Search Results for author: Justin Wang

Found 9 papers, 3 papers with code

Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity

no code implementations17 Feb 2025 Dylan Zhang, Justin Wang, Tianran Sun

Existing LMs struggle with proof-oriented programming due to data scarcity, which manifest in two key ways: (1) a lack of sufficient corpora for proof-oriented programming languages such as F*, and (2) the absence of large-scale, project-level proof-oriented implementations that can teach the model the intricate reasoning process when performing proof-oriented programming.

Data Augmentation

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

1 code implementation11 Oct 2024 Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian, Derek Duenas, Maxwell Lin, Justin Wang, Dan Hendrycks, Andy Zou, Zico Kolter, Matt Fredrikson, Eric Winsor, Jerome Wynne, Yarin Gal, Xander Davies

The robustness of LLMs to jailbreak attacks, where users design prompts to circumvent safety measures and misuse model capabilities, has been studied primarily for LLMs acting as simple chatbots.

$\textbf{Only-IF}$:Revealing the Decisive Effect of Instruction Diversity on Generalization

no code implementations7 Oct 2024 Dylan Zhang, Justin Wang, Francois Charton

In both cases, we demonstrate that 1) better performance can be achieved by increasing the diversity of an established dataset while keeping the data size constant, and 2) when scaling up the data, diversifying the semantics of instructions is more effective than simply increasing the quantity of similar data.

Diversity Instruction Following

MAGICS: Adversarial RL with Minimax Actors Guided by Implicit Critic Stackelberg for Convergent Neural Synthesis of Robot Safety

no code implementations20 Sep 2024 Justin Wang, Haimin Hu, Duy Phuong Nguyen, Jaime Fernández Fisac

While robust optimal control theory provides a rigorous framework to compute robot control policies that are provably safe, it struggles to scale to high-dimensional problems, leading to increased use of deep learning for tractable synthesis of robot safety.

OpenAI Gym Reinforcement Learning (RL)

Tamper-Resistant Safeguards for Open-Weight LLMs

2 code implementations1 Aug 2024 Rishub Tamirisa, Bhrugu Bharathi, Long Phan, Andy Zhou, Alice Gatti, Tarun Suresh, Maxwell Lin, Justin Wang, Rowan Wang, Ron Arel, Andy Zou, Dawn Song, Bo Li, Dan Hendrycks, Mantas Mazeika

Rapid advances in the capabilities of large language models (LLMs) have raised widespread concerns regarding their potential for malicious use.

Red Teaming TAR

From Symbolic Tasks to Code Generation: Diversification Yields Better Task Performers

no code implementations30 May 2024 Dylan Zhang, Justin Wang, Francois Charton

Instruction tuning -- tuning large language models on instruction-output pairs -- is a promising technique for making models better adapted to the real world.

Code Generation

Instruction Diversity Drives Generalization To Unseen Tasks

no code implementations16 Feb 2024 Dylan Zhang, Justin Wang, Francois Charton

We investigate the trade-off between the number of instructions the model is trained on and the number of training samples provided for each instruction and observe that the diversity of the instruction set determines generalization.

Diversity Language Modeling +2

3D Pose Detection in Videos: Focusing on Occlusion

no code implementations24 Jun 2020 Justin Wang, Edward Xu, Kangrui Xue, Lukasz Kidzinski

In this work, we build upon existing methods for occlusion-aware 3D pose detection in videos.

Position

Cannot find the paper you are looking for? You can Submit a new open access paper.