Search Results for author: Jerry Zhi-Yang He

Found 5 papers, 0 papers with code

CoS: Enhancing Personalization and Mitigating Bias with Context Steering

no code implementations2 May 2024 Jerry Zhi-Yang He, Sashrika Pandey, Mariah L. Schrum, Anca Dragan

Proper usage of the context enables the LLM to generate personalized responses, whereas inappropriate contextual influence can lead to stereotypical and potentially harmful generations (e. g. associating "female" with "housekeeper").

Quantifying Assistive Robustness Via the Natural-Adversarial Frontier

no code implementations16 Oct 2023 Jerry Zhi-Yang He, Zackory Erickson, Daniel S. Brown, Anca D. Dragan

We propose that capturing robustness in these interactive settings requires constructing and analyzing the entire natural-adversarial frontier: the Pareto-frontier of human policies that are the best trade-offs between naturalness and low robot performance.

Learning Representations that Enable Generalization in Assistive Tasks

no code implementations5 Dec 2022 Jerry Zhi-Yang He, aditi raghunathan, Daniel S. Brown, Zackory Erickson, Anca D. Dragan

We advocate that generalization to such OOD policies benefits from (1) learning a good latent representation for human policies that test-time humans can accurately be mapped to, and (2) making that representation adaptable with test-time interaction data, instead of relying on it to perfectly capture the space of human policies based on the simulated population only.

Causal Confusion and Reward Misidentification in Preference-Based Reward Learning

no code implementations13 Apr 2022 Jeremy Tien, Jerry Zhi-Yang He, Zackory Erickson, Anca D. Dragan, Daniel S. Brown

While much prior work focuses on causal confusion in reinforcement learning and behavioral cloning, we focus on a systematic study of causal confusion and reward misidentification when learning from preferences.

Imitation Learning

Assisted Robust Reward Design

no code implementations18 Nov 2021 Jerry Zhi-Yang He, Anca D. Dragan

We contribute an Assisted Reward Design method that speeds up the design process by anticipating and influencing this future evidence: rather than letting the designer eventually encounter failure cases and revise the reward then, the method actively exposes the designer to such environments during the development phase.

Autonomous Driving

Cannot find the paper you are looking for? You can Submit a new open access paper.