Search Results for author: Nevan Wichers

Found 12 papers, 2 papers with code

Gradient-Based Language Model Red Teaming

1 code implementation30 Jan 2024 Nevan Wichers, Carson Denison, Ahmad Beirami

Red teaming is a common strategy for identifying weaknesses in generative language models (LMs), where adversarial prompts are produced that trigger an LM to generate unsafe responses.

Language Modelling

Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation

no code implementations14 Jan 2024 Meng Cao, Lei Shu, Lei Yu, Yun Zhu, Nevan Wichers, Yinxiao Liu, Lei Meng

We investigate this approach under two different settings: one where the policy model is smaller and is paired with a more powerful critic model, and another where a single language model fulfills both roles.

Language Modelling reinforcement-learning +2

Fusion-Eval: Integrating Evaluators with LLMs

no code implementations15 Nov 2023 Lei Shu, Nevan Wichers, Liangchen Luo, Yun Zhu, Yinxiao Liu, Jindong Chen, Lei Meng

Evaluating natural language systems poses significant challenges, particularly in the realms of natural language understanding and high-level reasoning.

Natural Language Understanding

SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition

no code implementations10 Feb 2022 Dylan Slack, Yinlam Chow, Bo Dai, Nevan Wichers

However, we identify these techniques are not well equipped for safe policy learning because they ignore negative experiences(e. g., unsafe or unsuccessful), focusing only on positive experiences, which harms their ability to generalize to new tasks safely.

reinforcement-learning Reinforcement Learning (RL) +2

SAFER: Data-Efficient and Safe Reinforcement Learning Through Skill Acquisition

no code implementations29 Sep 2021 Dylan Z Slack, Yinlam Chow, Bo Dai, Nevan Wichers

Though many reinforcement learning (RL) problems involve learning policies in settings that are difficult to specify safety constraints and sparse rewards, current methods struggle to rapidly and safely acquire successful policies.

reinforcement-learning Reinforcement Learning (RL) +2

ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces

no code implementations22 Dec 2020 Zecheng He, Srinivas Sunkara, Xiaoxue Zang, Ying Xu, Lijuan Liu, Nevan Wichers, Gabriel Schubiner, Ruby Lee, Jindong Chen, Blaise Agüera y Arcas

Our methodology is designed to leverage visual, linguistic and domain-specific features in user interaction traces to pre-train generic feature representations of UIs and their components.

Retrieval

RL agents Implicitly Learning Human Preferences

1 code implementation14 Feb 2020 Nevan Wichers

In the real world, RL agents should be rewarded for fulfilling human preferences.

Resolving Spurious Correlations in Causal Models of Environments via Interventions

no code implementations12 Feb 2020 Sergei Volodin, Nevan Wichers, Jeremy Nixon

We consider the problem of inferring a causal model of a reinforcement learning environment and we propose a method to deal with spurious correlations.

Decision Making

Resolving Referring Expressions in Images With Labeled Elements

no code implementations24 Oct 2018 Nevan Wichers, Dilek Hakkani-Tur, Jindong Chen

Images may have elements containing text and a bounding box associated with them, for example, text identified via optical character recognition on a computer screen image, or a natural image with labeled objects.

Optical Character Recognition Optical Character Recognition (OCR) +1

Hierarchical Long-term Video Prediction without Supervision

no code implementations ICML 2018 Nevan Wichers, Ruben Villegas, Dumitru Erhan, Honglak Lee

Much of recent research has been devoted to video prediction and generation, yet most of the previous works have demonstrated only limited success in generating videos on short-term horizons.

Video Prediction

Unsupervised Hierarchical Video Prediction

no code implementations ICLR 2018 Nevan Wichers, Dumitru Erhan, Honglak Lee

Much recent research has been devoted to video prediction and generation, but mostly for short-scale time horizons.

Video Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.