Search Results for author: Usman Anwar

Found 14 papers, 4 papers with code

The Reality of AI and Biorisk

no code implementations2 Dec 2024 Aidan Peppin, Anka Reuel, Stephen Casper, Elliot Jones, Andrew Strait, Usman Anwar, Anurag Agrawal, Sayash Kapoor, Sanmi Koyejo, Marie Pellat, Rishi Bommasani, Nick Frosst, Sara Hooker

To accurately and confidently answer the question 'could an AI model or system increase biorisk', it is necessary to have both a sound theoretical threat model for how AI models or systems could increase biorisk and a robust method for testing that threat model.

Learning to Forget using Hypernetworks

no code implementations1 Dec 2024 Jose Miguel Lara Rangel, Stefan Schoepf, Jack Foster, David Krueger, Usman Anwar

Machine unlearning is gaining increasing attention as a way to remove adversarial data poisoning attacks from already trained models and to comply with privacy and AI regulations.

Data Poisoning Machine Unlearning

Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks

no code implementations11 Nov 2024 Madeline Brumley, Joe Kwon, David Krueger, Dmitrii Krasheninnikov, Usman Anwar

A key objective of interpretability research on large language models (LLMs) is to develop methods for robustly steering models toward desired behaviors.

In-Context Learning

Noisy Zero-Shot Coordination: Breaking The Common Knowledge Assumption In Zero-Shot Coordination Games

1 code implementation7 Nov 2024 Usman Anwar, Ashish Pandian, Jia Wan, David Krueger, Jakob Foerster

We show that with NZSC training, RL agents can be trained to coordinate well with novel partners even when the (exact) problem setting of the coordination is not common knowledge.

Meta-Learning Reinforcement Learning (RL)

Adversarial Robustness of In-Context Learning in Transformers for Linear Regression

no code implementations7 Nov 2024 Usman Anwar, Johannes von Oswald, Louis Kirsch, David Krueger, Spencer Frei

This work investigates the vulnerability of in-context learning in transformers to \textit{hijacking attacks} focusing on the setting of linear regression tasks.

Adversarial Robustness In-Context Learning +1

IDs for AI Systems

no code implementations17 Jun 2024 Alan Chan, Noam Kolt, Peter Wills, Usman Anwar, Christian Schroeder de Witt, Nitarshan Rajkumar, Lewis Hammond, David Krueger, Lennart Heim, Markus Anderljung

AI systems are increasingly pervasive, yet information needed to decide whether and how to engage with them may not exist or be accessible.

Reward Model Ensembles Help Mitigate Overoptimization

2 code implementations4 Oct 2023 Thomas Coste, Usman Anwar, Robert Kirk, David Krueger

Gao et al. (2023) studied this phenomenon in a synthetic human feedback setup with a significantly larger "gold" reward model acting as the true reward (instead of humans) and showed that overoptimization remains a persistent problem regardless of the size of the proxy reward model and training data used.

model Model Optimization

Domain Generalization for Robust Model-Based Offline Reinforcement Learning

no code implementations27 Nov 2022 Alan Clark, Shoaib Ahmed Siddiqui, Robert Kirk, Usman Anwar, Stephen Chung, David Krueger

Existing offline reinforcement learning (RL) algorithms typically assume that training data is either: 1) generated by a known policy, or 2) of entirely unknown origin.

Domain Generalization Offline RL +3

Constrained Reinforcement Learning With Learned Constraints

no code implementations1 Jan 2021 Shehryar Malik, Usman Anwar, Alireza Aghasi, Ali Ahmed

In this work, given a reward function and a set of demonstrations from an expert that maximizes this reward function while respecting \textit{unknown} constraints, we propose a framework to learn the most likely constraints that the expert respects.

reinforcement-learning Reinforcement Learning +1

Inverse Constrained Reinforcement Learning

1 code implementation19 Nov 2020 Usman Anwar, Shehryar Malik, Alireza Aghasi, Ali Ahmed

However, for the real world deployment of reinforcement learning (RL), it is critical that RL agents are aware of these constraints, so that they can act safely.

reinforcement-learning Reinforcement Learning +1

Learning To Solve Differential Equations Across Initial Conditions

no code implementations ICLR Workshop DeepDiffEq 2019 Shehryar Malik, Usman Anwar, Ali Ahmed, Alireza Aghasi

Recently, there has been a lot of interest in using neural networks for solving partial differential equations.

Cannot find the paper you are looking for? You can Submit a new open access paper.