Search Results for author: Lewis Hammond

Found 13 papers, 5 papers with code

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

1 code implementation • 15 Apr 2024 • Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs).

Paper
Code

Cooperation and Control in Delegation Games

no code implementations • 24 Feb 2024 • Oliver Sourbut, Lewis Hammond, Harriet Wood

Many settings of interest involving humans and machines -- from virtual personal assistants to autonomous vehicles -- can naturally be modelled as principals (humans) delegating to agents (machines), which then interact with each other on their principals' behalf.

Autonomous Vehicles

Paper
Add Code

Secret Collusion Among Generative AI Agents

no code implementations • 12 Feb 2024 • Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, Christian Schroeder de Witt

In this paper, we comprehensively formalise the problem of secret collusion in systems of generative AI agents by drawing on relevant concepts from both the AI and security literature.

Paper
Add Code

Visibility into AI Agents

no code implementations • 23 Jan 2024 • Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung

Increased delegation of commercial, scientific, governmental, and personal activities to AI agents -- systems capable of pursuing complex goals with limited supervision -- may exacerbate existing societal risks and introduce new risks.

Informativeness

Paper
Add Code

Welfare Diplomacy: Benchmarking Language Model Cooperation

1 code implementation • 13 Oct 2023 • Gabriel Mukobi, Hannah Erlebach, Niklas Lauffer, Lewis Hammond, Alan Chan, Jesse Clifton

The growing capabilities and increasingly widespread deployment of AI systems necessitate robust benchmarks for measuring their cooperative capabilities.

Benchmarking Language Modelling

Paper
Code

On Imperfect Recall in Multi-Agent Influence Diagrams

no code implementations • 11 Jul 2023 • James Fox, Matt MacDermott, Lewis Hammond, Paul Harrenstein, Alessandro Abate, Michael Wooldridge

Multi-agent influence diagrams (MAIDs) are a popular game-theoretic model based on Bayesian networks.

Paper
Add Code

Reasoning about Causality in Games

no code implementations • 5 Jan 2023 • Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, Michael Wooldridge

Regarding question iii), we describe correspondences between causal games and other formalisms, and explain how causal games can be used to answer queries that other causal or game-theoretic models do not support.

Paper
Add Code

Lexicographic Multi-Objective Reinforcement Learning

1 code implementation • 28 Dec 2022 • Joar Skalse, Lewis Hammond, Charlie Griffin, Alessandro Abate

In this work we introduce reinforcement learning techniques for solving lexicographic multi-objective problems.

Multi-Objective Reinforcement Learning reinforcement-learning

Paper
Code

Bounded Robustness in Reinforcement Learning via Lexicographic Objectives

no code implementations • 30 Sep 2022 • Daniel Jarne Ornia, Licio Romao, Lewis Hammond, Manuel Mazo Jr., Alessandro Abate

Policy robustness in Reinforcement Learning may not be desirable at any cost: the alterations caused by robustness requirements from otherwise optimal policies should be explainable, quantifiable and formally verifiable.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Rational Verification for Probabilistic Systems

no code implementations • 19 Jul 2021 • Julian Gutierrez, Lewis Hammond, Anthony W. Lin, Muhammad Najib, Michael Wooldridge

Rational verification is the problem of determining which temporal logic properties will hold in a multi-agent system, under the assumption that agents in the system act rationally, by choosing strategies that collectively form a game-theoretic equilibrium.

Paper
Add Code

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

1 code implementation • 9 Feb 2021 • Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, Michael Wooldridge

Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations.

Paper
Code

Multi-Agent Reinforcement Learning with Temporal Logic Specifications

1 code implementation • 1 Feb 2021 • Lewis Hammond, Alessandro Abate, Julian Gutierrez, Michael Wooldridge

In this paper, we study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment, which may exhibit probabilistic behaviour.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Code

Learning Tractable Probabilistic Models for Moral Responsibility and Blame

no code implementations • 8 Oct 2018 • Lewis Hammond, Vaishak Belle

From the viewpoint of such systems, the urgent questions are: (a) How can models of moral scenarios and blameworthiness be extracted and learnt automatically from data?

Decision Making Management +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.