Search Results for author: Iason Gabriel

Found 13 papers, 1 papers with code

Accounting for Offensive Speech as a Practice of Resistance

no code implementations • NAACL (WOAH) 2022 • Mark Diaz, Razvan Amironesei, Laura Weidinger, Iason Gabriel

Tasks such as toxicity detection, hate speech detection, and online harassment detection have been developed for identifying interactions involving offensive speech.

Hate Speech Detection Philosophy

Paper
Add Code

Sociotechnical Safety Evaluation of Generative AI Systems

no code implementations • 18 Oct 2023 • Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor Griffin, Ben Bariach, Iason Gabriel, Verena Rieser, William Isaac

First, we propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks.

Paper
Add Code

Model evaluation for extreme risks

no code implementations • 24 May 2023 • Toby Shevlane, Sebastian Farquhar, Ben Garfinkel, Mary Phuong, Jess Whittlestone, Jade Leung, Daniel Kokotajlo, Nahema Marchal, Markus Anderljung, Noam Kolt, Lewis Ho, Divya Siddarth, Shahar Avin, Will Hawkins, Been Kim, Iason Gabriel, Vijay Bolina, Jack Clark, Yoshua Bengio, Paul Christiano, Allan Dafoe

Current approaches to building general-purpose AI systems tend to produce systems with both beneficial and harmful capabilities.

Paper
Add Code

Manifestations of Xenophobia in AI Systems

no code implementations • 15 Dec 2022 • Nenad Tomasev, Jonathan Leader Maynard, Iason Gabriel

Xenophobia is one of the key drivers of marginalisation, discrimination, and conflict, yet many prominent machine learning (ML) fairness frameworks fail to comprehensively measure or mitigate the resulting xenophobic harms.

Fairness Recommendation Systems

Paper
Add Code

A Human Rights-Based Approach to Responsible AI

no code implementations • 6 Oct 2022 • Vinodkumar Prabhakaran, Margaret Mitchell, Timnit Gebru, Iason Gabriel

Research on fairness, accountability, transparency and ethics of AI-based interventions in society has gained much-needed momentum in recent years.

Ethics Fairness

Paper
Add Code

Improving alignment of dialogue agents via targeted human judgements

no code implementations • 28 Sep 2022 • Amelia Glaese, Nat McAleese, Maja Trębacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Maribeth Rauh, Laura Weidinger, Martin Chadwick, Phoebe Thacker, Lucy Campbell-Gillingham, Jonathan Uesato, Po-Sen Huang, Ramona Comanescu, Fan Yang, Abigail See, Sumanth Dathathri, Rory Greig, Charlie Chen, Doug Fritz, Jaume Sanchez Elias, Richard Green, Soňa Mokrá, Nicholas Fernando, Boxi Wu, Rachel Foley, Susannah Young, Iason Gabriel, William Isaac, John Mellor, Demis Hassabis, Koray Kavukcuoglu, Lisa Anne Hendricks, Geoffrey Irving

We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines.

Language Modelling

Paper
Add Code

In conversation with Artificial Intelligence: aligning language models with human values

no code implementations • 1 Sep 2022 • Atoosa Kasirzadeh, Iason Gabriel

Furthermore, we explore how these norms can be used to align conversational agents with human values across a range of different discursive domains.

Paper
Add Code

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

no code implementations • 16 Jun 2022 • Maribeth Rauh, John Mellor, Jonathan Uesato, Po-Sen Huang, Johannes Welbl, Laura Weidinger, Sumanth Dathathri, Amelia Glaese, Geoffrey Irving, Iason Gabriel, William Isaac, Lisa Anne Hendricks

Large language models produce human-like text that drive a growing number of applications.

Benchmarking Language Modelling +1

Paper
Add Code

Ethical and social risks of harm from Language Models

no code implementations • 8 Dec 2021 • Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel

We discuss the points of origin of different risks and point to potential mitigation approaches.

Misinformation

Paper
Add Code

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

2 code implementations • NA 2021 • Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Ranked #1 on Language Modelling on StackExchange

Abstract Algebra Anachronisms +133

755

Paper
Code

Alignment of Language Agents

no code implementations • 26 Mar 2021 • Zachary Kenton, Tom Everitt, Laura Weidinger, Iason Gabriel, Vladimir Mikulik, Geoffrey Irving

For artificial intelligence to be beneficial to humans the behaviour of AI agents needs to be aligned with what humans want.

Paper
Add Code

Modelling Cooperation in Network Games with Spatio-Temporal Complexity

no code implementations • 13 Feb 2021 • Michiel A. Bakker, Richard Everett, Laura Weidinger, Iason Gabriel, William S. Isaac, Joel Z. Leibo, Edward Hughes

Such systems have local incentives for individuals, whose behavior has an impact on the global outcome for the group.

Management reinforcement-learning +1

Paper
Add Code

The Challenge of Value Alignment: from Fairer Algorithms to AI Safety

no code implementations • 15 Jan 2021 • Iason Gabriel, Vafa Ghazavi

This paper addresses the question of how to align AI systems with human values and situates it within a wider body of thought regarding technology and value.

Fairness Computers and Society

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.