Search Results for author: Alex Tamkin

Found 24 papers, 14 papers with code

Bayesian Preference Elicitation with Language Models

no code implementations8 Mar 2024 Kunal Handa, Yarin Gal, Ellie Pavlick, Noah Goodman, Jacob Andreas, Alex Tamkin, Belinda Z. Li

We introduce OPEN (Optimal Preference Elicitation with Natural language) a framework that uses BOED to guide the choice of informative questions and an LM to extract features and translate abstract BOED queries into natural language questions.

Experimental Design

Evaluating and Mitigating Discrimination in Language Model Decisions

no code implementations6 Dec 2023 Alex Tamkin, Amanda Askell, Liane Lovitt, Esin Durmus, Nicholas Joseph, Shauna Kravec, Karina Nguyen, Jared Kaplan, Deep Ganguli

We present a method for proactively evaluating the potential discriminatory impact of LMs in a wide range of use cases, including hypothetical use cases where they have not yet been deployed.

Language Modelling Prompt Engineering

Codebook Features: Sparse and Discrete Interpretability for Neural Networks

1 code implementation26 Oct 2023 Alex Tamkin, Mohammad Taufeeque, Noah D. Goodman

In this setting, our approach overcomes the superposition problem by assigning states to distinct codes, and we find that we can make the neural network behave as if it is in a different state by activating the code for that state.

Quantization

Social Contract AI: Aligning AI Assistants with Implicit Group Norms

1 code implementation26 Oct 2023 Jan-Philipp Fränken, Sam Kwok, Peixuan Ye, Kanishk Gandhi, Dilip Arumugam, Jared Moore, Alex Tamkin, Tobias Gerstenberg, Noah D. Goodman

We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions.

Eliciting Human Preferences with Language Models

1 code implementation17 Oct 2023 Belinda Z. Li, Alex Tamkin, Noah Goodman, Jacob Andreas

Language models (LMs) can be directed to perform target tasks by using labeled examples or natural language prompts.

Studying Large Language Model Generalization with Influence Functions

2 code implementations7 Aug 2023 Roger Grosse, Juhan Bae, Cem Anil, Nelson Elhage, Alex Tamkin, Amirhossein Tajdini, Benoit Steiner, Dustin Li, Esin Durmus, Ethan Perez, Evan Hubinger, Kamilė Lukošiūtė, Karina Nguyen, Nicholas Joseph, Sam McCandlish, Jared Kaplan, Samuel R. Bowman

When trying to gain better visibility into a machine learning model in order to understand and mitigate the associated risks, a potentially valuable source of evidence is: which training examples most contribute to a given behavior?

counterfactual Language Modelling +2

Towards Measuring the Representation of Subjective Global Opinions in Language Models

no code implementations28 Jun 2023 Esin Durmus, Karina Nguyen, Thomas I. Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, Liane Lovitt, Sam McCandlish, Orowa Sikder, Alex Tamkin, Janel Thamkul, Jared Kaplan, Jack Clark, Deep Ganguli

We first build a dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across different countries.

Operationalising the Definition of General Purpose AI Systems: Assessing Four Approaches

no code implementations5 Jun 2023 Risto Uuk, Carlos Ignacio Gutierrez, Alex Tamkin

The European Union's Artificial Intelligence (AI) Act is set to be a landmark legal instrument for regulating AI technology.

BenchMD: A Benchmark for Unified Learning on Medical Images and Sensors

1 code implementation17 Apr 2023 Kathryn Wantlin, Chenwei Wu, Shih-Cheng Huang, Oishi Banerjee, Farah Dadabhoy, Veeral Vipin Mehta, Ryan Wonhee Han, Fang Cao, Raja R. Narayan, Errol Colak, Adewole Adamson, Laura Heacock, Geoffrey H. Tison, Alex Tamkin, Pranav Rajpurkar

Finally, we evaluate performance on out-of-distribution data collected at different hospitals than the training data, representing naturally-occurring distribution shifts that frequently degrade the performance of medical AI models.

Self-Supervised Learning

Multispectral Contrastive Learning with Viewmaker Networks

1 code implementation11 Feb 2023 Jasmine Bayrooti, Noah Goodman, Alex Tamkin

Contrastive learning methods have been applied to a range of domains and modalities by training models to identify similar "views" of data points.

Contrastive Learning Self-Supervised Learning

Task Ambiguity in Humans and Language Models

no code implementations20 Dec 2022 Alex Tamkin, Kunal Handa, Avash Shrestha, Noah Goodman

We investigate how both humans and models behave in the face of such task ambiguity by proposing AmbiBench, a new benchmark of six ambiguously-specified classification tasks.

Active Learning Helps Pretrained Models Learn the Intended Task

1 code implementation18 Apr 2022 Alex Tamkin, Dat Nguyen, Salil Deshpande, Jesse Mu, Noah Goodman

Models can fail in unpredictable ways during deployment due to task ambiguity, when multiple behaviors are consistent with the provided training data.

Active Learning

Oolong: Investigating What Makes Transfer Learning Hard with Controlled Studies

1 code implementation24 Feb 2022 Zhengxuan Wu, Alex Tamkin, Isabel Papadimitriou

When we transfer a pretrained language model to a new language, there are many axes of variation that change at once.

Cross-Lingual Transfer Language Modelling +1

Tradeoffs Between Contrastive and Supervised Learning: An Empirical Study

no code implementations10 Dec 2021 Ananya Karthik, Mike Wu, Noah Goodman, Alex Tamkin

Contrastive learning has made considerable progress in computer vision, outperforming supervised pretraining on a range of downstream datasets.

Contrastive Learning Image Classification

DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning

1 code implementation23 Nov 2021 Alex Tamkin, Vincent Liu, Rongfei Lu, Daniel Fein, Colin Schultz, Noah Goodman

Self-supervised learning algorithms, including BERT and SimCLR, have enabled significant strides in fields like natural language processing, computer vision, and speech processing.

Self-Supervised Learning

Pretrained models are active learners

no code implementations29 Sep 2021 Alex Tamkin, Dat Nguyen, Salil Deshpande, Jesse Mu, Noah Goodman

An important barrier to the safe deployment of machine learning systems is the risk of \emph{task ambiguity}, where multiple behaviors are consistent with the provided examples.

Active Learning

C5T5: Controllable Generation of Organic Molecules with Transformers

1 code implementation23 Aug 2021 Daniel Rothchild, Alex Tamkin, Julie Yu, Ujval Misra, Joseph Gonzalez

Methods for designing organic materials with desired properties have high potential impact across fields such as medicine, renewable energy, petrochemical engineering, and agriculture.

Drug Discovery molecular representation

On the Opportunities and Risks of Foundation Models

2 code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models

no code implementations4 Feb 2021 Alex Tamkin, Miles Brundage, Jack Clark, Deep Ganguli

On October 14th, 2020, researchers from OpenAI, the Stanford Institute for Human-Centered Artificial Intelligence, and other universities convened to discuss open research questions surrounding GPT-3, the largest publicly-disclosed dense language model at the time.

Language Modelling Philosophy

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

1 code implementation ICLR 2021 Alex Tamkin, Mike Wu, Noah Goodman

However, designing these views requires considerable trial and error by human experts, hindering widespread adoption of unsupervised representation learning methods across domains and modalities.

Contrastive Learning Representation Learning

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy

no code implementations5 Nov 2019 Ramtin Keramati, Christoph Dann, Alex Tamkin, Emma Brunskill

While maximizing expected return is the goal in most reinforcement learning approaches, risk-sensitive objectives such as conditional value at risk (CVaR) are more suitable for many high-stakes applications.

Cannot find the paper you are looking for? You can Submit a new open access paper.