Search Results for author: David Bau

Found 44 papers, 29 papers with code

Locating and Editing Factual Associations in Mamba

1 code implementation • 4 Apr 2024 • Arnab Sen Sharma, David Atkinson, David Bau

We investigate the mechanisms of factual recall in the Mamba state space model.

Paper
Code

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

1 code implementation • 28 Mar 2024 • Samuel Marks, Can Rager, Eric J. Michaud, Yonatan Belinkov, David Bau, Aaron Mueller

We introduce methods for discovering and applying sparse feature circuits.

Language Modelling

Paper
Code

Model Lakes

no code implementations • 4 Mar 2024 • Koyena Pal, David Bau, Renée J. Miller

And we discuss what principled data management techniques can be brought to bear on the study of large model management.

Management

Paper
Add Code

Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking

no code implementations • 22 Feb 2024 • Nikhil Prakash, Tamar Rott Shaham, Tal Haklay, Yonatan Belinkov, David Bau

We identify the mechanism that enables entity tracking and show that (i) in both the original model and its fine-tuned versions primarily the same circuit implements entity tracking.

Code Generation Instruction Following

Paper
Add Code

Measuring and Controlling Instruction (In)Stability in Language Model Dialogs

1 code implementation • 13 Feb 2024 • Kenneth Li, Tianle Liu, Naomi Bashkansky, David Bau, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg

System-prompting is a standard tool for customizing language-model chatbots, enabling them to follow a specific instruction.

Chatbot Language Modelling

Paper
Code

Black-Box Access is Insufficient for Rigorous AI Audits

no code implementations • 25 Jan 2024 • Stephen Casper, Carson Ezell, Charlotte Siegmann, Noam Kolt, Taylor Lynn Curtis, Benjamin Bucknall, Andreas Haupt, Kevin Wei, Jérémy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell

The effectiveness of an audit, however, depends on the degree of system access granted to auditors.

Paper
Add Code

Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models

1 code implementation • 20 Nov 2023 • Rohit Gandikota, Joanna Materzynska, Tingrui Zhou, Antonio Torralba, David Bau

We present a method to create interpretable concept sliders that enable precise control over attributes in image generations from diffusion models.

Image Generation

717

Paper
Code

Testing Language Model Agents Safely in the Wild

no code implementations • 17 Nov 2023 • Silen Naihin, David Atkinson, Marc Green, Merwane Hamadi, Craig Swift, Douglas Schonholtz, Adam Tauman Kalai, David Bau

A prerequisite for safe autonomy-in-the-wild is safe testing-in-the-wild.

Language Modelling

Paper
Add Code

Future Lens: Anticipating Subsequent Tokens from a Single Hidden State

no code implementations • 8 Nov 2023 • Koyena Pal, Jiuding Sun, Andrew Yuan, Byron C. Wallace, David Bau

More concretely, in this paper we ask: Given a hidden (internal) representation of a single token at position $t$ in an input, can we reliably anticipate the tokens that will appear at positions $\geq t + 2$?

Paper
Add Code

Function Vectors in Large Language Models

no code implementations • 23 Oct 2023 • Eric Todd, Millicent L. Li, Arnab Sen Sharma, Aaron Mueller, Byron C. Wallace, David Bau

Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV).

In-Context Learning

Paper
Add Code

FIND: A Function Description Benchmark for Evaluating Interpretability Methods

1 code implementation • NeurIPS 2023 • Sarah Schwettmann, Tamar Rott Shaham, Joanna Materzynska, Neil Chowdhury, Shuang Li, Jacob Andreas, David Bau, Antonio Torralba

FIND contains functions that resemble components of trained neural networks, and accompanying descriptions of the kind we seek to generate.

Paper
Code

Unified Concept Editing in Diffusion Models

1 code implementation • 25 Aug 2023 • Rohit Gandikota, Hadas Orgad, Yonatan Belinkov, Joanna Materzyńska, David Bau

Text-to-image models suffer from various safety issues that may limit their suitability for deployment.

Paper
Code

Linearity of Relation Decoding in Transformer Language Models

1 code implementation • 17 Aug 2023 • Evan Hernandez, Arnab Sen Sharma, Tal Haklay, Kevin Meng, Martin Wattenberg, Jacob Andreas, Yonatan Belinkov, David Bau

Linear relation representations may be obtained by constructing a first-order approximation to the LM from a single prompt, and they exist for a variety of factual, commonsense, and linguistic relations.

Relation

Paper
Code

Multimodal Neurons in Pretrained Text-Only Transformers

no code implementations • 3 Aug 2023 • Sarah Schwettmann, Neil Chowdhury, Samuel Klein, David Bau, Antonio Torralba

Language models demonstrate remarkable capacity to generalize representations learned in one modality to downstream tasks in other modalities.

Image Captioning

Paper
Add Code

Discovering Variable Binding Circuitry with Desiderata

no code implementations • 7 Jul 2023 • Xander Davies, Max Nadeau, Nikhil Prakash, Tamar Rott Shaham, David Bau

Recent work has shown that computation in language models may be human-understandable, with successful efforts to localize and intervene on both single-unit features and input-output circuits.

Paper
Add Code

Erasing Concepts from Diffusion Models

2 code implementations • ICCV 2023 • Rohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, David Bau

We propose a fine-tuning method that can erase a visual concept from a pre-trained diffusion model, given only the name of the style and using negative guidance as a teacher.

Text-based Image Editing

469

Paper
Code

Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task

1 code implementation • 24 Oct 2022 • Kenneth Li, Aspen K. Hopkins, David Bau, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg

Language models show a surprising range of capabilities, but the source of their apparent competence is unclear.

147

Paper
Code

Mass-Editing Memory in a Transformer

2 code implementations • 13 Oct 2022 • Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau

Recent work has shown exciting promise in updating large language models with new memories, so as to replace obsolete information or add specialized knowledge.

Language Modelling

376

Paper
Code

Content-Based Search for Deep Generative Models

1 code implementation • 6 Oct 2022 • Daohan Lu, Sheng-Yu Wang, Nupur Kumari, Rohan Agarwal, Mia Tang, David Bau, Jun-Yan Zhu

To address this need, we introduce the task of content-based model search: given a query and a large set of generative models, finding the models that best match the query.

Ranked #1 on Model Description Based Search on Generative Models

Contrastive Learning Image and Sketch based Model Retrieval +4

216

Paper
Code

Rewriting Geometric Rules of a GAN

1 code implementation • 28 Jul 2022 • Sheng-Yu Wang, David Bau, Jun-Yan Zhu

Our method allows a user to create a model that synthesizes endless objects with defined geometric changes, enabling the creation of a new generative model without the burden of curating a large-scale dataset.

172

Paper
Code

Local Relighting of Real Scenes

2 code implementations • 6 Jul 2022 • Audrey Cui, Ali Jahanian, Agata Lapedriza, Antonio Torralba, Shahin Mahdizadehaghdam, Rohit Kumar, David Bau

We introduce the task of local relighting, which changes a photograph of a scene by switching on and off the light sources that are visible within the image.

Image Relighting

Paper
Code

Disentangling visual and written concepts in CLIP

no code implementations • CVPR 2022 • Joanna Materzynska, Antonio Torralba, David Bau

The CLIP network measures the similarity between natural text and images; in this work, we investigate the entanglement of the representation of word images and natural images in its image encoder.

Retrieval

Paper
Add Code

Locating and Editing Factual Associations in GPT

2 code implementations • 10 Feb 2022 • Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov

To test our hypothesis that these computations correspond to factual association recall, we modify feed-forward weights to update specific factual associations using Rank-One Model Editing (ROME).

counterfactual Model Editing +2

1,176

Paper
Code

Natural Language Descriptions of Deep Visual Features

2 code implementations • 26 Jan 2022 • Evan Hernandez, Sarah Schwettmann, David Bau, Teona Bagashvili, Antonio Torralba, Jacob Andreas

Given a neuron, MILAN generates a description by searching for a natural language string that maximizes pointwise mutual information with the image regions in which the neuron is active.

Attribute

Paper
Code

Editing a classifier by rewriting its prediction rules

1 code implementation • NeurIPS 2021 • Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry

We present a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules.

Paper
Code

Toward a Visual Concept Vocabulary for GAN Latent Space

1 code implementation • ICCV 2021 • Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba

A large body of recent work has identified transformations in the latent spaces of generative adversarial networks (GANs) that consistently and interpretably transform generated images.

Disentanglement

Paper
Code

Natural Language Descriptions of Deep Features

no code implementations • ICLR 2022 • Evan Hernandez, Sarah Schwettmann, David Bau, Teona Bagashvili, Antonio Torralba, Jacob Andreas

Given a neuron, MILAN generates a description by searching for a natural language string that maximizes pointwise mutual information with the image regions in which the neuron is active.

Attribute

Paper
Add Code

Sketch Your Own GAN

1 code implementation • ICCV 2021 • Sheng-Yu Wang, David Bau, Jun-Yan Zhu

In particular, we change the weights of an original GAN model according to user sketches.

Image Generation

709

Paper
Code

Paint by Word

no code implementations • 19 Mar 2021 • Alex Andonian, Sabrina Osmany, Audrey Cui, YeonHwan Park, Ali Jahanian, Antonio Torralba, David Bau

We investigate the problem of zero-shot semantic image painting.

Paper
Add Code

Understanding the Role of Individual Units in a Deep Neural Network

2 code implementations • 10 Sep 2020 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, Antonio Torralba

Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.

Generative Adversarial Network Image Classification +2

298

Paper
Code

What makes fake images detectable? Understanding properties that generalize

1 code implementation • ECCV 2020 • Lucy Chai, David Bau, Ser-Nam Lim, Phillip Isola

The quality of image generation and manipulation is reaching impressive levels, making it increasingly difficult for a human to distinguish between what is real and what is fake.

Image Generation

127

Paper
Code

Rewriting a Deep Generative Model

3 code implementations • ECCV 2020 • David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba

To address the problem, we propose a formulation in which the desired rule is changed by manipulating a layer of a deep network as a linear associative memory.

709

Paper
Code

Diverse Image Generation via Self-Conditioned GANs

2 code implementations • CVPR 2020 • Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, Antonio Torralba

We introduce a simple but effective unsupervised method for generating realistic and diverse images.

Clustering Image Generation

154

Paper
Code

Semantic Photo Manipulation with a Generative Image Prior

1 code implementation • 15 May 2020 • David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, Antonio Torralba

First, it is hard for GANs to precisely reproduce an input image.

992

Paper
Code

Seeing What a GAN Cannot Generate

1 code implementation • ICCV 2019 • David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, Antonio Torralba

Differences in statistics reveal object classes that are omitted by a GAN.

Semantic Segmentation

184

Paper
Code

Dissecting Pruned Neural Networks

no code implementations • 29 Jun 2019 • Jonathan Frankle, David Bau

Namely, we consider the effect of removing unnecessary structure on the number of hidden units that learn disentangled representations of human-recognizable concepts as identified by network dissection.

Paper
Add Code

Visualizing and Understanding GANs

no code implementations • ICLR Workshop DeepGenStruct 2019 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba

We present an analytic framework to visualize and understand GANs at the unit-, object-, and scene-level.

Object

Paper
Add Code

On the Units of GANs (Extended Abstract)

no code implementations • 29 Jan 2019 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba

We quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output.

Paper
Add Code

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

8 code implementations • ICLR 2019 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba

Then, we quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output.

Image Generation Object

1,770

Paper
Code

Interpretable Basis Decomposition for Visual Explanation

1 code implementation • ECCV 2018 • Bolei Zhou, Yiyou Sun, David Bau, Antonio Torralba

Explanations of the decisions made by a deep neural network are important for human end-users to be able to understand and diagnose the trustworthiness of the system.

Paper
Code

Revisiting the Importance of Individual Units in CNNs via Ablation

no code implementations • 7 Jun 2018 • Bolei Zhou, Yiyou Sun, David Bau, Antonio Torralba

We confirm that unit attributes such as class selectivity are a poor predictor for impact on overall accuracy as found previously in recent work \cite{morcos2018importance}.

General Classification

Paper
Add Code

Explaining Explanations: An Overview of Interpretability of Machine Learning

1 code implementation • 31 May 2018 • Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, Lalana Kagal

There has recently been a surge of work in explanatory artificial intelligence (XAI).

BIG-bench Machine Learning Explainable Artificial Intelligence (XAI) +1

Paper
Code

Interpreting Deep Visual Representations via Network Dissection

2 code implementations • 15 Nov 2017 • Bolei Zhou, David Bau, Aude Oliva, Antonio Torralba

In this work, we describe Network Dissection, a method that interprets networks by providing labels for the units of their deep visual representations.

211

Paper
Code

Network Dissection: Quantifying Interpretability of Deep Visual Representations

1 code implementation • CVPR 2017 • David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, Antonio Torralba

Given any CNN model, the proposed method draws on a broad data set of visual concepts to score the semantics of hidden units at each intermediate convolutional layer.

211

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.