Search Results for author: Chris Olah

Found 18 papers, 10 papers with code

In-context Learning and Induction Heads

no code implementations24 Sep 2022 Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott Johnston, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, Chris Olah

In this work, we present preliminary and indirect evidence for a hypothesis that induction heads might constitute the mechanism for the majority of all "in-context learning" in large transformer models (i. e. decreasing loss at increasing token indices).

Multimodal Neurons in Artificial Neural Networks

1 code implementation Distill 2021 Gabriel Goh, Nick Cammarata, Chelsea Voss, Shan Carter, Michael Petrov, Ludwig Schubert, Alec Radford, Chris Olah

It’s the fact that you plug visual information into the rich tapestry of memory that brings it to life."

Visualizing Weights

no code implementations Distill 2021 Chelsea Voss, Nick Cammarata, Gabriel Goh, Michael Petrov, Ludwig Schubert, Ben Egan, Swee Kiat Lim, Chris Olah

Trying to understand artificial neural networks also has a lot in common with neuroscience, which tries to understand biological neural networks.

High-Low Frequency Detectors

no code implementations Distill 2021 Ludwig Schubert, Chelsea Voss, Nick Cammarata, Gabriel Goh, Chris Olah

Yet, when systematically characterizing the early layers of InceptionV1, we found a full fifteen neurons of mixed3a that appear to detect a high frequency pattern on one side, and a low frequency pattern on the other.

Vocal Bursts Intensity Prediction

Curve Detectors

no code implementations Distill 2020 Nick Cammarata, Gabriel Goh, Shan Carter, Ludwig Schubert, Michael Petrov, Chris Olah

Every vision model we've explored in detail contains neurons which detect curves.

Thread: Circuits

no code implementations Distill 2020 Nick Cammarata, Shan Carter, Gabriel Goh, Chris Olah, Michael Petrov, Ludwig Schubert, Chelsea Voss, Ben Egan, Swee Kiat Lim

To facilitate exploration of this direction, Distill is inviting a “thread” of short articles on circuits, interspersed with critical commentary by experts in adjacent fields.

Feature Visualization

1 code implementation Distill 2020 Chris Olah, Alexander Mordvintsev, Ludwig Schubert

There is a growing sense that neural networks need to be interpretable to humans.

Activation Atlas

1 code implementation Distill 2019 Shan Carter, Zan Armstrong, Ludwig Schubert, Ian Johnson, Chris Olah

By using feature inversion to visualize millions of activations from an image classification network, we create an explorable activation atlas of features the network has learned which can reveal how the network typically represents some concepts.

General Classification Image Classification

Differentiable Image Parameterizations

2 code implementations Distill 2018 Alexander Mordvintsev, Nicola Pezzotti, Ludwig Schubert, Chris Olah

Typically, we parameterize the input image as the RGB values of each pixel, but that isn’t the only way.

Image Generation

The Building Blocks of Interpretability

1 code implementation Distill 2018 Chris Olah, Arvind Satyanarayan, Ian Johnson, Shan Carter, Ludwig Schubert, Katherine Ye, Alexander Mordvintsev

In this article, we treat existing interpretability methods as fundamental and composable building blocks for rich user interfaces.

Concrete Problems in AI Safety

1 code implementation21 Jun 2016 Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Mané

Rapid progress in machine learning and artificial intelligence (AI) has brought increasing attention to the potential impacts of AI technologies on society.

BIG-bench Machine Learning Safe Exploration

Cannot find the paper you are looking for? You can Submit a new open access paper.