To enable easy post-hoc editing at scale, we propose Model Editor Networks using Gradient Decomposition (MEND), a collection of small auxiliary editing networks that use a single desired input-output pair to make fast, local edits to a pre-trained model's behavior.

Paper
Code

Editing Large Language Models: Problems, Methods, and Opportunities

zjunlp/easyedit • • 22 May 2023

Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.

Paper
Code

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

stanfordnlp/pyvene • • 12 Mar 2024

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability.

Paper
Code

Locating and Editing Factual Associations in GPT

kmeng01/rome • • 10 Feb 2022

To test our hypothesis that these computations correspond to factual association recall, we modify feed-forward weights to update specific factual associations using Rank-One Model Editing (ROME).

Paper
Code

Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

interpretml/interpret • 30 Jun 2022

Machine learning (ML) interpretability techniques can reveal undesirable patterns in data that models exploit to make predictions--potentially causing harms once deployed.

Paper
Code

Sparse Autoencoders Find Highly Interpretable Features in Language Models

hoagyc/sparse_coding • • 15 Sep 2023

One hypothesised cause of polysemanticity is \textit{superposition}, where neural networks represent more features than they have neurons by assigning features to an overcomplete set of directions in activation space, rather than to individual neurons.

Paper
Code

A Comprehensive Study of Knowledge Editing for Large Language Models

zjunlp/easyedit • • 2 Jan 2024

In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches.

Paper
Code

A Unified Framework for Model Editing

scalable-model-editing/unified-model-editing • • 21 Mar 2024

We introduce a unifying framework that brings two leading "locate-and-edit" model editing techniques -- ROME and MEMIT -- under a single conceptual umbrella, optimizing for the same goal, which we call the preservation-memorization objective.

Paper
Code

ModelPS: An Interactive and Collaborative Platform for Editing Pre-trained Models at Scale

cap-ntu/ML-Model-CI • • 18 May 2021

AI engineering has emerged as a crucial discipline to democratize deep neural network (DNN) models among software developers with a diverse background.

Paper
Code

Learning to Model Editing Processes

machelreid/editpro • 24 May 2022

We introduce baseline results and metrics on this task, finding that modeling editing processes improves performance on a variety of axes on both our proposed task and related downstream tasks compared to previous single-step models of edits.

Paper
Code

Model Editing

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result