Search Results for author: Aengus Lynch

Found 4 papers, 2 papers with code

Eight Methods to Evaluate Robust Unlearning in LLMs

no code implementations26 Feb 2024 Aengus Lynch, Phillip Guo, Aidan Ewart, Stephen Casper, Dylan Hadfield-Menell

Machine unlearning can be useful for removing harmful capabilities and memorized text from large language models (LLMs), but there are not yet standardized methods for rigorously evaluating it.

Machine Unlearning

Towards Automated Circuit Discovery for Mechanistic Interpretability

2 code implementations NeurIPS 2023 Arthur Conmy, Augustine N. Mavor-Parker, Aengus Lynch, Stefan Heimersheim, Adrià Garriga-Alonso

For example, the ACDC algorithm rediscovered 5/5 of the component types in a circuit in GPT-2 Small that computes the Greater-Than operation.

Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases

2 code implementations9 Mar 2023 Aengus Lynch, Gbètondji J-S Dovonon, Jean Kaddour, Ricardo Silva

The problem of spurious correlations (SCs) arises when a classifier relies on non-predictive features that happen to be correlated with the labels in the training data.

Image Captioning Image Classification

Causal Machine Learning: A Survey and Open Problems

no code implementations30 Jun 2022 Jean Kaddour, Aengus Lynch, Qi Liu, Matt J. Kusner, Ricardo Silva

Causal Machine Learning (CausalML) is an umbrella term for machine learning methods that formalize the data-generation process as a structural causal model (SCM).

BIG-bench Machine Learning Fairness +1

Cannot find the paper you are looking for? You can Submit a new open access paper.