Search Results for author: Alexander Wettig

Found 16 papers, 15 papers with code

Metadata Conditioning Accelerates Language Model Pre-training

1 code implementation3 Jan 2025 Tianyu Gao, Alexander Wettig, Luxi He, Yihe Dong, Sadhika Malladi, Danqi Chen

The vast diversity of styles, domains, and quality levels present in language model pre-training corpora is essential in developing general model capabilities, but efficiently learning and deploying the correct behaviors exemplified in each of these heterogeneous data sources is challenging.

Language Modeling Language Modelling +1

How to Train Long-Context Language Models (Effectively)

1 code implementation3 Oct 2024 Tianyu Gao, Alexander Wettig, Howard Yen, Danqi Chen

We study continued training and supervised fine-tuning (SFT) of a language model (LM) to make effective use of long-context information.

Finding Transformer Circuits with Edge Pruning

1 code implementation24 Jun 2024 Adithya Bhaskar, Alexander Wettig, Dan Friedman, Danqi Chen

Our method finds circuits in GPT-2 that use less than half the number of edges compared to circuits found by previous methods while being equally faithful to the full model predictions on standard circuit-finding tasks.

In-Context Learning Language Modelling

QuRating: Selecting High-Quality Data for Training Language Models

1 code implementation15 Feb 2024 Alexander Wettig, Aatmik Gupta, Saumya Malik, Danqi Chen

We train a QuRater model to learn scalar ratings from pairwise judgments, and use it to annotate a 260B training corpus with quality ratings for each of the four criteria.

In-Context Learning

Poisoning Retrieval Corpora by Injecting Adversarial Passages

1 code implementation29 Oct 2023 Zexuan Zhong, Ziqing Huang, Alexander Wettig, Danqi Chen

Dense retrievers have achieved state-of-the-art performance in various information retrieval tasks, but to what extent can they be safely deployed in real-world applications?

Information Retrieval Natural Questions +1

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

4 code implementations10 Oct 2023 Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan

We find real-world software engineering to be a rich, sustainable, and challenging testbed for evaluating the next generation of language models.

Bug fixing Code Generation +1

Learning Transformer Programs

1 code implementation NeurIPS 2023 Dan Friedman, Alexander Wettig, Danqi Chen

Recent research in mechanistic interpretability has attempted to reverse-engineer Transformer models by carefully inspecting network weights and activations.

In-Context Learning Interpretable Machine Learning +4

Adapting Language Models to Compress Contexts

1 code implementation24 May 2023 Alexis Chevalier, Alexander Wettig, Anirudh Ajith, Danqi Chen

Transformer-based language models (LMs) are powerful and widely-applicable tools, but their usefulness is constrained by a finite context window and the expensive computational cost of processing long text documents.

In-Context Learning Language Modeling +4

Finding Dataset Shortcuts with Grammar Induction

1 code implementation20 Oct 2022 Dan Friedman, Alexander Wettig, Danqi Chen

Many NLP datasets have been found to contain shortcuts: simple decision rules that achieve surprisingly high accuracy.

Diagnostic Sentence +1

A Kernel-Based View of Language Model Fine-Tuning

1 code implementation11 Oct 2022 Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora

It has become standard to solve NLP tasks by fine-tuning pre-trained language models (LMs), especially in low-data settings.

Language Modeling Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.