Search Results for author: Dirk Groeneveld

Found 20 papers, 10 papers with code

Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training

no code implementations29 May 2025 William Merrill, Shane Arora, Dirk Groeneveld, Hannaneh Hajishirzi

The right batch size is important when training language models at scale: a large batch size is necessary for fast training, but a batch size that is too large will harm token efficiency.

Language Modeling Language Modelling +1

DataDecide: How to Predict Best Pretraining Data with Small Experiments

1 code implementation15 Apr 2025 Ian Magnusson, Nguyen Tai, Ben Bogin, David Heineman, Jena D. Hwang, Luca Soldaini, Akshita Bhagia, Jiacheng Liu, Dirk Groeneveld, Oyvind Tafjord, Noah A. Smith, Pang Wei Koh, Jesse Dodge

Because large language models are expensive to pretrain on different datasets, using smaller-scale experiments to decide on data is crucial for reducing costs.

ARC HellaSwag +3

What's In My Big Data?

1 code implementation31 Oct 2023 Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hanna Hajishirzi, Noah A. Smith, Jesse Dodge

We open-source WIMBD's code and artifacts to provide a standard set of evaluations for new text-based corpora and to encourage more analyses and transparency around them.

Benchmarking

Just CHOP: Embarrassingly Simple LLM Compression

no code implementations24 May 2023 Ananya Harsh Jha, Tom Sherborne, Evan Pete Walsh, Dirk Groeneveld, Emma Strubell, Iz Beltagy

Large language models (LLMs) enable unparalleled few- and zero-shot reasoning capabilities but at a high computational footprint.

Knowledge Distillation Language Modeling +3

Continued Pretraining for Better Zero- and Few-Shot Promptability

1 code implementation19 Oct 2022 Zhaofeng Wu, Robert L. Logan IV, Pete Walsh, Akshita Bhagia, Dirk Groeneveld, Sameer Singh, Iz Beltagy

We demonstrate that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings compared to existing methods, up to 31% relative.

Language Modeling Language Modelling +2

A Simple Yet Strong Pipeline for HotpotQA

no code implementations EMNLP 2020 Dirk Groeneveld, Tushar Khot, Mausam, Ashish Sabharwal

State-of-the-art models for multi-hop question answering typically augment large-scale language models like BERT with additional, intuitively useful capabilities such as named entity recognition, graph-based reasoning, and question decomposition.

Multi-hop Question Answering named-entity-recognition +4

From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project

no code implementations4 Sep 2019 Peter Clark, Oren Etzioni, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Niket Tandon, Sumithra Bhakthavatsalam, Dirk Groeneveld, Michal Guerquin, Michael Schmitz

This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90% on the exam's non-diagram, multiple choice (NDMC) questions.

Multiple-choice Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.