no code implementations • NAACL (GeBNLP) 2022 • Amanda Bertsch, Ashley Oh, Sanika Natu, Swetha Gangu, Alan W. black, Emma Strubell
We extend our analysis to a longitudinal study of bias in film dialogue over the last 110 years and find that continued pre-training on OpenSubtitles encodes additional bias into BERT.
1 code implementation • EMNLP 2021 • Zhisong Zhang, Emma Strubell, Eduard Hovy
Although recent developments in neural architectures and pre-trained representations have greatly increased state-of-the-art model performance on fully-supervised semantic role labeling (SRL), the task remains challenging for languages where supervised SRL training data are not abundant.
1 code implementation • ACL (spnlp) 2021 • Zhisong Zhang, Emma Strubell, Eduard Hovy
In this work, we empirically compare span extraction methods for the task of semantic role labeling (SRL).
no code implementations • 22 May 2024 • Benjamin C. Lee, David Brooks, Arthur van Benthem, Udit Gupta, Gage Hills, Vincent Liu, Benjamin Pierce, Christopher Stewart, Emma Strubell, Gu-Yeon Wei, Adam Wierman, Yuan YAO, Minlan Yu
For embodied carbon, we must re-think conventional design strategies -- over-provisioned monolithic servers, frequent hardware refresh cycles, custom silicon -- and adopt life-cycle design strategies that more effectively reduce, reuse and recycle hardware at scale.
1 code implementation • 22 May 2024 • Sang Keun Choe, Hwijeen Ahn, Juhan Bae, Kewen Zhao, Minsoo Kang, Youngseog Chung, Adithya Pratapa, Willie Neiswanger, Emma Strubell, Teruko Mitamura, Jeff Schneider, Eduard Hovy, Roger Grosse, Eric Xing
Large language models (LLMs) are trained on a vast amount of human-written data, but data providers often remain uncredited.
1 code implementation • 1 Apr 2024 • Muhammad Khalifa, David Wadden, Emma Strubell, Honglak Lee, Lu Wang, Iz Beltagy, Hao Peng
We investigate the problem of intrinsic source citation, where LLMs are required to cite the pretraining source supporting a generated response.
3 code implementations • 1 Feb 2024 • Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi
Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs.
1 code implementation • 31 Jan 2024 • Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo
As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training data impacts model capabilities and limitations.
1 code implementation • 12 Jan 2024 • Li Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren F. Klein, Jesse Dodge
Large language models' (LLMs) abilities are drawn from their pretraining data, and model development begins with data curation.
1 code implementation • 9 Dec 2023 • Gustavo Gonçalves, Emma Strubell
Large Language Models (LLMs) trained with self-supervision on vast corpora of web text fit to the social biases of that text.
no code implementations • 28 Nov 2023 • Alexandra Sasha Luccioni, Yacine Jernite, Emma Strubell
Recent years have seen a surge in the popularity of commercial AI products based on generative, multi-purpose AI systems promising a unified approach to building machine learning (ML) models into technology.
no code implementations • 17 Nov 2023 • Xiaorong Wang, Clara Na, Emma Strubell, Sorelle Friedler, Sasha Luccioni
Despite the popularity of the `pre-train then fine-tune' paradigm in the NLP community, existing work quantifying energy costs and associated carbon emissions has largely focused on language model pre-training.
no code implementations • 11 Oct 2023 • Sireesh Gururaja, Amanda Bertsch, Clara Na, David Gray Widder, Emma Strubell
NLP is in a period of disruptive change that is impacting our methodologies, funding sources, and public perception.
no code implementations • 19 Jul 2023 • Hao Peng, Qingqing Cao, Jesse Dodge, Matthew E. Peters, Jared Fernandez, Tom Sherborne, Kyle Lo, Sam Skjonsberg, Emma Strubell, Darrell Plessas, Iz Beltagy, Evan Pete Walsh, Noah A. Smith, Hannaneh Hajishirzi
In response, we introduce Pentathlon, a benchmark for holistic and realistic evaluation of model efficiency.
no code implementations • 30 Jun 2023 • Harnoor Dhingra, Preetiha Jayashanker, Sayali Moghe, Emma Strubell
Large Language Models (LLMs) are trained primarily on minimally processed web text, which exhibits the same wide range of social biases held by the humans who created that content.
no code implementations • 29 Jun 2023 • Ji-Ung Lee, Haritz Puerto, Betty van Aken, Yuki Arase, Jessica Zosa Forde, Leon Derczynski, Andreas Rücklé, Iryna Gurevych, Roy Schwartz, Emma Strubell, Jesse Dodge
Many recent improvements in NLP stem from the development and use of large pre-trained language models (PLMs) with billions of parameters.
no code implementations • 24 May 2023 • Ananya Harsh Jha, Tom Sherborne, Evan Pete Walsh, Dirk Groeneveld, Emma Strubell, Iz Beltagy
Large language models (LLMs) enable unparalleled few- and zero-shot reasoning capabilities but at a high computational footprint.
1 code implementation • 22 May 2023 • Zhisong Zhang, Emma Strubell, Eduard Hovy
To address this challenge, we adopt an error estimator to adaptively decide the partial selection ratio according to the current model's capability.
no code implementations • 29 Apr 2023 • Rajshekhar Das, Jonathan Francis, Sanket Vaibhav Mehta, Jean Oh, Emma Strubell, Jose Moura
Self-training based on pseudo-labels has emerged as a dominant approach for addressing conditional distribution shifts in unsupervised domain adaptation (UDA) for semantic segmentation problems.
1 code implementation • 13 Feb 2023 • Jared Fernandez, Jacob Kahn, Clara Na, Yonatan Bisk, Emma Strubell
In this work, we examine this phenomenon through a series of case studies analyzing the effects of model design decisions, framework paradigms, and hardware platforms on total model latency.
no code implementations • 20 Dec 2022 • Dheeru Dua, Emma Strubell, Sameer Singh, Pat Verga
Recent advances in open-domain question answering (ODQA) have demonstrated impressive accuracy on standard Wikipedia style benchmarks.
no code implementations • 19 Dec 2022 • Sanket Vaibhav Mehta, Jai Gupta, Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Jinfeng Rao, Marc Najork, Emma Strubell, Donald Metzler
In this work, we introduce DSI++, a continual learning challenge for DSI to incrementally index new documents while being able to answer queries related to both previously and newly indexed documents.
no code implementations • 11 Dec 2022 • Zheng Wang, Juncheng B Li, Shuhui Qu, Florian Metze, Emma Strubell
In this work, we incorporate exponentially decaying quantization-error-aware noise together with a learnable scale of task loss gradient to approximate the effect of a quantization operator.
no code implementations • 8 Nov 2022 • Marius Hessenthaler, Emma Strubell, Dirk Hovy, Anne Lauscher
Fairness and environmental impact are important research directions for the sustainable development of artificial intelligence.
1 code implementation • 18 Oct 2022 • Zhisong Zhang, Emma Strubell, Eduard Hovy
In this work, we provide a survey of active learning (AL) for its applications in natural language processing (NLP).
1 code implementation • 14 Oct 2022 • Nupoor Gandhi, Anjalie Field, Emma Strubell
Although recent neural models for coreference resolution have led to substantial improvements on benchmark datasets, transferring these models to new target domains containing out-of-vocabulary spans and requiring differing annotation schemes remains challenging.
no code implementations • 13 Oct 2022 • Zheng Wang, Juncheng B Li, Shuhui Qu, Florian Metze, Emma Strubell
Quantization is an effective technique to reduce memory footprint, inference latency, and power consumption of deep learning models.
no code implementations • 31 Aug 2022 • Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, Roy Schwartz
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows.
no code implementations • 10 Jun 2022 • Jesse Dodge, Taylor Prewitt, Remi Tachet des Combes, Erika Odmark, Roy Schwartz, Emma Strubell, Alexandra Sasha Luccioni, Noah A. Smith, Nicole DeCario, Will Buchanan
By providing unprecedented access to computational resources, cloud computing has enabled rapid growth in technologies such as machine learning, the computational demands of which incur a high energy cost and a commensurate carbon footprint.
no code implementations • 25 May 2022 • Clara Na, Sanket Vaibhav Mehta, Emma Strubell
Model compression by way of parameter pruning, quantization, or distillation has recently gained popularity as an approach for reducing the computational requirements of modern deep neural network models for NLP.
1 code implementation • NeurIPS 2023 • Sanket Vaibhav Mehta, Darshan Patil, Sarath Chandar, Emma Strubell
The lifelong learning paradigm in machine learning is an attractive alternative to the more prominent isolated learning scheme not only due to its resemblance to biological learning but also its potential to reduce energy waste by obviating excessive model re-training.
1 code implementation • ACL 2022 • Sanket Vaibhav Mehta, Jinfeng Rao, Yi Tay, Mihir Kale, Ankur P. Parikh, Emma Strubell
Data-to-text generation focuses on generating fluent natural language responses from structured meaning representations (MRs).
no code implementations • 29 Sep 2021 • Rajshekhar Das, Jonathan Francis, Sanket Vaibhav Mehta, Jean Oh, Emma Strubell, Jose Moura
Crucially, the objectness constraint is agnostic to the ground-truth semantic segmentation labels and, therefore, remains appropriate for unsupervised adaptation settings.
no code implementations • 1 Jan 2021 • Juncheng B Li, Shuhui Qu, Xinjian Li, Emma Strubell, Florian Metze
Quantization of neural network parameters and activations has emerged as a successful approach to reducing the model size and inference time on hardware that sup-ports native low-precision arithmetic.
3 code implementations • ACL 2019 • Emma Strubell, Ananya Ganesh, Andrew McCallum
Recent progress in hardware and methodology for training neural networks has ushered in a new generation of large networks trained on abundant data.
no code implementations • WS 2019 • Sheshera Mysore, Zach Jensen, Edward Kim, Kevin Huang, Haw-Shiuan Chang, Emma Strubell, Jeffrey Flanigan, Andrew McCallum, Elsa Olivetti
Materials science literature contains millions of materials synthesis procedures described in unstructured natural language text.
1 code implementation • 31 Dec 2018 • Edward Kim, Zach Jensen, Alexander van Grootel, Kevin Huang, Matthew Staib, Sheshera Mysore, Haw-Shiuan Chang, Emma Strubell, Andrew McCallum, Stefanie Jegelka, Elsa Olivetti
Leveraging new data sources is a key step in accelerating the pace of materials design and discovery.
no code implementations • WS 2018 • Emma Strubell, Andrew McCallum
Do unsupervised methods for learning rich, contextualized token representations obviate the need for explicit modeling of linguistic structure in neural network models for semantic role labeling (SRL)?
1 code implementation • EMNLP 2018 • Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, Andrew McCallum
Unlike previous models which require significant pre-processing to prepare linguistic features, LISA can incorporate syntax using merely raw tokens as input, encoding the sequence only once to simultaneously perform parsing, predicate detection and role labeling for all predicates.
1 code implementation • NAACL 2018 • Patrick Verga, Emma Strubell, Andrew McCallum
Most work in relation extraction forms a prediction by looking at a short span of text within a single sentence containing a single entity pair mention.
no code implementations • 18 Nov 2017 • Sheshera Mysore, Edward Kim, Emma Strubell, Ao Liu, Haw-Shiuan Chang, Srikrishna Kompella, Kevin Huang, Andrew McCallum, Elsa Olivetti
In this work, we present a system for automatically extracting structured representations of synthesis procedures from the texts of materials science journal articles that describe explicit, experimental syntheses of inorganic compounds.
no code implementations • 23 Oct 2017 • Patrick Verga, Emma Strubell, Ofer Shai, Andrew McCallum
We propose a model to consider all mention and entity pairs simultaneously in order to make a prediction.
no code implementations • WS 2017 • Emma Strubell, Andrew McCallum
Dependency parses are an effective way to inject linguistic knowledge into many downstream tasks, and many practitioners wish to efficiently parse sentences at scale.
4 code implementations • EMNLP 2017 • Emma Strubell, Patrick Verga, David Belanger, Andrew McCallum
Today when many practitioners run basic NLP on the entire web and large-volume traffic, faster methods are paramount to saving time and energy costs.
Ranked #25 on Named Entity Recognition (NER) on Ontonotes v5 (English)
1 code implementation • NAACL 2016 • Patrick Verga, David Belanger, Emma Strubell, Benjamin Roth, Andrew McCallum
In response, this paper introduces significant further improvements to the coverage and flexibility of universal schema relation extraction: predictions for entities unseen in training and multilingual transfer learning to domains with no annotation.
no code implementations • IJCNLP 2015 • Emma Strubell, Luke Vilnis, Kate Silverstein, Andrew McCallum
We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components.
no code implementations • 30 Oct 2014 • Emma Strubell, Luke Vilnis, Andrew McCallum
We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components.