1 code implementation • EMNLP 2021 • Ari Holtzman, Peter West, Vered Shwartz, Yejin Choi, Luke Zettlemoyer
Large language models have shown promising results in zero-shot settings.
1 code implementation • ACL 2022 • Rabeeh Karimi Mahabadi, Luke Zettlemoyer, James Henderson, Lambert Mathias, Marzieh Saeidi, Veselin Stoyanov, Majid Yazdani
Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score.
1 code implementation • 26 Feb 2025 • Margaret Li, Sneha Kudugunta, Luke Zettlemoyer
Each of these factors may affect the fitted law, and therefore, the conclusions of a given study.
1 code implementation • 20 Feb 2025 • Michihiro Yasunaga, Luke Zettlemoyer, Marjan Ghazvininejad
Reward models play an essential role in training vision-language models (VLMs) by assessing output quality to enable aligning with human preferences.
no code implementations • 6 Feb 2025 • Shangbin Feng, Zifeng Wang, Palash Goyal, Yike Wang, Weijia Shi, Huang Xia, Hamid Palangi, Luke Zettlemoyer, Yulia Tsvetkov, Chen-Yu Lee, Tomas Pfister
We propose Heterogeneous Swarms, an algorithm to design multi-LLM systems by jointly optimizing model roles and weights.
2 code implementations • 31 Jan 2025 • Niklas Muennighoff, Zitong Yang, Weijia Shi, Xiang Lisa Li, Li Fei-Fei, Hannaneh Hajishirzi, Luke Zettlemoyer, Percy Liang, Emmanuel Candès, Tatsunori Hashimoto
After supervised finetuning the Qwen2. 5-32B-Instruct language model on s1K and equipping it with budget forcing, our model s1-32B exceeds o1-preview on competition math questions by up to 27% (MATH and AIME24).
Ranked #5 on
Mathematical Reasoning
on AIME24
1 code implementation • 27 Jan 2025 • Weixin Liang, Junhong Shen, Genghan Zhang, Ning Dong, Luke Zettlemoyer, Lili Yu
In the Chameleon setting, Mixture-of-Mamba reaches similar image loss with just 42. 50% of the FLOPs at the 1. 4B scale, and similar text loss with just 65. 40% of the FLOPs.
no code implementations • 6 Jan 2025 • Junhong Shen, Kushal Tirumala, Michihiro Yasunaga, Ishan Misra, Luke Zettlemoyer, Lili Yu, Chunting Zhou
Most existing image tokenizers encode images into a fixed number of tokens or patches, overlooking the inherent variability in image complexity.
3 code implementations • 31 Dec 2024 • Team OLMo, Pete Walsh, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Shane Arora, Akshita Bhagia, Yuling Gu, Shengyi Huang, Matt Jordan, Nathan Lambert, Dustin Schwenk, Oyvind Tafjord, Taira Anderson, David Atkinson, Faeze Brahman, Christopher Clark, Pradeep Dasigi, Nouha Dziri, Michal Guerquin, Hamish Ivison, Pang Wei Koh, Jiacheng Liu, Saumya Malik, William Merrill, Lester James V. Miranda, Jacob Morrison, Tyler Murray, Crystal Nam, Valentina Pyatkin, Aman Rangapur, Michael Schmitz, Sam Skjonsberg, David Wadden, Christopher Wilhelm, Michael Wilson, Luke Zettlemoyer, Ali Farhadi, Noah A. Smith, Hannaneh Hajishirzi
Our modified model architecture and training recipe achieve both better training stability and improved per-token efficiency.
no code implementations • 24 Dec 2024 • Mingda Chen, Yang Li, Karthik Padthe, Rulin Shao, Alicia Sun, Luke Zettlemoyer, Gargi Gosh, Wen-tau Yih
Large language models can generate factually inaccurate content, a problem known as hallucination.
no code implementations • 20 Dec 2024 • Vivek Ramanujan, Kushal Tirumala, Armen Aghajanyan, Luke Zettlemoyer, Ali Farhadi
Current image generation methods, such as latent diffusion and discrete token-based generation, depend on a two-stage training approach.
no code implementations • 19 Dec 2024 • Weijia Shi, Xiaochuang Han, Chunting Zhou, Weixin Liang, Xi Victoria Lin, Luke Zettlemoyer, Lili Yu
We also demonstrate that this framework can adapt existing vision-language models with multimodal generation ability.
1 code implementation • 13 Dec 2024 • Artidoro Pagnoni, Ram Pasunuru, Pedro Rodriguez, John Nguyen, Benjamin Muller, Margaret Li, Chunting Zhou, Lili Yu, Jason Weston, Luke Zettlemoyer, Gargi Ghosh, Mike Lewis, Ari Holtzman, Srinivasan Iyer
We introduce the Byte Latent Transformer (BLT), a new byte-level LLM architecture that, for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency and robustness.
1 code implementation • 12 Dec 2024 • Vincent-Pierre Berges, Barlas Oğuz, Daniel Haziza, Wen-tau Yih, Luke Zettlemoyer, Gargi Ghosh
We provide a fully parallelizable memory layer implementation, demonstrating scaling laws with up to 128B memory parameters, pretrained to 1 trillion tokens, comparing to base models with up to 8B parameters.
no code implementations • 5 Dec 2024 • Michihiro Yasunaga, Leonid Shamis, Chunting Zhou, Andrew Cohen, Jason Weston, Luke Zettlemoyer, Marjan Ghazvininejad
Recent approaches to large language model (LLM) alignment typically require millions of human annotations or rely on external aligned models for synthetic data generation.
no code implementations • 2 Dec 2024 • Jaskirat Singh, Lindsey Li, Weijia Shi, Ranjay Krishna, Yejin Choi, Pang Wei Koh, Michael F. Cohen, Stephen Gould, Liang Zheng, Luke Zettlemoyer
We introduce negative token merging (NegToMe), a simple but effective training-free approach which performs adversarial guidance through images by selectively pushing apart matching visual features between reference and generated images during the reverse diffusion process.
3 code implementations • 21 Nov 2024 • Akari Asai, Jacqueline He, Rulin Shao, Weijia Shi, Amanpreet Singh, Joseph Chee Chang, Kyle Lo, Luca Soldaini, Sergey Feldman, Mike D'Arcy, David Wadden, Matt Latzke, Minyang Tian, Pan Ji, Shengyan Liu, Hao Tong, Bohao Wu, Yanyu Xiong, Luke Zettlemoyer, Graham Neubig, Dan Weld, Doug Downey, Wen-tau Yih, Pang Wei Koh, Hannaneh Hajishirzi
Scientific progress depends on researchers' ability to synthesize the growing body of literature.
no code implementations • 8 Nov 2024 • Tong Chen, Hao Fang, Patrick Xia, Xiaodong Liu, Benjamin Van Durme, Luke Zettlemoyer, Jianfeng Gao, Hao Cheng
In StreamingQA, our approach is effective in injecting knowledge into the LM's parameters, achieving a 63. 5% improvement in F1 score over the model with supervised fine-tuning (from $19. 5$ to $31. 5$) for contexts as long as 32K tokens.
1 code implementation • 7 Nov 2024 • Weixin Liang, Lili Yu, Liang Luo, Srinivasan Iyer, Ning Dong, Chunting Zhou, Gargi Ghosh, Mike Lewis, Wen-tau Yih, Luke Zettlemoyer, Xi Victoria Lin
In the Transfusion setting, where text and image are trained with different objectives, a 7B MoT model matches the image modality performance of the dense baseline with one third of the FLOPs, and a 760M MoT model outperforms a 1. 4B dense baseline across key image generation metrics.
no code implementations • 6 Nov 2024 • Anaelia Ovalle, Krunoslav Lehman Pavasovic, Louis Martin, Luke Zettlemoyer, Eric Michael Smith, Adina Williams, Levent Sagun
Natural-language assistants are designed to provide users with helpful responses while avoiding harmful outputs, largely achieved through alignment to human preferences.
1 code implementation • 22 Oct 2024 • Hu Xu, Po-Yao Huang, Xiaoqing Ellen Tan, Ching-Feng Yeh, Jacob Kahn, Christine Jou, Gargi Ghosh, Omer Levy, Luke Zettlemoyer, Wen-tau Yih, Shang-Wen Li, Saining Xie, Christoph Feichtenhofer
This paper focuses on creating synthetic data to improve the quality of image captions.
no code implementations • 15 Oct 2024 • Seonghyeon Ye, Joel Jang, Byeongguk Jeon, Sejune Joo, Jianwei Yang, Baolin Peng, Ajay Mandlekar, Reuben Tan, Yu-Wei Chao, Bill Yuchen Lin, Lars Liden, Kimin Lee, Jianfeng Gao, Luke Zettlemoyer, Dieter Fox, Minjoon Seo
We introduce Latent Action Pretraining for general Action models (LAPA), an unsupervised method for pretraining Vision-Language-Action (VLA) models without ground-truth robot action labels.
3 code implementations • 20 Aug 2024 • Chunting Zhou, Lili Yu, Arun Babu, Kushal Tirumala, Michihiro Yasunaga, Leonid Shamis, Jacob Kahn, Xuezhe Ma, Luke Zettlemoyer, Omer Levy
Our experiments show that Transfusion scales significantly better than quantizing images and training a language model over discrete image tokens.
1 code implementation • 12 Aug 2024 • Hila Gonen, Terra Blevins, Alisa Liu, Luke Zettlemoyer, Noah A. Smith
Despite their wide adoption, the biases and unintended behaviors of language models remain poorly understood.
no code implementations • 8 Aug 2024 • Thao Nguyen, Jeffrey Li, Sewoong Oh, Ludwig Schmidt, Jason Weston, Luke Zettlemoyer, Xian Li
We propose a new method, instruction back-and-forth translation, to construct high-quality synthetic data grounded in world knowledge for aligning large language models (LLMs).
no code implementations • 31 Jul 2024 • Xi Victoria Lin, Akshat Shrivastava, Liang Luo, Srinivasan Iyer, Mike Lewis, Gargi Ghosh, Luke Zettlemoyer, Armen Aghajanyan
Under a 1-trillion-token training budget, the MoMa 1. 4B model, featuring 4 text experts and 4 image experts, achieves impressive FLOPs savings: 3. 7x overall, with 2. 6x for text and 5. 2x for image processing compared to a compute-equivalent dense baseline, measured by pre-training loss.
1 code implementation • 9 Jul 2024 • Rulin Shao, Jacqueline He, Akari Asai, Weijia Shi, Tim Dettmers, Sewon Min, Luke Zettlemoyer, Pang Wei Koh
Specifically, we find that increasing the size of the datastore used by a retrieval-based LM monotonically improves language modeling and several downstream tasks without obvious saturation, such that a smaller model augmented with a large datastore outperforms a larger LM-only model on knowledge-intensive tasks.
1 code implementation • 9 Jul 2024 • Tong Chen, Akari Asai, Niloofar Mireshghallah, Sewon Min, James Grimmelmann, Yejin Choi, Hannaneh Hajishirzi, Luke Zettlemoyer, Pang Wei Koh
Evaluating the degree of reproduction of copyright-protected content by language models (LMs) is of significant interest to the AI and legal communities.
1 code implementation • 8 Jul 2024 • Weijia Shi, Jaechan Lee, Yangsibo Huang, Sadhika Malladi, Jieyu Zhao, Ari Holtzman, Daogao Liu, Luke Zettlemoyer, Noah A. Smith, Chiyuan Zhang
Data owners may request the removal of their data from a trained model due to privacy or copyright concerns.
no code implementations • 26 Jun 2024 • Boyi Wei, Weijia Shi, Yangsibo Huang, Noah A. Smith, Chiyuan Zhang, Luke Zettlemoyer, Kai Li, Peter Henderson
Language models (LMs) derive their capabilities from extensive training on diverse data, including potentially copyrighted material.
1 code implementation • 20 Jun 2024 • Luxi He, Yangsibo Huang, Weijia Shi, Tinghao Xie, Haotian Liu, Yue Wang, Luke Zettlemoyer, Chiyuan Zhang, Danqi Chen, Peter Henderson
We show that state-of-the-art image and video generation models can still generate characters even if characters' names are not explicitly mentioned, sometimes with only two generic keywords (e. g., prompting with "videogame, plumber" consistently generates Nintendo's Mario character).
3 code implementations • 17 Jun 2024 • Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner, Maciej Kilian, HANLIN ZHANG, Rulin Shao, Sarah Pratt, Sunny Sanyal, Gabriel Ilharco, Giannis Daras, Kalyani Marathe, Aaron Gokaslan, Jieyu Zhang, Khyathi Chandu, Thao Nguyen, Igor Vasiljevic, Sham Kakade, Shuran Song, Sujay Sanghavi, Fartash Faghri, Sewoong Oh, Luke Zettlemoyer, Kyle Lo, Alaaeldin El-Nouby, Hadi Pouransari, Alexander Toshev, Stephanie Wang, Dirk Groeneveld, Luca Soldaini, Pang Wei Koh, Jenia Jitsev, Thomas Kollar, Alexandros G. Dimakis, Yair Carmon, Achal Dave, Ludwig Schmidt, Vaishaal Shankar
We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models.
no code implementations • 13 Jun 2024 • Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A Smith, Ranjay Krishna
In this work, we introduce Sketchpad, a framework that gives multimodal LMs a visual sketchpad and tools to draw on the sketchpad.
no code implementations • 21 May 2024 • Maciej Kilian, Varun Jampani, Luke Zettlemoyer
We find that token prediction methods, led by next-token prediction, significantly outperform diffusion on prompt following.
no code implementations • 26 Apr 2024 • Vasu Sharma, Karthik Padthe, Newsha Ardalani, Kushal Tirumala, Russell Howes, Hu Xu, Po-Yao Huang, Shang-Wen Li, Armen Aghajanyan, Gargi Ghosh, Luke Zettlemoyer
In recent times training Language Models (LMs) have relied on computationally heavy training over massive datasets which makes this training process extremely laborious.
1 code implementation • CVPR 2024 • Jiawei Ma, Po-Yao Huang, Saining Xie, Shang-Wen Li, Luke Zettlemoyer, Shih-Fu Chang, Wen-tau Yih, Hu Xu
The success of contrastive language-image pretraining (CLIP) relies on the supervision from the pairing between images and captions, which tends to be noisy in web-crawled data.
1 code implementation • 12 Apr 2024 • Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou
The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy.
no code implementations • 15 Mar 2024 • Tomasz Limisiewicz, Terra Blevins, Hila Gonen, Orevaoghene Ahia, Luke Zettlemoyer
A major consideration in multilingual language modeling is how to best represent languages with diverse vocabularies and scripts.
no code implementations • 5 Mar 2024 • Akari Asai, Zexuan Zhong, Danqi Chen, Pang Wei Koh, Luke Zettlemoyer, Hannaneh Hajishirzi, Wen-tau Yih
Parametric language models (LMs), which are trained on vast amounts of web data, exhibit remarkable flexibility and capability.
no code implementations • 16 Feb 2024 • Haoqiang Kang, Terra Blevins, Luke Zettlemoyer
While many hallucination detection techniques have been evaluated on English text, their effectiveness in multilingual contexts remains unknown.
1 code implementation • 12 Feb 2024 • Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, Hannaneh Hajishirzi
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data.
3 code implementations • 1 Feb 2024 • Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi
Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs.
1 code implementation • 31 Jan 2024 • Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo
As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training data impacts model capabilities and limitations.
2 code implementations • 30 Jan 2024 • Jiacheng Liu, Sewon Min, Luke Zettlemoyer, Yejin Choi, Hannaneh Hajishirzi
Second, existing $n$-gram LMs use small $n$ which hinders their performance; we instead allow $n$ to be arbitrarily large, by introducing a new $\infty$-gram LM with backoff.
1 code implementation • 19 Jan 2024 • Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, Luke Zettlemoyer
Despite their popularity in non-English NLP, multilingual language models often underperform monolingual ones due to inter-language competition for model parameters.
no code implementations • 8 Dec 2023 • Olga Golovneva, Sean O'Brien, Ramakanth Pasunuru, Tianlu Wang, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz
Using constrained reasoning, PathFinder integrates novel quality constraints, pruning, and exploration methods to enhance the efficiency and the quality of generation.
1 code implementation • 25 Oct 2023 • Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer
Min-K% Prob can be applied without any knowledge about the pretraining corpus or any additional training, departing from previous detection methods that require training a reference model on data that is similar to the pretraining data.
1 code implementation • 17 Oct 2023 • Joel Jang, Seungone Kim, Bill Yuchen Lin, Yizhong Wang, Jack Hessel, Luke Zettlemoyer, Hannaneh Hajishirzi, Yejin Choi, Prithviraj Ammanabrolu
In this work, we study Reinforcement Learning from Personalized Human Feedback (RLPHF) problem, wherein LLMs are aligned to multiple (sometimes conflicting) preferences by modeling alignment as a Multi-Objective Reinforcement Learning (MORL) problem.
1 code implementation • 16 Oct 2023 • Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Gergely Szilvasy, Rich James, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis
Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion.
no code implementations • 2 Oct 2023 • Xi Victoria Lin, Xilun Chen, Mingda Chen, Weijia Shi, Maria Lomeli, Rich James, Pedro Rodriguez, Jacob Kahn, Gergely Szilvasy, Mike Lewis, Luke Zettlemoyer, Scott Yih
Retrieval-augmented language models (RALMs) improve performance by accessing long-tail and up-to-date knowledge from external data stores, but are challenging to build.
Ranked #21 on
Question Answering
on TriviaQA
(using extra training data)
2 code implementations • 28 Sep 2023 • Hu Xu, Saining Xie, Xiaoqing Ellen Tan, Po-Yao Huang, Russell Howes, Vasu Sharma, Shang-Wen Li, Gargi Ghosh, Luke Zettlemoyer, Christoph Feichtenhofer
We believe that the main ingredient to the success of CLIP is its data and not the model architecture or pre-training objective.
1 code implementation • 5 Sep 2023 • Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan
It is also a general-purpose model that can do both text-to-image and image-to-text generation, allowing us to introduce self-contained contrastive decoding methods that produce high-quality outputs.
Ranked #2 on
Text-to-Image Generation
on COCO
1 code implementation • 31 Aug 2023 • Benjamin Muller, Belen Alastruey, Prangthip Hansanti, Elahe Kalbassi, Christophe Ropers, Eric Michael Smith, Adina Williams, Luke Zettlemoyer, Pierre Andrews, Marta R. Costa-jussà
We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
2 code implementations • 31 Aug 2023 • Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, Madian Khabsa
We use this dataset to evaluate the capabilities of multilingual masked language models (MLMs) and large language models (LLMs).
2 code implementations • 11 Aug 2023 • Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Omer Levy, Luke Zettlemoyer, Jason Weston, Mike Lewis
We present a scalable method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions.
1 code implementation • 8 Aug 2023 • Sewon Min, Suchin Gururangan, Eric Wallace, Weijia Shi, Hannaneh Hajishirzi, Noah A. Smith, Luke Zettlemoyer
SILO is built by (1) training a parametric LM on Open License Corpus (OLC), a new corpus we curate with 228B tokens of public domain and permissively licensed text and (2) augmenting it with a more general and easily modifiable nonparametric datastore (e. g., containing copyrighted books or news) that is only queried during inference.
1 code implementation • 8 Aug 2023 • Tianlu Wang, Ping Yu, Xiaoqing Ellen Tan, Sean O'Brien, Ramakanth Pasunuru, Jane Dwivedi-Yu, Olga Golovneva, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz
As large language models improve, there is increasing interest in techniques that leverage these models' capabilities to refine their own outputs.
no code implementations • 31 Jul 2023 • Ari Holtzman, Peter West, Luke Zettlemoyer
Coaxing out desired behavior from pretrained models, while avoiding undesirable ones, has redefined NLP and is reshaping how we interact with computers.
no code implementations • 24 May 2023 • Chenglei Si, Weijia Shi, Chen Zhao, Luke Zettlemoyer, Jordan Boyd-Graber
Beyond generalizability, the interpretable design of MoRE improves selective question answering results compared to baselines without incorporating inter-expert agreement.
3 code implementations • 24 May 2023 • Weijia Shi, Xiaochuang Han, Mike Lewis, Yulia Tsvetkov, Luke Zettlemoyer, Scott Wen-tau Yih
Language models (LMs) often struggle to pay enough attention to the input context, and generate texts that are unfaithful or contain hallucinations.
17 code implementations • NeurIPS 2023 • Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
Our best model family, which we name Guanaco, outperforms all previous openly released models on the Vicuna benchmark, reaching 99. 3% of the performance level of ChatGPT while only requiring 24 hours of finetuning on a single GPU.
Ranked #1 on
parameter-efficient fine-tuning
on BoolQ
4 code implementations • 23 May 2023 • Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Wei Koh, Mohit Iyyer, Luke Zettlemoyer, Hannaneh Hajishirzi
Evaluating the factuality of long-form text generated by large language models (LMs) is non-trivial because (1) generations often contain a mixture of supported and unsupported pieces of information, making binary judgments of quality inadequate, and (2) human evaluation is time-consuming and costly.
no code implementations • 23 May 2023 • Mikel Artetxe, Vedanuj Goswami, Shruti Bhosale, Angela Fan, Luke Zettlemoyer
Machine Translation (MT) has been widely used for cross-lingual classification, either by translating the test set into English and running inference with a monolingual model (translate-test), or translating the training set into the target languages and finetuning a multilingual model (translate-train).
5 code implementations • NeurIPS 2023 • Chunting Zhou, PengFei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy
Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences.
no code implementations • NeurIPS 2023 • Lili Yu, Dániel Simig, Colin Flaherty, Armen Aghajanyan, Luke Zettlemoyer, Mike Lewis
Autoregressive transformers are spectacular models for short sequences but scale poorly to long sequences such as high-resolution images, podcasts, code, or books.
no code implementations • 26 Apr 2023 • Haoqiang Kang, Terra Blevins, Luke Zettlemoyer
To better understand this contrast, we present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT), an extension of word-level translation that prompts the model to translate a given word in context.
1 code implementation • NeurIPS 2023 • Mitchell Wortsman, Tim Dettmers, Luke Zettlemoyer, Ari Morcos, Ali Farhadi, Ludwig Schmidt
We introduce new methods for 1) accelerating and 2) stabilizing training for large language-vision models.
1 code implementation • 24 Mar 2023 • Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, Noah A. Smith, Luke Zettlemoyer
Large language models are typically trained densely: all parameters are updated with respect to all inputs.
2 code implementations • 16 Mar 2023 • Bhargavi Paranjape, Scott Lundberg, Sameer Singh, Hannaneh Hajishirzi, Luke Zettlemoyer, Marco Tulio Ribeiro
We introduce Automatic Reasoning and Tool-use (ART), a framework that uses frozen LLMs to automatically generate intermediate reasoning steps as a program.
no code implementations • 15 Feb 2023 • Marjan Ghazvininejad, Hila Gonen, Luke Zettlemoyer
Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting, even though they were not explicitly trained for this task.
1 code implementation • NeurIPS 2023 • Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom
Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale.
1 code implementation • 4 Feb 2023 • Yu Meng, Jitin Krishnan, Sinong Wang, Qifan Wang, Yuning Mao, Han Fang, Marjan Ghazvininejad, Jiawei Han, Luke Zettlemoyer
In this work, we offer a new perspective on the consequence of such a discrepancy: We demonstrate empirically and theoretically that MLM pretraining allocates some model dimensions exclusively for representing $\texttt{[MASK]}$ tokens, resulting in a representation deficiency for real tokens and limiting the pretrained model's expressiveness when it is adapted to downstream data without $\texttt{[MASK]}$ tokens.
2 code implementations • 30 Jan 2023 • Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih
We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model.
Ranked #15 on
Question Answering
on Natural Questions
3 code implementations • 25 Jan 2023 • Davis Liang, Hila Gonen, Yuning Mao, Rui Hou, Naman Goyal, Marjan Ghazvininejad, Luke Zettlemoyer, Madian Khabsa
Large multilingual language models typically rely on a single vocabulary shared across 100+ languages.
no code implementations • 10 Jan 2023 • Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer
To better understand the scaling properties of such mixed-modal models, we conducted over 250 experiments using seven different modalities and model sizes ranging from 8 million to 30 billion, trained on 5-100 billion tokens.
1 code implementation • ICCV 2023 • Hu Xu, Saining Xie, Po-Yao Huang, Licheng Yu, Russell Howes, Gargi Ghosh, Luke Zettlemoyer, Christoph Feichtenhofer
Large vision-language models are generally applicable to many downstream tasks, but come at an exorbitant training cost that only large institutions can afford.
1 code implementation • 22 Dec 2022 • Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov
To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.
Ranked #26 on
Natural Language Inference
on RTE
no code implementations • 20 Dec 2022 • Weijia Shi, Xiaochuang Han, Hila Gonen, Ari Holtzman, Yulia Tsvetkov, Luke Zettlemoyer
Large language models can perform new tasks in a zero-shot fashion, given natural language prompts that specify the desired behavior.
2 code implementations • 20 Dec 2022 • Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer, Huan Sun
Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs).
1 code implementation • 19 Dec 2022 • Mengzhou Xia, Mikel Artetxe, Chunting Zhou, Xi Victoria Lin, Ramakanth Pasunuru, Danqi Chen, Luke Zettlemoyer, Ves Stoyanov
Why do larger language models demonstrate more desirable behaviors?
2 code implementations • 19 Dec 2022 • Xinxi Lyu, Sewon Min, Iz Beltagy, Luke Zettlemoyer, Hannaneh Hajishirzi
Although large language models can be prompted for both zero- and few-shot learning, performance drops significantly when no demonstrations are available.
4 code implementations • 19 Dec 2022 • Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao Yu
Our analysis suggests that INSTRUCTOR is robust to changes in instructions, and that instruction finetuning mitigates the challenge of training a single model on diverse datasets.
1 code implementation • 19 Dec 2022 • Tim Dettmers, Luke Zettlemoyer
Quantization methods reduce the number of bits required to represent each parameter in a model, trading accuracy for smaller memory footprints and inference latencies.
1 code implementation • 15 Dec 2022 • Olga Golovneva, Moya Chen, Spencer Poff, Martin Corredor, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz
Large language models show improved downstream task performance when prompted to generate step-by-step reasoning to justify their final answers.
no code implementations • 8 Dec 2022 • Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
Language models can be prompted to perform a wide variety of zero- and few-shot learning problems.
2 code implementations • 5 Dec 2022 • Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad
Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning, where a few examples are used to describe a task to the model.
1 code implementation • 2 Dec 2022 • Sewon Min, Weijia Shi, Mike Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke Zettlemoyer
Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases.
1 code implementation • 2 Dec 2022 • Bhargavi Paranjape, Pradeep Dasigi, Vivek Srikumar, Luke Zettlemoyer, Hannaneh Hajishirzi
We propose AGRO -- Adversarial Group discovery for Distributionally Robust Optimization -- an end-to-end approach that jointly identifies error-prone groups and improves accuracy on them.
1 code implementation • 30 Nov 2022 • Xinyan Velocity Yu, Sewon Min, Luke Zettlemoyer, Hannaneh Hajishirzi
We find that 25% of questions contain false presuppositions, and provide annotations for these presuppositions and their corrections.
no code implementations • 22 Nov 2022 • Michihiro Yasunaga, Armen Aghajanyan, Weijia Shi, Rich James, Jure Leskovec, Percy Liang, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih
To integrate knowledge in a more scalable and modular way, we propose a retrieval-augmented multimodal model, which enables a base multimodal model (generator) to refer to relevant text and images fetched by a retriever from external memory (e. g., documents on the web).
Ranked #8 on
Image Captioning
on MS COCO
2 code implementations • 18 Nov 2022 • Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu
We introduce DS-1000, a code generation benchmark with a thousand data science problems spanning seven Python libraries, such as NumPy and Pandas.
no code implementations • 15 Nov 2022 • Terra Blevins, Hila Gonen, Luke Zettlemoyer
Although pretrained language models (PLMs) can be prompted to perform a wide range of language tasks, it remains an open question how much this ability comes from generalizable linguistic understanding versus surface-level lexical patterns.
2 code implementations • 27 Oct 2022 • Xiang Lisa Li, Ari Holtzman, Daniel Fried, Percy Liang, Jason Eisner, Tatsunori Hashimoto, Luke Zettlemoyer, Mike Lewis
We propose contrastive decoding (CD), a reliable decoding approach that optimizes a contrastive objective subject to a plausibility constraint.
1 code implementation • 25 Oct 2022 • Victor Zhong, Weijia Shi, Wen-tau Yih, Luke Zettlemoyer
Moreover, existing models are not robust to variations in question constraints, but can be made more robust by tuning on clusters of related questions.
1 code implementation • 13 Oct 2022 • Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer
We present M2D2, a fine-grained, massively multi-domain corpus for studying domain adaptation in language models (LMs).
1 code implementation • 10 Oct 2022 • Tanay Dixit, Bhargavi Paranjape, Hannaneh Hajishirzi, Luke Zettlemoyer
We present COunterfactual Generation via Retrieval and Editing (CORE), a retrieval-augmented generation framework for creating diverse counterfactual perturbations for CDA.
4 code implementations • 6 Oct 2022 • Zhoujun Cheng, Tianbao Xie, Peng Shi, Chengzu Li, Rahul Nadkarni, Yushi Hu, Caiming Xiong, Dragomir Radev, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu
We propose Binder, a training-free neural-symbolic framework that maps the task input to a program, which (1) allows binding a unified API of language model (LM) functionalities to a programming language (e. g., SQL, Python) to extend its grammar coverage and thus tackle more diverse questions, (2) adopts an LM as both the program parser and the underlying model called by the API during execution, and (3) requires only a few in-context exemplar annotations.
Ranked #5 on
Table-based Fact Verification
on TabFact
1 code implementation • 30 Sep 2022 • Victor Zhong, Jesse Mu, Luke Zettlemoyer, Edward Grefenstette, Tim Rocktäschel
Recent work has shown that augmenting environments with language descriptions improves policy learning.
7 code implementations • 21 Sep 2022 • Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, Luke Zettlemoyer
The design choices in the Transformer attention mechanism, including weak inductive bias and quadratic computational complexity, have limited its application for modeling long sequences.
Ranked #1 on
ListOps
on ListOps
1 code implementation • 5 Sep 2022 • Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu
Departing from recent in-context learning methods, we formulate an annotation-efficient, two-step framework: selective annotation that chooses a pool of examples to annotate from unlabeled data in advance, followed by prompt retrieval that retrieves task examples from the annotated pool at test time.
3 code implementations • 15 Aug 2022 • Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance.
Ranked #2 on
Language Modelling
on C4
2 code implementations • 5 Aug 2022 • Margaret Li, Suchin Gururangan, Tim Dettmers, Mike Lewis, Tim Althoff, Noah A. Smith, Luke Zettlemoyer
New ELMs are learned by branching from (mixtures of) ELMs in the current set, further training the parameters on data for the new domain, and then merging the resulting model back into the set for future use.
1 code implementation • 29 Jun 2022 • Paden Tomasello, Akshat Shrivastava, Daniel Lazar, Po-chun Hsu, Duc Le, Adithya Sagar, Ali Elkahky, Jade Copet, Wei-Ning Hsu, Yossi Adi, Robin Algayres, Tu Ahn Nguyen, Emmanuel Dupoux, Luke Zettlemoyer, Abdelrahman Mohamed
Furthermore, in addition to the human-recorded audio, we are releasing a TTS-generated version to benchmark the performance for low-resource domain adaptation of end-to-end SLU systems.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • 21 Jun 2022 • Devendra Singh Sachan, Mike Lewis, Dani Yogatama, Luke Zettlemoyer, Joelle Pineau, Manzil Zaheer
We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.
no code implementations • 7 Jun 2022 • Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed
We describe LegoNN, a procedure for building encoder-decoder architectures in a way so that its parts can be applied to other tasks without the need for any fine-tuning.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
1 code implementation • 27 May 2022 • Weijia Shi, Julian Michael, Suchin Gururangan, Luke Zettlemoyer
Retrieval-augmented language models (LMs) use non-parametric memory to substantially outperform their non-retrieval counterparts on perplexity-based evaluations, but it is an open question whether they achieve similar gains in few- and zero-shot end-task accuracy.
no code implementations • 25 May 2022 • Suzanna Sia, Anton Belyy, Amjad Almahairi, Madian Khabsa, Luke Zettlemoyer, Lambert Mathias
Evaluating an explanation's faithfulness is desired for many reasons such as trust, interpretability and diagnosing the sources of model's errors.
no code implementations • 24 May 2022 • Terra Blevins, Hila Gonen, Luke Zettlemoyer
The emergent cross-lingual transfer seen in multilingual pretrained models has sparked significant interest in studying their behavior.
no code implementations • 24 May 2022 • Mikel Artetxe, Jingfei Du, Naman Goyal, Luke Zettlemoyer, Ves Stoyanov
Prior work on language model pre-training has explored different architectures and learning objectives, but differences in data, hyperparameters and evaluation make a principled comparison difficult.
no code implementations • 22 May 2022 • Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
Despite their wide adoption, the underlying training and memorization dynamics of very large language models is not well understood.
no code implementations • 9 May 2022 • Mandar Joshi, Terra Blevins, Mike Lewis, Daniel S. Weld, Luke Zettlemoyer
Creating labeled natural language training data is expensive and requires significant human effort.
11 code implementations • 2 May 2022 • Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer
Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning.
Ranked #2 on
Stereotypical Bias Analysis
on CrowS-Pairs
1 code implementation • 25 Apr 2022 • Freda Shi, Daniel Fried, Marjan Ghazvininejad, Luke Zettlemoyer, Sida I. Wang
In this work, we introduce execution result--based minimum Bayes risk decoding (MBR-EXEC) for program selection and show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
Ranked #47 on
Code Generation
on MBPP
no code implementations • 17 Apr 2022 • Terra Blevins, Luke Zettlemoyer
English pretrained language models, which make up the backbone of many modern NLP systems, require huge amounts of unlabeled training data.
1 code implementation • 15 Apr 2022 • Devendra Singh Sachan, Mike Lewis, Mandar Joshi, Armen Aghajanyan, Wen-tau Yih, Joelle Pineau, Luke Zettlemoyer
We propose a simple and effective re-ranking method for improving passage retrieval in open question answering.
3 code implementations • 12 Apr 2022 • Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis
Our model is the first generative model that is able to directly perform zero-shot code infilling, which we evaluate on challenging tasks such as type inference, comment generation, and variable re-naming.
Ranked #95 on
Code Generation
on MBPP
2 code implementations • 3 Apr 2022 • Rabeeh Karimi Mahabadi, Luke Zettlemoyer, James Henderson, Marzieh Saeidi, Lambert Mathias, Veselin Stoyanov, Majid Yazdani
Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score.
2 code implementations • 25 Feb 2022 • Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer
Large language models (LMs) are able to in-context learn -- perform a new task via inference alone by conditioning on a few input-label pairs (demonstrations) and making predictions for new inputs.
no code implementations • 25 Jan 2022 • Suchin Gururangan, Dallas Card, Sarah K. Dreier, Emily K. Gade, Leroy Z. Wang, Zeyu Wang, Luke Zettlemoyer, Noah A. Smith
Language models increasingly rely on massive web dumps for diverse text data.
no code implementations • 19 Jan 2022 • Armen Aghajanyan, Bernie Huang, Candace Ross, Vladimir Karpukhin, Hu Xu, Naman Goyal, Dmytro Okhonko, Mandar Joshi, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer
We introduce CM3, a family of causally masked generative models trained over a large corpus of structured multi-modal documents that can contain both text and image tokens.
1 code implementation • 16 Jan 2022 • Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu
Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.
Ranked #1 on
Task-Oriented Dialogue Systems
on KVRET
2 code implementations • 20 Dec 2021 • Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona Diab, Veselin Stoyanov, Xian Li
Large-scale generative language models such as GPT-3 are competitive few-shot learners.
no code implementations • 20 Dec 2021 • Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov
This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning.
no code implementations • 7 Dec 2021 • Darsh J Shah, Sinong Wang, Han Fang, Hao Ma, Luke Zettlemoyer
The ubiquity of offensive and hateful content on online fora necessitates the need for automatic solutions that detect such content competently across target groups.
2 code implementations • NAACL 2022 • Belinda Z. Li, Jane Yu, Madian Khabsa, Luke Zettlemoyer, Alon Halevy, Jacob Andreas
When a neural language model (LM) is adapted to perform a new task, what aspects of the task predict the eventual performance of the model?
no code implementations • NeurIPS 2021 • Victor Zhong, Austin Hanjie, Sida Wang, Karthik Narasimhan, Luke Zettlemoyer
We hope SILG enables the community to quickly identify new methodolo- gies for language grounding that generalize to a diverse set of environments and their associated challenges.
no code implementations • Findings (NAACL) 2022 • Eleftheria Briakou, Sida I. Wang, Luke Zettlemoyer, Marjan Ghazvininejad
Mined bitexts can contain imperfect translations that yield unreliable training signals for Neural Machine Translation (NMT).
2 code implementations • NAACL 2022 • Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learning on a large set of training tasks.
1 code implementation • 20 Oct 2021 • Victor Zhong, Austin W. Hanjie, Sida I. Wang, Karthik Narasimhan, Luke Zettlemoyer
We hope SILG enables the community to quickly identify new methodologies for language grounding that generalize to a diverse set of environments and their associated challenges.
3 code implementations • ICLR 2022 • Tim Dettmers, Mike Lewis, Sam Shleifer, Luke Zettlemoyer
To maintain stability and performance, we combine block-wise quantization with two additional changes: (1) dynamic quantization, a form of non-linear optimization that is precise for both large and small magnitude values, and (2) a stable embedding layer to reduce gradient variance that comes from the highly non-uniform distribution of input tokens in language models.
2 code implementations • EMNLP 2021 • Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer
We present VideoCLIP, a contrastive approach to pre-train a unified model for zero-shot video and text understanding, without using any labels on downstream tasks.
Ranked #1 on
Temporal Action Localization
on CrossTask
(using extra training data)
Action Segmentation
Long Video Retrieval (Background Removed)
+5
2 code implementations • NAACL 2022 • Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer
We introduce a new domain expert mixture (DEMix) layer that enables conditioning a language model (LM) on the domain of the input text.
1 code implementation • ACL 2022 • Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer
We introduce a noisy channel approach for language model prompting in few-shot text classification.
1 code implementation • ACL 2021 • Weijia Shi, Mandar Joshi, Luke Zettlemoyer
Short textual descriptions of entities provide summaries of their key attributes and have been shown to be useful sources of background knowledge for tasks such as entity linking and question answering.
2 code implementations • 26 Jul 2021 • Jesse Thomason, Mohit Shridhar, Yonatan Bisk, Chris Paxton, Luke Zettlemoyer
We introduce several CLIP-based models for distinguishing objects and demonstrate that while recent advances in jointly modeling vision and language are useful for robotic language understanding, it is still the case that these image-based models are weaker at understanding the 3D nature of objects -- properties which play a key role in manipulation.
no code implementations • ICLR 2022 • Armen Aghajanyan, Dmytro Okhonko, Mike Lewis, Mandar Joshi, Hu Xu, Gargi Ghosh, Luke Zettlemoyer
We introduce HTLM, a hyper-text language model trained on a large-scale web crawl.
Ranked #1 on
Table-to-Text Generation
on DART
2 code implementations • ACL 2022 • Jungsoo Park, Sewon Min, Jaewoo Kang, Luke Zettlemoyer, Hannaneh Hajishirzi
Claims in FAVIQ are verified to be natural, contain little lexical bias, and require a complete understanding of the evidence for verification.
1 code implementation • Findings (ACL) 2022 • Robin Jia, Mike Lewis, Luke Zettlemoyer
We propose a pre-training objective based on question answering (QA) for learning general-purpose contextual representations, motivated by the intuition that the representation of a phrase in a passage should encode all questions that the phrase can answer in context.
no code implementations • Findings (ACL) 2021 • Bhargavi Paranjape, Julian Michael, Marjan Ghazvininejad, Luke Zettlemoyer, Hannaneh Hajishirzi
Many commonsense reasoning NLP tasks involve choosing between one or more possible answers to a question or prompt based on knowledge that is often implicit.
1 code implementation • 9 Jun 2021 • Weijia Shi, Mandar Joshi, Luke Zettlemoyer
Short textual descriptions of entities provide summaries of their key attributes and have been shown to be useful sources of background knowledge for tasks such as entity linking and question answering.
2 code implementations • NeurIPS 2021 • Xuezhe Ma, Xiang Kong, Sinong Wang, Chunting Zhou, Jonathan May, Hao Ma, Luke Zettlemoyer
Specifically, with the first attention function, Luna packs the input sequence into a sequence of fixed length.
1 code implementation • Findings (ACL) 2021 • Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer
We present a simplified, task-agnostic multi-modal pre-training approach that can accept either video or text input, or both for a variety of end tasks.
Ranked #2 on
Temporal Action Localization
on CrossTask
(using extra training data)
2 code implementations • 16 Apr 2021 • Ari Holtzman, Peter West, Vered Shwartz, Yejin Choi, Luke Zettlemoyer
Large language models have shown promising results in zero-shot settings (Brown et al., 2020; Radford et al., 2019).
1 code implementation • 30 Mar 2021 • Mike Lewis, Shruti Bhosale, Tim Dettmers, Naman Goyal, Luke Zettlemoyer
Sparse layers can dramatically improve the efficiency of training and inference by routing each token to specialized expert modules that contain only a small fraction of the model parameters.
1 code implementation • 23 Mar 2021 • Nicola De Cao, Ledell Wu, Kashyap Popat, Mikel Artetxe, Naman Goyal, Mikhail Plekhanov, Luke Zettlemoyer, Nicola Cancedda, Sebastian Riedel, Fabio Petroni
Moreover, in a zero-shot setting on languages with no training data at all, mGENRE treats the target language as a latent variable that is marginalized at prediction time.
Ranked #2 on
Entity Disambiguation
on Mewsli-9
(using extra training data)
no code implementations • EACL 2021 • Terra Blevins, Mandar Joshi, Luke Zettlemoyer
Current models for Word Sense Disambiguation (WSD) struggle to disambiguate rare senses, despite reaching human performance on global WSD metrics.
2 code implementations • EMNLP 2021 • Armen Aghajanyan, Anchit Gupta, Akshat Shrivastava, Xilun Chen, Luke Zettlemoyer, Sonal Gupta
We propose pre-finetuning, an additional large-scale learning stage between language model pre-training and fine-tuning.
Ranked #3 on
Text Summarization
on Reddit TIFU
(using extra training data)
no code implementations • ICLR 2021 • Asish Ghoshal, Xilun Chen, Sonal Gupta, Luke Zettlemoyer, Yashar Mehdad
Training with soft targets instead of hard targets has been shown to improve performance and calibration of deep neural networks.
no code implementations • ACL 2021 • Haoyue Shi, Luke Zettlemoyer, Sida I. Wang
Bilingual lexicons map words in one language to their translations in another, and are typically induced by learning linear projections to align monolingual word embedding spaces.
2 code implementations • ACL 2021 • Armen Aghajanyan, Luke Zettlemoyer, Sonal Gupta
Although pretrained language models can be fine-tuned to produce state-of-the-art results for a very wide range of language understanding tasks, the dynamics of this process are not well understood, especially in the low data regime.
Ranked #1 on
Paraphrase Identification
on Quora Question Pairs
(Direct Intrinsic Dimension metric)
1 code implementation • COLING 2020 • Ayal Klein, Jonathan Mamou, Valentina Pyatkin, Daniela Stepanov, Hangfeng He, Dan Roth, Luke Zettlemoyer, Ido Dagan
We propose a new semantic scheme for capturing predicate-argument relations for nominalizations, termed QANom.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Christopher Clark, Mark Yatskar, Luke Zettlemoyer
We evaluate performance on synthetic datasets, and four datasets built to penalize models that exploit known biases on textual entailment, visual question answering, and image recognition tasks.
2 code implementations • Findings (ACL) 2021 • Chunting Zhou, Graham Neubig, Jiatao Gu, Mona Diab, Paco Guzman, Luke Zettlemoyer, Marjan Ghazvininejad
Neural sequence models can generate highly fluent sentences, but recent studies have also shown that they are also prone to hallucinate additional content not supported by the input.
1 code implementation • EMNLP 2020 • Xilun Chen, Asish Ghoshal, Yashar Mehdad, Luke Zettlemoyer, Sonal Gupta
Task-oriented semantic parsing is a critical component of virtual assistants, which is responsible for understanding the user's intents (set reminder, play music, etc.).
5 code implementations • ICLR 2021 • Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
We introduce $k$-nearest-neighbor machine translation ($k$NN-MT), which predicts tokens with a nearest neighbor classifier over a large datastore of cached examples, using representations from a neural translation model for similarity search.
1 code implementation • EMNLP 2020 • Victor Zhong, Mike Lewis, Sida I. Wang, Luke Zettlemoyer
We propose Grounded Adaptation for Zero-shot Executable Semantic Parsing (GAZP) to adapt an existing semantic parser to new environments (e. g. new database schemas).
Ranked #6 on
Text-To-SQL
on SParC
3 code implementations • ICLR 2021 • Armen Aghajanyan, Akshat Shrivastava, Anchit Gupta, Naman Goyal, Luke Zettlemoyer, Sonal Gupta
Although widely adopted, existing approaches for fine-tuning pre-trained language models have been shown to be unstable across hyper-parameter settings, motivating recent work on trust region methods.
Abstractive Text Summarization
Cross-Lingual Natural Language Inference
2 code implementations • ICLR 2021 • Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
We introduce a deep and light-weight transformer, DeLighT, that delivers similar or better performance than standard transformer-based models with significantly fewer parameters.
Ranked #1 on
Machine Translation
on WMT2016 English-French
no code implementations • ACL 2020 • Terra Blevins, Luke Zettlemoyer
A major obstacle in Word Sense Disambiguation (WSD) is that word senses are not uniformly distributed, causing existing models to generally perform poorly on senses that are either rare or unseen during training.
Ranked #10 on
Word Sense Disambiguation
on Supervised:
no code implementations • ACL 2020 • Nabil Hossain, Marjan Ghazvininejad, Luke Zettlemoyer
Retrieve-and-edit seq2seq methods typically retrieve an output from the training set and learn a model to edit it to produce the final output.
2 code implementations • NeurIPS 2020 • Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer
The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks.
1 code implementation • 6 May 2020 • Terra Blevins, Luke Zettlemoyer
A major obstacle in Word Sense Disambiguation (WSD) is that word senses are not uniformly distributed, causing existing models to generally perform poorly on senses that are either rare or unseen during training.
2 code implementations • EMNLP 2020 • Bhargavi Paranjape, Mandar Joshi, John Thickstun, Hannaneh Hajishirzi, Luke Zettlemoyer
Decisions of complex language understanding models can be rationalized by limiting their inputs to a relevant subsequence of the original text.
1 code implementation • ACL 2020 • Belinda Z. Li, Gabriel Stanovsky, Luke Zettlemoyer
We improve upon pairwise annotation for active learning in coreference resolution, by asking annotators to identify mention antecedents if a presented mention pair is deemed not coreferent.
2 code implementations • EMNLP 2020 • Sewon Min, Julian Michael, Hannaneh Hajishirzi, Luke Zettlemoyer
Ambiguity is inherent to open-domain question answering; especially when exploring new topics, it can be difficult to ask questions that have a single, unambiguous answer.
1 code implementation • ICML 2020 • Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy
This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order.
no code implementations • 23 Jan 2020 • Marjan Ghazvininejad, Omer Levy, Luke Zettlemoyer
The recently proposed mask-predict decoding algorithm has narrowed the performance gap between semi-autoregressive machine translation models and the traditional left-to-right approach.
8 code implementations • 22 Jan 2020 • Yinhan Liu, Jiatao Gu, Naman Goyal, Xi-An Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.
11 code implementations • CVPR 2020 • Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han, Roozbeh Mottaghi, Luke Zettlemoyer, Dieter Fox
We present ALFRED (Action Learning From Realistic Environments and Directives), a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions for household tasks.
7 code implementations • 10 Nov 2019 • Sewon Min, Danqi Chen, Luke Zettlemoyer, Hannaneh Hajishirzi
We introduce an approach for open-domain question answering (QA) that retrieves and reads a passage graph, where vertices are passages of text and edges represent relationships that are derived from an external knowledge base or co-occurrence in the same article.
3 code implementations • EMNLP 2020 • Ledell Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, Luke Zettlemoyer
This paper introduces a conceptually simple, scalable, and highly effective BERT-based entity linking model, along with an extensive evaluation of its accuracy-speed trade-off.
no code implementations • 9 Nov 2019 • Siddharth Dalmia, Abdel-rahman Mohamed, Mike Lewis, Florian Metze, Luke Zettlemoyer
Inspired by modular software design principles of independence, interchangeability, and clarity of interface, we introduce a method for enforcing encoder-decoder modularity in seq2seq models without sacrificing the overall model quality or its full differentiability.
1 code implementation • ACL 2020 • Paul Roit, Ayal Klein, Daniela Stepanov, Jonathan Mamou, Julian Michael, Gabriel Stanovsky, Luke Zettlemoyer, Ido Dagan
Question-answer driven Semantic Role Labeling (QA-SRL) was proposed as an attractive open and natural flavour of SRL, potentially attainable from laymen.
34 code implementations • ACL 2020 • Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov
We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.
no code implementations • ACL 2020 • Shijie Wu, Alexis Conneau, Haoran Li, Luke Zettlemoyer, Veselin Stoyanov
We study the problem of multilingual masked language modeling, i. e. the training of a single model on concatenated text from multiple languages, and present a detailed study of several factors that influence why these models are so effective for cross-lingual transfer.
no code implementations • IJCNLP 2019 • Panupong Pasupat, Sonal Gupta, M, Karishma yam, Rushin Shah, Mike Lewis, Luke Zettlemoyer
We propose a semantic parser for parsing compositional utterances into Task Oriented Parse (TOP), a tree representation that has intents and slots as labels of nesting tree nodes.
5 code implementations • ICLR 2020 • Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
Applying this augmentation to a strong Wikitext-103 LM, with neighbors drawn from the original training set, our $k$NN-LM achieves a new state-of-the-art perplexity of 15. 79 - a 2. 9 point improvement with no additional training.
Ranked #10 on
Language Modelling
on WikiText-103
47 code implementations • ACL 2020 • Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer
We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.
Ranked #3 on
Open-Domain Question Answering
on ELI5
2 code implementations • IJCNLP 2019 • Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer
Interactive programming with interleaved code snippet cells and natural language markdown is recently gaining popularity in the form of Jupyter notebooks, which accelerate prototyping and collaboration.
1 code implementation • IJCNLP 2019 • Sewon Min, Danqi Chen, Hannaneh Hajishirzi, Luke Zettlemoyer
Many question answering (QA) tasks only provide weak supervision for how the answer should be computed.
Ranked #2 on
Question Answering
on NarrativeQA
3 code implementations • IJCNLP 2019 • Christopher Clark, Mark Yatskar, Luke Zettlemoyer
Our method has two stages: we (1) train a naive model that makes predictions exclusively based on dataset biases, and (2) train a robust model as part of an ensemble with the naive one in order to encourage it to focus on other patterns in the data that are more likely to generalize.
Ranked #5 on
Visual Question Answering (VQA)
on VQA-CP
2 code implementations • IJCNLP 2019 • Mandar Joshi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer
We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3. 9 F1) and GAP (+11. 5 F1) benchmarks.
Ranked #11 on
Coreference Resolution
on CoNLL 2012
(using extra training data)
67 code implementations • 26 Jul 2019 • Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
Ranked #1 on
Only Connect Walls Dataset Task 1 (Grouping)
on OCW
(Wasserstein Distance (WD) metric, using extra
training data)
6 code implementations • TACL 2020 • Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.
Ranked #1 on
Question Answering
on TriviaQA
(F1 metric)
2 code implementations • 10 Jul 2019 • Jesse Thomason, Michael Murray, Maya Cakmak, Luke Zettlemoyer
To train agents that search an environment for a goal location, we define the Navigation from Dialog History task.
2 code implementations • ICLR 2020 • Tim Dettmers, Luke Zettlemoyer
We demonstrate the possibility of what we call sparse learning: accelerated training of deep neural networks that maintain sparse weights throughout training while achieving dense performance levels.
Ranked #49 on
Image Classification
on MNIST
1 code implementation • ACL 2019 • Victor Zhong, Luke Zettlemoyer
Conversational machine reading systems help users answer high-level questions (e. g. determine if they qualify for particular government benefits) when they do not know the exact rules by which the determination is made(e. g. whether they need certain income levels or veteran status).
1 code implementation • ACL 2019 • Sewon Min, Eric Wallace, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi, Luke Zettlemoyer
Multi-hop reading comprehension (RC) questions are challenging because they require reading and reasoning over multiple paragraphs.
2 code implementations • ACL 2019 • Sewon Min, Victor Zhong, Luke Zettlemoyer, Hannaneh Hajishirzi
Multi-hop Reading Comprehension (RC) requires reasoning and aggregation across several paragraphs.
Ranked #66 on
Question Answering
on HotpotQA
no code implementations • ACL 2019 • Terra Blevins, Luke Zettlemoyer
We incorporate morphological supervision into character language models (CLMs) via multitasking and show that this addition improves bits-per-character (BPC) performance across 24 languages, even when the morphology data and language modeling data are disjoint.
2 code implementations • ACL 2019 • Gabriel Stanovsky, Noah A. Smith, Luke Zettlemoyer
We present the first challenge set and evaluation protocol for the analysis of gender bias in machine translation (MT).
no code implementations • NAACL 2019 • Pradeep Dasigi, Matt Gardner, Shikhar Murty, Luke Zettlemoyer, Eduard Hovy
Training semantic parsers from question-answer pairs typically involves searching over an exponentially large space of logical forms, and an unguided search can easily be misled by spurious logical forms that coincidentally evaluate to the correct answer.
4 code implementations • 26 Apr 2019 • Abdelrahman Mohamed, Dmytro Okhonko, Luke Zettlemoyer
The recent success of transformer networks for neural machine translation and other NLP tasks has led to a surge in research work trying to apply it for speech recognition.
2 code implementations • IJCNLP 2019 • Marjan Ghazvininejad, Omer Levy, Yinhan Liu, Luke Zettlemoyer
Most machine translation systems generate text autoregressively from left to right.
no code implementations • IJCNLP 2019 • Srinivasan Iyer, Alvin Cheung, Luke Zettlemoyer
Programmers typically organize executable source code using high-level coding patterns or idiomatic structures such as nested loops, exception handlers and recursive blocks, rather than as individual code tokens.
no code implementations • IJCNLP 2019 • Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
We present a new approach for pretraining a bi-directional transformer model that provides significant performance gains across a variety of language understanding problems.
Ranked #12 on
Constituency Parsing
on Penn Treebank
no code implementations • 15 Feb 2019 • Arash Einolghozati, Panupong Pasupat, Sonal Gupta, Rushin Shah, Mrinal Mohit, Mike Lewis, Luke Zettlemoyer
Semantic parsing using hierarchical representations has recently been proposed for task oriented dialog with promising results [Gupta et al 2018].
1 code implementation • ACL 2019 • Fei Liu, Luke Zettlemoyer, Jacob Eisenstein
We present a new architecture for storing and accessing entity mentions during online text processing.
3 code implementations • NAACL 2019 • Mandar Joshi, Eunsol Choi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer
Reasoning about implied relationships (e. g., paraphrastic, common sense, encyclopedic) between pairs of words is crucial for many cross-sentence inference problems.