Search Results for author: Omer Levy

Found 81 papers, 48 papers with code

RoBERTa: A Robustly Optimized BERT Pretraining Approach

58 code implementations26 Jul 2019 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

 Ranked #1 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (Wasserstein Distance (WD) metric, using extra training data)

Document Image Classification Language Modelling +13

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

42 code implementations ACL 2020 Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

Abstractive Text Summarization Denoising +5

Generalization through Memorization: Nearest Neighbor Language Models

5 code implementations ICLR 2020 Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis

Applying this augmentation to a strong Wikitext-103 LM, with neighbors drawn from the original training set, our $k$NN-LM achieves a new state-of-the-art perplexity of 15. 79 - a 2. 9 point improvement with no additional training.

Domain Adaptation Language Modelling +1

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

3 code implementations9 Jun 2022 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

LIMA: Less Is More for Alignment

5 code implementations NeurIPS 2023 Chunting Zhou, PengFei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy

Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences.

Language Modelling reinforcement-learning

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

11 code implementations WS 2018 Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset.

Natural Language Inference Natural Language Understanding +2

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

5 code implementations NeurIPS 2019 Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks.

Transfer Learning

code2vec: Learning Distributed Representations of Code

9 code implementations26 Mar 2018 Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

We demonstrate the effectiveness of our approach by using it to predict a method's name from the vector representation of its body.

code2seq: Generating Sequences from Structured Representations of Code

6 code implementations ICLR 2019 Uri Alon, Shaked Brody, Omer Levy, Eran Yahav

The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval.

Code Summarization NMT +3

What Does BERT Look At? An Analysis of BERT's Attention

2 code implementations WS 2019 Kevin Clark, Urvashi Khandelwal, Omer Levy, Christopher D. Manning

Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data.

Language Modelling Sentence

BERT for Coreference Resolution: Baselines and Analysis

2 code implementations IJCNLP 2019 Mandar Joshi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer

We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3. 9 F1) and GAP (+11. 5 F1) benchmarks.

Ranked #10 on Coreference Resolution on CoNLL 2012 (using extra training data)

Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation

1 code implementation NeurIPS 2023 Yuval Kirstain, Adam Polyak, Uriel Singer, Shahbuland Matiana, Joe Penna, Omer Levy

Using this web app we build Pick-a-Pic, a large, open dataset of text-to-image prompts and real users' preferences over generated images.

Text-to-Image Generation

How to Train BERT with an Academic Budget

4 code implementations EMNLP 2021 Peter Izsak, Moshe Berchansky, Omer Levy

While large language models a la BERT are used ubiquitously in NLP, pretraining them is considered a luxury that only a few well-funded industry labs can afford.

Language Modelling Linguistic Acceptability +4

A General Path-Based Representation for Predicting Program Properties

3 code implementations26 Mar 2018 Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

A major challenge when learning from programs is $\textit{how to represent programs in a way that facilitates effective learning}$.

Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor

3 code implementations19 Dec 2022 Or Honovich, Thomas Scialom, Omer Levy, Timo Schick

We collect 64, 000 examples by prompting a language model with three seed examples of instructions and eliciting a fourth.

Language Modelling

Are Sixteen Heads Really Better than One?

3 code implementations NeurIPS 2019 Paul Michel, Omer Levy, Graham Neubig

Attention is a powerful and ubiquitous mechanism for allowing neural models to focus on particular salient pieces of information by taking their weighted average when making predictions.

Self-Alignment with Instruction Backtranslation

2 code implementations11 Aug 2023 Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Omer Levy, Luke Zettlemoyer, Jason Weston, Mike Lewis

We present a scalable method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions.

Instruction Following Language Modelling

Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling

1 code implementation ACL 2018 Luheng He, Kenton Lee, Omer Levy, Luke Zettlemoyer

Recent BIO-tagging-based neural semantic role labeling models are very high performing, but assume gold predicates as part of the input and cannot incorporate span-level features.

Semantic Role Labeling

Few-Shot Question Answering by Pretraining Span Selection

4 code implementations ACL 2021 Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy

Given a passage with multiple sets of recurring spans, we mask in each set all recurring spans but one, and ask the model to select the correct span in the passage for each masked span.

Question Answering

Structural Language Models of Code

2 code implementations ICML 2020 Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

We introduce a new approach to any-code completion that leverages the strict syntax of programming languages to model a code snippet as a tree - structural language modeling (SLM).

C++ code Code Completion +2

Ultra-Fine Entity Typing

1 code implementation ACL 2018 Eunsol Choi, Omer Levy, Yejin Choi, Luke Zettlemoyer

We introduce a new entity typing task: given a sentence with an entity mention, the goal is to predict a set of free-form phrases (e. g. skyscraper, songwriter, or criminal) that describe appropriate types for the target entity.

Entity Linking Entity Typing +1

Transformer Feed-Forward Layers Are Key-Value Memories

1 code implementation EMNLP 2021 Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy

Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role in the network remains under-explored.

pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference

3 code implementations NAACL 2019 Mandar Joshi, Eunsol Choi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer

Reasoning about implied relationships (e. g., paraphrastic, common sense, encyclopedic) between pairs of words is crucial for many cross-sentence inference problems.

Common Sense Reasoning Sentence +1

SCROLLS: Standardized CompaRison Over Long Language Sequences

2 code implementations10 Jan 2022 Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv, Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy

NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild.

Long-range modeling Natural Language Inference +1

Learning to Retrieve Passages without Supervision

1 code implementation NAACL 2022 Ori Ram, Gal Shachaf, Omer Levy, Jonathan Berant, Amir Globerson

Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive performance by training on large datasets of question-passage pairs.

Contrastive Learning Open-Domain Question Answering +1

Instruction Induction: From Few Examples to Natural Language Task Descriptions

1 code implementation22 May 2022 Or Honovich, Uri Shaham, Samuel R. Bowman, Omer Levy

Large language models are able to perform a task by conditioning on a few input-output demonstrations - a paradigm known as in-context learning.

In-Context Learning

Zero-Shot Relation Extraction via Reading Comprehension

2 code implementations CONLL 2017 Omer Levy, Minjoon Seo, Eunsol Choi, Luke Zettlemoyer

We show that relation extraction can be reduced to answering simple reading comprehension questions, by associating one or more natural-language questions with each relation slot.

Reading Comprehension Relation +5

Recurrent Additive Networks

2 code implementations21 May 2017 Kenton Lee, Omer Levy, Luke Zettlemoyer

We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates.

Language Modelling

Vision Transformers with Mixed-Resolution Tokenization

1 code implementation1 Apr 2023 Tomer Ronen, Omer Levy, Avram Golbert

In this work, we apply this approach to Vision Transformers by introducing a novel image tokenization scheme, replacing the standard uniform grid with a mixed-resolution sequence of tokens, where each token represents a patch of arbitrary size.

Image Classification

Coreference Resolution without Span Representations

1 code implementation ACL 2021 Yuval Kirstain, Ori Ram, Omer Levy

The introduction of pretrained language models has reduced many complex task-specific NLP models to simple lightweight layers.

coreference-resolution

Neural Machine Translation without Embeddings

2 code implementations NAACL 2021 Uri Shaham, Omer Levy

Many NLP models operate over sequences of subword tokens produced by hand-crafted tokenization rules and heuristic subword induction algorithms.

Machine Translation Translation

Aligned Cross Entropy for Non-Autoregressive Machine Translation

1 code implementation ICML 2020 Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy

This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order.

Machine Translation Translation

Transformer Language Models without Positional Encodings Still Learn Positional Information

1 code implementation30 Mar 2022 Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy

Causal transformer language models (LMs), such as GPT-3, typically require some form of positional encoding, such as positional embeddings.

Position

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

1 code implementation23 May 2023 Uri Shaham, Maor Ivgi, Avia Efrat, Jonathan Berant, Omer Levy

We introduce ZeroSCROLLS, a zero-shot benchmark for natural language understanding over long texts, which contains only test and small validation sets, without training data.

Natural Language Understanding

How Optimal is Greedy Decoding for Extractive Question Answering?

1 code implementation12 Aug 2021 Or Castel, Ori Ram, Avia Efrat, Omer Levy

However, this approach does not ensure that the answer is a span in the given passage, nor does it guarantee that it is the most probable one.

Extractive Question-Answering Question Answering +1

A Few More Examples May Be Worth Billions of Parameters

1 code implementation8 Oct 2021 Yuval Kirstain, Patrick Lewis, Sebastian Riedel, Omer Levy

We investigate the dynamics of increasing the number of model parameters versus the number of labeled examples across a wide variety of tasks.

Extractive Question-Answering Multiple-choice +1

LMentry: A Language Model Benchmark of Elementary Language Tasks

1 code implementation3 Nov 2022 Avia Efrat, Or Honovich, Omer Levy

As the performance of large language models rapidly improves, benchmarks are getting larger and more complex as well.

Language Modelling Sentence

word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method

5 code implementations15 Feb 2014 Yoav Goldberg, Omer Levy

The word2vec software of Tomas Mikolov and colleagues (https://code. google. com/p/word2vec/ ) has gained a lot of traction lately, and provides state-of-the-art word embeddings.

Language Modelling Word Embeddings

ParaShoot: A Hebrew Question Answering Dataset

1 code implementation EMNLP (MRQA) 2021 Omri Keren, Omer Levy

NLP research in Hebrew has largely focused on morphology and syntax, where rich annotated datasets in the spirit of Universal Dependencies are available.

Question Answering

Models In a Spelling Bee: Language Models Implicitly Learn the Character Composition of Tokens

1 code implementation NAACL 2022 Itay Itzhak, Omer Levy

Standard pretrained language models operate on sequences of subword tokens without direct access to the characters that compose each token's string representation.

Language Modelling

LSTMs Exploit Linguistic Attributes of Data

no code implementations WS 2018 Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan, Noah A. Smith

While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data.

Memorization Open-Ended Question Answering

Simulating Action Dynamics with Neural Process Networks

no code implementations ICLR 2018 Antoine Bosselut, Omer Levy, Ari Holtzman, Corin Ennis, Dieter Fox, Yejin Choi

Understanding procedural language requires anticipating the causal effects of actions, even when they are not explicitly stated.

Deep RNNs Encode Soft Hierarchical Syntax

no code implementations ACL 2018 Terra Blevins, Omer Levy, Luke Zettlemoyer

We present a set of experiments to demonstrate that deep recurrent neural networks (RNNs) learn internal representations that capture soft hierarchical notions of syntax from highly varied supervision.

Dependency Parsing Language Modelling +3

Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum

no code implementations ACL 2018 Omer Levy, Kenton Lee, Nicholas FitzGerald, Luke Zettlemoyer

LSTMs were introduced to combat vanishing gradients in simple RNNs by augmenting them with gated additive recurrent connections.

Annotation Artifacts in Natural Language Inference Data

no code implementations NAACL 2018 Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, Noah A. Smith

Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to.

Natural Language Inference Negation +2

A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments

no code implementations EACL 2017 Omer Levy, Anders Søgaard, Yoav Goldberg

While cross-lingual word embeddings have been studied extensively in recent years, the qualitative differences between the different algorithms remain vague.

Cross-Lingual Word Embeddings Sentence +1

Modeling Extractive Sentence Intersection via Subtree Entailment

no code implementations COLING 2016 Omer Levy, Ido Dagan, Gabriel Stanovsky, Judith Eckle-Kohler, Iryna Gurevych

Sentence intersection captures the semantic overlap of two texts, generalizing over paradigms such as textual entailment and semantic text similarity.

Abstractive Text Summarization Natural Language Inference +2

Neural Word Embedding as Implicit Matrix Factorization

no code implementations NeurIPS 2014 Omer Levy, Yoav Goldberg

We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, shifted by a global constant.

Word Similarity

Improving Distributional Similarity with Lessons Learned from Word Embeddings

no code implementations TACL 2015 Omer Levy, Yoav Goldberg, Ido Dagan

Recent trends suggest that neural-network-inspired word embedding models outperform traditional count-based distributional models on word similarity and analogy detection tasks.

Word Embeddings Word Similarity

Semi-Autoregressive Training Improves Mask-Predict Decoding

no code implementations23 Jan 2020 Marjan Ghazvininejad, Omer Levy, Luke Zettlemoyer

The recently proposed mask-predict decoding algorithm has narrowed the performance gap between semi-autoregressive machine translation models and the traditional left-to-right approach.

Machine Translation Translation

The Turking Test: Can Language Models Understand Instructions?

no code implementations22 Oct 2020 Avia Efrat, Omer Levy

Supervised machine learning provides the learner with a set of input-output examples of the target task.

Language Modelling Sentence

Can Latent Alignments Improve Autoregressive Machine Translation?

no code implementations NAACL 2021 Adi Haviv, Lior Vassertail, Omer Levy

Latent alignment objectives such as CTC and AXE significantly improve non-autoregressive machine translation models.

Machine Translation Translation

What Do You Get When You Cross Beam Search with Nucleus Sampling?

no code implementations insights (ACL) 2022 Uri Shaham, Omer Levy

We combine beam search with the probabilistic pruning technique of nucleus sampling to create two deterministic nucleus search algorithms for natural language generation.

Machine Translation Text Generation +1

Structural Language Models for Any-Code Generation

no code implementations25 Sep 2019 Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

We introduce a new approach to AnyGen that leverages the strict syntax of programming languages to model a code snippet as tree structural language modeling (SLM).

C++ code Code Generation +1

Are Mutually Intelligible Languages Easier to Translate?

no code implementations31 Jan 2022 Avital Friedland, Jonathan Zeltser, Omer Levy

Two languages are considered mutually intelligible if their native speakers can communicate with each other, while using their own mother tongue.

Translation

Breaking Character: Are Subwords Good Enough for MRLs After All?

no code implementations10 Apr 2022 Omri Keren, Tal Avinari, Reut Tsarfaty, Omer Levy

Large pretrained language models (PLMs) typically tokenize the input string into contiguous subwords before any pretraining or inference.

Extractive Question-Answering Language Modelling +7

Causes and Cures for Interference in Multilingual Translation

no code implementations14 Dec 2022 Uri Shaham, Maha Elbayad, Vedanuj Goswami, Omer Levy, Shruti Bhosale

Multilingual machine translation models can benefit from synergy between different language pairs, but also suffer from interference.

Machine Translation Translation

A Simple Baseline for Beam Search Reranking

no code implementations17 Dec 2022 Lior Vassertail, Omer Levy

Reranking methods in machine translation aim to close the gap between common evaluation metrics (e. g. BLEU) and maximum likelihood learning and decoding algorithms.

Machine Translation Translation

Scaling Laws for Generative Mixed-Modal Language Models

no code implementations10 Jan 2023 Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer

To better understand the scaling properties of such mixed-modal models, we conducted over 250 experiments using seven different modalities and model sizes ranging from 8 million to 30 billion, trained on 5-100 billion tokens.

X&Fuse: Fusing Visual Information in Text-to-Image Generation

no code implementations2 Mar 2023 Yuval Kirstain, Omer Levy, Adam Polyak

We introduce X&Fuse, a general approach for conditioning on visual information when generating images from text.

Text-to-Image Generation

Branch-Solve-Merge Improves Large Language Model Evaluation and Generation

no code implementations23 Oct 2023 Swarnadeep Saha, Omer Levy, Asli Celikyilmaz, Mohit Bansal, Jason Weston, Xian Li

Large Language Models (LLMs) are frequently used for multi-faceted language generation and evaluation tasks that involve satisfying intricate user constraints or taking into account multiple aspects and criteria.

Language Modelling Large Language Model +1

Cannot find the paper you are looking for? You can Submit a new open access paper.