1 code implementation • 22 May 2022 • Or Honovich, Uri Shaham, Samuel R. Bowman, Omer Levy
Large language models are able to perform a task by conditioning on a few input-output demonstrations - a paradigm known as in-context learning.
no code implementations • 10 Apr 2022 • Omri Keren, Tal Avinari, Reut Tsarfaty, Omer Levy
Large pretrained language models (PLMs) typically tokenize the input string into contiguous subwords before any pretraining or inference.
no code implementations • 30 Mar 2022 • Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
Transformers typically require some form of positional encoding, such as positional embeddings, to process natural language sequences.
no code implementations • 31 Jan 2022 • Avital Friedland, Jonathan Zeltser, Omer Levy
Two languages are considered mutually intelligible if their native speakers can communicate with each other, while using their own mother tongue.
1 code implementation • 10 Jan 2022 • Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv, Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy
NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild.
Ranked #5 on
Long-range modeling
on SCROLLS
1 code implementation • 14 Dec 2021 • Ori Ram, Gal Shachaf, Omer Levy, Jonathan Berant, Amir Globerson
Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive performance by training on large datasets of question-passage pairs.
1 code implementation • 14 Dec 2021 • Wenhan Xiong, Barlas Oğuz, Anchit Gupta, Xilun Chen, Diana Liskovich, Omer Levy, Wen-tau Yih, Yashar Mehdad
Many NLP tasks require processing long contexts beyond the length limit of pretrained models.
1 code implementation • 8 Oct 2021 • Yuval Kirstain, Patrick Lewis, Sebastian Riedel, Omer Levy
We investigate the dynamics of increasing the number of model parameters versus the number of labeled examples across a wide variety of tasks.
1 code implementation • EMNLP (MRQA) 2021 • Omri Keren, Omer Levy
NLP research in Hebrew has largely focused on morphology and syntax, where rich annotated datasets in the spirit of Universal Dependencies are available.
no code implementations • 25 Aug 2021 • Itay Itzhak, Omer Levy
Standard pretrained language models operate on sequences of subword tokens without direct access to the characters that compose each token's string representation.
1 code implementation • 12 Aug 2021 • Or Castel, Ori Ram, Avia Efrat, Omer Levy
However, this approach does not ensure that the answer is a span in the given passage, nor does it guarantee that it is the most probable one.
no code implementations • insights (ACL) 2022 • Uri Shaham, Omer Levy
We combine beam search with the probabilistic pruning technique of nucleus sampling to create two deterministic nucleus search algorithms for natural language generation.
no code implementations • NAACL 2021 • Adi Haviv, Lior Vassertail, Omer Levy
Latent alignment objectives such as CTC and AXE significantly improve non-autoregressive machine translation models.
3 code implementations • EMNLP 2021 • Peter Izsak, Moshe Berchansky, Omer Levy
While large language models a la BERT are used ubiquitously in NLP, pretraining them is considered a luxury that only a few well-funded industry labs can afford.
Ranked #16 on
Semantic Textual Similarity
on MRPC
1 code implementation • EMNLP 2021 • Avia Efrat, Uri Shaham, Dan Kilman, Omer Levy
Current NLP datasets targeting ambiguity can be solved by a native speaker with relative ease.
1 code implementation • ACL 2021 • Yuval Kirstain, Ori Ram, Omer Levy
The introduction of pretrained language models has reduced many complex task-specific NLP models to simple lightweight layers.
4 code implementations • ACL 2021 • Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
Given a passage with multiple sets of recurring spans, we mask in each set all recurring spans but one, and ask the model to select the correct span in the passage for each masked span.
1 code implementation • EMNLP 2021 • Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy
Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role in the network remains under-explored.
no code implementations • 22 Oct 2020 • Avia Efrat, Omer Levy
Supervised machine learning provides the learner with a set of input-output examples of the target task.
2 code implementations • NAACL 2021 • Uri Shaham, Omer Levy
Many NLP models operate over sequences of subword tokens produced by hand-crafted tokenization rules and heuristic subword induction algorithms.
1 code implementation • ICML 2020 • Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy
This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order.
no code implementations • 23 Jan 2020 • Marjan Ghazvininejad, Omer Levy, Luke Zettlemoyer
The recently proposed mask-predict decoding algorithm has narrowed the performance gap between semi-autoregressive machine translation models and the traditional left-to-right approach.
2 code implementations • ACL 2020 • Ofir Press, Noah A. Smith, Omer Levy
Multilayer transformer networks consist of interleaved self-attention and feedforward sublayers.
Ranked #6 on
Language Modelling
on enwik8
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Jiezhong Qiu, Hao Ma, Omer Levy, Scott Wen-tau Yih, Sinong Wang, Jie Tang
We present BlockBERT, a lightweight and efficient BERT model for better modeling long-distance dependencies.
2 code implementations • ICLR 2020 • Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
Applying this augmentation to a strong Wikitext-103 LM, with neighbors drawn from the original training set, our $k$NN-LM achieves a new state-of-the-art perplexity of 15. 79 - a 2. 9 point improvement with no additional training.
Ranked #4 on
Language Modelling
on WikiText-103
31 code implementations • ACL 2020 • Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer
We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.
Ranked #4 on
Text Summarization
on X-Sum
2 code implementations • ICML 2020 • Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav
We introduce a new approach to any-code completion that leverages the strict syntax of programming languages to model a code snippet as a tree - structural language modeling (SLM).
no code implementations • 25 Sep 2019 • Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav
We introduce a new approach to AnyGen that leverages the strict syntax of programming languages to model a code snippet as tree structural language modeling (SLM).
2 code implementations • IJCNLP 2019 • Mandar Joshi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer
We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3. 9 F1) and GAP (+11. 5 F1) benchmarks.
Ranked #4 on
Coreference Resolution
on OntoNotes
48 code implementations • 26 Jul 2019 • Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
Ranked #2 on
Common Sense Reasoning
on SWAG
5 code implementations • TACL 2020 • Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.
Ranked #1 on
Open-Domain Question Answering
on SearchQA
(F1 metric)
2 code implementations • WS 2019 • Kevin Clark, Urvashi Khandelwal, Omer Levy, Christopher D. Manning
Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data.
3 code implementations • NeurIPS 2019 • Paul Michel, Omer Levy, Graham Neubig
Attention is a powerful and ubiquitous mechanism for allowing neural models to focus on particular salient pieces of information by taking their weighted average when making predictions.
3 code implementations • NeurIPS 2019 • Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman
In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks.
2 code implementations • IJCNLP 2019 • Marjan Ghazvininejad, Omer Levy, Yinhan Liu, Luke Zettlemoyer
Most machine translation systems generate text autoregressively from left to right.
no code implementations • WS 2019 • Vladimir Karpukhin, Omer Levy, Jacob Eisenstein, Marjan Ghazvininejad
We consider the problem of making machine translation more robust to character-level variation at the source side, such as typos.
3 code implementations • NAACL 2019 • Mandar Joshi, Eunsol Choi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer
Reasoning about implied relationships (e. g., paraphrastic, common sense, encyclopedic) between pairs of words is crucial for many cross-sentence inference problems.
6 code implementations • ICLR 2019 • Uri Alon, Shaked Brody, Omer Levy, Eran Yahav
The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval.
no code implementations • ACL 2018 • Eunsol Choi, Omer Levy, Yejin Choi, Luke Zettlemoyer
We introduce a new entity typing task: given a sentence with an entity mention, the goal is to predict a set of free-form phrases (e. g. skyscraper, songwriter, or criminal) that describe appropriate types for the target entity.
Ranked #4 on
Entity Typing
on Ontonotes v5 (English)
no code implementations • WS 2018 • Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan, Noah A. Smith
While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data.
1 code implementation • ACL 2018 • Luheng He, Kenton Lee, Omer Levy, Luke Zettlemoyer
Recent BIO-tagging-based neural semantic role labeling models are very high performing, but assume gold predicates as part of the input and cannot incorporate span-level features.
no code implementations • ACL 2018 • Terra Blevins, Omer Levy, Luke Zettlemoyer
We present a set of experiments to demonstrate that deep recurrent neural networks (RNNs) learn internal representations that capture soft hierarchical notions of syntax from highly varied supervision.
no code implementations • ACL 2018 • Omer Levy, Kenton Lee, Nicholas FitzGerald, Luke Zettlemoyer
LSTMs were introduced to combat vanishing gradients in simple RNNs by augmenting them with gated additive recurrent connections.
8 code implementations • WS 2018 • Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman
For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset.
Ranked #32 on
Natural Language Inference
on MultiNLI
Natural Language Inference
Natural Language Understanding
+1
9 code implementations • 26 Mar 2018 • Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav
We demonstrate the effectiveness of our approach by using it to predict a method's name from the vector representation of its body.
3 code implementations • 26 Mar 2018 • Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav
A major challenge when learning from programs is $\textit{how to represent programs in a way that facilitates effective learning}$.
no code implementations • NAACL 2018 • Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, Noah A. Smith
Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to.
no code implementations • ICLR 2018 • Antoine Bosselut, Omer Levy, Ari Holtzman, Corin Ennis, Dieter Fox, Yejin Choi
Understanding procedural language requires anticipating the causal effects of actions, even when they are not explicitly stated.
1 code implementation • CONLL 2017 • Yotam Eshel, Noam Cohen, Kira Radinsky, Shaul Markovitch, Ikuya Yamada, Omer Levy
We address the task of Named Entity Disambiguation (NED) for noisy text.
2 code implementations • CONLL 2017 • Omer Levy, Minjoon Seo, Eunsol Choi, Luke Zettlemoyer
We show that relation extraction can be reduced to answering simple reading comprehension questions, by associating one or more natural-language questions with each relation slot.
2 code implementations • 21 May 2017 • Kenton Lee, Omer Levy, Luke Zettlemoyer
We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates.
no code implementations • COLING 2016 • Omer Levy, Ido Dagan, Gabriel Stanovsky, Judith Eckle-Kohler, Iryna Gurevych
Sentence intersection captures the semantic overlap of two texts, generalizing over paradigms such as textual entailment and semantic text similarity.
Abstractive Text Summarization
Natural Language Inference
+1
no code implementations • EACL 2017 • Omer Levy, Anders Søgaard, Yoav Goldberg
While cross-lingual word embeddings have been studied extensively in recent years, the qualitative differences between the different algorithms remain vague.
no code implementations • TACL 2015 • Omer Levy, Yoav Goldberg, Ido Dagan
Recent trends suggest that neural-network-inspired word embedding models outperform traditional count-based distributional models on word similarity and analogy detection tasks.
no code implementations • NeurIPS 2014 • Omer Levy, Yoav Goldberg
We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, shifted by a global constant.
5 code implementations • 15 Feb 2014 • Yoav Goldberg, Omer Levy
The word2vec software of Tomas Mikolov and colleagues (https://code. google. com/p/word2vec/ ) has gained a lot of traction lately, and provides state-of-the-art word embeddings.