Search Results for author: Omer Levy

Found 67 papers, 37 papers with code

Instruction Induction: From Few Examples to Natural Language Task Descriptions

1 code implementation22 May 2022 Or Honovich, Uri Shaham, Samuel R. Bowman, Omer Levy

Large language models are able to perform a task by conditioning on a few input-output demonstrations - a paradigm known as in-context learning.

Breaking Character: Are Subwords Good Enough for MRLs After All?

no code implementations10 Apr 2022 Omri Keren, Tal Avinari, Reut Tsarfaty, Omer Levy

Large pretrained language models (PLMs) typically tokenize the input string into contiguous subwords before any pretraining or inference.

Language Modelling Morphological Disambiguation +4

Transformer Language Models without Positional Encodings Still Learn Positional Information

no code implementations30 Mar 2022 Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy

Transformers typically require some form of positional encoding, such as positional embeddings, to process natural language sequences.

Are Mutually Intelligible Languages Easier to Translate?

no code implementations31 Jan 2022 Avital Friedland, Jonathan Zeltser, Omer Levy

Two languages are considered mutually intelligible if their native speakers can communicate with each other, while using their own mother tongue.


SCROLLS: Standardized CompaRison Over Long Language Sequences

1 code implementation10 Jan 2022 Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv, Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy

NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild.

Long-range modeling Natural Language Inference +1

Learning to Retrieve Passages without Supervision

1 code implementation14 Dec 2021 Ori Ram, Gal Shachaf, Omer Levy, Jonathan Berant, Amir Globerson

Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive performance by training on large datasets of question-passage pairs.

Contrastive Learning Open-Domain Question Answering

Simple Local Attentions Remain Competitive for Long-Context Tasks

1 code implementation14 Dec 2021 Wenhan Xiong, Barlas Oğuz, Anchit Gupta, Xilun Chen, Diana Liskovich, Omer Levy, Wen-tau Yih, Yashar Mehdad

Many NLP tasks require processing long contexts beyond the length limit of pretrained models.

A Few More Examples May Be Worth Billions of Parameters

1 code implementation8 Oct 2021 Yuval Kirstain, Patrick Lewis, Sebastian Riedel, Omer Levy

We investigate the dynamics of increasing the number of model parameters versus the number of labeled examples across a wide variety of tasks.

Multiple-choice Question Answering

ParaShoot: A Hebrew Question Answering Dataset

1 code implementation EMNLP (MRQA) 2021 Omri Keren, Omer Levy

NLP research in Hebrew has largely focused on morphology and syntax, where rich annotated datasets in the spirit of Universal Dependencies are available.

Question Answering

Models In a Spelling Bee: Language Models Implicitly Learn the Character Composition of Tokens

no code implementations25 Aug 2021 Itay Itzhak, Omer Levy

Standard pretrained language models operate on sequences of subword tokens without direct access to the characters that compose each token's string representation.

Language Modelling Pretrained Language Models

How Optimal is Greedy Decoding for Extractive Question Answering?

1 code implementation12 Aug 2021 Or Castel, Ori Ram, Avia Efrat, Omer Levy

However, this approach does not ensure that the answer is a span in the given passage, nor does it guarantee that it is the most probable one.

Pretrained Language Models Question Answering +1

What Do You Get When You Cross Beam Search with Nucleus Sampling?

no code implementations insights (ACL) 2022 Uri Shaham, Omer Levy

We combine beam search with the probabilistic pruning technique of nucleus sampling to create two deterministic nucleus search algorithms for natural language generation.

Machine Translation Text Generation +1

Can Latent Alignments Improve Autoregressive Machine Translation?

no code implementations NAACL 2021 Adi Haviv, Lior Vassertail, Omer Levy

Latent alignment objectives such as CTC and AXE significantly improve non-autoregressive machine translation models.

Machine Translation Translation

How to Train BERT with an Academic Budget

3 code implementations EMNLP 2021 Peter Izsak, Moshe Berchansky, Omer Levy

While large language models a la BERT are used ubiquitously in NLP, pretraining them is considered a luxury that only a few well-funded industry labs can afford.

Language Modelling Linguistic Acceptability +4

Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language

1 code implementation EMNLP 2021 Avia Efrat, Uri Shaham, Dan Kilman, Omer Levy

Current NLP datasets targeting ambiguity can be solved by a native speaker with relative ease.

Coreference Resolution without Span Representations

1 code implementation ACL 2021 Yuval Kirstain, Ori Ram, Omer Levy

The introduction of pretrained language models has reduced many complex task-specific NLP models to simple lightweight layers.

Coreference Resolution Pretrained Language Models

Few-Shot Question Answering by Pretraining Span Selection

4 code implementations ACL 2021 Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy

Given a passage with multiple sets of recurring spans, we mask in each set all recurring spans but one, and ask the model to select the correct span in the passage for each masked span.

Question Answering

Transformer Feed-Forward Layers Are Key-Value Memories

1 code implementation EMNLP 2021 Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy

Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role in the network remains under-explored.

The Turking Test: Can Language Models Understand Instructions?

no code implementations22 Oct 2020 Avia Efrat, Omer Levy

Supervised machine learning provides the learner with a set of input-output examples of the target task.

Language Modelling

Neural Machine Translation without Embeddings

2 code implementations NAACL 2021 Uri Shaham, Omer Levy

Many NLP models operate over sequences of subword tokens produced by hand-crafted tokenization rules and heuristic subword induction algorithms.

Machine Translation Translation

Aligned Cross Entropy for Non-Autoregressive Machine Translation

1 code implementation ICML 2020 Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy

This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order.

Machine Translation Translation

Semi-Autoregressive Training Improves Mask-Predict Decoding

no code implementations23 Jan 2020 Marjan Ghazvininejad, Omer Levy, Luke Zettlemoyer

The recently proposed mask-predict decoding algorithm has narrowed the performance gap between semi-autoregressive machine translation models and the traditional left-to-right approach.

Machine Translation Translation

Generalization through Memorization: Nearest Neighbor Language Models

2 code implementations ICLR 2020 Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis

Applying this augmentation to a strong Wikitext-103 LM, with neighbors drawn from the original training set, our $k$NN-LM achieves a new state-of-the-art perplexity of 15. 79 - a 2. 9 point improvement with no additional training.

Domain Adaptation Language Modelling

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

31 code implementations ACL 2020 Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

Abstractive Text Summarization Denoising +5

Structural Language Models of Code

2 code implementations ICML 2020 Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

We introduce a new approach to any-code completion that leverages the strict syntax of programming languages to model a code snippet as a tree - structural language modeling (SLM).

Code Completion Code Generation +1

Structural Language Models for Any-Code Generation

no code implementations25 Sep 2019 Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

We introduce a new approach to AnyGen that leverages the strict syntax of programming languages to model a code snippet as tree structural language modeling (SLM).

Code Generation Language Modelling

BERT for Coreference Resolution: Baselines and Analysis

2 code implementations IJCNLP 2019 Mandar Joshi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer

We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3. 9 F1) and GAP (+11. 5 F1) benchmarks.

Coreference Resolution

What Does BERT Look At? An Analysis of BERT's Attention

2 code implementations WS 2019 Kevin Clark, Urvashi Khandelwal, Omer Levy, Christopher D. Manning

Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data.

Language Modelling

Are Sixteen Heads Really Better than One?

3 code implementations NeurIPS 2019 Paul Michel, Omer Levy, Graham Neubig

Attention is a powerful and ubiquitous mechanism for allowing neural models to focus on particular salient pieces of information by taking their weighted average when making predictions.

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

3 code implementations NeurIPS 2019 Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks.

Transfer Learning

pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference

3 code implementations NAACL 2019 Mandar Joshi, Eunsol Choi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer

Reasoning about implied relationships (e. g., paraphrastic, common sense, encyclopedic) between pairs of words is crucial for many cross-sentence inference problems.

Common Sense Reasoning Word Embeddings

code2seq: Generating Sequences from Structured Representations of Code

6 code implementations ICLR 2019 Uri Alon, Shaked Brody, Omer Levy, Eran Yahav

The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval.

Code Summarization Source Code Summarization +1

Ultra-Fine Entity Typing

no code implementations ACL 2018 Eunsol Choi, Omer Levy, Yejin Choi, Luke Zettlemoyer

We introduce a new entity typing task: given a sentence with an entity mention, the goal is to predict a set of free-form phrases (e. g. skyscraper, songwriter, or criminal) that describe appropriate types for the target entity.

Entity Linking Entity Typing

LSTMs Exploit Linguistic Attributes of Data

no code implementations WS 2018 Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan, Noah A. Smith

While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data.

Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling

1 code implementation ACL 2018 Luheng He, Kenton Lee, Omer Levy, Luke Zettlemoyer

Recent BIO-tagging-based neural semantic role labeling models are very high performing, but assume gold predicates as part of the input and cannot incorporate span-level features.

Semantic Role Labeling

Deep RNNs Encode Soft Hierarchical Syntax

no code implementations ACL 2018 Terra Blevins, Omer Levy, Luke Zettlemoyer

We present a set of experiments to demonstrate that deep recurrent neural networks (RNNs) learn internal representations that capture soft hierarchical notions of syntax from highly varied supervision.

Dependency Parsing Language Modelling +3

Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum

no code implementations ACL 2018 Omer Levy, Kenton Lee, Nicholas FitzGerald, Luke Zettlemoyer

LSTMs were introduced to combat vanishing gradients in simple RNNs by augmenting them with gated additive recurrent connections.

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

8 code implementations WS 2018 Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset.

Natural Language Inference Natural Language Understanding +1

code2vec: Learning Distributed Representations of Code

9 code implementations26 Mar 2018 Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

We demonstrate the effectiveness of our approach by using it to predict a method's name from the vector representation of its body.

A General Path-Based Representation for Predicting Program Properties

3 code implementations26 Mar 2018 Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

A major challenge when learning from programs is $\textit{how to represent programs in a way that facilitates effective learning}$.

Annotation Artifacts in Natural Language Inference Data

no code implementations NAACL 2018 Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, Noah A. Smith

Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to.

Natural Language Inference Text Categorization

Simulating Action Dynamics with Neural Process Networks

no code implementations ICLR 2018 Antoine Bosselut, Omer Levy, Ari Holtzman, Corin Ennis, Dieter Fox, Yejin Choi

Understanding procedural language requires anticipating the causal effects of actions, even when they are not explicitly stated.

Zero-Shot Relation Extraction via Reading Comprehension

2 code implementations CONLL 2017 Omer Levy, Minjoon Seo, Eunsol Choi, Luke Zettlemoyer

We show that relation extraction can be reduced to answering simple reading comprehension questions, by associating one or more natural-language questions with each relation slot.

Reading Comprehension Relation Extraction +2

Recurrent Additive Networks

2 code implementations21 May 2017 Kenton Lee, Omer Levy, Luke Zettlemoyer

We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates.

Language Modelling

Modeling Extractive Sentence Intersection via Subtree Entailment

no code implementations COLING 2016 Omer Levy, Ido Dagan, Gabriel Stanovsky, Judith Eckle-Kohler, Iryna Gurevych

Sentence intersection captures the semantic overlap of two texts, generalizing over paradigms such as textual entailment and semantic text similarity.

Abstractive Text Summarization Natural Language Inference +1

A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments

no code implementations EACL 2017 Omer Levy, Anders Søgaard, Yoav Goldberg

While cross-lingual word embeddings have been studied extensively in recent years, the qualitative differences between the different algorithms remain vague.

Cross-Lingual Word Embeddings Word Embeddings

Improving Distributional Similarity with Lessons Learned from Word Embeddings

no code implementations TACL 2015 Omer Levy, Yoav Goldberg, Ido Dagan

Recent trends suggest that neural-network-inspired word embedding models outperform traditional count-based distributional models on word similarity and analogy detection tasks.

Word Embeddings Word Similarity

Neural Word Embedding as Implicit Matrix Factorization

no code implementations NeurIPS 2014 Omer Levy, Yoav Goldberg

We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, shifted by a global constant.

Word Similarity

word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method

5 code implementations15 Feb 2014 Yoav Goldberg, Omer Levy

The word2vec software of Tomas Mikolov and colleagues (https://code. google. com/p/word2vec/ ) has gained a lot of traction lately, and provides state-of-the-art word embeddings.

Language Modelling Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.