Search Results for author: Yoav Goldberg

Found 168 papers, 85 papers with code

It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

1 code implementation EMNLP (BlackboxNLP) 2020 Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.


Leveraging Collection-Wide Similarities for Unsupervised Document Structure Extraction

no code implementations21 Feb 2024 Gili Lior, Yoav Goldberg, Gabriel Stanovsky

Document collections of various domains, e. g., legal, medical, or financial, often share some underlying collection-wide structure, which captures information that can aid both human users and structure-aware models.

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

1 code implementation19 Feb 2024 Mosh Levy, Alon Jacoby, Yoav Goldberg

This paper explores the impact of extending input lengths on the capabilities of Large Language Models (LLMs).

What Changed? Converting Representational Interventions to Natural Language

no code implementations17 Feb 2024 Matan Avitan, Ryan Cotterell, Yoav Goldberg, Shauli Ravfogel

Interventions targeting the representation space of language models (LMs) have emerged as effective means to influence model behavior.


NERetrieve: Dataset for Next Generation Named Entity Recognition and Retrieval

1 code implementation22 Oct 2023 Uri Katz, Matan Vetzler, Amir DN Cohen, Yoav Goldberg

The third, and most challenging, is the move from the recognition setup to a novel retrieval setup, where the query is a zero-shot entity type, and the expected result is all the sentences from a large, pre-indexed corpus that contain entities of these types, and their corresponding spans.

named-entity-recognition Named Entity Recognition +3

Hierarchy Builder: Organizing Textual Spans into a Hierarchy to Facilitate Navigation

no code implementations18 Sep 2023 Itay Yair, Hillel Taub-Tabib, Yoav Goldberg

Information extraction systems often produce hundreds to thousands of strings on a specific topic.

Unsupervised Mapping of Arguments of Deverbal Nouns to Their Corresponding Verbal Labels

no code implementations24 Jun 2023 Aviv Weinstein, Yoav Goldberg

Deverbal nouns are nominal forms of verbs commonly used in written English texts to describe events or actions, as well as their arguments.

Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment

1 code implementation NeurIPS 2023 Royi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg, Gal Chechik

This reflects an impaired mapping between linguistic binding of entities and modifiers in the prompt and visual binding of the corresponding elements in the generated image.

Attribute Sentence +1

Conjunct Resolution in the Face of Verbal Omissions

no code implementations26 May 2023 Royi Rassin, Yoav Goldberg, Reut Tsarfaty

In this work we propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure.

Missing Elements Sentence +1

ChatGPT and Simple Linguistic Inferences: Blind Spots and Blinds

no code implementations24 May 2023 Victoria Basmov, Yoav Goldberg, Reut Tsarfaty

This paper sheds light on the limitations of ChatGPT's understanding capabilities, focusing on simple inference tasks that are typically easy for humans but appear to be challenging for the model.

Retrieving Texts based on Abstract Descriptions

no code implementations21 May 2023 Shauli Ravfogel, Valentina Pyatkin, Amir DN Cohen, Avshalom Manevich, Yoav Goldberg

While instruction-tuned Large Language Models (LLMs) excel at extracting information from text, they are not suitable for locating texts conforming to a given description in a large document collection (semantic retrieval).

Language Modelling Large Language Model +2

Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks

1 code implementation17 May 2023 Alon Jacovi, Avi Caciularu, Omer Goldman, Yoav Goldberg

Data contamination has become prevalent and challenging with the rise of models pretrained on large automatically-crawled corpora.


Conformal Nucleus Sampling

no code implementations4 May 2023 Shauli Ravfogel, Yoav Goldberg, Jacob Goldberger

Language models generate text based on successively sampling the next word.

Conformal Prediction

Neighboring Words Affect Human Interpretation of Saliency Explanations

1 code implementation4 May 2023 Alon Jacovi, Hendrik Schuff, Heike Adel, Ngoc Thang Vu, Yoav Goldberg

Word-level saliency explanations ("heat maps over words") are often used to communicate feature-attribution in text-based models.

Training Large Scale Polynomial CNNs for E2E Inference over Homomorphic Encryption

no code implementations26 Apr 2023 Moran Baruch, Nir Drucker, Gilad Ezov, Yoav Goldberg, Eyal Kushnir, Jenny Lerner, Omri Soceanu, Itamar Zimerman

Training large-scale CNNs that during inference can be run under Homomorphic Encryption (HE) is challenging due to the need to use only polynomial operations.

Privacy Preserving Transfer Learning

Two Kinds of Recall

no code implementations19 Mar 2023 Yoav Goldberg

I demonstrate through experiments that while neural methods are indeed significantly better at d-recall, it is sometimes the case that pattern-based methods are still substantially better at e-recall.

Vocal Bursts Valence Prediction

At Your Fingertips: Extracting Piano Fingering Instructions from Videos

no code implementations7 Mar 2023 Amit Moryossef, Yanai Elazar, Yoav Goldberg

Piano fingering -- knowing which finger to use to play each note in a musical piece, is a hard and important skill to master when learning to play the piano.

Lexical Generalization Improves with Larger Models and Longer Training

1 code implementation23 Oct 2022 Elron Bandel, Yoav Goldberg, Yanai Elazar

While fine-tuned language models perform well on many tasks, they were also shown to rely on superficial surface features such as lexical overlap.

Natural Language Inference Reading Comprehension

DALLE-2 is Seeing Double: Flaws in Word-to-Concept Mapping in Text2Image Models

no code implementations19 Oct 2022 Royi Rassin, Shauli Ravfogel, Yoav Goldberg

We study the way DALLE-2 maps symbols (words) in the prompt to their references (entities or properties of entities in the generated image).

Log-linear Guardedness and its Implications

no code implementations18 Oct 2022 Shauli Ravfogel, Yoav Goldberg, Ryan Cotterell

Methods for erasing human-interpretable concepts from neural representations that assume linearity have been found to be tractable and useful.

CIKQA: Learning Commonsense Inference with a Unified Knowledge-in-the-loop QA Paradigm

no code implementations12 Oct 2022 Hongming Zhang, Yintong Huo, Yanai Elazar, Yangqiu Song, Yoav Goldberg, Dan Roth

We first align commonsense tasks with relevant knowledge from commonsense knowledge bases and ask humans to annotate whether the knowledge is enough or not.

Question Answering Task 2

Understanding Transformer Memorization Recall Through Idioms

1 code implementation7 Oct 2022 Adi Haviv, Ido Cohen, Jacob Gidron, Roei Schuster, Yoav Goldberg, Mor Geva

In this work, we offer the first methodological framework for probing and characterizing recall of memorized sequences in transformer LMs.


F-coref: Fast, Accurate and Easy to Use Coreference Resolution

1 code implementation9 Sep 2022 Shon Otmazgin, Arie Cattan, Yoav Goldberg

We introduce fastcoref, a python package for fast, accurate, and easy-to-use English coreference resolution.


Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions

no code implementations28 Jul 2022 Yanai Elazar, Nora Kassner, Shauli Ravfogel, Amir Feder, Abhilasha Ravichander, Marius Mosbach, Yonatan Belinkov, Hinrich Schütze, Yoav Goldberg

Our causal framework and our results demonstrate the importance of studying datasets and the benefits of causality for understanding NLP models.

Rivendell: Project-Based Academic Search Engine

no code implementations26 Jun 2022 Teddy Lazebnik, Hanna Weitman, Yoav Goldberg, Gal A. Kaminka

We posit that in searching for research papers, a combination of a life-time search engine with an explicitly-provided context (project) provides a solution to the concept drift problem.

LingMess: Linguistically Informed Multi Expert Scorers for Coreference Resolution

2 code implementations25 May 2022 Shon Otmazgin, Arie Cattan, Yoav Goldberg

While coreference resolution typically involves various linguistic challenges, recent models are based on a single pairwise scorer for all types of pairs.


LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models

1 code implementation26 Apr 2022 Mor Geva, Avi Caciularu, Guy Dar, Paul Roit, Shoval Sadde, Micah Shlain, Bar Tamir, Yoav Goldberg

The opaque nature and unexplained behavior of transformer-based language models (LMs) have spurred a wide interest in interpreting their predictions.

Analyzing Gender Representation in Multilingual Models

1 code implementation RepL4NLP (ACL) 2022 Hila Gonen, Shauli Ravfogel, Yoav Goldberg

Multilingual language models were shown to allow for nontrivial transfer across scripts and languages.

Gender Classification

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

1 code implementation28 Mar 2022 Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg

Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood.

Linear Adversarial Concept Erasure

2 code implementations28 Jan 2022 Shauli Ravfogel, Michael Twiton, Yoav Goldberg, Ryan Cotterell

Modern neural models trained on textual data rely on pre-trained representations that emerge without direct supervision.

Kernelized Concept Erasure

1 code implementation28 Jan 2022 Shauli Ravfogel, Francisco Vargas, Yoav Goldberg, Ryan Cotterell

One prominent approach for the identification of concepts in neural representations is searching for a linear subspace whose erasure prevents the prediction of the concept from the representations.

Human Interpretation of Saliency-based Explanation Over Text

1 code implementation27 Jan 2022 Hendrik Schuff, Alon Jacovi, Heike Adel, Yoav Goldberg, Ngoc Thang Vu

In this work, we focus on this question through a study of saliency-based explanations over textual data.

CommonsenseQA 2.0: Exposing the Limits of AI through Gamification

no code implementations14 Jan 2022 Alon Talmor, Ori Yoran, Ronan Le Bras, Chandra Bhagavatula, Yoav Goldberg, Yejin Choi, Jonathan Berant

Constructing benchmarks that test the abilities of modern natural language understanding models is difficult - pre-trained language models exploit artifacts in benchmarks to achieve human parity, but still fail on adversarial examples and make errors that demonstrate a lack of common sense.

Common Sense Reasoning Natural Language Understanding

Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora

1 code implementation ACL 2020 Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg

The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational social science.

Word Embeddings

Large Scale Substitution-based Word Sense Induction

no code implementations ACL 2022 Matan Eyal, Shoval Sadde, Hillel Taub-Tabib, Yoav Goldberg

We present a word-sense induction method based on pre-trained masked language models (MLMs), which can cheaply scale to large vocabularies and large corpora.

Outlier Detection Word Embeddings +1

Text-based NP Enrichment

1 code implementation24 Sep 2021 Yanai Elazar, Victoria Basmov, Yoav Goldberg, Reut Tsarfaty

Understanding the relations between entities denoted by NPs in a text is a critical part of human-like natural language understanding.

Natural Language Understanding

Asking It All: Generating Contextualized Questions for any Semantic Role

1 code implementation EMNLP 2021 Valentina Pyatkin, Paul Roit, Julian Michael, Reut Tsarfaty, Yoav Goldberg, Ido Dagan

We develop a two-stage model for this task, which first produces a context-independent question prototype for each role and then revises it to be contextually appropriate for the passage.

Question Generation Question-Generation

BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models

5 code implementations ACL 2022 Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg

We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset of them) are being modified.

Language Modelling

Thinking Like Transformers

4 code implementations13 Jun 2021 Gail Weiss, Yoav Goldberg, Eran Yahav

In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language.

Neural Extractive Search

no code implementations ACL 2021 Shauli Ravfogel, Hillel Taub-Tabib, Yoav Goldberg

We advocate for a search paradigm called ``extractive search'', in which a search query is enriched with capture-slots, to allow for such rapid extraction.


Data Augmentation for Sign Language Gloss Translation

no code implementations MTSummit 2021 Amit Moryossef, Kayo Yin, Graham Neubig, Yoav Goldberg

Sign language translation (SLT) is often decomposed into video-to-gloss recognition and gloss-to-text translation, where a gloss is a sequence of transcribed spoken-language words in the order in which they are signed.

Data Augmentation Low-Resource Neural Machine Translation +3

Including Signed Languages in Natural Language Processing

no code implementations ACL 2021 Kayo Yin, Amit Moryossef, Julie Hochgesang, Yoav Goldberg, Malihe Alikhani

Signed languages are the primary means of communication for many deaf and hard of hearing individuals.

Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?

no code implementations22 Apr 2021 William Merrill, Yoav Goldberg, Roy Schwartz, Noah A. Smith

We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence.

Back to Square One: Artifact Detection, Training and Commonsense Disentanglement in the Winograd Schema

no code implementations EMNLP 2021 Yanai Elazar, Hongming Zhang, Yoav Goldberg, Dan Roth

To support this claim, we first show that the current evaluation method of WS is sub-optimal and propose a modification that uses twin sentences for evaluation.

Bias Detection Disentanglement +1

Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?

4 code implementations NAACL 2021 Eric Lehman, Sarthak Jain, Karl Pichotta, Yoav Goldberg, Byron C. Wallace

The cost of training such models (and the necessity of data access to do so) coupled with their utility motivates parameter sharing, i. e., the release of pretrained models such as ClinicalBERT.

Contrastive Explanations for Model Interpretability

1 code implementation EMNLP 2021 Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Yejin Choi, Yoav Goldberg

Our method is based on projecting model representation to a latent space that captures only the features that are useful (to the model) to differentiate two potential decisions.

text-classification Text Classification

Measuring and Improving Consistency in Pretrained Language Models

1 code implementation1 Feb 2021 Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, Yoav Goldberg

In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge?

A simple geometric proof for the benefit of depth in ReLU networks

no code implementations18 Jan 2021 Asaf Amrami, Yoav Goldberg

We present a simple proof for the benefit of depth in multi-layer feedforward network with rectified activation ("depth separation").

Facts2Story: Controlling Text Generation by Key Facts

1 code implementation COLING 2020 Eyal Orbach, Yoav Goldberg

Recent advancements in self-attention neural network architectures have raised the bar for open-ended text generation.

Text Generation

Compressing Pre-trained Language Models by Matrix Decomposition

no code implementations Asian Chapter of the Association for Computational Linguistics 2020 Matan Ben Noach, Yoav Goldberg

Large pre-trained language models reach state-of-the-art results on many different NLP tasks when fine-tuned individually; They also come with a significant memory and computational requirements, calling for methods to reduce model sizes (green AI).

Model Compression

Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent

1 code implementation EMNLP 2021 William Merrill, Vivek Ramanujan, Yoav Goldberg, Roy Schwartz, Noah Smith

To better understand this bias, we study the tendency for transformer parameters to grow in magnitude ($\ell_2$ norm) during training, and its implications for the emergent representations within self attention layers.

Inductive Bias

It's not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

1 code implementation16 Oct 2020 Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.


Relation Classification as Two-way Span-Prediction

no code implementations9 Oct 2020 Amir DN Cohen, Shachar Rosenman, Yoav Goldberg

The current supervised relation classification (RC) task uses a single embedding to represent the relation between a pair of entities.

Classification General Classification +4

Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data

1 code implementation EMNLP 2020 Shachar Rosenman, Alon Jacovi, Yoav Goldberg

The process of collecting and annotating training data may introduce distribution artifacts which may limit the ability of models to learn correct generalization behavior.

Attribute Question Answering +2

Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

1 code implementation NeurIPS 2020 Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant

In this work, we provide a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements.

World Knowledge

Interactive Extractive Search over Biomedical Corpora

no code implementations WS 2020 Hillel Taub-Tabib, Micah Shlain, Shoval Sadde, Dan Lahav, Matan Eyal, Yaara Cohen, Yoav Goldberg

We present a system that allows life-science researchers to search a linguistically annotated corpus of scientific texts using patterns over dependency graphs, as well as using patterns over token sequences and a powerful variant of boolean keyword queries.

Retrieval Sentence

Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals

no code implementations1 Jun 2020 Yanai Elazar, Shauli Ravfogel, Alon Jacovi, Yoav Goldberg

In this work, we point out the inability to infer behavioral conclusions from probing results and offer an alternative method that focuses on how the information is being used, rather than on what information is encoded.

Aligning Faithful Interpretations with their Social Attribution

1 code implementation1 Jun 2020 Alon Jacovi, Yoav Goldberg

We find that the requirement of model interpretations to be faithful is vague and incomplete.

pyBART: Evidence-based Syntactic Transformations for IE

1 code implementation ACL 2020 Aryeh Tiktinsky, Yoav Goldberg, Reut Tsarfaty

We present pyBART, an easy-to-use open-source Python library for converting English UD trees either to Enhanced UD graphs or to our representation.

Relation Extraction

A Two-Stage Masked LM Method for Term Set Expansion

1 code implementation ACL 2020 Guy Kushilevitz, Shaul Markovitch, Yoav Goldberg

We tackle the task of Term Set Expansion (TSE): given a small seed set of example terms from a semantic class, finding more members of that class.

Vocal Bursts Valence Prediction

A Formal Hierarchy of RNN Architectures

no code implementations ACL 2020 William Merrill, Gail Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, Eran Yahav

While formally extending these findings to unsaturated RNNs is left to future work, we hypothesize that the practical learnable capacity of unsaturated RNNs obeys a similar hierarchy.

Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

2 code implementations ACL 2020 Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton, Yoav Goldberg

The ability to control for the kinds of information encoded in neural representation has a variety of use cases, especially in light of the challenge of interpreting these models.

Fairness Multi-class Classification +1

Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?

no code implementations ACL 2020 Alon Jacovi, Yoav Goldberg

With the growing popularity of deep-learning based NLP models, comes a need for interpretable systems.

Unsupervised Domain Clusters in Pretrained Language Models

1 code implementation ACL 2020 Roee Aharoni, Yoav Goldberg

The notion of "in-domain data" in NLP is often over-simplistic and vague, as textual data varies in many nuanced linguistic aspects such as topic, style or level of formality.

Machine Translation Sentence +1

oLMpics -- On what Language Model Pre-training Captures

1 code implementation31 Dec 2019 Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant

A fundamental challenge is to understand whether the performance of a LM on a task should be attributed to the pre-trained representations or to the process of fine-tuning on the task data.

Language Modelling

Learning Deterministic Weighted Automata with Queries and Counterexamples

1 code implementation NeurIPS 2019 Gail Weiss, Yoav Goldberg, Eran Yahav

We present an algorithm for extraction of a probabilistic deterministic finite automaton (PDFA) from a given black-box language model, such as a recurrent neural network (RNN).

Language Modelling

Scalable Evaluation and Improvement of Document Set Expansion via Neural Positive-Unlabeled Learning

1 code implementation EACL 2021 Alon Jacovi, Gang Niu, Yoav Goldberg, Masashi Sugiyama

We consider the situation in which a user has collected a small set of documents on a cohesive topic, and they want to retrieve additional documents on this topic from a large collection.

Information Retrieval Retrieval

Improving Quality and Efficiency in Plan-based Neural Data-to-Text Generation

1 code implementation WS 2019 Amit Moryossef, Ido Dagan, Yoav Goldberg

We follow the step-by-step approach to neural data-to-text generation we proposed in Moryossef et al (2019), in which the generation process is divided into a text-planning stage followed by a plan-realization stage.

Data-to-Text Generation Referring Expression +1

Transfer Learning Between Related Tasks Using Expected Label Proportions

1 code implementation IJCNLP 2019 Matan Ben Noach, Yoav Goldberg

We propose a novel application of the XR framework for transfer learning between related tasks, where knowing the labels of task A provides an estimation of the label proportion of task B.

Sentence Sentiment Analysis +2

Ab Antiquo: Neural Proto-language Reconstruction

2 code implementations NAACL 2021 Carlo Meloni, Shauli Ravfogel, Yoav Goldberg

Historical linguists have identified regularities in the process of historic sound change.

Filling Gender \& Number Gaps in Neural Machine Translation with Black-box Context Injection

no code implementations WS 2019 Amit Moryossef, Roee Aharoni, Yoav Goldberg

When translating from a language that does not morphologically mark information such as gender and number into a language that does, translation systems must {``}guess{''} this missing information, often leading to incorrect translations in the given context.

Machine Translation Test +1

Aligning Vector-spaces with Noisy Supervised Lexicon

1 code implementation NAACL 2019 Noa Yehezkel Lubin, Jacob Goldberger, Yoav Goldberg

The algorithm jointly learns the noise level in the lexicon, finds the set of noisy pairs, and learns the mapping between the spaces.


Towards better substitution-based word sense induction

2 code implementations29 May 2019 Asaf Amrami, Yoav Goldberg

Word sense induction (WSI) is the task of unsupervised clustering of word usages within a sentence to distinguish senses.

Clustering Sentence +1

Where's My Head? Definition, Dataset and Models for Numeric Fused-Heads Identification and Resolution

1 code implementation26 May 2019 Yanai Elazar, Yoav Goldberg

We provide the first computational treatment of fused-heads constructions (FH), focusing on the numeric fused-heads (NFH).

Missing Elements Sentence

Towards Neural Decompilation

no code implementations20 May 2019 Omer Katz, Yuval Olshaker, Yoav Goldberg, Eran Yahav

We address the problem of automatic decompilation, converting a program in low-level representation back to a higher-level human-readable programming language.

C++ code Machine Translation +1

Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation

1 code implementation NAACL 2019 Amit Moryossef, Yoav Goldberg, Ido Dagan

We propose to split the generation process into a symbolic text-planning stage that is faithful to the input, followed by a neural generation stage that focuses only on realization.

Data-to-Text Generation Graph-to-Sequence

Aligning Vector-spaces with Noisy Supervised Lexicons

1 code implementation25 Mar 2019 Noa Yehezkel Lubin, Jacob Goldberger, Yoav Goldberg

The algorithm jointly learns the noise level in the lexicon, finds the set of noisy pairs, and learns the mapping between the spaces.


Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages

2 code implementations NAACL 2019 Shauli Ravfogel, Yoav Goldberg, Tal Linzen

How do typological properties such as word order and morphological case marking affect the ability of neural sequence models to acquire the syntax of a language?


Filling Gender & Number Gaps in Neural Machine Translation with Black-box Context Injection

no code implementations8 Mar 2019 Amit Moryossef, Roee Aharoni, Yoav Goldberg

When translating from a language that does not morphologically mark information such as gender and number into a language that does, translation systems must "guess" this missing information, often leading to incorrect translations in the given context.

Machine Translation Test +1

Where's My Head? Definition, Data Set, and Models for Numeric Fused-Head Identification and Resolution

no code implementations TACL 2019 Yanai Elazar, Yoav Goldberg

We provide the first computational treatment of fused-heads constructions (FHs), focusing on the numeric fused-heads (NFHs).


A Little Is Enough: Circumventing Defenses For Distributed Learning

4 code implementations NeurIPS 2019 Moran Baruch, Gilad Baruch, Yoav Goldberg

We show that 20% of corrupt workers are sufficient to degrade a CIFAR10 model accuracy by 50%, as well as to introduce backdoors into MNIST and CIFAR10 models without hurting their accuracy

Assessing BERT's Syntactic Abilities

2 code implementations16 Jan 2019 Yoav Goldberg

I assess the extent to which the recently introduced BERT model captures English syntactic phenomena, using (1) naturally-occurring subject-verb agreement stimuli; (2) "coloreless green ideas" subject-verb agreement stimuli, in which content words in natural sentences are randomly replaced with words sharing the same part-of-speech and inflection; and (3) manually crafted stimuli for subject-verb agreement and reflexive anaphora phenomena.

Can LSTM Learn to Capture Agreement? The Case of Basque

no code implementations WS 2018 Shauli Ravfogel, Francis M. Tyers, Yoav Goldberg

We propose the Basque agreement prediction task as challenging benchmark for models that attempt to learn regularities in human language.


Word Sense Induction with Neural biLM and Symmetric Patterns

1 code implementation EMNLP 2018 Asaf Amrami, Yoav Goldberg

An established method for Word Sense Induction (WSI) uses a language model to predict probable substitutes for target words, and induces senses by clustering these resulting substitute vectors.

Clustering Word Sense Induction

Adversarial Removal of Demographic Attributes from Text Data

1 code implementation EMNLP 2018 Yanai Elazar, Yoav Goldberg

Recent advances in Representation Learning and Adversarial Training seem to succeed in removing unwanted features from the learned representation.

Representation Learning

Term Set Expansion based on Multi-Context Term Embeddings: an End-to-end Workflow

no code implementations26 Jul 2018 Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan, Yoav Goldberg, Alon Eirew, Yael Green, Shira Guskin, Peter Izsak, Daniel Korat

We present SetExpander, a corpus-based system for expanding a seed set of terms into a more complete set of terms that belong to the same semantic class.

Split and Rephrase: Better Evaluation and Stronger Baselines

1 code implementation ACL 2018 Roee Aharoni, Yoav Goldberg

To aid this, we present a new train-development-test data split and neural models augmented with a copy-mechanism, outperforming the best reported baseline by 8. 68 BLEU and fostering further progress on the task.

Machine Translation Memorization +3

On the Practical Computational Power of Finite Precision RNNs for Language Recognition

1 code implementation ACL 2018 Gail Weiss, Yoav Goldberg, Eran Yahav

While Recurrent Neural Networks (RNNs) are famously known to be Turing complete, this relies on infinite precision in the states and unbounded computation time.

Breaking NLI Systems with Sentences that Require Simple Lexical Inferences

2 code implementations ACL 2018 Max Glockner, Vered Shwartz, Yoav Goldberg

We create a new NLI test set that shows the deficiency of state-of-the-art models in inferences that require lexical and world knowledge.

Test World Knowledge

Split and Rephrase: Better Evaluation and a Stronger Baseline

2 code implementations2 May 2018 Roee Aharoni, Yoav Goldberg

To aid this, we present a new train-development-test data split and neural models augmented with a copy-mechanism, outperforming the best reported baseline by 8. 68 BLEU and fostering further progress on the task.

Memorization Sentence +2

LaVAN: Localized and Visible Adversarial Noise

1 code implementation ICML 2018 Danny Karmon, Daniel Zoran, Yoav Goldberg

Most works on adversarial examples for deep-learning based image classifiers use noise that, while small, covers the entire image.


Controlling Linguistic Style Aspects in Neural Language Generation

no code implementations WS 2017 Jessica Ficler, Yoav Goldberg

Most work on neural natural language generation (NNLG) focus on controlling the content of the generated text.

Language Modelling Text Generation

Exploring the Syntactic Abilities of RNNs with Multi-task Learning

1 code implementation CONLL 2017 Emile Enguehard, Yoav Goldberg, Tal Linzen

Recent work has explored the syntactic abilities of RNNs using the subject-verb agreement task, which diagnoses sensitivity to sentence structure.

CCG Supertagging Language Modelling +3

Greedy Transition-Based Dependency Parsing with Stack LSTMs

no code implementations CL 2017 Miguel Ballesteros, Chris Dyer, Yoav Goldberg, Noah A. Smith

During training, dynamic oracles alternate between sampling parser states from the training data and from the model as it is being learned, making the model more robust to the kinds of errors that will be made at test time.

Test Transition-Based Dependency Parsing

On-the-fly Operation Batching in Dynamic Computation Graphs

2 code implementations NeurIPS 2017 Graham Neubig, Yoav Goldberg, Chris Dyer

Dynamic neural network toolkits such as PyTorch, DyNet, and Chainer offer more flexibility for implementing models that cope with data of varying dimensions and structure, relative to toolkits that operate on statically declared computations (e. g., TensorFlow, CNTK, and Theano).

Towards String-to-Tree Neural Machine Translation

no code implementations ACL 2017 Roee Aharoni, Yoav Goldberg

We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees.

Machine Translation NMT +1

The Interplay of Semantics and Morphology in Word Embeddings

1 code implementation EACL 2017 Oded Avraham, Yoav Goldberg

We explore the ability of word embeddings to capture both semantic and morphological similarity, as affected by the different types of linguistic properties (surface form, lemma, morphological tag) used to compose the representation of each word.


Improving a Strong Neural Parser with Conjunction-Specific Features

no code implementations EACL 2017 Jessica Ficler, Yoav Goldberg

While dependency parsers reach very high overall accuracy, some dependency relations are much harder than others.

Dependency Parsing

DyNet: The Dynamic Neural Network Toolkit

4 code implementations15 Jan 2017 Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.

graph construction

Semi Supervised Preposition-Sense Disambiguation using Multilingual Data

no code implementations COLING 2016 Hila Gonen, Yoav Goldberg

Prepositions are very common and very ambiguous, and understanding their sense is critical for understanding the meaning of the sentence.

General Classification Sentence +1

Morphological Inflection Generation with Hard Monotonic Attention

1 code implementation ACL 2017 Roee Aharoni, Yoav Goldberg

We present a neural model for morphological inflection generation which employs a hard attention mechanism, inspired by the nearly-monotonic alignment commonly found between the characters in a word and the characters in its inflection.

Hard Attention Morphological Inflection

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

5 code implementations TACL 2016 Tal Linzen, Emmanuel Dupoux, Yoav Goldberg

The success of long short-term memory (LSTM) neural networks in language processing is typically attributed to their ability to capture long-distance statistical regularities.

Language Modelling

A Neural Network for Coordination Boundary Prediction

no code implementations EMNLP 2016 Jessica Ficler, Yoav Goldberg

We propose a neural-network based model for coordination boundary prediction.

A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments

no code implementations EACL 2017 Omer Levy, Anders Søgaard, Yoav Goldberg

While cross-lingual word embeddings have been studied extensively in recent years, the qualitative differences between the different algorithms remain vague.

Cross-Lingual Word Embeddings Sentence +1

Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks

3 code implementations15 Aug 2016 Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, Yoav Goldberg

The analysis sheds light on the relative strengths of different sentence embedding methods with respect to these low level prediction tasks, and on the effect of the encoded vector's dimensionality on the resulting representations.

Sentence Sentence Embedding +1

Coordination Annotation Extension in the Penn Tree Bank

1 code implementation ACL 2016 Jessica Ficler, Yoav Goldberg

Coordination is an important and common syntactic construction which is not handled well by state of the art parsers.

Improved Parsing for Argument-Clusters Coordination

no code implementations ACL 2016 Jessica Ficler, Yoav Goldberg

Syntactic parsers perform poorly in prediction of Argument-Cluster Coordination (ACC).

Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss

3 code implementations ACL 2016 Barbara Plank, Anders Søgaard, Yoav Goldberg

Bidirectional long short-term memory (bi-LSTM) networks have recently proven successful for various NLP sequence modeling tasks, but little is known about their reliance to input representations, target languages, data set size, and label noise.

Part-Of-Speech Tagging POS +1

Improving sentence compression by learning to predict gaze

no code implementations NAACL 2016 Sigrid Klerke, Yoav Goldberg, Anders Søgaard

We show how eye-tracking corpora can be used to improve sentence compression models, presenting a novel multi-task learning algorithm based on multi-layer LSTMs.

Multi-Task Learning Sentence +1

Improving Hypernymy Detection with an Integrated Path-based and Distributional Method

1 code implementation ACL 2016 Vered Shwartz, Yoav Goldberg, Ido Dagan

Detecting hypernymy relations is a key task in NLP, which is addressed in the literature using two complementary approaches.

Training with Exploration Improves a Greedy Stack-LSTM Parser

no code implementations11 Mar 2016 Miguel Ballesteros, Yoav Goldberg, Chris Dyer, Noah A. Smith

We adapt the greedy Stack-LSTM dependency parser of Dyer et al. (2015) to support a training-with-exploration procedure using dynamic oracles(Goldberg and Nivre, 2013) instead of cross-entropy minimization.

Chinese Dependency Parsing Dependency Parsing

Getting More Out Of Syntax with PropS

no code implementations4 Mar 2016 Gabriel Stanovsky, Jessica Ficler, Ido Dagan, Yoav Goldberg

Semantic NLP applications often rely on dependency trees to recognize major elements of the proposition structure of sentences.

Open Information Extraction

Easy-First Dependency Parsing with Hierarchical Tree LSTMs

no code implementations TACL 2016 Eliyahu Kiperwasser, Yoav Goldberg

We suggest a compositional vector representation of parse trees that relies on a recursive combination of recurrent-neural network encoders.

Dependency Parsing Word Embeddings

A Primer on Neural Network Models for Natural Language Processing

1 code implementation2 Oct 2015 Yoav Goldberg

Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing.

BIG-bench Machine Learning

Improving Distributional Similarity with Lessons Learned from Word Embeddings

no code implementations TACL 2015 Omer Levy, Yoav Goldberg, Ido Dagan

Recent trends suggest that neural-network-inspired word embedding models outperform traditional count-based distributional models on word similarity and analogy detection tasks.

Word Embeddings Word Similarity

Neural Word Embedding as Implicit Matrix Factorization

no code implementations NeurIPS 2014 Omer Levy, Yoav Goldberg

We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, shifted by a global constant.

Word Similarity

word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method

5 code implementations15 Feb 2014 Yoav Goldberg, Omer Levy

The word2vec software of Tomas Mikolov and colleagues (https://code. google. com/p/word2vec/ ) has gained a lot of traction lately, and provides state-of-the-art word embeddings.

Language Modelling Word Embeddings

A Tabular Method for Dynamic Oracles in Transition-Based Parsing

no code implementations TACL 2014 Yoav Goldberg, Francesco Sartorio, Giorgio Satta

We develop parsing oracles for two transition-based dependency parsers, including the arc-standard parser, solving a problem that was left open in (Goldberg and Nivre, 2013).

Training Deterministic Parsers with Non-Deterministic Oracles

no code implementations TACL 2013 Yoav Goldberg, Joakim Nivre

This problem is aggravated by the fact that they are normally trained using oracles that are deterministic and incomplete in the sense that they assume a unique canonical path through the transition system and are only valid as long as the parser does not stray from this path.


Cannot find the paper you are looking for? You can Submit a new open access paper.