Search Results for author: Yanai Elazar

Found 49 papers, 24 papers with code

It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

1 code implementation EMNLP (BlackboxNLP) 2020 Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.

Translation

On Linear Representations and Pretraining Data Frequency in Language Models

no code implementations16 Apr 2025 Jack Merullo, Noah A. Smith, Sarah Wiegreffe, Yanai Elazar

Pretraining data has a direct impact on the behaviors and quality of language models (LMs), but we only understand the most basic principles of this relationship.

2k In-Context Learning +1

Better Aligned with Survey Respondents or Training Data? Unveiling Political Leanings of LLMs on U.S. Supreme Court Cases

no code implementations25 Feb 2025 Shanshan Xu, T. Y. S. S Santosh, Yanai Elazar, Quirin Vogel, Barbara Plank, Matthias Grabmair

The increased adoption of Large Language Models (LLMs) and their potential to shape public opinion have sparked interest in assessing these models' political leanings.

GRADE: Quantifying Sample Diversity in Text-to-Image Models

no code implementations29 Oct 2024 Royi Rassin, Aviv Slobodkin, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

GRADE leverages the world knowledge embedded in large language models and visual question-answering systems to identify relevant concept-specific axes of diversity (e. g., ``shape'' and ``color'' for the concept ``cookie'').

Attribute Diversity +3

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

1 code implementation24 Oct 2024 Lester James V. Miranda, Yizhong Wang, Yanai Elazar, Sachin Kumar, Valentina Pyatkin, Faeze Brahman, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi

We analyze features from the routing model to identify characteristics of instances that can benefit from human feedback, e. g., prompts with a moderate safety concern or moderate intent complexity.

How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold

1 code implementation19 Oct 2024 Sahil Verma, Royi Rassin, Arnav Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar

We seek to determine the point at which a model was trained on enough instances to imitate a concept -- the imitation threshold.

Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data

no code implementations20 Jul 2024 Xinyi Wang, Antonis Antoniades, Yanai Elazar, Alfonso Amayuelas, Alon Albalak, Kexun Zhang, William Yang Wang

Furthermore, while model performance improves across all tasks as LLM size increases, only factual question answering shows an increase in memorization, whereas machine translation and reasoning tasks exhibit greater generalization, producing more novel outputs.

Language Modelling Machine Translation +6

Detection and Measurement of Syntactic Templates in Generated Text

no code implementations28 Jun 2024 Chantal Shaib, Yanai Elazar, Junyi Jessy Li, Byron C. Wallace

Recent work on evaluating the diversity of text generated by LLMs has focused on word-level features.

Diversity Memorization

Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG

1 code implementation18 Jun 2024 William Merrill, Noah A. Smith, Yanai Elazar

In this work, we investigate the extent to which modern LMs generate $n$-grams from their training data, evaluating both (i) the probability LMs assign to complete training $n$-grams and (ii) $n$-novelty, the proportion of $n$-grams generated by an LM that did not appear in the training data (for arbitrarily large $n$).

Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation

no code implementations2 Jun 2024 Bar Iluz, Yanai Elazar, Asaf Yehudai, Gabriel Stanovsky

Most works on gender bias focus on intrinsic bias -- removing traces of information about a protected group from the model's internal representation.

Machine Translation

Calibrating Large Language Models with Sample Consistency

no code implementations21 Feb 2024 Qing Lyu, Kumar Shridhar, Chaitanya Malaviya, Li Zhang, Yanai Elazar, Niket Tandon, Marianna Apidianaki, Mrinmaya Sachan, Chris Callison-Burch

Accurately gauging the confidence level of Large Language Models' (LLMs) predictions is pivotal for their reliable application.

Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals

no code implementations16 Nov 2023 Yanai Elazar, Bhargavi Paranjape, Hao Peng, Sarah Wiegreffe, Khyathi Raghavi, Vivek Srikumar, Sameer Singh, Noah A. Smith

Previous work has found that datasets with paired inputs are prone to correlations between a specific part of the input (e. g., the hypothesis in NLI) and the label; consequently, models trained only on those outperform chance.

counterfactual In-Context Learning +2

What's In My Big Data?

1 code implementation31 Oct 2023 Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hanna Hajishirzi, Noah A. Smith, Jesse Dodge

We open-source WIMBD's code and artifacts to provide a standard set of evaluations for new text-based corpora and to encourage more analyses and transparency around them.

Benchmarking

The Bias Amplification Paradox in Text-to-Image Generation

1 code implementation1 Aug 2023 Preethi Seshadri, Sameer Singh, Yanai Elazar

Bias amplification is a phenomenon in which models exacerbate biases or stereotypes present in the training data.

Text to Image Generation Text-to-Image Generation

Estimating the Causal Effect of Early ArXiving on Paper Acceptance

2 code implementations24 Jun 2023 Yanai Elazar, Jiayao Zhang, David Wadden, Bo Zhang, Noah A. Smith

However, since quality is a challenging construct to estimate, we use the negative outcome control method, using paper citation count as a control variable to debias the quality confounding effect.

Causal Inference

Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation

1 code implementation26 May 2023 Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar

In this paper, we compare the generalization of few-shot fine-tuning and in-context learning to challenge datasets, while controlling for the models used, the number of examples, and the number of parameters, ranging from 125M to 30B.

Domain Generalization In-Context Learning

At Your Fingertips: Extracting Piano Fingering Instructions from Videos

no code implementations7 Mar 2023 Amit Moryossef, Yanai Elazar, Yoav Goldberg

Piano fingering -- knowing which finger to use to play each note in a musical piece, is a hard and important skill to master when learning to play the piano.

Lexical Generalization Improves with Larger Models and Longer Training

1 code implementation23 Oct 2022 Elron Bandel, Yoav Goldberg, Yanai Elazar

While fine-tuned language models perform well on many tasks, they were also shown to rely on superficial surface features such as lexical overlap.

Natural Language Inference Reading Comprehension

CIKQA: Learning Commonsense Inference with a Unified Knowledge-in-the-loop QA Paradigm

no code implementations12 Oct 2022 Hongming Zhang, Yintong Huo, Yanai Elazar, Yangqiu Song, Yoav Goldberg, Dan Roth

We first align commonsense tasks with relevant knowledge from commonsense knowledge bases and ask humans to annotate whether the knowledge is enough or not.

Question Answering Task 2

Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions

no code implementations28 Jul 2022 Yanai Elazar, Nora Kassner, Shauli Ravfogel, Amir Feder, Abhilasha Ravichander, Marius Mosbach, Yonatan Belinkov, Hinrich Schütze, Yoav Goldberg

Our causal framework and our results demonstrate the importance of studying datasets and the benefits of causality for understanding NLP models.

Text-based NP Enrichment

1 code implementation24 Sep 2021 Yanai Elazar, Victoria Basmov, Yoav Goldberg, Reut Tsarfaty

Understanding the relations between entities denoted by NPs in a text is a critical part of human-like natural language understanding.

Natural Language Understanding

Contrastive Explanations for Model Interpretability

1 code implementation EMNLP 2021 Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Yejin Choi, Yoav Goldberg

Our method is based on projecting model representation to a latent space that captures only the features that are useful (to the model) to differentiate two potential decisions.

model text-classification +1

Measuring and Improving Consistency in Pretrained Language Models

1 code implementation1 Feb 2021 Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, Yoav Goldberg

In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge?

First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT

1 code implementation EACL 2021 Benjamin Muller, Yanai Elazar, Benoît Sagot, Djamé Seddah

Such transfer emerges by fine-tuning on a task of interest in one language and evaluating on a distinct language, not seen during the fine-tuning.

Language Modeling Language Modelling +1

It's not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

1 code implementation16 Oct 2020 Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.

Translation

Do Language Embeddings Capture Scales?

no code implementations EMNLP (BlackboxNLP) 2020 Xikun Zhang, Deepak Ramachandran, Ian Tenney, Yanai Elazar, Dan Roth

Pretrained Language Models (LMs) have been shown to possess significant linguistic, common sense, and factual knowledge.

Common Sense Reasoning

Evaluating NLP Models via Contrast Sets

no code implementations1 Oct 2020 Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, A. Zhang, Ben Zhou

Unfortunately, when a dataset has systematic gaps (e. g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities.

Reading Comprehension Sentiment Analysis

Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals

no code implementations1 Jun 2020 Yanai Elazar, Shauli Ravfogel, Alon Jacovi, Yoav Goldberg

In this work, we point out the inability to infer behavioral conclusions from probing results and offer an alternative method that focuses on how the information is being used, rather than on what information is encoded.

Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

2 code implementations ACL 2020 Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton, Yoav Goldberg

The ability to control for the kinds of information encoded in neural representation has a variety of use cases, especially in light of the challenge of interpreting these models.

Fairness Multi-class Classification +1

oLMpics -- On what Language Model Pre-training Captures

2 code implementations31 Dec 2019 Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant

A fundamental challenge is to understand whether the performance of a LM on a task should be attributed to the pre-trained representations or to the process of fine-tuning on the task data.

Language Modeling Language Modelling +1

Adversarial Removal of Demographic Attributes Revisited

no code implementations IJCNLP 2019 Maria Barrett, Yova Kementchedjhieva, Yanai Elazar, Desmond Elliott, Anders S{\o}gaard

Elazar and Goldberg (2018) showed that protected attributes can be extracted from the representations of a debiased neural network for mention detection at above-chance levels, by evaluating a diagnostic classifier on a held-out subsample of the data it was trained on.

Diagnostic

Where's My Head? Definition, Dataset and Models for Numeric Fused-Heads Identification and Resolution

1 code implementation26 May 2019 Yanai Elazar, Yoav Goldberg

We provide the first computational treatment of fused-heads constructions (FH), focusing on the numeric fused-heads (NFH).

Missing Elements Sentence

Where's My Head? Definition, Data Set, and Models for Numeric Fused-Head Identification and Resolution

no code implementations TACL 2019 Yanai Elazar, Yoav Goldberg

We provide the first computational treatment of fused-heads constructions (FHs), focusing on the numeric fused-heads (NFHs).

Sentence

Adversarial Removal of Demographic Attributes from Text Data

1 code implementation EMNLP 2018 Yanai Elazar, Yoav Goldberg

Recent advances in Representation Learning and Adversarial Training seem to succeed in removing unwanted features from the learned representation.

Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.