Search Results for author: Hila Gonen

Found 25 papers, 12 papers with code

It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

1 code implementation EMNLP (BlackboxNLP) 2020 Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.


MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling

no code implementations15 Mar 2024 Tomasz Limisiewicz, Terra Blevins, Hila Gonen, Orevaoghene Ahia, Luke Zettlemoyer

A major consideration in multilingual language modeling is how to best represent languages with diverse vocabularies and scripts.

Language Modelling

Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

no code implementations19 Jan 2024 Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, Luke Zettlemoyer

Despite their popularity in non-English NLP, multilingual language models often underperform monolingual ones due to inter-language competition for model parameters.

That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context?

1 code implementation23 Oct 2023 Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, Noah A. Smith

In experiments, we compare MT-specific models and language models for (i) their preference when given an ambiguous subsentence, (ii) their sensitivity to disambiguating context, and (iii) the performance disparity between figurative and literal source sentences.


Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models

no code implementations23 May 2023 Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov

Language models have graduated from being research prototypes to commercialized products offered as web APIs, and recent works have highlighted the multilingual capabilities of these products.

Fairness Language Modelling

Dictionary-based Phrase-level Prompting of Large Language Models for Machine Translation

no code implementations15 Feb 2023 Marjan Ghazvininejad, Hila Gonen, Luke Zettlemoyer

Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting, even though they were not explicitly trained for this task.

Machine Translation Translation

Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too?

no code implementations20 Dec 2022 Weijia Shi, Xiaochuang Han, Hila Gonen, Ari Holtzman, Yulia Tsvetkov, Luke Zettlemoyer

Large language models can perform new tasks in a zero-shot fashion, given natural language prompts that specify the desired behavior.

Demystifying Prompts in Language Models via Perplexity Estimation

no code implementations8 Dec 2022 Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer

Language models can be prompted to perform a wide variety of zero- and few-shot learning problems.

Few-Shot Learning

Prompting Language Models for Linguistic Structure

no code implementations15 Nov 2022 Terra Blevins, Hila Gonen, Luke Zettlemoyer

Although pretrained language models (PLMs) can be prompted to perform a wide range of language tasks, it remains an open question how much this ability comes from generalizable linguistic understanding versus surface-level lexical patterns.

Chunking In-Context Learning +8

Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models

no code implementations24 May 2022 Terra Blevins, Hila Gonen, Luke Zettlemoyer

The emergent cross-lingual transfer seen in multilingual pretrained models has sparked significant interest in studying their behavior.

Cross-Lingual Transfer XLM-R

Analyzing Gender Representation in Multilingual Models

1 code implementation RepL4NLP (ACL) 2022 Hila Gonen, Shauli Ravfogel, Yoav Goldberg

Multilingual language models were shown to allow for nontrivial transfer across scripts and languages.

Gender Classification

Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora

1 code implementation ACL 2020 Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg

The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational social science.

Word Embeddings

Identifying Helpful Sentences in Product Reviews

no code implementations NAACL 2021 Iftah Gamzu, Hila Gonen, Gilad Kutiel, Ran Levy, Eugene Agichtein

This task is closely related to the task of Multi Document Summarization in the product reviews domain but differs in its objective and its level of conciseness.

Document Summarization Multi-Document Summarization +1

Pick a Fight or Bite your Tongue: Investigation of Gender Differences in Idiomatic Language Usage

1 code implementation COLING 2020 Ella Rabinovich, Hila Gonen, Suzanne Stevenson

A large body of research on gender-linked language has established foundations regarding cross-gender differences in lexical, emotional, and topical preferences, along with their sociological underpinnings.

It's not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

1 code implementation16 Oct 2020 Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.


Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

2 code implementations ACL 2020 Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton, Yoav Goldberg

The ability to control for the kinds of information encoded in neural representation has a variety of use cases, especially in light of the challenge of interpreting these models.

Fairness Multi-class Classification +1

It's All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution

no code implementations IJCNLP 2019 Rowan Hall Maudslay, Hila Gonen, Ryan Cotterell, Simone Teufel

An alternative approach is Counterfactual Data Augmentation (CDA), in which a corpus is duplicated and augmented to remove bias, e. g. by swapping all inherently-gendered words in the copy.

counterfactual Data Augmentation +1

Semi Supervised Preposition-Sense Disambiguation using Multilingual Data

no code implementations COLING 2016 Hila Gonen, Yoav Goldberg

Prepositions are very common and very ambiguous, and understanding their sense is critical for understanding the meaning of the sentence.

General Classification Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.