1 code implementation • EMNLP (BlackboxNLP) 2020 • Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg
Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.
no code implementations • 15 Mar 2024 • Tomasz Limisiewicz, Terra Blevins, Hila Gonen, Orevaoghene Ahia, Luke Zettlemoyer
A major consideration in multilingual language modeling is how to best represent languages with diverse vocabularies and scripts.
no code implementations • 19 Jan 2024 • Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, Luke Zettlemoyer
Despite their popularity in non-English NLP, multilingual language models often underperform monolingual ones due to inter-language competition for model parameters.
1 code implementation • arXiv 2023 • Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek Šuppa, Hila Gonen, Joseph Marvin Imperial, Börje F. Karlsson, Peiqin Lin, Nikola Ljubešić, LJ Miranda, Barbara Plank, Arij Riabi, Yuval Pinter
We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages.
Ranked #1 on Named Entity Recognition (NER) on UNER v1 (Danish)
1 code implementation • 23 Oct 2023 • Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, Noah A. Smith
In experiments, we compare MT-specific models and language models for (i) their preference when given an ambiguous subsentence, (ii) their sensitivity to disambiguating context, and (iii) the performance disparity between figurative and literal source sentences.
no code implementations • 24 May 2023 • Akari Asai, Sneha Kudugunta, Xinyan Velocity Yu, Terra Blevins, Hila Gonen, Machel Reid, Yulia Tsvetkov, Sebastian Ruder, Hannaneh Hajishirzi
Despite remarkable advancements in few-shot generalization in natural language processing, most models are developed and evaluated primarily in English.
no code implementations • 23 May 2023 • Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov
Language models have graduated from being research prototypes to commercialized products offered as web APIs, and recent works have highlighted the multilingual capabilities of these products.
no code implementations • 15 Feb 2023 • Marjan Ghazvininejad, Hila Gonen, Luke Zettlemoyer
Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting, even though they were not explicitly trained for this task.
2 code implementations • 25 Jan 2023 • Davis Liang, Hila Gonen, Yuning Mao, Rui Hou, Naman Goyal, Marjan Ghazvininejad, Luke Zettlemoyer, Madian Khabsa
Large multilingual language models typically rely on a single vocabulary shared across 100+ languages.
no code implementations • 20 Dec 2022 • Weijia Shi, Xiaochuang Han, Hila Gonen, Ari Holtzman, Yulia Tsvetkov, Luke Zettlemoyer
Large language models can perform new tasks in a zero-shot fashion, given natural language prompts that specify the desired behavior.
no code implementations • 8 Dec 2022 • Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
Language models can be prompted to perform a wide variety of zero- and few-shot learning problems.
no code implementations • 15 Nov 2022 • Terra Blevins, Hila Gonen, Luke Zettlemoyer
Although pretrained language models (PLMs) can be prompted to perform a wide range of language tasks, it remains an open question how much this ability comes from generalizable linguistic understanding versus surface-level lexical patterns.
no code implementations • 24 May 2022 • Terra Blevins, Hila Gonen, Luke Zettlemoyer
The emergent cross-lingual transfer seen in multilingual pretrained models has sparked significant interest in studying their behavior.
1 code implementation • RepL4NLP (ACL) 2022 • Hila Gonen, Shauli Ravfogel, Yoav Goldberg
Multilingual language models were shown to allow for nontrivial transfer across scripts and languages.
1 code implementation • ACL 2020 • Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg
The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational social science.
no code implementations • NAACL 2021 • Iftah Gamzu, Hila Gonen, Gilad Kutiel, Ran Levy, Eugene Agichtein
This task is closely related to the task of Multi Document Summarization in the product reviews domain but differs in its objective and its level of conciseness.
1 code implementation • COLING 2020 • Ella Rabinovich, Hila Gonen, Suzanne Stevenson
A large body of research on gender-linked language has established foundations regarding cross-gender differences in lexical, emotional, and topical preferences, along with their sociological underpinnings.
1 code implementation • 16 Oct 2020 • Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg
Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Hila Gonen, Kellie Webster
The successful application of neural methods to machine translation has realized huge quality advances for the community.
2 code implementations • ACL 2020 • Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton, Yoav Goldberg
The ability to control for the kinds of information encoded in neural representation has a variety of use cases, especially in light of the challenge of interpreting these models.
1 code implementation • CONLL 2019 • Hila Gonen, Yova Kementchedjhieva, Yoav Goldberg
Many natural languages assign grammatical gender also to inanimate nouns in the language.
no code implementations • IJCNLP 2019 • Rowan Hall Maudslay, Hila Gonen, Ryan Cotterell, Simone Teufel
An alternative approach is Counterfactual Data Augmentation (CDA), in which a corpus is duplicated and augmented to remove bias, e. g. by swapping all inherently-gendered words in the copy.
2 code implementations • NAACL 2019 • Hila Gonen, Yoav Goldberg
Word embeddings are widely used in NLP for a vast range of tasks.
1 code implementation • IJCNLP 2019 • Hila Gonen, Yoav Goldberg
We focus on the problem of language modeling for code-switched language, in the context of automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • COLING 2016 • Hila Gonen, Yoav Goldberg
Prepositions are very common and very ambiguous, and understanding their sense is critical for understanding the meaning of the sentence.