Search Results for author: Clara Vania

Found 22 papers, 7 papers with code

Crowdsourcing Beyond Annotation: Case Studies in Benchmark Data Collection

no code implementations • EMNLP (ACL) 2021 • Alane Suhr, Clara Vania, Nikita Nangia, Maarten Sap, Mark Yatskar, Samuel R. Bowman, Yoav Artzi

Even though it is such a fundamental tool in NLP, crowdsourcing use is largely guided by common practices and the personal experience of researchers.

Paper
Add Code

Improving Distantly Supervised Document-Level Relation Extraction Through Natural Language Inference

no code implementations • DeepLo 2022 • Clara Vania, Grace Lee and Andrea Pierleoni

Document-level Relation Extraction Natural Language Inference +1

Paper
Add Code

SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages

no code implementations • ACL (SIGMORPHON) 2021 • Tiago Pimentel, Maria Ryskina, Sabrina J. Mielke, Shijie Wu, Eleanor Chodroff, Brian Leonard, Garrett Nicolai, Yustinus Ghanggo Ate, Salam Khalifa, Nizar Habash, Charbel El-Khaissi, Omer Goldman, Michael Gasser, William Lane, Matt Coler, Arturo Oncevay, Jaime Rafael Montoya Samame, Gema Celeste Silva Villegas, Adam Ek, Jean-Philippe Bernardy, Andrey Shcherbakov, Aziyana Bayyr-ool, Karina Sheifer, Sofya Ganieva, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Andrew Krizhanovsky, Natalia Krizhanovsky, Clara Vania, Sardana Ivanova, Aelita Salchak, Christopher Straughn, Zoey Liu, Jonathan North Washington, Duygu Ataman, Witold Kieraś, Marcin Woliński, Totok Suhardijanto, Niklas Stoehr, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Richard J. Hatcher, Emily Prud'hommeaux, Ritesh Kumar, Mans Hulden, Botond Barta, Dorina Lakatos, Gábor Szolnok, Judit Ács, Mohit Raj, David Yarowsky, Ryan Cotterell, Ben Ambridge, Ekaterina Vylomova

This year's iteration of the SIGMORPHON Shared Task on morphological reinflection focuses on typological diversity and cross-lingual variation of morphosyntactic features.

Paper
Add Code

WebIE: Faithful and Robust Information Extraction on the Web

no code implementations • 23 May 2023 • Chenxi Whitehouse, Clara Vania, Alham Fikri Aji, Christos Christodoulopoulos, Andrea Pierleoni

We evaluate the in-domain, out-of-domain, and zero-shot cross-lingual performance of generative IE models and find models trained on WebIE show better generalisability.

Entity Linking

Paper
Add Code

Endowing Language Models with Multimodal Knowledge Graph Representations

1 code implementation • 27 Jun 2022 • Ningyuan Huang, Yash R. Deshpande, Yibo Liu, Houda Alberts, Kyunghyun Cho, Clara Vania, Iacer Calixto

We use the recently released VisualSem KG as our external knowledge repository, which covers a subset of Wikipedia and WordNet entities, and compare a mix of tuple-based and graph-based algorithms to learn entity and relation representations that are grounded on the KG multimodal information.

Multilingual Named Entity Recognition named-entity-recognition +2

Paper
Code

UniMorph 4.0: Universal Morphology

no code implementations • LREC 2022 • Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Benoît Sagot, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay, Juan López Bautista, Gema Celeste Silva Villegas, Lucas Torroba Hennigen, Adam Ek, David Guriel, Peter Dirix, Jean-Philippe Bernardy, Andrey Scherbakov, Aziyana Bayyr-ool, Antonios Anastasopoulos, Roberto Zariquiey, Karina Sheifer, Sofya Ganieva, Hilaria Cruz, Ritván Karahóǧa, Stella Markantonatou, George Pavlidis, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Candy Angulo, Jatayu Baxi, Andrew Krizhanovsky, Natalia Krizhanovskaya, Elizabeth Salesky, Clara Vania, Sardana Ivanova, Jennifer White, Rowan Hall Maudslay, Josef Valvoda, Ran Zmigrod, Paula Czarnowska, Irene Nikkarinen, Aelita Salchak, Brijesh Bhatt, Christopher Straughn, Zoey Liu, Jonathan North Washington, Yuval Pinter, Duygu Ataman, Marcin Wolinski, Totok Suhardijanto, Anna Yablonskaya, Niklas Stoehr, Hossep Dolatian, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Aryaman Arora, Richard J. Hatcher, Ritesh Kumar, Jeremiah Young, Daria Rodionova, Anastasia Yemelina, Taras Andrushko, Igor Marchenko, Polina Mashkovtseva, Alexandra Serova, Emily Prud'hommeaux, Maria Nepomniashchaya, Fausto Giunchiglia, Eleanor Chodroff, Mans Hulden, Miikka Silfverberg, Arya D. McCarthy, David Yarowsky, Ryan Cotterell, Reut Tsarfaty, Ekaterina Vylomova

The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema.

Morphological Inflection

Paper
Add Code

IndoNLI: A Natural Language Inference Dataset for Indonesian

1 code implementation • EMNLP 2021 • Rahmad Mahendra, Alham Fikri Aji, Samuel Louvan, Fahrurrozi Rahman, Clara Vania

The expert-annotated data is used exclusively as a test set.

Natural Language Inference Sentence +1

Paper
Code

Comparing Test Sets with Item Response Theory

no code implementations • ACL 2021 • Clara Vania, Phu Mon Htut, William Huang, Dhara Mungra, Richard Yuanzhe Pang, Jason Phang, Haokun Liu, Kyunghyun Cho, Samuel R. Bowman

Recent years have seen numerous NLP datasets introduced to evaluate the performance of fine-tuned models on natural language understanding tasks.

Natural Language Understanding

Paper
Add Code

What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?

1 code implementation • ACL 2021 • Nikita Nangia, Saku Sugawara, Harsh Trivedi, Alex Warstadt, Clara Vania, Samuel R. Bowman

However, we find that training crowdworkers, and then using an iterative process of collecting data, sending feedback, and qualifying workers based on expert judgments is an effective means of collecting challenging data.

Multiple-choice Natural Language Understanding +1

Paper
Code

Asking Crowdworkers to Write Entailment Examples: The Best of Bad Options

1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Clara Vania, Ruijie Chen, Samuel R. Bowman

Using these protocols and a writing-based baseline, we collect several new English NLI datasets of over 3k examples each, each using a fixed amount of annotator time, but a varying number of examples to fit that time budget.

Natural Language Inference Transfer Learning

Paper
Code

CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models

1 code implementation • EMNLP 2020 • Nikita Nangia, Clara Vania, Rasika Bhalerao, Samuel R. Bowman

To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs).

Paper
Code

VisualSem: A High-quality Knowledge Graph for Vision and Language

1 code implementation • EMNLP (MRL) 2021 • Houda Alberts, Teresa Huang, Yash Deshpande, Yibo Liu, Kyunghyun Cho, Clara Vania, Iacer Calixto

We also release a neural multi-modal retrieval model that can use images or sentences as inputs and retrieves entities in the KG.

Data Augmentation Natural Language Understanding +2

Paper
Code

Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?

no code implementations • ACL 2020 • Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

However, we fail to observe more granular correlations between probing and target task performance, highlighting the need for further work on broad-coverage probing benchmarks.

coreference-resolution Natural Language Understanding +1

Paper
Add Code

English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Jason Phang, Iacer Calixto, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman

Intermediate-task training---fine-tuning a pretrained model on an intermediate task before fine-tuning again on the target task---often improves model performance substantially on language understanding tasks in monolingual English settings.

Ranked #20 on Zero-Shot Cross-Lingual Transfer on XTREME

Question Answering Retrieval +3

Paper
Add Code

Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?

no code implementations • 1 May 2020 • Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

However, we fail to observe more granular correlations between probing and target task performance, highlighting the need for further work on broad-coverage probing benchmarks.

coreference-resolution Natural Language Understanding +1

Paper
Add Code

A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages

no code implementations • IJCNLP 2019 • Clara Vania, Yova Kementchedjhieva, Anders Søgaard, Adam Lopez

Parsers are available for only a handful of the world's languages, since they require lots of training data.

Data Augmentation Dependency Parsing +1

Paper
Add Code

LINSPECTOR: Multilingual Probing Tasks for Word Representations

3 code implementations • CL 2020 • Gözde Gül Şahin, Clara Vania, Ilia Kuznetsov, Iryna Gurevych

We present a reusable methodology for creation and evaluation of such tests in a multilingual setting.

Dependency Parsing named-entity-recognition +8

Paper
Code

Explicitly modeling case improves neural dependency parsing

no code implementations • WS 2018 • Clara Vania, Adam Lopez

Neural dependency parsing models that compose word representations from characters can presumably exploit morphosyntax when making attachment decisions.

Dependency Parsing Multi-Task Learning

Paper
Add Code

What do character-level models learn about morphology? The case of dependency parsing

no code implementations • EMNLP 2018 • Clara Vania, Andreas Grivas, Adam Lopez

When parsing morphologically-rich languages with neural models, it is beneficial to model input at the character level, and it has been claimed that this is because character-level models learn morphology.

Dependency Parsing Morphological Analysis