no code implementations • Findings (EMNLP) 2021 • Janosch Haber, Massimo Poesio
One of the central aspects of contextualised language models is that they should be able to distinguish the meaning of lexically ambiguous words by their contexts.
no code implementations • ACL 2022 • Tommaso Fornaciari, Alexandra Uma, Massimo Poesio, Dirk Hovy
Natural Language Processing (NLP) ‘s applied nature makes it necessary to select the most effective and robust models.
no code implementations • ACL (CODI, CRAC) 2021 • Sopan Khosla, Juntao Yu, Ramesh Manuvinakurike, Vincent Ng, Massimo Poesio, Michael Strube, Carolyn Rosé
In this paper, we provide an overview of the CODI-CRAC 2021 Shared-Task: Anaphora Resolution in Dialogue.
no code implementations • COLING (CRAC) 2020 • Abdulrahman Aloraini, Massimo Poesio
We propose a BERT-based multilingual model for AZP identification from predicted zero pronoun positions, and evaluate it on the Arabic and Chinese portions of OntoNotes 5. 0.
no code implementations • PaM 2020 • Janosch Haber, Massimo Poesio
Homonymy is often used to showcase one of the advantages of context-sensitive word embedding techniques such as ELMo and BERT.
no code implementations • COLING (WANLP) 2020 • Abdulrahman Aloraini, Massimo Poesio, Ayman Alhelbawy
We present the Arabic dialect identification system that we used for the country-level subtask of the NADI challenge.
no code implementations • ACL (BPPF) 2021 • Valerio Basile, Michael Fell, Tommaso Fornaciari, Dirk Hovy, Silviu Paun, Barbara Plank, Massimo Poesio, Alexandra Uma
Instead, we suggest that we need to better capture the sources of disagreement to improve today’s evaluation practice.
1 code implementation • 24 May 2022 • Silviu Paun, Juntao Yu, Nafise Sadat Moosavi, Massimo Poesio
Anaphoric reference is an aspect of language interpretation covering a variety of types of interpretation beyond the simple case of identity reference to entities introduced via nominal expressions covered by the traditional coreference task in its most recent incarnation in ONTONOTES and similar datasets.
no code implementations • 27 Sep 2021 • Janosch Haber, Massimo Poesio
One of the central aspects of contextualised language models is that they should be able to distinguish the meaning of lexically ambiguous words by their contexts.
no code implementations • CRAC (ACL) 2021 • Pengcheng Lu, Massimo Poesio
Issues with coreference resolution are one of the most frequently mentioned challenges for information extraction from the biomedical literature.
no code implementations • CRAC (ACL) 2021 • Abdulrahman Aloraini, Massimo Poesio
In pro-drop language like Arabic, Chinese, Italian, Japanese, Spanish, and many others, unrealized (null) arguments in certain syntactic positions can refer to a previously introduced entity, and are thus called anaphoric zero pronouns.
no code implementations • SEMEVAL 2021 • Alexandra Uma, Tommaso Fornaciari, Anca Dumitrache, Tristan Miller, Jon Chamberlain, Barbara Plank, Edwin Simpson, Massimo Poesio
Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements in both natural language processing and computer vision.
no code implementations • NAACL 2021 • Tommaso Fornaciari, Alexandra Uma, Silviu Paun, Barbara Plank, Dirk Hovy, Massimo Poesio
Supervised learning assumes that a ground truth label exists.
1 code implementation • NAACL 2021 • Juntao Yu, Nafise Sadat Moosavi, Silviu Paun, Massimo Poesio
Split-antecedent anaphora is rarer and more complex to resolve than single-antecedent anaphora; as a result, it is not annotated in many datasets designed to test coreference, and previous work on resolving this type of anaphora was carried out in unrealistic conditions that assume gold mentions and/or gold split-antecedent anaphors are available.
no code implementations • EACL 2021 • Tommaso Fornaciari, Federico Bianchi, Massimo Poesio, Dirk Hovy
In most cases, however, the target texts{'} preceding context is not considered.
1 code implementation • COLING 2020 • Juntao Yu, Massimo Poesio
can be achieved on full bridging resolution with this architecture.
no code implementations • Joint Conference on Lexical and Computational Semantics 2020 • Janosch Haber, Massimo Poesio
Co-predication is one of the most frequently used linguistic tests to tell apart shifts in polysemic sense from changes in homonymic meaning.
1 code implementation • COLING (CRAC) 2020 • Abdulrahman Aloraini, Juntao Yu, Massimo Poesio
No neural coreference resolver for Arabic exists, in fact we are not aware of any learning-based coreference resolver for Arabic since (Bjorkelund and Kuhn, 2014).
1 code implementation • COLING 2020 • Juntao Yu, Nafise Sadat Moosavi, Silviu Paun, Massimo Poesio
One limitation of virtually all coreference resolution models is the focus on single-antecedent anaphors.
1 code implementation • ACL 2020 • Juntao Yu, Bernd Bohnet, Massimo Poesio
Named Entity Recognition (NER) is a fundamental task in Natural Language Processing, concerned with identifying spans of text expressing references to entities.
Ranked #1 on
Named Entity Recognition
on GENIA
no code implementations • LREC 2020 • Osman Doruk Kicikoglu, Richard Bartle, Jon Chamberlain, Silviu Paun, Massimo Poesio
As the uses of Games-With-A-Purpose (GWAPs) broadens, the systems that incorporate its usages have expanded in complexity.
no code implementations • LREC 2020 • Abdulrahman Aloraini, Massimo Poesio
In languages like Arabic, Chinese, Italian, Japanese, Korean, Portuguese, Spanish, and many others, predicate arguments in certain syntactic positions are not realized instead of being realized as overt pronouns, and are thus called zero- or null-pronouns.
no code implementations • LREC 2020 • Jon Chamberlain, Udo Kruschwitz, Massimo Poesio
Crowdsourcing approaches provide a difficult design challenge for developers.
1 code implementation • 7 Mar 2020 • Juntao Yu, Massimo Poesio
can be achieved on full bridging resolution with this architecture.
1 code implementation • LREC 2020 • Juntao Yu, Alexandra Uma, Massimo Poesio
In this paper, we introduce an architecture to simultaneously identify non-referring expressions (including expletives, predicative s, and other types) and build coreference chains, including singletons.
Ranked #1 on
Coreference Resolution
on The ARRAU Corpus
1 code implementation • 25 Oct 2019 • Keith Vella, Massimo Poesio, Michael Sigamani, Cihan Dogan, Aimore Dutra, Dimitrios Dimakopoulos, Alfredo Gemma, Ella Walters
We present an automated evaluation method to measure fluidity in conversational dialogue systems.
no code implementations • 25 Sep 2019 • Silviu Paun, Juntao Yu, Jon Chamberlain, Udo Kruschwitz, Massimo Poesio
The model is also flexible enough to be used in standard annotation tasks for classification where it registers on par performance with the state of the art.
1 code implementation • LREC 2020 • Juntao Yu, Bernd Bohnet, Massimo Poesio
We then evaluate our models for coreference resolution by using mentions predicted by our best model in start-of-the-art coreference systems.
1 code implementation • ACL 2019 • Chris Madge, Juntao Yu, Jon Chamberlain, Udo Kruschwitz, Silviu Paun, Massimo Poesio
One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation.
1 code implementation • ACL 2019 • Nafise Sadat Moosavi, Leo Born, Massimo Poesio, Michael Strube
To address this problem, minimum spans are manually annotated in smaller corpora.
no code implementations • NAACL 2019 • Massimo Poesio, Jon Chamberlain, Silviu Paun, Juntao Yu, Alex Uma, ra, Udo Kruschwitz
The corpus, containing annotations for about 108, 000 markables, is one of the largest corpora for coreference for English, and one of the largest crowdsourced NLP corpora, but its main feature is the large number of judgments per markable: 20 on average, and over 2. 2M in total.
no code implementations • EMNLP 2018 • Silviu Paun, Jon Chamberlain, Udo Kruschwitz, Juntao Yu, Massimo Poesio
The availability of large scale annotated corpora for coreference is essential to the development of the field.
no code implementations • WS 2018 • Massimo Poesio, Yulia Grishina, Varada Kolhatkar, Nafise Moosavi, Ina Roesiger, Adam Roussel, Fabian Simonjetz, Alex Uma, ra, Olga Uryupina, Juntao Yu, Heike Zinsmeister
The most distinctive feature of the corpus is the annotation of a wide range of anaphoric relations, including bridging references and discourse deixis in addition to identity (coreference).
no code implementations • TACL 2018 • Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio
We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators.
no code implementations • WS 2017 • Sophie Chesney, Maria Liakata, Massimo Poesio, Matthew Purver
This paper discusses the problem of incongruent headlines: those which do not accurately represent the information contained in the article with which they occur.
no code implementations • TACL 2017 • Andrew J. Anderson, Douwe Kiela, Stephen Clark, Massimo Poesio
Dual coding theory considers concrete concepts to be encoded in the brain both linguistically and visually, and abstract concepts only linguistically.
no code implementations • WS 2016 • Fabio Celli, Evgeny Stepanov, Massimo Poesio, Giuseppe Riccardi
On June 23rd 2016, UK held the referendum which ratified the exit from the EU.
no code implementations • LREC 2016 • Jon Chamberlain, Massimo Poesio, Udo Kruschwitz
Corpora are typically annotated by several experts to create a gold standard; however, there are now compelling reasons to use a non-expert crowd to annotate text, driven by cost, speed and scalability.
no code implementations • LREC 2016 • Olga Uryupina, Ron artstein, Antonella Bristot, Federica Cavicchio, Kepa Rodriguez, Massimo Poesio
This paper presents a second release of the ARRAU dataset: a multi-domain corpus with thorough linguistically motivated annotation of anaphora and related phenomena.
no code implementations • LREC 2016 • Mijail Kabadjov, Udo Kruschwitz, Massimo Poesio, Josef Steinberger, Jorge Valderrama, Hugo Zaragoza
In this paper we present the OnForumS corpus developed for the shared task of the same name on Online Forum Summarisation (OnForumS at MultiLing{'}15).
no code implementations • TACL 2015 • Maha Althobaiti, Udo Kruschwitz, Massimo Poesio
Supervised methods can achieve high performance on NLP tasks, such as Named Entity Recognition (NER), but new annotations are required for every new domain and/or genre change.
no code implementations • LREC 2014 • Maha Althobaiti, Udo Kruschwitz, Massimo Poesio
We present a free, Java-based library named {``}AraNLP{''} that covers various Arabic text preprocessing tools.
no code implementations • 12 Feb 2014 • Fabio Celli, Massimo Poesio
We present PR2, a personality recognition system available online, that performs instance-based classification of Big5 personality types from unstructured text, using language-independent features.
no code implementations • LREC 2012 • Olga Uryupina, Massimo Poesio
Several corpora annotated for coreference have been made available in the past decade.
no code implementations • LREC 2012 • Tommaso Fornaciari, Massimo Poesio
In criminal proceedings, sometimes it is not easy to evaluate the sincerity of oral testimonies.