Search Results for author: Massimo Poesio

Found 58 papers, 12 papers with code

Patterns of Polysemy and Homonymy in Contextualised Language Models

no code implementations Findings (EMNLP) 2021 Janosch Haber, Massimo Poesio

One of the central aspects of contextualised language models is that they should be able to distinguish the meaning of lexically ambiguous words by their contexts.

Anaphoric Zero Pronoun Identification: A Multilingual Approach

no code implementations COLING (CRAC) 2020 Abdulrahman Aloraini, Massimo Poesio

We propose a BERT-based multilingual model for AZP identification from predicted zero pronoun positions, and evaluate it on the Arabic and Chinese portions of OntoNotes 5. 0.

Transfer Learning

Word Sense Distance in Human Similarity Judgements and Contextualised Word Embeddings

no code implementations PaM 2020 Janosch Haber, Massimo Poesio

Homonymy is often used to showcase one of the advantages of context-sensitive word embedding techniques such as ELMo and BERT.

Word Embeddings

Scoring Coreference Chains with Split-Antecedent Anaphors

1 code implementation24 May 2022 Silviu Paun, Juntao Yu, Nafise Sadat Moosavi, Massimo Poesio

Anaphoric reference is an aspect of language interpretation covering a variety of types of interpretation beyond the simple case of identity reference to entities introduced via nominal expressions covered by the traditional coreference task in its most recent incarnation in ONTONOTES and similar datasets.

Patterns of Lexical Ambiguity in Contextualised Language Models

no code implementations27 Sep 2021 Janosch Haber, Massimo Poesio

One of the central aspects of contextualised language models is that they should be able to distinguish the meaning of lexically ambiguous words by their contexts.

Coreference Resolution for the Biomedical Domain: A Survey

no code implementations CRAC (ACL) 2021 Pengcheng Lu, Massimo Poesio

Issues with coreference resolution are one of the most frequently mentioned challenges for information extraction from the biomedical literature.

Coreference Resolution

Data Augmentation Methods for Anaphoric Zero Pronouns

no code implementations CRAC (ACL) 2021 Abdulrahman Aloraini, Massimo Poesio

In pro-drop language like Arabic, Chinese, Italian, Japanese, Spanish, and many others, unrealized (null) arguments in certain syntactic positions can refer to a previously introduced entity, and are thus called anaphoric zero pronouns.

Data Augmentation

SemEval-2021 Task 12: Learning with Disagreements

no code implementations SEMEVAL 2021 Alexandra Uma, Tommaso Fornaciari, Anca Dumitrache, Tristan Miller, Jon Chamberlain, Barbara Plank, Edwin Simpson, Massimo Poesio

Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements in both natural language processing and computer vision.

Computer Vision Natural Language Processing

Stay Together: A System for Single and Split-antecedent Anaphora Resolution

1 code implementation NAACL 2021 Juntao Yu, Nafise Sadat Moosavi, Silviu Paun, Massimo Poesio

Split-antecedent anaphora is rarer and more complex to resolve than single-antecedent anaphora; as a result, it is not annotated in many datasets designed to test coreference, and previous work on resolving this type of anaphora was carried out in unrealistic conditions that assume gold mentions and/or gold split-antecedent anaphors are available.

Neural Coreference Resolution for Arabic

1 code implementation COLING (CRAC) 2020 Abdulrahman Aloraini, Juntao Yu, Massimo Poesio

No neural coreference resolver for Arabic exists, in fact we are not aware of any learning-based coreference resolver for Arabic since (Bjorkelund and Kuhn, 2014).

Coreference Resolution

Named Entity Recognition as Dependency Parsing

1 code implementation ACL 2020 Juntao Yu, Bernd Bohnet, Massimo Poesio

Named Entity Recognition (NER) is a fundamental task in Natural Language Processing, concerned with identifying spans of text expressing references to entities.

Dependency Parsing named-entity-recognition +3

Aggregation Driven Progression System for GWAPs

no code implementations LREC 2020 Osman Doruk Kicikoglu, Richard Bartle, Jon Chamberlain, Silviu Paun, Massimo Poesio

As the uses of Games-With-A-Purpose (GWAPs) broadens, the systems that incorporate its usages have expanded in complexity.

Cross-lingual Zero Pronoun Resolution

no code implementations LREC 2020 Abdulrahman Aloraini, Massimo Poesio

In languages like Arabic, Chinese, Italian, Japanese, Korean, Portuguese, Spanish, and many others, predicate arguments in certain syntactic positions are not realized instead of being realized as overt pronouns, and are thus called zero- or null-pronouns.

Machine Translation Translation

A Cluster Ranking Model for Full Anaphora Resolution

1 code implementation LREC 2020 Juntao Yu, Alexandra Uma, Massimo Poesio

In this paper, we introduce an architecture to simultaneously identify non-referring expressions (including expletives, predicative s, and other types) and build coreference chains, including singletons.

Coreference Resolution

A Mention-Pair Model of Annotation with Nonparametric User Communities

no code implementations25 Sep 2019 Silviu Paun, Juntao Yu, Jon Chamberlain, Udo Kruschwitz, Massimo Poesio

The model is also flexible enough to be used in standard annotation tasks for classification where it registers on par performance with the state of the art.

Neural Mention Detection

1 code implementation LREC 2020 Juntao Yu, Bernd Bohnet, Massimo Poesio

We then evaluate our models for coreference resolution by using mentions predicted by our best model in start-of-the-art coreference systems.

Coreference Resolution NER

Crowdsourcing and Aggregating Nested Markable Annotations

1 code implementation ACL 2019 Chris Madge, Juntao Yu, Jon Chamberlain, Udo Kruschwitz, Silviu Paun, Massimo Poesio

One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation.

Coreference Resolution Entity Resolution +1

A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation

no code implementations NAACL 2019 Massimo Poesio, Jon Chamberlain, Silviu Paun, Juntao Yu, Alex Uma, ra, Udo Kruschwitz

The corpus, containing annotations for about 108, 000 markables, is one of the largest corpora for coreference for English, and one of the largest crowdsourced NLP corpora, but its main feature is the large number of judgments per markable: 20 on average, and over 2. 2M in total.

Anaphora Resolution with the ARRAU Corpus

no code implementations WS 2018 Massimo Poesio, Yulia Grishina, Varada Kolhatkar, Nafise Moosavi, Ina Roesiger, Adam Roussel, Fabian Simonjetz, Alex Uma, ra, Olga Uryupina, Juntao Yu, Heike Zinsmeister

The most distinctive feature of the corpus is the annotation of a wide range of anaphoric relations, including bridging references and discourse deixis in addition to identity (coreference).

Comparing Bayesian Models of Annotation

no code implementations TACL 2018 Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio

We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators.

Model Selection Natural Language Processing

Incongruent Headlines: Yet Another Way to Mislead Your Readers

no code implementations WS 2017 Sophie Chesney, Maria Liakata, Massimo Poesio, Matthew Purver

This paper discusses the problem of incongruent headlines: those which do not accurately represent the information contained in the article with which they occur.

Natural Language Processing

Visually Grounded and Textual Semantic Models Differentially Decode Brain Activity Associated with Concrete and Abstract Nouns

no code implementations TACL 2017 Andrew J. Anderson, Douwe Kiela, Stephen Clark, Massimo Poesio

Dual coding theory considers concrete concepts to be encoded in the brain both linguistically and visually, and abstract concepts only linguistically.

Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference.

no code implementations LREC 2016 Jon Chamberlain, Massimo Poesio, Udo Kruschwitz

Corpora are typically annotated by several experts to create a gold standard; however, there are now compelling reasons to use a non-expert crowd to annotate text, driven by cost, speed and scalability.

ARRAU: Linguistically-Motivated Annotation of Anaphoric Descriptions

no code implementations LREC 2016 Olga Uryupina, Ron artstein, Antonella Bristot, Federica Cavicchio, Kepa Rodriguez, Massimo Poesio

This paper presents a second release of the ARRAU dataset: a multi-domain corpus with thorough linguistically motivated annotation of anaphora and related phenomena.

The OnForumS corpus from the Shared Task on Online Forum Summarisation at MultiLing 2015

no code implementations LREC 2016 Mijail Kabadjov, Udo Kruschwitz, Massimo Poesio, Josef Steinberger, Jorge Valderrama, Hugo Zaragoza

In this paper we present the OnForumS corpus developed for the shared task of the same name on Online Forum Summarisation (OnForumS at MultiLing{'}15).

Combining Minimally-supervised Methods for Arabic Named Entity Recognition

no code implementations TACL 2015 Maha Althobaiti, Udo Kruschwitz, Massimo Poesio

Supervised methods can achieve high performance on NLP tasks, such as Named Entity Recognition (NER), but new annotations are required for every new domain and/or genre change.

named-entity-recognition NER +1

PR2: A Language Independent Unsupervised Tool for Personality Recognition from Text

no code implementations12 Feb 2014 Fabio Celli, Massimo Poesio

We present PR2, a personality recognition system available online, that performs instance-based classification of Big5 personality types from unstructured text, using language-independent features.

General Classification

DeCour: a corpus of DEceptive statements in Italian COURts

no code implementations LREC 2012 Tommaso Fornaciari, Massimo Poesio

In criminal proceedings, sometimes it is not easy to evaluate the sincerity of oral testimonies.

Deception Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.