Search Results for author: Silviu Paun

Found 14 papers, 4 papers with code

We Need to Consider Disagreement in Evaluation

no code implementations ACL (BPPF) 2021 Valerio Basile, Michael Fell, Tommaso Fornaciari, Dirk Hovy, Silviu Paun, Barbara Plank, Massimo Poesio, Alexandra Uma

Instead, we suggest that we need to better capture the sources of disagreement to improve today’s evaluation practice.

The Universal Anaphora Scorer

no code implementations LREC 2022 Juntao Yu, Sopan Khosla, Nafise Sadat Moosavi, Silviu Paun, Sameer Pradhan, Massimo Poesio

It also supports the evaluation of split antecedent anaphora and discourse deixis, for which no tools existed.

Aggregating Crowdsourced and Automatic Judgments to Scale Up a Corpus of Anaphoric Reference for Fiction and Wikipedia Texts

no code implementations11 Oct 2022 Juntao Yu, Silviu Paun, Maris Camilleri, Paloma Carretero Garcia, Jon Chamberlain, Udo Kruschwitz, Massimo Poesio

Although several datasets annotated for anaphoric reference/coreference exist, even the largest such datasets have limitations in terms of size, range of domains, coverage of anaphoric phenomena, and size of documents included.

2k

Scoring Coreference Chains with Split-Antecedent Anaphors

1 code implementation24 May 2022 Silviu Paun, Juntao Yu, Nafise Sadat Moosavi, Massimo Poesio

Anaphoric reference is an aspect of language interpretation covering a variety of types of interpretation beyond the simple case of identity reference to entities introduced via nominal expressions covered by the traditional coreference task in its most recent incarnation in ONTONOTES and similar datasets.

Stay Together: A System for Single and Split-antecedent Anaphora Resolution

1 code implementation NAACL 2021 Juntao Yu, Nafise Sadat Moosavi, Silviu Paun, Massimo Poesio

Split-antecedent anaphora is rarer and more complex to resolve than single-antecedent anaphora; as a result, it is not annotated in many datasets designed to test coreference, and previous work on resolving this type of anaphora was carried out in unrealistic conditions that assume gold mentions and/or gold split-antecedent anaphors are available.

Aggregating and Learning from Multiple Annotators

no code implementations EACL 2021 Silviu Paun, Edwin Simpson

There is also a growing body of recent work arguing that following the convention and training with adjudicated labels ignores any uncertainty the labellers had in their classifications, which results in models with poorer generalisation capabilities.

Aggregation Driven Progression System for GWAPs

no code implementations LREC 2020 Osman Doruk Kicikoglu, Richard Bartle, Jon Chamberlain, Silviu Paun, Massimo Poesio

As the uses of Games-With-A-Purpose (GWAPs) broadens, the systems that incorporate its usages have expanded in complexity.

A Mention-Pair Model of Annotation with Nonparametric User Communities

no code implementations25 Sep 2019 Silviu Paun, Juntao Yu, Jon Chamberlain, Udo Kruschwitz, Massimo Poesio

The model is also flexible enough to be used in standard annotation tasks for classification where it registers on par performance with the state of the art.

Crowdsourcing and Aggregating Nested Markable Annotations

1 code implementation ACL 2019 Chris Madge, Juntao Yu, Jon Chamberlain, Udo Kruschwitz, Silviu Paun, Massimo Poesio

One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation.

coreference-resolution Entity Resolution +1

A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation

no code implementations NAACL 2019 Massimo Poesio, Jon Chamberlain, Silviu Paun, Juntao Yu, Alex Uma, ra, Udo Kruschwitz

The corpus, containing annotations for about 108, 000 markables, is one of the largest corpora for coreference for English, and one of the largest crowdsourced NLP corpora, but its main feature is the large number of judgments per markable: 20 on average, and over 2. 2M in total.

Comparing Bayesian Models of Annotation

no code implementations TACL 2018 Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio

We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators.

Model Selection

Cannot find the paper you are looking for? You can Submit a new open access paper.