Search Results for author: Merel Scholman

Found 15 papers, 2 papers with code

Design Choices in Crowdsourcing Discourse Relation Annotations: The Effect of Worker Selection and Training

no code implementations LREC 2022 Merel Scholman, Valentina Pyatkin, Frances Yung, Ido Dagan, Reut Tsarfaty, Vera Demberg

The current contribution studies the effect of worker selection and training on the agreement on implicit relation labels between workers and gold labels, for both the DC and the QA method.


Establishing Annotation Quality in Multi-label Annotations

no code implementations COLING 2022 Marian Marchal, Merel Scholman, Frances Yung, Vera Demberg

In many linguistic fields requiring annotated data, multiple interpretations of a single item are possible.

A practical perspective on connective generation

no code implementations CODI 2021 Frances Yung, Merel Scholman, Vera Demberg

In the current contribution, we analyse whether a sophisticated connective generation module is necessary to select a connective, or whether this can be solved with simple methods (such as random choice between connectives that are known to express a given relation, or usage of a generic language model).

Language Modelling Relation +1

Semi-automatic discourse annotation in a low-resource language: Developing a connective lexicon for Nigerian Pidgin

no code implementations CODI 2021 Marian Marchal, Merel Scholman, Vera Demberg

The lexicon shows that the majority of Nigerian Pidgin connectives are borrowed from its English lexifier, but that there are also some connectives that are unique to Nigerian Pidgin.


Comparison of methods for explicit discourse connective identification across various domains

no code implementations CODI 2021 Merel Scholman, Tianai Dong, Frances Yung, Vera Demberg

Existing parse methods use varying approaches to identify explicit discourse connectives, but their performance has not been consistently evaluated in comparison to each other, nor have they been evaluated consistently on text other than newspaper articles.

DiscoGeM: A Crowdsourced Corpus of Genre-Mixed Implicit Discourse Relations

1 code implementation LREC 2022 Merel Scholman, Tianai Dong, Frances Yung, Vera Demberg

Both the corpus and the dataset can facilitate a multitude of applications and research purposes, for example to function as training data to improve the performance of automatic discourse relation parsers, as well as facilitate research into non-connective signals of discourse relations.

Relation Relation Classification

Label distributions help implicit discourse relation classification

no code implementations COLING (CODI, CRAC) 2022 Frances Yung, Kaveri Anuranjana, Merel Scholman, Vera Demberg

Implicit discourse relations can convey more than one relation sense, but much of the research on discourse relations has focused on single relation senses.

Classification Implicit Discourse Relation Classification +1

Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin

no code implementations28 Apr 2024 Pin-Jie Lin, Merel Scholman, Muhammed Saeed, Vera Demberg

We test the effect of this data augmentation on two critical NLP tasks: machine translation and sentiment analysis.

Data Augmentation Machine Translation +2

Prompting Implicit Discourse Relation Annotation

no code implementations7 Feb 2024 Frances Yung, Mansoor Ahmad, Merel Scholman, Vera Demberg

Pre-trained large language models, such as ChatGPT, archive outstanding performance in various reasoning tasks without supervised training and were found to have outperformed crowdsourcing workers.

Classification Implicit Discourse Relation Classification +3

Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin

1 code implementation1 Jul 2023 Pin-Jie Lin, Muhammed Saeed, Ernie Chang, Merel Scholman

In this work, we target on improving upon both text classification and translation of Nigerian Pidgin (Naija) by collecting a large-scale parallel English-Pidgin corpus and further propose a framework of cross-lingual adaptive training that includes both continual and task adaptive training so as to adapt a base pre-trained model to low-resource languages.

text-classification Text Classification +1

Crowdsourcing Discourse Relation Annotations by a Two-Step Connective Insertion Task

no code implementations WS 2019 Frances Yung, Vera Demberg, Merel Scholman

The perspective of being able to crowd-source coherence relations bears the promise of acquiring annotations for new texts quickly, which could then increase the size and variety of discourse-annotated corpora.

Relation Vocal Bursts Valence Prediction

How compatible are our discourse annotations? Insights from mapping RST-DT and PDTB annotations

no code implementations28 Apr 2017 Vera Demberg, Fatemeh Torabi Asr, Merel Scholman

Discourse-annotated corpora are an important resource for the community, but they are often annotated according to different frameworks.

Implicit Relations Relation

Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks

no code implementations LREC 2016 Ines Rehbein, Merel Scholman, Vera Demberg

In discourse relation annotation, there is currently a variety of different frameworks being used, and most of them have been developed and employed mostly on written data.


Cannot find the paper you are looking for? You can Submit a new open access paper.