no code implementations • GermEval 2021 • Kwabena Odame Akomeah, Udo Kruschwitz, Bernd Ludwig
Exploring the issue of overfitting we uncovered that due to a bug in the pipeline the runs we submitted had not been trained on the full set but only on a small training set.
1 code implementation • LREC 2022 • Christoph Turban, Udo Kruschwitz
A different, more recent, development is the automatic augmentation of training data.
no code implementations • SemEval (NAACL) 2022 • Selina Meyer, Maximilian Schmidhuber, Udo Kruschwitz
In this description paper we outline the system architecture submitted to Task 4, Subtask 1 at SemEval-2022.
1 code implementation • 6 Oct 2024 • Sabrina Guidotti, Gregor Donabauer, Simone Somazzi, Udo Kruschwitz, Davide Taibi, Dimitri Ognibene
The widespread use of social media has highlighted potential negative impacts on society and individuals, largely driven by recommendation algorithms that shape user behavior and social dynamics.
1 code implementation • 18 Jul 2024 • Samy Ateia, Udo Kruschwitz
We participated in the 12th BioASQ challenge, which is a retrieval augmented generation (RAG) setting, and explored the performance of current GPT models Claude 3 Opus, GPT-3. 5-turbo and Mixtral 8x7b with in-context learning (zero-shot, few-shot) and QLoRa fine-tuning.
1 code implementation • 12 Apr 2024 • Wan-Hua Her, Udo Kruschwitz
Machine Translation has made impressive progress in recent years offering close to human-level performance on many languages, but studies have primarily focused on high-resource languages with broad online presence and resources.
1 code implementation • 28 Feb 2024 • Gregor Donabauer, Udo Kruschwitz
Pre-training of neural networks has recently revolutionized the field of Natural Language Processing (NLP) and has before demonstrated its effectiveness in computer vision.
1 code implementation • 28 Jun 2023 • Samy Ateia, Udo Kruschwitz
We assessed the performance of commercial Large Language Models (LLMs) GPT-3. 5-Turbo and GPT-4 on tasks from the 2023 BioASQ challenge.
1 code implementation • 13 Dec 2022 • Gregor Donabauer, Udo Kruschwitz
Fake news detection has become a research area that goes way beyond a purely academic interest as it has direct implications on our society as a whole.
no code implementations • 11 Oct 2022 • Juntao Yu, Silviu Paun, Maris Camilleri, Paloma Carretero Garcia, Jon Chamberlain, Udo Kruschwitz, Massimo Poesio
Although several datasets annotated for anaphoric reference/coreference exist, even the largest such datasets have limitations in terms of size, range of domains, coverage of anaphoric phenomena, and size of documents included.
1 code implementation • LREC 2022 • Miriam Schirmer, Udo Kruschwitz, Gregor Donabauer
Recent progress in natural language processing has been impressive in many different areas with transformer-based approaches setting new benchmarks for a wide range of applications.
1 code implementation • LREC 2022 • Philipp Hartl, Udo Kruschwitz
The distribution of fake news is not a new but a rapidly growing problem.
no code implementations • GermEval 2021 • Hoai Nam Tran, Udo Kruschwitz
This paper describes our approach (ur-iw-hnt) for the Shared Task of GermEval2021 to identify toxic, engaging, and fact-claiming comments.
1 code implementation • 25 Jun 2021 • Tony Russell-Rose, Philip Gooch, Udo Kruschwitz
Knowledge workers (such as healthcare information professionals, patent agents and recruitment professionals) undertake work tasks where search forms a core part of their duties.
no code implementations • LREC 2020 • Jon Chamberlain, Udo Kruschwitz, Massimo Poesio
Crowdsourcing approaches provide a difficult design challenge for developers.
no code implementations • 25 Sep 2019 • Silviu Paun, Juntao Yu, Jon Chamberlain, Udo Kruschwitz, Massimo Poesio
The model is also flexible enough to be used in standard annotation tasks for classification where it registers on par performance with the state of the art.
1 code implementation • ACL 2019 • Chris Madge, Juntao Yu, Jon Chamberlain, Udo Kruschwitz, Silviu Paun, Massimo Poesio
One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation.
no code implementations • NAACL 2019 • Massimo Poesio, Jon Chamberlain, Silviu Paun, Juntao Yu, Alex Uma, ra, Udo Kruschwitz
The corpus, containing annotations for about 108, 000 markables, is one of the largest corpora for coreference for English, and one of the largest crowdsourced NLP corpora, but its main feature is the large number of judgments per markable: 20 on average, and over 2. 2M in total.
no code implementations • 11 May 2019 • Suzan Verberne, Jiyin He, Gineke Wiggers, Tony Russell-Rose, Udo Kruschwitz, Arjen P. de Vries
Search conducted in a work context is an everyday activity that has been around since long before the Web was invented, yet we still seem to understand little about its general characteristics.
no code implementations • EMNLP 2018 • Silviu Paun, Jon Chamberlain, Udo Kruschwitz, Juntao Yu, Massimo Poesio
The availability of large scale annotated corpora for coreference is essential to the development of the field.
no code implementations • ICLR 2018 • Dino S. Ratcliffe, Luca Citi, Sam Devlin, Udo Kruschwitz
Many deep reinforcement learning approaches use graphical state representations, this means visually distinct games that share the same underlying structure cannot effectively share knowledge.
no code implementations • TACL 2018 • Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio
We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators.
no code implementations • LREC 2016 • Jon Chamberlain, Massimo Poesio, Udo Kruschwitz
Corpora are typically annotated by several experts to create a gold standard; however, there are now compelling reasons to use a non-expert crowd to annotate text, driven by cost, speed and scalability.
no code implementations • LREC 2016 • Mijail Kabadjov, Udo Kruschwitz, Massimo Poesio, Josef Steinberger, Jorge Valderrama, Hugo Zaragoza
In this paper we present the OnForumS corpus developed for the shared task of the same name on Online Forum Summarisation (OnForumS at MultiLing{'}15).
no code implementations • LREC 2016 • Ayman Alhelbawy, Poesio Massimo, Udo Kruschwitz
In this paper we present a new corpus of Arabic tweets that mention some form of violent event, developed to support the automatic identification of Human Rights Abuse.
no code implementations • TACL 2015 • Maha Althobaiti, Udo Kruschwitz, Massimo Poesio
Supervised methods can achieve high performance on NLP tasks, such as Named Entity Recognition (NER), but new annotations are required for every new domain and/or genre change.
no code implementations • LREC 2014 • Maha Althobaiti, Udo Kruschwitz, Massimo Poesio
We present a free, Java-based library named {``}AraNLP{''} that covers various Arabic text preprocessing tools.
no code implementations • LREC 2012 • Danica Damljanovi{\'c}, Udo Kruschwitz, M-Dyaa Albakour, Johann Petrak, Mihai Lupu
Our approach is based on exploiting the structure inherent in an RDF graph and then applying the methods from statistical semantics, and in particular, Random Indexing, in order to discover contextually related terms.
no code implementations • LREC 2012 • Ahmet Aker, Mahmoud El-Haj, M-Dyaa Albakour, Udo Kruschwitz
For each run we assessed the results provided by 25 workers on a set of 10 tasks.