Search Results for author: Torsten Zesch

Found 67 papers, 9 papers with code

Bye, Bye, Maintenance Work? Using Model Cloning to Approximate the Behavior of Legacy Tools

no code implementations • KONVENS (WS) 2022 • Piush Aggarwal, Torsten Zesch

Paper
Add Code

LeSpell - A Multi-Lingual Benchmark Corpus of Spelling Errors to Develop Spellchecking Methods for Learner Language

1 code implementation • LREC 2022 • Marie Bexte, Ronja Laarmann-Quante, Andrea Horbach, Torsten Zesch

Spellchecking text written by language learners is especially challenging because errors made by learners differ both quantitatively and qualitatively from errors made by already proficient learners.

Paper
Code

Robustness of end-to-end Automatic Speech Recognition Models – A Case Study using Mozilla DeepSpeech

no code implementations • KONVENS (WS) 2021 • Aashish Agarwal, Torsten Zesch

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Improving Generalization of Hate Speech Detection Systems to Novel Target Groups via Domain Adaptation

no code implementations • NAACL (WOAH) 2022 • Florian Ludwig, Klara Dolos, Torsten Zesch, Eleanor Hobley

Despite recent advances in machine learning based hate speech detection, classifiers still struggle with generalizing knowledge to out-of-domain data samples.

Hate Speech Detection Unsupervised Domain Adaptation

Paper
Add Code

Implicit Phenomena in Short-answer Scoring Data

no code implementations • ACL (unimplicit) 2021 • Marie Bexte, Andrea Horbach, Torsten Zesch

We therefore quantify to what extent implicit language phenomena occur in short answer datasets and examine the influence they have on automatic scoring performance.

Word Embeddings

Paper
Add Code

A Crash Course on Ethics for Natural Language Processing

no code implementations • NAACL (TeachingNLP) 2021 • Annemarie Friedrich, Torsten Zesch

It is generally agreed upon in the natural language processing (NLP) community that ethics should be integrated into any curriculum.

Ethics

Paper
Add Code

Analyzing the Real Vulnerability of Hate Speech Detection Systems against Targeted Intentional Noise

no code implementations • COLING (WNUT) 2022 • Piush Aggarwal, Torsten Zesch

Hate speech detection systems have been shown to be vulnerable against obfuscation attacks, where a potential hater tries to circumvent detection by deliberately introducing noise in their posts.

Hate Speech Detection

Paper
Add Code

‘Meet me at the ribary’ – Acceptability of spelling variants in free-text answers to listening comprehension prompts

no code implementations • NAACL (BEA) 2022 • Ronja Laarmann-Quante, Leska Schwarz, Andrea Horbach, Torsten Zesch

When listening comprehension is tested as a free-text production task, a challenge for scoring the answers is the resulting wide range of spelling variants.

Paper
Add Code

Similarity-Based Content Scoring - How to Make S-BERT Keep Up With BERT

1 code implementation • NAACL (BEA) 2022 • Marie Bexte, Andrea Horbach, Torsten Zesch

The dominating paradigm for content scoring is to learn an instance-based model, i. e. to use lexical features derived from the learner answers themselves.

Paper
Code

C-Test Collector: A Proficiency Testing Application to Collect Training Data for C-Tests

no code implementations • EACL (BEA) 2021 • Christian Haring, Rene Lehmann, Andrea Horbach, Torsten Zesch

We present the C-Test Collector, a web-based tool that allows language learners to test their proficiency level using c-tests.

Paper
Add Code

VL-BERT+: Detecting Protected Groups in Hateful Multimodal Memes

no code implementations • ACL (WOAH) 2021 • Piush Aggarwal, Michelle Espranita Liman, Darina Gold, Torsten Zesch

This paper describes our submission (winning solution for Task A) to the Shared Task on Hateful Meme Detection at WOAH 2021.

Data Augmentation Hateful Meme Classification

Paper
Add Code

Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding

no code implementations • 8 Apr 2024 • Ahmad Idrissi-Yaghir, Amin Dada, Henning Schäfer, Kamyar Arzideh, Giulia Baldini, Jan Trienes, Max Hasin, Jeanette Bewersdorff, Cynthia S. Schmidt, Marie Bauer, Kaleb E. Smith, Jiang Bian, Yonghui Wu, Jörg Schlötterer, Torsten Zesch, Peter A. Horn, Christin Seifert, Felix Nensa, Jens Kleesiek, Christoph M. Friedrich

Recent advances in natural language processing (NLP) can be largely attributed to the advent of pre-trained language models such as BERT and RoBERTa.

Domain Adaptation Extractive Question-Answering +5

Paper
Add Code

Text or Image? What is More Important in Cross-Domain Generalization Capabilities of Hate Meme Detection Models?

no code implementations • 7 Feb 2024 • Piush Aggarwal, Jawar Mehrabanian, Weigang Huang, Özge Alacam, Torsten Zesch

This paper delves into the formidable challenge of cross-domain generalization in multimodal hate meme detection, presenting compelling findings.

Domain Generalization Image Captioning

Paper
Add Code

HateProof: Are Hateful Meme Detection Systems really Robust?

no code implementations • 11 Feb 2023 • Piush Aggarwal, Pranit Chawla, Mithun Das, Punyajoy Saha, Binny Mathew, Torsten Zesch, Animesh Mukherjee

Empirically, we find a noticeable performance drop of as high as 10% in the macro-F1 score for certain attacks.

Contrastive Learning

Paper
Add Code

Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech

no code implementations • 8 May 2021 • Aashish Agarwal, Torsten Zesch

When evaluating the performance of automatic speech recognition models, usually word error rate within a certain dataset is used.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Effects of Layer Freezing on Transferring a Speech Recognition System to Under-resourced Languages

1 code implementation • KONVENS (WS) 2021 • Onno Eberhard, Torsten Zesch

In this paper, we investigate the effect of layer freezing on the effectiveness of model transfer in the area of automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Don't take ``nswvtnvakgxpm'' for an answer --The surprising vulnerability of automatic content scoring systems to adversarial input

no code implementations • COLING 2020 • Yuning Ding, Brian Riordan, Andrea Horbach, Aoife Cahill, Torsten Zesch

Automatic content scoring systems are widely used on short answer tasks to save human effort.

Paper
Add Code

Chinese Content Scoring: Open-Access Datasets and Features on Different Segmentation Levels

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Yuning Ding, Andrea Horbach, Torsten Zesch

As a review of prior work for Chinese content scoring shows a lack of open-access data in the field, we present two short-answer data sets for Chinese.

Paper
Add Code

Decomposing and Comparing Meaning Relations: Paraphrasing, Textual Entailment, Contradiction, and Specificity

no code implementations • LREC 2020 • Venelin Kovatchev, Darina Gold, M. Antonia Marti, Maria Salamo, Torsten Zesch

We use the typology to annotate a corpus of 520 sentence pairs in English and we demonstrate that unlike previous typologies, SHARel can be applied to all relations of interest with a high inter-annotator agreement.

Natural Language Inference Sentence +1

Paper
Add Code

A Legal Approach to Hate Speech -- Operationalizing the EU's Legal Framework against the Expression of Hatred as an NLP Task

no code implementations • 7 Apr 2020 • Frederike Zufall, Marius Hamacher, Katharina Kloppenborg, Torsten Zesch

We propose a 'legal approach' to hate speech detection by operationalization of the decision as to whether a post is subject to criminal law into an NLP task.

Decision Making Hate Speech Detection

Paper
Add Code

Divide and Extract -- Disentangling Clause Splitting and Proposition Extraction

no code implementations • RANLP 2019 • Darina Gold, Torsten Zesch

The resulting proposition evaluation dataset allows us to independently compare the performance of proposition extraction systems on simple and complex clauses.

Paper
Add Code

Annotating and analyzing the interactions between meaning relations

1 code implementation • WS 2019 • Darina Gold, Venelin Kovatchev, Torsten Zesch

Here we present a corpus annotated with these relations and the analysis of these results.

Natural Language Inference Semantic Similarity +3

Paper
Code

ltl.uni-due at SemEval-2019 Task 5: Simple but Effective Lexico-Semantic Features for Detecting Hate Speech in Twitter

no code implementations • SEMEVAL 2019 • Huangpan Zhang, Michael Wojatzki, Tobias Horsmann, Torsten Zesch

On the Spanish data our system is ranked 25th out of 39.

Paper
Add Code

LTL-UDE at SemEval-2019 Task 6: BERT and Two-Vote Classification for Categorizing Offensiveness

no code implementations • SEMEVAL 2019 • Piush Aggarwal, Tobias Horsmann, Michael Wojatzki, Torsten Zesch

We present results for Subtask A and C of SemEval 2019 Shared Task 6.

General Classification

Paper
Add Code

From legal to technical concept: Towards an automated classification of German political Twitter postings as criminal offenses

1 code implementation • NAACL 2019 • Frederike Zufall, Tobias Horsmann, Torsten Zesch

In this article, we analyze which Twitter posts could actually be deemed offenses under German criminal law.

General Classification

Paper
Code

The Role of Diacritics in Increasing the Difficulty of Arabic Lexical Recognition Tests

no code implementations • WS 2018 • Osama Hamed, Torsten Zesch

Paper
Add Code

Cross-Lingual Content Scoring

no code implementations • WS 2018 • Andrea Horbach, Sebastian Stennmanns, Torsten Zesch

We investigate the feasibility of cross-lingual content scoring, a scenario where training and test data in an automatic scoring task are from two different languages.

Machine Translation

Paper
Add Code

Agree or Disagree: Predicting Judgments on Nuanced Assertions

1 code implementation • SEMEVAL 2018 • Michael Wojatzki, Torsten Zesch, Saif Mohammad, Svetlana Kiritchenko

Being able to predict whether people agree or disagree with an assertion (i. e. an explicit, self-contained statement) has several applications ranging from predicting how many people will like or dislike a social media post to classifying posts based on whether they are in accordance with a particular point of view.

Paper
Code

DeepTC -- An Extension of DKPro Text Classification for Fostering Reproducibility of Deep Learning Experiments

no code implementations • LREC 2018 • Tobias Horsmann, Torsten Zesch

General Classification text-classification +2

Paper
Add Code

ESCRITO - An NLP-Enhanced Educational Scoring Toolkit

no code implementations • LREC 2018 • Torsten Zesch, Andrea Horbach

Argument Mining Grammatical Error Correction +2

Paper
Add Code

Quantifying Qualitative Data for Understanding Controversial Issues

no code implementations • LREC 2018 • Michael Wojatzki, Saif Mohammad, Torsten Zesch, Svetlana Kiritchenko

Argument Mining Decision Making +2

Paper
Add Code

The Influence of Spelling Errors on Content Scoring Performance

no code implementations • WS 2017 • Andrea Horbach, Yuning Ding, Torsten Zesch

Spelling errors occur frequently in educational settings, but their influence on automatic scoring is largely unknown.

BIG-bench Machine Learning

Paper
Add Code

Fine-grained essay scoring of a complex writing task for native speakers

no code implementations • WS 2017 • Andrea Horbach, Dirk Scholten-Akoun, Yuning Ding, Torsten Zesch

Automatic essay scoring is nowadays successfully used even in high-stakes tests, but this is mainly limited to holistic scoring of learner essays.

Paper
Add Code

Investigating neural architectures for short answer scoring

no code implementations • WS 2017 • Brian Riordan, Andrea Horbach, Aoife Cahill, Torsten Zesch, Chong MIn Lee

Neural approaches to automated essay scoring have recently shown state-of-the-art performance.

Automated Essay Scoring Reading Comprehension +1

Paper
Add Code

Same same, but different: Compositionality of paraphrase granularity levels

1 code implementation • RANLP 2017 • Darina Benikova, Torsten Zesch

Paraphrases exist on different granularity levels, the most frequently used one being the sentential level.

Machine Translation Question Answering +2

Paper
Code

Do LSTMs really work so well for PoS tagging? -- A replication study

no code implementations • EMNLP 2017 • Tobias Horsmann, Torsten Zesch

A recent study by Plank et al. (2016) found that LSTM-based PoS taggers considerably improve over the current state-of-the-art when evaluated on the corpora of the Universal Dependencies project that use a coarse-grained tagset.

Feature Engineering Part-Of-Speech Tagging +2