Search Results for author: Steffen Eger

Found 59 papers, 33 papers with code

Evaluation of Coreference Resolution Systems Under Adversarial Attacks

no code implementations EMNLP (CODI) 2020 Haixia Chai, Wei Zhao, Steffen Eger, Michael Strube

A substantial overlap of coreferent mentions in the CoNLL dataset magnifies the recent progress on coreference resolution.

Coreference Resolution

TUDA-Reproducibility @ ReproGen: Replicability of Human Evaluation of Text-to-Text and Concept-to-Text Generation

no code implementations INLG (ACL) 2021 Christian Richter, Yanran Chen, Steffen Eger

This paper describes our contribution to the Shared Task ReproGen by Belz et al. (2021), which investigates the reproducibility of human evaluations in the context of Natural Language Generation.

Concept-To-Text Generation Paper generation

End-to-end style-conditioned poetry generation: What does it take to learn from examples alone?

no code implementations EMNLP (LaTeCHCLfL, CLFL, LaTeCH) 2021 Jörg Wöckener, Thomas Haider, Tristan Miller, The-Khang Nguyen, Thanh Tung Linh Nguyen, Minh Vu Pham, Jonas Belouadi, Steffen Eger

In this work, we design an end-to-end model for poetry generation based on conditioned recurrent neural network (RNN) language models whose goal is to learn stylistic features (poem length, sentiment, alliteration, and rhyming) from examples alone.

TUDa at WMT21: Sentence-Level Direct Assessment with Adapters

no code implementations WMT (EMNLP) 2021 Gregor Geigle, Jonas Stadtmüller, Wei Zhao, Jonas Pfeiffer, Steffen Eger

This paper presents our submissions to the WMT2021 Shared Task on Quality Estimation, Task 1 Sentence-Level Direct Assessment.

Reproducibility Issues for BERT-based Evaluation Metrics

1 code implementation30 Mar 2022 Yanran Chen, Jonas Belouadi, Steffen Eger

We find that reproduction of claims and results often fails because of (i) heavy undocumented preprocessing involved in the metrics, (ii) missing code and (iii) reporting weaker results for the baseline metrics.

Machine Translation Text Generation

Towards Explainable Evaluation Metrics for Natural Language Generation

1 code implementation21 Mar 2022 Christoph Leiter, Piyawat Lertvittayakumjorn, Marina Fomicheva, Wei Zhao, Yang Gao, Steffen Eger

We also provide a synthesizing overview over recent approaches for explainable machine translation metrics and discuss how they relate to those goals and properties.

Machine Translation Text Generation +1

Detecting Stance in Scientific Papers: Did we get more Negative Recently?

no code implementations28 Feb 2022 Dominik Beese, Begüm Altunbaş, Görkem Güzeler, Steffen Eger

In this paper, we classify scientific articles in the domain of natural language processing (NLP) and machine learning (ML) into whether (i) they extend the current state-of-the-art by introduction of novel techniques which beat existing models or whether (ii) they mainly criticize the existing state-of-the-art, i. e., that it is deficient with respect to some property (e. g., wrong evaluation, wrong datasets, misleading task specification).

USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics for Machine Translation

no code implementations21 Feb 2022 Jonas Belouadi, Steffen Eger

In particular, we use an unsupervised evaluation metric to mine pseudo-parallel data, which we use to remap deficient underlying vector spaces (in an iterative manner) and to induce an unsupervised MT system, which then provides pseudo-references as an additional component in the metric.

Machine Translation Parallel Corpus Mining +2

Constrained Density Matching and Modeling for Cross-lingual Alignment of Contextualized Representations

no code implementations31 Jan 2022 Wei Zhao, Steffen Eger

In this work, we attribute the data hungriness of previous alignment techniques to two limitations: (i) the inability to sufficiently leverage data and (ii) these techniques are not trained properly.

DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence

1 code implementation26 Jan 2022 Wei Zhao, Michael Strube, Steffen Eger

We find that (i) the majority of BERT-based metrics correlate much worse with human rated coherence than early discourse metrics, invented a decade ago; (ii) the recent state-of-the-art BARTScore is weak when operated at system level -- which is particularly problematic as systems are typically compared in this manner.

Document Level Machine Translation Machine Translation +1

Better than Average: Paired Evaluation of NLP Systems

1 code implementation ACL 2021 Maxime Peyrard, Wei Zhao, Steffen Eger, Robert West

Evaluation in NLP is usually done by comparing the scores of competing systems independently averaged over a common set of test instances.

Constrained Density Matching and Modeling for Effective Contextualized Alignment

no code implementations29 Sep 2021 Wei Zhao, Steffen Eger

In this work, we analyze the limitations according to which previous alignments become very resource-intensive, \emph{viz.,} (i) the inability to sufficiently leverage data and (ii) that alignments are not trained properly.

Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases

1 code implementation13 Aug 2021 Tobias Walter, Celina Kirschner, Steffen Eger, Goran Glavaš, Anne Lauscher, Simone Paolo Ponzetto

We analyze bias in historical corpora as encoded in diachronic distributional semantic models by focusing on two specific forms of bias, namely a political (i. e., anti-communism) and racist (i. e., antisemitism) one.

Diachronic Word Embeddings Word Embeddings

Graph Routing between Capsules

no code implementations22 Jun 2021 Yang Li, Wei Zhao, Erik Cambria, Suhang Wang, Steffen Eger

Therefore, in this paper, we introduce a new capsule network with graph routing to learn both relationships, where capsules in each layer are treated as the nodes of a graph.

Text Classification

CMCE at SemEval-2020 Task 1: Clustering on Manifolds of Contextualized Embeddings to Detect Historical Meaning Shifts

1 code implementation SEMEVAL 2020 David Rother, Thomas Haider, Steffen Eger

Remarkably, with only 10 dimensional MBERT embeddings (reduced from the original size of 768), our submitted model performs best on subtask 1 for English and ranks third in subtask 2 for English.

Change Detection Word Embeddings

Probing Multilingual BERT for Genetic and Typological Signals

no code implementations COLING 2020 Taraka Rama, Lisa Beinborn, Steffen Eger

We probe the layers in multilingual BERT (mBERT) for phylogenetic and geographic language signals across 100 languages and compute language distances based on the mBERT representations.

Vec2Sent: Probing Sentence Embeddings with Natural Language Generation

1 code implementation COLING 2020 Martin Kerscher, Steffen Eger

We introspect black-box sentence embeddings by conditionally generating from them with the objective to retrieve the underlying discrete sentence.

Sentence Embeddings Text Generation

From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks

1 code implementation12 Oct 2020 Steffen Eger, Yannik Benz

Adversarial attacks are label-preserving modifications to inputs of machine learning classifiers designed to fool machines but not humans.

Natural Language Inference Part-Of-Speech Tagging +1

How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation

1 code implementation CONLL 2020 Steffen Eger, Johannes Daxenberger, Iryna Gurevych

We then probe embeddings in a multilingual setup with design choices that lie in a 'stable region', as we identify for English, and find that results on English do not transfer to other languages.

Sentence Embeddings

On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation

1 code implementation ACL 2020 Wei Zhao, Goran Glavaš, Maxime Peyrard, Yang Gao, Robert West, Steffen Eger

We systematically investigate a range of metrics based on state-of-the-art cross-lingual semantic representations obtained with pretrained M-BERT and LASER.

Language Modelling Machine Translation +4

PO-EMO: Conceptualization, Annotation, and Modeling of Aesthetic Emotions in German and English Poetry

1 code implementation LREC 2020 Thomas Haider, Steffen Eger, Evgeny Kim, Roman Klinger, Winfried Menninghaus

Thus, we conceptualize a set of aesthetic emotions that are predictive of aesthetic appreciation in the reader, and allow the annotation of multiple labels per line to capture mixed emotions within their context.

Emotion Classification Emotion Recognition

Semantic Change and Emerging Tropes In a Large Corpus of New High German Poetry

1 code implementation WS 2019 Thomas Haider, Steffen Eger

Due to its semantic succinctness and novelty of expression, poetry is a great test bed for semantic change analysis.

Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications

5 code implementations ACL 2019 Wei Zhao, Haiyun Peng, Steffen Eger, Erik Cambria, Min Yang

Obstacles hindering the development of capsule networks for challenging NLP applications include poor scalability to large output spaces and less reliable routing processes.

 Ranked #1 on Text Classification on RCV1 (P@1 metric)

General Classification Multi Label Text Classification +2

Pitfalls in the Evaluation of Sentence Embeddings

no code implementations WS 2019 Steffen Eger, Andreas Rücklé, Iryna Gurevych

Our motivation is to challenge the current evaluation of sentence embeddings and to provide an easy-to-access reference for future research.

Sentence Embeddings

Does My Rebuttal Matter? Insights from a Major NLP Conference

1 code implementation NAACL 2019 Yang Gao, Steffen Eger, Ilia Kuznetsov, Iryna Gurevych, Yusuke Miyao

We then focus on the role of the rebuttal phase, and propose a novel task to predict after-rebuttal (i. e., final) scores from initial reviews and author responses.

Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems

no code implementations NAACL 2019 Steffen Eger, Gözde Gül Şahin, Andreas Rücklé, Ji-Ung Lee, Claudia Schulz, Mohsen Mesgar, Krishnkant Swarnkar, Edwin Simpson, Iryna Gurevych

Visual modifications to text are often used to obfuscate offensive comments in social media (e. g., "! d10t") or as a writing style ("1337" in "leet speak"), among other scenarios.

Adversarial Attack

Predicting Research Trends From Arxiv

1 code implementation7 Mar 2019 Steffen Eger, Chao Li, Florian Netzer, Iryna Gurevych

By extrapolation, we predict that these topics will remain lead problems/approaches in their fields in the short- and mid-term.

reinforcement-learning Text Generation

Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks

1 code implementation EMNLP 2018 Steffen Eger, Paul Youssef, Iryna Gurevych

Activation functions play a crucial role in neural networks because they are the nonlinearities which have been attributed to the success story of deep learning.

Image Classification

One Size Fits All? A simple LSTM for non-literal token and construction-level classification

no code implementations COLING 2018 Erik-L{\^a}n Do Dinh, Steffen Eger, Iryna Gurevych

In this paper, we tackle four different tasks of non-literal language classification: token and construction level metaphor detection, classification of idiomatic use of infinitive-verb compounds, and classification of non-literal particle verbs.

Classification General Classification +1

Multi-Task Learning for Argumentation Mining in Low-Resource Settings

1 code implementation NAACL 2018 Claudia Schulz, Steffen Eger, Johannes Daxenberger, Tobias Kahse, Iryna Gurevych

We investigate whether and where multi-task learning (MTL) can improve performance on NLP problems related to argumentation mining (AM), in particular argument component identification.

Multi-Task Learning

Neural End-to-End Learning for Computational Argumentation Mining

2 code implementations ACL 2017 Steffen Eger, Johannes Daxenberger, Iryna Gurevych

Contrary to models that operate on the argument component level, we find that framing AM as dependency parsing leads to subpar performance results.

Dependency Parsing Frame +2

EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION

1 code implementation SEMEVAL 2017 Steffen Eger, Erik-Lân Do Dinh, Ilia Kuznetsov, Masoud Kiaeeha, Iryna Gurevych

From these approaches, we created an ensemble of differently hyper-parameterized systems, achieving a micro-F1-score of 0. 63 on the test data.

General Classification

Complex Decomposition of the Negative Distance kernel

no code implementations5 Jan 2016 Tim vor der Brück, Steffen Eger, Alexander Mehler

Our evaluation shows that the power kernel produces F-scores that are comparable to the reference kernels, but is -- except for the linear kernel -- faster to compute.

Document Classification General Classification +1

On the Number of Many-to-Many Alignments of Multiple Sequences

no code implementations2 Nov 2015 Steffen Eger

We provide a new asymptotic formula for the case $S=\{(s_1,\ldots, s_N) \:|\: 1\le s_i\le 2\}$.

Cannot find the paper you are looking for? You can Submit a new open access paper.