Search Results for author: Dana Ruiter

We present a simple new method where an emergent NMT system is used for simultaneously selecting training data and learning internal NMT representations.

Machine Translation NMT +1

Paper
Code

The Effect of Domain and Diacritics in Yorùbá-English Neural Machine Translation

1 code implementation • 15 Mar 2021 • David I. Adelani, Dana Ruiter, Jesujoba O. Alabi, Damilola Adebonojo, Adesina Ayeni, Mofe Adeyemi, Ayodele Awokoya, Cristina España-Bonet

We investigate how and when this training condition affects the final quality and intelligibility of a translation.

Benchmarking Domain Adaptation +2

Paper
Code

Emoji-Based Transfer Learning for Sentiment Tasks

1 code implementation • EACL 2021 • Susann Boy, Dana Ruiter, Dietrich Klakow

This is done using a transfer learning approach, where the parameters learned by an emoji-based source task are transferred to a sentiment target task.

Hate Speech Detection Sentiment Analysis +1

Paper
Code

Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces

1 code implementation • ACL (WOAH) 2021 • Vanessa Hahn, Dana Ruiter, Thomas Kleinbauer, Dietrich Klakow

We observe that, on both similar and distant target tasks and across all languages, the subspace-based representations transfer more effectively than standard BERT representations in the zero-shot setting, with improvements between F1 +10. 9 and F1 +42. 9 over the baselines across all tested monolingual and cross-lingual scenarios.

Sentence

Paper
Code

Placing M-Phasis on the Plurality of Hate: A Feature-Based Corpus of Hate Online

1 code implementation • LREC 2022 • Dana Ruiter, Liane Reiners, Ashwin Geet D'Sa, Thomas Kleinbauer, Dominique Fohr, Irina Illina, Dietrich Klakow, Christian Schemer, Angeliki Monnier

Even though hate speech (HS) online has been an important object of research in the last decade, most HS-related corpora over-simplify the phenomenon of hate by attempting to label user comments as "hate" or "neutral".

Hate Speech Detection

Paper
Code

UdS-DFKI Participation at WMT 2019: Low-Resource (en-gu) and Coreference-Aware (en-de) Systems

no code implementations • WS 2019 • Cristina Espa{\~n}a-Bonet, Dana Ruiter

This paper describes the UdS-DFKI submission to the WMT2019 news translation task for Gujarati{--}English (low-resourced pair) and German{--}English (document-level evaluation).

Translation

Paper
Add Code

Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation

no code implementations • EMNLP 2020 • Dana Ruiter, Josef van Genabith, Cristina España-Bonet

Self-supervised neural machine translation (SSNMT) jointly learns to identify and select suitable training data from comparable (rather than parallel) corpora and to translate, in a way that the two tasks support each other in a virtuous circle.

Denoising Machine Translation +1

Paper
Add Code

Integrating Unsupervised Data Generation into Self-Supervised Neural Machine Translation for Low-Resource Languages

no code implementations • MTSummit 2021 • Dana Ruiter, Dietrich Klakow, Josef van Genabith, Cristina España-Bonet

For most language combinations, parallel data is either scarce or simply unavailable.

Denoising NMT +3

Paper
Add Code

EdinSaar@WMT21: North-Germanic Low-Resource Multilingual NMT

no code implementations • WMT (EMNLP) 2021 • Svetlana Tchistiakova, Jesujoba Alabi, Koel Dutta Chowdhury, Sourav Dutta, Dana Ruiter

We describe the EdinSaar submission to the shared task of Multilingual Low-Resource Translation for North Germanic Languages at the Sixth Conference on Machine Translation (WMT2021).

Machine Translation NMT +1

Paper
Add Code

Label Propagation-Based Semi-Supervised Learning for Hate Speech Classification

no code implementations • EMNLP (insights) 2020 • Ashwin Geet D’Sa, Irina Illina, Dominique Fohr, Dietrich Klakow, Dana Ruiter

In this paper, label propagation-based semi-supervised learning is explored for the task of hate speech classification.

Classification

Paper
Add Code

The Effect of Domain and Diacritics in Yoruba–English Neural Machine Translation

no code implementations • MTSummit 2021 • David Adelani, Dana Ruiter, Jesujoba Alabi, Damilola Adebonojo, Adesina Ayeni, Mofe Adeyemi, Ayodele Esther Awokoya, Cristina España-Bonet

Massively multilingual machine translation (MT) has shown impressive capabilities and including zero and few-shot translation between low-resource language pairs.

Benchmarking Machine Translation +1

Paper
Add Code

UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian

no code implementations • WMT (EMNLP) 2020 • Sourav Dutta, Jesujoba Alabi, Saptarashmi Bandyopadhyay, Dana Ruiter, Josef van Genabith

This paper describes the UdS-DFKI submission to the shared task for unsupervised machine translation (MT) and very low-resource supervised MT between German (de) and Upper Sorbian (hsb) at the Fifth Conference of Machine Translation (WMT20).

Translation Unsupervised Machine Translation

Paper
Add Code

Self-Induced Curriculum Learning in Neural Machine Translation

no code implementations • 25 Sep 2019 • Dana Ruiter, Cristina España-Bonet, Josef van Genabith

Self-supervised neural machine translation (SS-NMT) learns how to extract/select suitable training data from comparable (rather than parallel) corpora and how to translate, in a way that the two tasks support each other in a virtuous circle.

Denoising Machine Translation +2

Paper
Add Code

Exploiting Social Media Content for Self-Supervised Style Transfer

1 code implementation • NAACL (SocialNLP) 2022 • Dana Ruiter, Thomas Kleinbauer, Cristina España-Bonet, Josef van Genabith, Dietrich Klakow

Recent research on style transfer takes inspiration from unsupervised neural machine translation (UNMT), learning from large amounts of non-parallel data by exploiting cycle consistency loss, back-translation, and denoising autoencoders.

Attribute Denoising +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.