Search Results for author: Josef van Genabith

Found 122 papers, 18 papers with code

BIRA: Improved Predictive Exchange Word Clustering

1 code implementation • NAACL 2016 • Jon Dehdari, Liling Tan, Josef van Genabith

Chunking Clustering +2

Paper
Code

Massively Multilingual Neural Grapheme-to-Phoneme Conversion

1 code implementation • WS 2017 • Ben Peters, Jon Dehdari, Josef van Genabith

Grapheme-to-phoneme conversion (g2p) is necessary for text-to-speech and automatic speech recognition systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

INFODENS: An Open-source Framework for Learning Text Representations

1 code implementation • 16 Oct 2018 • Ahmad Taie, Raphael Rubino, Josef van Genabith

The advent of representation learning methods enabled large performance gains on various language tasks, alleviating the need for manual feature engineering.

Feature Engineering General Classification +3

Paper
Code

ReVal: A Simple and Effective Machine Translation Evaluation Metric Based on Recurrent Neural Networks

1 code implementation • EMNLP 2015 • Rohit Gupta, Constantin Or{\u{a}}san, Josef van Genabith

Feature Engineering Machine Translation +3

Paper
Code

Machine Translation Evaluation using Recurrent Neural Networks

1 code implementation • WS 2015 • Rohit Gupta, Constantin Or{\u{a}}san, Josef van Genabith

Feature Engineering Machine Translation +3

Paper
Code

LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools

1 code implementation • 23 Jan 2024 • Qianli Wang, Tatiana Anikina, Nils Feldhus, Josef van Genabith, Leonhard Hennig, Sebastian Möller

Interpretability tools that offer explanations in the form of a dialogue have demonstrated their efficacy in enhancing users' understanding, as one-off explanations may occasionally fall short in providing sufficient information to the user.

counterfactual Fact Checking +4

Paper
Code

Mid-Air Hand Gestures for Post-Editing of Machine Translation

1 code implementation • ACL 2021 • Rashad Albo Jamara, Nico Herbig, Antonio Kr{\"u}ger, Josef van Genabith

Here, we present the first study that investigates the usefulness of mid-air hand gestures in combination with the keyboard (GK) for text editing in PE of MT.

Machine Translation Translation

Paper
Code

Investigating the Helpfulness of Word-Level Quality Estimation for Post-Editing Machine Translation Output

1 code implementation • EMNLP 2021 • Raksha Shenoy, Nico Herbig, Antonio Krüger, Josef van Genabith

For helpful quality levels, a visualization reflecting the uncertainty of the QE model is preferred.

Machine Translation Translation

Paper
Code

Self-Supervised Neural Machine Translation

1 code implementation • ACL 2019 • Dana Ruiter, Cristina Espa{\~n}a-Bonet, Josef van Genabith

We present a simple new method where an emergent NMT system is used for simultaneously selecting training data and learning internal NMT representations.

Machine Translation NMT +1

Paper
Code

Neural Morphological Tagging from Characters for Morphologically Rich Languages

1 code implementation • 21 Jun 2016 • Georg Heigold, Guenter Neumann, Josef van Genabith

We systematically explore a variety of neural architectures (DNN, CNN, CNNHighway, LSTM, BLSTM) to obtain character-based word vectors combined with bidirectional LSTMs to model across-word context in an end-to-end setting.

Morphological Tagging TAG +1

Paper
Code

Exploring Paracrawl for Document-level Neural Machine Translation

1 code implementation • 20 Apr 2023 • Yusser Al Ghussin, Jingyi Zhang, Josef van Genabith

We show that document-level NMT models trained with only parallel paragraphs from Paracrawl can be used to translate real documents from TED, News and Europarl, outperforming sentence-level NMT models.

Machine Translation NMT +2

Paper
Code

TransIns: Document Translation with Markup Reinsertion

1 code implementation • EMNLP (ACL) 2021 • Jörg Steffen, Josef van Genabith

This is challenging, as markup can be nested, apply to spans contiguous in source but non-contiguous in target etc.

Document Translation NMT +1

Paper
Code

Do not Rely on Relay Translations: Multilingual Parallel Direct Europarl

2 code implementations • MoTra (NoDaLiDa) 2021 • Kwabena Amponsah-Kaakyire, Daria Pylypenko, Cristina España-Bonet, Josef van Genabith

Paper
Code

Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?

1 code implementation • 28 Apr 2023 • Sonal Sannigrahi, Josef van Genabith, Cristina Espana-Bonet

We demonstrate that while a simple sentence average results in a strong baseline for classification tasks, more complex combinations are necessary for semantic tasks.

Sentence Sentence Embeddings +1

Paper
Code

When your Cousin has the Right Connections: Unsupervised Bilingual Lexicon Induction for Related Data-Imbalanced Languages

1 code implementation • 23 May 2023 • Niyati Bafna, Cristina España-Bonet, Josef van Genabith, Benoît Sagot, Rachel Bawden

Most existing approaches for unsupervised bilingual lexicon induction (BLI) depend on good quality static or contextual embeddings requiring large monolingual corpora for both languages.

Bilingual Lexicon Induction Language Modelling

Paper
Code

An Empirical Analysis of NMT-Derived Interlingual Embeddings and their Use in Parallel Sentence Identification

no code implementations • 18 Apr 2017 • Cristina España-Bonet, Ádám Csaba Varga, Alberto Barrón-Cedeño, Josef van Genabith

First, we systematically study the NMT context vectors, i. e. output of the encoder, and their power as an interlingua representation of a sentence.

Machine Translation NMT +3

Paper
Add Code

Exploring the Use of Text Classification in the Legal Domain

no code implementations • 25 Oct 2017 • Octavia-Maria Sulea, Marcos Zampieri, Shervin Malmasi, Mihaela Vela, Liviu P. Dinu, Josef van Genabith

In this paper, we investigate the application of text classification methods to support law professionals.

General Classification text-classification +1

Paper
Add Code

Predicting the Law Area and Decisions of French Supreme Court Cases

no code implementations • RANLP 2017 • Octavia-Maria Sulea, Marcos Zampieri, Mihaela Vela, Josef van Genabith

In this paper, we investigate the application of text classification methods to predict the law area and the decision of cases judged by the French Supreme Court.

General Classification text-classification +1

Paper
Add Code

How Robust Are Character-Based Word Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse?

no code implementations • WS 2018 • Georg Heigold, Günter Neumann, Josef van Genabith

In this paper, we study the impact of noisy input.

Machine Translation Morphological Tagging +2

Paper
Add Code

A Neural Network based Approach to Automatic Post-Editing

no code implementations • ACL 2016 • Santanu Pal, Sudip Kumar Naskar, Mihaela Vela, Josef van Genabith

Automatic Post-Editing

Paper
Add Code

An Extensive Empirical Evaluation of Character-Based Morphological Tagging for 14 Languages

no code implementations • EACL 2017 • Georg Heigold, Guenter Neumann, Josef van Genabith

This paper investigates neural character-based morphological tagging for languages with complex morphology and large tag sets.

Language Modelling Machine Translation +5

Paper
Add Code

Neural Automatic Post-Editing Using Prior Alignment and Reranking

no code implementations • EACL 2017 • Santanu Pal, Sudip Kumar Naskar, Mihaela Vela, Qun Liu, Josef van Genabith

APE translations produced by our system show statistically significant improvements over the first-stage MT, phrase-based APE and the best reported score on the WMT 2016 APE dataset by a previous neural APE system.

Automatic Post-Editing NMT +2

Paper
Add Code

Common Round: Application of Language Technologies to Large-Scale Web Debates

no code implementations • EACL 2017 • Hans Uszkoreit, Aleks Gabryszak, ra, Leonhard Hennig, J{\"o}rg Steffen, Renlong Ai, Stephan Busemann, Jon Dehdari, Josef van Genabith, Georg Heigold, Nils Rethmeier, Raphael Rubino, Sven Schmeier, Philippe Thomas, He Wang, Feiyu Xu

Web debates play an important role in enabling broad participation of constituencies in social, political and economic decision-taking.

Decision Making Machine Translation +2

Paper
Add Code

Information Density and Quality Estimation Features as Translationese Indicators for Human Translation Classification

no code implementations • NAACL 2016 • Raphael Rubino, Ekaterina Lapshinova-Koltunski, Josef van Genabith

General Classification Machine Translation +1

Paper
Add Code

Scaling Up Word Clustering

no code implementations • NAACL 2016 • Jon Dehdari, Liling Tan, Josef van Genabith

Chunking Clustering +2

Paper
Add Code

SAARSHEFF at SemEval-2016 Task 1: Semantic Textual Similarity with Machine Translation Evaluation Metrics and (eXtreme) Boosted Tree Ensembles

no code implementations • SEMEVAL 2016 • Liling Tan, Carolina Scarton, Lucia Specia, Josef van Genabith

Machine Translation Semantic Textual Similarity

Paper
Add Code

WOLVESAAR at SemEval-2016 Task 1: Replicating the Success of Monolingual Word Alignment and Neural Embeddings for Semantic Textual Similarity

no code implementations • SEMEVAL 2016 • Hannah Bechara, Rohit Gupta, Liling Tan, Constantin Or{\u{a}}san, Ruslan Mitkov, Josef van Genabith

Information Retrieval Machine Translation +4

Paper
Add Code

MacSaar at SemEval-2016 Task 11: Zipfian and Character Features for ComplexWord Identification

no code implementations • SEMEVAL 2016 • Marcos Zampieri, Liling Tan, Josef van Genabith

Complex Word Identification Lexical Simplification +1

Paper
Add Code

USAAR at SemEval-2016 Task 13: Hyponym Endocentricity

no code implementations • SEMEVAL 2016 • Liling Tan, Francis Bond, Josef van Genabith

Word Embeddings

Paper
Add Code

Code-Mixed Question Answering Challenge: Crowd-sourcing Data and Techniques

no code implementations • WS 2018 • Ch, Khyathi u, Ekaterina Loginova, Vishal Gupta, Josef van Genabith, G{\"u}nter Neumann, Manoj Chinnakotla, Eric Nyberg, Alan W. black

As a first step towards fostering research which supports CM in NLP applications, we systematically crowd-sourced and curated an evaluation dataset for factoid question answering in three CM languages - Hinglish (Hindi+English), Tenglish (Telugu+English) and Tamlish (Tamil+English) which belong to two language families (Indo-Aryan and Dravidian).

Question Answering Sentence

Paper
Add Code

A Transformer-Based Multi-Source Automatic Post-Editing System

no code implementations • WS 2018 • Santanu Pal, Nico Herbig, Antonio Kr{\"u}ger, Josef van Genabith

The proposed model is an extension of the transformer architecture: two separate self-attention-based encoders encode the machine translation output (mt) and the source (src), followed by a joint encoder that attends over a combination of these two encoded sequences (encsrc and encmt) for generating the post-edited sentence.

Automatic Post-Editing NMT +2

Paper
Add Code

The Effect of Error Rate in Artificially Generated Data for Automatic Preposition and Determiner Correction

no code implementations • WS 2017 • Fraser Bowen, Jon Dehdari, Josef van Genabith

In this research we investigate the impact of mismatches in the density and type of error between training and test data on a neural system correcting preposition and determiner errors.

Grammatical Error Correction Machine Translation

Paper
Add Code

JU-USAAR: A Domain Adaptive MT System

no code implementations • WS 2016 • Koushik Pahari, Alapan Kuila, Santanu Pal, Sudip Kumar Naskar, B, Sivaji yopadhyay, Josef van Genabith

Domain Adaptation Language Modelling +1

Paper
Add Code

USAAR: An Operation Sequential Model for Automatic Statistical Post-Editing

no code implementations • WS 2016 • Santanu Pal, Marcos Zampieri, Josef van Genabith

Automatic Post-Editing Word Alignment

Paper
Add Code

Modeling Diachronic Change in Scientific Writing with Information Density

no code implementations • COLING 2016 • Raphael Rubino, Stefania Degaetano-Ortlieb, Elke Teich, Josef van Genabith

In this paper we investigate the introduction of information theory inspired features to study long term diachronic change on three levels: lexis, part-of-speech and syntax.

General Classification Informativeness

Paper
Add Code

Multi-Engine and Multi-Alignment Based Automatic Post-Editing and its Impact on Translation Productivity

no code implementations • COLING 2016 • Santanu Pal, Sudip Kumar Naskar, Josef van Genabith

In the paper we show that parallel system combination in the APE stage of a sequential MT-APE combination yields substantial translation improvements both measured in terms of automatic evaluation metrics as well as in terms of productivity improvements measured in a post-editing experiment.

Automatic Post-Editing Translation

Paper
Add Code

CATaLog Online: A Web-based CAT Tool for Distributed Translation with Data Capture for APE and Translation Process Research

no code implementations • COLING 2016 • Santanu Pal, Sudip Kumar Naskar, Marcos Zampieri, Tapas Nayak, Josef van Genabith

We present a free web-based CAT tool called CATaLog Online which provides a novel and user-friendly online CAT environment for post-editors/translators.

Automatic Post-Editing Translation

Paper
Add Code

European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management

no code implementations • LREC 2018 • Andrea L{\"o}sch, Val{\'e}rie Mapelli, Stelios Piperidis, Andrejs Vasi{\c{l}}jevs, Lilli Smal, Thierry Declerck, Eileen Schnur, Khalid Choukri, Josef van Genabith

Machine Translation Management

Paper
Add Code

Head-Driven Hierarchical Phrase-based Translation

no code implementations • ACL 2012 • Junhui Li, Zhaopeng Tu, Guodong Zhou, Josef van Genabith

Dependency Parsing Machine Translation +1

Paper
Add Code

Identifying High-Impact Sub-Structures for Convolution Kernels in Document-level Sentiment Classification

no code implementations • ACL 2012 • Zhaopeng Tu, Yifan He, Jennifer Foster, Josef van Genabith, Qun Liu, ShouXun Lin

General Classification Sentiment Analysis +1

Paper
Add Code

Active Learning for Post-Editing Based Incrementally Retrained MT

no code implementations • EACL 2014 • Aswarth Abhilash Dara, Josef van Genabith, Qun Liu, John Judge, Antonio Toral

Active Learning Domain Adaptation +1

Paper
Add Code

TMTprime: A Recommender System for MT and TM Integration

no code implementations • NAACL 2013 • Aswarth Abhilash Dara, D, S apat, ipan, Declan Groves, Josef van Genabith

Machine Translation Recommendation Systems

Paper
Add Code

USAAR-SHEFFIELD: Semantic Textual Similarity with Deep Regression and Machine Translation Evaluation Metrics

no code implementations • SEMEVAL 2015 • Liling Tan, Carolina Scarton, Lucia Specia, Josef van Genabith

Dimensionality Reduction Machine Translation +3

Paper
Add Code

USAAR-WLV: Hypernym Generation with Deep Neural Nets

no code implementations • SEMEVAL 2015 • Liling Tan, Rohit Gupta, Josef van Genabith

Word Embeddings

Paper
Add Code

CNGL-CORE: Referential Translation Machines for Measuring Semantic Similarity

no code implementations • SEMEVAL 2013 • Ergun Bi{\c{c}}ici, Josef van Genabith

Lemmatization Machine Translation +4

Paper
Add Code

CNGL: Grading Student Answers by Acts of Translation

no code implementations • SEMEVAL 2013 • Ergun Bi{\c{c}}ici, Josef van Genabith

Machine Translation Question Answering +2

Paper
Add Code

UdS-Sant: English--German Hybrid Machine Translation System

no code implementations • WS 2015 • Santanu Pal, Sudip Naskar, Josef van Genabith

Entity Extraction using GAN Machine Translation +2

Paper
Add Code

USAAR-SAPE: An English--Spanish Statistical Automatic Post-Editing System

no code implementations • WS 2015 • Santanu Pal, Mihaela Vela, Sudip Kumar Naskar, Josef van Genabith

Automatic Post-Editing Word Alignment

Paper
Add Code

Passive and Pervasive Use of Bilingual Dictionary in Statistical Machine Translation

no code implementations • WS 2015 • Liling Tan, Josef van Genabith, Francis Bond

Domain Adaptation Language Modelling +3

Paper
Add Code

Can Translation Memories afford not to use paraphrasing?

no code implementations • WS 2015 • Rohit Gupta, Constantin Or{\u{a}}san, Marcos Zampieri, Mihaela Vela, Josef van Genabith

Semantic Textual Similarity Translation

Paper
Add Code

Searching for Context: a Study on Document-Level Labels for Translation Quality Estimation

no code implementations • WS 2015 • Carolina Scarton, Marcos Zampieri, Mihaela Vela, Josef van Genabith, Lucia Specia

Machine Translation Translation

Paper
Add Code

Re-assessing the WMT2013 Human Evaluation with Professional Translators Trainees

no code implementations • WS 2015 • Mihaela Vela, Josef van Genabith

Machine Translation

Paper
Add Code

An Awkward Disparity between BLEU / RIBES Scores and Human Judgements in Machine Translation

no code implementations • WS 2015 • Liling Tan, Jon Dehdari, Josef van Genabith

Machine Translation Translation

Paper
Add Code

CATaLog: New Approaches to TM and Post Editing Interfaces

no code implementations • WS 2015 • Tapas Nayek, Sudip Kumar Naskar, Santanu Pal, Marcos Zampieri, Mihaela Vela, Josef van Genabith

Machine Translation

Paper
Add Code

Comparing Approaches to the Identification of Similar Languages

no code implementations • WS 2015 • Marcos Zampieri, Binyam Gebrekidan Gebre, Hernani Costa, Josef van Genabith

Language Identification

Paper
Add Code

How Sentiment Analysis Can Help Machine Translation

no code implementations • WS 2014 • Santanu Pal, Braja Gopal Patra, Dipankar Das, Sudip Kumar Naskar, B, Sivaji yopadhyay, Josef van Genabith

Machine Translation Sentiment Analysis +1

Paper
Add Code

Shallow Semantically-Informed PBSMT and HPBSMT

no code implementations • WS 2013 • Tsuyoshi Okita, Qun Liu, Josef van Genabith

Domain Adaptation Machine Translation

Paper
Add Code

Combining EBMT, SMT, TM and IR Technologies for Quality and Scale

no code implementations • WS 2012 • D, S apat, ipan, Sara Morrissey, Andy Way, Josef van Genabith

Language Modelling Machine Translation

Paper
Add Code

Using Syntactic Head Information in Hierarchical Phrase-Based Translation

no code implementations • WS 2012 • Junhui Li, Zhaopeng Tu, Guodong Zhou, Josef van Genabith

Dependency Parsing Machine Translation +1

Paper
Add Code

System Combination with Extra Alignment Information

no code implementations • WS 2012 • Xiaofeng Wu, Tsuyoshi Okita, Josef van Genabith, Qun Liu

Paper
Add Code

Topic Modeling-based Domain Adaptation for System Combination

no code implementations • WS 2012 • Tsuyoshi Okita, Antonio Toral, Josef van Genabith

Document Classification Domain Adaptation +1

Paper
Add Code

Sentence-Level Quality Estimation for MT System Combination

no code implementations • WS 2012 • Tsuyoshi Okita, Rapha{\"e}l Rubino, Josef van Genabith

Machine Translation Sentence

Paper
Add Code

Results from the ML4HMT-12 Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation

no code implementations • WS 2012 • Christian Federmann, Tsuyoshi Okita, Maite Melero, Marta R. Costa-Jussa, Toni Badia, Josef van Genabith

Machine Translation

Paper
Add Code

The Floating Arabic Dictionary: An Automatic Method for Updating a Lexical Database through the Detection and Lemmatization of Unknown Words

no code implementations • COLING 2012 • Mohammed Attia, Younes Samih, Khaled Shaalan, Josef van Genabith

Lemmatization

Paper
Add Code

Translation Quality-Based Supplementary Data Selection by Incremental Update of Translation Models

no code implementations • COLING 2012 • Pratyush Banerjee, Sudip Kumar Naskar, Johann Roturier, Andy Way, Josef van Genabith

Domain Adaptation Machine Translation +1

Paper
Add Code

An Evaluation of Statistical Post-Editing Systems Applied to RBMT and SMT Systems

no code implementations • COLING 2012 • Hanna B{\'e}chara, Rapha{\"e}l Rubino, Yifan He, Yanjun Ma, Josef van Genabith

Machine Translation

Paper
Add Code

Simple and Effective Parameter Tuning for Domain Adaptation of Statistical Machine Translation

no code implementations • COLING 2012 • Pavel Pecina, Antonio Toral, Josef van Genabith

Domain Adaptation Machine Translation +1

Paper
Add Code

Improved Spelling Error Detection and Correction for Arabic

no code implementations • COLING 2012 • Mohammed Attia, Pavel Pecina, Younes Samih, Khaled Shaalan, Josef van Genabith

Language Modelling

Paper
Add Code

Combining Multiple Alignments to Improve Machine Translation

no code implementations • COLING 2012 • Zhaopeng Tu, Yang Liu, Yifan He, Josef van Genabith, Qun Liu, ShouXun Lin

Machine Translation Translation +1

Paper
Add Code

The Strategic Impact of META-NET on the Regional, National and International Level

no code implementations • LREC 2014 • Georg Rehm, Hans Uszkoreit, Sophia Ananiadou, N{\'u}ria Bel, Audron{\.e} Bielevi{\v{c}}ien{\.e}, Lars Borin, Ant{\'o}nio Branco, Gerhard Budin, Nicoletta Calzolari, Walter Daelemans, Radovan Garab{\'\i}k, Marko Grobelnik, Carmen Garc{\'\i}a-Mateo, Josef van Genabith, Jan Haji{\v{c}}, Inma Hern{\'a}ez, John Judge, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lind{\'e}n, Bernardo Magnini, Joseph Mariani, John McNaught, Maite Melero, Monica Monachini, Asunci{\'o}n Moreno, Jan Odijk, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Stelios Piperidis, Adam Przepi{\'o}rkowski, Eir{\'\i}kur R{\"o}gnvaldsson, Michael Rosner, Bolette Pedersen, Inguna Skadi{\c{n}}a, Koenraad De Smedt, Marko Tadi{\'c}, Paul Thompson, Dan Tufi{\c{s}}, Tam{\'a}s V{\'a}radi, Andrejs Vasi{\c{l}}jevs, Kadri Vider, Jolanta Zabarskaite

This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics.

Machine Translation

Paper
Add Code

Irish Treebanking and Parsing: A Preliminary Evaluation

no code implementations • LREC 2012 • Teresa Lynn, {\"O}zlem {\c{C}}etino{\u{g}}lu, Jennifer Foster, Elaine U{\'\i} Dhonnchadha, Mark Dras, Josef van Genabith

This paper describes the early stages in the development of new language resources for Irish â€• namely the first Irish dependency treebank and the first Irish statistical dependency parser.

Machine Translation POS

Paper
Add Code

A Richly Annotated, Multilingual Parallel Corpus for Hybrid Machine Translation

no code implementations • LREC 2012 • Eleftherios Avramidis, Marta R. Costa-juss{\`a}, Christian Federmann, Josef van Genabith, Maite Melero, Pavel Pecina

This corpus aims to serve as a basic resource for further research on whether hybrid machine translation algorithms and system combination techniques can benefit from additional (linguistically motivated, decoding, and runtime) information provided by the different systems involved.

Machine Translation Translation

Paper
Add Code

Arabic Word Generation and Modelling for Spell Checking

no code implementations • LREC 2012 • Khaled Shaalan, Mohammed Attia, Pavel Pecina, Younes Samih, Josef van Genabith

Furthermore, from a large list of valid forms and invalid forms we create a character-based tri-gram language model to approximate knowledge about permissible character clusters in Arabic, creating a novel method for detecting spelling errors.

Language Modelling Morphological Analysis +2

Paper
Add Code

Automatic Extraction and Evaluation of Arabic LFG Resources

no code implementations • LREC 2012 • Mohammed Attia, Khaled Shaalan, Lamia Tounsi, Josef van Genabith

We utilize this annotation to automatically acquire grammatical function (dependency) based subcategorization frames and paths linking long-distance dependencies (LDDs).

POS

Paper
Add Code

The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation

no code implementations • LREC 2012 • Christian Federmann, Eleftherios Avramidis, Marta R. Costa-juss{\`a}, Josef van Genabith, Maite Melero, Pavel Pecina

We describe the Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation (ML4HMT) which aims to foster research on improved system combination approaches for machine translation (MT).

Language Modelling Machine Translation +1

Paper
Add Code

Integrating Artificial and Human Intelligence for Efficient Translation

no code implementations • 7 Mar 2019 • Nico Herbig, Santanu Pal, Josef van Genabith, Antonio Krüger

Current advances in machine translation increase the need for translators to switch from traditional translation to post-editing of machine-translated text, a process that saves time and improves quality.

Machine Translation Translation

Paper
Add Code

CATaLog Online: Porting a Post-editing Tool to the Web

no code implementations • LREC 2016 • Santanu Pal, Marcos Zampieri, Sudip Kumar Naskar, Tapas Nayak, Mihaela Vela, Josef van Genabith

The tool features a number of editing and log functions similar to the desktop version of CATaLog enhanced with several new features that we describe in detail in this paper.

Machine Translation Management +1

Paper
Add Code

Fostering the Next Generation of European Language Technology: Recent Developments ― Emerging Initiatives ― Challenges and Opportunities

no code implementations • LREC 2016 • Georg Rehm, Jan Haji{\v{c}}, Josef van Genabith, Andrejs Vasiljevs

META-NET is a European network of excellence, founded in 2010, that consists of 60 research centres in 34 European countries.

Paper
Add Code

UdS Submission for the WMT 19 Automatic Post-Editing Task

no code implementations • WS 2019 • Hongfei Xu, Qiuhui Liu, Josef van Genabith

In this paper, we describe our submission to the English-German APE shared task at WMT 2019.

Automatic Post-Editing NMT

Paper
Add Code

JU-Saarland Submission to the WMT2019 English--Gujarati Translation Shared Task

no code implementations • WS 2019 • Riktim Mondal, Shankha Raj Nayek, Aditya Chowdhury, Santanu Pal, Sudip Kumar Naskar, Josef van Genabith

In this paper we describe our joint submission (JU-Saarland) from Jadavpur University and Saarland University in the WMT 2019 news translation shared task for English{--}Gujarati language pair within the translation task sub-track.

Machine Translation NMT +1

Paper
Add Code

DFKI-NMT Submission to the WMT19 News Translation Task

no code implementations • WS 2019 • Jingyi Zhang, Josef van Genabith

This paper describes the DFKI-NMT submission to the WMT19 News translation task.

NMT Translation

Paper
Add Code

USAAR-DFKI -- The Transference Architecture for English--German Automatic Post-Editing

no code implementations • WS 2019 • Santanu Pal, Hongfei Xu, Nico Herbig, Antonio Kr{\"u}ger, Josef van Genabith

In this paper we present an English{--}German Automatic Post-Editing (APE) system called transference, submitted to the APE Task organized at WMT 2019.

Automatic Post-Editing Translation

Paper
Add Code

UDS--DFKI Submission to the WMT2019 Czech--Polish Similar Language Translation Shared Task

no code implementations • WS 2019 • Santanu Pal, Marcos Zampieri, Josef van Genabith

The first edition of this shared task featured data from three pairs of similar languages: Czech and Polish, Hindi and Nepali, and Portuguese and Spanish.

Translation

Paper
Add Code

The Transference Architecture for Automatic Post-Editing

no code implementations • COLING 2020 • Santanu Pal, Hongfei Xu, Nico Herbig, Sudip Kumar Naskar, Antonio Krueger, Josef van Genabith

In automatic post-editing (APE) it makes sense to condition post-editing (pe) decisions on both the source (src) and the machine translated text (mt) as input.

Automatic Post-Editing NMT

Paper
Add Code

Improving CAT Tools in the Translation Workflow: New Approaches and Evaluation

no code implementations • WS 2019 • Mihaela Vela, Santanu Pal, Marcos Zampieri, Sudip Kumar Naskar, Josef van Genabith

User feedback revealed that the users preferred using CATaLog Online over existing CAT tools in some respects, especially by selecting the output of the MT system and taking advantage of the color scheme for TM suggestions.

Automatic Post-Editing Management +1

Paper
Add Code

UDS--DFKI Submission to the WMT2019 Similar Language Translation Shared Task

no code implementations • 16 Aug 2019 • Santanu Pal, Marcos Zampieri, Josef van Genabith

The first edition of this shared task featured data from three pairs of similar languages: Czech and Polish, Hindi and Nepali, and Portuguese and Spanish.

Translation

Paper
Add Code

Analysing Coreference in Transformer Outputs

no code implementations • WS 2019 • Ekaterina Lapshinova-Koltunski, Cristina España-Bonet, Josef van Genabith

We analyse coreference phenomena in three neural machine translation systems trained with different data settings with or without access to explicit intra- and cross-sentential anaphoric information.

Machine Translation Translation

Paper
Add Code

Lipschitz Constrained Parameter Initialization for Deep Transformers

no code implementations • ACL 2020 • Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Jingyi Zhang

In this paper, we first empirically demonstrate that a simple modification made in the official implementation, which changes the computation order of residual connection and layer normalization, can significantly ease the optimization of deep Transformers.

Translation

Paper
Add Code

Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers

no code implementations • NAACL 2021 • Hongfei Xu, Josef van Genabith, Qiuhui Liu, Deyi Xiong

Due to its effectiveness and performance, the Transformer translation model has attracted wide attention, most recently in terms of probing-based approaches.

Translation Word Translation

Paper
Add Code

The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

no code implementations • LREC 2020 • Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, José Manuel Gómez Pérez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, António Branco, Gerhard Budin, Walter Daelemans, Koenraad De Smedt, Radovan Garabík, Maria Gavriilidou, Dagmar Gromann, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Jan Odijk, Maciej Ogrodniczuk, Eiríkur Rögnvaldsson, Mike Rosner, Bolette Sandford Pedersen, Inguna Skadiņa, Marko Tadić, Dan Tufiş, Tamás Váradi, Kadri Vider, Andy Way, François Yvon

Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality.

Misconceptions

Paper
Add Code

Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation

no code implementations • EMNLP 2020 • Dana Ruiter, Josef van Genabith, Cristina España-Bonet

Self-supervised neural machine translation (SSNMT) jointly learns to identify and select suitable training data from comparable (rather than parallel) corpora and to translate, in a way that the two tasks support each other in a virtuous circle.

Denoising Machine Translation +1

Paper
Add Code

Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change

no code implementations • ACL 2020 • Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu

We propose to automatically and dynamically determine batch sizes by accumulating gradients of mini-batches and performing an optimization step at just the time when the direction of gradients starts to fluctuate.

Paper
Add Code

Language Data Sharing in European Public Services -- Overcoming Obstacles and Creating Sustainable Data Sharing Infrastructures

no code implementations • LREC 2020 • Lilli Smal, Andrea L{\"o}sch, Josef van Genabith, Maria Giagkou, Thierry Declerck, Stephan Busemann

Data is key in training modern language technologies.

Management

Paper
Add Code

MMPE: A Multi-Modal Interface for Post-Editing Machine Translation

no code implementations • ACL 2020 • Nico Herbig, Tim D{\"u}wel, Santanu Pal, Kalliopi Meladaki, Mahsa Monshizadeh, Antonio Kr{\"u}ger, Josef van Genabith

On the other hand, speech and multi-modal combinations of select {\&} speech are considered suitable for replacements and insertions but offer less potential for deletion and reordering.

Machine Translation Translation

Paper
Add Code

MMPE: A Multi-Modal Interface using Handwriting, Touch Reordering, and Speech Commands for Post-Editing Machine Translation

no code implementations • ACL 2020 • Nico Herbig, Santanu Pal, Tim D{\"u}wel, Kalliopi Meladaki, Mahsa Monshizadeh, Vladislav Hnatovskiy, Antonio Kr{\"u}ger, Josef van Genabith

The shift from traditional translation to post-editing (PE) of machine-translated (MT) text can save time and reduce errors, but it also affects the design of translation interfaces, as the task changes from mainly generating text to correcting errors within otherwise helpful translation proposals.

Machine Translation Translation

Paper
Add Code

How Human is Machine Translationese? Comparing Human and Machine Translations of Text and Speech

no code implementations • WS 2020 • Yuri Bizzoni, Tom S Juzek, Cristina Espa{\~n}a-Bonet, Koel Dutta Chowdhury, Josef van Genabith, Elke Teich

Some translationese features tend to appear in simultaneous interpreting with higher frequency than in human text translation, but the reasons for this are unclear.

Machine Translation Translation

Paper
Add Code

Learning Source Phrase Representations for Neural Machine Translation

no code implementations • ACL 2020 • Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu, Jingyi Zhang

Considering that modeling phrases instead of words has significantly improved the Statistical Machine Translation (SMT) approach through the use of larger translation blocks ("phrases") and its reordering ability, modeling NMT at phrase level is an intuitive proposal to help the model capture long-distance relationships.

Machine Translation NMT +1

Paper
Add Code

Rewiring the Transformer with Depth-Wise LSTMs

no code implementations • 13 Jul 2020 • Hongfei Xu, Yang song, Qiuhui Liu, Josef van Genabith, Deyi Xiong

Stacking non-linear layers allows deep neural networks to model complicated functions, and including residual connections in Transformer layers is beneficial for convergence and performance.

NMT Time Series Analysis

Paper
Add Code

Linguistically inspired morphological inflection with a sequence to sequence model

no code implementations • 4 Sep 2020 • Eleni Metheniti, Guenter Neumann, Josef van Genabith

Inflection is an essential part of every human language's morphology, yet little effort has been made to unify linguistic theory and computational methods in recent years.

Language Acquisition LEMMA +1

Paper
Add Code

Learning Hard Retrieval Decoder Attention for Transformers

no code implementations • Findings (EMNLP) 2021 • Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong

The Transformer translation model is based on the multi-head attention mechanism, which can be parallelized easily.

Machine Translation Retrieval +2

Paper
Add Code

Translation Quality Estimation by Jointly Learning to Score and Rank

no code implementations • EMNLP 2020 • Jingyi Zhang, Josef van Genabith

In order to make use of different types of human evaluation data for supervised learning, we present a multi-task learning QE model that jointly learns two tasks: score a translation and rank two translations.

Multi-Task Learning Sentence +2

Paper
Add Code

Integrating Unsupervised Data Generation into Self-Supervised Neural Machine Translation for Low-Resource Languages

no code implementations • MTSummit 2021 • Dana Ruiter, Dietrich Klakow, Josef van Genabith, Cristina España-Bonet

For most language combinations, parallel data is either scarce or simply unavailable.

Denoising NMT +3

Paper
Add Code

Modeling Task-Aware MIMO Cardinality for Efficient Multilingual Neural Machine Translation

no code implementations • ACL 2021 • Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong

In this paper, we propose to efficiently increase the capacity for multilingual NMT by increasing the cardinality.

Machine Translation NMT +1

Paper
Add Code

Multi-Head Highly Parallelized LSTM Decoder for Neural Machine Translation

no code implementations • ACL 2021 • Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Meng Zhang

This has to be computed n times for a sequence of length n. The linear transformations involved in the LSTM gate and state computations are the major cost factors in this.

Machine Translation Translation

Paper
Add Code

A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment

no code implementations • ACL 2021 • Jingyi Zhang, Josef van Genabith

We further fine-tune the target-to-source attention in the BTBA model to obtain better alignments using a full context based optimization method and self-supervised training.

Machine Translation Translation +1

Paper
Add Code

Understanding Translationese in Multi-view Embedding Spaces

no code implementations • COLING 2020 • Koel Dutta Chowdhury, Cristina Espa{\~n}a-Bonet, Josef van Genabith

Recent studies use a combination of lexical and syntactic features to show that footprints of the source language remain visible in translations, to the extent that it is possible to predict the original source language from the translation.

Translation

Paper
Add Code

Comparing Feature-Engineering and Feature-Learning Approaches for Multilingual Translationese Classification

no code implementations • EMNLP 2021 • Daria Pylypenko, Kwabena Amponsah-Kaakyire, Koel Dutta Chowdhury, Josef van Genabith, Cristina España-Bonet

Traditional hand-crafted linguistically-informed features have often been used for distinguishing between translated and original non-translated texts.

Feature Engineering Feature Importance +1

Paper
Add Code

Improving the Multi-Modal Post-Editing (MMPE) CAT Environment based on Professional Translators’ Feedback

no code implementations • AMTA 2020 • Nico Herbig, Santanu Pal, Tim Düwel, Raksha Shenoy, Antonio Krüger, Josef van Genabith

Paper
Add Code

English to Manipuri and Mizo Post-Editing Effort and its Impact on Low Resource Machine Translation

no code implementations • ICON 2020 • Loitongbam Sanayai Meetei, Thoudam Doren Singh, Sivaji Bandyopadhyay, Mihaela Vela, Josef van Genabith

A Computer Assisted Translation (CAT) tool is used to record the time, keystroke and other indicators to measure PE effort in terms of temporal and technical effort.

Machine Translation Translation

Paper
Add Code

UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian

no code implementations • WMT (EMNLP) 2020 • Sourav Dutta, Jesujoba Alabi, Saptarashmi Bandyopadhyay, Dana Ruiter, Josef van Genabith

This paper describes the UdS-DFKI submission to the shared task for unsupervised machine translation (MT) and very low-resource supervised MT between German (de) and Upper Sorbian (hsb) at the Fifth Conference of Machine Translation (WMT20).

Translation Unsupervised Machine Translation

Paper
Add Code

Tracing Source Language Interference in Translation with Graph-Isomorphism Measures

no code implementations • RANLP 2021 • Koel Dutta Chowdhury, Cristina España-Bonet, Josef van Genabith

Previous research has used linguistic features to show that translations exhibit traces of source language interference and that phylogenetic trees between languages can be reconstructed from the results of translations into the same language.

Open-Ended Question Answering Translation

Paper
Add Code

Self-Induced Curriculum Learning in Neural Machine Translation

no code implementations • 25 Sep 2019 • Dana Ruiter, Cristina España-Bonet, Josef van Genabith

Self-supervised neural machine translation (SS-NMT) learns how to extract/select suitable training data from comparable (rather than parallel) corpora and how to translate, in a way that the two tasks support each other in a virtuous circle.

Denoising Machine Translation +2

Paper
Add Code

Going beyond zero-shot MT: combining phonological, morphological and semantic factors. The UdS-DFKI System at IWSLT 2017

no code implementations • IWSLT 2017 • Cristina España-Bonet, Josef van Genabith

This paper describes the UdS-DFKI participation to the multilingual task of the IWSLT Evaluation 2017.

Translation

Paper
Add Code

Towards Debiasing Translation Artifacts

1 code implementation • NAACL 2022 • Koel Dutta Chowdhury, Rricha Jalota, Cristina España-Bonet, Josef van Genabith

Cross-lingual natural language processing relies on translation, either by humans or machines, at different levels, from translating training data to translating test sets.

Natural Language Inference Sentence +1

Paper
Code

Exploiting Social Media Content for Self-Supervised Style Transfer

1 code implementation • NAACL (SocialNLP) 2022 • Dana Ruiter, Thomas Kleinbauer, Cristina España-Bonet, Josef van Genabith, Dietrich Klakow

Recent research on style transfer takes inspiration from unsupervised neural machine translation (UNMT), learning from large amounts of non-parallel data by exploiting cycle consistency loss, back-translation, and denoising autoencoders.

Attribute Denoising +4

Paper
Code

Explaining Translationese: why are Neural Classifiers Better and what do they Learn?

no code implementations • 24 Oct 2022 • Kwabena Amponsah-Kaakyire, Daria Pylypenko, Josef van Genabith, Cristina España-Bonet

Previous research did not show $(i)$ whether the difference is because of the features, the classifiers or both, and $(ii)$ what the neural classifiers actually learn.

Feature Engineering Representation Learning

Paper
Add Code

NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual Question Answering

no code implementations • 7 Nov 2022 • Tengxun Zhang, Hongfei Xu, Josef van Genabith, Deyi Xiong, Hongying Zan

Hybrid tabular-textual question answering (QA) requires reasoning from heterogeneous information, and the types of reasoning are mainly divided into numerical reasoning and span extraction.

Question Answering

Paper
Add Code

RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain

no code implementations • 6 Jun 2023 • Sangeet Sagar, Mirco Ravanelli, Bernd Kiefer, Ivana Kruijff Korbayova, Josef van Genabith

Despite the recent advancements in speech recognition, there are still difficulties in accurately transcribing conversational and emotional speech in noisy and reverberant acoustic environments.

Decision Making Robust Speech Recognition +1

Paper
Add Code

Measuring Spurious Correlation in Classification: 'Clever Hans' in Translationese

1 code implementation • 25 Aug 2023 • Angana Borah, Daria Pylypenko, Cristina Espana-Bonet, Josef van Genabith

Translationese signals are subtle (especially for professional translation) and compete with many other signals in the data such as genre, style, author, and, in particular, topic.

Classification

Paper
Code

Translating away Translationese without Parallel Data

no code implementations • 28 Oct 2023 • Rricha Jalota, Koel Dutta Chowdhury, Cristina España-Bonet, Josef van Genabith

We show how we can eliminate the need for parallel validation data by combining the self-supervised loss with an unsupervised loss.

Binary Classification Language Modelling +3

Paper
Add Code

Investigating the Encoding of Words in BERT's Neurons using Feature Textualization

no code implementations • 14 Nov 2023 • Tanja Baeumel, Soniya Vijayakumar, Josef van Genabith, Guenter Neumann, Simon Ostermann

Pretrained language models (PLMs) form the basis of most state-of-the-art NLP technologies.

Paper
Add Code

Where exactly does contextualization in a PLM happen?

no code implementations • 11 Dec 2023 • Soniya Vijayakumar, Tanja Bäumel, Simon Ostermann, Josef van Genabith

Pre-trained Language Models (PLMs) have shown to be consistently successful in a plethora of NLP tasks due to their ability to learn contextualized representations of words (Ethayarajh, 2019).

Language Modelling Sentence +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.