Search Results for author: Andy Way

Found 134 papers, 14 papers with code

Introducing the Digital Language Equality Metric: Technological Factors

no code implementations TDLE (LREC) 2022 Federico Gaspari, Owen Gallagher, Georg Rehm, Maria Giagkou, Stelios Piperidis, Jane Dunne, Andy Way

The paper situates this ongoing work with a strong European focus in the broader context of related efforts, and explains how the DLE Metric can help track the progress towards DLE for all languages of Europe, focusing in particular on the role played by the TFs.

On Machine Translation of User Reviews

no code implementations RANLP 2021 Maja Popović, Alberto Poncelas, Marija Brkic, Andy Way

This work investigates neural machine translation (NMT) systems for translating English user reviews into Croatian and Serbian, two similar morphologically complex languages.

Machine Translation NMT +1

The ADAPT’s Submissions to the WMT20 Biomedical Translation Task

no code implementations WMT (EMNLP) 2020 Prashant Nayak, Rejwanul Haque, Andy Way

This paper describes the ADAPT Centre’s submissions to the WMT20 Biomedical Translation Shared Task for English-to-Basque.

Machine Translation NMT +1

The ADAPT System Description for the WMT20 News Translation Task

no code implementations WMT (EMNLP) 2020 Venkatesh Parthasarathy, Akshai Ramesh, Rejwanul Haque, Andy Way

This paper describes the ADAPT Centre’s submissions to the WMT20 News translation shared task for English-to-Tamil and Tamil-to-English.

Machine Translation NMT +1

Identifying Complaints from Product Reviews: A Case Study on Hindi

1 code implementation ICON 2020 Raghvendra Pratap Singh, Rejwanul Haque, Mohammed Hasanuzzaman, Andy Way

Automatic recognition of customer complaints on products or services that they purchase can be crucial for the organisations, multinationals and online retailers since they can exploit this information to fulfil their customers’ expectations including managing and resolving the complaints.

Machine Translation NMT +1

Using Wordnet to Improve Reordering in Hierarchical Phrase-Based Statistical Machine Translation

no code implementations GWC 2016 Arefeh Kazemi, Antonio Toral, Andy Way

We propose the use of WordNet synsets in a syntax-based reordering model for hierarchical statistical machine translation (HPB-SMT) to enable the model to generalize to phrases not seen in the training data but that have equivalent meaning.

Machine Translation Translation

Machine Translation in the Covid domain: an English-Irish case study for LoResMT 2021

no code implementations MTSummit 2021 Seamus Lankford, Haithem Afli, Andy Way

Translation models for the specific domain of translating Covid data from English to Irish were developed for the LoResMT 2021 shared task.

Domain Adaptation Machine Translation +1

Modelling Source- and Target- Language Syntactic Information as Conditional Context in Interactive Neural Machine Translation

no code implementations EAMT 2020 Kamal Kumar Gupta, Rejwanul Haque, Asif Ekbal, Pushpak Bhattacharyya, Andy Way

In this study, we model source-language syntactic constituency parse and target-language syntactic descriptions in the form of supertags as conditional context for interactive prediction in neural MT (NMT).

Machine Translation NMT +1

MT syntactic priming effects on L2 English speakers

no code implementations EAMT 2020 Natália Resende, Benjamin Cowan, Andy Way

In this paper, we tested 20 Brazilian Portuguese speakers at intermediate and advanced English proficiency levels to investigate the influence of Google Translate’s MT system on the mental processing of English as a second language.


Investigating Low-resource Machine Translation for English-to-Tamil

no code implementations loresmt (AACL) 2020 Akshai Ramesh, Venkatesh Balavadhani parthasa, Rejwanul Haque, Andy Way

Statistical machine translation (SMT) which was the dominant paradigm in machine translation (MT) research for nearly three decades has recently been superseded by the end-to-end deep learning approaches to MT.

Machine Translation NMT +1

gaHealth: An English–Irish Bilingual Corpus of Health Data

1 code implementation LREC 2022 Séamus Lankford, Haithem Afli, Órla Ní Loinsigh, Andy Way

However in the context of low-resource languages, there is a paucity of parallel data datasets available for developing translation models.

Machine Translation Translation

Compiling a Highly Accurate Bilingual Lexicon by Combining Different Approaches

no code implementations gwll (LREC) 2022 Steinþór Steingrímsson, Luke O’Brien, Finnur Ingimundarson, Hrafn Loftsson, Andy Way

By combining the most promising approaches and data sets, using confidence scores calculated from the data and the results of manually evaluating samples from our manual evaluation as indicators, we are able to induce lists of translations with a very high acceptance rate.

Cross-Lingual Word Embeddings Machine Translation +1

Developing Machine Translation Engines for Multilingual Participatory Spaces

no code implementations EAMT 2022 Pintu Lohar, Guodong Xie, Andy Way

It is often a challenging task to build Machine Translation (MT) engines for a specific domain due to the lack of parallel data in that area.

Machine Translation Translation

Overview of the ELE Project

no code implementations EAMT 2022 Itziar Aldabe, Jane Dunne, Aritz Farwell, Owen Gallagher, Federico Gaspari, Maria Giagkou, Jan Hajic, Jens Peter Kückens, Teresa Lynn, Georg Rehm, German Rigau, Katrin Marheinecke, Stelios Piperidis, Natalia Resende, Tea Vojtěchová, Andy Way

This paper provides an overview of the ongoing European Language Equality(ELE) project, an 18-month action funded by the European Commission which involves 52 partners.

Neural Machine Translation for translating into Croatian and Serbian

no code implementations VarDial (COLING) 2020 Maja Popović, Alberto Poncelas, Marija Brkic, Andy Way

Furthermore, translation performance from English is much better than from German, partly because German is morphologically more complex and partly because the corpus consists mostly of parallel human translations instead of original text and its human translation.

Machine Translation NMT +1

An Error-based Investigation of Statistical and Neural Machine Translation Performance on Hindi-to-Tamil and English-to-Tamil

no code implementations AACL (WAT) 2020 Akshai Ramesh, Venkatesh Balavadhani parthasa, Rejwanul Haque, Andy Way

Statistical machine translation (SMT) was the state-of-the-art in machine translation (MT) research for more than two decades, but has since been superseded by neural MT (NMT).

Machine Translation NMT +1

The ADAPT Centre’s Neural MT Systems for the WAT 2020 Document-Level Translation Task

no code implementations AACL (WAT) 2020 Wandri Jooste, Rejwanul Haque, Andy Way

In this paper we describe the ADAPT Centre’s submissions to the WAT 2020 document-level Business Scene Dialogue (BSD) Translation task.

Data Augmentation Machine Translation +1

MTrill project: Machine Translation impact on language learning

no code implementations EAMT 2020 Natália Resende, Andy Way

The literature on MT and Computer Assisted Language Learning (CALL) shows that, over the years, MT systems have been facilitating language teaching and also language learning (Nin ̃o, 2006).

Language Acquisition Machine Translation +1

A human evaluation of English-Irish statistical and neural machine translation

no code implementations EAMT 2020 Meghan Dowling, Sheila Castilho, Joss Moorkens, Teresa Lynn, Andy Way

With official status in both Ireland and the EU, there is a need for high-quality English-Irish (EN-GA) machine translation (MT) systems which are suitable for use in a professional translation environment.

Machine Translation Translation

DELA Corpus - A Document-Level Corpus Annotated with Context-Related Issues

1 code implementation WMT (EMNLP) 2021 Sheila Castilho, João Lucas Cavalheiro Camargo, Miguel Menezes, Andy Way

Recently, the Machine Translation (MT) community has become more interested in document-level evaluation especially in light of reactions to claims of “human parity”, since examining the quality at the level of the document rather than at the sentence level allows for the assessment of suprasentential context, providing a more reliable evaluation.

Machine Translation Translation

SentAlign: Accurate and Scalable Sentence Alignment

1 code implementation15 Nov 2023 Steinþór Steingrímsson, Hrafn Loftsson, Andy Way

We present SentAlign, an accurate sentence alignment tool designed to handle very large parallel document pairs.

Machine Translation Translation

Adaptive Machine Translation with Large Language Models

1 code implementation30 Jan 2023 Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way

By feeding an LLM at inference time with a prompt that consists of a list of translation pairs, it can then simulate the domain and style characteristics.

Domain Adaptation Information Retrieval +4

Dependency Graph-to-String Statistical Machine Translation

no code implementations20 Mar 2021 Liangyou Li, Andy Way, Qun Liu

We present graph-based translation models which translate source graphs into target strings.

Machine Translation Translation

Arabisc: Context-Sensitive Neural Spelling Checker

1 code implementation1 Dec 2020 Yasmin Moslem, Rejwanul Haque, Andy Way

Accordingly, we made use of a bidirectional LSTM language model (LM) for our context-sensitive spelling detection and correction model which is shown to have much control over the correction process.

Language Modelling Spelling Correction

The Impact of Indirect Machine Translation on Sentiment Classification

no code implementations AMTA 2020 Alberto Poncelas, Pintu Lohar, Andy Way, James Hadley

Furthermore, as performing a direct translation is not always possible, we explore the performance of automatic classifiers on sentences that have been translated using a pivot MT system.

Classification General Classification +4

The ADAPT System Description for the STAPLE 2020 English-to-Portuguese Translation Task

no code implementations WS 2020 Rejwanul Haque, Yasmin Moslem, Andy Way

This paper describes the ADAPT Centre{'}s submission to STAPLE (Simultaneous Translation and Paraphrase for Language Education) 2020, a shared task of the 4th Workshop on Neural Generation and Translation (WNGT), for the English-to-Portuguese translation task.

Machine Translation Translation

Effectively Aligning and Filtering Parallel Corpora under Sparse Data Conditions

no code implementations ACL 2020 Stein{\th}{\'o}r Steingr{\'\i}msson, Hrafn Loftsson, Andy Way

When rich morphology exacerbates the data sparsity problem, it is imperative to have accurate alignment and filtering methods that can help make the most of what is available by maximising the number of correctly translated segments in a corpus and minimising noise by removing incorrect translations and segments containing extraneous data.

Machine Translation Translation

Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation

no code implementations ACL 2020 Xabier Soto, Dimitar Shterionov, Alberto Poncelas, Andy Way

Machine translation (MT) has benefited from using synthetic training data originating from translating monolingual corpora, a technique known as backtranslation.

Machine Translation Translation

On Context Span Needed for Machine Translation Evaluation

no code implementations LREC 2020 Sheila Castilho, Maja Popovi{\'c}, Andy Way

Despite increasing efforts to improve evaluation of machine translation (MT) by going beyond the sentence level to the document level, the definition of what exactly constitutes a {``}document level{''} is still not clear.

Machine Translation Translation

A Tool for Facilitating OCR Postediting in Historical Documents

1 code implementation LREC 2020 Alberto Poncelas, Mohammad Aboomar, Jan Buts, James Hadley, Andy Way

This paper reports on a tool built for postediting the output of Tesseract, more specifically for correcting common errors in digitized historical documents.

Language Modelling Optical Character Recognition +1

Multiple Segmentations of Thai Sentences for Neural Machine Translation

no code implementations LREC 2020 Alberto Poncelas, Wichaya Pidchamook, Chao-Hong Liu, James Hadley, Andy Way

Thai is a low-resource language, so it is often the case that data is not available in sufficient quantities to train an Neural Machine Translation (NMT) model which perform to a high level of quality.

Machine Translation NMT +1

Selecting Artificially-Generated Sentences for Fine-Tuning Neural Machine Translation

no code implementations WS 2019 Alberto Poncelas, Andy Way

Neural Machine Translation (NMT) models tend to achieve best performance when larger sets of parallel sentences are provided for training.

Machine Translation NMT +2

Getting Gender Right in Neural Machine Translation

no code implementations EMNLP 2018 Eva Vanmassenhove, Christian Hardmeier, Andy Way

Our contribution is two-fold: (1) the compilation of large datasets with speaker information for 20 language pairs, and (2) a simple set of experiments that incorporate gender information into NMT for multiple language pairs.

Machine Translation NMT +1

Combining SMT and NMT Back-Translated Data for Efficient NMT

no code implementations9 Sep 2019 Alberto Poncelas, Maja Popovic, Dimitar Shterionov, Gideon Maillette de Buy Wenniger, Andy Way

Neural Machine Translation (NMT) models achieve their best performance when large sets of parallel data are used for training.

Machine Translation NMT +1

Building English-to-Serbian Machine Translation System for IMDb Movie Reviews

1 code implementation WS 2019 Pintu Lohar, Maja Popovi{\'c}, Andy Way

This paper reports the results of the first experiment dealing with the challenges of building a machine translation system for user-generated content involving a complex South Slavic language.

Machine Translation Translation

Lost in Translation: Loss and Decay of Linguistic Richness in Machine Translation

no code implementations WS 2019 Eva Vanmassenhove, Dimitar Shterionov, Andy Way

This work presents an empirical approach to quantifying the loss of lexical richness in Machine Translation (MT) systems compared to Human Translation (HT).

Machine Translation Translation

No Padding Please: Efficient Neural Handwriting Recognition

1 code implementation28 Feb 2019 Gideon Maillette de Buy Wenniger, Lambert Schomaker, Andy Way

Neural handwriting recognition (NHR) is the recognition of handwritten text with deep learning models, such as multi-dimensional long short-term memory (MDLSTM) recurrent neural networks.

Handwriting Recognition Handwritten Text Recognition

ABI Neural Ensemble Model for Gender Prediction Adapt Bar-Ilan Submission for the CLIN29 Shared Task on Gender Prediction

no code implementations23 Feb 2019 Eva Vanmassenhove, Amit Moryossef, Alberto Poncelas, Andy Way, Dimitar Shterionov

In contradiction with the results described in previous comparable shared tasks, our neural models performed better than our best traditional approaches with our best feature set-up.

Gender Prediction

Data Selection with Feature Decay Algorithms Using an Approximated Target Side

no code implementations IWSLT (EMNLP) 2018 Alberto Poncelas, Gideon Maillette de Buy Wenniger, Andy Way

A limitation of these methods to date is that using the source-side test set does not by itself guarantee that sentences are selected with correct translations, or translations that are suitable given the test-set domain.

Machine Translation NMT +2

Learning to Jointly Translate and Predict Dropped Pronouns with a Shared Reconstruction Mechanism

no code implementations EMNLP 2018 Long-Yue Wang, Zhaopeng Tu, Andy Way, Qun Liu

Pronouns are frequently omitted in pro-drop languages, such as Chinese, generally leading to significant challenges with respect to the production of complete translations.

Machine Translation Translation

Extracting In-domain Training Corpora for Neural Machine Translation Using Data Selection Methods

no code implementations WS 2018 Catarina Cruz Silva, Chao-Hong Liu, Alberto Poncelas, Andy Way

Data selection is a process used in selecting a subset of parallel data for the training of machine translation (MT) systems, so that 1) resources for training might be reduced, 2) trained models could perform better than those trained with the whole corpus, and/or 3) trained models are more tailored to specific domains.

Machine Translation NMT +1

Multi-Level Structured Self-Attentions for Distantly Supervised Relation Extraction

no code implementations EMNLP 2018 Jinhua Du, Jingguang Han, Andy Way, Dadong Wan

Targeting the MIL issue, the structured sentence-level attention learns a 2-D matrix where each row vector represents a weight distribution on selection of different valid in-stances.

Relation Extraction valid

Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation

1 code implementation WS 2018 Antonio Toral, Sheila Castilho, Ke Hu, Andy Way

We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation proficiency of the evaluators, and the provision of inter-sentential context.

Machine Translation Test +1

Tailoring Neural Architectures for Translating from Morphologically Rich Languages

no code implementations COLING 2018 Peyman Passban, Andy Way, Qun Liu

A morphologically complex word (MCW) is a hierarchical constituent with meaning-preserving subunits, so word-based models which rely on surface forms might not be powerful enough to translate such structures.

Machine Translation NMT +1

Fine-Grained Temporal Orientation and its Relationship with Psycho-Demographic Correlates

no code implementations NAACL 2018 Sabyasachi Kamila, Mohammed Hasanuzzaman, Asif Ekbal, Pushpak Bhattacharyya, Andy Way

In this paper, we propose a very first study to demonstrate the association between the sentiment view of the temporal orientation of the users and their different psycho-demographic attributes by analyzing their tweets.

Decision Making Test

Investigating Backtranslation in Neural Machine Translation

no code implementations17 Apr 2018 Alberto Poncelas, Dimitar Shterionov, Andy Way, Gideon Maillette de Buy Wenniger, Peyman Passban

A prerequisite for training corpus-based machine translation (MT) systems -- either Statistical MT (SMT) or Neural MT (NMT) -- is the availability of high-quality parallel data.

Machine Translation NMT +1

Quality expectations of machine translation

no code implementations22 Mar 2018 Andy Way

Machine Translation (MT) is being deployed for a range of use-cases by millions of people on a daily basis.

Machine Translation Translation

What Level of Quality can Neural Machine Translation Attain on Literary Text?

no code implementations15 Jan 2018 Antonio Toral, Andy Way

Given the rise of a new approach to MT, Neural MT (NMT), and its promising performance on different text types, we assess the translation quality it can attain on what is perceived to be the greatest challenge for MT: literary text.

Machine Translation NMT +1

ADAPT at IJCNLP-2017 Task 4: A Multinomial Naive Bayes Classification Approach for Customer Feedback Analysis task

no code implementations IJCNLP 2017 Pintu Lohar, Koel Dutta Chowdhury, Haithem Afli, Mohammed Hasanuzzaman, Andy Way

In this paper, we analyse the real world samples of customer feedback from Microsoft Office customers in four languages, i. e., English, French, Spanish and Japanese and conclude a five-plus-one-classes categorisation (comment, request, bug, complaint, meaningless and undetermined) for meaning classification.

Classification General Classification +3

Demographic Word Embeddings for Racism Detection on Twitter

no code implementations IJCNLP 2017 Mohammed Hasanuzzaman, Ga{\"e}l Dias, Andy Way

Most social media platforms grant users freedom of speech by allowing them to freely express their thoughts, beliefs, and opinions.

Classification General Classification +1

Semantics-Enhanced Task-Oriented Dialogue Translation: A Case Study on Hotel Booking

no code implementations IJCNLP 2017 Long-Yue Wang, Jinhua Du, Liangyou Li, Zhaopeng Tu, Andy Way, Qun Liu

We showcase TODAY, a semantics-enhanced task-oriented dialogue translation system, whose novelties are: (i) task-oriented named entity (NE) definition and a hybrid strategy for NE recognition and translation; and (ii) a novel grounded semantic method for dialogue understanding and task-order management.

Dialogue Understanding Machine Translation +3

Human Evaluation of Multi-modal Neural Machine Translation: A Case-Study on E-Commerce Listing Titles

no code implementations WS 2017 Iacer Calixto, Daniel Stein, Evgeny Matusov, Sheila Castilho, Andy Way

Nonetheless, human evaluators ranked translations from a multi-modal NMT model as better than those of a text-only NMT over 88{\%} of the time, which suggests that images do help NMT in this use-case.

Machine Translation NMT +1

Ethical Considerations in NLP Shared Tasks

no code implementations WS 2017 Carla Parra Escart{\'\i}n, Wessel Reijers, Teresa Lynn, Joss Moorkens, Andy Way, Chao-Hong Liu

Shared tasks are increasingly common in our field, and new challenges are suggested at almost every conference and workshop.

Ethics Machine Translation

Using Images to Improve Machine-Translating E-Commerce Product Listings.

no code implementations EACL 2017 Iacer Calixto, Daniel Stein, Evgeny Matusov, Pintu Lohar, Sheila Castilho, Andy Way

We evaluate our models quantitatively using BLEU and TER and find that (i) additional synthetic data has a general positive impact on text-only and multi-modal NMT models, and that (ii) using a multi-modal NMT model for re-ranking n-best lists improves TER significantly across different n-best list sizes.

Machine Translation NMT +2

Context-Aware Graph Segmentation for Graph-Based Translation

no code implementations EACL 2017 Liangyou Li, Andy Way, Qun Liu

In this paper, we present an improved graph-based translation model which segments an input graph into node-induced subgraphs by taking source context into consideration.

Segmentation Translation

Identifying Effective Translations for Cross-lingual Arabic-to-English User-generated Speech Search

no code implementations WS 2017 Ahmad Khwileh, Haithem Afli, Gareth Jones, Andy Way

Cross Language Information Retrieval (CLIR) systems are a valuable tool to enable speakers of one language to search for content of interest expressed in a different language.

Information Retrieval Machine Translation +2

Fast Gated Neural Domain Adaptation: Language Model as a Case Study

no code implementations COLING 2016 Jian Zhang, Xiaofeng Wu, Andy Way, Qun Liu

We show that the neural LM perplexity can be reduced by 7. 395 and 12. 011 using the proposed domain adaptation mechanism on the Penn Treebank and News data, respectively.

Domain Adaptation Language Modelling +2

Enriching Phrase Tables for Statistical Machine Translation Using Mixed Embeddings

no code implementations COLING 2016 Peyman Passban, Qun Liu, Andy Way

PBSMT engines by default provide four probability scores in phrase tables which are considered as the main set of bilingual features.

Document Classification Machine Translation +2

Topic-Informed Neural Machine Translation

no code implementations COLING 2016 Jian Zhang, Liangyou Li, Andy Way, Qun Liu

In recent years, neural machine translation (NMT) has demonstrated state-of-the-art machine translation (MT) performance.

Machine Translation NMT +3

Automatic Construction of Discourse Corpora for Dialogue Translation

no code implementations LREC 2016 Long-Yue Wang, Xiaojun Zhang, Zhaopeng Tu, Andy Way, Qun Liu

Then tags such as speaker and discourse boundary from the script data are projected to its subtitle data via an information retrieval approach in order to map monolingual discourse to bilingual texts.

Information Retrieval Language Modelling +3

ProphetMT: A Tree-based SMT-driven Controlled Language Authoring/Post-Editing Tool

no code implementations LREC 2016 Xiaofeng Wu, Jinhua Du, Qun Liu, Andy Way

This paper presents ProphetMT, a tree-based SMT-driven Controlled Language (CL) authoring and post-editing tool.


Enhancing Access to Online Education: Quality Machine Translation of MOOC Content

no code implementations LREC 2016 Valia Kordoni, Antal Van den Bosch, Katia Lida Kermanidis, Vilelmini Sosoni, Kostadin Cholakov, Iris Hendrickx, Matthias Huck, Andy Way

The present work is an overview of the TraMOOC (Translation for Massive Open Online Courses) research and innovation project, a machine translation approach for online educational content.

Machine Translation Sentiment Analysis +1

Using SMT for OCR Error Correction of Historical Texts

no code implementations LREC 2016 Haithem Afli, Zhengwei Qiu, Andy Way, P{\'a}raic Sheridan

A trend to digitize historical paper-based archives has emerged in recent years, with the advent of digital optical scanners.

Language Modelling Machine Translation +3

Using BabelNet to Improve OOV Coverage in SMT

no code implementations LREC 2016 Jinhua Du, Andy Way, Andrzej Zydron

Out-of-vocabulary words (OOVs) are a ubiquitous and difficult problem in statistical machine translation (SMT).

Domain Adaptation Machine Translation +1

A Novel Approach to Dropped Pronoun Translation

no code implementations NAACL 2016 Long-Yue Wang, Zhaopeng Tu, Xiaojun Zhang, Hang Li, Andy Way, Qun Liu

Finally, we integrate the above outputs into our translation system to recall missing pronouns by both extracting rules from the DP-labelled training data and translating the DP-generated input sentences.

Machine Translation Translation

SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles

no code implementations LREC 2012 Volha Petukhova, Rodrigo Agerri, Mark Fishel, Sergio Penkale, Arantza del Pozo, Mirjam Sepesy Mau{\v{c}}ec, Andy Way, Panayota Georgakopoulou, Martin Volk

Subtitling and audiovisual translation have been recognized as areas that could greatly benefit from the introduction of Statistical Machine Translation (SMT) followed by post-editing, in order to increase efficiency of subtitle production process.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.