Search Results for author: Lucia Specia

Found 199 papers, 41 papers with code

Bayesian Model-Agnostic Meta-Learning with Matrix-Valued Kernels for Quality Estimation

no code implementations • ACL (RepL4NLP) 2021 • Abiola Obamuyide, Marina Fomicheva, Lucia Specia

To address these challenges, we propose a Bayesian meta-learning approach for adapting QE models to the needs and preferences of each user with limited supervision.

Machine Translation Meta-Learning +1

Paper
Add Code

Quality In, Quality Out: Learning from Actual Mistakes

no code implementations • EAMT 2020 • Frederic Blain, Nikolaos Aletras, Lucia Specia

However, QE models are often trained on noisy approximations of quality annotations derived from the proportion of post-edited words in translated sentences instead of direct human annotations of translation errors.

Machine Translation Sentence +2

Paper
Add Code

ICL’s Submission to the WMT21 Critical Error Detection Shared Task

no code implementations • WMT (EMNLP) 2021 • Genze Jiang, Zhenhao Li, Lucia Specia

This paper presents Imperial College London’s submissions to the WMT21 Quality Estimation (QE) Shared Task 3: Critical Error Detection.

Feature Engineering

Paper
Add Code

Findings of the WMT 2021 Shared Task on Quality Estimation

no code implementations • WMT (EMNLP) 2021 • Lucia Specia, Frédéric Blain, Marina Fomicheva, Chrysoula Zerva, Zhenhao Li, Vishrav Chaudhary, André F. T. Martins

We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels.

Machine Translation Sentence +1

Paper
Add Code

Multi-level quality prediction with QuEst++

no code implementations • EAMT 2016 • Gustavo H.Paetzold, Lucia Specia

Paper
Add Code

A Taxonomy and Study of Critical Errors in Machine Translation

no code implementations • EAMT 2022 • Khetam Al Sharou, Lucia Specia

We also study the impact of the source text on generating critical errors in the translation and, based on this, propose a set of recommendations on aspects of the MT that need further scrutiny, especially for user-generated content, to avoid generating such errors, and hence improve online communication.

Machine Translation Translation

Paper
Add Code

The IWSLT 2019 Evaluation Campaign

no code implementations • EMNLP (IWSLT) 2019 • Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico

The IWSLT 2019 evaluation campaign featured three tasks: speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech.

Translation

Paper
Add Code

Translation Quality and Productivity: A Study on Rich Morphology Languages

no code implementations • MTSummit 2017 • Lucia Specia, Kim Harris, Frédéric Blain, Aljoscha Burchardt, Viviven Macketanz, Inguna Skadin, Matteo Negri, Marco Turchi

Translation

Paper
Add Code

Uncertainty Aware Review Hallucination for Science Article Classification

no code implementations • Findings (ACL) 2021 • Korbinian Friedl, Georgios Rizos, Lukas Stappen, Madina Hasan, Lucia Specia, Thomas Hain, Björn Schuller

Classification Hallucination

Paper
Add Code

A Generative Framework for Simultaneous Machine Translation

no code implementations • EMNLP 2021 • Yishu Miao, Phil Blunsom, Lucia Specia

We propose a generative framework for simultaneous machine translation.

Machine Translation reinforcement-learning +2

Paper
Add Code

Findings of the WMT 2020 Shared Task on Machine Translation Robustness

no code implementations • WMT (EMNLP) 2020 • Lucia Specia, Zhenhao Li, Juan Pino, Vishrav Chaudhary, Francisco Guzmán, Graham Neubig, Nadir Durrani, Yonatan Belinkov, Philipp Koehn, Hassan Sajjad, Paul Michel, Xian Li

We report the findings of the second edition of the shared task on improving robustness in Machine Translation (MT).

Machine Translation Translation

Paper
Add Code

Findings of the WMT 2020 Shared Task on Quality Estimation

no code implementations • WMT (EMNLP) 2020 • Lucia Specia, Frédéric Blain, Marina Fomicheva, Erick Fonseca, Vishrav Chaudhary, Francisco Guzmán, André F. T. Martins

We report the results of the WMT20 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word, sentence and document levels.

Machine Translation Sentence +1

Paper
Add Code

BERGAMOT-LATTE Submissions for the WMT20 Quality Estimation Shared Task

no code implementations • WMT (EMNLP) 2020 • Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Vishrav Chaudhary, Mark Fishel, Francisco Guzmán, Lucia Specia

We explore (a) a black-box approach to QE based on pre-trained representations; and (b) glass-box approaches that leverage various indicators that can be extracted from the neural MT systems.

Sentence Task 2

Paper
Add Code

Leveraging Pre-trained Language Models for Gender Debiasing

no code implementations • LREC 2022 • Nishtha Jain, Declan Groves, Lucia Specia, Maja Popović

This work explores a light-weight method to generate gender variants for a given text using pre-trained language models as the resource, without any task-specific labelled data.

Text Generation

Paper
Add Code

The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification

1 code implementation • CL (ACL) 2021 • Fernando Alva-Manchego, Carolina Scarton, Lucia Specia

Second, we conduct the first meta-evaluation of automatic metrics in Text Simplification, using our new data set (and other existing data) to analyze the variation of the correlation between metrics’ scores and human judgments across three dimensions: the perceived simplicity level, the system type, and the set of references used for computation.

Sentence Text Simplification

Paper
Code

Bias Mitigation in Machine Translation Quality Estimation

1 code implementation • ACL 2022 • Hanna Behnke, Marina Fomicheva, Lucia Specia

Machine Translation Quality Estimation (QE) aims to build predictive models to assess the quality of machine-generated translations in the absence of reference translations.

Binary Classification Machine Translation +1

Paper
Code

Multilingual and Multimodal Learning for Brazilian Portuguese

no code implementations • LREC 2022 • Júlia Sato, Helena Caseli, Lucia Specia

The good BLEU and METEOR values obtained for this new language pair, regarding the original English-German VTLM, establish the suitability of the model to other languages.

Language Modelling Sentence +1

Paper
Add Code

Feature-rich NMT and SMT post-edited corpora for productivity and evaluation tasks with a subset of MQM-annotated data

no code implementations • MTSummit 2017 • Kim Harris, Lucia Specia, Aljoscha Burchardt

NMT

Paper
Add Code

Exploring Hypotheses Spaces in Neural Machine Translation

no code implementations • MTSummit 2017 • Frédéric Blain, Lucia Specia, Pranava Madhyastha

Machine Translation Translation

Paper
Add Code

One-parameter models for sentence-level post-editing effort estimation

no code implementations • MTSummit 2017 • Mikel L. Forcada, Miquel Esplà-Gomis, Felipe Sánchez-Martínez, Lucia Specia

Sentence

Paper
Add Code

Predicting Moments of Mood Changes Overtime from Imbalanced Social Media Data

no code implementations • NAACL (CLPsych) 2022 • Falwah Alhamed, Julia Ive, Lucia Specia

The second is predicting the degree of suicide risk as a user-level classification task.

Paper
Add Code

Validating Quality Estimation in a Computer-Aided Translation Workflow: Speed, Cost and Quality Trade-off

no code implementations • MTSummit 2021 • Fernando Alva-Manchego, Lucia Specia, Sara Szoc, Tom Vanallemeersch, Heidi Depraetere

In this scenario, a Quality Estimation (QE) tool can be used to score MT outputs, and a threshold on the QE scores can be applied to decide whether an MT output can be used as-is or requires human post-edition.

Machine Translation Translation

Paper
Add Code

Towards a Better Understanding of Noise in Natural Language Processing

no code implementations • RANLP 2021 • Khetam Al Sharou, Zhenhao Li, Lucia Specia

In this paper, we propose a definition and taxonomy of various types of non-standard textual content – generally referred to as “noise” – in Natural Language Processing (NLP).

Paper
Add Code

Multimodal Simultaneous Machine Translation

no code implementations • MMTLRL (RANLP) 2021 • Lucia Specia

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.

Machine Translation Translation

Paper
Add Code

deepQuest-py: Large and Distilled Models for Quality Estimation

1 code implementation • EMNLP (ACL) 2021 • Fernando Alva-Manchego, Abiola Obamuyide, Amit Gajbhiye, Frédéric Blain, Marina Fomicheva, Lucia Specia

We introduce deepQuest-py, a framework for training and evaluation of large and light-weight models for Quality Estimation (QE).

Knowledge Distillation Sentence

Paper
Code

From Understanding to Utilization: A Survey on Explainability for Large Language Models

no code implementations • 23 Jan 2024 • Haoyan Luo, Lucia Specia

This survey underscores the imperative for increased explainability in LLMs, delving into both the research on explainability and the various methodologies and tasks that utilize an understanding of these models.

Model Editing

Paper
Add Code

Reducing Hallucinations in Neural Machine Translation with Feature Attribution

no code implementations • 17 Nov 2022 • Joël Tang, Marina Fomicheva, Lucia Specia

We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT.

Machine Translation NMT +2

Paper
Add Code

Scene Text Recognition with Semantics

no code implementations • 19 Oct 2022 • Joshua Cesare Placidi, Yishu Miao, Zixu Wang, Lucia Specia

Scene Text Recognition (STR) models have achieved high performance in recent years on benchmark datasets where text images are presented with minimal noise.

Scene Text Recognition

Paper
Add Code

Contrastive Video-Language Learning with Fine-grained Frame Sampling

no code implementations • 10 Oct 2022 • Zixu Wang, Yujie Zhong, Yishu Miao, Lin Ma, Lucia Specia

However, even in paired video-text segments, only a subset of the frames are semantically relevant to the corresponding text, with the remainder representing noise; where the ratio of noisy frames is higher for longer videos.

Question Answering Representation Learning +3

Paper
Add Code

Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts

1 code implementation • 24 Jun 2022 • Atijit Anuchitanukul, Lucia Specia

We present Burst2Vec, our multi-task learning approach to predict emotion, age, and origin (i. e., native country/language) from vocal bursts.

Multi-Task Learning

Paper
Code

Logically Consistent Adversarial Attacks for Soft Theorem Provers

1 code implementation • 29 Apr 2022 • Alexander Gaskell, Yishu Miao, Lucia Specia, Francesca Toni

We propose a novel, generative adversarial framework for probing and improving these models' reasoning capabilities.

Automated Theorem Proving

Paper
Code

Supervised Visual Attention for Simultaneous Multimodal Machine Translation

no code implementations • 23 Jan 2022 • Veneta Haralampieva, Ozan Caglayan, Lucia Specia

A particular use for such multimodal systems is the task of simultaneous machine translation, where visual context has been shown to complement the partial information provided by the source sentence, especially in the early phases of translation.

Multimodal Machine Translation Sentence +1

Paper
Add Code

Revisiting Contextual Toxicity Detection in Conversations

no code implementations • 24 Nov 2021 • Atijit Anuchitanukul, Julia Ive, Lucia Specia

We then propose to bring these findings into computational detection models by introducing and evaluating (a) neural architectures for contextual toxicity detection that are aware of the conversational structure, and (b) data augmentation strategies that can help model contextual toxicity detection.

Ranked #1 on Toxic Comment Classification on CAD

Data Augmentation Toxic Comment Classification

Paper
Add Code

Guiding Visual Question Generation

no code implementations • NAACL 2022 • Nihir Vedd, Zixu Wang, Marek Rei, Yishu Miao, Lucia Specia

In traditional Visual Question Generation (VQG), most images have multiple concepts (e. g. objects and categories) for which a question could be generated, but models are trained to mimic an arbitrary choice of concept as given in their training data.

Question Generation Question-Generation +2

Paper
Add Code

Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation

1 code implementation • WMT (EMNLP) 2021 • Diptesh Kanojia, Marina Fomicheva, Tharindu Ranasinghe, Frédéric Blain, Constantin Orăsan, Lucia Specia

However, this ability is yet to be tested in the current evaluation practices, where QE systems are assessed only in terms of their correlation with human judgements.

Machine Translation Translation

Paper
Code

Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications

no code implementations • EMNLP 2021 • Shuo Sun, Ahmed El-Kishky, Vishrav Chaudhary, James Cross, Francisco Guzmán, Lucia Specia

Sentence-level Quality estimation (QE) of machine translation is traditionally formulated as a regression task, and the performance of QE models is typically measured by Pearson correlation with human labels.

Machine Translation Model Compression +3

Paper
Add Code

A Survey of Online Hate Speech through the Causal Lens

no code implementations • EMNLP (CINLP) 2021 • Antigoni-Maria Founta, Lucia Specia

The societal issue of digital hostility has previously attracted a lot of attention.

Causal Inference

Paper
Add Code

Translation Error Detection as Rationale Extraction

no code implementations • Findings (ACL) 2022 • Marina Fomicheva, Lucia Specia, Nikolaos Aletras

Recent Quality Estimation (QE) models based on multilingual pre-trained representations have achieved very competitive results when predicting the overall quality of translated sentences.

Sentence Translation

Paper
Add Code

Continual Quality Estimation with Online Bayesian Meta-Learning

no code implementations • ACL 2021 • Abiola Obamuyide, Marina Fomicheva, Lucia Specia

Most current quality estimation (QE) models for machine translation are trained and evaluated in a static setting where training and test data are assumed to be from a fixed distribution.

Machine Translation Meta-Learning +1

Paper
Add Code

Knowledge Distillation for Quality Estimation

1 code implementation • Findings (ACL) 2021 • Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Frédéric Blain, Abiola Obamuyide, Nikolaos Aletras, Lucia Specia

Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations.

Data Augmentation Knowledge Distillation +2

Paper
Code

BERTGEN: Multi-task Generation through BERT

1 code implementation • ACL 2021 • Faidon Mitzalis, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively.

Image Captioning Multimodal Machine Translation +2

Paper
Code

SentSim: Crosslingual Semantic Evaluation of Machine Translation

no code implementations • NAACL 2021 • Yurun Song, Junchen Zhao, Lucia Specia

Machine translation (MT) is currently evaluated in one of two ways: in a monolingual fashion, by comparison with the system output to one or more human reference translations, or in a trained crosslingual fashion, by building a supervised model to predict quality scores from human-labeled data.

Machine Translation Semantic Similarity +3

Paper
Add Code

Cross-Modal Generative Augmentation for Visual Question Answering

no code implementations • 11 May 2021 • Zixu Wang, Yishu Miao, Lucia Specia

Experiments on Visual Question Answering as downstream task demonstrate the effectiveness of the proposed generative model, which is able to improve strong UpDn-based models to achieve state-of-the-art performance.

Data Augmentation Question Answering +1

Paper
Add Code

What Makes a Scientific Paper be Accepted for Publication?

no code implementations • EMNLP (CINLP) 2021 • Panagiotis Fytas, Georgios Rizos, Lucia Specia

Despite peer-reviewing being an essential component of academia since the 1600s, it has repeatedly received criticisms for lack of transparency and consistency.

Paper
Add Code

Backtranslation Feedback Improves User Confidence in MT, Not Quality

1 code implementation • NAACL 2021 • Vilém Zouhar, Michal Novák, Matúš Žilinec, Ondřej Bojar, Mateo Obregón, Robin L. Hill, Frédéric Blain, Marina Fomicheva, Lucia Specia, Lisa Yankovskaya

Translating text into a language unknown to the text's author, dubbed outbound translation, is a modern need for which the user experience has significant room for improvement, beyond the basic machine translation facility.

Machine Translation Translation

Paper
Code

Visual Cues and Error Correction for Translation Robustness

1 code implementation • Findings (EMNLP) 2021 • Zhenhao Li, Marek Rei, Lucia Specia

Neural Machine Translation models are sensitive to noise in the input texts, such as misspelled words and ungrammatical constructions.

Machine Translation Translation

Paper
Code

MultiSubs: A Large-scale Multimodal and Multilingual Dataset

1 code implementation • LREC 2022 • Josiah Wang, Pranava Madhyastha, Josiel Figueiredo, Chiraag Lala, Lucia Specia

The dataset will benefit research on visual grounding of words especially in the context of free-form sentences, and can be obtained from https://doi. org/10. 5281/zenodo. 5034604 under a Creative Commons licence.

Ranked #1 on Multimodal Text Prediction on MultiSubs

Multimodal Lexical Translation Multimodal Text Prediction +2

Paper
Code

Exploring Supervised and Unsupervised Rewards in Machine Translation

1 code implementation • EACL 2021 • Julia Ive, Zixu Wang, Marina Fomicheva, Lucia Specia

Reinforcement Learning (RL) is a powerful framework to address the discrepancy between loss functions used during training and the final evaluation metrics to be used at test time.

Machine Translation Reinforcement Learning (RL) +2

Paper
Code

Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation

1 code implementation • EACL 2021 • Julia Ive, Andy Mingren Li, Yishu Miao, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

This paper addresses the problem of simultaneous machine translation (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to support this process by providing additional (visual) contextual information which may be available before the textual input is produced.

Machine Translation reinforcement-learning +2

Paper
Code

Quality Estimation without Human-labeled Data

no code implementations • EACL 2021 • Yi-Lin Tuan, Ahmed El-Kishky, Adithya Renduchintala, Vishrav Chaudhary, Francisco Guzmán, Lucia Specia

Quality estimation aims to measure the quality of translated content without access to a reference translation.

Machine Translation Sentence +1

Paper
Add Code

Cross-lingual Visual Pre-training for Multimodal Machine Translation

1 code implementation • EACL 2021 • Ozan Caglayan, Menekse Kuyu, Mustafa Sercan Amac, Pranava Madhyastha, Erkut Erdem, Aykut Erdem, Lucia Specia

Pre-trained language models have been shown to improve performance in many natural language tasks substantially.

Language Modelling Multimodal Machine Translation +1

Paper
Code

Latent Variable Models for Visual Question Answering

no code implementations • 16 Jan 2021 • Zixu Wang, Yishu Miao, Lucia Specia

Current work on Visual Question Answering (VQA) explore deterministic approaches conditioned on various types of image and question features.

Benchmarking Question Answering +1

Paper
Add Code

MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish

no code implementations • 13 Dec 2020 • Begum Citamak, Ozan Caglayan, Menekse Kuyu, Erkut Erdem, Aykut Erdem, Pranava Madhyastha, Lucia Specia

We hope that the MSVD-Turkish dataset and the results reported in this work will lead to better video captioning and multimodal machine translation models for Turkish and other morphology rich and agglutinative languages.

Multimodal Machine Translation Sentence +3

Paper
Add Code

An Exploratory Study on Multilingual Quality Estimation

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Shuo Sun, Marina Fomicheva, Fr{\'e}d{\'e}ric Blain, Vishrav Chaudhary, Ahmed El-Kishky, Adithya Renduchintala, Francisco Guzm{\'a}n, Lucia Specia

Predicting the quality of machine translation has traditionally been addressed with language-specific models, under the assumption that the quality label distribution or linguistic features exhibit traits that are not shared across languages.

Machine Translation Translation

Paper
Add Code

Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision

1 code implementation • 19 Nov 2020 • Yujie Zhong, Linhai Xie, Sen Wang, Lucia Specia, Yishu Miao

In this paper, we teach machines to understand visuals and natural language by learning the mapping between sentences and noisy video snippets without explicit annotations.

Retrieval Self-Supervised Learning

Paper
Code

Curious Case of Language Generation Evaluation Metrics: A Cautionary Tale

no code implementations • COLING 2020 • Ozan Caglayan, Pranava Madhyastha, Lucia Specia

Automatic evaluation of language generation systems is a well-studied problem in Natural Language Processing.

Image Captioning Machine Translation +3

Paper
Add Code

FIND: Human-in-the-Loop Debugging Deep Text Classifiers

1 code implementation • EMNLP 2020 • Piyawat Lertvittayakumjorn, Lucia Specia, Francesca Toni

Since obtaining a perfect training dataset (i. e., a dataset which is considerably large, unbiased, and well-representative of unseen cases) is hardly possible, many real-world text classifiers are trained on the available, yet imperfect, datasets.

Paper
Code

MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset

1 code implementation • LREC 2022 • Marina Fomicheva, Shuo Sun, Erick Fonseca, Chrysoula Zerva, Frédéric Blain, Vishrav Chaudhary, Francisco Guzmán, Nina Lopatina, Lucia Specia, André F. T. Martins

We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE).

Automatic Post-Editing Sentence +1

Paper
Code

Simultaneous Machine Translation with Visual Context

1 code implementation • EMNLP 2020 • Ozan Caglayan, Julia Ive, Veneta Haralampieva, Pranava Madhyastha, Loïc Barrault, Lucia Specia

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.

Machine Translation Translation

Paper
Code

Are we Estimating or Guesstimating Translation Quality?

no code implementations • ACL 2020 • Shuo Sun, Francisco Guzm{\'a}n, Lucia Specia

Recent advances in pre-trained multilingual language models lead to state-of-the-art results on the task of quality estimation (QE) for machine translation.

Machine Translation Translation

Paper
Add Code

Exploring Model Consensus to Generate Translation Paraphrases

1 code implementation • WS 2020 • Zhenhao Li, Marina Fomicheva, Lucia Specia

This paper describes our submission to the 2020 Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE).

Machine Translation Translation

Paper
Code

Multimodal Quality Estimation for Machine Translation

no code implementations • ACL 2020 • Shu Okabe, Fr{\'e}d{\'e}ric Blain, Lucia Specia

We propose approaches to Quality Estimation (QE) for Machine Translation that explore both text and visual modalities for Multimodal QE.

Machine Translation Sentence +1

Paper
Add Code

Multi-Hypothesis Machine Translation Evaluation

no code implementations • ACL 2020 • Marina Fomicheva, Lucia Specia, Francisco Guzm{\'a}n

Reliably evaluating Machine Translation (MT) through automated metrics is a long-standing problem.

Machine Translation Translation +1

Paper
Add Code

Unsupervised Quality Estimation for Neural Machine Translation

3 code implementations • 21 May 2020 • Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time.

Machine Translation Translation +1

29,233

Paper
Code

A Post-Editing Dataset in the Legal Domain: Do we Underestimate Neural Machine Translation Quality?

no code implementations • LREC 2020 • Julia Ive, Lucia Specia, Sara Szoc, Tom Vanallemeersch, Joachim Van den Bogaert, Eduardo Farah, Christine Maroti, Artur Ventura, Maxim Khalilov

We introduce a machine translation dataset for three pairs of languages in the legal domain with post-edited high-quality neural machine translation and independent human references.

Automatic Post-Editing Sentence +1

Paper
Add Code

ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations

1 code implementation • ACL 2020 • Fernando Alva-Manchego, Louis Martin, Antoine Bordes, Carolina Scarton, Benoît Sagot, Lucia Specia

Furthermore, we motivate the need for developing better methods for automatic evaluation using ASSET, since we show that current popular metrics may not be suitable when multiple simplification transformations are performed.

Sentence

Paper
Code

Data-Driven Sentence Simplification: Survey and Benchmark

no code implementations • CL 2020 • Fern Alva-Manchego, o, Carolina Scarton, Lucia Specia

Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read and understand.

Sentence

Paper
Add Code

Multimodal Machine Translation through Visuals and Speech

no code implementations • 28 Nov 2019 • Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, Jörg Tiedemann

Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data.

Ranked #4 on Multimodal Machine Translation on Multi30K

Image Captioning Multimodal Machine Translation +4

Paper
Add Code

Deep Copycat Networks for Text-to-Text Generation

1 code implementation • IJCNLP 2019 • Julia Ive, Pranava Madhyastha, Lucia Specia

Most text-to-text generation tasks, for example text summarisation and text simplification, require copying words from the input to the output.

Automatic Post-Editing Text Generation +2

Paper
Code

Improving Neural Machine Translation Robustness via Data Augmentation: Beyond Back-Translation

1 code implementation • WS 2019 • Zhenhao Li, Lucia Specia

Neural Machine Translation (NMT) models have been proved strong when translating clean texts, but they are very sensitive to noise in the input.

Data Augmentation Domain Adaptation +3

Paper
Code

Transformer-based Cascaded Multimodal Speech Translation

no code implementations • EMNLP (IWSLT) 2019 • Zixiu Wu, Ozan Caglayan, Julia Ive, Josiah Wang, Lucia Specia

Upon conducting extensive experiments, we found that (i) the explored visual integration schemes often harm the translation performance for the transformer and additive deliberation, but considerably improve the cascade deliberation; (ii) the transformer and cascade deliberation integrate the visual modality better than the additive deliberation, as shown by the incongruence analysis.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Imperial College London Submission to VATEX Video Captioning Task

no code implementations • 16 Oct 2019 • Ozan Caglayan, Zixiu Wu, Pranava Madhyastha, Josiah Wang, Lucia Specia

This paper describes the Imperial College London team's submission to the 2019' VATEX video captioning challenge, where we first explore two sequence-to-sequence models, namely a recurrent (GRU) model and a transformer model, which generate captions from the I3D action features.

Video Captioning

Paper
Add Code

Estimating post-editing effort: a study on human judgements, task-based and reference-based metrics of MT quality

1 code implementation • EMNLP (IWSLT) 2019 • Carolina Scarton, Mikel L. Forcada, Miquel Esplà-Gomis, Lucia Specia

To that end, we report experiments on a dataset with newly-collected post-editing indicators and show their usefulness when estimating post-editing effort.

Machine Translation Translation

Paper
Code

Improving Neural Machine Translation Robustness via Data Augmentation: Beyond Back Translation

no code implementations • 7 Oct 2019 • Zhenhao Li, Lucia Specia

Neural Machine Translation (NMT) models have been proved strong when translating clean texts, but they are very sensitive to noise in the input.

Data Augmentation Domain Adaptation +3

Paper
Add Code

Taking MT Evaluation Metrics to Extremes: Beyond Correlation with Human Judgments

no code implementations • CL 2019 • Marina Fomicheva, Lucia Specia

Much work has been dedicated to the improvement of evaluation metrics to achieve a higher correlation with human judgments.

Machine Translation Translation

Paper
Add Code

Phrase Localization Without Paired Training Examples

1 code implementation • ICCV 2019 • Josiah Wang, Lucia Specia

Localizing phrases in images is an important part of image understanding and can be useful in many applications that require mappings between textual and visual information.

Paper
Code

EASSE: Easier Automatic Sentence Simplification Evaluation

1 code implementation • IJCNLP 2019 • Fernando Alva-Manchego, Louis Martin, Carolina Scarton, Lucia Specia

We introduce EASSE, a Python package aiming to facilitate and standardise automatic evaluation and comparison of Sentence Simplification (SS) systems.

Sentence

153

Paper
Code

Predicting Actions to Help Predict Translations

no code implementations • 5 Aug 2019 • Zixiu Wu, Julia Ive, Josiah Wang, Pranava Madhyastha, Lucia Specia

The question we ask ourselves is whether visual features can support the translation process, in particular, given that this is a dataset extracted from videos, we focus on the translation of actions, which we believe are poorly captured in current static image-text datasets currently used for multimodal translation.

Translation

Paper
Add Code

WMDO: Fluency-based Word Mover's Distance for Machine Translation Evaluation

no code implementations • WS 2019 • Julian Chow, Lucia Specia, Pranava Madhyastha

We propose WMDO, a metric based on distance between distributions in the semantic vector space.

Machine Translation Translation +1

Paper
Add Code

Cross-Sentence Transformations in Text Simplification

no code implementations • WS 2019 • Fern Alva-Manchego, o, Carolina Scarton, Lucia Specia

Current approaches to Text Simplification focus on simplifying sentences individually.

Sentence Text Simplification

Paper
Add Code

A Comparison on Fine-grained Pre-trained Embeddings for the WMT19Chinese-English News Translation Task

no code implementations • WS 2019 • Zhenhao Li, Lucia Specia

This paper describes our submission to the WMT 2019 Chinese-English (zh-en) news translation shared task.

Translation

Paper
Add Code

APE-QUEST

no code implementations • WS 2019 • Joachim Van den Bogaert, Heidi Depraetere, Sara Szoc, Tom Vanallemeersch, Koen Van Winckel, Frederic Everaert, Lucia Specia, Julia Ive, Maxim Khalilov, Christine Maroti, Eduardo Farah, Artur Ventura

Paper
Add Code

VIFIDEL: Evaluating the Visual Fidelity of Image Descriptions

no code implementations • ACL 2019 • Pranava Madhyastha, Josiah Wang, Lucia Specia

It estimates the faithfulness of a generated caption with respect to the content of the actual image, based on the semantic similarity between labels of objects depicted in images and words in the description.

Paper
Add Code

Is artificial data useful for biomedical Natural Language Processing algorithms?

no code implementations • WS 2019 • Zixu Wang, Julia Ive, Sumithra Velupillai, Lucia Specia

A major obstacle to the development of Natural Language Processing (NLP) methods in the biomedical domain is data accessibility.

Temporal Relation Extraction text-classification +1

Paper
Add Code

Distilling Translations with Visual Awareness

1 code implementation • ACL 2019 • Julia Ive, Pranava Madhyastha, Lucia Specia

Previous work on multimodal machine translation has shown that visual information is only needed in very specific cases, for example in the presence of ambiguous words where the textual context is not sufficient.

Ranked #3 on Multimodal Machine Translation on Multi30K (Meteor (EN-FR) metric)

Multimodal Machine Translation Translation

Paper
Code

Grounded Word Sense Translation

no code implementations • WS 2019 • Chiraag Lala, Pranava Madhyastha, Lucia Specia

Recent work on visually grounded language learning has focused on broader applications of grounded representations, such as visual question answering and multimodal machine translation.

Grounded language learning Multimodal Machine Translation +3

Paper
Add Code

Probing the Need for Visual Context in Multimodal Machine Translation

no code implementations • NAACL 2019 • Ozan Caglayan, Pranava Madhyastha, Lucia Specia, Loïc Barrault

Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial.

Multimodal Machine Translation Translation

Paper
Add Code

End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space

1 code implementation • WS 2018 • Pranava Swaroop Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn {`}distributional similarity{'} in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

Paper
Code

How2: A Large-scale Dataset for Multimodal Language Understanding

2 code implementations • 1 Nov 2018 • Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze

In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

150

Paper
Code

Assessing Crosslingual Discourse Relations in Machine Translation

1 code implementation • 7 Oct 2018 • Karin Sim Smith, Lucia Specia

In an attempt to improve overall translation quality, there has been an increasing focus on integrating more linguistic elements into Machine Translation (MT).

Machine Translation Translation

Paper
Code

Findings of the Third Shared Task on Multimodal Machine Translation

1 code implementation • WS 2018 • Lo{\"\i}c Barrault, Fethi Bougares, Lucia Specia, Chiraag Lala, Desmond Elliott, Stella Frank

In this task a source sentence in English is supplemented by an image and participating systems are required to generate a translation for such a sentence into German, French or Czech.

Multimodal Machine Translation Sentence +1

159

Paper
Code

Findings of the WMT 2018 Shared Task on Quality Estimation

no code implementations • WS 2018 • Lucia Specia, Fr{\'e}d{\'e}ric Blain, Varvara Logacheva, Ram{\'o}n Astudillo, Andr{\'e} F. T. Martins

We report the results of the WMT18 shared task on Quality Estimation, i. e. the task of predicting the quality of the output of machine translation systems at various granularity levels: word, phrase, sentence and document.

Machine Translation Sentence +1

Paper
Add Code

Sheffield Submissions for WMT18 Multimodal Translation Shared Task

no code implementations • WS 2018 • Chiraag Lala, Pranava Swaroop Madhyastha, Carolina Scarton, Lucia Specia

For task 1b, we explore three approaches: (i) re-ranking based on cross-lingual word sense disambiguation (as for task 1), (ii) re-ranking based on consensus of NMT n-best lists from German-Czech, French-Czech and English-Czech systems, and (iii) data augmentation by generating English source data through machine translation from French to English and from German to English followed by hypothesis selection using a multimodal-reranker.

Data Augmentation Multimodal Machine Translation +4

Paper
Add Code

Sheffield Submissions for the WMT18 Quality Estimation Shared Task

no code implementations • WS 2018 • Julia Ive, Carolina Scarton, Fr{\'e}d{\'e}ric Blain, Lucia Specia

In this paper we present the University of Sheffield submissions for the WMT18 Quality Estimation shared task.

Machine Translation

Paper
Add Code

Proceedings of the Third Conference on Machine Translation: Shared Task Papers

no code implementations • EMNLP 2018 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor

Machine Translation Translation

Paper
Add Code

End-to-end Image Captioning Exploits Multimodal Distributional Similarity

no code implementations • 11 Sep 2018 • Pranava Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn `distributional similarity' in a multimodal feature space by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

Paper
Add Code

Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting

no code implementations • WS 2018 • Mikel L. Forcada, Carolina Scarton, Lucia Specia, Barry Haddow, Alexandra Birch

A popular application of machine translation (MT) is gisting: MT is consumed as is to make sense of text in a foreign language.

Machine Translation Reading Comprehension +2

Paper
Add Code

deepQuest: A Framework for Neural-based Quality Estimation

2 code implementations • COLING 2018 • Julia Ive, Fr{\'e}d{\'e}ric Blain, Lucia Specia

Our approach is significantly faster and yields performance improvements for a range of document-level quality estimation tasks.

Feature Engineering Machine Translation +2

Paper
Code

Learning Simplifications for Specific Target Audiences

no code implementations • ACL 2018 • Carolina Scarton, Lucia Specia

Text simplification (TS) is a monolingual text-to-text transformation task where an original (complex) text is transformed into a target (simpler) text.

Lexical Simplification Machine Translation +4

Paper
Add Code

Vis-Eval Metric Viewer: A Visualisation Tool for Inspecting and Evaluating Metric Scores of Machine Translation Output

no code implementations • NAACL 2018 • David Steele, Lucia Specia

Machine Translation systems are usually evaluated and compared using automated evaluation metrics such as BLEU and METEOR to score the generated translations against human translations.

Machine Translation Sentence +1

Paper
Add Code

Defoiling Foiled Image Captions

1 code implementation • NAACL 2018 • Pranava Madhyastha, Josiah Wang, Lucia Specia

We address the task of detecting foiled image captions, i. e. identifying whether a caption contains a word that has been deliberately replaced by a semantically similar word, thus rendering it inaccurate with respect to the image being described.

Descriptive Image Captioning +1

Paper
Code

Multimodal Lexical Translation

1 code implementation • LREC 2018 • Chiraag Lala, Lucia Specia

Multimodal Lexical Translation Translation +2

Paper
Code

SimPA: A Sentence-Level Simplification Corpus for the Public Administration Domain

no code implementations • LREC 2018 • Carolina Scarton, Gustavo Paetzold, Lucia Specia

Lexical Simplification Sentence +2

Paper
Add Code

Text Simplification from Professionally Produced Corpora

no code implementations • LREC 2018 • Carolina Scarton, Gustavo Paetzold, Lucia Specia

Lexical Simplification Machine Translation +1

Paper
Add Code

A Report on the Complex Word Identification Shared Task 2018

no code implementations • WS 2018 • Seid Muhie Yimam, Chris Biemann, Shervin Malmasi, Gustavo H. Paetzold, Lucia Specia, Sanja Štajner, Anaïs Tack, Marcos Zampieri

We report the findings of the second Complex Word Identification (CWI) shared task organized as part of the BEA workshop co-located with NAACL-HLT'2018.

Binary Classification Classification +2

Paper
Add Code

Object Counts! Bringing Explicit Detections Back into Image Captioning

no code implementations • NAACL 2018 • Josiah Wang, Pranava Madhyastha, Lucia Specia

The use of explicit object detectors as an intermediate step to image captioning - which used to constitute an essential stage in early work - is often bypassed in the currently dominant end-to-end approaches, where the language model is conditioned directly on a mid-level image embedding.

Image Captioning Language Modelling +1

Paper
Add Code

Combining Quality Estimation and Automatic Post-editing to Enhance Machine Translation output

no code implementations • WS 2018 • Rajen Chatterjee, Matteo Negri, Marco Turchi, Fr{\'e}d{\'e}ric Blain, Lucia Specia

Automatic Post-Editing Translation

Paper
Add Code

What is image captioning made of?

1 code implementation • ICLR 2018 • Pranava Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn ‘distributional similarity’ in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

Paper
Code

Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs

1 code implementation • IJCNLP 2017 • Fern Alva-Manchego, o, Joachim Bingel, Gustavo Paetzold, Carolina Scarton, Lucia Specia

Current research in text simplification has been hampered by two central problems: (i) the small amount of high-quality parallel simplification data available, and (ii) the lack of explicit annotations of simplification operations, such as deletions or substitutions, on existing data.

Ranked #8 on Text Simplification on PWKP / WikiSmall (SARI metric)

Machine Translation Sentence +2

Paper
Code

The Ultimate Presentation Makeup Tutorial: How to Polish your Posters, Slides and Presentations Skills

no code implementations • IJCNLP 2017 • Gustavo Paetzold, Lucia Specia

There is no question that our research community have, and still has been producing an insurmountable amount of interesting strategies, models and tools to a wide array of problems and challenges in diverse areas of knowledge.

Paper
Add Code

MASSAlign: Alignment and Annotation of Comparable Documents

no code implementations • IJCNLP 2017 • Gustavo Paetzold, Fern Alva-Manchego, o, Lucia Specia

We introduce MASSAlign: a Python library for the alignment and annotation of monolingual comparable documents.

Sentence

Paper
Add Code

MUSST: A Multilingual Syntactic Simplification Tool

no code implementations • IJCNLP 2017 • Carolina Scarton, Alessio Palmero Aprosio, Sara Tonelli, Tamara Mart{\'\i}n Wanton, Lucia Specia

Our implementation includes a set of general-purpose simplification rules, as well as a sentence selection module (to select sentences to be simplified) and a confidence model (to select only promising simplifications).

Lexical Simplification Sentence +1

Paper
Add Code

Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description

no code implementations • WS 2017 • Desmond Elliott, Stella Frank, Loïc Barrault, Fethi Bougares, Lucia Specia

The multilingual image description task was changed such that at test time, only the image is given.

Multimodal Machine Translation Sentence +1

Paper
Add Code

Complex Word Identification: Challenges in Data Annotation and System Performance

no code implementations • WS 2017 • Marcos Zampieri, Shervin Malmasi, Gustavo Paetzold, Lucia Specia

This paper revisits the problem of complex word identification (CWI) following up the SemEval CWI shared task.

Complex Word Identification General Classification

Paper
Add Code

Findings of the 2017 Conference on Machine Translation (WMT17)

no code implementations • WS 2017 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shu-Jian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, Marco Turchi

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Guiding Neural Machine Translation Decoding with External Knowledge

no code implementations • WS 2017 • Rajen Chatterjee, Matteo Negri, Marco Turchi, Marcello Federico, Lucia Specia, Fr{\'e}d{\'e}ric Blain

Machine Translation Translation

Paper
Add Code

Sheffield MultiMT: Using Object Posterior Predictions for Multimodal Machine Translation

no code implementations • WS 2017 • Pranava Swaroop Madhyastha, Josiah Wang, Lucia Specia

Image Captioning Image Classification +4

Paper
Add Code

The QT21 Combined Machine Translation System for English to Latvian

no code implementations • WS 2017 • Jan-Thorsten Peter, Hermann Ney, Ond{\v{r}}ej Bojar, Ngoc-Quan Pham, Jan Niehues, Alex Waibel, Franck Burlot, Fran{\c{c}}ois Yvon, M{\=a}rcis Pinnis, Valters {\v{S}}ics, Jasmijn Bastings, Miguel Rios, Wilker Aziz, Philip Williams, Fr{\'e}d{\'e}ric Blain, Lucia Specia

Machine Translation Translation

Paper
Add Code

Bilexical Embeddings for Quality Estimation

no code implementations • WS 2017 • Fr{\'e}d{\'e}ric Blain, Carolina Scarton, Lucia Specia

Language Modelling Machine Translation +1

Paper
Add Code

Feature-Enriched Character-Level Convolutions for Text Regression

no code implementations • WS 2017 • Gustavo Paetzold, Lucia Specia

Feature Engineering Machine Translation +1

Paper
Add Code

SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation

no code implementations • SEMEVAL 2017 • Daniel Cer, Mona Diab, Eneko Agirre, I{\~n}igo Lopez-Gazpio, Lucia Specia

Semantic Textual Similarity (STS) measures the meaning similarity of sentences.

Machine Translation Natural Language Inference +4

Paper
Add Code

SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

3 code implementations • 31 Jul 2017 • Daniel Cer, Mona Diab, Eneko Agirre, Iñigo Lopez-Gazpio, Lucia Specia

Semantic Textual Similarity (STS) measures the meaning similarity of sentences.

Machine Translation Question Answering +3

Paper
Code

Lexical Simplification with Neural Ranking

no code implementations • EACL 2017 • Gustavo Paetzold, Lucia Specia

We present a new Lexical Simplification approach that exploits Neural Networks to learn substitutions from the Newsela corpus - a large set of professionally produced simplifications.

Complex Word Identification Information Retrieval +3

Paper
Add Code

Vicinity-Driven Paragraph and Sentence Alignment for Comparable Corpora

no code implementations • 13 Dec 2016 • Gustavo Henrique Paetzold, Lucia Specia

Parallel corpora have driven great progress in the field of Text Simplification.

Sentence Text Simplification

Paper
Add Code

Quality Estimation for Language Output Applications

no code implementations • COLING 2016 • Carolina Scarton, Gustavo Paetzold, Lucia Specia

The goal of QE is to estimate the quality of language output applications without the need of human references.

BIG-bench Machine Learning Machine Translation +3

Paper
Add Code

Anita: An Intelligent Text Adaptation Tool

no code implementations • COLING 2016 • Gustavo Paetzold, Lucia Specia

We introduce Anita: a flexible and intelligent Text Adaptation tool for web content that provides Text Simplification and Text Enhancement modules.

Lexical Simplification Text Simplification

Paper
Add Code

Understanding the Lexical Simplification Needs of Non-Native Speakers of English

no code implementations • COLING 2016 • Gustavo Paetzold, Lucia Specia

We report three user studies in which the Lexical Simplification needs of non-native English speakers are investigated.

Complex Word Identification Lexical Simplification +2

Paper
Add Code

Collecting and Exploring Everyday Language for Predicting Psycholinguistic Properties of Words

no code implementations • COLING 2016 • Gustavo Paetzold, Lucia Specia

Exploring language usage through frequency analysis in large corpora is a defining feature in most recent work in corpus and computational linguistics.

Text Simplification

Paper
Add Code

Personalized Machine Translation: Preserving Original Author Traits

no code implementations • EACL 2017 • Ella Rabinovich, Shachar Mirkin, Raj Nath Patel, Lucia Specia, Shuly Wintner

The language that we produce reflects our personality, and various personal and demographic characteristics can be detected in natural language texts.

Domain Adaptation Machine Translation +1

Paper
Add Code

CobaltF: A Fluent Metric for MT Evaluation

no code implementations • WS 2016 • Marina Fomicheva, N{\'u}ria Bel, Lucia Specia, Iria da Cunha, Anton Malinovskiy

Language Modelling Machine Translation +1

Paper
Add Code

Reference Bias in Monolingual Machine Translation Evaluation

no code implementations • ACL 2016 • Marina Fomicheva, Lucia Specia

Machine Translation Translation

Paper
Add Code

Metrics for Evaluation of Word-level Machine Translation Quality Estimation

no code implementations • ACL 2016 • Varvara Logacheva, Michal Lukasik, Lucia Specia

Machine Translation Translation

Paper
Add Code

The QT21/HimL Combined Machine Translation System

no code implementations • WS 2016 • Jan-Thorsten Peter, Tamer Alkhouli, Hermann Ney, Matthias Huck, Fabienne Braune, Alex Fraser, er, Ale{\v{s}} Tamchyna, Ond{\v{r}}ej Bojar, Barry Haddow, Rico Sennrich, Fr{\'e}d{\'e}ric Blain, Lucia Specia, Jan Niehues, Alex Waibel, Alex Allauzen, re, Lauriane Aufrant, Franck Burlot, Elena Knyazeva, Thomas Lavergne, Fran{\c{c}}ois Yvon, M{\=a}rcis Pinnis, Stella Frank

Ranked #12 on Machine Translation on WMT2016 English-Romanian

Machine Translation Translation

Paper
Add Code

Sheffield Systems for the English-Romanian WMT Translation Task

no code implementations • WS 2016 • Fr{\'e}d{\'e}ric Blain, Xingyi Song, Lucia Specia

Language Modelling Machine Translation +2

Paper
Add Code

Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, J{\"o}rg Tiedemann, Marco Turchi

Machine Translation Translation

Paper
Add Code

A Shared Task on Multimodal Machine Translation and Crosslingual Image Description

no code implementations • WS 2016 • Lucia Specia, Stella Frank, Khalil Sima{'}an, Desmond Elliott

Image Retrieval Multimodal Machine Translation +4

Paper
Add Code

Word embeddings and discourse information for Quality Estimation

no code implementations • WS 2016 • Carolina Scarton, Daniel Beck, Kashif Shah, Karin Sim Smith, Lucia Specia

Feature Engineering Machine Translation +1

Paper
Add Code

USFD's Phrase-level Quality Estimation Systems

no code implementations • WS 2016 • Varvara Logacheva, Fr{\'e}d{\'e}ric Blain, Lucia Specia

Machine Translation

Paper
Add Code

SHEF-LIUM-NN: Sentence level Quality Estimation with Neural Network Features

no code implementations • WS 2016 • Kashif Shah, Fethi Bougares, Lo{\"\i}c Barrault, Lucia Specia

Automatic Speech Recognition (ASR) Feature Engineering +5

Paper
Add Code

SimpleNets: Quality Estimation with Resource-Light Neural Networks

no code implementations • WS 2016 • Gustavo Paetzold, Lucia Specia

Machine Translation Text Simplification +1

Paper
Add Code

SHEF-Multimodal: Grounding Machine Translation on Images

no code implementations • WS 2016 • Kashif Shah, Josiah Wang, Lucia Specia

Multimodal Machine Translation Question Answering +2

Paper
Add Code

SHEF-MIME: Word-level Quality Estimation Using Imitation Learning

no code implementations • WS 2016 • Daniel Beck, Andreas Vlachos, Gustavo Paetzold, Lucia Specia

Feature Engineering Imitation Learning +3

Paper
Add Code

Findings of the 2016 Conference on Machine Translation

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Exploring Prediction Uncertainty in Machine Translation Quality Estimation

no code implementations • CONLL 2016 • Daniel Beck, Lucia Specia, Trevor Cohn

Machine Translation Quality Estimation is a notoriously difficult task, which lessens its usefulness in real-world translation environments.

Machine Translation Translation

Paper
Add Code

Inferring Psycholinguistic Properties of Words

no code implementations • NAACL 2016 • Gustavo Paetzold, Lucia Specia

Lexical Simplification Reading Comprehension +1

Paper
Add Code

SAARSHEFF at SemEval-2016 Task 1: Semantic Textual Similarity with Machine Translation Evaluation Metrics and (eXtreme) Boosted Tree Ensembles

no code implementations • SEMEVAL 2016 • Liling Tan, Carolina Scarton, Lucia Specia, Josef van Genabith

Machine Translation Semantic Textual Similarity

Paper
Add Code

SV000gg at SemEval-2016 Task 11: Heavy Gauge Complex Word Identification with System Voting

no code implementations • SEMEVAL 2016 • Gustavo Paetzold, Lucia Specia

Complex Word Identification Lexical Simplification

Paper
Add Code

Large-scale Multitask Learning for Machine Translation Quality Estimation

no code implementations • NAACL 2016 • Kashif Shah, Lucia Specia

Machine Translation Translation

Paper
Add Code

SemEval 2016 Task 11: Complex Word Identification

no code implementations • SEMEVAL 2016 • Gustavo Paetzold, Lucia Specia

Complex Word Identification Lexical Simplification +1

Paper
Add Code

Multi30K: Multilingual English-German Image Descriptions

1 code implementation • WS 2016 • Desmond Elliott, Stella Frank, Khalil Sima'an, Lucia Specia

We introduce the Multi30K dataset to stimulate multilingual multimodal research.

Multimodal Machine Translation Translation

159

Paper
Code

Benchmarking Lexical Simplification Systems

no code implementations • LREC 2016 • Gustavo Paetzold, Lucia Specia

Lexical Simplification is the task of replacing complex words in a text with simpler alternatives.

Benchmarking Lexical Simplification

Paper
Add Code

MARMOT: A Toolkit for Translation Quality Estimation at the Word Level

1 code implementation • LREC 2016 • Varvara Logacheva, Chris Hokamp, Lucia Specia

The tool has a set of state-of-the-art features for QE, and new features can easily be added.

Machine Translation Sentence +1

Paper
Code

A Reading Comprehension Corpus for Machine Translation Evaluation

1 code implementation • LREC 2016 • Carolina Scarton, Lucia Specia

Effectively assessing Natural Language Processing output tasks is a challenge for research in the area.

Machine Translation Reading Comprehension +1

Paper
Code

Phrase Level Segmentation and Labelling of Machine Translation Errors

no code implementations • LREC 2016 • Fr{\'e}d{\'e}ric Blain, Varvara Logacheva, Lucia Specia

This paper presents our work towards a novel approach for Quality Estimation (QE) of machine translation based on sequences of adjacent words, the so-called phrases.

Machine Translation Sentence +1

Paper
Add Code

Cohere: A Toolkit for Local Coherence

1 code implementation • LREC 2016 • Karin Sim Smith, Wilker Aziz, Lucia Specia

We describe COHERE, our coherence toolkit which incorporates various complementary models for capturing and measuring different aspects of text coherence.

Paper
Code

The Trouble with Machine Translation Coherence

no code implementations • WS 2016 • Karin Sim Smith, Wilker Aziz, Lucia Specia

Machine Translation Translation

Paper
Add Code

Predicting and Using Implicit Discourse Elements in Chinese-English Translation

no code implementations • WS 2016 • David Steele, Lucia Specia

Machine Translation Translation

Paper
Add Code

Semantic Textual Similarity in Quality Estimation

no code implementations • WS 2016 • Hanna Bechara, Carla Parra Escartin, Constantin Orasan, Lucia Specia

Machine Translation Semantic Textual Similarity

Paper
Add Code

The USFD Spoken Language Translation System for IWSLT 2014

no code implementations • 13 Sep 2015 • Raymond W. M. Ng, Mortaza Doulaty, Rama Doddipatla, Wilker Aziz, Kashif Shah, Oscar Saz, Madina Hasan, Ghada Alharbi, Lucia Specia, Thomas Hain

The USFD primary system incorporates state-of-the-art ASR and MT techniques and gives a BLEU score of 23. 45 and 14. 75 on the English-to-French and English-to-German speech-to-text translation task with the IWSLT 2014 data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Findings of the 2015 Workshop on Statistical Machine Translation

no code implementations • WS 2015 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, Marco Turchi

Automatic Post-Editing Translation

Paper
Add Code

Data enhancement and selection strategies for the word-level Quality Estimation

no code implementations • WS 2015 • Varvara Logacheva, Chris Hokamp, Lucia Specia

Machine Translation

Paper
Add Code

USHEF and USAAR-USHEF participation in the WMT15 QE shared task

no code implementations • WS 2015 • Carolina Scarton, Liling Tan, Lucia Specia

Machine Translation

Paper
Add Code

A Proposal for a Coherence Corpus in Machine Translation

no code implementations • WS 2015 • Karin Sim Smith, Wilker Aziz, Lucia Specia

Machine Translation Translation

Paper
Add Code

Sheffield Systems for the Finnish-English WMT Translation Task

no code implementations • WS 2015 • David Steele, Karin Sim Smith, Lucia Specia

Machine Translation Morphological Analysis +1

Paper
Add Code

SHEF-NN: Translation Quality Estimation with Neural Networks

no code implementations • WS 2015 • Kashif Shah, Varvara Logacheva, Gustavo Paetzold, Frederic Blain, Daniel Beck, Fethi Bougares, Lucia Specia

Feature Engineering Language Modelling +4

Paper
Add Code

Investigating Continuous Space Language Models for Machine Translation Quality Estimation

no code implementations • EMNLP 2015 • Kashif Shah, Raymond W. M. Ng, Fethi Bougares, Lucia Specia

Language Modelling Machine Translation +1

Paper
Add Code

Learning Structural Kernels for Natural Language Processing

no code implementations • TACL 2015 • Daniel Beck, Trevor Cohn, Christian Hardmeier, Lucia Specia

Structural kernels are a flexible learning paradigm that has been widely used in Natural Language Processing.

Gaussian Processes Hyperparameter Optimization +1

Paper
Add Code

Multi-level Translation Quality Prediction with QuEst++

no code implementations • IJCNLP 2015 • Lucia Specia, Gustavo Paetzold, Carolina Scarton

Feature Engineering Machine Translation +1

Paper
Add Code

LEXenstein: A Framework for Lexical Simplification

no code implementations • IJCNLP 2015 • Gustavo Paetzold, Lucia Specia

Lexical Simplification

Paper
Add Code

WA-Continuum: Visualising Word Alignments across Multiple Parallel Sentences Simultaneously

no code implementations • IJCNLP 2015 • David Steele, Lucia Specia

Machine Translation Word Alignment

Paper
Add Code

USAAR-SHEFFIELD: Semantic Textual Similarity with Deep Regression and Machine Translation Evaluation Metrics

no code implementations • SEMEVAL 2015 • Liling Tan, Carolina Scarton, Lucia Specia, Josef van Genabith

Dimensionality Reduction Machine Translation +3

Paper
Add Code

Truly Exploring Multiple References for Machine Translation Evaluation

no code implementations • WS 2015 • Ying Qin, Lucia Specia

Machine Translation Translation

Paper
Add Code

The role of artificially generated negative data for quality estimation of machine translation

no code implementations • WS 2015 • Varvara Logacheva, Lucia Specia

Machine Translation Translation

Paper
Add Code

Okapi+QuEst: Translation Quality Estimation within Okapi

no code implementations • WS 2015 • Gustavo Henrique Paetzold, Lucia Specia, Yves Savourel

Machine Translation Translation

Paper
Add Code

Searching for Context: a Study on Document-Level Labels for Translation Quality Estimation

no code implementations • WS 2015 • Carolina Scarton, Marcos Zampieri, Mihaela Vela, Josef van Genabith, Lucia Specia

Machine Translation Translation

Paper
Add Code

Exact Decoding for Phrase-Based Statistical Machine Translation

no code implementations • EMNLP 2014 • Wilker Aziz, Marc Dymetman, Lucia Specia

Language Modelling Machine Translation +1

Paper
Add Code

Joint Emotion Analysis via Multi-task Gaussian Processes

no code implementations • EMNLP 2014 • Daniel Beck, Trevor Cohn, Lucia Specia

Domain Adaptation Emotion Recognition +3

Paper
Add Code

Exploring Consensus in Machine Translation for Quality Estimation

no code implementations • WS 2014 • Carolina Scarton, Lucia Specia

Machine Translation Translation

Paper
Add Code

SHEF-Lite 2.0: Sparse Multi-task Gaussian Processes for Translation Quality Estimation

no code implementations • WS 2014 • Daniel Beck, Kashif Shah, Lucia Specia

Gaussian Processes Machine Translation +2

Paper
Add Code

Findings of the 2014 Workshop on Statistical Machine Translation

no code implementations • WS 2014 • Ondrej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, Aleš Tamchyna

Cross-Lingual Information Retrieval Domain Adaptation +5

Paper
Add Code

An efficient and user-friendly tool for machine translation quality estimation

no code implementations • LREC 2014 • Kashif Shah, Marco Turchi, Lucia Specia

We present a new version of QUEST ― an open source framework for machine translation quality estimation ― which brings a number of improvements: (i) it provides a Web interface and functionalities such that non-expert users, e. g. translators or lay-users of machine translations, can get quality predictions (or internal features of the framework) for translations without having to install the toolkit, obtain resources or build prediction models; (ii) it significantly improves over the previous runtime performance by keeping resources (such as language models) in memory; (iii) it provides an option for users to submit the source text only and automatically obtain translations from Bing Translator; (iv) it provides a ranking of multiple translations submitted by users for each source text according to their estimated quality.

Machine Translation Translation

Paper
Add Code

A Quality-based Active Sample Selection Strategy for Statistical Machine Translation

no code implementations • LREC 2014 • Varvara Logacheva, Lucia Specia

Our approach is based on a quality estimation technique which involves a wider range of features of the source text, automatic translation, and machine translation system compared to previous work.

Active Learning Machine Translation +3

Paper
Add Code

Confidence-based Active Learning Methods for Machine Translation

no code implementations • WS 2014 • Varvara Logacheva, Lucia Specia

Active Learning Domain Adaptation +2

Paper
Add Code

An Analysis of Crowdsourced Text Simplifications

no code implementations • WS 2014 • Marcelo Amancio, Lucia Specia

Machine Translation Text Simplification

Paper
Add Code

SHEF-Lite: When Less is More for Translation Quality Estimation

no code implementations • WS 2013 • Daniel Beck, Kashif Shah, Trevor Cohn, Lucia Specia

Active Learning Machine Translation +1

Paper
Add Code

QuEst - A translation quality estimation framework

no code implementations • ACL 2013 • Lucia Specia, Kashif Shah, Jose G. C. de Souza, Trevor Cohn

Machine Translation Speech Recognition +2

Paper
Add Code

Findings of the 2013 Workshop on Statistical Machine Translation

no code implementations • WS 2013 • Ond{\v{r}}ej Bojar, Christian Buck, Chris Callison-Burch, Christian Federmann, Barry Haddow, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, Lucia Specia

Machine Translation Translation

Paper
Add Code

Reducing Annotation Effort for Quality Estimation via Active Learning

no code implementations • ACL 2013 • Daniel Beck, Lucia Specia, Trevor Cohn

Active Learning Feature Engineering +1

Paper
Add Code

Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation

no code implementations • ACL 2013 • Trevor Cohn, Lucia Specia

Gaussian Processes Machine Translation +2

Paper
Add Code

Multilingual WSD-like Constraints for Paraphrase Extraction

no code implementations • WS 2013 • Wilker Aziz, Lucia Specia

Machine Translation Question Answering +1

Paper
Add Code

Text Simplification as Tree Transduction

no code implementations • WS 2013 • Gustavo H. Paetzold, Lucia Specia

Lexical Simplification Text Simplification

Paper
Add Code

Automatic Question Generation in Multimedia-Based Learning

no code implementations • COLING 2012 • Yvonne Skalban, Le An Ha, Lucia Specia, Ruslan Mitkov

Question Generation Question-Generation

Paper
Add Code

Linguistic and Statistical Traits Characterising Plagiarism

no code implementations • COLING 2012 • Mir Chong, a, Lucia Specia

Paper
Add Code

UOW: Semantically Informed Text Similarity

no code implementations • SEMEVAL 2012 • Miguel Rios, Wilker Aziz, Lucia Specia

Machine Translation Semantic Textual Similarity +1

Paper
Add Code

SemEval-2012 Task 1: English Lexical Simplification

no code implementations • SEMEVAL 2012 • Lucia Specia, Sujay Kumar Jauhar, Rada Mihalcea

Lexical Simplification Text Simplification

Paper
Add Code

UOW-SHEF: SimpLex -- Lexical Simplicity Ranking based on Contextual and Psycholinguistic Features

no code implementations • SEMEVAL 2012 • Sujay Kumar Jauhar, Lucia Specia

Lexical Simplification Text Simplification

Paper
Add Code

Linguistic Features for Quality Estimation

no code implementations • WS 2012 • Mariano Felice, Lucia Specia

Machine Translation

Paper
Add Code

Findings of the 2012 Workshop on Statistical Machine Translation

no code implementations • WS 2012 • Chris Callison-Burch, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, Lucia Specia

Machine Translation Structured Prediction +1

Paper
Add Code

PET: a Tool for Post-editing and Assessing Machine Translation

no code implementations • LREC 2012 • Wilker Aziz, Sheila Castilho, Lucia Specia

Given the significant improvements in Machine Translation (MT) quality and the increasing demand for translations, post-editing of automatic translations is becoming a popular practice in the translation industry.

Machine Translation Sentence +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.