Search Results for author: Lucia Specia

Found 199 papers, 41 papers with code

Bayesian Model-Agnostic Meta-Learning with Matrix-Valued Kernels for Quality Estimation

no code implementations ACL (RepL4NLP) 2021 Abiola Obamuyide, Marina Fomicheva, Lucia Specia

To address these challenges, we propose a Bayesian meta-learning approach for adapting QE models to the needs and preferences of each user with limited supervision.

Machine Translation Meta-Learning +1

The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification

1 code implementation CL (ACL) 2021 Fernando Alva-Manchego, Carolina Scarton, Lucia Specia

Second, we conduct the first meta-evaluation of automatic metrics in Text Simplification, using our new data set (and other existing data) to analyze the variation of the correlation between metrics’ scores and human judgments across three dimensions: the perceived simplicity level, the system type, and the set of references used for computation.

Sentence Text Simplification

Towards a Better Understanding of Noise in Natural Language Processing

no code implementations RANLP 2021 Khetam Al Sharou, Zhenhao Li, Lucia Specia

In this paper, we propose a definition and taxonomy of various types of non-standard textual content – generally referred to as “noise” – in Natural Language Processing (NLP).

Multimodal Simultaneous Machine Translation

no code implementations MMTLRL (RANLP) 2021 Lucia Specia

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.

Machine Translation Translation

Quality In, Quality Out: Learning from Actual Mistakes

no code implementations EAMT 2020 Frederic Blain, Nikolaos Aletras, Lucia Specia

However, QE models are often trained on noisy approximations of quality annotations derived from the proportion of post-edited words in translated sentences instead of direct human annotations of translation errors.

Machine Translation Sentence +2

The IWSLT 2019 Evaluation Campaign

no code implementations EMNLP (IWSLT) 2019 Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico

The IWSLT 2019 evaluation campaign featured three tasks: speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech.

Translation

Findings of the WMT 2020 Shared Task on Quality Estimation

no code implementations WMT (EMNLP) 2020 Lucia Specia, Frédéric Blain, Marina Fomicheva, Erick Fonseca, Vishrav Chaudhary, Francisco Guzmán, André F. T. Martins

We report the results of the WMT20 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word, sentence and document levels.

Machine Translation Sentence +1

BERGAMOT-LATTE Submissions for the WMT20 Quality Estimation Shared Task

no code implementations WMT (EMNLP) 2020 Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Vishrav Chaudhary, Mark Fishel, Francisco Guzmán, Lucia Specia

We explore (a) a black-box approach to QE based on pre-trained representations; and (b) glass-box approaches that leverage various indicators that can be extracted from the neural MT systems.

Sentence Task 2

Leveraging Pre-trained Language Models for Gender Debiasing

no code implementations LREC 2022 Nishtha Jain, Declan Groves, Lucia Specia, Maja Popović

This work explores a light-weight method to generate gender variants for a given text using pre-trained language models as the resource, without any task-specific labelled data.

Text Generation

Multilingual and Multimodal Learning for Brazilian Portuguese

no code implementations LREC 2022 Júlia Sato, Helena Caseli, Lucia Specia

The good BLEU and METEOR values obtained for this new language pair, regarding the original English-German VTLM, establish the suitability of the model to other languages.

Language Modelling Sentence +1

A Taxonomy and Study of Critical Errors in Machine Translation

no code implementations EAMT 2022 Khetam Al Sharou, Lucia Specia

We also study the impact of the source text on generating critical errors in the translation and, based on this, propose a set of recommendations on aspects of the MT that need further scrutiny, especially for user-generated content, to avoid generating such errors, and hence improve online communication.

Machine Translation Translation

Bias Mitigation in Machine Translation Quality Estimation

1 code implementation ACL 2022 Hanna Behnke, Marina Fomicheva, Lucia Specia

Machine Translation Quality Estimation (QE) aims to build predictive models to assess the quality of machine-generated translations in the absence of reference translations.

Binary Classification Machine Translation +1

Validating Quality Estimation in a Computer-Aided Translation Workflow: Speed, Cost and Quality Trade-off

no code implementations MTSummit 2021 Fernando Alva-Manchego, Lucia Specia, Sara Szoc, Tom Vanallemeersch, Heidi Depraetere

In this scenario, a Quality Estimation (QE) tool can be used to score MT outputs, and a threshold on the QE scores can be applied to decide whether an MT output can be used as-is or requires human post-edition.

Machine Translation Translation

ICL’s Submission to the WMT21 Critical Error Detection Shared Task

no code implementations WMT (EMNLP) 2021 Genze Jiang, Zhenhao Li, Lucia Specia

This paper presents Imperial College London’s submissions to the WMT21 Quality Estimation (QE) Shared Task 3: Critical Error Detection.

Feature Engineering

Findings of the WMT 2021 Shared Task on Quality Estimation

no code implementations WMT (EMNLP) 2021 Lucia Specia, Frédéric Blain, Marina Fomicheva, Chrysoula Zerva, Zhenhao Li, Vishrav Chaudhary, André F. T. Martins

We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels.

Machine Translation Sentence +1

From Understanding to Utilization: A Survey on Explainability for Large Language Models

no code implementations23 Jan 2024 Haoyan Luo, Lucia Specia

This survey underscores the imperative for increased explainability in LLMs, delving into both the research on explainability and the various methodologies and tasks that utilize an understanding of these models.

Model Editing

Reducing Hallucinations in Neural Machine Translation with Feature Attribution

no code implementations17 Nov 2022 Joël Tang, Marina Fomicheva, Lucia Specia

We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT.

Machine Translation NMT +2

Scene Text Recognition with Semantics

no code implementations19 Oct 2022 Joshua Cesare Placidi, Yishu Miao, Zixu Wang, Lucia Specia

Scene Text Recognition (STR) models have achieved high performance in recent years on benchmark datasets where text images are presented with minimal noise.

Scene Text Recognition

Contrastive Video-Language Learning with Fine-grained Frame Sampling

no code implementations10 Oct 2022 Zixu Wang, Yujie Zhong, Yishu Miao, Lin Ma, Lucia Specia

However, even in paired video-text segments, only a subset of the frames are semantically relevant to the corresponding text, with the remainder representing noise; where the ratio of noisy frames is higher for longer videos.

Question Answering Representation Learning +3

Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts

1 code implementation24 Jun 2022 Atijit Anuchitanukul, Lucia Specia

We present Burst2Vec, our multi-task learning approach to predict emotion, age, and origin (i. e., native country/language) from vocal bursts.

Multi-Task Learning

Logically Consistent Adversarial Attacks for Soft Theorem Provers

1 code implementation29 Apr 2022 Alexander Gaskell, Yishu Miao, Lucia Specia, Francesca Toni

We propose a novel, generative adversarial framework for probing and improving these models' reasoning capabilities.

Automated Theorem Proving

Supervised Visual Attention for Simultaneous Multimodal Machine Translation

no code implementations23 Jan 2022 Veneta Haralampieva, Ozan Caglayan, Lucia Specia

A particular use for such multimodal systems is the task of simultaneous machine translation, where visual context has been shown to complement the partial information provided by the source sentence, especially in the early phases of translation.

Multimodal Machine Translation Sentence +1

Revisiting Contextual Toxicity Detection in Conversations

no code implementations24 Nov 2021 Atijit Anuchitanukul, Julia Ive, Lucia Specia

We then propose to bring these findings into computational detection models by introducing and evaluating (a) neural architectures for contextual toxicity detection that are aware of the conversational structure, and (b) data augmentation strategies that can help model contextual toxicity detection.

Data Augmentation Toxic Comment Classification

Guiding Visual Question Generation

no code implementations NAACL 2022 Nihir Vedd, Zixu Wang, Marek Rei, Yishu Miao, Lucia Specia

In traditional Visual Question Generation (VQG), most images have multiple concepts (e. g. objects and categories) for which a question could be generated, but models are trained to mimic an arbitrary choice of concept as given in their training data.

Question Generation Question-Generation +2

Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation

1 code implementation WMT (EMNLP) 2021 Diptesh Kanojia, Marina Fomicheva, Tharindu Ranasinghe, Frédéric Blain, Constantin Orăsan, Lucia Specia

However, this ability is yet to be tested in the current evaluation practices, where QE systems are assessed only in terms of their correlation with human judgements.

Machine Translation Translation

Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications

no code implementations EMNLP 2021 Shuo Sun, Ahmed El-Kishky, Vishrav Chaudhary, James Cross, Francisco Guzmán, Lucia Specia

Sentence-level Quality estimation (QE) of machine translation is traditionally formulated as a regression task, and the performance of QE models is typically measured by Pearson correlation with human labels.

Machine Translation Model Compression +3

Translation Error Detection as Rationale Extraction

no code implementations Findings (ACL) 2022 Marina Fomicheva, Lucia Specia, Nikolaos Aletras

Recent Quality Estimation (QE) models based on multilingual pre-trained representations have achieved very competitive results when predicting the overall quality of translated sentences.

Sentence Translation

Continual Quality Estimation with Online Bayesian Meta-Learning

no code implementations ACL 2021 Abiola Obamuyide, Marina Fomicheva, Lucia Specia

Most current quality estimation (QE) models for machine translation are trained and evaluated in a static setting where training and test data are assumed to be from a fixed distribution.

Machine Translation Meta-Learning +1

Knowledge Distillation for Quality Estimation

1 code implementation Findings (ACL) 2021 Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Frédéric Blain, Abiola Obamuyide, Nikolaos Aletras, Lucia Specia

Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations.

Data Augmentation Knowledge Distillation +2

BERTGEN: Multi-task Generation through BERT

1 code implementation ACL 2021 Faidon Mitzalis, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively.

Image Captioning Multimodal Machine Translation +2

SentSim: Crosslingual Semantic Evaluation of Machine Translation

no code implementations NAACL 2021 Yurun Song, Junchen Zhao, Lucia Specia

Machine translation (MT) is currently evaluated in one of two ways: in a monolingual fashion, by comparison with the system output to one or more human reference translations, or in a trained crosslingual fashion, by building a supervised model to predict quality scores from human-labeled data.

Machine Translation Semantic Similarity +3

Cross-Modal Generative Augmentation for Visual Question Answering

no code implementations11 May 2021 Zixu Wang, Yishu Miao, Lucia Specia

Experiments on Visual Question Answering as downstream task demonstrate the effectiveness of the proposed generative model, which is able to improve strong UpDn-based models to achieve state-of-the-art performance.

Data Augmentation Question Answering +1

What Makes a Scientific Paper be Accepted for Publication?

no code implementations EMNLP (CINLP) 2021 Panagiotis Fytas, Georgios Rizos, Lucia Specia

Despite peer-reviewing being an essential component of academia since the 1600s, it has repeatedly received criticisms for lack of transparency and consistency.

Backtranslation Feedback Improves User Confidence in MT, Not Quality

1 code implementation NAACL 2021 Vilém Zouhar, Michal Novák, Matúš Žilinec, Ondřej Bojar, Mateo Obregón, Robin L. Hill, Frédéric Blain, Marina Fomicheva, Lucia Specia, Lisa Yankovskaya

Translating text into a language unknown to the text's author, dubbed outbound translation, is a modern need for which the user experience has significant room for improvement, beyond the basic machine translation facility.

Machine Translation Translation

Visual Cues and Error Correction for Translation Robustness

1 code implementation Findings (EMNLP) 2021 Zhenhao Li, Marek Rei, Lucia Specia

Neural Machine Translation models are sensitive to noise in the input texts, such as misspelled words and ungrammatical constructions.

Machine Translation Translation

MultiSubs: A Large-scale Multimodal and Multilingual Dataset

1 code implementation LREC 2022 Josiah Wang, Pranava Madhyastha, Josiel Figueiredo, Chiraag Lala, Lucia Specia

The dataset will benefit research on visual grounding of words especially in the context of free-form sentences, and can be obtained from https://doi. org/10. 5281/zenodo. 5034604 under a Creative Commons licence.

Multimodal Lexical Translation Multimodal Text Prediction +2

Exploring Supervised and Unsupervised Rewards in Machine Translation

1 code implementation EACL 2021 Julia Ive, Zixu Wang, Marina Fomicheva, Lucia Specia

Reinforcement Learning (RL) is a powerful framework to address the discrepancy between loss functions used during training and the final evaluation metrics to be used at test time.

Machine Translation Reinforcement Learning (RL) +2

Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation

1 code implementation EACL 2021 Julia Ive, Andy Mingren Li, Yishu Miao, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

This paper addresses the problem of simultaneous machine translation (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to support this process by providing additional (visual) contextual information which may be available before the textual input is produced.

Machine Translation reinforcement-learning +2

Latent Variable Models for Visual Question Answering

no code implementations16 Jan 2021 Zixu Wang, Yishu Miao, Lucia Specia

Current work on Visual Question Answering (VQA) explore deterministic approaches conditioned on various types of image and question features.

Benchmarking Question Answering +1

MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish

no code implementations13 Dec 2020 Begum Citamak, Ozan Caglayan, Menekse Kuyu, Erkut Erdem, Aykut Erdem, Pranava Madhyastha, Lucia Specia

We hope that the MSVD-Turkish dataset and the results reported in this work will lead to better video captioning and multimodal machine translation models for Turkish and other morphology rich and agglutinative languages.

Multimodal Machine Translation Sentence +3

An Exploratory Study on Multilingual Quality Estimation

no code implementations Asian Chapter of the Association for Computational Linguistics 2020 Shuo Sun, Marina Fomicheva, Fr{\'e}d{\'e}ric Blain, Vishrav Chaudhary, Ahmed El-Kishky, Adithya Renduchintala, Francisco Guzm{\'a}n, Lucia Specia

Predicting the quality of machine translation has traditionally been addressed with language-specific models, under the assumption that the quality label distribution or linguistic features exhibit traits that are not shared across languages.

Machine Translation Translation

Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision

1 code implementation19 Nov 2020 Yujie Zhong, Linhai Xie, Sen Wang, Lucia Specia, Yishu Miao

In this paper, we teach machines to understand visuals and natural language by learning the mapping between sentences and noisy video snippets without explicit annotations.

Retrieval Self-Supervised Learning

FIND: Human-in-the-Loop Debugging Deep Text Classifiers

1 code implementation EMNLP 2020 Piyawat Lertvittayakumjorn, Lucia Specia, Francesca Toni

Since obtaining a perfect training dataset (i. e., a dataset which is considerably large, unbiased, and well-representative of unseen cases) is hardly possible, many real-world text classifiers are trained on the available, yet imperfect, datasets.

Simultaneous Machine Translation with Visual Context

1 code implementation EMNLP 2020 Ozan Caglayan, Julia Ive, Veneta Haralampieva, Pranava Madhyastha, Loïc Barrault, Lucia Specia

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.

Machine Translation Translation

Are we Estimating or Guesstimating Translation Quality?

no code implementations ACL 2020 Shuo Sun, Francisco Guzm{\'a}n, Lucia Specia

Recent advances in pre-trained multilingual language models lead to state-of-the-art results on the task of quality estimation (QE) for machine translation.

Machine Translation Translation

Multimodal Quality Estimation for Machine Translation

no code implementations ACL 2020 Shu Okabe, Fr{\'e}d{\'e}ric Blain, Lucia Specia

We propose approaches to Quality Estimation (QE) for Machine Translation that explore both text and visual modalities for Multimodal QE.

Machine Translation Sentence +1

Exploring Model Consensus to Generate Translation Paraphrases

1 code implementation WS 2020 Zhenhao Li, Marina Fomicheva, Lucia Specia

This paper describes our submission to the 2020 Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE).

Machine Translation Translation

Unsupervised Quality Estimation for Neural Machine Translation

3 code implementations21 May 2020 Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time.

Machine Translation Translation +1

ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations

1 code implementation ACL 2020 Fernando Alva-Manchego, Louis Martin, Antoine Bordes, Carolina Scarton, Benoît Sagot, Lucia Specia

Furthermore, we motivate the need for developing better methods for automatic evaluation using ASSET, since we show that current popular metrics may not be suitable when multiple simplification transformations are performed.

Sentence

Data-Driven Sentence Simplification: Survey and Benchmark

no code implementations CL 2020 Fern Alva-Manchego, o, Carolina Scarton, Lucia Specia

Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read and understand.

Sentence

Multimodal Machine Translation through Visuals and Speech

no code implementations28 Nov 2019 Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, Jörg Tiedemann

Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data.

Image Captioning Multimodal Machine Translation +4

Improving Neural Machine Translation Robustness via Data Augmentation: Beyond Back-Translation

1 code implementation WS 2019 Zhenhao Li, Lucia Specia

Neural Machine Translation (NMT) models have been proved strong when translating clean texts, but they are very sensitive to noise in the input.

Data Augmentation Domain Adaptation +3

Deep Copycat Networks for Text-to-Text Generation

1 code implementation IJCNLP 2019 Julia Ive, Pranava Madhyastha, Lucia Specia

Most text-to-text generation tasks, for example text summarisation and text simplification, require copying words from the input to the output.

Automatic Post-Editing Text Generation +2

Transformer-based Cascaded Multimodal Speech Translation

no code implementations EMNLP (IWSLT) 2019 Zixiu Wu, Ozan Caglayan, Julia Ive, Josiah Wang, Lucia Specia

Upon conducting extensive experiments, we found that (i) the explored visual integration schemes often harm the translation performance for the transformer and additive deliberation, but considerably improve the cascade deliberation; (ii) the transformer and cascade deliberation integrate the visual modality better than the additive deliberation, as shown by the incongruence analysis.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Imperial College London Submission to VATEX Video Captioning Task

no code implementations16 Oct 2019 Ozan Caglayan, Zixiu Wu, Pranava Madhyastha, Josiah Wang, Lucia Specia

This paper describes the Imperial College London team's submission to the 2019' VATEX video captioning challenge, where we first explore two sequence-to-sequence models, namely a recurrent (GRU) model and a transformer model, which generate captions from the I3D action features.

Video Captioning

Improving Neural Machine Translation Robustness via Data Augmentation: Beyond Back Translation

no code implementations7 Oct 2019 Zhenhao Li, Lucia Specia

Neural Machine Translation (NMT) models have been proved strong when translating clean texts, but they are very sensitive to noise in the input.

Data Augmentation Domain Adaptation +3

Taking MT Evaluation Metrics to Extremes: Beyond Correlation with Human Judgments

no code implementations CL 2019 Marina Fomicheva, Lucia Specia

Much work has been dedicated to the improvement of evaluation metrics to achieve a higher correlation with human judgments.

Machine Translation Translation

Phrase Localization Without Paired Training Examples

1 code implementation ICCV 2019 Josiah Wang, Lucia Specia

Localizing phrases in images is an important part of image understanding and can be useful in many applications that require mappings between textual and visual information.

Semantic Similarity Semantic Textual Similarity

EASSE: Easier Automatic Sentence Simplification Evaluation

1 code implementation IJCNLP 2019 Fernando Alva-Manchego, Louis Martin, Carolina Scarton, Lucia Specia

We introduce EASSE, a Python package aiming to facilitate and standardise automatic evaluation and comparison of Sentence Simplification (SS) systems.

Sentence

Predicting Actions to Help Predict Translations

no code implementations5 Aug 2019 Zixiu Wu, Julia Ive, Josiah Wang, Pranava Madhyastha, Lucia Specia

The question we ask ourselves is whether visual features can support the translation process, in particular, given that this is a dataset extracted from videos, we focus on the translation of actions, which we believe are poorly captured in current static image-text datasets currently used for multimodal translation.

Translation

VIFIDEL: Evaluating the Visual Fidelity of Image Descriptions

no code implementations ACL 2019 Pranava Madhyastha, Josiah Wang, Lucia Specia

It estimates the faithfulness of a generated caption with respect to the content of the actual image, based on the semantic similarity between labels of objects depicted in images and words in the description.

Semantic Similarity Semantic Textual Similarity

Distilling Translations with Visual Awareness

1 code implementation ACL 2019 Julia Ive, Pranava Madhyastha, Lucia Specia

Previous work on multimodal machine translation has shown that visual information is only needed in very specific cases, for example in the presence of ambiguous words where the textual context is not sufficient.

Ranked #3 on Multimodal Machine Translation on Multi30K (Meteor (EN-FR) metric)

Multimodal Machine Translation Translation

Grounded Word Sense Translation

no code implementations WS 2019 Chiraag Lala, Pranava Madhyastha, Lucia Specia

Recent work on visually grounded language learning has focused on broader applications of grounded representations, such as visual question answering and multimodal machine translation.

Grounded language learning Multimodal Machine Translation +3

Probing the Need for Visual Context in Multimodal Machine Translation

no code implementations NAACL 2019 Ozan Caglayan, Pranava Madhyastha, Lucia Specia, Loïc Barrault

Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial.

Multimodal Machine Translation Translation

How2: A Large-scale Dataset for Multimodal Language Understanding

2 code implementations1 Nov 2018 Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze

In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space

1 code implementation WS 2018 Pranava Swaroop Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn {`}distributional similarity{'} in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

Assessing Crosslingual Discourse Relations in Machine Translation

1 code implementation7 Oct 2018 Karin Sim Smith, Lucia Specia

In an attempt to improve overall translation quality, there has been an increasing focus on integrating more linguistic elements into Machine Translation (MT).

Machine Translation Translation

Findings of the Third Shared Task on Multimodal Machine Translation

1 code implementation WS 2018 Lo{\"\i}c Barrault, Fethi Bougares, Lucia Specia, Chiraag Lala, Desmond Elliott, Stella Frank

In this task a source sentence in English is supplemented by an image and participating systems are required to generate a translation for such a sentence into German, French or Czech.

Multimodal Machine Translation Sentence +1

Sheffield Submissions for WMT18 Multimodal Translation Shared Task

no code implementations WS 2018 Chiraag Lala, Pranava Swaroop Madhyastha, Carolina Scarton, Lucia Specia

For task 1b, we explore three approaches: (i) re-ranking based on cross-lingual word sense disambiguation (as for task 1), (ii) re-ranking based on consensus of NMT n-best lists from German-Czech, French-Czech and English-Czech systems, and (iii) data augmentation by generating English source data through machine translation from French to English and from German to English followed by hypothesis selection using a multimodal-reranker.

Data Augmentation Multimodal Machine Translation +4

Findings of the WMT 2018 Shared Task on Quality Estimation

no code implementations WS 2018 Lucia Specia, Fr{\'e}d{\'e}ric Blain, Varvara Logacheva, Ram{\'o}n Astudillo, Andr{\'e} F. T. Martins

We report the results of the WMT18 shared task on Quality Estimation, i. e. the task of predicting the quality of the output of machine translation systems at various granularity levels: word, phrase, sentence and document.

Machine Translation Sentence +1

End-to-end Image Captioning Exploits Multimodal Distributional Similarity

no code implementations11 Sep 2018 Pranava Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn `distributional similarity' in a multimodal feature space by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

deepQuest: A Framework for Neural-based Quality Estimation

2 code implementations COLING 2018 Julia Ive, Fr{\'e}d{\'e}ric Blain, Lucia Specia

Our approach is significantly faster and yields performance improvements for a range of document-level quality estimation tasks.

Feature Engineering Machine Translation +2

Learning Simplifications for Specific Target Audiences

no code implementations ACL 2018 Carolina Scarton, Lucia Specia

Text simplification (TS) is a monolingual text-to-text transformation task where an original (complex) text is transformed into a target (simpler) text.

Lexical Simplification Machine Translation +4

Vis-Eval Metric Viewer: A Visualisation Tool for Inspecting and Evaluating Metric Scores of Machine Translation Output

no code implementations NAACL 2018 David Steele, Lucia Specia

Machine Translation systems are usually evaluated and compared using automated evaluation metrics such as BLEU and METEOR to score the generated translations against human translations.

Machine Translation Sentence +1

Defoiling Foiled Image Captions

1 code implementation NAACL 2018 Pranava Madhyastha, Josiah Wang, Lucia Specia

We address the task of detecting foiled image captions, i. e. identifying whether a caption contains a word that has been deliberately replaced by a semantically similar word, thus rendering it inaccurate with respect to the image being described.

Descriptive Image Captioning +1

Object Counts! Bringing Explicit Detections Back into Image Captioning

no code implementations NAACL 2018 Josiah Wang, Pranava Madhyastha, Lucia Specia

The use of explicit object detectors as an intermediate step to image captioning - which used to constitute an essential stage in early work - is often bypassed in the currently dominant end-to-end approaches, where the language model is conditioned directly on a mid-level image embedding.

Image Captioning Language Modelling +1

What is image captioning made of?

1 code implementation ICLR 2018 Pranava Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn ‘distributional similarity’ in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs

1 code implementation IJCNLP 2017 Fern Alva-Manchego, o, Joachim Bingel, Gustavo Paetzold, Carolina Scarton, Lucia Specia

Current research in text simplification has been hampered by two central problems: (i) the small amount of high-quality parallel simplification data available, and (ii) the lack of explicit annotations of simplification operations, such as deletions or substitutions, on existing data.

Machine Translation Sentence +2

The Ultimate Presentation Makeup Tutorial: How to Polish your Posters, Slides and Presentations Skills

no code implementations IJCNLP 2017 Gustavo Paetzold, Lucia Specia

There is no question that our research community have, and still has been producing an insurmountable amount of interesting strategies, models and tools to a wide array of problems and challenges in diverse areas of knowledge.

MASSAlign: Alignment and Annotation of Comparable Documents

no code implementations IJCNLP 2017 Gustavo Paetzold, Fern Alva-Manchego, o, Lucia Specia

We introduce MASSAlign: a Python library for the alignment and annotation of monolingual comparable documents.

Sentence

MUSST: A Multilingual Syntactic Simplification Tool

no code implementations IJCNLP 2017 Carolina Scarton, Alessio Palmero Aprosio, Sara Tonelli, Tamara Mart{\'\i}n Wanton, Lucia Specia

Our implementation includes a set of general-purpose simplification rules, as well as a sentence selection module (to select sentences to be simplified) and a confidence model (to select only promising simplifications).

Lexical Simplification Sentence +1

Lexical Simplification with Neural Ranking

no code implementations EACL 2017 Gustavo Paetzold, Lucia Specia

We present a new Lexical Simplification approach that exploits Neural Networks to learn substitutions from the Newsela corpus - a large set of professionally produced simplifications.

Complex Word Identification Information Retrieval +3

Anita: An Intelligent Text Adaptation Tool

no code implementations COLING 2016 Gustavo Paetzold, Lucia Specia

We introduce Anita: a flexible and intelligent Text Adaptation tool for web content that provides Text Simplification and Text Enhancement modules.

Lexical Simplification Text Simplification

Collecting and Exploring Everyday Language for Predicting Psycholinguistic Properties of Words

no code implementations COLING 2016 Gustavo Paetzold, Lucia Specia

Exploring language usage through frequency analysis in large corpora is a defining feature in most recent work in corpus and computational linguistics.

Text Simplification

Personalized Machine Translation: Preserving Original Author Traits

no code implementations EACL 2017 Ella Rabinovich, Shachar Mirkin, Raj Nath Patel, Lucia Specia, Shuly Wintner

The language that we produce reflects our personality, and various personal and demographic characteristics can be detected in natural language texts.

Domain Adaptation Machine Translation +1

Exploring Prediction Uncertainty in Machine Translation Quality Estimation

no code implementations CONLL 2016 Daniel Beck, Lucia Specia, Trevor Cohn

Machine Translation Quality Estimation is a notoriously difficult task, which lessens its usefulness in real-world translation environments.

Machine Translation Translation

Cohere: A Toolkit for Local Coherence

1 code implementation LREC 2016 Karin Sim Smith, Wilker Aziz, Lucia Specia

We describe COHERE, our coherence toolkit which incorporates various complementary models for capturing and measuring different aspects of text coherence.

Phrase Level Segmentation and Labelling of Machine Translation Errors

no code implementations LREC 2016 Fr{\'e}d{\'e}ric Blain, Varvara Logacheva, Lucia Specia

This paper presents our work towards a novel approach for Quality Estimation (QE) of machine translation based on sequences of adjacent words, the so-called phrases.

Machine Translation Sentence +1

Benchmarking Lexical Simplification Systems

no code implementations LREC 2016 Gustavo Paetzold, Lucia Specia

Lexical Simplification is the task of replacing complex words in a text with simpler alternatives.

Benchmarking Lexical Simplification

The USFD Spoken Language Translation System for IWSLT 2014

no code implementations13 Sep 2015 Raymond W. M. Ng, Mortaza Doulaty, Rama Doddipatla, Wilker Aziz, Kashif Shah, Oscar Saz, Madina Hasan, Ghada Alharbi, Lucia Specia, Thomas Hain

The USFD primary system incorporates state-of-the-art ASR and MT techniques and gives a BLEU score of 23. 45 and 14. 75 on the English-to-French and English-to-German speech-to-text translation task with the IWSLT 2014 data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

A Quality-based Active Sample Selection Strategy for Statistical Machine Translation

no code implementations LREC 2014 Varvara Logacheva, Lucia Specia

Our approach is based on a quality estimation technique which involves a wider range of features of the source text, automatic translation, and machine translation system compared to previous work.

Active Learning Machine Translation +3

An efficient and user-friendly tool for machine translation quality estimation

no code implementations LREC 2014 Kashif Shah, Marco Turchi, Lucia Specia

We present a new version of QUEST ― an open source framework for machine translation quality estimation ― which brings a number of improvements: (i) it provides a Web interface and functionalities such that non-expert users, e. g. translators or lay-users of machine translations, can get quality predictions (or internal features of the framework) for translations without having to install the toolkit, obtain resources or build prediction models; (ii) it significantly improves over the previous runtime performance by keeping resources (such as language models) in memory; (iii) it provides an option for users to submit the source text only and automatically obtain translations from Bing Translator; (iv) it provides a ranking of multiple translations submitted by users for each source text according to their estimated quality.

Machine Translation Translation

PET: a Tool for Post-editing and Assessing Machine Translation

no code implementations LREC 2012 Wilker Aziz, Sheila Castilho, Lucia Specia

Given the significant improvements in Machine Translation (MT) quality and the increasing demand for translations, post-editing of automatic translations is becoming a popular practice in the translation industry.

Machine Translation Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.