Search Results for author: Lucia Specia

Found 179 papers, 36 papers with code

Validating Quality Estimation in a Computer-Aided Translation Workflow: Speed, Cost and Quality Trade-off

no code implementations MTSummit 2021 Fernando Alva-Manchego, Lucia Specia, Sara Szoc, Tom Vanallemeersch, Heidi Depraetere

In this scenario, a Quality Estimation (QE) tool can be used to score MT outputs, and a threshold on the QE scores can be applied to decide whether an MT output can be used as-is or requires human post-edition.

Machine Translation Translation

Towards a Better Understanding of Noise in Natural Language Processing

no code implementations RANLP 2021 Khetam Al Sharou, Zhenhao Li, Lucia Specia

In this paper, we propose a definition and taxonomy of various types of non-standard textual content – generally referred to as “noise” – in Natural Language Processing (NLP).

Multimodal Simultaneous Machine Translation

no code implementations MMTLRL (RANLP) 2021 Lucia Specia

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.

Machine Translation Translation

BERGAMOT-LATTE Submissions for the WMT20 Quality Estimation Shared Task

no code implementations WMT (EMNLP) 2020 Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Vishrav Chaudhary, Mark Fishel, Francisco Guzmán, Lucia Specia

We explore (a) a black-box approach to QE based on pre-trained representations; and (b) glass-box approaches that leverage various indicators that can be extracted from the neural MT systems.

Findings of the WMT 2020 Shared Task on Quality Estimation

no code implementations WMT (EMNLP) 2020 Lucia Specia, Frédéric Blain, Marina Fomicheva, Erick Fonseca, Vishrav Chaudhary, Francisco Guzmán, André F. T. Martins

We report the results of the WMT20 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word, sentence and document levels.

Document-level Machine Translation +1

Bayesian Model-Agnostic Meta-Learning with Matrix-Valued Kernels for Quality Estimation

no code implementations ACL (RepL4NLP) 2021 Abiola Obamuyide, Marina Fomicheva, Lucia Specia

To address these challenges, we propose a Bayesian meta-learning approach for adapting QE models to the needs and preferences of each user with limited supervision.

Machine Translation Meta-Learning +1

Quality In, Quality Out: Learning from Actual Mistakes

no code implementations EAMT 2020 Frederic Blain, Nikolaos Aletras, Lucia Specia

However, QE models are often trained on noisy approximations of quality annotations derived from the proportion of post-edited words in translated sentences instead of direct human annotations of translation errors.

Machine Translation Transfer Learning +1

Revisiting Contextual Toxicity Detection in Conversations

no code implementations24 Nov 2021 Julia Ive, Atijit Anuchitanukul, Lucia Specia

We then propose to bring these findings into computational detection models by introducing (a) neural architectures for contextual toxicity detection that are aware of the conversational structure, and (b) data augmentation strategies that can help model contextual toxicity detection.

Data Augmentation

Guiding Visual Question Generation

no code implementations15 Oct 2021 Nihir Vedd, Zixu Wang, Marek Rei, Yishu Miao, Lucia Specia

In traditional Visual Question Generation (VQG), most images have multiple concepts (e. g. objects and categories) for which a question could be generated, but models are trained to mimic an arbitrary choice of concept as given in their training data.

Question Generation Visual Question Answering

Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation

1 code implementation22 Sep 2021 Diptesh Kanojia, Marina Fomicheva, Tharindu Ranasinghe, Frédéric Blain, Constantin Orăsan, Lucia Specia

However, this ability is yet to be tested in the current evaluation practices, where QE systems are assessed only in terms of their correlation with human judgements.

Machine Translation Translation

Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications

no code implementations EMNLP 2021 Shuo Sun, Ahmed El-Kishky, Vishrav Chaudhary, James Cross, Francisco Guzmán, Lucia Specia

Sentence-level Quality estimation (QE) of machine translation is traditionally formulated as a regression task, and the performance of QE models is typically measured by Pearson correlation with human labels.

Machine Translation Model Compression +1

Translation Error Detection as Rationale Extraction

no code implementations27 Aug 2021 Marina Fomicheva, Lucia Specia, Nikolaos Aletras

Recent Quality Estimation (QE) models based on multilingual pre-trained representations have achieved very competitive results when predicting the overall quality of translated sentences.

Translation

Continual Quality Estimation with Online Bayesian Meta-Learning

no code implementations ACL 2021 Abiola Obamuyide, Marina Fomicheva, Lucia Specia

Most current quality estimation (QE) models for machine translation are trained and evaluated in a static setting where training and test data are assumed to be from a fixed distribution.

Machine Translation Meta-Learning +1

Knowledge Distillation for Quality Estimation

1 code implementation Findings (ACL) 2021 Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Frédéric Blain, Abiola Obamuyide, Nikolaos Aletras, Lucia Specia

Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations.

Data Augmentation Knowledge Distillation +2

BERTGEN: Multi-task Generation through BERT

1 code implementation ACL 2021 Faidon Mitzalis, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively.

Image Captioning Multimodal Machine Translation +2

SentSim: Crosslingual Semantic Evaluation of Machine Translation

no code implementations NAACL 2021 Yurun Song, Junchen Zhao, Lucia Specia

Machine translation (MT) is currently evaluated in one of two ways: in a monolingual fashion, by comparison with the system output to one or more human reference translations, or in a trained crosslingual fashion, by building a supervised model to predict quality scores from human-labeled data.

Machine Translation Semantic Similarity +2

Cross-Modal Generative Augmentation for Visual Question Answering

no code implementations11 May 2021 Zixu Wang, Yishu Miao, Lucia Specia

Experiments on Visual Question Answering as downstream task demonstrate the effectiveness of the proposed generative model, which is able to improve strong UpDn-based models to achieve state-of-the-art performance.

Data Augmentation Question Answering +1

What Makes a Scientific Paper be Accepted for Publication?

no code implementations EMNLP (CINLP) 2021 Panagiotis Fytas, Georgios Rizos, Lucia Specia

Despite peer-reviewing being an essential component of academia since the 1600s, it has repeatedly received criticisms for lack of transparency and consistency.

Backtranslation Feedback Improves User Confidence in MT, Not Quality

1 code implementation NAACL 2021 Vilém Zouhar, Michal Novák, Matúš Žilinec, Ondřej Bojar, Mateo Obregón, Robin L. Hill, Frédéric Blain, Marina Fomicheva, Lucia Specia, Lisa Yankovskaya

Translating text into a language unknown to the text's author, dubbed outbound translation, is a modern need for which the user experience has significant room for improvement, beyond the basic machine translation facility.

Machine Translation Translation

Visual Cues and Error Correction for Translation Robustness

1 code implementation Findings (EMNLP) 2021 Zhenhao Li, Marek Rei, Lucia Specia

Neural Machine Translation models are sensitive to noise in the input texts, such as misspelled words and ungrammatical constructions.

Machine Translation Translation

MultiSubs: A Large-scale Multimodal and Multilingual Dataset

1 code implementation2 Mar 2021 Josiah Wang, Pranava Madhyastha, Josiel Figueiredo, Chiraag Lala, Lucia Specia

The dataset will benefit research on visual grounding of words especially in the context of free-form sentences, and can be obtained from https://doi. org/10. 5281/zenodo. 5034604 under a Creative Commons licence.

Multimodal Lexical Translation Multimodal Text Prediction +1

Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation

1 code implementation EACL 2021 Julia Ive, Andy Mingren Li, Yishu Miao, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

This paper addresses the problem of simultaneous machine translation (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to support this process by providing additional (visual) contextual information which may be available before the textual input is produced.

Machine Translation Translation

Exploring Supervised and Unsupervised Rewards in Machine Translation

1 code implementation EACL 2021 Julia Ive, Zixu Wang, Marina Fomicheva, Lucia Specia

Reinforcement Learning (RL) is a powerful framework to address the discrepancy between loss functions used during training and the final evaluation metrics to be used at test time.

Machine Translation Text Generation +1

Latent Variable Models for Visual Question Answering

no code implementations16 Jan 2021 Zixu Wang, Yishu Miao, Lucia Specia

Current work on Visual Question Answering (VQA) explore deterministic approaches conditioned on various types of image and question features.

Latent Variable Models Question Answering +1

MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish

no code implementations13 Dec 2020 Begum Citamak, Ozan Caglayan, Menekse Kuyu, Erkut Erdem, Aykut Erdem, Pranava Madhyastha, Lucia Specia

We hope that the MSVD-Turkish dataset and the results reported in this work will lead to better video captioning and multimodal machine translation models for Turkish and other morphology rich and agglutinative languages.

Multimodal Machine Translation Translation +2

An Exploratory Study on Multilingual Quality Estimation

no code implementations Asian Chapter of the Association for Computational Linguistics 2020 Shuo Sun, Marina Fomicheva, Fr{\'e}d{\'e}ric Blain, Vishrav Chaudhary, Ahmed El-Kishky, Adithya Renduchintala, Francisco Guzm{\'a}n, Lucia Specia

Predicting the quality of machine translation has traditionally been addressed with language-specific models, under the assumption that the quality label distribution or linguistic features exhibit traits that are not shared across languages.

Machine Translation Translation

Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision

1 code implementation19 Nov 2020 Yujie Zhong, Linhai Xie, Sen Wang, Lucia Specia, Yishu Miao

In this paper, we teach machines to understand visuals and natural language by learning the mapping between sentences and noisy video snippets without explicit annotations.

Self-Supervised Learning

FIND: Human-in-the-Loop Debugging Deep Text Classifiers

1 code implementation EMNLP 2020 Piyawat Lertvittayakumjorn, Lucia Specia, Francesca Toni

Since obtaining a perfect training dataset (i. e., a dataset which is considerably large, unbiased, and well-representative of unseen cases) is hardly possible, many real-world text classifiers are trained on the available, yet imperfect, datasets.

Simultaneous Machine Translation with Visual Context

1 code implementation EMNLP 2020 Ozan Caglayan, Julia Ive, Veneta Haralampieva, Pranava Madhyastha, Loïc Barrault, Lucia Specia

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.

Machine Translation Translation

Exploring Model Consensus to Generate Translation Paraphrases

1 code implementation WS 2020 Zhenhao Li, Marina Fomicheva, Lucia Specia

This paper describes our submission to the 2020 Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE).

Machine Translation Translation

Multimodal Quality Estimation for Machine Translation

no code implementations ACL 2020 Shu Okabe, Fr{\'e}d{\'e}ric Blain, Lucia Specia

We propose approaches to Quality Estimation (QE) for Machine Translation that explore both text and visual modalities for Multimodal QE.

Document-level Machine Translation +1

Are we Estimating or Guesstimating Translation Quality?

no code implementations ACL 2020 Shuo Sun, Francisco Guzm{\'a}n, Lucia Specia

Recent advances in pre-trained multilingual language models lead to state-of-the-art results on the task of quality estimation (QE) for machine translation.

Machine Translation Translation

Unsupervised Quality Estimation for Neural Machine Translation

3 code implementations21 May 2020 Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time.

Machine Translation Translation

ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations

1 code implementation ACL 2020 Fernando Alva-Manchego, Louis Martin, Antoine Bordes, Carolina Scarton, Benoît Sagot, Lucia Specia

Furthermore, we motivate the need for developing better methods for automatic evaluation using ASSET, since we show that current popular metrics may not be suitable when multiple simplification transformations are performed.

Data-Driven Sentence Simplification: Survey and Benchmark

no code implementations CL 2020 Fern Alva-Manchego, o, Carolina Scarton, Lucia Specia

Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read and understand.

Multimodal Machine Translation through Visuals and Speech

no code implementations28 Nov 2019 Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, Jörg Tiedemann

Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data.

Image Captioning Multimodal Machine Translation +3

Improving Neural Machine Translation Robustness via Data Augmentation: Beyond Back-Translation

1 code implementation WS 2019 Zhenhao Li, Lucia Specia

Neural Machine Translation (NMT) models have been proved strong when translating clean texts, but they are very sensitive to noise in the input.

Data Augmentation Domain Adaptation +2

Deep Copycat Networks for Text-to-Text Generation

1 code implementation IJCNLP 2019 Julia Ive, Pranava Madhyastha, Lucia Specia

Most text-to-text generation tasks, for example text summarisation and text simplification, require copying words from the input to the output.

Automatic Post-Editing Text Generation +2

Transformer-based Cascaded Multimodal Speech Translation

no code implementations29 Oct 2019 Zixiu Wu, Ozan Caglayan, Julia Ive, Josiah Wang, Lucia Specia

Upon conducting extensive experiments, we found that (i) the explored visual integration schemes often harm the translation performance for the transformer and additive deliberation, but considerably improve the cascade deliberation; (ii) the transformer and cascade deliberation integrate the visual modality better than the additive deliberation, as shown by the incongruence analysis.

automatic-speech-recognition Multimodal Machine Translation +2

Imperial College London Submission to VATEX Video Captioning Task

no code implementations16 Oct 2019 Ozan Caglayan, Zixiu Wu, Pranava Madhyastha, Josiah Wang, Lucia Specia

This paper describes the Imperial College London team's submission to the 2019' VATEX video captioning challenge, where we first explore two sequence-to-sequence models, namely a recurrent (GRU) model and a transformer model, which generate captions from the I3D action features.

Video Captioning

Estimating post-editing effort: a study on human judgements, task-based and reference-based metrics of MT quality

1 code implementation14 Oct 2019 Carolina Scarton, Mikel L. Forcada, Miquel Esplà-Gomis, Lucia Specia

To that end, we report experiments on a dataset with newly-collected post-editing indicators and show their usefulness when estimating post-editing effort.

Machine Translation Translation

Improving Neural Machine Translation Robustness via Data Augmentation: Beyond Back Translation

no code implementations7 Oct 2019 Zhenhao Li, Lucia Specia

Neural Machine Translation (NMT) models have been proved strong when translating clean texts, but they are very sensitive to noise in the input.

Data Augmentation Domain Adaptation +2

Taking MT Evaluation Metrics to Extremes: Beyond Correlation with Human Judgments

no code implementations CL 2019 Marina Fomicheva, Lucia Specia

Much work has been dedicated to the improvement of evaluation metrics to achieve a higher correlation with human judgments.

Machine Translation Translation

Phrase Localization Without Paired Training Examples

1 code implementation ICCV 2019 Josiah Wang, Lucia Specia

Localizing phrases in images is an important part of image understanding and can be useful in many applications that require mappings between textual and visual information.

Semantic Similarity Semantic Textual Similarity

EASSE: Easier Automatic Sentence Simplification Evaluation

1 code implementation IJCNLP 2019 Fernando Alva-Manchego, Louis Martin, Carolina Scarton, Lucia Specia

We introduce EASSE, a Python package aiming to facilitate and standardise automatic evaluation and comparison of Sentence Simplification (SS) systems.

Predicting Actions to Help Predict Translations

no code implementations5 Aug 2019 Zixiu Wu, Julia Ive, Josiah Wang, Pranava Madhyastha, Lucia Specia

The question we ask ourselves is whether visual features can support the translation process, in particular, given that this is a dataset extracted from videos, we focus on the translation of actions, which we believe are poorly captured in current static image-text datasets currently used for multimodal translation.

Translation

VIFIDEL: Evaluating the Visual Fidelity of Image Descriptions

no code implementations ACL 2019 Pranava Madhyastha, Josiah Wang, Lucia Specia

It estimates the faithfulness of a generated caption with respect to the content of the actual image, based on the semantic similarity between labels of objects depicted in images and words in the description.

Semantic Similarity Semantic Textual Similarity

Distilling Translations with Visual Awareness

1 code implementation ACL 2019 Julia Ive, Pranava Madhyastha, Lucia Specia

Previous work on multimodal machine translation has shown that visual information is only needed in very specific cases, for example in the presence of ambiguous words where the textual context is not sufficient.

Ranked #2 on Multimodal Machine Translation on Multi30K (Meteor (EN-FR) metric)

Multimodal Machine Translation Translation

Grounded Word Sense Translation

no code implementations WS 2019 Chiraag Lala, Pranava Madhyastha, Lucia Specia

Recent work on visually grounded language learning has focused on broader applications of grounded representations, such as visual question answering and multimodal machine translation.

Grounded language learning Multimodal Machine Translation +3

Probing the Need for Visual Context in Multimodal Machine Translation

no code implementations NAACL 2019 Ozan Caglayan, Pranava Madhyastha, Lucia Specia, Loïc Barrault

Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial.

Multimodal Machine Translation Translation

How2: A Large-scale Dataset for Multimodal Language Understanding

1 code implementation1 Nov 2018 Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze

In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations.

automatic-speech-recognition Language understanding +3

End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space

1 code implementation WS 2018 Pranava Swaroop Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn {`}distributional similarity{'} in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

Assessing Crosslingual Discourse Relations in Machine Translation

1 code implementation7 Oct 2018 Karin Sim Smith, Lucia Specia

In an attempt to improve overall translation quality, there has been an increasing focus on integrating more linguistic elements into Machine Translation (MT).

Machine Translation Translation

Findings of the WMT 2018 Shared Task on Quality Estimation

no code implementations WS 2018 Lucia Specia, Fr{\'e}d{\'e}ric Blain, Varvara Logacheva, Ram{\'o}n Astudillo, Andr{\'e} F. T. Martins

We report the results of the WMT18 shared task on Quality Estimation, i. e. the task of predicting the quality of the output of machine translation systems at various granularity levels: word, phrase, sentence and document.

Machine Translation Translation

Findings of the Third Shared Task on Multimodal Machine Translation

1 code implementation WS 2018 Lo{\"\i}c Barrault, Fethi Bougares, Lucia Specia, Chiraag Lala, Desmond Elliott, Stella Frank

In this task a source sentence in English is supplemented by an image and participating systems are required to generate a translation for such a sentence into German, French or Czech.

Multimodal Machine Translation Translation

Sheffield Submissions for WMT18 Multimodal Translation Shared Task

no code implementations WS 2018 Chiraag Lala, Pranava Swaroop Madhyastha, Carolina Scarton, Lucia Specia

For task 1b, we explore three approaches: (i) re-ranking based on cross-lingual word sense disambiguation (as for task 1), (ii) re-ranking based on consensus of NMT n-best lists from German-Czech, French-Czech and English-Czech systems, and (iii) data augmentation by generating English source data through machine translation from French to English and from German to English followed by hypothesis selection using a multimodal-reranker.

Data Augmentation Multimodal Machine Translation +3

End-to-end Image Captioning Exploits Multimodal Distributional Similarity

no code implementations11 Sep 2018 Pranava Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn `distributional similarity' in a multimodal feature space by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

deepQuest: A Framework for Neural-based Quality Estimation

2 code implementations COLING 2018 Julia Ive, Fr{\'e}d{\'e}ric Blain, Lucia Specia

Our approach is significantly faster and yields performance improvements for a range of document-level quality estimation tasks.

Document-level Feature Engineering +2

Learning Simplifications for Specific Target Audiences

no code implementations ACL 2018 Carolina Scarton, Lucia Specia

Text simplification (TS) is a monolingual text-to-text transformation task where an original (complex) text is transformed into a target (simpler) text.

Lexical Simplification Machine Translation +3

Vis-Eval Metric Viewer: A Visualisation Tool for Inspecting and Evaluating Metric Scores of Machine Translation Output

no code implementations NAACL 2018 David Steele, Lucia Specia

Machine Translation systems are usually evaluated and compared using automated evaluation metrics such as BLEU and METEOR to score the generated translations against human translations.

Machine Translation Translation

Defoiling Foiled Image Captions

1 code implementation NAACL 2018 Pranava Madhyastha, Josiah Wang, Lucia Specia

We address the task of detecting foiled image captions, i. e. identifying whether a caption contains a word that has been deliberately replaced by a semantically similar word, thus rendering it inaccurate with respect to the image being described.

Image Captioning

Object Counts! Bringing Explicit Detections Back into Image Captioning

no code implementations NAACL 2018 Josiah Wang, Pranava Madhyastha, Lucia Specia

The use of explicit object detectors as an intermediate step to image captioning - which used to constitute an essential stage in early work - is often bypassed in the currently dominant end-to-end approaches, where the language model is conditioned directly on a mid-level image embedding.

Image Captioning Language Modelling

What is image captioning made of?

1 code implementation ICLR 2018 Pranava Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn ‘distributional similarity’ in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs

1 code implementation IJCNLP 2017 Fern Alva-Manchego, o, Joachim Bingel, Gustavo Paetzold, Carolina Scarton, Lucia Specia

Current research in text simplification has been hampered by two central problems: (i) the small amount of high-quality parallel simplification data available, and (ii) the lack of explicit annotations of simplification operations, such as deletions or substitutions, on existing data.

Machine Translation Sentence Compression +1

MUSST: A Multilingual Syntactic Simplification Tool

no code implementations IJCNLP 2017 Carolina Scarton, Alessio Palmero Aprosio, Sara Tonelli, Tamara Mart{\'\i}n Wanton, Lucia Specia

Our implementation includes a set of general-purpose simplification rules, as well as a sentence selection module (to select sentences to be simplified) and a confidence model (to select only promising simplifications).

Lexical Simplification Text Simplification

The Ultimate Presentation Makeup Tutorial: How to Polish your Posters, Slides and Presentations Skills

no code implementations IJCNLP 2017 Gustavo Paetzold, Lucia Specia

There is no question that our research community have, and still has been producing an insurmountable amount of interesting strategies, models and tools to a wide array of problems and challenges in diverse areas of knowledge.

MASSAlign: Alignment and Annotation of Comparable Documents

no code implementations IJCNLP 2017 Gustavo Paetzold, Fern Alva-Manchego, o, Lucia Specia

We introduce MASSAlign: a Python library for the alignment and annotation of monolingual comparable documents.

Lexical Simplification with Neural Ranking

no code implementations EACL 2017 Gustavo Paetzold, Lucia Specia

We present a new Lexical Simplification approach that exploits Neural Networks to learn substitutions from the Newsela corpus - a large set of professionally produced simplifications.

Complex Word Identification Information Retrieval +2

Collecting and Exploring Everyday Language for Predicting Psycholinguistic Properties of Words

no code implementations COLING 2016 Gustavo Paetzold, Lucia Specia

Exploring language usage through frequency analysis in large corpora is a defining feature in most recent work in corpus and computational linguistics.

Text Simplification

Anita: An Intelligent Text Adaptation Tool

no code implementations COLING 2016 Gustavo Paetzold, Lucia Specia

We introduce Anita: a flexible and intelligent Text Adaptation tool for web content that provides Text Simplification and Text Enhancement modules.

Lexical Simplification Text Simplification

Personalized Machine Translation: Preserving Original Author Traits

no code implementations EACL 2017 Ella Rabinovich, Shachar Mirkin, Raj Nath Patel, Lucia Specia, Shuly Wintner

The language that we produce reflects our personality, and various personal and demographic characteristics can be detected in natural language texts.

Domain Adaptation Machine Translation +1

Exploring Prediction Uncertainty in Machine Translation Quality Estimation

no code implementations CONLL 2016 Daniel Beck, Lucia Specia, Trevor Cohn

Machine Translation Quality Estimation is a notoriously difficult task, which lessens its usefulness in real-world translation environments.

Machine Translation Translation

Cohere: A Toolkit for Local Coherence

1 code implementation LREC 2016 Karin Sim Smith, Wilker Aziz, Lucia Specia

We describe COHERE, our coherence toolkit which incorporates various complementary models for capturing and measuring different aspects of text coherence.

Phrase Level Segmentation and Labelling of Machine Translation Errors

no code implementations LREC 2016 Fr{\'e}d{\'e}ric Blain, Varvara Logacheva, Lucia Specia

This paper presents our work towards a novel approach for Quality Estimation (QE) of machine translation based on sequences of adjacent words, the so-called phrases.

Machine Translation Translation

Benchmarking Lexical Simplification Systems

no code implementations LREC 2016 Gustavo Paetzold, Lucia Specia

Lexical Simplification is the task of replacing complex words in a text with simpler alternatives.

Lexical Simplification

The USFD Spoken Language Translation System for IWSLT 2014

no code implementations13 Sep 2015 Raymond W. M. Ng, Mortaza Doulaty, Rama Doddipatla, Wilker Aziz, Kashif Shah, Oscar Saz, Madina Hasan, Ghada Alharbi, Lucia Specia, Thomas Hain

The USFD primary system incorporates state-of-the-art ASR and MT techniques and gives a BLEU score of 23. 45 and 14. 75 on the English-to-French and English-to-German speech-to-text translation task with the IWSLT 2014 data.

automatic-speech-recognition Machine Translation +3