no code implementations • EAMT 2022 • José G.C. de Souza, Ricardo Rei, Ana C. Farinha, Helena Moniz, André F. T. Martins
This paper presents QUARTZ, QUality-AwaRe machine Translation, a project led by Unbabel which aims at developing machine translation systems that are more robust and produce fewer critical errors.
no code implementations • WMT (EMNLP) 2021 • Chrysoula Zerva, Daan van Stigt, Ricardo Rei, Ana C Farinha, Pedro Ramos, José G. C. de Souza, Taisiya Glushkova, Miguel Vera, Fabio Kepler, André F. T. Martins
We present the joint contribution of IST and Unbabel to the WMT 2021 Shared Task on Quality Estimation.
no code implementations • WMT (EMNLP) 2021 • Lucia Specia, Frédéric Blain, Marina Fomicheva, Chrysoula Zerva, Zhenhao Li, Vishrav Chaudhary, André F. T. Martins
We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels.
no code implementations • WMT (EMNLP) 2020 • João Moura, Miguel Vera, Daan van Stigt, Fabio Kepler, André F. T. Martins
We present the joint contribution of IST and Unbabel to the WMT 2020 Shared Task on Quality Estimation.
no code implementations • WMT (EMNLP) 2020 • Lucia Specia, Frédéric Blain, Marina Fomicheva, Erick Fonseca, Vishrav Chaudhary, Francisco Guzmán, André F. T. Martins
We report the results of the WMT20 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word, sentence and document levels.
1 code implementation • WMT (EMNLP) 2020 • M. Amin Farajian, António V. Lopes, André F. T. Martins, Sameen Maruf, Gholamreza Haffari
We report the results of the first edition of the WMT shared task on chat translation.
no code implementations • EAMT 2020 • André F. T. Martins, Joao Graca, Paulo Dimas, Helena Moniz, Graham Neubig
This paper presents the Multilingual Artificial Intelligence Agent Assistant (MAIA), a project led by Unbabel with the collaboration of CMU, INESC-ID and IT Lisbon.
no code implementations • EAMT 2022 • André F. T. Martins
DeepSPIN is a research project funded by the European Research Council (ERC) whose goal is to develop new neural structured prediction methods, models, and algorithms for improving the quality, interpretability, and data-efficiency of natural language processing (NLP) systems, with special emphasis on machine translation and quality estimation applications.
no code implementations • EAMT 2020 • António Lopes, M. Amin Farajian, Rachel Bawden, Michael Zhang, André F. T. Martins
In this paper we provide a systematic comparison of existing and new document-level neural machine translation solutions.
1 code implementation • WMT (EMNLP) 2021 • Ricardo Rei, Ana C Farinha, Chrysoula Zerva, Daan van Stigt, Craig Stewart, Pedro Ramos, Taisiya Glushkova, André F. T. Martins, Alon Lavie
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics Shared Task.
no code implementations • 24 May 2022 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
Experiments on machine translation in two settings, static domain adaptation and ``on-the-fly'' adaptation, show that the chunk-based $k$NN-MT model leads to a significant speed-up (up to 4 times) with only a small drop in translation quality.
1 code implementation • 2 May 2022 • Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, André F. T. Martins
Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search.
1 code implementation • SpaNLP (ACL) 2022 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
On the other hand, semi-parametric models have been shown to successfully perform domain adaptation by retrieving examples from an in-domain datastore (Khandelwal et al., 2021).
1 code implementation • 22 Apr 2022 • Patrick Fernandes, Marcos Treviso, Danish Pruthi, André F. T. Martins, Graham Neubig
In this work, leveraging meta-learning techniques, we extend this idea to improve the quality of the explanations themselves, specifically by optimizing explanations such that student models more effectively learn to simulate the original model.
1 code implementation • 13 Apr 2022 • Chrysoula Zerva, Taisiya Glushkova, Ricardo Rei, André F. T. Martins
Neural-based machine translation (MT) evaluation metrics are progressing fast.
1 code implementation • 4 Mar 2022 • Gonçalo R. A. Faria, André F. T. Martins, Mário A. T. Figueiredo
Recent work has shown promising results in causal discovery by leveraging interventional data with gradient-based methods, even when the intervened variables are unknown.
1 code implementation • 8 Feb 2022 • Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins
In this paper, we combine the representational strengths of factor graphs and of neural networks, proposing undirected neural networks (UNNs): a flexible framework for specifying computations that can be performed in any order.
no code implementations • spnlp (ACL) 2022 • Marcos Treviso, António Góis, Patrick Fernandes, Erick Fonseca, André F. T. Martins
Transformers' quadratic complexity with respect to the input sequence length has motivated a body of work on efficient sparse approximations to softmax.
no code implementations • 15 Sep 2021 • Kayo Yin, Patrick Fernandes, André F. T. Martins, Graham Neubig
Although proper handling of discourse phenomena significantly contributes to the quality of machine translation (MT), common translation quality metrics do not adequately capture them.
2 code implementations • Findings (EMNLP) 2021 • Taisiya Glushkova, Chrysoula Zerva, Ricardo Rei, André F. T. Martins
Several neural-based metrics have been recently proposed to evaluate machine translation quality.
2 code implementations • EMNLP 2021 • Nuno Miguel Guerreiro, André F. T. Martins
Selective rationalization aims to produce decisions along with rationales (e. g., text highlights or word alignments between two sentences).
1 code implementation • 1 Sep 2021 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
Transformers are unable to model long-term memories effectively, since the amount of computation they need to perform grows with the context length.
Ranked #1 on
Dialogue Generation
on WikiText-103
1 code implementation • ICLR 2022 • António Farinhas, Wilker Aziz, Vlad Niculae, André F. T. Martins
Neural networks and other machine learning models compute continuous representations, while humans communicate mostly through discrete symbols.
1 code implementation • 4 Aug 2021 • André F. T. Martins, Marcos Treviso, António Farinhas, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Mathieu Blondel, Vlad Niculae
When $\Omega$ is a Tsallis negentropy with parameter $\alpha$, we obtain "deformed exponential families," which include $\alpha$-entmax and sparsemax ($\alpha$ = 2) as particular cases.
1 code implementation • ACL 2021 • Kayo Yin, Patrick Fernandes, Danish Pruthi, Aditi Chaudhary, André F. T. Martins, Graham Neubig
Are models paying large amounts of attention to the same context?
1 code implementation • ACL 2021 • Patrick Fernandes, Kayo Yin, Graham Neubig, André F. T. Martins
Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context -- context from sentences other than those currently being translated.
no code implementations • 7 Apr 2021 • António Farinhas, André F. T. Martins, Pedro M. Q. Aguiar
Visual attention mechanisms are a key component of neural network models for computer vision.
no code implementations • 1 Apr 2021 • André F. T. Martins
Neural networks and other machine learning models compute continuous representations, while humans communicate with discrete symbols.
1 code implementation • NAACL 2021 • Ben Peters, André F. T. Martins
Current sequence-to-sequence models are trained to minimize cross-entropy and use softmax to compute the locally normalized probabilities over target sequences.
1 code implementation • 9 Oct 2020 • Marina Fomicheva, Shuo Sun, Erick Fonseca, Chrysoula Zerva, Frédéric Blain, Vishrav Chaudhary, Francisco Guzmán, Nina Lopatina, Lucia Specia, André F. T. Martins
We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE).
1 code implementation • EMNLP 2020 • Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins
Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data.
1 code implementation • NeurIPS 2020 • Gonçalo M. Correia, Vlad Niculae, Wilker Aziz, André F. T. Martins
In this paper, we propose a new training strategy which replaces these estimators by an exact yet efficient marginalization.
2 code implementations • NeurIPS 2020 • André F. T. Martins, António Farinhas, Marcos Treviso, Vlad Niculae, Pedro M. Q. Aguiar, Mário A. T. Figueiredo
Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e. g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation).
Ranked #28 on
Visual Question Answering
on VQA v2 test-dev
no code implementations • EMNLP (BlackboxNLP) 2020 • Marcos V. Treviso, André F. T. Martins
Explainability is a topic of growing importance in NLP.
1 code implementation • EMNLP 2020 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
Current state-of-the-art text generators build on powerful language models such as GPT-2, achieving impressive performance.
1 code implementation • ICML 2020 • Vlad Niculae, André F. T. Martins
Structured prediction requires manipulating a large number of combinatorial structures, e. g., dependency trees or alignments, either as latent or output variables.
3 code implementations • IJCNLP 2019 • Gonçalo M. Correia, Vlad Niculae, André F. T. Martins
Findings of the quantitative and qualitative analysis of our approach include that heads in different layers learn different sparsity preferences and tend to be more diverse in their attention distributions than softmax Transformers.
1 code implementation • 24 Jul 2019 • António Góis, André F. T. Martins
The combination of machines and humans for translation is effective, with many studies showing productivity gains when humans post-edit machine-translated output instead of translating from scratch.
no code implementations • 24 Jul 2019 • André F. T. Martins, Vlad Niculae
These notes aim to shed light on the recently proposed structured projected intermediate gradient optimization technique (SPIGOT, Peng et al., 2018).
no code implementations • WS 2019 • Fabio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, António Góis, M. Amin Farajian, António V. Lopes, André F. T. Martins
We present the contribution of the Unbabel team to the WMT 2019 Shared Task on Quality Estimation.
no code implementations • ACL 2019 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
Named entity recognition (NER) and entity linking (EL) are two fundamentally related tasks, since in order to perform EL, first the mentions to entities have to be detected.
Ranked #6 on
Entity Linking
on AIDA-CoNLL
2 code implementations • ACL 2019 • Tsvetomila Mihaylova, André F. T. Martins
In the Transformer model, unlike the RNN, the generation of a new word attends to the full sentence generated so far, not only to the last word, and it is not straightforward to apply the scheduled sampling technique.
1 code implementation • 14 Jun 2019 • Gonçalo M. Correia, André F. T. Martins
Automatic post-editing (APE) seeks to automatically refine the output of a black-box machine translation (MT) system through human post-edits.
no code implementations • WS 2019 • António V. Lopes, M. Amin Farajian, Gonçalo M. Correia, Jonay Trenous, André F. T. Martins
Analogously to dual-encoder architectures we develop a BERT-based encoder-decoder (BED) model in which a single pretrained BERT encoder receives both the source src and machine translation tgt strings.
1 code implementation • ACL 2019 • Ben Peters, Vlad Niculae, André F. T. Martins
Sequence-to-sequence models are a powerful workhorse of NLP.
1 code implementation • NAACL 2019 • Afonso Mendes, Shashi Narayan, Sebastião Miranda, Zita Marinho, André F. T. Martins, Shay B. Cohen
We present a new neural model for text summarization that first extracts sentences from a document and then compresses them.
1 code implementation • NAACL 2019 • Sameen Maruf, André F. T. Martins, Gholamreza Haffari
Despite the progress made in sentence-level NMT, current systems still fall short at achieving fluent, good quality translation for a full document.
1 code implementation • ACL 2019 • Fábio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, André F. T. Martins
We introduce OpenKiwi, a PyTorch-based open source framework for translation quality estimation.
3 code implementations • 8 Jan 2019 • Mathieu Blondel, André F. T. Martins, Vlad Niculae
Over the past decades, numerous loss functions have been been proposed for a variety of supervised learning tasks, including regression, classification, ranking, and more generally structured prediction.
1 code implementation • EMNLP 2018 • Vlad Niculae, André F. T. Martins, Claire Cardie
Deep NLP models benefit from underlying structures in the data---e. g., parse trees---typically extracted using off-the-shelf parsers.
1 code implementation • WS 2018 • Sameen Maruf, André F. T. Martins, Gholamreza Haffari
In this work, we propose the task of translating Bilingual Multi-Speaker Conversations, and explore neural architectures which exploit both source and target-side conversation histories for this task.
2 code implementations • 24 May 2018 • Mathieu Blondel, André F. T. Martins, Vlad Niculae
This paper studies Fenchel-Young losses, a generic way to construct convex loss functions from a regularization function.
1 code implementation • ACL 2018 • Chaitanya Malaviya, Pedro Ferreira, André F. T. Martins
In NMT, words are sometimes dropped from the source or generated repeatedly in the translation.
2 code implementations • ACL 2018 • Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, Alexandra Birch
We present Marian, an efficient and self-contained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs.
3 code implementations • ICML 2018 • Vlad Niculae, André F. T. Martins, Mathieu Blondel, Claire Cardie
Structured prediction requires searching over a combinatorial number of structures.
8 code implementations • 5 Feb 2016 • André F. T. Martins, Ramón Fernandez Astudillo
We propose sparsemax, a new activation function similar to the traditional softmax, but able to output sparse probabilities.
no code implementations • IJCNLP 2015 • Daniel Fernández-González, André F. T. Martins
We reduce phrase-representation parsing to dependency parsing.