Search Results for author: André F. T. Martins

Found 58 papers, 39 papers with code

QUARTZ: Quality-Aware Machine Translation

no code implementations EAMT 2022 José G.C. de Souza, Ricardo Rei, Ana C. Farinha, Helena Moniz, André F. T. Martins

This paper presents QUARTZ, QUality-AwaRe machine Translation, a project led by Unbabel which aims at developing machine translation systems that are more robust and produce fewer critical errors.

Machine Translation Translation

Findings of the WMT 2021 Shared Task on Quality Estimation

no code implementations WMT (EMNLP) 2021 Lucia Specia, Frédéric Blain, Marina Fomicheva, Chrysoula Zerva, Zhenhao Li, Vishrav Chaudhary, André F. T. Martins

We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels.

Machine Translation Translation

Findings of the WMT 2020 Shared Task on Quality Estimation

no code implementations WMT (EMNLP) 2020 Lucia Specia, Frédéric Blain, Marina Fomicheva, Erick Fonseca, Vishrav Chaudhary, Francisco Guzmán, André F. T. Martins

We report the results of the WMT20 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word, sentence and document levels.

Machine Translation Translation

Project MAIA: Multilingual AI Agent Assistant

no code implementations EAMT 2020 André F. T. Martins, Joao Graca, Paulo Dimas, Helena Moniz, Graham Neubig

This paper presents the Multilingual Artificial Intelligence Agent Assistant (MAIA), a project led by Unbabel with the collaboration of CMU, INESC-ID and IT Lisbon.

Natural Language Processing Translation

DeepSPIN: Deep Structured Prediction for Natural Language Processing

no code implementations EAMT 2022 André F. T. Martins

DeepSPIN is a research project funded by the European Research Council (ERC) whose goal is to develop new neural structured prediction methods, models, and algorithms for improving the quality, interpretability, and data-efficiency of natural language processing (NLP) systems, with special emphasis on machine translation and quality estimation applications.

Machine Translation Natural Language Processing +2

Chunk-based Nearest Neighbor Machine Translation

no code implementations24 May 2022 Pedro Henrique Martins, Zita Marinho, André F. T. Martins

Experiments on machine translation in two settings, static domain adaptation and ``on-the-fly'' adaptation, show that the chunk-based $k$NN-MT model leads to a significant speed-up (up to 4 times) with only a small drop in translation quality.

Domain Adaptation Language Modelling +2

Quality-Aware Decoding for Neural Machine Translation

1 code implementation2 May 2022 Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, André F. T. Martins

Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search.

Machine Translation Translation

Efficient Machine Translation Domain Adaptation

1 code implementation SpaNLP (ACL) 2022 Pedro Henrique Martins, Zita Marinho, André F. T. Martins

On the other hand, semi-parametric models have been shown to successfully perform domain adaptation by retrieving examples from an in-domain datastore (Khandelwal et al., 2021).

Domain Adaptation Language Modelling +2

Learning to Scaffold: Optimizing Model Explanations for Teaching

1 code implementation22 Apr 2022 Patrick Fernandes, Marcos Treviso, Danish Pruthi, André F. T. Martins, Graham Neubig

In this work, leveraging meta-learning techniques, we extend this idea to improve the quality of the explanations themselves, specifically by optimizing explanations such that student models more effectively learn to simulate the original model.

Meta-Learning Natural Language Processing

Differentiable Causal Discovery Under Latent Interventions

1 code implementation4 Mar 2022 Gonçalo R. A. Faria, André F. T. Martins, Mário A. T. Figueiredo

Recent work has shown promising results in causal discovery by leveraging interventional data with gradient-based methods, even when the intervened variables are unknown.

Causal Discovery Variational Inference

Modeling Structure with Undirected Neural Networks

1 code implementation8 Feb 2022 Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins

In this paper, we combine the representational strengths of factor graphs and of neural networks, proposing undirected neural networks (UNNs): a flexible framework for specifying computations that can be performed in any order.

Dependency Parsing Image Classification

Predicting Attention Sparsity in Transformers

no code implementations spnlp (ACL) 2022 Marcos Treviso, António Góis, Patrick Fernandes, Erick Fonseca, André F. T. Martins

Transformers' quadratic complexity with respect to the input sequence length has motivated a body of work on efficient sparse approximations to softmax.

Language Modelling Machine Translation +3

When Does Translation Require Context? A Data-driven, Multilingual Exploration

no code implementations15 Sep 2021 Kayo Yin, Patrick Fernandes, André F. T. Martins, Graham Neubig

Although proper handling of discourse phenomena significantly contributes to the quality of machine translation (MT), common translation quality metrics do not adequately capture them.

Machine Translation Translation

SPECTRA: Sparse Structured Text Rationalization

2 code implementations EMNLP 2021 Nuno Miguel Guerreiro, André F. T. Martins

Selective rationalization aims to produce decisions along with rationales (e. g., text highlights or word alignments between two sentences).

Natural Language Inference

$\infty$-former: Infinite Memory Transformer

1 code implementation1 Sep 2021 Pedro Henrique Martins, Zita Marinho, André F. T. Martins

Transformers are unable to model long-term memories effectively, since the amount of computation they need to perform grows with the context length.

Dialogue Generation Language Modelling

Sparse Communication via Mixed Distributions

1 code implementation ICLR 2022 António Farinhas, Wilker Aziz, Vlad Niculae, André F. T. Martins

Neural networks and other machine learning models compute continuous representations, while humans communicate mostly through discrete symbols.

Sparse Continuous Distributions and Fenchel-Young Losses

1 code implementation4 Aug 2021 André F. T. Martins, Marcos Treviso, António Farinhas, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Mathieu Blondel, Vlad Niculae

When $\Omega$ is a Tsallis negentropy with parameter $\alpha$, we obtain "deformed exponential families," which include $\alpha$-entmax and sparsemax ($\alpha$ = 2) as particular cases.

Audio Classification Question Answering +1

Measuring and Increasing Context Usage in Context-Aware Machine Translation

1 code implementation ACL 2021 Patrick Fernandes, Kayo Yin, Graham Neubig, André F. T. Martins

Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context -- context from sentences other than those currently being translated.

Document Level Machine Translation Machine Translation +1

Reconciling the Discrete-Continuous Divide: Towards a Mathematical Theory of Sparse Communication

no code implementations1 Apr 2021 André F. T. Martins

Neural networks and other machine learning models compute continuous representations, while humans communicate with discrete symbols.

Smoothing and Shrinking the Sparse Seq2Seq Search Space

1 code implementation NAACL 2021 Ben Peters, André F. T. Martins

Current sequence-to-sequence models are trained to minimize cross-entropy and use softmax to compute the locally normalized probabilities over target sequences.

Machine Translation Morphological Inflection +1

Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

1 code implementation EMNLP 2020 Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins

Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data.

Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

1 code implementation NeurIPS 2020 Gonçalo M. Correia, Vlad Niculae, Wilker Aziz, André F. T. Martins

In this paper, we propose a new training strategy which replaces these estimators by an exact yet efficient marginalization.

Sparse and Continuous Attention Mechanisms

2 code implementations NeurIPS 2020 André F. T. Martins, António Farinhas, Marcos Treviso, Vlad Niculae, Pedro M. Q. Aguiar, Mário A. T. Figueiredo

Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e. g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation).

Machine Translation Question Answering +3

Sparse Text Generation

1 code implementation EMNLP 2020 Pedro Henrique Martins, Zita Marinho, André F. T. Martins

Current state-of-the-art text generators build on powerful language models such as GPT-2, achieving impressive performance.

Dialogue Generation Language Modelling +1

LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction

1 code implementation ICML 2020 Vlad Niculae, André F. T. Martins

Structured prediction requires manipulating a large number of combinatorial structures, e. g., dependency trees or alignments, either as latent or output variables.

Structured Prediction

Adaptively Sparse Transformers

3 code implementations IJCNLP 2019 Gonçalo M. Correia, Vlad Niculae, André F. T. Martins

Findings of the quantitative and qualitative analysis of our approach include that heads in different layers learn different sparsity preferences and tend to be more diverse in their attention distributions than softmax Transformers.

Machine Translation Translation

Translator2Vec: Understanding and Representing Human Post-Editors

1 code implementation24 Jul 2019 António Góis, André F. T. Martins

The combination of machines and humans for translation is effective, with many studies showing productivity gains when humans post-edit machine-translated output instead of translating from scratch.

Translation

Notes on Latent Structure Models and SPIGOT

no code implementations24 Jul 2019 André F. T. Martins, Vlad Niculae

These notes aim to shed light on the recently proposed structured projected intermediate gradient optimization technique (SPIGOT, Peng et al., 2018).

Joint Learning of Named Entity Recognition and Entity Linking

no code implementations ACL 2019 Pedro Henrique Martins, Zita Marinho, André F. T. Martins

Named entity recognition (NER) and entity linking (EL) are two fundamentally related tasks, since in order to perform EL, first the mentions to entities have to be detected.

Entity Linking Multi-Task Learning +3

Scheduled Sampling for Transformers

2 code implementations ACL 2019 Tsvetomila Mihaylova, André F. T. Martins

In the Transformer model, unlike the RNN, the generation of a new word attends to the full sentence generated so far, not only to the last word, and it is not straightforward to apply the scheduled sampling technique.

A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning

1 code implementation14 Jun 2019 Gonçalo M. Correia, André F. T. Martins

Automatic post-editing (APE) seeks to automatically refine the output of a black-box machine translation (MT) system through human post-edits.

Automatic Post-Editing Transfer Learning +1

Unbabel's Submission to the WMT2019 APE Shared Task: BERT-based Encoder-Decoder for Automatic Post-Editing

no code implementations WS 2019 António V. Lopes, M. Amin Farajian, Gonçalo M. Correia, Jonay Trenous, André F. T. Martins

Analogously to dual-encoder architectures we develop a BERT-based encoder-decoder (BED) model in which a single pretrained BERT encoder receives both the source src and machine translation tgt strings.

Automatic Post-Editing Translation

Selective Attention for Context-aware Neural Machine Translation

1 code implementation NAACL 2019 Sameen Maruf, André F. T. Martins, Gholamreza Haffari

Despite the progress made in sentence-level NMT, current systems still fall short at achieving fluent, good quality translation for a full document.

Machine Translation Translation

Learning with Fenchel-Young Losses

3 code implementations8 Jan 2019 Mathieu Blondel, André F. T. Martins, Vlad Niculae

Over the past decades, numerous loss functions have been been proposed for a variety of supervised learning tasks, including regression, classification, ranking, and more generally structured prediction.

Structured Prediction

Towards Dynamic Computation Graphs via Sparse Latent Structure

1 code implementation EMNLP 2018 Vlad Niculae, André F. T. Martins, Claire Cardie

Deep NLP models benefit from underlying structures in the data---e. g., parse trees---typically extracted using off-the-shelf parsers.

graph construction

Contextual Neural Model for Translating Bilingual Multi-Speaker Conversations

1 code implementation WS 2018 Sameen Maruf, André F. T. Martins, Gholamreza Haffari

In this work, we propose the task of translating Bilingual Multi-Speaker Conversations, and explore neural architectures which exploit both source and target-side conversation histories for this task.

Document Translation Machine Translation +1

Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms

2 code implementations24 May 2018 Mathieu Blondel, André F. T. Martins, Vlad Niculae

This paper studies Fenchel-Young losses, a generic way to construct convex loss functions from a regularization function.

Marian: Fast Neural Machine Translation in C++

2 code implementations ACL 2018 Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, Alexandra Birch

We present Marian, an efficient and self-contained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs.

Machine Translation Translation

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

8 code implementations5 Feb 2016 André F. T. Martins, Ramón Fernandez Astudillo

We propose sparsemax, a new activation function similar to the traditional softmax, but able to output sparse probabilities.

Classification General Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.