no code implementations • WMT (EMNLP) 2020 • João Moura, Miguel Vera, Daan van Stigt, Fabio Kepler, André F. T. Martins
We present the joint contribution of IST and Unbabel to the WMT 2020 Shared Task on Quality Estimation.
1 code implementation • WMT (EMNLP) 2021 • Ricardo Rei, Ana C Farinha, Chrysoula Zerva, Daan van Stigt, Craig Stewart, Pedro Ramos, Taisiya Glushkova, André F. T. Martins, Alon Lavie
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics Shared Task.
no code implementations • WMT (EMNLP) 2021 • Chrysoula Zerva, Daan van Stigt, Ricardo Rei, Ana C Farinha, Pedro Ramos, José G. C. de Souza, Taisiya Glushkova, Miguel Vera, Fabio Kepler, André F. T. Martins
We present the joint contribution of IST and Unbabel to the WMT 2021 Shared Task on Quality Estimation.
no code implementations • WMT (EMNLP) 2021 • Lucia Specia, Frédéric Blain, Marina Fomicheva, Chrysoula Zerva, Zhenhao Li, Vishrav Chaudhary, André F. T. Martins
We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels.
no code implementations • WMT (EMNLP) 2020 • Lucia Specia, Frédéric Blain, Marina Fomicheva, Erick Fonseca, Vishrav Chaudhary, Francisco Guzmán, André F. T. Martins
We report the results of the WMT20 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word, sentence and document levels.
1 code implementation • WMT (EMNLP) 2020 • M. Amin Farajian, António V. Lopes, André F. T. Martins, Sameen Maruf, Gholamreza Haffari
We report the results of the first edition of the WMT shared task on chat translation.
no code implementations • EAMT 2020 • António Lopes, M. Amin Farajian, Rachel Bawden, Michael Zhang, André F. T. Martins
In this paper we provide a systematic comparison of existing and new document-level neural machine translation solutions.
no code implementations • EAMT 2022 • José G.C. de Souza, Ricardo Rei, Ana C. Farinha, Helena Moniz, André F. T. Martins
This paper presents QUARTZ, QUality-AwaRe machine Translation, a project led by Unbabel which aims at developing machine translation systems that are more robust and produce fewer critical errors.
no code implementations • EAMT 2020 • André F. T. Martins, Joao Graca, Paulo Dimas, Helena Moniz, Graham Neubig
This paper presents the Multilingual Artificial Intelligence Agent Assistant (MAIA), a project led by Unbabel with the collaboration of CMU, INESC-ID and IT Lisbon.
no code implementations • EAMT 2022 • André F. T. Martins
DeepSPIN is a research project funded by the European Research Council (ERC) whose goal is to develop new neural structured prediction methods, models, and algorithms for improving the quality, interpretability, and data-efficiency of natural language processing (NLP) systems, with special emphasis on machine translation and quality estimation applications.
no code implementations • 24 Sep 2024 • Pedro Henrique Martins, Patrick Fernandes, João Alves, Nuno M. Guerreiro, Ricardo Rei, Duarte M. Alves, José Pombal, Amin Farajian, Manuel Faysse, Mateusz Klimaszewski, Pierre Colombo, Barry Haddow, José G. C. de Souza, Alexandra Birch, André F. T. Martins
The quality of open-weight LLMs has seen significant improvement, yet they remain predominantly focused on English.
no code implementations • 11 Sep 2024 • António Farinhas, Haau-Sing Li, André F. T. Martins
In this paper, we draw a parallel between this strategy and the use of redundancy to decrease the error rate in noisy communication channels.
1 code implementation • 25 Aug 2024 • Haau-Sing Li, Patrick Fernandes, Iryna Gurevych, André F. T. Martins
Recently, a diverse set of decoding and reranking procedures have been shown effective for LLM-based code generation.
no code implementations • 7 Jul 2024 • Hugo Pitorro, Pavlo Vasylenko, Marcos Treviso, André F. T. Martins
Transformers are the current architecture of choice for NLP, but their attention layers do not scale well to long contexts.
no code implementations • 29 Jun 2024 • Peiqin Lin, André F. T. Martins, Hinrich Schütze
Recent studies have highlighted the potential of exploiting parallel corpora to enhance multilingual large language models, improving performance in both bilingual tasks, e. g., machine translation, and general-purpose tasks, e. g., text classification.
no code implementations • 27 Jun 2024 • Marcos Treviso, Nuno M. Guerreiro, Sweta Agrawal, Ricardo Rei, José Pombal, Tania Vaz, Helena Wu, Beatriz Silva, Daan van Stigt, André F. T. Martins
While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies.
1 code implementation • 26 Jun 2024 • Anna Bavaresco, Raffaella Bernardi, Leonardo Bertolazzi, Desmond Elliott, Raquel Fernández, Albert Gatt, Esam Ghaleb, Mario Giulianelli, Michael Hanna, Alexander Koller, André F. T. Martins, Philipp Mondorf, Vera Neplenbroek, Sandro Pezzelle, Barbara Plank, David Schlangen, Alessandro Suglia, Aditya K Surikuchi, Ece Takmaz, Alberto Testoni
There is an increasing trend towards evaluating NLP models with LLM-generated judgments instead of human judgments.
1 code implementation • 28 May 2024 • Gonçalo R. A. Faria, Sweta Agrawal, António Farinhas, Ricardo Rei, José G. C. de Souza, André F. T. Martins
An important challenge in machine translation (MT) is to generate high-quality and diverse translations.
no code implementations • 28 May 2024 • Sweta Agrawal, António Farinhas, Ricardo Rei, André F. T. Martins
Automatic metrics for evaluating translation quality are typically validated by measuring how well they correlate with human assessments.
1 code implementation • 8 May 2024 • Peiqin Lin, André F. T. Martins, Hinrich Schütze
Thus, we introduce XAMPLER: Cross-Lingual Example Retrieval, a method tailored to tackle the challenge of cross-lingual in-context learning using only annotated English data.
no code implementations • 3 May 2024 • Margarida M. Campos, António Farinhas, Chrysoula Zerva, Mário A. T. Figueiredo, André F. T. Martins
The rapid proliferation of large language models and natural language processing (NLP) applications creates a crucial need for uncertainty quantification to mitigate risks such as hallucinations and to enhance decision-making reliability in critical applications.
1 code implementation • 13 Mar 2024 • Sweta Agrawal, Amin Farajian, Patrick Fernandes, Ricardo Rei, André F. T. Martins
Our findings show that augmenting neural learned metrics with contextual information helps improve correlation with human judgments in the reference-free scenario and when evaluating translations in out-of-English settings.
no code implementations • 6 Mar 2024 • Ben Peters, André F. T. Martins
Neural machine translation (MT) models achieve strong results across a variety of settings, but it is widely believed that they are highly sensitive to "noisy" inputs, such as spelling errors, abbreviations, and other formatting issues.
2 code implementations • 27 Feb 2024 • Duarte M. Alves, José Pombal, Nuno M. Guerreiro, Pedro H. Martins, João Alves, Amin Farajian, Ben Peters, Ricardo Rei, Patrick Fernandes, Sweta Agrawal, Pierre Colombo, José G. C. de Souza, André F. T. Martins
While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task.
1 code implementation • 1 Feb 2024 • Dennis Ulmer, Chrysoula Zerva, André F. T. Martins
Conformal prediction is an attractive framework to provide predictions imbued with statistical guarantees, however, its application to text generation is challenging since any i. i. d.
1 code implementation • 1 Feb 2024 • Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt, François Yvon, André F. T. Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo
We introduce CroissantLLM, a 1. 3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware.
no code implementations • 24 Jan 2024 • Peiqin Lin, Shaoxiong Ji, Jörg Tiedemann, André F. T. Martins, Hinrich Schütze
Large language models (LLMs) have advanced the state of the art in natural language processing.
no code implementations • 15 Nov 2023 • Miguel Moura Ramos, Patrick Fernandes, António Farinhas, André F. T. Martins
A core ingredient in RLHF's success in aligning and improving large language models (LLMs) is its reward model, trained using human feedback on model outputs.
no code implementations • 20 Oct 2023 • Duarte M. Alves, Nuno M. Guerreiro, João Alves, José Pombal, Ricardo Rei, José G. C. de Souza, Pierre Colombo, André F. T. Martins
Experiments on 10 language pairs show that our proposed approach recovers the original few-shot capabilities while keeping the added benefits of finetuning.
1 code implementation • 17 Oct 2023 • António Farinhas, José G. C. de Souza, André F. T. Martins
Large language models (LLMs) are becoming a one-fits-many solution, but they sometimes hallucinate or produce unreliable output.
2 code implementations • 16 Oct 2023 • Nuno M. Guerreiro, Ricardo Rei, Daan van Stigt, Luisa Coheur, Pierre Colombo, André F. T. Martins
Widely used learned metrics for machine translation evaluation, such as COMET and BLEURT, estimate the quality of a translation hypothesis by providing a single sentence-level score.
1 code implementation • 2 Oct 2023 • António Farinhas, Chrysoula Zerva, Dennis Ulmer, André F. T. Martins
Split conformal prediction has recently sparked great interest due to its ability to provide formally guaranteed uncertainty sets or intervals for predictions made by black-box neural models, ensuring a predefined probability of containing the actual ground truth.
1 code implementation • 21 Sep 2023 • Ricardo Rei, Nuno M. Guerreiro, José Pombal, Daan van Stigt, Marcos Treviso, Luisa Coheur, José G. C. de Souza, André F. T. Martins
Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2).
1 code implementation • 14 Aug 2023 • Patrick Fernandes, Daniel Deutsch, Mara Finkelstein, Parker Riley, André F. T. Martins, Graham Neubig, Ankush Garg, Jonathan H. Clark, Markus Freitag, Orhan Firat
Automatic evaluation of machine translation (MT) is a critical tool driving the rapid iterative development of MT systems.
no code implementations • 9 Jun 2023 • Chrysoula Zerva, André F. T. Martins
Several uncertainty estimation methods have been recently proposed for machine translation evaluation.
1 code implementation • 30 May 2023 • Taisiya Glushkova, Chrysoula Zerva, André F. T. Martins
Although neural-based machine translation evaluation metrics, such as COMET or BLEURT, have achieved strong correlations with human judgements, they are sometimes unreliable in detecting certain phenomena that can be considered as critical errors, such as deviations in entities and numbers.
1 code implementation • 26 May 2023 • Marcos Treviso, Alexis Ross, Nuno M. Guerreiro, André F. T. Martins
Selective rationales and counterfactual examples have emerged as two effective, complementary classes of interpretability methods for analyzing and training NLP models.
1 code implementation • 23 May 2023 • Peiqin Lin, Chengzhi Hu, Zheyu Zhang, André F. T. Martins, Hinrich Schütze
Recent multilingual pretrained language models (mPLMs) have been shown to encode strong language-specific signals, which are not explicitly provided during pretraining.
Open-Ended Question Answering Zero-Shot Cross-Lingual Transfer
1 code implementation • 20 May 2023 • Ayyoob Imani, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet, Nora Kassner, Chunlan Ma, Helmut Schmid, André F. T. Martins, François Yvon, Hinrich Schütze
The NLP community has mainly focused on scaling Large Language Models (LLMs) vertically, i. e., making them better for about 100 languages.
1 code implementation • 19 May 2023 • Ricardo Rei, Nuno M. Guerreiro, Marcos Treviso, Luisa Coheur, Alon Lavie, André F. T. Martins
Neural metrics for machine translation evaluation, such as COMET, exhibit significant improvements in their correlation with human judgments, as compared to traditional metrics based on lexical overlap, such as BLEU.
no code implementations • 1 May 2023 • Patrick Fernandes, Aman Madaan, Emmy Liu, António Farinhas, Pedro Henrique Martins, Amanda Bertsch, José G. C. de Souza, Shuyan Zhou, Tongshuang Wu, Graham Neubig, André F. T. Martins
Many recent advances in natural language generation have been fueled by training large language models on internet-scale data.
1 code implementation • 28 Mar 2023 • Nuno M. Guerreiro, Duarte Alves, Jonas Waldendorf, Barry Haddow, Alexandra Birch, Pierre Colombo, André F. T. Martins
Large-scale multilingual machine translation systems have demonstrated remarkable ability to translate directly between numerous languages, making them increasingly appealing for real-world applications.
no code implementations • 18 Jan 2023 • Vlad Niculae, Caio F. Corro, Nikita Nangia, Tsvetomila Mihaylova, André F. T. Martins
Many types of data from fields including natural language processing, computer vision, and bioinformatics, are well represented by discrete, compositional structures such as trees, sequences, or matchings.
1 code implementation • 19 Dec 2022 • Haau-Sing Li, Mohsen Mesgar, André F. T. Martins, Iryna Gurevych
We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions.
1 code implementation • 19 Dec 2022 • Nuno M. Guerreiro, Pierre Colombo, Pablo Piantanida, André F. T. Martins
Neural machine translation (NMT) has become the de-facto standard in real-world machine translation applications.
1 code implementation • 27 Oct 2022 • Diogo Pernes, Afonso Mendes, André F. T. Martins
Current abstractive summarization systems present important weaknesses which prevent their deployment in real-world applications, such as the omission of relevant information and the generation of factual inconsistencies (also known as hallucinations).
1 code implementation • 13 Sep 2022 • Ricardo Rei, Marcos Treviso, Nuno M. Guerreiro, Chrysoula Zerva, Ana C. Farinha, Christine Maroti, José G. C. de Souza, Taisiya Glushkova, Duarte M. Alves, Alon Lavie, Luisa Coheur, André F. T. Martins
We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on Quality Estimation (QE).
no code implementations • 31 Aug 2022 • Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, Roy Schwartz
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows.
2 code implementations • 10 Aug 2022 • Nuno M. Guerreiro, Elena Voita, André F. T. Martins
Although the problem of hallucinations in neural machine translation (NMT) has received some attention, research on this highly pathological phenomenon lacks solid ground.
1 code implementation • 24 May 2022 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
Semi-parametric models, which augment generation with retrieval, have led to impressive results in language modeling and machine translation, due to their ability to retrieve fine-grained information from a datastore of examples.
1 code implementation • NAACL 2022 • Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, André F. T. Martins
Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search.
1 code implementation • SpaNLP (ACL) 2022 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
On the other hand, semi-parametric models have been shown to successfully perform domain adaptation by retrieving examples from an in-domain datastore (Khandelwal et al., 2021).
1 code implementation • 22 Apr 2022 • Patrick Fernandes, Marcos Treviso, Danish Pruthi, André F. T. Martins, Graham Neubig
In this work, leveraging meta-learning techniques, we extend this idea to improve the quality of the explanations themselves, specifically by optimizing explanations such that student models more effectively learn to simulate the original model.
1 code implementation • 13 Apr 2022 • Chrysoula Zerva, Taisiya Glushkova, Ricardo Rei, André F. T. Martins
Trainable evaluation metrics for machine translation (MT) exhibit strong correlation with human judgements, but they are often hard to interpret and might produce unreliable scores under noisy or out-of-domain data.
1 code implementation • 4 Mar 2022 • Gonçalo R. A. Faria, André F. T. Martins, Mário A. T. Figueiredo
Recent work has shown promising results in causal discovery by leveraging interventional data with gradient-based methods, even when the intervened variables are unknown.
1 code implementation • 8 Feb 2022 • Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins
In this paper, we combine the representational strengths of factor graphs and of neural networks, proposing undirected neural networks (UNNs): a flexible framework for specifying computations that can be performed in any order.
no code implementations • spnlp (ACL) 2022 • Marcos Treviso, António Góis, Patrick Fernandes, Erick Fonseca, André F. T. Martins
Transformers' quadratic complexity with respect to the input sequence length has motivated a body of work on efficient sparse approximations to softmax.
no code implementations • 15 Sep 2021 • Patrick Fernandes, Kayo Yin, Emmy Liu, André F. T. Martins, Graham Neubig
Although proper handling of discourse significantly contributes to the quality of machine translation (MT), these improvements are not adequately measured in common translation quality metrics.
2 code implementations • Findings (EMNLP) 2021 • Taisiya Glushkova, Chrysoula Zerva, Ricardo Rei, André F. T. Martins
Several neural-based metrics have been recently proposed to evaluate machine translation quality.
2 code implementations • EMNLP 2021 • Nuno Miguel Guerreiro, André F. T. Martins
Selective rationalization aims to produce decisions along with rationales (e. g., text highlights or word alignments between two sentences).
1 code implementation • 1 Sep 2021 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
Transformers are unable to model long-term memories effectively, since the amount of computation they need to perform grows with the context length.
Ranked #1 on Dialogue Generation on CMU-DoG
1 code implementation • ICLR 2022 • António Farinhas, Wilker Aziz, Vlad Niculae, André F. T. Martins
Neural networks and other machine learning models compute continuous representations, while humans communicate mostly through discrete symbols.
1 code implementation • 4 Aug 2021 • André F. T. Martins, Marcos Treviso, António Farinhas, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Mathieu Blondel, Vlad Niculae
In contrast, for finite domains, recent work on sparse alternatives to softmax (e. g., sparsemax, $\alpha$-entmax, and fusedmax), has led to distributions with varying support.
1 code implementation • ACL 2021 • Kayo Yin, Patrick Fernandes, Danish Pruthi, Aditi Chaudhary, André F. T. Martins, Graham Neubig
Are models paying large amounts of attention to the same context?
1 code implementation • ACL 2021 • Patrick Fernandes, Kayo Yin, Graham Neubig, André F. T. Martins
Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context -- context from sentences other than those currently being translated.
no code implementations • 7 Apr 2021 • António Farinhas, André F. T. Martins, Pedro M. Q. Aguiar
Visual attention mechanisms are a key component of neural network models for computer vision.
no code implementations • 1 Apr 2021 • André F. T. Martins
Neural networks and other machine learning models compute continuous representations, while humans communicate with discrete symbols.
1 code implementation • NAACL 2021 • Ben Peters, André F. T. Martins
Current sequence-to-sequence models are trained to minimize cross-entropy and use softmax to compute the locally normalized probabilities over target sequences.
1 code implementation • LREC 2022 • Marina Fomicheva, Shuo Sun, Erick Fonseca, Chrysoula Zerva, Frédéric Blain, Vishrav Chaudhary, Francisco Guzmán, Nina Lopatina, Lucia Specia, André F. T. Martins
We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE).
1 code implementation • EMNLP 2020 • Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins
Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data.
1 code implementation • NeurIPS 2020 • Gonçalo M. Correia, Vlad Niculae, Wilker Aziz, André F. T. Martins
In this paper, we propose a new training strategy which replaces these estimators by an exact yet efficient marginalization.
2 code implementations • NeurIPS 2020 • André F. T. Martins, António Farinhas, Marcos Treviso, Vlad Niculae, Pedro M. Q. Aguiar, Mário A. T. Figueiredo
Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e. g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation).
Ranked #36 on Visual Question Answering (VQA) on VQA v2 test-std
1 code implementation • EMNLP (BlackboxNLP) 2020 • Marcos V. Treviso, André F. T. Martins
Explainability is a topic of growing importance in NLP.
1 code implementation • EMNLP 2020 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
Current state-of-the-art text generators build on powerful language models such as GPT-2, achieving impressive performance.
1 code implementation • ICML 2020 • Vlad Niculae, André F. T. Martins
Structured prediction requires manipulating a large number of combinatorial structures, e. g., dependency trees or alignments, either as latent or output variables.
3 code implementations • IJCNLP 2019 • Gonçalo M. Correia, Vlad Niculae, André F. T. Martins
Findings of the quantitative and qualitative analysis of our approach include that heads in different layers learn different sparsity preferences and tend to be more diverse in their attention distributions than softmax Transformers.
Ranked #1 on Machine Translation on IWSLT2017 German-English
no code implementations • WS 2019 • Fabio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, António Góis, M. Amin Farajian, António V. Lopes, André F. T. Martins
We present the contribution of the Unbabel team to the WMT 2019 Shared Task on Quality Estimation.
no code implementations • 24 Jul 2019 • André F. T. Martins, Vlad Niculae
These notes aim to shed light on the recently proposed structured projected intermediate gradient optimization technique (SPIGOT, Peng et al., 2018).
1 code implementation • 24 Jul 2019 • António Góis, André F. T. Martins
The combination of machines and humans for translation is effective, with many studies showing productivity gains when humans post-edit machine-translated output instead of translating from scratch.
no code implementations • ACL 2019 • Pedro Henrique Martins, Zita Marinho, André F. T. Martins
Named entity recognition (NER) and entity linking (EL) are two fundamentally related tasks, since in order to perform EL, first the mentions to entities have to be detected.
Ranked #12 on Entity Linking on AIDA-CoNLL
3 code implementations • ACL 2019 • Tsvetomila Mihaylova, André F. T. Martins
In the Transformer model, unlike the RNN, the generation of a new word attends to the full sentence generated so far, not only to the last word, and it is not straightforward to apply the scheduled sampling technique.
1 code implementation • 14 Jun 2019 • Gonçalo M. Correia, André F. T. Martins
Automatic post-editing (APE) seeks to automatically refine the output of a black-box machine translation (MT) system through human post-edits.
no code implementations • WS 2019 • António V. Lopes, M. Amin Farajian, Gonçalo M. Correia, Jonay Trenous, André F. T. Martins
Analogously to dual-encoder architectures we develop a BERT-based encoder-decoder (BED) model in which a single pretrained BERT encoder receives both the source src and machine translation tgt strings.
1 code implementation • ACL 2019 • Ben Peters, Vlad Niculae, André F. T. Martins
Sequence-to-sequence models are a powerful workhorse of NLP.
1 code implementation • NAACL 2019 • Afonso Mendes, Shashi Narayan, Sebastião Miranda, Zita Marinho, André F. T. Martins, Shay B. Cohen
We present a new neural model for text summarization that first extracts sentences from a document and then compresses them.
1 code implementation • NAACL 2019 • Sameen Maruf, André F. T. Martins, Gholamreza Haffari
Despite the progress made in sentence-level NMT, current systems still fall short at achieving fluent, good quality translation for a full document.
1 code implementation • ACL 2019 • Fábio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, André F. T. Martins
We introduce OpenKiwi, a PyTorch-based open source framework for translation quality estimation.
3 code implementations • 8 Jan 2019 • Mathieu Blondel, André F. T. Martins, Vlad Niculae
Over the past decades, numerous loss functions have been been proposed for a variety of supervised learning tasks, including regression, classification, ranking, and more generally structured prediction.
1 code implementation • EMNLP 2018 • Vlad Niculae, André F. T. Martins, Claire Cardie
Deep NLP models benefit from underlying structures in the data---e. g., parse trees---typically extracted using off-the-shelf parsers.
1 code implementation • WS 2018 • Sameen Maruf, André F. T. Martins, Gholamreza Haffari
In this work, we propose the task of translating Bilingual Multi-Speaker Conversations, and explore neural architectures which exploit both source and target-side conversation histories for this task.
2 code implementations • 24 May 2018 • Mathieu Blondel, André F. T. Martins, Vlad Niculae
This paper studies Fenchel-Young losses, a generic way to construct convex loss functions from a regularization function.
1 code implementation • ACL 2018 • Chaitanya Malaviya, Pedro Ferreira, André F. T. Martins
In NMT, words are sometimes dropped from the source or generated repeatedly in the translation.
3 code implementations • ACL 2018 • Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, Alexandra Birch
We present Marian, an efficient and self-contained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs.
3 code implementations • ICML 2018 • Vlad Niculae, André F. T. Martins, Mathieu Blondel, Claire Cardie
Structured prediction requires searching over a combinatorial number of structures.
8 code implementations • 5 Feb 2016 • André F. T. Martins, Ramón Fernandez Astudillo
We propose sparsemax, a new activation function similar to the traditional softmax, but able to output sparse probabilities.
no code implementations • IJCNLP 2015 • Daniel Fernández-González, André F. T. Martins
We reduce phrase-representation parsing to dependency parsing.