no code implementations • 21 Feb 2025 • Hugo Pitorro, Marcos Treviso
State space models (SSMs), such as Mamba, have emerged as an efficient alternative to transformers for long-context sequence modeling.
1 code implementation • 17 Feb 2025 • Nuno Gonçalves, Marcos Treviso, André F. T. Martins
The computational cost of softmax-based attention in transformers limits their applicability to long-context tasks.
1 code implementation • 7 Jul 2024 • Hugo Pitorro, Pavlo Vasylenko, Marcos Treviso, André F. T. Martins
Transformers are the current architecture of choice for NLP, but their attention layers do not scale well to long contexts.
no code implementations • 27 Jun 2024 • Marcos Treviso, Nuno M. Guerreiro, Sweta Agrawal, Ricardo Rei, José Pombal, Tania Vaz, Helena Wu, Beatriz Silva, Daan van Stigt, André F. T. Martins
While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies.
1 code implementation • 21 Sep 2023 • Ricardo Rei, Nuno M. Guerreiro, José Pombal, Daan van Stigt, Marcos Treviso, Luisa Coheur, José G. C. de Souza, André F. T. Martins
Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2).
1 code implementation • 26 May 2023 • Marcos Treviso, Alexis Ross, Nuno M. Guerreiro, André F. T. Martins
Selective rationales and counterfactual examples have emerged as two effective, complementary classes of interpretability methods for analyzing and training NLP models.
1 code implementation • 19 May 2023 • Ricardo Rei, Nuno M. Guerreiro, Marcos Treviso, Luisa Coheur, Alon Lavie, André F. T. Martins
Neural metrics for machine translation evaluation, such as COMET, exhibit significant improvements in their correlation with human judgments, as compared to traditional metrics based on lexical overlap, such as BLEU.
1 code implementation • 13 Sep 2022 • Ricardo Rei, Marcos Treviso, Nuno M. Guerreiro, Chrysoula Zerva, Ana C. Farinha, Christine Maroti, José G. C. de Souza, Taisiya Glushkova, Duarte M. Alves, Alon Lavie, Luisa Coheur, André F. T. Martins
We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on Quality Estimation (QE).
no code implementations • 31 Aug 2022 • Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, Roy Schwartz
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows.
1 code implementation • 22 Apr 2022 • Patrick Fernandes, Marcos Treviso, Danish Pruthi, André F. T. Martins, Graham Neubig
In this work, leveraging meta-learning techniques, we extend this idea to improve the quality of the explanations themselves, specifically by optimizing explanations such that student models more effectively learn to simulate the original model.
no code implementations • spnlp (ACL) 2022 • Marcos Treviso, António Góis, Patrick Fernandes, Erick Fonseca, André F. T. Martins
Transformers' quadratic complexity with respect to the input sequence length has motivated a body of work on efficient sparse approximations to softmax.
1 code implementation • 4 Aug 2021 • André F. T. Martins, Marcos Treviso, António Farinhas, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Mathieu Blondel, Vlad Niculae
In contrast, for finite domains, recent work on sparse alternatives to softmax (e. g., sparsemax, $\alpha$-entmax, and fusedmax), has led to distributions with varying support.
2 code implementations • NeurIPS 2020 • André F. T. Martins, António Farinhas, Marcos Treviso, Vlad Niculae, Pedro M. Q. Aguiar, Mário A. T. Figueiredo
Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e. g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation).
Ranked #35 on
Visual Question Answering (VQA)
on VQA v2 test-std
no code implementations • LREC 2020 • Edresson Casanova, Marcos Treviso, Lilian H{\"u}bner, S Alu{\'\i}sio, ra
Automatic analysis of connected speech by natural language processing techniques is a promising direction for diagnosing cognitive impairments.
no code implementations • WS 2019 • Fabio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, António Góis, M. Amin Farajian, António V. Lopes, André F. T. Martins
We present the contribution of the Unbabel team to the WMT 2019 Shared Task on Quality Estimation.
1 code implementation • ACL 2019 • Fábio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, André F. T. Martins
We introduce OpenKiwi, a PyTorch-based open source framework for translation quality estimation.
3 code implementations • WS 2017 • Nathan Hartmann, Erick Fonseca, Christopher Shulby, Marcos Treviso, Jessica Rodrigues, Sandra Aluisio
Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing sys- tems.