no code implementations • 15 Nov 2023 • Miguel Moura Ramos, Patrick Fernandes, António Farinhas, André F. T. Martins
A core ingredient in RLHF's success in aligning and improving large language models (LLMs) is its reward model, trained using human feedback on model outputs.
1 code implementation • 17 Oct 2023 • António Farinhas, José G. C. de Souza, André F. T. Martins
Large language models (LLMs) are becoming a one-fits-many solution, but they sometimes hallucinate or produce unreliable output.
1 code implementation • 2 Oct 2023 • António Farinhas, Chrysoula Zerva, Dennis Ulmer, André F. T. Martins
Split conformal prediction has recently sparked great interest due to its ability to provide formally guaranteed uncertainty sets or intervals for predictions made by black-box neural models, ensuring a predefined probability of containing the actual ground truth.
no code implementations • 1 May 2023 • Patrick Fernandes, Aman Madaan, Emmy Liu, António Farinhas, Pedro Henrique Martins, Amanda Bertsch, José G. C. de Souza, Shuyan Zhou, Tongshuang Wu, Graham Neubig, André F. T. Martins
Many recent advances in natural language generation have been fueled by training large language models on internet-scale data.
1 code implementation • NAACL 2022 • Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, André F. T. Martins
Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search.
1 code implementation • ICLR 2022 • António Farinhas, Wilker Aziz, Vlad Niculae, André F. T. Martins
Neural networks and other machine learning models compute continuous representations, while humans communicate mostly through discrete symbols.
1 code implementation • 4 Aug 2021 • André F. T. Martins, Marcos Treviso, António Farinhas, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Mathieu Blondel, Vlad Niculae
In contrast, for finite domains, recent work on sparse alternatives to softmax (e. g., sparsemax, $\alpha$-entmax, and fusedmax), has led to distributions with varying support.
no code implementations • 7 Apr 2021 • António Farinhas, André F. T. Martins, Pedro M. Q. Aguiar
Visual attention mechanisms are a key component of neural network models for computer vision.
2 code implementations • NeurIPS 2020 • André F. T. Martins, António Farinhas, Marcos Treviso, Vlad Niculae, Pedro M. Q. Aguiar, Mário A. T. Figueiredo
Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e. g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation).
Ranked #36 on
Visual Question Answering (VQA)
on VQA v2 test-std