Search Results for author: Patrick Fernandes

Found 20 papers, 11 papers with code

CMU’s IWSLT 2022 Dialect Speech Translation System

no code implementations IWSLT (ACL) 2022 Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe

We use additional paired Modern Standard Arabic data (MSA) to directly improve the speech recognition (ASR) and machine translation (MT) components of our cascaded systems.

Knowledge Distillation Machine Translation +3

Is Context Helpful for Chat Translation Evaluation?

no code implementations13 Mar 2024 Sweta Agrawal, Amin Farajian, Patrick Fernandes, Ricardo Rei, André F. T. Martins

Our findings show that augmenting neural learned metrics with contextual information helps improve correlation with human judgments in the reference-free scenario and when evaluating translations in out-of-English settings.

Language Modelling Large Language Model +2

Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

1 code implementation27 Feb 2024 Duarte M. Alves, José Pombal, Nuno M. Guerreiro, Pedro H. Martins, João Alves, Amin Farajian, Ben Peters, Ricardo Rei, Patrick Fernandes, Sweta Agrawal, Pierre Colombo, José G. C. de Souza, André F. T. Martins

While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task.

Language Modelling Large Language Model +1

CroissantLLM: A Truly Bilingual French-English Language Model

1 code implementation1 Feb 2024 Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt, François Yvon, André F. T. Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo

We introduce CroissantLLM, a 1. 3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware.

Language Modelling Large Language Model

Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues

1 code implementation20 Nov 2023 Sumire Honda, Patrick Fernandes, Chrysoula Zerva

We make use of Conditional Cross-Mutual Information (CXMI) to explore how much of the context the model uses and generalise CXMI to study the impact of the extra-sentential context.

Machine Translation NMT +2

Aligning Neural Machine Translation Models: Human Feedback in Training and Inference

no code implementations15 Nov 2023 Miguel Moura Ramos, Patrick Fernandes, António Farinhas, André F. T. Martins

A core ingredient in RLHF's success in aligning and improving large language models (LLMs) is its reward model, trained using human feedback on model outputs.

Language Modelling Machine Translation +1

Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation

1 code implementation17 May 2023 Markus Freitag, Behrooz Ghorbani, Patrick Fernandes

Recent advances in machine translation (MT) have shown that Minimum Bayes Risk (MBR) decoding can be a powerful alternative to beam search decoding, especially when combined with neural-based utility functions.

Machine Translation

Scaling Laws for Multilingual Neural Machine Translation

no code implementations19 Feb 2023 Patrick Fernandes, Behrooz Ghorbani, Xavier Garcia, Markus Freitag, Orhan Firat

Through a novel joint scaling law formulation, we compute the effective number of parameters allocated to each language pair and examine the role of language similarity in the scaling behavior of our models.

Machine Translation Translation

A Multi-dimensional Evaluation of Tokenizer-free Multilingual Pretrained Models

no code implementations13 Oct 2022 Jimin Sun, Patrick Fernandes, Xinyi Wang, Graham Neubig

Recent work on tokenizer-free multilingual pretrained models show promising results in improving cross-lingual transfer and reducing engineering overhead (Clark et al., 2022; Xue et al., 2022).

Cross-Lingual Transfer

Quality-Aware Decoding for Neural Machine Translation

1 code implementation NAACL 2022 Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, André F. T. Martins

Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search.

Machine Translation NMT +1

Learning to Scaffold: Optimizing Model Explanations for Teaching

1 code implementation22 Apr 2022 Patrick Fernandes, Marcos Treviso, Danish Pruthi, André F. T. Martins, Graham Neubig

In this work, leveraging meta-learning techniques, we extend this idea to improve the quality of the explanations themselves, specifically by optimizing explanations such that student models more effectively learn to simulate the original model.


Predicting Attention Sparsity in Transformers

no code implementations spnlp (ACL) 2022 Marcos Treviso, António Góis, Patrick Fernandes, Erick Fonseca, André F. T. Martins

Transformers' quadratic complexity with respect to the input sequence length has motivated a body of work on efficient sparse approximations to softmax.

Language Modelling Machine Translation +3

When Does Translation Require Context? A Data-driven, Multilingual Exploration

no code implementations15 Sep 2021 Patrick Fernandes, Kayo Yin, Emmy Liu, André F. T. Martins, Graham Neubig

Although proper handling of discourse significantly contributes to the quality of machine translation (MT), these improvements are not adequately measured in common translation quality metrics.

Machine Translation Translation

Measuring and Increasing Context Usage in Context-Aware Machine Translation

1 code implementation ACL 2021 Patrick Fernandes, Kayo Yin, Graham Neubig, André F. T. Martins

Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context -- context from sentences other than those currently being translated.

Document Level Machine Translation Machine Translation +1

Structured Neural Summarization

3 code implementations ICLR 2019 Patrick Fernandes, Miltiadis Allamanis, Marc Brockschmidt

Summarization of long sequences into a concise statement is a core problem in natural language processing, requiring non-trivial understanding of the input.

Source Code Summarization

Cannot find the paper you are looking for? You can Submit a new open access paper.