Search Results for author: Paul-Ambroise Duquenne

Found 12 papers, 6 papers with code

Multimodal and Multilingual Embeddings for Large-Scale Speech Mining

1 code implementation NeurIPS 2021 Paul-Ambroise Duquenne, Hongyu Gong, Holger Schwenk

Using a similarity metric in that multimodal embedding space, we perform mining of audio in German, French, Spanish and English from Librivox against billions of sentences from Common Crawl.

Speech-to-Speech Translation Translation

SONAR: Sentence-Level Multimodal and Language-Agnostic Representations

1 code implementation22 Aug 2023 Paul-Ambroise Duquenne, Holger Schwenk, Benoît Sagot

Our single text encoder, covering 200 languages, substantially outperforms existing sentence embeddings such as LASER3 and LabSE on the xsim and xsim++ multilingual similarity search tasks.

Machine Translation Sentence +4

Language Modeling Is Compression

1 code implementation19 Sep 2023 Grégoire Delétang, Anian Ruoss, Paul-Ambroise Duquenne, Elliot Catt, Tim Genewein, Christopher Mattern, Jordi Grau-Moya, Li Kevin Wenliang, Matthew Aitchison, Laurent Orseau, Marcus Hutter, Joel Veness

We show that large language models are powerful general-purpose predictors and that the compression viewpoint provides novel insights into scaling laws, tokenization, and in-context learning.

In-Context Learning Language Modelling

Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer

no code implementations5 Oct 2023 Paul-Ambroise Duquenne, Holger Schwenk, Benoît Sagot

Recent research has shown that independently trained encoders and decoders, combined through a shared fixed-size representation, can achieve competitive performance in speech-to-text translation.

Speech-to-Text Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.