Search Results for author: Julien Abadji

Found 2 papers, 0 papers with code

mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus

no code implementations13 Jun 2024 Matthieu Futeral, Armel Zebaze, Pedro Ortiz Suarez, Julien Abadji, Rémi Lacroix, Cordelia Schmid, Rachel Bawden, Benoît Sagot

We additionally train two types of multilingual model to prove the benefits of mOSCAR: (1) a model trained on a subset of mOSCAR and captioning data and (2) a model train on captioning data only.

Few-Shot Learning In-Context Learning

Towards a Cleaner Document-Oriented Multilingual Crawled Corpus

no code implementations LREC 2022 Julien Abadji, Pedro Ortiz Suarez, Laurent Romary, Benoît Sagot

The need for raw large raw corpora has dramatically increased in recent years with the introduction of transfer learning and semi-supervised learning methods to Natural Language Processing.

Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.