no code implementations • 7 Feb 2024 • Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Rame, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel
Moreover, responses in these datasets are often sampled from a language model distinct from the one being aligned, and since the model evolves over training, the alignment phase is inevitably off-policy.
1 code implementation • 1 Oct 2023 • Mustafa Shukor, Alexandre Rame, Corentin Dancette, Matthieu Cord
Based on our ICL study, (3) we push ICL further and propose new multimodal ICL variants such as; Multitask-ICL, Chain-of-Hindsight-ICL, and Self-Correcting-ICL.
1 code implementation • 30 Jul 2023 • Mustafa Shukor, Corentin Dancette, Alexandre Rame, Matthieu Cord
Our model is efficiently pretrained on many tasks, based on task balancing and multimodal curriculum learning.
2 code implementations • 7 Sep 2021 • Alexandre Rame, Corentin Dancette, Matthieu Cord
In this paper, we introduce a new regularization - named Fishr - that enforces domain invariance in the space of the gradients of the loss: specifically, the domain-level variances of gradients are matched across training domains.
Ranked #30 on
Domain Generalization
on TerraIncognita
1 code implementation • ICCV 2021 • Alexandre Rame, Remy Sun, Matthieu Cord
Recent strategies achieved ensembling "for free" by fitting concurrently diverse subnetworks inside a single base network.
Ranked #15 on
Image Classification
on Tiny ImageNet Classification
no code implementations • ICLR 2021 • Alexandre Rame, Matthieu Cord
Deep ensembles perform better than a single network thanks to the diversity among their members.
no code implementations • 6 Oct 2020 • Alexandre Rame, Arthur Douillard, Charles Ollion
That's why in addition to a first color classifier, we include a second regression stage for refinement in our newly proposed architecture.
no code implementations • 6 Dec 2018 • Alexandre Rame, Emilien Garreau, Hedi Ben-Younes, Charles Ollion
Similarly to self-training methods, the predictions of these initial detectors mitigate the missing annotations on the complementary datasets.