1 code implementation • 17 Dec 2024 • Guillaume Couairon, Renu Singh, Anastase Charantonis, Christian Lessig, Claire Monteleoni
We then design a probabilistic weather model called \textbf{ArchesWeatherGen} based on flow matching, a modern variant of diffusion models, that is trained to project ArchesWeather's predictions to the distribution of ERA5 weather states.
1 code implementation • 11 Dec 2024 • LCM team, Loïc Barrault, Paul-Ambroise Duquenne, Maha Elbayad, Artyom Kozhevnikov, Belen Alastruey, Pierre Andrews, Mariano Coria, Guillaume Couairon, Marta R. Costa-jussà, David Dale, Hady Elsahar, Kevin Heffernan, João Maria Janeiro, Tuan Tran, Christophe Ropers, Eduardo Sánchez, Robin San Roman, Alexandre Mourachko, Safiyyah Saleem, Holger Schwenk
In this paper, we present an attempt at an architecture which operates on an explicit higher-level semantic representation, which we name a concept.
1 code implementation • 23 May 2024 • Guillaume Couairon, Christian Lessig, Anastase Charantonis, Claire Monteleoni
In this paper, we show that the 3D local processing in Pangu-Weather is computationally sub-optimal.
no code implementations • 29 Mar 2024 • Barbara Toniella Corradini, Mustafa Shukor, Paul Couairon, Guillaume Couairon, Franco Scarselli, Matthieu Cord
The pipeline is as follows: the image is passed to both a captioner model (i. e. BLIP) and a diffusion model (i. e., Stable Diffusion Model) to generate a text description and visual representation, respectively.
no code implementations • 17 Oct 2023 • Pierre Fernandez, Guillaume Couairon, Teddy Furon, Matthijs Douze
The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance.
no code implementations • 18 Sep 2023 • Asya Grechka, Guillaume Couairon, Matthieu Cord
For the specific task of image inpainting, the current guiding mechanism relies on copying-and-pasting the known regions from the input image at each denoising step.
no code implementations • ICCV 2023 • Guillaume Couairon, Marlène Careil, Matthieu Cord, Stéphane Lathuilière, Jakob Verbeek
Large-scale text-to-image diffusion models have significantly improved the state of the art in generative image modelling and allow for an intuitive and powerful user interface to drive the image generation process.
1 code implementation • NeurIPS 2023 • Alexandre Ramé, Guillaume Couairon, Mustafa Shukor, Corentin Dancette, Jean-Baptiste Gaya, Laure Soulier, Matthieu Cord
Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned on labeled data.
1 code implementation • 14 Apr 2023 • Jamie Tolan, Hung-I Yang, Ben Nosarzewski, Guillaume Couairon, Huy Vo, John Brandt, Justine Spore, Sayantan Majumdar, Daniel Haziza, Janaki Vamaraju, Theo Moutakanni, Piotr Bojanowski, Tracy Johns, Brian White, Tobias Tiecke, Camille Couprie
The maps are generated by the extraction of features from a self-supervised model trained on Maxar imagery from 2017 to 2020, and the training of a dense prediction decoder against aerial lidar maps.
2 code implementations • ICCV 2023 • Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, Teddy Furon
For instance, it detects the origin of an image generated from a text prompt, then cropped to keep $10\%$ of the content, with $90$+$\%$ accuracy at a false positive rate below 10$^{-6}$.
4 code implementations • 20 Oct 2022 • Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord
Semantic image editing is an extension of image generation, with the additional constraint that the generated image should be as similar as possible to a given input image.
1 code implementation • 29 Aug 2022 • Mustafa Shukor, Guillaume Couairon, Matthieu Cord
Vision and Language Pretraining has become the prevalent approach for tackling multimodal downstream tasks.
1 code implementation • 20 Apr 2022 • Mustafa Shukor, Guillaume Couairon, Asya Grechka, Matthieu Cord
We propose a new retrieval framework, T-Food (Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval) that exploits the interaction between modalities in a novel regularization scheme, while using only unimodal encoders at test time for efficient retrieval.
Ranked #3 on Cross-Modal Retrieval on Recipe1M
1 code implementation • CVPR 2022 • Guillaume Couairon, Asya Grechka, Jakob Verbeek, Holger Schwenk, Matthieu Cord
Via the latent space of an auto-encoder, we iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms.
4 code implementations • CVPR 2022 • Amanpreet Singh, Ronghang Hu, Vedanuj Goswami, Guillaume Couairon, Wojciech Galuba, Marcus Rohrbach, Douwe Kiela
State-of-the-art vision and vision-and-language models rely on large-scale visio-linguistic pretraining for obtaining good performance on a variety of downstream tasks.
Ranked #4 on Image Retrieval on MS COCO
no code implementations • 6 Dec 2021 • Guillaume Couairon, Matthieu Cord, Matthijs Douze, Holger Schwenk
We introduce the SIMAT dataset to evaluate the task of Image Retrieval with Multimodal queries.
1 code implementation • CVPR 2022 • Arthur Douillard, Alexandre Ramé, Guillaume Couairon, Matthieu Cord
Our strategy scales to a large number of tasks while having negligible memory and time overheads due to strict control of the parameters expansion.
Ranked #2 on Incremental Learning on ImageNet - 10 steps