1 code implementation • 6 Jan 2025 • Théophane Vallaeys, Matthew Muckley, Jakob Verbeek, Matthijs Douze
QINCo recently addressed this inefficiency by using a neural network to determine the quantization codebook in RQ based on the vector reconstruction from previous steps.
1 code implementation • 13 Dec 2024 • Melissa Hall, Oscar Mañas, Reyhane Askari-Hemmat, Mark Ibrahim, Candace Ross, Pietro Astolfi, Tariq Berrada Ifriqi, Marton Havasi, Yohann Benchetrit, Karen Ullrich, Carolina Braga, Abhishek Charnalia, Maeve Ryan, Mike Rabbat, Michal Drozdzal, Jakob Verbeek, Adriana Romero-Soriano
To enable actionable evaluation insights, we introduce ''Evaluation Exercises'' that highlight takeaways for specific evaluation questions.
no code implementations • 6 Nov 2024 • Tariq Berrada, Pietro Astolfi, Melissa Hall, Marton Havasi, Yohann Benchetrit, Adriana Romero-Soriano, Karteek Alahari, Michal Drozdzal, Jakob Verbeek
LDMs learn the data distribution in the latent space of an autoencoder (AE) and produce images by mapping the generated latents into RGB image space using the AE decoder.
no code implementations • 5 Nov 2024 • Tariq Berrada Ifriqi, Pietro Astolfi, Melissa Hall, Reyhane Askari-Hemmat, Yohann Benchetrit, Marton Havasi, Matthew Muckley, Karteek Alahari, Adriana Romero-Soriano, Jakob Verbeek, Michal Drozdzal
In this work, we perform an in-depth study of LDM training recipes focusing on the performance of models and their training efficiency.
1 code implementation • 14 Jun 2024 • Pietro Astolfi, Marlene Careil, Melissa Hall, Oscar Mañas, Matthew Muckley, Jakob Verbeek, Adriana Romero Soriano, Michal Drozdzal
Building world models that accurately and comprehensively represent the real world is the utmost aspiration for conditional image generative models as it would enable their use as world simulators.
no code implementations • 20 Mar 2024 • Théophane Vallaeys, Mustafa Shukor, Matthieu Cord, Jakob Verbeek
The abilities of large language models (LLMs) have recently progressed to unprecedented levels, paving the way to novel applications in a wide variety of areas.
no code implementations • 18 Mar 2024 • François Porcher, Camille Couprie, Marc Szafraniec, Jakob Verbeek
Despite the availability of large datasets for tasks like image classification and image-text alignment, labeled data for more complex recognition tasks, such as detection and segmentation, is less abundant.
1 code implementation • 26 Jan 2024 • Iris A. M. Huijben, Matthijs Douze, Matthew Muckley, Ruud J. G. van Sloun, Jakob Verbeek
For example, QINCo achieves better nearest-neighbor search accuracy using 12-byte codes than the state-of-the-art UNQ using 16 bytes on the BigANN1M and Deep1M datasets.
no code implementations • CVPR 2024 • Tariq Berrada, Jakob Verbeek, Camille Couprie, Karteek Alahari
Semantic image synthesis, i. e., generating images from user-provided semantic label maps, is an important conditional image generation task as it allows to control both the content as well as the spatial layout of generated images.
1 code implementation • 16 Oct 2023 • Marlène Careil, Matthew J. Muckley, Jakob Verbeek, Stéphane Lathuilière
We find that our model leads to reconstructions with state-of-the-art visual quality as measured by FID and KID.
1 code implementation • 3 Aug 2023 • Tariq Berrada, Camille Couprie, Karteek Alahari, Jakob Verbeek
Although instance segmentation methods have improved considerably, the dominant paradigm is to rely on fully-annotated training images, which are tedious to obtain.
no code implementations • 17 Jul 2023 • Ekaterina Iakovleva, Karteek Alahari, Jakob Verbeek
Deep convolutional networks are ubiquitous in computer vision, due to their excellent performance across different tasks for various domains.
no code implementations • ICCV 2023 • Guillaume Couairon, Marlène Careil, Matthieu Cord, Stéphane Lathuilière, Jakob Verbeek
Large-scale text-to-image diffusion models have significantly improved the state of the art in generative image modelling and allow for an intuitive and powerful user interface to drive the image generation process.
1 code implementation • 15 May 2023 • Enrico Fini, Pietro Astolfi, Adriana Romero-Soriano, Jakob Verbeek, Michal Drozdzal
Indeed, we find that a simple CLIP baseline can also be improved substantially, up to a 25% relative improvement on downstream zero-shot tasks, by using well-known training techniques that are popular in other subfields.
no code implementations • 26 Apr 2023 • Arantxa Casanova, Marlène Careil, Adriana Romero-Soriano, Christopher J. Pal, Jakob Verbeek, Michal Drozdzal
Our experiments on the OI dataset show that M&Ms outperforms baselines in terms of fine-grained scene controllability while being very competitive in terms of image quality and sample diversity.
no code implementations • 10 Apr 2023 • João Maria Janeiro, Stanislav Frolov, Alaaeldin El-Nouby, Jakob Verbeek
For example, for segmentation mIoU is reduced from 44. 5 to 30. 5 mIoU when compressing to 0. 1 bpp using the best compression model we evaluated.
no code implementations • CVPR 2023 • Marlène Careil, Jakob Verbeek, Stéphane Lathuilière
The class affinity matrix is introduced as a first layer to the source model to make it compatible with the target label maps, and the source model is then further finetuned for the target domain.
no code implementations • 16 Mar 2023 • Pietro Astolfi, Arantxa Casanova, Jakob Verbeek, Pascal Vincent, Adriana Romero-Soriano, Michal Drozdzal
We showcase the benefits of DA_IC-GAN by plugging it out-of-the-box into the supervised training of ResNets and DeiT models on the ImageNet dataset, and achieving accuracy boosts up to between 1%p and 2%p with the highest capacity models.
no code implementations • 26 Jan 2023 • Matthew J. Muckley, Alaaeldin El-Nouby, Karen Ullrich, Hervé Jégou, Jakob Verbeek
Lossy image compression aims to represent images in as few bits as possible while maintaining fidelity to the original.
1 code implementation • CVPR 2023 • Hugo Touvron, Matthieu Cord, Maxime Oquab, Piotr Bojanowski, Jakob Verbeek, Hervé Jégou
Given a neural network to be trained, for each sample we implicitly instantiate two altered networks, "submodels", with stochastic depth: i. e. activating only a subset of the layers and skipping others.
no code implementations • 14 Dec 2022 • Alaaeldin El-Nouby, Matthew J. Muckley, Karen Ullrich, Ivan Laptev, Jakob Verbeek, Hervé Jégou
In this work, we attempt to bring these lines of research closer by revisiting vector quantization for image compression.
1 code implementation • 9 Dec 2022 • Hugo Touvron, Matthieu Cord, Maxime Oquab, Piotr Bojanowski, Jakob Verbeek, Hervé Jégou
We introduce submodel co-training, a regularization method related to co-training, self-distillation and stochastic depth.
Ranked #70 on
Image Classification
on ImageNet
no code implementations • 25 Nov 2022 • Marlène Careil, Stéphane Lathuilière, Camille Couprie, Jakob Verbeek
To allow for more control, image synthesis can be conditioned on semantic segmentation maps that instruct the generator the position of objects in the image.
4 code implementations • 20 Oct 2022 • Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord
Semantic image editing is an extension of image generation, with the additional constraint that the generated image should be as similar as possible to a given input image.
7 code implementations • 18 Mar 2022 • Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Jakob Verbeek, Hervé Jégou
(2) Fine-tuning the weights of the attention layers is sufficient to adapt vision transformers to a higher resolution and to other classification tasks.
Ranked #9 on
Image Classification
on CIFAR-10
(using extra training data)
1 code implementation • CVPR 2022 • Guillaume Couairon, Asya Grechka, Jakob Verbeek, Holger Schwenk, Matthieu Cord
Via the latent space of an auto-encoder, we iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms.
1 code implementation • NeurIPS 2021 • Arantxa Casanova, Marlène Careil, Jakob Verbeek, Michal Drozdzal, Adriana Romero-Soriano
Generative Adversarial Networks (GANs) can generate near photo realistic images in narrow domains such as human faces.
Ranked #1 on
Conditional Image Generation
on ImageNet 64x64
11 code implementations • NeurIPS 2021 • Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou
We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.
Ranked #58 on
Instance Segmentation
on COCO minival
17 code implementations • NeurIPS 2021 • Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Hervé Jégou
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification.
Ranked #1 on
Image Classification
on Certificate Verification
no code implementations • 23 Mar 2021 • Nitika Verma, Adnane Boukhayma, Jakob Verbeek, Edmond Boyer
Convolutional networks have been extremely successful for regular data structures such as 2D images and 3D voxel grids.
1 code implementation • ICML 2020 • Ekaterina Iakovleva, Jakob Verbeek, Karteek Alahari
We propose a novel amortized variational inference scheme for an empirical Bayes meta-learning model, where model parameters are treated as latent variables.
1 code implementation • ECCV 2020 • Roman Klokov, Edmond Boyer, Jakob Verbeek
Generative models have proven effective at modeling 3D shapes and their statistical variations.
1 code implementation • COLING 2020 • Maha Elbayad, Michael Ustaszewski, Emmanuelle Esperança-Rodier, Francis Brunet Manquat, Jakob Verbeek, Laurent Besacier
We conduct in this work an evaluation study comparing offline and online neural machine translation architectures.
1 code implementation • 18 May 2020 • Maha Elbayad, Laurent Besacier, Jakob Verbeek
We also show that the 2D-convolution architecture is competitive with Transformers for simultaneous translation of spoken language.
1 code implementation • 3 Mar 2020 • Adria Ruiz, Jakob Verbeek
We propose Hierarchical Neural Ensembles (HNE), a novel framework to embed an ensemble of multiple networks in a hierarchical tree structure, sharing intermediate layers.
no code implementations • 25 Sep 2019 • Maha Elbayad, Laurent Besacier, Jakob Verbeek
We investigate the sensitivity of such models to the value of k that is used during training and when deploying the model, and the effect of updating the hidden states in transformer models as new source tokens are read.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • CVPR 2020 • Xiaotian Li, Shuzhe Wang, Yi Zhao, Jakob Verbeek, Juho Kannala
In this work, we present a new hierarchical scene coordinate network to predict pixel scene coordinates in a coarse-to-fine manner from a single RGB image.
1 code implementation • 20 Aug 2019 • Roman Klokov, Jakob Verbeek, Edmond Boyer
We study end-to-end learning strategies for 3D shape inference from images, in particular from a single image.
no code implementations • ICCV 2019 • Adria Ruiz, Jakob Verbeek
Despite the outstanding performance of convolutional neural networks (CNNs) for many vision tasks, the required computational cost during inference is problematic when resources are limited.
no code implementations • ICLR Workshop LLD 2019 • Adrià Ruiz, Oriol Martinez, Xavier Binefa, Jakob Verbeek
Given a pool of unlabelled images, the goal is to learn a representation where a set of target factors are disentangled from others.
no code implementations • 24 Jan 2019 • Adria Ruiz, Oriol Martinez, Xavier Binefa, Jakob Verbeek
Given a pool of unlabeled images, the goal is to learn a representation where a set of target factors are disentangled from others.
no code implementations • NeurIPS 2019 • Thomas Lucas, Konstantin Shmelkov, Karteek Alahari, Cordelia Schmid, Jakob Verbeek
We show that our model significantly improves over existing hybrid models: offering GAN-like samples, IS and FID scores that are competitive with fully adversarial models, and improved likelihood scores.
no code implementations • 11 Oct 2018 • Mariia Vladimirova, Jakob Verbeek, Pablo Mesejo, Julyan Arbel
We investigate deep Bayesian neural networks with Gaussian weight priors and a class of ReLU-like nonlinearities.
no code implementations • 27 Sep 2018 • Thomas Lucas, Konstantin Shmelkov, Karteek Alahari, Cordelia Schmid, Jakob Verbeek
First, we propose a model that extends variational autoencoders by using deterministic invertible transformation layers to map samples from the decoder to the image space.
no code implementations • 15 Aug 2018 • Xiaotian Li, Juha Ylioinas, Jakob Verbeek, Juho Kannala
Image-based camera relocalization is an important problem in computer vision and robotics.
3 code implementations • CONLL 2018 • Maha Elbayad, Laurent Besacier, Jakob Verbeek
Current state-of-the-art machine translation systems are based on encoder-decoder architectures, that first encode the input sequence, and then generate an output sequence based on the input encoding.
Ranked #2 on
Machine Translation
on IWSLT2015 German-English
no code implementations • ICML 2018 • Thomas Lucas, Corentin Tallec, Jakob Verbeek, Yann Ollivier
We propose to feed the discriminator with mixed batches of true and fake samples, and train it to predict the ratio of true samples in the batch.
1 code implementation • ACL 2018 • Maha Elbayad, Laurent Besacier, Jakob Verbeek
We extend this approach to token-level loss smoothing, and propose improvements to the sequence-level smoothing approach.
1 code implementation • ECCV 2018 • Pauline Luc, Camille Couprie, Yann Lecun, Jakob Verbeek
We apply the "detection head'" of Mask R-CNN on the predicted features to produce the instance segmentation of future frames.
no code implementations • ICLR 2018 • Thomas Lucas, Jakob Verbeek
Our contribution is a training procedure relying on an auxiliary loss function that controls which information is captured by the latent variables and what is left to the autoregressive decoder.
1 code implementation • CVPR 2018 • Nitika Verma, Edmond Boyer, Jakob Verbeek
Convolutional neural networks (CNNs) have massively impacted visual recognition in 2D images, and are now ubiquitous in state-of-the-art approaches.
2 code implementations • ICCV 2017 • Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann Lecun
The ability to predict and therefore to anticipate the future is an important attribute of intelligence.
no code implementations • ICCV 2017 • Marco Pedersoli, Thomas Lucas, Cordelia Schmid, Jakob Verbeek
We propose "Areas of Attention", a novel attention-based model for automatic image captioning.
1 code implementation • 25 Nov 2016 • Pauline Luc, Camille Couprie, Soumith Chintala, Jakob Verbeek
Adversarial training has been shown to produce state of the art results for generative image modeling.
no code implementations • NeurIPS 2016 • Shreyas Saxena, Jakob Verbeek
Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem.
no code implementations • 21 Mar 2016 • Guosheng Hu, Xiaojiang Peng, Yongxin Yang, Timothy Hospedales, Jakob Verbeek
To train such networks, very large training sets are needed with millions of labeled images.
no code implementations • 3 Oct 2015 • Ramazan Gokberk Cinbis, Jakob Verbeek, Cordelia Schmid
It has been experimentally observed that the performance of BoW and FV representations can be improved by employing discounting transformations such as power normalization.
1 code implementation • 8 Jun 2015 • Matthijs Douze, Jérôme Revaud, Jakob Verbeek, Hervé Jégou, Cordelia Schmid
We address the problem of specific video event retrieval.
no code implementations • 21 Apr 2015 • Heng Wang, Dan Oneata, Jakob Verbeek, Cordelia Schmid
We also use the homography to cancel out camera motion from the optical flow.
no code implementations • 3 Mar 2015 • Ramazan Gokberk Cinbis, Jakob Verbeek, Cordelia Schmid
In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations.
no code implementations • CVPR 2014 • Dan Oneata, Jakob Verbeek, Cordelia Schmid
Transformation of the FV by power and L2 normalizations has shown to significantly improve its performance, and led to state-of-the-art results for a range of image and video classification and retrieval tasks.
no code implementations • CVPR 2014 • Ramazan Gokberk Cinbis, Jakob Verbeek, Cordelia Schmid
In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations.