no code implementations • ICCV 2023 • Animesh Karnewar, Niloy J. Mitra, Andrea Vedaldi, David Novotny
Diffusion-based image generators can now produce high-quality and diverse samples, but their success has yet to fully translate to 3D generation: existing diffusion methods can either generate low-resolution but 3D consistent outputs, or detailed 2D views of 3D objects but with potential structural defects and lacking view consistency or realism.
1 code implementation • ICCV 2023 • Chuanxia Zheng, Andrea Vedaldi
Vector Quantisation (VQ) is experiencing a comeback in machine learning, where it is increasingly used in representation learning.
1 code implementation • ICCV 2023 • Roman Shapovalov, Yanir Kleiman, Ignacio Rocco, David Novotny, Andrea Vedaldi, Changan Chen, Filippos Kokkinos, Ben Graham, Natalia Neverova
We introduce Replay, a collection of multi-view, multi-modal videos of humans interacting socially.
1 code implementation • 14 Jul 2023 • Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht
In this paper, we thus propose CoTracker, an architecture that jointly tracks multiple points throughout an entire video.
no code implementations • 15 Jun 2023 • Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht
This provides a distribution of appearances for a given text circumventing the ambiguity problem.
no code implementations • ICCV 2023 • Stanislaw Szymanowicz, Christian Rupprecht, Andrea Vedaldi
We fit a diffusion model to a large number of viewsets for a given category of objects.
1 code implementation • CVPR 2023 • Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht
The network learns to pool information from neighboring frames to improve the temporal consistency of its predictions.
no code implementations • 20 Apr 2023 • Tomas Jakab, Ruining Li, Shangzhe Wu, Christian Rupprecht, Andrea Vedaldi
We propose a framework using an image generator like Stable Diffusion to generate virtual training data for learning such a reconstruction network from scratch.
no code implementations • ICCV 2023 • Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
Large-scale Vision-Language Models, such as CLIP, learn powerful image-text representations that have found numerous applications, from zero-shot classification to text-to-image generation.
no code implementations • CVPR 2023 • Yaoyao Liu, Bernt Schiele, Andrea Vedaldi, Christian Rupprecht
Incremental object detection (IOD) aims to train an object detector in phases, each with annotations for new object categories.
no code implementations • 6 Apr 2023 • Minghao Chen, Iro Laina, Andrea Vedaldi
We thoroughly evaluate our approach on three benchmarks and provide several qualitative examples and a comparative analysis of the two strategies that demonstrate the superiority of backward guidance compared to forward guidance, as well as prior work.
no code implementations • CVPR 2023 • Animesh Karnewar, Andrea Vedaldi, David Novotny, Niloy Mitra
We show that our diffusion models are scalable, train robustly, and are competitive in terms of sample quality and fidelity to existing approaches for 3D generative modeling.
1 code implementation • 21 Mar 2023 • Ignacio Rocco, Iurii Makarov, Filippos Kokkinos, David Novotny, Benjamin Graham, Natalia Neverova, Andrea Vedaldi
We present a method for fast 3D reconstruction and real-time rendering of dynamic humans from monocular videos with accompanying parametric body fits.
3 code implementations • 21 Feb 2023 • Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi
We consider the problem of reconstructing a full 360{\deg} photographic model of an object from a single image of it.
no code implementations • 21 Feb 2023 • Luke Melas-Kyriazi, Christian Rupprecht, Andrea Vedaldi
Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision.
no code implementations • 26 Jan 2023 • Uriel Singer, Shelly Sheynin, Adam Polyak, Oron Ashual, Iurii Makarov, Filippos Kokkinos, Naman Goyal, Andrea Vedaldi, Devi Parikh, Justin Johnson, Yaniv Taigman
We present MAV3D (Make-A-Video3D), a method for generating three-dimensional dynamic scenes from text descriptions.
no code implementations • CVPR 2023 • Changan Chen, Alexander Richard, Roman Shapovalov, Vamsi Krishna Ithapu, Natalia Neverova, Kristen Grauman, Andrea Vedaldi
We introduce the novel-view acoustic synthesis (NVAS) task: given the sight and sound observed at a source viewpoint, can we synthesize the sound of that scene from an unseen target viewpoint?
no code implementations • CVPR 2023 • Luke Melas-Kyriazi, Christian Rupprecht, Andrea Vedaldi
Reconstructing the 3D shape of an object from a single RGB image is a long-standing problem in computer vision.
no code implementations • CVPR 2023 • Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Andrea Vedaldi
We consider the problem of reconstructing a full 360deg photographic model of an object from a single image of it.
1 code implementation • 6 Dec 2022 • Mohamed El Banani, Ignacio Rocco, David Novotny, Andrea Vedaldi, Natalia Neverova, Justin Johnson, Benjamin Graham
To address this, we propose a self-supervised approach for correspondence estimation that learns from multiview consistency in short RGB-D video sequences.
no code implementations • CVPR 2023 • Shangzhe Wu, Ruining Li, Tomas Jakab, Christian Rupprecht, Andrea Vedaldi
We consider the problem of predicting the 3D shape, articulation, viewpoint, texture, and lighting of an articulated animal like a horse given a single test image as input.
no code implementations • CVPR 2023 • Samarth Sinha, Roman Shapovalov, Jeremy Reizenstein, Ignacio Rocco, Natalia Neverova, Andrea Vedaldi, David Novotny
Obtaining photorealistic reconstructions of objects from sparse views is inherently ambiguous and can only be achieved by learning suitable reconstruction priors.
no code implementations • 21 Oct 2022 • Laurynas Karazija, Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi
We propose a new approach to learn to segment multiple image objects without manual supervision.
no code implementations • 7 Sep 2022 • Iro Laina, Yuki M. Asano, Andrea Vedaldi
Self-supervised visual representation learning has recently attracted significant research interest.
no code implementations • 7 Sep 2022 • Vadim Tschernezki, Iro Laina, Diane Larlus, Andrea Vedaldi
We present Neural Feature Fusion Fields (N3F), a method that improves dense 2D image feature extractors when the latter are applied to the analysis of multiple images reconstructible as a 3D scene.
no code implementations • 13 Jun 2022 • Eldar Insafutdinov, Dylan Campbell, João F. Henriques, Andrea Vedaldi
We present a method for the accurate 3D reconstruction of partly-symmetric objects.
1 code implementation • CVPR 2022 • Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi
We find that these eigenvectors already decompose an image into meaningful segments, and can be readily used to localize objects in a scene.
no code implementations • 16 May 2022 • Subhabrata Choudhury, Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht
Motion, measured via optical flow, provides a powerful cue to discover and learn objects in images and videos.
Ranked #4 on
Unsupervised Object Segmentation
on SegTrack-v2
no code implementations • 3 May 2022 • Andrew Brown, Cheng-Yang Fu, Omkar Parkhi, Tamara L. Berg, Andrea Vedaldi
We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change.
1 code implementation • CVPR 2022 • Sagar Vaze, Kai Han, Andrea Vedaldi, Andrew Zisserman
Here, the unlabelled images may come from labelled classes or from novel ones.
Ranked #1 on
Open-World Semi-Supervised Learning
on CIFAR-10
(Seen accuracy (50% Labeled) metric)
Fine-Grained Visual Recognition
Open-World Semi-Supervised Learning
+1
no code implementations • CVPR 2022 • David Novotny, Ignacio Rocco, Samarth Sinha, Alexandre Carlier, Gael Kerchenbaum, Roman Shapovalov, Nikita Smetanin, Natalia Neverova, Benjamin Graham, Andrea Vedaldi
Compared to weaker deformation models, this significantly reduces the reconstruction ambiguity and, for dynamic objects, allows Keypoint Transporter to obtain reconstructions of the quality superior or at least comparable to prior approaches while being much faster and reliant on a pre-trained monocular depth estimator network.
1 code implementation • CVPR 2022 • Gengshan Yang, Minh Vo, Natalia Neverova, Deva Ramanan, Andrea Vedaldi, Hanbyul Joo
Our key insight is to merge three schools of thought; (1) classic deformable shape models that make use of articulated bones and blend skinning, (2) volumetric neural radiance fields (NeRFs) that are amenable to gradient-based optimization, and (3) canonical embeddings that generate correspondences between pixels and an articulated model.
no code implementations • 8 Dec 2021 • Honglie Chen, Weidi Xie, Triantafyllos Afouras, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman
Finally, we set the first benchmark for general audio-visual synchronisation with over 160 diverse classes in the new VGG-Sound Sync video dataset.
1 code implementation • NeurIPS 2021 • Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi
First, we construct a proxy task through a set of objectives that encourages the model to learn a meaningful decomposition of the image into its parts.
Ranked #1 on
Unsupervised Keypoint Estimation
on CUB
1 code implementation • 5 Nov 2021 • Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi
We then train a fine-grained textual similarity model that matches image descriptions with documents on a sentence-level basis.
no code implementations • 19 Oct 2021 • Vadim Tschernezki, Diane Larlus, Andrea Vedaldi
Given a raw video sequence taken from a freely-moving camera, we study the problem of decomposing the observed 3D scene into a static background and a dynamic foreground containing the objects that move in the video sequence.
2 code implementations • ICLR 2022 • Sagar Vaze, Kai Han, Andrea Vedaldi, Andrew Zisserman
In this paper, we first demonstrate that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes.
Ranked #10 on
Out-of-Distribution Detection
on CIFAR-100 vs CIFAR-10
no code implementations • ICLR 2022 • Iro Laina, Yuki M Asano, Andrea Vedaldi
Self-supervised visual representation learning has attracted significant research interest.
Ranked #89 on
Image Classification
on ObjectNet
(using extra training data)
1 code implementation • NeurIPS Workshop ImageNet_PPF 2021 • Yuki M. Asano, Christian Rupprecht, Andrew Zisserman, Andrea Vedaldi
On the other hand, state-of-the-art pretraining is nowadays obtained with unsupervised methods, meaning that labelled datasets such as ImageNet may not be necessary, or perhaps not even optimal, for model pretraining.
no code implementations • 16 Sep 2021 • Robert McCraith, Lukas Neumann, Andrea Vedaldi
Vision is one of the primary sensing modalities in autonomous driving.
1 code implementation • 16 Sep 2021 • Robert McCraith, Eldar Insafutdinov, Lukas Neumann, Andrea Vedaldi
We present a system for automatic converting of 2D mask object predictions and raw LiDAR point clouds into full 3D bounding boxes of objects.
no code implementations • ICCV 2021 • Roman Shapovalov, David Novotny, Benjamin Graham, Patrick Labatut, Andrea Vedaldi
The method learns, in an end-to-end fashion, a soft partition of a given category-specific 3D template mesh into rigid parts together with a monocular reconstruction network that predicts the part motions such that they reproject correctly onto 2D DensePose-like surface annotations of the object.
no code implementations • 19 Aug 2021 • Matan Atzmon, David Novotny, Andrea Vedaldi, Yaron Lipman
Implicit neural representation is a recent approach to learn shape collections as zero level-sets of neural networks, where each shape is represented by a latent code.
no code implementations • 22 Jul 2021 • Shangzhe Wu, Tomas Jakab, Christian Rupprecht, Andrea Vedaldi
In this paper, we present DOVE, a method that learns textured 3D models of deformable object categories from monocular videos available online, without keypoint, viewpoint or template shape supervision.
1 code implementation • 29 Jun 2021 • Kai Han, Sylvestre-Alvise Rebuffi, Sébastien Ehrhardt, Andrea Vedaldi, Andrew Zisserman
We present a new approach called AutoNovel to address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labelled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use ranking statistics to transfer the model's knowledge of the labelled classes to the problem of clustering the unlabelled images; and, (3) we train the data representation by optimizing a joint objective function on the labelled and unlabelled subsets of the data, improving both the supervised classification of the labelled data, and the clustering of the unlabelled data.
Ranked #1 on
Novel Class Discovery
on SVHN
no code implementations • CVPR 2021 • Lukas Neumann, Andrea Vedaldi
Predicting future pedestrian trajectory is a crucial component of autonomous driving systems, as recognizing critical situations based only on current pedestrian position may come too late for any meaningful corrective action (e. g. breaking) to take place.
no code implementations • CVPR 2021 • Marvin Eisenberger, David Novotny, Gael Kerchenbaum, Patrick Labatut, Natalia Neverova, Daniel Cremers, Andrea Vedaldi
We present NeuroMorph, a new neural network architecture that takes as input two 3D shapes and produces in one go, i. e. in a single feed forward pass, a smooth interpolation and point-to-point correspondences between them.
no code implementations • CVPR 2021 • Natalia Neverova, Artsiom Sanakoyeu, Patrick Labatut, David Novotny, Andrea Vedaldi
Recent work has shown that it is possible to learn a unified dense pose predictor for several categories of related objects.
1 code implementation • 15 Jun 2021 • Xu Ji, Razvan Pascanu, Devon Hjelm, Balaji Lakshminarayanan, Andrea Vedaldi
Intuitively, one would expect accuracy of a trained neural network's prediction on test samples to correlate with how densely the samples are surrounded by seen training samples in representation space.
2 code implementations • NeurIPS 2021 • Mandela Patrick, Dylan Campbell, Yuki M. Asano, Ishan Misra, Florian Metze, Christoph Feichtenhofer, Andrea Vedaldi, João F. Henriques
In video transformers, the time dimension is often treated in the same way as the two spatial dimensions.
Ranked #14 on
Action Recognition
on EPIC-KITCHENS-100
(using extra training data)
1 code implementation • ICLR 2022 • Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi
Recent research has shown that numerous human-interpretable directions exist in the latent space of GANs.
no code implementations • 5 May 2021 • Dan Xu, Andrea Vedaldi, Joao F. Henriques
We build on the idea of view synthesis, which uses classical camera geometry to re-render a source image from a different point-of-view, specified by a predicted relative pose and depth map.
no code implementations • CVPR 2022 • Triantafyllos Afouras, Yuki M. Asano, Francois Fagan, Andrea Vedaldi, Florian Metze
We tackle the problem of learning object detectors without supervision.
1 code implementation • CVPR 2021 • Honglie Chen, Weidi Xie, Triantafyllos Afouras, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman
We show that our algorithm achieves state-of-the-art performance on the popular Flickr SoundNet dataset.
no code implementations • CVPR 2021 • Philipp Henzler, Jeremy Reizenstein, Patrick Labatut, Roman Shapovalov, Tobias Ritschel, Andrea Vedaldi, David Novotny
Our goal is to learn a deep network that, given a small number of images of an object of a given category, reconstructs it in 3D.
1 code implementation • ICCV 2021 • Mandela Patrick, Yuki M. Asano, Bernie Huang, Ishan Misra, Florian Metze, Joao Henriques, Andrea Vedaldi
First, for space, we show that spatial augmentations such as cropping do work well for videos too, but that previous implementations, due to the high processing and memory cost, could not do this at a scale sufficient for it to work well.
1 code implementation • NeurIPS 2020 • Natalia Neverova, David Novotny, Vasil Khalidov, Marc Szafraniec, Patrick Labatut, Andrea Vedaldi
In this work, we focus on the task of learning and representing dense correspondences in deformable object categories.
no code implementations • NeurIPS 2020 • Benjamin Biggs, Sébastien Ehrhadt, Hanbyul Joo, Benjamin Graham, Andrea Vedaldi, David Novotny
We consider the problem of obtaining dense 3D reconstructions of humans from single and partially occluded views.
no code implementations • NeurIPS 2020 • Iro Laina, Ruth C. Fong, Andrea Vedaldi
The increasing impact of black box models, and particularly of unsupervised ones, comes with an increasing interest in tools to understand and interpret them.
no code implementations • ICLR 2021 • Mandela Patrick, Po-Yao Huang, Yuki Asano, Florian Metze, Alexander Hauptmann, João Henriques, Andrea Vedaldi
The dominant paradigm for learning video-text representations -- noise contrastive learning -- increases the similarity of the representations of pairs of samples that are known to be related, such as text and video from the same sample, and pushes away the representations of all other pairs.
no code implementations • 28 Sep 2020 • Mandela Patrick, Yuki Asano, Polina Kuznetsova, Ruth Fong, Joao F. Henriques, Geoffrey Zweig, Andrea Vedaldi
In this paper, we show that, for videos, the answer is more complex, and that better results can be obtained by accounting for the interplay between invariance, distinctiveness, multiple modalities and time.
no code implementations • 16 Sep 2020 • Robert McCraith, Lukas Neumann, Andrea Vedaldi
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal.
1 code implementation • NeurIPS 2020 • David Novotny, Roman Shapovalov, Andrea Vedaldi
We propose the Canonical 3D Deformer Map, a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects.
1 code implementation • NeurIPS 2020 • Sebastien Ehrhardt, Oliver Groth, Aron Monszpart, Martin Engelcke, Ingmar Posner, Niloy Mitra, Andrea Vedaldi
We present RELATE, a model that learns to generate physically plausible scenes and videos of multiple interacting objects.
1 code implementation • NeurIPS 2020 • Yuki M. Asano, Mandela Patrick, Christian Rupprecht, Andrea Vedaldi
A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: labelled data.
1 code implementation • 22 Jun 2020 • Xu Ji, Joao Henriques, Tinne Tuytelaars, Andrea Vedaldi
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
1 code implementation • 17 Jun 2020 • Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi, Andrew Zisserman
We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
2 code implementations • 29 Apr 2020 • Honglie Chen, Weidi Xie, Andrea Vedaldi, Andrew Zisserman
Our goal is to collect a large-scale audio-visual dataset with low label noise from videos in the wild using computer vision techniques.
no code implementations • 13 Apr 2020 • Robert McCraith, Lukas Neumann, Andrew Zisserman, Andrea Vedaldi
Recent advances in self-supervised learning havedemonstrated that it is possible to learn accurate monoculardepth reconstruction from raw video data, without using any 3Dground truth for supervision.
1 code implementation • 7 Apr 2020 • Hanbyul Joo, Natalia Neverova, Andrea Vedaldi
Remarkably, the resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks such as 3DPW.
Ranked #20 on
3D Human Pose Estimation
on MPI-INF-3DHP
(PA-MPJPE metric)
1 code implementation • CVPR 2020 • Sylvestre-Alvise Rebuffi, Ruth Fong, Xu Ji, Andrea Vedaldi
Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample.
1 code implementation • 19 Mar 2020 • Oliver Groth, Chia-Man Hung, Andrea Vedaldi, Ingmar Posner
Visuomotor control (VMC) is an effective means of achieving basic manipulation tasks such as pushing or pick-and-place from raw images.
1 code implementation • 18 Mar 2020 • Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou
An EfficientNet-L2 pre-trained with weak supervision on 300M unlabeled images and further optimized with FixRes achieves 88. 5% top-1 accuracy (top-5: 98. 7%), which establishes the new state of the art for ImageNet with a single crop.
Ranked #8 on
Image Classification
on ImageNet ReaL
(using extra training data)
1 code implementation • ICCV 2021 • Mandela Patrick, Yuki M. Asano, Polina Kuznetsova, Ruth Fong, João F. Henriques, Geoffrey Zweig, Andrea Vedaldi
In the image domain, excellent representations can be learned by inducing invariance to content-preserving transformations via noise contrastive learning.
1 code implementation • CVPR 2020 • Artsiom Sanakoyeu, Vasil Khalidov, Maureen S. McCarthy, Andrea Vedaldi, Natalia Neverova
Recent contributions have demonstrated that it is possible to recognize the pose of humans densely and accurately given a large dataset of poses annotated in detail.
1 code implementation • ICLR 2020 • Kai Han, Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Andrea Vedaldi, Andrew Zisserman
In this work we address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labeled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use rank statistics to transfer the model's knowledge of the labelled classes to the problem of clustering the unlabelled images; and, (3) we train the data representation by optimizing a joint objective function on the labelled and unlabelled subsets of the data, improving both the supervised classification of the labelled data, and the clustering of the unlabelled data.
no code implementations • NeurIPS 2019 • Natalia Neverova, David Novotny, Andrea Vedaldi
We show that these models, by understanding uncertainty better, can solve the original DensePose task more accurately, thus setting the new state-of-the-art accuracy in this benchmark.
1 code implementation • CVPR 2020 • Shangzhe Wu, Christian Rupprecht, Andrea Vedaldi
We propose a method to learn 3D deformable object categories from raw single-view images, without external supervision.
4 code implementations • ICLR 2020 • Yuki Markus Asano, Christian Rupprecht, Andrea Vedaldi
Combining clustering and representation learning is one of the most promising approaches for unsupervised learning of deep neural networks.
Ranked #6 on
Image Clustering
on ImageNet
no code implementations • 23 Oct 2019 • Ruth Fong, Andrea Vedaldi
Deep networks for visual recognition are known to leverage "easy to recognise" portions of objects such as faces and distinctive texture patterns.
no code implementations • 19 Oct 2019 • Sylvestre-Alvise Rebuffi, Ruth Fong, Xu Ji, Hakan Bilen, Andrea Vedaldi
In this paper, we are rather interested by the locations of an image that contribute to the model's training.
2 code implementations • ICCV 2019 • Ruth Fong, Mandela Patrick, Andrea Vedaldi
In this paper, we discuss some of the shortcomings of existing approaches to perturbation analysis and address them by introducing the concept of extremal perturbations, which are theoretically grounded and interpretable.
2 code implementations • ICCV 2019 • David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedaldi
We propose C3DPO, a method for extracting 3D models of deformable objects from 2D keypoint annotations in unconstrained images.
1 code implementation • ICCV 2019 • Kai Han, Andrea Vedaldi, Andrew Zisserman
The second contribution is a method to estimate the number of classes in the unlabelled data.
1 code implementation • ICCV 2019 • James Thewlis, Samuel Albanie, Hakan Bilen, Andrea Vedaldi
Equivariance to random image transformations is an effective method to learn landmarks of object categories, such as the eyes and the nose in faces, without manual supervision.
Ranked #1 on
Unsupervised Facial Landmark Detection
on 300W
no code implementations • 14 Aug 2019 • Honglie Chen, Weidi Xie, Andrea Vedaldi, Andrew Zisserman
We propose AutoCorrect, a method to automatically learn object-annotation alignments from a dataset with annotations affected by geometric noise.
no code implementations • CVPR 2020 • Tomas Jakab, Ankush Gupta, Hakan Bilen, Andrea Vedaldi
We propose KeypointGAN, a new method for recognizing the pose of objects from a single image that for learning uses only unlabelled videos and a weak empirical prior on the object poses.
3 code implementations • NeurIPS 2019 • Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou
Conversely, when training a ResNeXt-101 32x48d pre-trained in weakly-supervised fashion on 940 million public images at resolution 224x224 and further optimizing for test resolution 320x320, we obtain a test top-1 accuracy of 86. 4% (top-5: 98. 0%) (single-crop).
Ranked #2 on
Fine-Grained Image Classification
on Birdsnap
(using extra training data)
no code implementations • CVPR 2019 • Natalia Neverova, James Thewlis, Riza Alp Güler, Iasonas Kokkinos, Andrea Vedaldi
DensePose supersedes traditional landmark detectors by densely mapping image pixels to body surface coordinates.
no code implementations • 4 Jun 2019 • Shangzhe Wu, Christian Rupprecht, Andrea Vedaldi
Specifically, given a single image of the object seen from an arbitrary viewpoint, our model predicts a symmetric canonical view, the corresponding 3D shape and a viewpoint transformation, and trains with the goal of reconstructing the input view, resembling an auto-encoder.
no code implementations • 26 May 2019 • Sébastien Ehrhardt, Aron Monszpart, Niloy J. Mitra, Andrea Vedaldi
We are interested in learning models of intuitive physics similar to the ones that animals use for navigation, manipulation and planning.
1 code implementation • 21 May 2019 • Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi, Andrew Zisserman
The first is a simple but effective one: we leverage the power of transfer learning among different tasks and self-supervision to initialize a good representation of the data without making use of any label.
no code implementations • ICLR 2019 • Fabian Fuchs, Oliver Groth, Adam Kosiorek, Alex Bewley, Markus Wulfmeier, Andrea Vedaldi, Ingmar Posner
Using an adversarial stethoscope, the network is successfully de-biased, leading to a performance increase from 66% to 88%.
2 code implementations • ICLR 2020 • Yuki M. Asano, Christian Rupprecht, Andrea Vedaldi
We look critically at popular self-supervision techniques for learning deep convolutional neural networks without manual labels.
3 code implementations • 14 Feb 2019 • Maxim Berman, Hervé Jégou, Andrea Vedaldi, Iasonas Kokkinos, Matthijs Douze
When fed to a linear classifier, the learned embeddings provide state-of-the-art classification accuracy.
Ranked #1 on
Image Retrieval
on INRIA Holidays
no code implementations • NeurIPS 2018 • James Thewlis, Hakan Bilen, Andrea Vedaldi
We propose a new approach to model and learn, without manual supervision, the symmetries of natural objects, such as faces or flowers, given only images as input.
8 code implementations • NeurIPS 2018 • Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi
We also propose a parametric gather-excite operator pair which yields further performance gains, relate it to the recently-introduced Squeeze-and-Excitation Networks, and analyse the effects of these changes to the CNN feature activation statistics.
no code implementations • 23 Sep 2018 • Ankush Gupta, Andrea Vedaldi, Andrew Zisserman
This work presents a method for visual text recognition without using any paired supervisory data.
no code implementations • ECCV 2018 • Maria Klodt, Andrea Vedaldi
First, since such self-supervised approaches are based on the brightness constancy assumption, which is valid only for a subset of pixels, we propose a probabilistic learning formulation where the network predicts distributions over variables rather than specific values.
no code implementations • 16 Aug 2018 • Samuel Albanie, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman
We make the following contributions: (i) we develop a strong teacher network for facial emotion recognition that achieves the state of the art on a standard benchmark; (ii) we use the teacher to train a student, tabula rasa, to learn representations (embeddings) for speech emotion recognition without access to labelled audio data; and (iii) we show that the speech emotion embedding can be used for speech emotion recognition on external benchmark datasets.
Ranked #3 on
Facial Expression Recognition (FER)
on FERPlus
Facial Emotion Recognition
Facial Expression Recognition (FER)
+1
no code implementations • ECCV 2018 • David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi
Object detection and instance segmentation are dominated by region-based methods such as Mask RCNN.
no code implementations • 21 Jul 2018 • Ankush Gupta, Andrea Vedaldi, Andrew Zisserman
End-to-end trained Recurrent Neural Networks (RNNs) have been successfully applied to numerous problems that require processing sequences, such as image captioning, machine translation, and text recognition.
1 code implementation • 20 Jul 2018 • Karel Lenc, Andrea Vedaldi
The new protocol is better for assessment on a large number of images and reduces the dependency of the results on unwanted distractors such as the number of detected features and the feature magnification factor.
6 code implementations • ICCV 2019 • Xu Ji, João F. Henriques, Andrea Vedaldi
The method is not specialised to computer vision and operates on any paired dataset samples; in our experiments we use random transforms to obtain a pair from each image.
Ranked #1 on
Unsupervised MNIST
on MNIST
no code implementations • 15 Jul 2018 • Aravindh Mahendran, James Thewlis, Andrea Vedaldi
We propose a novel method for learning convolutional neural image representations without manual supervision.
2 code implementations • NeurIPS 2018 • Tomas Jakab, Ankush Gupta, Hakan Bilen, Andrea Vedaldi
We propose a method for learning landmark detectors for visual objects (such as the eyes and the nose in a face) without any manual supervision.
Conditional Image Generation
Unsupervised Facial Landmark Detection
no code implementations • 14 Jun 2018 • Fabian B. Fuchs, Oliver Groth, Adam R. Kosiorek, Alex Bewley, Markus Wulfmeier, Andrea Vedaldi, Ingmar Posner
Conversely, training on an easy dataset where visual cues are positively correlated with stability, the baseline model learns a bias leading to poor performance on a harder dataset.
no code implementations • CVPR 2018 • João F. Henriques, Andrea Vedaldi
The module contains an allocentric spatial memory that can be accessed associatively by feeding to it the current sensory input, resulting in localization, and then updated using an LSTM or similar mechanism.
1 code implementation • 21 May 2018 • João F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi
We propose a fast second-order method that can be used as a drop-in replacementfor current deep learning solvers.
6 code implementations • ICLR 2019 • João F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi
Instead, we propose to keep a single estimate of the gradient projected by the inverse Hessian matrix, and update it once per iteration.
5 code implementations • ICLR 2019 • Luca Bertinetto, João F. Henriques, Philip H. S. Torr, Andrea Vedaldi
The main idea is to teach a deep network to use standard machine learning tools, such as ridge regression, as part of its own internal model, enabling it to quickly adapt to novel data.
no code implementations • 14 May 2018 • Sebastien Ehrhardt, Aron Monszpart, Niloy Mitra, Andrea Vedaldi
While learning models of intuitive physics is an increasingly active area of research, current approaches still fall short of natural intelligences in one important regard: they require external supervision, such as explicit access to physical states, at training and sometimes even at test times.
1 code implementation • ECCV 2018 • Oliver Groth, Fabian B. Fuchs, Ingmar Posner, Andrea Vedaldi
Physical intuition is pivotal for intelligent agents to perform complex tasks.
no code implementations • CVPR 2018 • David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi
Self-supervision can dramatically cut back the amount of manually-labelled data required to train deep neural networks.
3 code implementations • CVPR 2018 • Sylvestre-Alvise Rebuffi, Hakan Bilen, Andrea Vedaldi
A practical limitation of deep neural networks is their high degree of specialization to a single task and visual domain.
no code implementations • ECCV 2018 • Jack Valmadre, Luca Bertinetto, João F. Henriques, Ran Tao, Andrea Vedaldi, Arnold Smeulders, Philip Torr, Efstratios Gavves
We introduce the OxUvA dataset and benchmark for evaluating single-object tracking algorithms.
1 code implementation • CVPR 2018 • Ruth Fong, Andrea Vedaldi
By studying such embeddings, we are able to show that 1., in most cases, multiple filters are required to code for a concept, that 2., often filters are not concept specific and help encode multiple concepts, and that 3., compared to single filter activations, filter embeddings are able to better characterize the meaning of a representation and its relationship to other concepts.
no code implementations • 22 Dec 2017 • Sebastien Ehrhardt, Aron Monszpart, Niloy Mitra, Andrea Vedaldi
In order to be able to leverage the approximation capabilities of artificial intelligence techniques in such physics related contexts, researchers have handcrafted the relevant states, and then used neural networks to learn the state transitions using simulation runs as training data.
14 code implementations • CVPR 2018 • Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
In this paper, we show that, on the contrary, the structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learning.
no code implementations • 26 Nov 2017 • Jameson Merkow, Robert Lufkin, Kim Nguyen, Stefano Soatto, Zhuowen Tu, Andrea Vedaldi
Thus, DeepRadiologyNet enables significant reduction in the workload of human radiologists by automatically filtering studies and reporting on the high-confidence ones at an operating point well below the literal error rate for US Board Certified radiologists, estimated at 0. 82%.
no code implementations • NeurIPS 2017 • James Thewlis, Hakan Bilen, Andrea Vedaldi
One of the key challenges of visual perception is to extract abstract models of 3D objects and object categories from visual measurements, which are affected by complex nuisance factors such as viewpoint, occlusion, motion, and deformations.
Ranked #3 on
Unsupervised Facial Landmark Detection
on AFLW-MTFL
Optical Flow Estimation
Unsupervised Facial Landmark Detection
no code implementations • 6 Jun 2017 • Sébastien Ehrhardt, Aron Monszpart, Andrea Vedaldi, Niloy Mitra
While the basic laws of Newtonian mechanics are well understood, explaining a physical scenario still requires manually modeling the problem with suitable equations and associated parameters.
2 code implementations • NeurIPS 2017 • Sylvestre-Alvise Rebuffi, Hakan Bilen, Andrea Vedaldi
There is a growing interest in learning data representations that work well for many different types of problems and data.
no code implementations • ICCV 2017 • David Novotny, Diane Larlus, Andrea Vedaldi
Traditional approaches for learning 3D object categories use either synthetic data or manual supervision.
1 code implementation • ICCV 2017 • James Thewlis, Hakan Bilen, Andrea Vedaldi
Learning automatically the structure of object categories remains an important open problem in computer vision.
Ranked #2 on
Unsupervised Facial Landmark Detection
on AFLW-MTFL
Unsupervised Facial Landmark Detection
Unsupervised Human Pose Estimation
+1
no code implementations • CVPR 2017 • Jack Valmadre, Luca Bertinetto, João F. Henriques, Andrea Vedaldi, Philip H. S. Torr
The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations.
Ranked #3 on
Visual Object Tracking
on OTB-50
no code implementations • CVPR 2017 • Vassileios Balntas, Karel Lenc, Andrea Vedaldi, Krystian Mikolajczyk
In this paper, we propose a novel benchmark for evaluating local image descriptors.
no code implementations • CVPR 2017 • David Novotny, Diane Larlus, Andrea Vedaldi
Despite significant progress of deep learning in recent years, state-of-the-art semantic matching methods still rely on legacy features such as SIFT or HoG.
6 code implementations • ICCV 2017 • Ruth Fong, Andrea Vedaldi
As machine learning algorithms are increasingly applied to high impact yet high risk tasks, such as medical diagnosis or autonomous driving, it is critical that researchers can explain how such algorithms arrived at their predictions.
1 code implementation • 7 Apr 2017 • Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
Unlike previous hybrids of autoencoders and adversarial networks, the adversarial game in our approach is set up directly between the encoder and the generator, and no external mappings are trained in the process of learning.
no code implementations • 1 Mar 2017 • Sebastien Ehrhardt, Aron Monszpart, Niloy J. Mitra, Andrea Vedaldi
Evolution has resulted in highly developed abilities in many natural intelligences to quickly and accurately predict mechanical phenomena.
no code implementations • 25 Jan 2017 • Hakan Bilen, Andrea Vedaldi
With the advent of large labelled datasets and high-capacity models, the performance of machine vision systems has been improving rapidly.
Ranked #14 on
Continual Learning
on visual domain decathlon (10 tasks)
1 code implementation • CVPR 2017 • Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
The recent work of Gatys et al., who characterized the style of an image by the statistics of convolutional neural network filters, ignited a renewed interest in the texture generation and image stylization problems.
3 code implementations • 2 Dec 2016 • Hakan Bilen, Basura Fernando, Efstratios Gavves, Andrea Vedaldi
This is a powerful idea because it allows to convert any video to an image so that existing CNN models pre-trained for the analysis of still images can be immediately extended to videos.
no code implementations • 7 Oct 2016 • Samuel Albanie, Andrea Vedaldi
As a starting point, we consider the problem of relating facial expressions to objectively measurable events occurring in videos.
no code implementations • ICML 2017 • João F. Henriques, Andrea Vedaldi
Convolutional Neural Networks (CNNs) are extremely efficient, since they exploit the inherent translation-invariance of natural images.
1 code implementation • 12 Sep 2016 • James Thewlis, Shuai Zheng, Philip H. S. Torr, Andrea Vedaldi
Deep Matching (DM) is a popular high-quality method for quasi-dense image matching.
21 code implementations • 27 Jul 2016 • Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
It this paper we revisit the fast stylization method introduced in Ulyanov et.
no code implementations • 5 Jul 2016 • David Novotny, Diane Larlus, Andrea Vedaldi
While recent research in image understanding has often focused on recognizing more types of objects, understanding more about the objects is just as important.
10 code implementations • 30 Jun 2016 • Luca Bertinetto, Jack Valmadre, João F. Henriques, Andrea Vedaldi, Philip H. S. Torr
The problem of arbitrary object tracking has traditionally been tackled by learning a model of the object's appearance exclusively online, using as sole training data the video itself.
Ranked #3 on
Visual Object Tracking
on OTB-50
no code implementations • NeurIPS 2016 • Luca Bertinetto, João F. Henriques, Jack Valmadre, Philip H. S. Torr, Andrea Vedaldi
In this paper, we propose a method to learn the parameters of a deep model in one shot.
no code implementations • NeurIPS 2016 • Hakan Bilen, Andrea Vedaldi
Modern discriminative predictors have been shown to match natural intelligences in specific perceptual tasks in image classification, object and part detection, boundary extraction, etc.
1 code implementation • CVPR 2016 • Hakan Bilen, Basura Fernando, Efstratios Gavves, Andrea Vedaldi, Stephen Gould
We introduce the concept of dynamic image, a novel compact representation of videos useful for video analysis especially when convolutional neural networks (CNNs) are used.
Ranked #61 on
Action Recognition
on HMDB-51
1 code implementation • 4 May 2016 • Karel Lenc, Andrea Vedaldi
We support these ideas theoretically, proposing a novel analysis of local features in term of geometric transformations, and we show that all common and many uncommon detectors can be derived in this framework.
3 code implementations • CVPR 2016 • Ankush Gupta, Andrea Vedaldi, Andrew Zisserman
In this paper we introduce a new method for text detection in natural images.
Ranked #12 on
Scene Text Detection
on ICDAR 2013 (1015)
11 code implementations • 10 Mar 2016 • Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, Victor Lempitsky
Gatys et al. recently demonstrated that deep networks can generate beautiful textures and stylized images from a single texture example.
no code implementations • 7 Dec 2015 • Aravindh Mahendran, Andrea Vedaldi
Image representations, from SIFT and bag of visual words to Convolutional Neural Networks (CNNs) are a crucial component of almost all computer vision systems.
3 code implementations • CVPR 2016 • Hakan Bilen, Andrea Vedaldi
Weakly supervised learning of object detection is an important problem in image understanding that still does not have a satisfactory solution.
Ranked #3 on
Weakly Supervised Object Detection
on HICO-DET
no code implementations • 9 Jul 2015 • Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Andrea Vedaldi
Visual textures have played a key role in image understanding because they convey important semantics of images, and because texture representations that pool local image descriptors in an orderless manner have had a tremendous impact in diverse applications.
no code implementations • 23 Jun 2015 • Karel Lenc, Andrea Vedaldi
In object detection, methods such as R-CNN have obtained excellent results by integrating CNNs with region proposal generation algorithms such as selective search.
1 code implementation • CVPR 2015 • Mircea Cimpoi, Subhransu Maji, Andrea Vedaldi
Research in texture recognition often concentrates on the problem of material recognition in uncluttered conditions, an assumption rarely met by applications.
no code implementations • 18 Apr 2015 • David Novotný, Diane Larlus, Florent Perronnin, Andrea Vedaldi
Fisher Vectors and related orderless visual statistics have demonstrated excellent performance in object detection, sometimes superior to established approaches such as the Deformable Part Models.
no code implementations • 20 Dec 2014 • Sobhan Naderi Parizi, Andrea Vedaldi, Andrew Zisserman, Pedro Felzenszwalb
First, a collection of informative parts is discovered, using heuristics that promote part distinctiveness and diversity, and then classifiers are trained on the vector of part responses.
no code implementations • 18 Dec 2014 • Max Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
We develop a representation suitable for the unconstrained recognition of words in natural images: the general case of no fixed lexicon and unknown length.
no code implementations • 15 Dec 2014 • Andrea Vedaldi, Karel Lenc
MatConvNet is an implementation of Convolutional Neural Networks (CNNs) for MATLAB.
no code implementations • 4 Dec 2014 • Max Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
In this work we present an end-to-end system for text spotting -- localising and recognising text in natural scene images -- and text based image retrieval.
Ranked #15 on
Scene Text Detection
on ICDAR 2013 (1015)
8 code implementations • CVPR 2015 • Aravindh Mahendran, Andrea Vedaldi
Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system.
no code implementations • 25 Nov 2014 • Mircea Cimpoi, Subhransu Maji, Andrea Vedaldi
Research in texture recognition often concentrates on the problem of material recognition in uncluttered conditions, an assumption rarely met by applications.
no code implementations • CVPR 2015 • Karel Lenc, Andrea Vedaldi
Despite the importance of image representations such as histograms of oriented gradients and deep Convolutional Neural Networks (CNN), our theoretical understanding of them remains limited.
1 code implementation • 9 Jun 2014 • Max Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
In this work we present a framework for the recognition of natural scene text.
Ranked #32 on
Scene Text Recognition
on SVT
no code implementations • CVPR 2014 • Omkar M. Parkhi, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
Our goal is to learn a compact, discriminative vector representation of a face track, suitable for the face recognition tasks of verification and classification.
no code implementations • CVPR 2014 • Andrea Vedaldi, Siddharth Mahendran, Stavros Tsogkas, Subhransu Maji, Ross Girshick, Juho Kannala, Esa Rahtu, Iasonas Kokkinos, Matthew B. Blaschko, David Weiss, Ben Taskar, Karen Simonyan, Naomi Saphra, Sammy Mohamed
We show that the collected data can be used to study the relation between part detection and attribute prediction by diagnosing the performance of classifiers that pool information from different parts of an object.
no code implementations • 15 May 2014 • Max Jaderberg, Andrea Vedaldi, Andrew Zisserman
The focus of this paper is speeding up the evaluation of convolutional neural networks.
1 code implementation • 14 May 2014 • Ken Chatfield, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
In particular, we show that the data augmentation techniques commonly applied to CNN-based methods can also be applied to shallow methods, and result in an analogous performance boost.
21 code implementations • 20 Dec 2013 • Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets).
no code implementations • NeurIPS 2013 • Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
As massively parallel computations have become broadly available with modern GPUs, deep architectures trained on very large datasets have risen in popularity.
11 code implementations • CVPR 2014 • Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, Andrea Vedaldi
Patterns and textures are defining characteristics of many natural objects: a shirt can be striped, the wings of a butterfly can be veined, and the skin of an animal can be scaly.
1 code implementation • 21 Jun 2013 • Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew Blaschko, Andrea Vedaldi
This paper introduces FGVC-Aircraft, a new dataset containing 10, 000 images of aircraft spanning 100 aircraft models, organised in a three-level hierarchy.
no code implementations • CVPR 2013 • Mayank Juneja, Andrea Vedaldi, C. V. Jawahar, Andrew Zisserman
The automatic discovery of distinctive parts for an object or scene class is challenging since it requires simultaneously to learn the part appearance and also to identify the part occurrences in images.
no code implementations • NeurIPS 2011 • Victor Lempitsky, Andrea Vedaldi, Andrew Zisserman
Often, the random field is applied over a flat partitioning of the image into non-intersecting elements, such as pixels or super-pixels.
no code implementations • NeurIPS 2010 • Matthew Blaschko, Andrea Vedaldi, Andrew Zisserman
A standard approach to learning object category detectors is to provide strong supervision in the form of a region of interest (ROI) specifying each instance of the object in the training images.
no code implementations • NeurIPS 2009 • Andrea Vedaldi, Andrew Zisserman
We develop a structured output model for object category detection that explicitly accounts for alignment, multiple aspects and partial truncation in both training and inference.