no code implementations • 22 Apr 2024 • Sophia Sirko-Galouchenko, Alexandre Boulch, Spyros Gidaris, Andrei Bursuc, Antonin Vobecky, Patrick Pérez, Renaud Marlet
Models pretrained with our method exhibit improved BEV semantic segmentation performance, particularly in low-data scenarios.
no code implementations • NeurIPS 2023 • Antonin Vobecky, Oriane Siméoni, David Hurych, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Josef Sivic
We describe an approach to predict open-vocabulary 3D semantic voxel occupancy map from input 2D images with the objective of enabling 3D grounding, segmentation and retrieval of free-form language queries.
3D Semantic Occupancy Prediction 3D Semantic Segmentation +3
1 code implementation • 1 Dec 2023 • Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos, Nikos Komodakis
Unsupervised object-centric learning aims to decompose scenes into interpretable object entities, termed slots.
1 code implementation • 26 Oct 2023 • Gilles Puy, Spyros Gidaris, Alexandre Boulch, Oriane Siméoni, Corentin Sautier, Patrick Pérez, Andrei Bursuc, Renaud Marlet
In particular, thanks to our scalable distillation method named ScaLR, we show that scaling the 2D and 3D backbones and pretraining on diverse datasets leads to a substantial improvement of the feature quality.
1 code implementation • 19 Oct 2023 • Oriane Siméoni, Éloi Zablocki, Spyros Gidaris, Gilles Puy, Patrick Pérez
We propose here a survey of unsupervised object localization methods that discover objects in images without requiring any manual annotation in the era of self-supervised ViTs.
no code implementations • 18 Jul 2023 • Spyros Gidaris, Andrei Bursuc, Oriane Simeoni, Antonin Vobecky, Nikos Komodakis, Matthieu Cord, Patrick Pérez
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks for very large fully-annotated datasets.
1 code implementation • CVPR 2023 • Angelika Ando, Spyros Gidaris, Andrei Bursuc, Gilles Puy, Alexandre Boulch, Renaud Marlet
(c) We refine pixel-wise predictions with a convolutional decoder and a skip connection from the convolutional stem to combine low-level but fine-grained features of the the convolutional stem with the high-level but coarse predictions of the ViT encoder.
1 code implementation • 25 Jul 2022 • Huy V. Vo, Oriane Siméoni, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Jean Ponce
On COCO, using on average 10 fully-annotated images per class, or equivalently 1% of the training set, BiB also reduces the performance gap (in AP) between the weakly-supervised detector and the fully-supervised Fast RCNN by over 70%, showing a good trade-off between performance and data efficiency.
1 code implementation • CVPR 2022 • Corentin Sautier, Gilles Puy, Spyros Gidaris, Alexandre Boulch, Andrei Bursuc, Renaud Marlet
In this context, we propose a self-supervised pre-training method for 3D perception models that is tailored to autonomous driving data.
1 code implementation • 23 Mar 2022 • Ioannis Kakogeorgiou, Spyros Gidaris, Bill Psomas, Yannis Avrithis, Andrei Bursuc, Konstantinos Karantzalos, Nikos Komodakis
In this work, we argue that image token masking differs from token masking in text, due to the amount and correlation of tokens in an image.
1 code implementation • 21 Mar 2022 • Antonin Vobecky, David Hurych, Oriane Siméoni, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Josef Sivic
This work investigates learning pixel-wise semantic image segmentation in urban scenes without any manual annotation, just from the raw non-curated data collected by cars which, equipped with cameras and LiDAR sensors, drive around a city.
2 code implementations • 29 Sep 2021 • Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet, Jean Ponce
We also show that training a class-agnostic detector on the discovered objects boosts results by another 7 points.
Ranked #4 on Weakly-Supervised Object Localization on CUB-200-2011 (Top-1 Localization Accuracy metric)
2 code implementations • CVPR 2021 • Spyros Gidaris, Andrei Bursuc, Gilles Puy, Nikos Komodakis, Matthieu Cord, Patrick Pérez
With this in mind, we propose a teacher-student scheme to learn representations by training a convolutional net to reconstruct a bag-of-visual-words (BoW) representation of an image, given as input a perturbed version of that same image.
Ranked #18 on Semi-Supervised Image Classification on ImageNet - 1% labeled data (Top 5 Accuracy metric)
1 code implementation • CVPR 2020 • Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, Matthieu Cord
Inspired by the success of NLP methods in this area, in this work we propose a self-supervised approach based on spatially dense image descriptions that encode discrete visual concepts, here called visual words.
1 code implementation • ECCV 2020 • Himalaya Jain, Spyros Gidaris, Nikos Komodakis, Patrick Pérez, Matthieu Cord
Knowledge distillation refers to the process of training a compact student network to achieve better accuracy by learning from a high capacity teacher network.
1 code implementation • 27 Aug 2019 • Xi Shen, Ilaria Pastrolin, Oumayma Bounou, Spyros Gidaris, Marc Smith, Olivier Poncet, Mathieu Aubry
Historical watermark recognition is a highly practical, yet unsolved challenge for archivists and historians.
1 code implementation • ICCV 2019 • Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, Matthieu Cord
Few-shot learning and self-supervised learning address different facets of the same problem: how to train a model with little or no labeled data.
1 code implementation • CVPR 2019 • Spyros Gidaris, Nikos Komodakis
The meta-model, given as input some novel classes with few training examples per class, must properly adapt the existing recognition model into a new model that can correctly classify in a unified way both the novel and the base classes.
4 code implementations • CVPR 2018 • Spyros Gidaris, Nikos Komodakis
In this context, the goal of our work is to devise a few-shot visual learning system that during test time it will be able to efficiently learn novel categories from only a few training data while at the same time it will not forget the initial categories on which it was trained (here called base categories).
Ranked #2 on Few-Shot Image Classification on ImageNet (1-shot)
3 code implementations • ECCV 2018 • George Papandreou, Tyler Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, Kevin Murphy
We present a box-free bottom-up approach for the tasks of pose estimation and instance segmentation of people in multi-person images using an efficient single-shot model.
Ranked #8 on Multi-Person Pose Estimation on COCO test-dev
20 code implementations • ICLR 2018 • Spyros Gidaris, Praveer Singh, Nikos Komodakis
However, in order to successfully learn those features, they usually require massive amounts of manually labeled data, which is both expensive and impractical to scale.
Ranked #126 on Self-Supervised Image Classification on ImageNet
1 code implementation • CVPR 2017 • Spyros Gidaris, Nikos Komodakis
Instead, we propose a generic architecture that decomposes the label improvement task to three steps: 1) detecting the initial label estimates that are incorrect, 2) replacing the incorrect labels with new ones, and finally 3) refining the renewed labels by predicting residual corrections w. r. t.
1 code implementation • 14 Jun 2016 • Spyros Gidaris, Nikos Komodakis
We extensively evaluate our AttractioNet approach on several image datasets (i. e. COCO, PASCAL, ImageNet detection and NYU-Depth V2 datasets) reporting on all of them state-of-the-art results that surpass the previous work in the field by a significant margin and also providing strong empirical evidence that our approach is capable to generalize to unseen categories.
1 code implementation • ICCV 2015 • Spyros Gidaris, Nikos Komodakis
We propose an object detection system that relies on a multi-region deep convolutional neural network (CNN) that also encodes semantic segmentation-aware features.
1 code implementation • CVPR 2016 • Spyros Gidaris, Nikos Komodakis
We propose a novel object localization methodology with the purpose of boosting the localization accuracy of state-of-the-art object detection systems.
1 code implementation • 7 May 2015 • Spyros Gidaris, Nikos Komodakis
We propose an object detection system that relies on a multi-region deep convolutional neural network (CNN) that also encodes semantic segmentation-aware features.