Search Results for author: Yannis Kalantidis

Found 34 papers, 21 papers with code

Label Propagation for Zero-shot Classification with Vision-Language Models

2 code implementations • 5 Apr 2024 • Vladan Stojnić, Yannis Kalantidis, Giorgos Tolias

We leverage the graph structure of the unlabeled data and introduce ZLaP, a method based on label propagation (LP) that utilizes geodesic distances for classification.

Classification Zero-Shot Learning

271

Paper
Code

Weatherproofing Retrieval for Localization with Generative AI and Geometric Consistency

no code implementations • 14 Feb 2024 • Yannis Kalantidis, Mert Bülent Sarıyıldız, Rafael S. Rezende, Philippe Weinzaepfel, Diane Larlus, Gabriela Csurka

After expanding the training set, we propose a training approach that leverages the specificities and the underlying geometry of this mix of real and synthetic images.

Image Retrieval Retrieval +1

Paper
Add Code

Rethinking matching-based few-shot action recognition

no code implementations • 28 Mar 2023 • Juliette Bertrand, Yannis Kalantidis, Giorgos Tolias

Few-shot action recognition, i. e. recognizing new action classes given only a few examples, benefits from incorporating temporal information.

Few-Shot action recognition Few Shot Action Recognition

Paper
Add Code

Fake it till you make it: Learning transferable representations from synthetic ImageNet clones

no code implementations • CVPR 2023 • Mert Bulent Sariyildiz, Karteek Alahari, Diane Larlus, Yannis Kalantidis

We show that with minimal and class-agnostic prompt engineering, ImageNet clones are able to close a large part of the gap between models produced by synthetic images and models trained with real images, for the several standard classification benchmarks that we consider in this study.

Classification Image Generation +1

Paper
Add Code

Granularity-aware Adaptation for Image Retrieval over Multiple Tasks

no code implementations • 5 Oct 2022 • Jon Almazán, Byungsoo Ko, Geonmo Gu, Diane Larlus, Yannis Kalantidis

We address it with the proposed Grappa, an approach that starts from a strong pretrained model, and adapts it to tackle multiple retrieval tasks concurrently, using only unlabeled images from the different task domains.

Image Retrieval Pseudo Label +2

Paper
Add Code

PoseBERT: A Generic Transformer Module for Temporal 3D Human Modeling

1 code implementation • 22 Aug 2022 • Fabien Baradel, Romain Brégier, Thibault Groueix, Philippe Weinzaepfel, Yannis Kalantidis, Grégory Rogez

It is simple, generic and versatile, as it can be plugged on top of any image-based model to transform it in a video-based model leveraging temporal information.

Pose Estimation Pose Prediction

Paper
Code

No Reason for No Supervision: Improved Generalization in Supervised Models

1 code implementation • 30 Jun 2022 • Mert Bulent Sariyildiz, Yannis Kalantidis, Karteek Alahari, Diane Larlus

We consider the problem of training a deep neural network on a given classification task, e. g., ImageNet-1K (IN1K), so that it excels at both the training task as well as at other (future) transfer tasks.

Data Augmentation Self-Supervised Learning +1

Paper
Code

Learning Super-Features for Image Retrieval

1 code implementation • ICLR 2022 • Philippe Weinzaepfel, Thomas Lucas, Diane Larlus, Yannis Kalantidis

Second, they are typically trained with a global loss that only acts on top of an aggregation of local features; by contrast, testing is based on local feature matching, which creates a discrepancy between training and testing.

Ranked #3 on Image Retrieval on ROxford (Medium)

Image Retrieval Retrieval

123

Paper
Code

TLDR: Twin Learning for Dimensionality Reduction

1 code implementation • 18 Oct 2021 • Yannis Kalantidis, Carlos Lassance, Jon Almazan, Diane Larlus

Dimensionality reduction methods are unsupervised approaches which learn low-dimensional spaces where some properties of the initial space, typically the notion of "neighborhood", are preserved.

Dimensionality Reduction Representation Learning +2

122

Paper
Code

Leveraging MoCap Data for Human Mesh Recovery

1 code implementation • 18 Oct 2021 • Fabien Baradel, Thibault Groueix, Philippe Weinzaepfel, Romain Brégier, Yannis Kalantidis, Grégory Rogez

In fact, we show that simply fine-tuning the batch normalization layers of the model is enough to achieve large gains.

Ranked #65 on 3D Human Pose Estimation on 3DPW

3D Human Pose Estimation 3D Human Reconstruction +2

Paper
Code

Probabilistic Embeddings for Cross-Modal Retrieval

4 code implementations • CVPR 2021 • Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus

Instead, we propose to use Probabilistic Cross-Modal Embedding (PCME), where samples from the different modalities are represented as probabilistic distributions in the common embedding space.

Cross-Modal Retrieval Retrieval

119

Paper
Code

Concept Generalization in Visual Representation Learning

1 code implementation • ICCV 2021 • Mert Bulent Sariyildiz, Yannis Kalantidis, Diane Larlus, Karteek Alahari

In this paper, we argue that the semantic relationships between seen and unseen concepts affect generalization performance and propose ImageNet-CoG, a novel benchmark on the ImageNet-21K (IN-21K) dataset that enables measuring concept generalization in a principled way.

Representation Learning Self-Supervised Learning

Paper
Code

Hard Negative Mixing for Contrastive Learning

1 code implementation • NeurIPS 2020 • Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, Diane Larlus

Based on these observations, and motivated by the success of data mixing, we propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead.

Contrastive Learning Data Augmentation +5

Paper
Code

Proceedings of the ICLR Workshop on Computer Vision for Agriculture (CV4A) 2020

no code implementations • 23 Apr 2020 • Yannis Kalantidis, Laura Sevilla-Lara, Ernest Mwebaze, Dina Machuve, Hamed Alemohammad, David Guerena

The workshop was held in conjunction with the International Conference on Learning Representations (ICLR) 2020.

Paper
Add Code

Decoupling Representation and Classifier for Long-Tailed Recognition

4 code implementations • ICLR 2020 • Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

The long-tail distribution of the visual world poses great challenges for deep learning based classification models on how to handle the class imbalance problem.

Ranked #3 on Long-tail learning with class descriptors on CUB-LT

Classification General Classification +3

926

Paper
Code

Learning to Generate Grounded Visual Captions without Localization Supervision

2 code implementations • 1 Jun 2019 • Chih-Yao Ma, Yannis Kalantidis, Ghassan AlRegib, Peter Vajda, Marcus Rohrbach, Zsolt Kira

When automatically generating a sentence description for an image or video, it often remains unclear how well the generated caption is grounded, that is whether the model uses the correct image regions to output particular words, or if the model is hallucinating based on priors in the dataset and/or the language model.

Image Captioning Language Modelling +2

157

Paper
Code

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

28 code implementations • ICCV 2019 • Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, Jiashi Feng

Similarly, the output feature maps of a convolution layer can also be seen as a mixture of information at different frequencies.

Ranked #147 on Action Classification on Kinetics-400

Action Classification Image Classification +1

2,926

Paper
Code

Less is More: Learning Highlight Detection from Video Duration

no code implementations • CVPR 2019 • Bo Xiong, Yannis Kalantidis, Deepti Ghadiyaram, Kristen Grauman

Highlight detection has the potential to significantly ease video browsing, but existing methods often suffer from expensive supervision requirements, where human viewers must manually identify highlights in training videos.

Highlight Detection

Paper
Add Code

DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition

no code implementations • CVPR 2019 • Zheng Shou, Xudong Lin, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Shih-Fu Chang, Zhicheng Yan

Motion has shown to be useful for video understanding, where motion is typically represented by optical flow.

Ranked #1 on Action Recognition on UCF-101

Action Classification Action Recognition In Videos +3

Paper
Add Code

Grounded Video Description

2 code implementations • CVPR 2019 • Luowei Zhou, Yannis Kalantidis, Xinlei Chen, Jason J. Corso, Marcus Rohrbach

Our dataset, ActivityNet-Entities, augments the challenging ActivityNet Captions dataset with 158k bounding box annotations, each grounding a noun phrase.

Sentence Video Description

319

Paper
Code

Focal Visual-Text Attention for Memex Question Answering

1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2018 • Junwei Liang, Lu Jiang, Liangliang Cao, Yannis Kalantidis, Li-Jia Li, and Alexander Hauptmann

In addition to a text answer, a few grounding photos are also given to justify the answer.

Ranked #1 on Memex Question Answering on MemexQA

Memex Question Answering Question Answering +1

Paper
Code

A^2-Nets: Double Attention Networks

2 code implementations • NeurIPS 2018 • Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

Learning to capture long-range relations is fundamental to image/video recognition.

Action Classification Action Recognition +2

Paper
Code

Graph-Based Global Reasoning Networks

9 code implementations • CVPR 2019 • Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, Yannis Kalantidis

In this work, we propose a new approach for reasoning globally in which a set of features are globally aggregated over the coordinate space and then projected to an interaction space where relational reasoning can be efficiently computed.

Action Classification Action Recognition +4

337

Paper
Code

$A^2$-Nets: Double Attention Networks

no code implementations • 27 Oct 2018 • Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

Learning to capture long-range relations is fundamental to image/video recognition.

Ranked #35 on Action Recognition on UCF101

3D Absolute Human Pose Estimation Action Classification +3

Paper
Add Code

Multi-Fiber Networks for Video Recognition

no code implementations • ECCV 2018 • Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

In this paper, we aim to reduce the computational cost of spatio-temporal deep neural networks, making them run as fast as their 2D counterparts while preserving state-of-the-art accuracy on video recognition benchmarks.

Ranked #36 on Action Recognition on UCF101 (using extra training data)

Action Classification Action Recognition +1

Paper
Add Code

Large-Scale Visual Relationship Understanding

2 code implementations • 27 Apr 2018 • Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny

Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples.

Relationship Detection

112

Paper
Code

MemexQA: Visual Memex Question Answering

1 code implementation • 4 Aug 2017 • Lu Jiang, Junwei Liang, Liangliang Cao, Yannis Kalantidis, Sachin Farfade, Alexander Hauptmann

This paper proposes a new task, MemexQA: given a collection of photos or videos from a user, the goal is to automatically answer questions that help users recover their memory about events captured in the collection.

Memex Question Answering Question Answering +1

Paper
Code

Tag Prediction at Flickr: a View from the Darkroom

no code implementations • 6 Dec 2016 • Kofi Boakye, Sachin Farfade, Hamid Izadinia, Yannis Kalantidis, Pierre Garrigues

Our results demonstrate that, for real-world datasets, training exclusively with this noisy data yields performance on par with the standard paradigm of first pre-training on clean data and then fine-tuning.

TAG

Paper
Add Code

LOH and behold: Web-scale visual search, recommendation and clustering using Locally Optimized Hashing

no code implementations • 21 Apr 2016 • Yannis Kalantidis, Lyndon Kennedy, Huy Nguyen, Clayton Mellina, David A. Shamma

We propose a novel hashing-based matching scheme, called Locally Optimized Hashing (LOH), based on a state-of-the-art quantization algorithm that can be used for efficient, large-scale search, recommendation, clustering, and deduplication.

Clustering Distributed Computing +1

Paper
Add Code

Visual Congruent Ads for Image Search

no code implementations • 21 Apr 2016 • Yannis Kalantidis, Ayman Farahat, Lyndon Kennedy, Ricardo Baeza-Yates, David A. Shamma

The quality of user experience online is affected by the relevance and placement of advertisements.

Image Retrieval

Paper
Add Code

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

1 code implementation • 23 Feb 2016 • Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A. Shamma, Michael S. Bernstein, Fei-Fei Li

Despite progress in perceptual tasks such as image classification, computers still perform poorly on cognitive tasks such as image description and question answering.

Image Classification Question Answering

225

Paper
Code

Cross-dimensional Weighting for Aggregated Deep Convolutional Features

1 code implementation • 13 Dec 2015 • Yannis Kalantidis, Clayton Mellina, Simon Osindero

We propose a simple and straightforward way of creating powerful image representations via cross-dimensional weighting and aggregation of deep convolutional neural network layer outputs.

Ranked #13 on Image Retrieval on RParis (Medium)

Image Retrieval

105

Paper
Code

Web-Scale Image Clustering Revisited

1 code implementation • ICCV 2015 • Yannis Avrithis, Yannis Kalantidis, Evangelos Anagnostopoulos, Ioannis Z. Emiris

Large scale duplicate detection, clustering and mining of documents or images has been conventionally treated with seed detection via hashing, followed by seed growing heuristics using fast search.

Clustering Image Clustering +1

Paper
Code

Locally Optimized Product Quantization for Approximate Nearest Neighbor Search

no code implementations • CVPR 2014 • Yannis Kalantidis, Yannis Avrithis

We present a simple vector quantizer that combines low distortion with fast search and apply it to approximate nearest neighbor (ANN) search in high dimensional spaces.

Quantization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.