Search Results for author: Yannis Kalantidis

Found 33 papers, 20 papers with code

Weatherproofing Retrieval for Localization with Generative AI and Geometric Consistency

no code implementations14 Feb 2024 Yannis Kalantidis, Mert Bülent Sarıyıldız, Rafael S. Rezende, Philippe Weinzaepfel, Diane Larlus, Gabriela Csurka

After expanding the training set, we propose a training approach that leverages the specificities and the underlying geometry of this mix of real and synthetic images.

Image Retrieval Retrieval +1

Rethinking matching-based few-shot action recognition

no code implementations28 Mar 2023 Juliette Bertrand, Yannis Kalantidis, Giorgos Tolias

Few-shot action recognition, i. e. recognizing new action classes given only a few examples, benefits from incorporating temporal information.

Few-Shot action recognition Few Shot Action Recognition

Fake it till you make it: Learning transferable representations from synthetic ImageNet clones

no code implementations CVPR 2023 Mert Bulent Sariyildiz, Karteek Alahari, Diane Larlus, Yannis Kalantidis

We show that with minimal and class-agnostic prompt engineering, ImageNet clones are able to close a large part of the gap between models produced by synthetic images and models trained with real images, for the several standard classification benchmarks that we consider in this study.

Classification Image Generation +1

Granularity-aware Adaptation for Image Retrieval over Multiple Tasks

no code implementations5 Oct 2022 Jon Almazán, Byungsoo Ko, Geonmo Gu, Diane Larlus, Yannis Kalantidis

We address it with the proposed Grappa, an approach that starts from a strong pretrained model, and adapts it to tackle multiple retrieval tasks concurrently, using only unlabeled images from the different task domains.

Image Retrieval Pseudo Label +2

PoseBERT: A Generic Transformer Module for Temporal 3D Human Modeling

1 code implementation22 Aug 2022 Fabien Baradel, Romain Brégier, Thibault Groueix, Philippe Weinzaepfel, Yannis Kalantidis, Grégory Rogez

It is simple, generic and versatile, as it can be plugged on top of any image-based model to transform it in a video-based model leveraging temporal information.

Pose Estimation Pose Prediction

No Reason for No Supervision: Improved Generalization in Supervised Models

1 code implementation30 Jun 2022 Mert Bulent Sariyildiz, Yannis Kalantidis, Karteek Alahari, Diane Larlus

We consider the problem of training a deep neural network on a given classification task, e. g., ImageNet-1K (IN1K), so that it excels at both the training task as well as at other (future) transfer tasks.

Data Augmentation Self-Supervised Learning +1

Learning Super-Features for Image Retrieval

1 code implementation ICLR 2022 Philippe Weinzaepfel, Thomas Lucas, Diane Larlus, Yannis Kalantidis

Second, they are typically trained with a global loss that only acts on top of an aggregation of local features; by contrast, testing is based on local feature matching, which creates a discrepancy between training and testing.

Image Retrieval Retrieval

TLDR: Twin Learning for Dimensionality Reduction

1 code implementation18 Oct 2021 Yannis Kalantidis, Carlos Lassance, Jon Almazan, Diane Larlus

Dimensionality reduction methods are unsupervised approaches which learn low-dimensional spaces where some properties of the initial space, typically the notion of "neighborhood", are preserved.

Dimensionality Reduction Representation Learning +2

Probabilistic Embeddings for Cross-Modal Retrieval

3 code implementations CVPR 2021 Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus

Instead, we propose to use Probabilistic Cross-Modal Embedding (PCME), where samples from the different modalities are represented as probabilistic distributions in the common embedding space.

Cross-Modal Retrieval Retrieval

Concept Generalization in Visual Representation Learning

1 code implementation ICCV 2021 Mert Bulent Sariyildiz, Yannis Kalantidis, Diane Larlus, Karteek Alahari

In this paper, we argue that the semantic relationships between seen and unseen concepts affect generalization performance and propose ImageNet-CoG, a novel benchmark on the ImageNet-21K (IN-21K) dataset that enables measuring concept generalization in a principled way.

Representation Learning Self-Supervised Learning

Hard Negative Mixing for Contrastive Learning

1 code implementation NeurIPS 2020 Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, Diane Larlus

Based on these observations, and motivated by the success of data mixing, we propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead.

Contrastive Learning Data Augmentation +5

Proceedings of the ICLR Workshop on Computer Vision for Agriculture (CV4A) 2020

no code implementations23 Apr 2020 Yannis Kalantidis, Laura Sevilla-Lara, Ernest Mwebaze, Dina Machuve, Hamed Alemohammad, David Guerena

The workshop was held in conjunction with the International Conference on Learning Representations (ICLR) 2020.

Learning to Generate Grounded Visual Captions without Localization Supervision

2 code implementations1 Jun 2019 Chih-Yao Ma, Yannis Kalantidis, Ghassan AlRegib, Peter Vajda, Marcus Rohrbach, Zsolt Kira

When automatically generating a sentence description for an image or video, it often remains unclear how well the generated caption is grounded, that is whether the model uses the correct image regions to output particular words, or if the model is hallucinating based on priors in the dataset and/or the language model.

Image Captioning Language Modelling +2

Less is More: Learning Highlight Detection from Video Duration

no code implementations CVPR 2019 Bo Xiong, Yannis Kalantidis, Deepti Ghadiyaram, Kristen Grauman

Highlight detection has the potential to significantly ease video browsing, but existing methods often suffer from expensive supervision requirements, where human viewers must manually identify highlights in training videos.

Highlight Detection

Grounded Video Description

2 code implementations CVPR 2019 Luowei Zhou, Yannis Kalantidis, Xinlei Chen, Jason J. Corso, Marcus Rohrbach

Our dataset, ActivityNet-Entities, augments the challenging ActivityNet Captions dataset with 158k bounding box annotations, each grounding a noun phrase.

Sentence Video Description

Graph-Based Global Reasoning Networks

9 code implementations CVPR 2019 Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, Yannis Kalantidis

In this work, we propose a new approach for reasoning globally in which a set of features are globally aggregated over the coordinate space and then projected to an interaction space where relational reasoning can be efficiently computed.

Action Classification Action Recognition +4

Multi-Fiber Networks for Video Recognition

no code implementations ECCV 2018 Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

In this paper, we aim to reduce the computational cost of spatio-temporal deep neural networks, making them run as fast as their 2D counterparts while preserving state-of-the-art accuracy on video recognition benchmarks.

Ranked #36 on Action Recognition on UCF101 (using extra training data)

Action Classification Action Recognition +1

Large-Scale Visual Relationship Understanding

2 code implementations27 Apr 2018 Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny

Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples.

Relationship Detection

MemexQA: Visual Memex Question Answering

1 code implementation4 Aug 2017 Lu Jiang, Junwei Liang, Liangliang Cao, Yannis Kalantidis, Sachin Farfade, Alexander Hauptmann

This paper proposes a new task, MemexQA: given a collection of photos or videos from a user, the goal is to automatically answer questions that help users recover their memory about events captured in the collection.

Memex Question Answering Question Answering +1

Tag Prediction at Flickr: a View from the Darkroom

no code implementations6 Dec 2016 Kofi Boakye, Sachin Farfade, Hamid Izadinia, Yannis Kalantidis, Pierre Garrigues

Our results demonstrate that, for real-world datasets, training exclusively with this noisy data yields performance on par with the standard paradigm of first pre-training on clean data and then fine-tuning.

TAG

Visual Congruent Ads for Image Search

no code implementations21 Apr 2016 Yannis Kalantidis, Ayman Farahat, Lyndon Kennedy, Ricardo Baeza-Yates, David A. Shamma

The quality of user experience online is affected by the relevance and placement of advertisements.

Image Retrieval

LOH and behold: Web-scale visual search, recommendation and clustering using Locally Optimized Hashing

no code implementations21 Apr 2016 Yannis Kalantidis, Lyndon Kennedy, Huy Nguyen, Clayton Mellina, David A. Shamma

We propose a novel hashing-based matching scheme, called Locally Optimized Hashing (LOH), based on a state-of-the-art quantization algorithm that can be used for efficient, large-scale search, recommendation, clustering, and deduplication.

Clustering Distributed Computing +1

Cross-dimensional Weighting for Aggregated Deep Convolutional Features

1 code implementation13 Dec 2015 Yannis Kalantidis, Clayton Mellina, Simon Osindero

We propose a simple and straightforward way of creating powerful image representations via cross-dimensional weighting and aggregation of deep convolutional neural network layer outputs.

Image Retrieval

Web-Scale Image Clustering Revisited

1 code implementation ICCV 2015 Yannis Avrithis, Yannis Kalantidis, Evangelos Anagnostopoulos, Ioannis Z. Emiris

Large scale duplicate detection, clustering and mining of documents or images has been conventionally treated with seed detection via hashing, followed by seed growing heuristics using fast search.

Clustering Image Clustering +1

Locally Optimized Product Quantization for Approximate Nearest Neighbor Search

no code implementations CVPR 2014 Yannis Kalantidis, Yannis Avrithis

We present a simple vector quantizer that combines low distortion with fast search and apply it to approximate nearest neighbor (ANN) search in high dimensional spaces.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.