Search Results for author: Yannis Kalantidis

Found 26 papers, 14 papers with code

Leveraging MoCap Data for Human Mesh Recovery

no code implementations18 Oct 2021 Fabien Baradel, Thibault Groueix, Philippe Weinzaepfel, Romain Brégier, Yannis Kalantidis, Grégory Rogez

In fact, we show that simply fine-tuning the batch normalization layers of the model is enough to achieve large gains.

TLDR: Twin Learning for Dimensionality Reduction

1 code implementation18 Oct 2021 Yannis Kalantidis, Carlos Lassance, Jon Almazan, Diane Larlus

In this paper, we unify these two families of approaches from the angle of manifold learning and propose TLDR, a dimensionality reduction method for generic input spaces that is porting the simple self-supervised learning framework of Barlow Twins to a setting where it is hard or impossible to define an appropriate set of distortions by hand.

Dimensionality Reduction Representation Learning +1

Probabilistic Embeddings for Cross-Modal Retrieval

1 code implementation CVPR 2021 Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus

Instead, we propose to use Probabilistic Cross-Modal Embedding (PCME), where samples from the different modalities are represented as probabilistic distributions in the common embedding space.

Cross-Modal Retrieval

Concept Generalization in Visual Representation Learning

1 code implementation ICCV 2021 Mert Bulent Sariyildiz, Yannis Kalantidis, Diane Larlus, Karteek Alahari

In this paper, we argue that the semantic relationships between seen and unseen concepts affect generalization performance and propose ImageNet-CoG, a novel benchmark on the ImageNet-21K (IN-21K) dataset that enables measuring concept generalization in a principled way.

Representation Learning Self-Supervised Learning

Hard Negative Mixing for Contrastive Learning

no code implementations NeurIPS 2020 Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, Diane Larlus

Based on these observations, and motivated by the success of data mixing, we propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead.

Contrastive Learning Data Augmentation +4

Proceedings of the ICLR Workshop on Computer Vision for Agriculture (CV4A) 2020

no code implementations23 Apr 2020 Yannis Kalantidis, Laura Sevilla-Lara, Ernest Mwebaze, Dina Machuve, Hamed Alemohammad, David Guerena

The workshop was held in conjunction with the International Conference on Learning Representations (ICLR) 2020.

Learning to Generate Grounded Visual Captions without Localization Supervision

2 code implementations1 Jun 2019 Chih-Yao Ma, Yannis Kalantidis, Ghassan AlRegib, Peter Vajda, Marcus Rohrbach, Zsolt Kira

When automatically generating a sentence description for an image or video, it often remains unclear how well the generated caption is grounded, that is whether the model uses the correct image regions to output particular words, or if the model is hallucinating based on priors in the dataset and/or the language model.

Image Captioning Language Modelling +1

Less is More: Learning Highlight Detection from Video Duration

no code implementations CVPR 2019 Bo Xiong, Yannis Kalantidis, Deepti Ghadiyaram, Kristen Grauman

Highlight detection has the potential to significantly ease video browsing, but existing methods often suffer from expensive supervision requirements, where human viewers must manually identify highlights in training videos.

Grounded Video Description

2 code implementations CVPR 2019 Luowei Zhou, Yannis Kalantidis, Xinlei Chen, Jason J. Corso, Marcus Rohrbach

Our dataset, ActivityNet-Entities, augments the challenging ActivityNet Captions dataset with 158k bounding box annotations, each grounding a noun phrase.

Video Description

Graph-Based Global Reasoning Networks

4 code implementations CVPR 2019 Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, Yannis Kalantidis

In this work, we propose a new approach for reasoning globally in which a set of features are globally aggregated over the coordinate space and then projected to an interaction space where relational reasoning can be efficiently computed.

Action Classification Action Recognition +3

Multi-Fiber Networks for Video Recognition

no code implementations ECCV 2018 Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

In this paper, we aim to reduce the computational cost of spatio-temporal deep neural networks, making them run as fast as their 2D counterparts while preserving state-of-the-art accuracy on video recognition benchmarks.

Ranked #30 on Action Recognition on UCF101 (using extra training data)

Action Classification Action Recognition +1

Large-Scale Visual Relationship Understanding

2 code implementations27 Apr 2018 Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny

Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples.

MemexQA: Visual Memex Question Answering

1 code implementation4 Aug 2017 Lu Jiang, Junwei Liang, Liangliang Cao, Yannis Kalantidis, Sachin Farfade, Alexander Hauptmann

This paper proposes a new task, MemexQA: given a collection of photos or videos from a user, the goal is to automatically answer questions that help users recover their memory about events captured in the collection.

Memex Question Answering Question Answering +1

Tag Prediction at Flickr: a View from the Darkroom

no code implementations6 Dec 2016 Kofi Boakye, Sachin Farfade, Hamid Izadinia, Yannis Kalantidis, Pierre Garrigues

Our results demonstrate that, for real-world datasets, training exclusively with this noisy data yields performance on par with the standard paradigm of first pre-training on clean data and then fine-tuning.


LOH and behold: Web-scale visual search, recommendation and clustering using Locally Optimized Hashing

no code implementations21 Apr 2016 Yannis Kalantidis, Lyndon Kennedy, Huy Nguyen, Clayton Mellina, David A. Shamma

We propose a novel hashing-based matching scheme, called Locally Optimized Hashing (LOH), based on a state-of-the-art quantization algorithm that can be used for efficient, large-scale search, recommendation, clustering, and deduplication.

Distributed Computing Quantization

Visual Congruent Ads for Image Search

no code implementations21 Apr 2016 Yannis Kalantidis, Ayman Farahat, Lyndon Kennedy, Ricardo Baeza-Yates, David A. Shamma

The quality of user experience online is affected by the relevance and placement of advertisements.

Image Retrieval

Cross-dimensional Weighting for Aggregated Deep Convolutional Features

1 code implementation13 Dec 2015 Yannis Kalantidis, Clayton Mellina, Simon Osindero

We propose a simple and straightforward way of creating powerful image representations via cross-dimensional weighting and aggregation of deep convolutional neural network layer outputs.

Image Retrieval

Web-Scale Image Clustering Revisited

1 code implementation ICCV 2015 Yannis Avrithis, Yannis Kalantidis, Evangelos Anagnostopoulos, Ioannis Z. Emiris

Large scale duplicate detection, clustering and mining of documents or images has been conventionally treated with seed detection via hashing, followed by seed growing heuristics using fast search.

Image Clustering Quantization

Locally Optimized Product Quantization for Approximate Nearest Neighbor Search

no code implementations CVPR 2014 Yannis Kalantidis, Yannis Avrithis

We present a simple vector quantizer that combines low distortion with fast search and apply it to approximate nearest neighbor (ANN) search in high dimensional spaces.


Cannot find the paper you are looking for? You can Submit a new open access paper.