no code implementations • ICLR 2019 • Pascal Mettes, Elise van der Pol, Cees G. M. Snoek
The structure is defined by polar prototypes, points on the hypersphere of the output space.
no code implementations • 29 Nov 2024 • Wenfang Sun, Yingjun Du, Gaowen Liu, Cees G. M. Snoek
We tackle the problem of quantifying the number of objects by a generative text-to-image model.
no code implementations • 18 Nov 2024 • Piyush Bagad, Makarand Tapaswi, Cees G. M. Snoek, Andrew Zisserman
We study the connection between audio-visual observations and the underlying physics of a mundane yet intriguing everyday activity: pouring liquids.
no code implementations • 7 Nov 2024 • Jie Liu, Pan Zhou, Yingjun Du, Ah-Hwee Tan, Cees G. M. Snoek, Jan-Jakob Sonke, Efstratios Gavves
To solve this issue, we propose Cooperative Plan Optimization (CaPo) to enhance the cooperation efficiency of LLM-based embodied agents.
1 code implementation • 6 Nov 2024 • Zehao Xiao, Cees G. M. Snoek
Machine learning algorithms have achieved remarkable success across various disciplines, use cases and applications, under the prevailing assumption that training and test samples are drawn from the same distribution.
no code implementations • 26 Oct 2024 • Yingjun Du, Gaowen Liu, Yuzhang Shang, Yuguang Yao, Ramana Kompella, Cees G. M. Snoek
This paper introduces prompt diffusion, which uses a diffusion model to gradually refine the prompts to obtain a customized prompt for each sample.
1 code implementation • 20 Oct 2024 • Yingjun Du, Wenfang Sun, Cees G. M. Snoek
Pre-trained vision-language models like CLIP have remarkably adapted to various downstream tasks.
no code implementations • 16 Oct 2024 • Aozhu Chen, Hazel Doughty, Xirong Li, Cees G. M. Snoek
We perform comprehensive experiments using four state-of-the-art models across two standard benchmarks (MSR-VTT and VATEX) and two specially curated datasets enriched with detailed descriptions (VLN-UVO and VLN-OOPS), resulting in a number of novel insights: 1) our analyses show that the current evaluation benchmarks fall short in detecting a model's ability to perceive subtle single-word differences, 2) our fine-grained evaluation highlights the difficulty models face in distinguishing such subtle variations.
no code implementations • 15 Oct 2024 • Hazel Doughty, Fida Mohammad Thoker, Cees G. M. Snoek
Furthermore, we propose verb-variation paraphrasing to increase the caption variety and learn the link between primitive motions and high-level verbs.
no code implementations • 14 Oct 2024 • Aritra Bhowmik, Mohammad Mahdi Derakhshani, Dennis Koelma, Martin R. Oswald, Yuki M. Asano, Cees G. M. Snoek
Yet, without vast amounts of spatial supervision, current Visual Language Models (VLMs) struggle at this task.
no code implementations • 13 Oct 2024 • Ivona Najdenkoska, Mohammad Mahdi Derakhshani, Yuki M. Asano, Nanne van Noord, Marcel Worring, Cees G. M. Snoek
By effectively encoding captions longer than the default 77 tokens, our model outperforms baselines on cross-modal tasks such as retrieval and text-to-image generation.
no code implementations • 10 Oct 2024 • Daniel Cores, Michael Dorkenwald, Manuel Mucientes, Cees G. M. Snoek, Yuki M. Asano
Large language models have demonstrated impressive performance when integrated with vision models even enabling video understanding.
2 code implementations • 26 Aug 2024 • Sarah Rastegar, Mohammadreza Salehi, Yuki M. Asano, Hazel Doughty, Cees G. M. Snoek
In this paper, we address Generalized Category Discovery, aiming to simultaneously uncover novel categories and accurately classify known ones.
no code implementations • 22 Jul 2024 • Mohammadreza Salehi, Michael Dorkenwald, Fida Mohammad Thoker, Efstratios Gavves, Cees G. M. Snoek, Yuki M. Asano
To tackle this, we present Sinkhorn-guided Masked Video Modelling (SIGMA), a novel video pretraining method that jointly learns the video model in addition to a target feature space using a projection network.
1 code implementation • 17 Jul 2024 • Luc P. J. Sträter, Mohammadreza Salehi, Efstratios Gavves, Cees G. M. Snoek, Yuki M. Asano
These features are fed to an attention-based discriminator, which is trained to score every patch in the image.
Ranked #1 on Anomaly Detection on One-class CIFAR-100
no code implementations • 13 Jun 2024 • Duy-Kien Nguyen, Mahmoud Assran, Unnat Jain, Martin R. Oswald, Cees G. M. Snoek, Xinlei Chen
This work does not introduce a new method.
no code implementations • 31 Mar 2024 • Wenfang Sun, Yingjun Du, Gaowen Liu, Ramana Kompella, Cees G. M. Snoek
Additionally, we propose an assembly that merges the segmentation maps from the various subclass descriptors to ensure a more comprehensive representation of the different aspects in the test images.
1 code implementation • 18 Mar 2024 • Miltiadis Kofinas, Boris Knyazev, Yan Zhang, Yunlu Chen, Gertjan J. Burghouts, Efstratios Gavves, Cees G. M. Snoek, David W. Zhang
Neural networks that process the parameters of other neural networks find applications in domains as diverse as classifying implicit neural representations, generating neural network weights, and predicting generalization errors.
no code implementations • CVPR 2024 • Zehao Xiao, Jiayi Shen, Mohammad Mahdi Derakhshani, Shengcai Liao, Cees G. M. Snoek
To effectively encode the distribution information and their relationships, we further introduce a transformer inference network with a pseudo-shift training mechanism.
no code implementations • CVPR 2024 • Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek, Yuki M. Asano
Vision-Language Models (VLMs), such as Flamingo and GPT-4V, have shown immense potential by integrating large language models with vision systems.
no code implementations • CVPR 2024 • Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek
Low-resource settings are well-established in natural language processing, where many languages lack sufficient data for deep learning at scale.
no code implementations • 17 Dec 2023 • Vincent Tao Hu, David W Zhang, Pascal Mettes, Meng Tang, Deli Zhao, Cees G. M. Snoek
Flow Matching is an emerging generative modeling technique that offers the advantage of simple and efficient training.
no code implementations • 14 Dec 2023 • Vincent Tao Hu, Yunlu Chen, Mathilde Caron, Yuki M. Asano, Cees G. M. Snoek, Bjorn Ommer
However, recent studies have revealed that the feature representation derived from diffusion model itself is discriminative for numerous downstream tasks as well, which prompts us to propose a framework to extract guidance from, and specifically for, diffusion models.
no code implementations • 14 Dec 2023 • Vincent Tao Hu, Wenzhe Yin, Pingchuan Ma, Yunlu Chen, Basura Fernando, Yuki M Asano, Efstratios Gavves, Pascal Mettes, Bjorn Ommer, Cees G. M. Snoek
In this paper, we propose \emph{Motion Flow Matching}, a novel generative model designed for human motion generation featuring efficient sampling and effectiveness in motion editing applications.
no code implementations • 30 Nov 2023 • Aritra Bhowmik, Martin R. Oswald, Pascal Mettes, Cees G. M. Snoek
For proposal regression, we solve a simpler problem where we regress to the area of intersection between proposal and ground truth.
no code implementations • 28 Nov 2023 • Mohammad Mahdi Derakhshani, Menglin Xia, Harkirat Behl, Cees G. M. Snoek, Victor Rühle
We propose CompFuser, an image generation pipeline that enhances spatial comprehension and attribute assignment in text-to-image generative models.
1 code implementation • 23 Nov 2023 • Tao Hu, William Thong, Pascal Mettes, Cees G. M. Snoek
In this paper, we propose a visual-semantic embedding network that explicitly deals with the imbalanced scenario for activity retrieval.
no code implementations • 15 Nov 2023 • Aviv Shamsian, David W. Zhang, Aviv Navon, Yan Zhang, Miltiadis Kofinas, Idan Achituve, Riccardo Valperga, Gertjan J. Burghouts, Efstratios Gavves, Cees G. M. Snoek, Ethan Fetaya, Gal Chechik, Haggai Maron
Learning in weight spaces, where neural networks process the weights of other deep neural networks, has emerged as a promising research direction with applications in various fields, from analyzing and editing neural fields and implicit neural representations, to network pruning and quantization.
2 code implementations • NeurIPS 2023 • Sarah Rastegar, Hazel Doughty, Cees G. M. Snoek
In the quest for unveiling novel categories at test time, we confront the inherent limitations of traditional supervised recognition models that are restricted by a predefined category set.
no code implementations • 9 Oct 2023 • Duy-Kien Nguyen, Martin R. Oswald, Cees G. M. Snoek
The ability to detect objects in images at varying scales has played a pivotal role in the design of modern object detectors.
no code implementations • 30 Sep 2023 • Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring, Yuki M. Asano
We present Self-Context Adaptation (SeCAt), a self-supervised approach that unlocks few-shot abilities for open-ended classification with small visual language models.
1 code implementation • ICCV 2023 • Mohammadreza Salehi, Efstratios Gavves, Cees G. M. Snoek, Yuki M. Asano
Our paper aims to address this gap by proposing a novel approach that incorporates temporal consistency in dense self-supervised learning.
no code implementations • 8 Jul 2023 • Sameer Ambekar, Zehao Xiao, Jiayi Shen, XianTong Zhen, Cees G. M. Snoek
We formulate the generalization at test time as a variational inference problem, by modeling pseudo labels as distributions, to consider the uncertainty during generalization and alleviate the misleading signal of inaccurate pseudo labels.
1 code implementation • 16 Jun 2023 • Shuo Chen, Yingjun Du, Pascal Mettes, Cees G. M. Snoek
This paper investigates the problem of scene graph generation in videos with the aim of capturing semantic relations between subjects and objects in the form of $\langle$subject, predicate, object$\rangle$ triplets.
no code implementations • 8 Jun 2023 • Yingjun Du, Jiayi Shen, XianTong Zhen, Cees G. M. Snoek
By learning to retain and recall the learning process of past training tasks, EMO nudges parameter updates in the right direction, even when the gradients provided by a limited number of examples are uninformative.
1 code implementation • 8 Jun 2023 • Zenglin Shi, Pascal Mettes, Cees G. M. Snoek
Where density-based counting methods typically use the point annotations only to create Gaussian-density maps, which act as the supervision signal, the starting point of this work is that point annotations have counting potential beyond density map generation.
1 code implementation • 8 Jun 2023 • Duy-Kien Nguyen, Vaibhav Aggarwal, Yanghao Li, Martin R. Oswald, Alexander Kirillov, Cees G. M. Snoek, Xinlei Chen
In this work, we explore regions as a potential visual analogue of words for self-supervised image representation learning.
1 code implementation • 17 May 2023 • Wenfang Sun, Yingjun Du, XianTong Zhen, Fan Wang, Ling Wang, Cees G. M. Snoek
To account for the uncertainty caused by the limited training tasks, we propose a variational MetaModulation where the modulation parameters are treated as latent variables.
no code implementations • ICCV 2023 • Pengwan Yang, Cees G. M. Snoek, Yuki M. Asano
In this paper we address the task of finding representative subsets of points in a 3D point cloud by means of a point-wise ordering.
no code implementations • CVPR 2023 • Yingjun Du, Jiayi Shen, XianTong Zhen, Cees G. M. Snoek
Modern image classifiers perform well on populated classes, while degrading considerably on tail classes with only a few instances.
1 code implementation • 10 Mar 2023 • Tom van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
Most existing methods approach it as a multi-class classification problem, which restricts the outcome to a predefined closed-set of curated answers.
Ranked #1 on Medical Visual Question Answering on OVQA
1 code implementation • 22 Feb 2023 • Zehao Xiao, XianTong Zhen, Shengcai Liao, Cees G. M. Snoek
In this paper, we propose energy-based sample adaptation at test time for domain generalization.
1 code implementation • 30 Jan 2023 • Yan Zhang, David W. Zhang, Simon Lacoste-Julien, Gertjan J. Burghouts, Cees G. M. Snoek
Slot attention is a powerful method for object-centric modeling in images and videos.
1 code implementation • CVPR 2023 • Piyush Bagad, Makarand Tapaswi, Cees G. M. Snoek
Our work serves as a first step towards probing and instilling a sense of time in existing video-language models without the need for data and compute-intense training from scratch.
Ranked #3 on Video-Text Retrieval on Test-of-Time (using extra training data)
no code implementations • ICCV 2023 • Aritra Bhowmik, Yu Wang, Nora Baka, Martin R. Oswald, Cees G. M. Snoek
Contrary to existing methods, which learn objects and relations separately, our key idea is to learn the object-relation distribution jointly.
no code implementations • 5 Dec 2022 • Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek
The main causes are the limited availability of labeled dark videos to learn from, as well as the distribution shift towards the lower color contrast at test-time.
1 code implementation • 19 Oct 2022 • Mengmeng Jing, XianTong Zhen, Jingjing Li, Cees G. M. Snoek
Our model perturbation provides a new probabilistic way for domain adaptation which enables efficient adaptation to target domains while maximally preserving knowledge in source models.
1 code implementation • CVPR 2023 • Vincent Tao Hu, David W Zhang, Yuki M. Asano, Gertjan J. Burghouts, Cees G. M. Snoek
Diffusion models have demonstrated remarkable progress in image generation quality, especially when guidance is used to control the generative process.
1 code implementation • 10 Oct 2022 • Jiayi Shen, Zehao Xiao, XianTong Zhen, Cees G. M. Snoek, Marcel Worring
To generalize to such test data, it is crucial for individual tasks to leverage knowledge from related tasks.
1 code implementation • ICCV 2023 • Mohammad Mahdi Derakhshani, Enrique Sanchez, Adrian Bulat, Victor Guilherme Turrisi da Costa, Cees G. M. Snoek, Georgios Tzimiropoulos, Brais Martinez
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
Ranked #1 on Few-Shot Learning on food101
1 code implementation • 28 May 2022 • Hossein Mirzaei, Mohammadreza Salehi, Sajjad Shahabi, Efstratios Gavves, Cees G. M. Snoek, Mohammad Sabokrou, Mohammad Hossein Rohban
Effectiveness of our method for both the near-distribution and standard novelty detection is assessed through extensive experiments on datasets in diverse applications such as medical images, object classification, and quality control.
Ranked #3 on Anomaly Detection on One-class CIFAR-10 (using extra training data)
no code implementations • 19 Apr 2022 • Pengwan Yang, Yuki M. Asano, Pascal Mettes, Cees G. M. Snoek
The goal of this paper is to bypass the need for labelled examples in few-shot video understanding at run time.
1 code implementation • 12 Apr 2022 • Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Tom van Sonsbeek, XianTong Zhen, Dwarikanath Mahapatra, Marcel Worring, Cees G. M. Snoek
Task and class incremental learning of diseases address the issue of classifying new samples without re-training the models from scratch, while cross-domain incremental learning addresses the issue of dealing with datasets originating from different institutions while retaining the previously obtained knowledge.
1 code implementation • CVPR 2022 • Yunhua Zhang, Hazel Doughty, Ling Shao, Cees G. M. Snoek
This paper strives for activity recognition under domain shift, for example caused by change of scenery or camera viewpoint.
1 code implementation • CVPR 2022 • Hazel Doughty, Cees G. M. Snoek
We aim to understand how actions are performed and identify subtle differences, such as 'fold firmly' vs. 'fold gently'.
1 code implementation • ICLR 2022 • Zehao Xiao, XianTong Zhen, Ling Shao, Cees G. M. Snoek
We leverage a meta-learning paradigm to learn our model to acquire the ability of adaptation with single samples at training time so as to further adapt itself to each single test sample at test time.
Ranked #1 on Domain Adaptation on PACS
no code implementations • 26 Dec 2021 • Mohammad Mahdi Derakhshani, XianTong Zhen, Ling Shao, Cees G. M. Snoek
Kernel continual learning by \citet{derakhshani2021kernel} has recently emerged as a strong continual learner due to its non-parametric ability to tackle task interference and catastrophic forgetting.
1 code implementation • ICLR 2022 • Yingjun Du, XianTong Zhen, Ling Shao, Cees G. M. Snoek
To explore and exploit the importance of different semantic levels, we further propose to learn the weights associated with the prototype at each level in a data-driven way, which enables the model to adaptively choose the most generalizable features.
1 code implementation • CVPR 2022 • Duy-Kien Nguyen, Jihong Ju, Olaf Booij, Martin R. Oswald, Cees G. M. Snoek
Specifically, we present BoxeR, short for Box Transformer, which attends to a set of boxes by predicting their transformation from a reference window on an input feature map.
1 code implementation • ICLR 2022 • Yan Zhang, David W. Zhang, Simon Lacoste-Julien, Gertjan J. Burghouts, Cees G. M. Snoek
Most set prediction models in deep learning use set-equivariant operations, but they actually operate on multisets.
1 code implementation • 27 Oct 2021 • William Thong, Cees G. M. Snoek
This paper strives to address image classifier bias, with a focus on both feature and label embedding spaces.
1 code implementation • 25 Oct 2021 • Shuo Chen, Pascal Mettes, Cees G. M. Snoek
Video relation detection forms a new and challenging problem in computer vision, where subjects and objects need to be localized spatio-temporally and a predicate label needs to be assigned if and only if there is an interaction between the two.
1 code implementation • ICCV 2021 • Shuo Chen, Zenglin Shi, Pascal Mettes, Cees G. M. Snoek
We also propose Social Fabric: an encoding that represents a pair of object tubelets as a composition of interaction primitives.
Ranked #1 on Video Visual Relation Detection on VidOR
1 code implementation • 8 Aug 2021 • Fida Mohammad Thoker, Hazel Doughty, Cees G. M. Snoek
In particular, we propose inter-skeleton contrastive learning, which learns from multiple different input skeleton representations in a cross-contrastive manner.
no code implementations • 6 Aug 2021 • Fida Mohammad Thoker, Cees G. M. Snoek
This paper strives for action recognition and detection in video modalities like RGB, depth maps or 3D-skeleton sequences when only limited modality-specific labeled examples are available.
1 code implementation • 12 Jul 2021 • Mohammad Mahdi Derakhshani, XianTong Zhen, Ling Shao, Cees G. M. Snoek
We further introduce variational random features to learn a data-driven kernel for each task.
1 code implementation • 2 Jul 2021 • Zenglin Shi, Pascal Mettes, Subhransu Maji, Cees G. M. Snoek
The deep image prior showed that a randomly initialized network with a suitable architecture can be trained to solve inverse imaging problems by simply optimizing it's parameters to reconstruct a single degraded image.
1 code implementation • 26 Jun 2021 • David W. Zhang, Gertjan J. Burghouts, Cees G. M. Snoek
We address two common scaling problems encountered in set-to-hypergraph tasks that limit the size of the input set: the exponentially growing number of hyperedges and the run-time complexity, both leading to higher memory requirements.
no code implementations • ACL 2021 • Yingjun Du, Nithin Holla, XianTong Zhen, Cees G. M. Snoek, Ekaterina Shutova
A critical challenge faced by supervised word sense disambiguation (WSD) is the lack of large annotated datasets with sufficient coverage of words in their diversity of senses.
1 code implementation • 2 Jun 2021 • Zenglin Shi, Yunlu Chen, Efstratios Gavves, Pascal Mettes, Cees G. M. Snoek
The state-of-the-art leverages deep networks to estimate the two core coefficients of the guided filter.
1 code implementation • 14 May 2021 • Haoliang Sun, Xiankai Lu, Haochen Wang, Yilong Yin, XianTong Zhen, Cees G. M. Snoek, Ling Shao
We define a global latent variable to represent the prototype of each object category, which we model as a probabilistic distribution.
1 code implementation • 9 May 2021 • Zehao Xiao, Jiayi Shen, XianTong Zhen, Ling Shao, Cees G. M. Snoek
Domain generalization is challenging due to the domain shift and the uncertainty caused by the inaccessibility of target domain data.
1 code implementation • 8 May 2021 • Yingjun Du, Haoliang Sun, XianTong Zhen, Jun Xu, Yilong Yin, Ling Shao, Cees G. M. Snoek
Specifically, we propose learning variational random features in a data-driven manner to obtain task-specific kernels by leveraging the shared knowledge provided by related tasks in a meta-learning setting.
no code implementations • ICCV 2021 • Kirill Gavrilyuk, Mihir Jain, Ilia Karmanov, Cees G. M. Snoek
With the motion model we generate pseudo-labels for a large unlabeled video collection, which enables us to transfer knowledge by learning to predict these pseudo-labels with an appearance model.
no code implementations • 23 Apr 2021 • Sander R. Klomp, Matthew van Rijn, Rob G. J. Wijnhoven, Cees G. M. Snoek, Peter H. N. de With
Our experiments investigate the suitability of anonymization methods for maintaining face detector performance, the effect of detectors overtraining on anonymization artefacts, dataset size for training an anonymizer, and the effect of training time of anonymization GANs.
1 code implementation • 10 Apr 2021 • Pascal Mettes, William Thong, Cees G. M. Snoek
This work strives for the classification and localization of human actions in videos, without the need for any labeled video training examples.
no code implementations • CVPR 2021 • Pengwan Yang, Pascal Mettes, Cees G. M. Snoek
This paper introduces the task of few-shot common action localization in time and space.
1 code implementation • CVPR 2022 • Jiaojiao Zhao, Yanyi Zhang, Xinyu Li, Hao Chen, Shuai Bing, Mingze Xu, Chunhui Liu, Kaustav Kundu, Yuanjun Xiong, Davide Modolo, Ivan Marsic, Cees G. M. Snoek, Joseph Tighe
We propose TubeR: a simple solution for spatio-temporal video action detection.
no code implementations • ICLR 2021 • Jiaojiao Zhao, Cees G. M. Snoek
Pooling is a critical operation in convolutional neural networks for increasing receptive fields and improving robustness to input variations.
1 code implementation • CVPR 2021 • Yunhua Zhang, Ling Shao, Cees G. M. Snoek
We also introduce a variant of this dataset for repetition counting under challenging vision conditions.
no code implementations • 1 Jan 2021 • Zehao Xiao, Jiayi Shen, XianTong Zhen, Ling Shao, Cees G. M. Snoek
In the probabilistic modeling framework, we introduce a domain-invariant principle to explore invariance across domains in a unified way.
no code implementations • ICLR 2021 • Yingjun Du, XianTong Zhen, Ling Shao, Cees G. M. Snoek
Batch normalization plays a crucial role when training deep neural networks.
1 code implementation • NeurIPS 2020 • XianTong Zhen, Yingjun Du, Huan Xiong, Qiang Qiu, Cees G. M. Snoek, Ling Shao
The variational semantic memory accrues and stores semantic information for the probabilistic inference of class prototypes in a hierarchical Bayesian framework.
1 code implementation • ICLR 2021 • David W. Zhang, Gertjan J. Burghouts, Cees G. M. Snoek
In this paper, we propose an alternative to training via set losses by viewing learning as conditional density estimation.
1 code implementation • 25 Aug 2020 • William Thong, Cees G. M. Snoek
We propose a bias-aware learner to map inputs to a semantic embedding space for generalized zero-shot learning.
1 code implementation • ECCV 2020 • Yunlu Chen, Vincent Tao Hu, Efstratios Gavves, Thomas Mensink, Pascal Mettes, Pengwan Yang, Cees G. M. Snoek
In this paper, we define data augmentation between point clouds as a shortest path linear interpolation.
Ranked #3 on 3D Point Cloud Data Augmentation on ModelNet40
3D Point Cloud Classification 3D Point Cloud Data Augmentation +2
1 code implementation • ECCV 2020 • Pengwan Yang, Vincent Tao Hu, Pascal Mettes, Cees G. M. Snoek
The start and end of an action in a long untrimmed video is determined based on just a hand-full of trimmed video examples containing the same action, without knowing their common class label.
no code implementations • ECCV 2020 • Ying-Jun Du, Jun Xu, Huan Xiong, Qiang Qiu, Xian-Tong Zhen, Cees G. M. Snoek, Ling Shao
Domain generalization models learn to generalize to previously unseen domains, but suffer from prediction uncertainty and domain shift.
no code implementations • CVPR 2020 • Kirill Gavrilyuk, Ryan Sanford, Mehrsan Javan, Cees G. M. Snoek
This paper strives to recognize individual actions and group activities from videos.
1 code implementation • ECCV 2020 • Sanath Narayan, Akshita Gupta, Fahad Shahbaz Khan, Cees G. M. Snoek, Ling Shao
We propose to enforce semantic consistency at all stages of (generalized) zero-shot learning: training, feature synthesis and classification.
Ranked #2 on Generalized Zero-Shot Learning on Oxford 102 Flower
no code implementations • CVPR 2020 • Tom F. H. Runia, Kirill Gavrilyuk, Cees G. M. Snoek, Arnold W. M. Smeulders
For many of the physical phenomena around us, we have developed sophisticated models explaining their behavior.
2 code implementations • 19 Nov 2019 • William Thong, Pascal Mettes, Cees G. M. Snoek
In this paper, we make the step towards an open setting where multiple visual domains are available.
no code implementations • 17 Oct 2019 • Tom F. H. Runia, Kirill Gavrilyuk, Cees G. M. Snoek, Arnold W. M. Smeulders
Nevertheless, inferring specifics from visual observations is challenging due to the high number of causally underlying physical parameters -- including material properties and external forces.
no code implementations • Proceedings of the AAAI Conference on Artificial Intelligence 2019 • Tao Hu, Pengwan Yang, Chiliang Zhang, Gang Yu, Yadong Mu, Cees G. M. Snoek
Few-shot learning is a nascent research topic, motivated by the fact that traditional deep learning methods require tremen- dous amounts of data.
Ranked #1 on Few-Shot Semantic Segmentation on Pascal5i
2 code implementations • CVPR 2019 • Shuai Liao, Efstratios Gavves, Cees G. M. Snoek
We observe many continuous output problems in computer vision are naturally contained in closed geometrical manifolds, like the Euler angles in viewpoint estimation or the normals in surface normal estimation.
no code implementations • 2 Apr 2019 • William Thong, Cees G. M. Snoek, Arnold W. M. Smeulders
These relationships enable them to cooperate for their mutual benefits for image retrieval.
1 code implementation • CVPR 2019 • Jiaojiao Zhao, Cees G. M. Snoek
With only half the computation and parameters of the state-of-the-art two-stream methods, our two-in-one stream still achieves impressive results on UCF101-24, UCFSports and J-HMDB.
Ranked #1 on Action Detection on UCF Sports (Video-mAP 0.5 metric)
1 code implementation • ICCV 2019 • Zenglin Shi, Pascal Mettes, Cees G. M. Snoek
To assist both the density estimation and the focus from segmentation, we also introduce an improved kernel size estimator for the point annotations.
no code implementations • 29 Jan 2019 • Federico Landi, Cees G. M. Snoek, Rita Cucchiara
This paper strives for the detection of real-world anomalies such as burglaries and assaults in surveillance videos.
1 code implementation • NeurIPS 2019 • Pascal Mettes, Elise van der Pol, Cees G. M. Snoek
This paper introduces hyperspherical prototype networks, which unify classification and regression with prototypes on hyperspherical output spaces.
no code implementations • 27 Jan 2019 • Jiaojiao Zhao, Jungong Han, Ling Shao, Cees G. M. Snoek
We propose two ways to incorporate object semantics into the colorization model: through a pixelated semantic embedding and a pixelated semantic generator.
no code implementations • 5 Aug 2018 • Jiaojiao Zhao, Li Liu, Cees G. M. Snoek, Jungong Han, Ling Shao
While many image colorization algorithms have recently shown the capability of producing plausible color versions from gray-scale photographs, they still suffer from the problems of context confusion and edge color bleeding.
no code implementations • 18 Jul 2018 • Amir Ghodrati, Efstratios Gavves, Cees G. M. Snoek
Time-aware encoding of frame sequences in a video is a fundamental problem in video understanding.
no code implementations • 8 Jul 2018 • Pascal Mettes, Cees G. M. Snoek
Rather than disconnecting the spatio-temporal learning from the training, we propose Spatio-Temporal Instance Learning, which enables action localization directly from box proposals in video frames.
Multiple Instance Learning Spatio-Temporal Action Localization +1
1 code implementation • 18 Jun 2018 • Tom F. H. Runia, Cees G. M. Snoek, Arnold W. M. Smeulders
Estimating visual repetition from realistic video is challenging as periodic motion is rarely perfectly static and stationary.
no code implementations • 29 May 2018 • Pascal Mettes, Cees G. M. Snoek
Experimental evaluation on three action localization datasets shows our pointly-supervised approach (i) is as effective as traditional box-supervision at a fraction of the annotation cost, (ii) is robust to sparse and noisy point annotations, (iii) benefits from pseudo-points during inference, and (iv) outperforms recent weakly-supervised alternatives.
1 code implementation • CVPR 2018 • Kirill Gavrilyuk, Amir Ghodrati, Zhenyang Li, Cees G. M. Snoek
This paper strives for pixel-level segmentation of actors and their actions in video content.
Ranked #13 on Referring Expression Segmentation on J-HMDB
no code implementations • CVPR 2018 • Tom F. H. Runia, Cees G. M. Snoek, Arnold W. M. Smeulders
We consider the problem of estimating repetition in video, such as performing push-ups, cutting a melon or playing violin.
no code implementations • 30 Jan 2018 • Spencer Cappallo, Stacey Svetlichnaya, Pierre Garrigues, Thomas Mensink, Cees G. M. Snoek
Over the past decade, emoji have emerged as a new and widespread form of digital communication, spanning diverse social networks and spoken languages.
1 code implementation • 5 Sep 2017 • Jianfeng Dong, Xirong Li, Cees G. M. Snoek
This paper strives to find amidst a set of sentences the one best describing the content of a given image or video.
no code implementations • ICCV 2017 • Pascal Mettes, Cees G. M. Snoek
Action localization and classification experiments on four contemporary action video datasets support our proposal.
no code implementations • 28 Jul 2017 • Pascal Mettes, Cees G. M. Snoek, Shih-Fu Chang
The goal of this paper is to determine the spatio-temporal location of actions in video.
no code implementations • CVPR 2017 • Zhenyang Li, Ran Tao, Efstratios Gavves, Cees G. M. Snoek, Arnold W. M. Smeulders
This paper strives to track a target object in a video.
Ranked #17 on Referring Expression Segmentation on J-HMDB
no code implementations • 6 Oct 2016 • Svetlana Kordumova, Jan C. van Gemert, Cees G. M. Snoek, Arnold W. M. Smeulders
Second, we propose translating the things syntax in linguistic abstract statements and study their descriptive effect to retrieve scenes.
no code implementations • 7 Jul 2016 • Mihir Jain, Jan van Gemert, Hervé Jégou, Patrick Bouthemy, Cees G. M. Snoek
First, inspired by selective search for object proposals, we introduce an approach to generate action proposals from spatiotemporal super-voxels in an unsupervised manner, we call them Tubelets.
1 code implementation • 6 Jul 2016 • Zhenyang Li, Efstratios Gavves, Mihir Jain, Cees G. M. Snoek
We present a new architecture for end-to-end sequence learning of actions in video, we call VideoLSTM.
no code implementations • 26 Apr 2016 • Pascal Mettes, Jan C. van Gemert, Cees G. M. Snoek
Rather than annotating boxes, we propose to annotate actions in video with points on a sparse subset of frames only.
no code implementations • 23 Apr 2016 • Jianfeng Dong, Xirong Li, Cees G. M. Snoek
This paper strives to find the sentence best describing the content of an image or video.
no code implementations • 23 Feb 2016 • Pascal Mettes, Dennis C. Koelma, Cees G. M. Snoek
To deal with the problems of over-specific classes and classes with few images, we introduce a bottom-up and top-down approach for reorganization of the ImageNet hierarchy based on all its 21, 814 classes and more than 14 million images.
no code implementations • 8 Nov 2015 • Amirhossein Habibian, Thomas Mensink, Cees G. M. Snoek
In our proposed embedding, which we call VideoStory, the correlations between the terms are utilized to learn a more effective representation by optimizing a joint objective balancing descriptiveness and predictability. We show how learning the VideoStory using a multimodal predictability loss, including appearance, motion and audio features, results in a better predictable representation.
no code implementations • ICCV 2015 • Mihir Jain, Jan C. van Gemert, Thomas Mensink, Cees G. M. Snoek
Our key contribution is objects2action, a semantic word embedding that is spanned by a skip-gram model of thousands of object categories.
Ranked #24 on Zero-Shot Action Recognition on UCF101
no code implementations • 16 Oct 2015 • Pascal Mettes, Jan C. van Gemert, Cees G. M. Snoek
This work aims for image categorization using a representation of distinctive parts.
no code implementations • 10 Oct 2015 • Masoud Mazloom, Xirong Li, Cees G. M. Snoek
We consider the problem of event detection in video for scenarios where only few, or even zero examples are available for training.
no code implementations • ICCV 2015 • Efstratios Gavves, Thomas Mensink, Tatiana Tommasi, Cees G. M. Snoek, Tinne Tuytelaars
How can we reuse existing knowledge, in the form of available datasets, when solving a new and apparently unrelated target task from a set of unlabeled data?
no code implementations • CVPR 2015 • Mihir Jain, Jan C. van Gemert, Cees G. M. Snoek
This paper contributes to automatic classification and localization of human actions in video.
1 code implementation • 28 Mar 2015 • Xirong Li, Tiberio Uricchio, Lamberto Ballan, Marco Bertini, Cees G. M. Snoek, Alberto del Bimbo
Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image.
no code implementations • CVPR 2014 • Koen E. A. van de Sande, Cees G. M. Snoek, Arnold W. M. Smeulders
Finally, by multiple codeword assignments, we achieve exact and approximate Fisher vectors with FLAIR.
no code implementations • CVPR 2014 • Mihir Jain, Jan van Gemert, Herve Jegou, Patrick Bouthemy, Cees G. M. Snoek
Our approach significantly outperforms the state-of-the-art on both datasets, while restricting the search of actions to a fraction of possible bounding box sequences.
no code implementations • CVPR 2014 • Thomas Mensink, Efstratios Gavves, Cees G. M. Snoek
In this paper we aim for zero-shot classification, that is visual recognition of an unseen class by using knowledge transfer from known classes.
no code implementations • CVPR 2014 • Ran Tao, Efstratios Gavves, Cees G. M. Snoek, Arnold W. M. Smeulders
This paper aims for generic instance search from a single example.