Search Results for author: Pascal Mettes

Found 47 papers, 26 papers with code

Find the Cliffhanger: Multi-Modal Trailerness in Soap Operas

1 code implementation29 Jan 2024 Carlo Bretti, Pascal Mettes, Hendrik Vincent Koops, Daan Odijk, Nanne van Noord

Creating a trailer requires carefully picking out and piecing together brief enticing moments out of a longer video, making it a chal- lenging and time-consuming task.

Latent Space Editing in Transformer-Based Flow Matching

no code implementations17 Dec 2023 Vincent Tao Hu, David W Zhang, Pascal Mettes, Meng Tang, Deli Zhao, Cees G. M. Snoek

Flow Matching is an emerging generative modeling technique that offers the advantage of simple and efficient training.

Motion Flow Matching for Human Motion Synthesis and Editing

no code implementations14 Dec 2023 Vincent Tao Hu, Wenzhe Yin, Pingchuan Ma, Yunlu Chen, Basura Fernando, Yuki M Asano, Efstratios Gavves, Pascal Mettes, Bjorn Ommer, Cees G. M. Snoek

In this paper, we propose \emph{Motion Flow Matching}, a novel generative model designed for human motion generation featuring efficient sampling and effectiveness in motion editing applications.

Motion Interpolation motion prediction +1

Revisiting Proposal-based Object Detection

no code implementations30 Nov 2023 Aritra Bhowmik, Martin R. Oswald, Pascal Mettes, Cees G. M. Snoek

For proposal regression, we solve a simpler problem where we regress to the area of intersection between proposal and ground truth.

Instance Segmentation Object +4

Query by Activity Video in the Wild

1 code implementation23 Nov 2023 Tao Hu, William Thong, Pascal Mettes, Cees G. M. Snoek

In this paper, we propose a visual-semantic embedding network that explicitly deals with the imbalanced scenario for activity retrieval.


Hyperbolic Random Forests

1 code implementation25 Aug 2023 Lars Doorenbos, Pablo Márquez-Neila, Raphael Sznitman, Pascal Mettes

To make hyperbolic random forests work on multi-class data and imbalanced experiments, we furthermore outline a new method for combining classes based on their lowest common ancestor and a class-balanced version of the large-margin loss.

Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation

1 code implementation16 Jun 2023 Shuo Chen, Yingjun Du, Pascal Mettes, Cees G. M. Snoek

This paper investigates the problem of scene graph generation in videos with the aim of capturing semantic relations between subjects and objects in the form of $\langle$subject, predicate, object$\rangle$ triplets.

Graph Generation Meta-Learning +1

HypLL: The Hyperbolic Learning Library

1 code implementation9 Jun 2023 Max van Spengler, Philipp Wirth, Pascal Mettes

Deep learning in hyperbolic space is quickly gaining traction in the fields of machine learning, multimedia, and computer vision.

Focus for Free in Density-Based Counting

no code implementations8 Jun 2023 Zenglin Shi, Pascal Mettes, Cees G. M. Snoek

Where density-based counting methods typically use the point annotations only to create Gaussian-density maps, which act as the supervision signal, the starting point of this work is that point annotations have counting potential beyond density map generation.

Infinite Class Mixup

1 code implementation17 May 2023 Thomas Mensink, Pascal Mettes

To make optimisation tractable, we propose a dual-contrastive Infinite Class Mixup loss, where we contrast the classifier of a mixed pair to both the classifiers and the predicted outputs of other mixed pairs in a batch.

Hyperbolic Deep Learning in Computer Vision: A Survey

no code implementations11 May 2023 Pascal Mettes, Mina Ghadimi Atigh, Martin Keller-Ressel, Jeffrey Gu, Serena Yeung

In this paper, we provide a categorization and in-depth overview of current literature on hyperbolic learning for computer vision.

Representation Learning

Poincaré ResNet

1 code implementation24 Mar 2023 Max van Spengler, Erwin Berkhout, Pascal Mettes

This paper introduces an end-to-end residual network that operates entirely on the Poincar\'e ball model of hyperbolic space.

Poincare ResNet

1 code implementation ICCV 2023 Max van Spengler, Erwin Berkhout, Pascal Mettes

(iii) Due to the many intermediate operations in Poincare layers, the computation graphs of deep learning libraries blow up, limiting our ability to train on deep hyperbolic networks.

Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs

no code implementations6 Dec 2022 Osman Ülger, Julian Wiederer, Mohsen Ghafoorian, Vasileios Belagiannis, Pascal Mettes

In such temporally-dynamic graphs, a core problem is inferring the future state of spatio-temporal edges, which can constitute multiple types of relations.

Graph Attention object-detection +2

Self-Contained Entity Discovery from Captioned Videos

1 code implementation13 Aug 2022 Melika Ayoughi, Pascal Mettes, Paul Groth

This paper introduces the task of visual named entity discovery in videos without the need for task-specific supervision or task-specific external knowledge sources.

Maximum Class Separation as Inductive Bias in One Matrix

1 code implementation17 Jun 2022 Tejaswi Kasarla, Gertjan J. Burghouts, Max van Spengler, Elise van der Pol, Rita Cucchiara, Pascal Mettes

This paper proposes a simple alternative: encoding maximum separation as an inductive bias in the network by adding one fixed matrix multiplication before computing the softmax activations.

Inductive Bias Long-tail Learning +3

Less than Few: Self-Shot Video Instance Segmentation

no code implementations19 Apr 2022 Pengwan Yang, Yuki M. Asano, Pascal Mettes, Cees G. M. Snoek

The goal of this paper is to bypass the need for labelled examples in few-shot video understanding at run time.

Few-Shot Learning Instance Segmentation +5

Hyperbolic Image Segmentation

1 code implementation CVPR 2022 Mina GhadimiAtigh, Julian Schoep, Erman Acar, Nanne van Noord, Pascal Mettes

For image segmentation, the current standard is to perform pixel-level optimization and inference in Euclidean output embedding spaces through linear hyperplanes.

Image Segmentation Segmentation +1

Universal Prototype Transport for Zero-Shot Action Recognition and Localization

no code implementations8 Mar 2022 Pascal Mettes

For universal object models, we outline a variant that defines target prototypes based on an optimal transport between unseen action prototypes and object prototypes.

Action Recognition Object +4

Zero-Shot Action Recognition from Diverse Object-Scene Compositions

1 code implementation26 Oct 2021 Carlo Bretti, Pascal Mettes

This paper investigates the problem of zero-shot action recognition, in the setting where no training videos with seen actions are available.

Action Recognition Object +2

Diagnosing Errors in Video Relation Detectors

1 code implementation25 Oct 2021 Shuo Chen, Pascal Mettes, Cees G. M. Snoek

Video relation detection forms a new and challenging problem in computer vision, where subjects and objects need to be localized spatio-temporally and a predicate label needs to be assigned if and only if there is an interaction between the two.

Action Localization Object +3

Transductive Universal Transport for Zero-Shot Action Recognition

no code implementations29 Sep 2021 Pascal Mettes

For universal object models, we outline a weighted transport variant from unseen action embeddings to object embeddings directly.

Action Recognition Object +4

Social Fabric: Tubelet Compositions for Video Relation Detection

1 code implementation ICCV 2021 Shuo Chen, Zenglin Shi, Pascal Mettes, Cees G. M. Snoek

We also propose Social Fabric: an encoding that represents a pair of object tubelets as a composition of interaction primitives.

Object Relation +2

Frequency-Supervised MR-to-CT Image Synthesis

1 code implementation19 Jul 2021 Zenglin Shi, Pascal Mettes, Guoyan Zheng, Cees Snoek

In this paper, we find that all existing approaches share a common limitation: reconstruction breaks down in and around the high-frequency parts of CT images.

Computed Tomography (CT) Image Generation +1

On Measuring and Controlling the Spectral Bias of the Deep Image Prior

1 code implementation2 Jul 2021 Zenglin Shi, Pascal Mettes, Subhransu Maji, Cees G. M. Snoek

The deep image prior showed that a randomly initialized network with a suitable architecture can be trained to solve inverse imaging problems by simply optimizing it's parameters to reconstruct a single degraded image.

Denoising Super-Resolution

Unsharp Mask Guided Filtering

1 code implementation2 Jun 2021 Zenglin Shi, Yunlu Chen, Efstratios Gavves, Pascal Mettes, Cees G. M. Snoek

The state-of-the-art leverages deep networks to estimate the two core coefficients of the guided filter.


Object Priors for Classifying and Localizing Unseen Actions

1 code implementation10 Apr 2021 Pascal Mettes, William Thong, Cees G. M. Snoek

This work strives for the classification and localization of human actions in videos, without the need for any labeled video training examples.

Action Classification Action Localization +5

Localizing the Common Action Among a Few Videos

1 code implementation ECCV 2020 Pengwan Yang, Vincent Tao Hu, Pascal Mettes, Cees G. M. Snoek

The start and end of an action in a long untrimmed video is determined based on just a hand-full of trimmed video examples containing the same action, without knowing their common class label.

Action Localization

Open Cross-Domain Visual Search

2 code implementations19 Nov 2019 William Thong, Pascal Mettes, Cees G. M. Snoek

In this paper, we make the step towards an open setting where multiple visual domains are available.

Domain Adaptation

4-Connected Shift Residual Networks

1 code implementation22 Oct 2019 Andrew Brown, Pascal Mettes, Marcel Worring

Interestingly, when incorporating shifts to all point-wise convolutions in residual networks, 4-connected shifts outperform 8-connected shifts.

Counting with Focus for Free

1 code implementation ICCV 2019 Zenglin Shi, Pascal Mettes, Cees G. M. Snoek

To assist both the density estimation and the focus from segmentation, we also introduce an improved kernel size estimator for the point annotations.

Density Estimation

Hyperspherical Prototype Networks

1 code implementation NeurIPS 2019 Pascal Mettes, Elise van der Pol, Cees G. M. Snoek

This paper introduces hyperspherical prototype networks, which unify classification and regression with prototypes on hyperspherical output spaces.

Classification General Classification +1

Spatio-Temporal Instance Learning: Action Tubes from Class Supervision

no code implementations8 Jul 2018 Pascal Mettes, Cees G. M. Snoek

Rather than disconnecting the spatio-temporal learning from the training, we propose Spatio-Temporal Instance Learning, which enables action localization directly from box proposals in video frames.

Multiple Instance Learning Spatio-Temporal Action Localization +1

Pointly-Supervised Action Localization

no code implementations29 May 2018 Pascal Mettes, Cees G. M. Snoek

Experimental evaluation on three action localization datasets shows our pointly-supervised approach (i) is as effective as traditional box-supervision at a fraction of the annotation cost, (ii) is robust to sparse and noisy point annotations, (iii) benefits from pseudo-points during inference, and (iv) outperforms recent weakly-supervised alternatives.

Action Localization Multiple Instance Learning +1

Localizing Actions from Video Labels and Pseudo-Annotations

no code implementations28 Jul 2017 Pascal Mettes, Cees G. M. Snoek, Shih-Fu Chang

The goal of this paper is to determine the spatio-temporal location of actions in video.

Action Localization

Spot On: Action Localization from Pointly-Supervised Proposals

no code implementations26 Apr 2016 Pascal Mettes, Jan C. van Gemert, Cees G. M. Snoek

Rather than annotating boxes, we propose to annotate actions in video with points on a sparse subset of frames only.

Action Localization Multiple Instance Learning +1

The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection

no code implementations23 Feb 2016 Pascal Mettes, Dennis C. Koelma, Cees G. M. Snoek

To deal with the problems of over-specific classes and classes with few images, we introduce a bottom-up and top-down approach for reorganization of the ImageNet hierarchy based on all its 21, 814 classes and more than 14 million images.

Event Detection Object Recognition

Water Detection through Spatio-Temporal Invariant Descriptors

no code implementations2 Nov 2015 Pascal Mettes, Robby T. Tan, Remco C. Veltkamp

Experimental evaluation on the Video Water Database and the DynTex database indicates the effectiveness of the proposed algorithm, outperforming multiple algorithms for dynamic texture recognition and material recognition by ca.

Dynamic Texture Recognition Material Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.