no code implementations • ECCV 2020 • He Zhao, Richard P. Wildes
We investigate the joint anticipation of long-term activity labels and their corresponding times with the aim of improving both the naturalness and diversity of predictions.
no code implementations • 2 Apr 2024 • Matthew Kowal, Richard P. Wildes, Konstantinos G. Derpanis
Understanding what deep network models capture in their learned representations is a fundamental challenge in computer vision.
no code implementations • 19 Mar 2024 • Filip Ilic, He Zhao, Thomas Pock, Richard P. Wildes
Global obfuscation hides privacy sensitive regions, but also contextual regions important for action recognition.
no code implementations • 18 Oct 2023 • Rezaul Karim, Richard P. Wildes
In this survey, we address the above with a thorough discussion of various categories of video segmentation, a component-wise discussion of the state-of-the-art transformer-based models, and a review of related interpretability methods.
no code implementations • CVPR 2023 • Nikita Dvornik, Isma Hadji, Ran Zhang, Konstantinos G. Derpanis, Animesh Garg, Richard P. Wildes, Allan D. Jepson
This motivates the need to temporally localize the instruction steps in such videos, i. e. the task called key-step localization.
no code implementations • CVPR 2023 • Rezaul Karim, He Zhao, Richard P. Wildes, Mennatullah Siam
In this paper, we present an end-to-end trainable unified multiscale encoder-decoder transformer that is focused on dense prediction tasks in video.
no code implementations • 3 Nov 2022 • Matthew Kowal, Mennatullah Siam, Md Amirul Islam, Neil D. B. Bruce, Richard P. Wildes, Konstantinos G. Derpanis
(ii) Some datasets that are assumed to be biased toward dynamics are actually biased toward static information.
1 code implementation • 9 Aug 2022 • Dekun Wu, He Zhao, Xingce Bao, Richard P. Wildes
In this paper, we propose a novel large-scale NBA dataset for Sports Video Analysis (NSVA) with a focus on captioning, to address the above challenges.
1 code implementation • 13 Jul 2022 • Filip Ilic, Thomas Pock, Richard P. Wildes
Presently, a methodology and corresponding dataset to isolate the effects of dynamic information in video are missing.
1 code implementation • CVPR 2022 • Matthew Kowal, Mennatullah Siam, Md Amirul Islam, Neil D. B. Bruce, Richard P. Wildes, Konstantinos G. Derpanis
To show the efficacy of our approach, we analyse two widely studied tasks, action recognition and video object segmentation.
1 code implementation • CVPR 2022 • He Zhao, Isma Hadji, Nikita Dvornik, Konstantinos G. Derpanis, Richard P. Wildes, Allan D. Jepson
Our model is based on a transformer equipped with a memory module, which maps the start and goal observations to a sequence of plausible actions.
1 code implementation • 27 Mar 2022 • Mennatullah Siam, Konstantinos G. Derpanis, Richard P. Wildes
In this paper, we present a simple but effective temporal transductive inference (TTI) approach that leverages temporal consistency in the unlabelled video frames during few-shot inference.
no code implementations • 11 Jul 2021 • He Zhao, Richard P. Wildes
Early action recognition (action prediction) from limited preliminary observations plays a critical role for streaming vision systems that demand real-time inference, as video actions often possess elongated temporal spans which cause undesired latency.
no code implementations • 11 Jul 2021 • He Zhao, Richard P. Wildes
Action prediction is a major sub-area of video predictive understanding and is the focus of this review.
no code implementations • 26 May 2021 • Soo Min Kang, Richard P. Wildes
An algorithm is developed to measure the presence of these traits in tracked objects to determine if they correspond to a biological entity in locomotion.
1 code implementation • ICCV 2021 • He Zhao, Richard P. Wildes
Goal-conditioned approaches recently have been found very useful to human trajectory prediction, when adequate goal estimates are provided.
Ranked #5 on Trajectory Prediction on ETH
no code implementations • 30 Nov 2020 • Isma Hadji, Richard P. Wildes
A standard explanation of this result is that these filters reflect the structure of the images that they have been exposed to during training: Natural images typically are locally composed of oriented contours at various scales and oriented bandpass filters are matched to such structure.
no code implementations • 25 Sep 2019 • Richard P. Wildes
A standard explanation of this result is that these filters reflect the structure of the images that they have been exposed to during training: Natural images typically are locally composed of oriented contours at various scales and oriented bandpass filters are matched to such structure.
no code implementations • ECCV 2018 • Isma Hadji, Richard P. Wildes
This paper introduces a new large scale dynamic texture dataset.
3 code implementations • 23 Mar 2018 • Isma Hadji, Richard P. Wildes
This document will review the most prominent proposals using multilayer convolutional architectures.
no code implementations • CVPR 2018 • Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes, Andrew Zisserman
In this paper, we shed light on deep spatiotemporal representations by visualizing what two-stream models have learned in order to recognize actions in video.
1 code implementation • ICCV 2017 • Isma Hadji, Richard P. Wildes
Another key aspect of the network is its recurrent nature, whereby the output of each layer of processing feeds back to the input.
1 code implementation • CVPR 2017 • Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes
This paper presents a general ConvNet architecture for video action recognition based on multiplicative interactions of spacetime features.
Ranked #47 on Action Recognition on HMDB-51
1 code implementation • CVPR 2017 • Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes
Finally, our temporal ResNet boosts recognition performance and establishes a new state-of-the-art on dynamic scene recognition, as well as on the complementary task of action recognition.
1 code implementation • NeurIPS 2016 • Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes
Two-stream Convolutional Networks (ConvNets) have shown strong performance for human action recognition in videos.
Ranked #48 on Action Recognition on UCF101
1 code implementation • 21 Oct 2016 • Soo Min Kang, Richard P. Wildes
In this report, a thorough review of various action recognition and detection algorithms in computer vision is provided by analyzing the two-step process of a typical action recognition and detection algorithm: (i) extraction and encoding of features, and (ii) classifying features into action classes.
no code implementations • CVPR 2015 • Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes
By using the resulting definition of saliency during feature pooling we show that action recognition performance achieves state-of-the-art levels on three widely considered action recognition datasets.
no code implementations • CVPR 2014 • Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes
This paper presents a unified bag of visual word (BoW) framework for dynamic scene recognition.