Online Spatiotemporal Action Detection and Prediction via Causal Representations

1 code implementation31 Aug 2020 Gurkirt Singh

In this thesis, we focus on video action understanding problems from an online and real-time processing point of view.

Action Detection Action Recognition +3

Two-Stream AMTnet for Action Detection

1 code implementation3 Apr 2020 Suman Saha, Gurkirt Singh, Fabio Cuzzolin

This is achieved by augmenting the previous Action Micro-Tube (AMTnet) action detection framework in three distinct ways: by adding a parallel motion stIn this paper, we propose a new deep neural network architecture for online action detection, termed ream to the original appearance one in AMTnet; (2) in opposition to state-of-the-art action detectors which train appearance and motion streams separately, and use a test time late fusion scheme to fuse RGB and flow cues, by jointly training both streams in an end-to-end fashion and merging RGB and optical flow features at training time; (3) by introducing an online action tube generation algorithm which works at video-level, and in real-time (when exploiting only appearance features).

Action Detection Autonomous Driving +2

End-to-End Video Captioning

no code implementations4 Apr 2019 Silvio Olivastri, Gurkirt Singh, Fabio Cuzzolin

The decoder is then optimised on such static features to generate the video's description.

Action Recognition Machine Translation +3

Recurrent Convolutions for Causal 3D CNNs

no code implementations17 Nov 2018 Gurkirt Singh, Fabio Cuzzolin

Recently, three dimensional (3D) convolutional neural networks (CNNs) have emerged as dominant methods to capture spatiotemporal representations in videos, by adding to pre-existing 2D CNNs a third, temporal dimension.

Action Detection

Predicting Action Tubes

no code implementations23 Aug 2018 Gurkirt Singh, Suman Saha, Fabio Cuzzolin

In this work, we present a method to predict an entire `action tube' (a set of temporally linked bounding boxes) in a trimmed video just by observing a smaller subset of it.

Action Classification Action Detection +1

TraMNet - Transition Matrix Network for Efficient Action Tube Proposals

1 code implementation1 Aug 2018 Gurkirt Singh, Suman Saha, Fabio Cuzzolin

At training time, transitions are specific to cell locations of the feature maps, so that a sparse (efficient) transition matrix is used to train the network.

Action Detection from a Robot-Car Perspective

no code implementations30 Jul 2018 Valentina Fontana, Gurkirt Singh, Stephen Akrigg, Manuele Di Maio, Suman Saha, Fabio Cuzzolin

We present the new Road Event and Activity Detection (READ) dataset, designed and created from an autonomous vehicle perspective to take action detection challenges to autonomous driving.

Action Detection Activity Detection +2

Spatio-temporal Human Action Localisation and Instance Segmentation in Temporally Untrimmed Videos

no code implementations22 Jul 2017 Suman Saha, Gurkirt Singh, Michael Sapienza, Philip H. S. Torr, Fabio Cuzzolin

Current state-of-the-art human action recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame.

Action Recognition Instance Segmentation +1

AMTnet: Action-Micro-Tube Regression by End-to-end Trainable Deep Architecture

no code implementations ICCV 2017 Suman Saha, Gurkirt Singh, Fabio Cuzzolin

As such, our 3D-RPN net is able to effectively encode the temporal aspect of actions by purely exploiting appearance, as opposed to methods which heavily rely on expensive flow maps.

Action Detection Region Proposal

Incremental Tube Construction for Human Action Detection

1 code implementation5 Apr 2017 Harkirat Singh Behl, Michael Sapienza, Gurkirt Singh, Suman Saha, Fabio Cuzzolin, Philip H. S. Torr

In this work, we introduce a real-time and online joint-labelling and association algorithm for action detection that can incrementally construct space-time action tubes on the most challenging action videos in which different action categories occur concurrently.

Action Detection Human robot interaction

Online Real-time Multiple Spatiotemporal Action Localisation and Prediction

4 code implementations ICCV 2017 Gurkirt Singh, Suman Saha, Michael Sapienza, Philip Torr, Fabio Cuzzolin

To the best of our knowledge, ours is the first real-time (up to 40fps) system able to perform online S/T action localisation and early action prediction on the untrimmed videos of UCF101-24.

Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos

no code implementations4 Aug 2016 Suman Saha, Gurkirt Singh, Michael Sapienza, Philip H. S. Torr, Fabio Cuzzolin

In stage 2, the appearance network detections are boosted by combining them with the motion detection scores, in proportion to their respective spatial overlap.

Action Detection Motion Detection +1

Untrimmed Video Classification for Activity Detection: submission to ActivityNet Challenge

1 code implementation7 Jul 2016 Gurkirt Singh, Fabio Cuzzolin

We propose a simple, yet effective, method for the temporal detection of activities in temporally untrimmed videos with the help of untrimmed classification.

Action Detection Activity Detection +4

