Search Results for author: Swathikiran Sudhakaran

Found 15 papers, 6 papers with code

Relevance-based Margin for Contrastively-trained Video Retrieval Models

1 code implementation27 Apr 2022 Alex Falcon, Swathikiran Sudhakaran, Giuseppe Serra, Sergio Escalera, Oswald Lanz

We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance.

Multi-Instance Retrieval Natural Language Queries +2

Gate-Shift-Fuse for Video Action Recognition

1 code implementation16 Mar 2022 Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

3D kernel factorization approaches have been proposed to reduce the complexity of 3D CNNs.

Ranked #17 on Action Recognition on EPIC-KITCHENS-100 (using extra training data)

Action Recognition Temporal Action Localization +1

Space-time Mixing Attention for Video Transformer

1 code implementation NeurIPS 2021 Adrian Bulat, Juan-Manuel Perez-Rua, Swathikiran Sudhakaran, Brais Martinez, Georgios Tzimiropoulos

In this work, we propose a Video Transformer model the complexity of which scales linearly with the number of frames in the video sequence and hence induces no overhead compared to an image-based Transformer model.

Action Classification Action Recognition In Videos +1

Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries

no code implementations16 Feb 2021 Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

We present EgoACO, a deep neural architecture for video action recognition that learns to pool action-context-object descriptors from frame level features by leveraging the verb-noun structure of action labels in egocentric video datasets.

Action Recognition Object +1

FBK-HUPBA Submission to the EPIC-Kitchens Action Recognition 2020 Challenge

no code implementations24 Jun 2020 Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

In this report we describe the technical details of our submission to the EPIC-Kitchens Action Recognition 2020 Challenge.

Action Recognition

Gate-Shift Networks for Video Action Recognition

2 code implementations CVPR 2020 Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

Deep 3D CNNs for video action recognition are designed to learn powerful representations in the joint spatio-temporal feature space.

Ranked #26 on Action Recognition on Something-Something V1 (using extra training data)

Action Recognition

An Analysis of Deep Neural Networks with Attention for Action Recognition from a Neurophysiological Perspective

no code implementations2 Jul 2019 Swathikiran Sudhakaran, Oswald Lanz

We review three recent deep learning based methods for action recognition and present a brief comparative analysis of the methods from a neurophyisiological point of view.

Action Recognition

FBK-HUPBA Submission to the EPIC-Kitchens 2019 Action Recognition Challenge

no code implementations21 Jun 2019 Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

In this report we describe the technical details of our submission to the EPIC-Kitchens 2019 action recognition challenge.

Action Recognition

Hierarchical Feature Aggregation Networks for Video Action Recognition

no code implementations29 May 2019 Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

Most action recognition methods base on a) a late aggregation of frame level CNN features using average pooling, max pooling, or RNN, among others, or b) spatio-temporal aggregation via 3D convolutions.

Ranked #51 on Action Recognition on HMDB-51 (using extra training data)

Action Recognition Temporal Action Localization

Top-down Attention Recurrent VLAD Encoding for Action Recognition in Videos

no code implementations29 Aug 2018 Swathikiran Sudhakaran, Oswald Lanz

Most recent approaches for action recognition from video leverage deep architectures to encode the video clip into a fixed length representation vector that is then used for classification.

Action Recognition In Videos General Classification +2

Learning to Detect Violent Videos using Convolutional Long Short-Term Memory

no code implementations19 Sep 2017 Swathikiran Sudhakaran, Oswald Lanz

A convolutional neural network is used to extract frame level features from a video.

Convolutional Long Short-Term Memory Networks for Recognizing First Person Interactions

no code implementations19 Sep 2017 Swathikiran Sudhakaran, Oswald Lanz

The proposed approach uses a pair of convolutional neural networks, whose parameters are shared, for extracting frame level features from successive frames of the video.

Cannot find the paper you are looking for? You can Submit a new open access paper.