1 code implementation • ECCV 2020 • Kara Marie Schatz, Erik Quintanilla, Shruti Vyas, Yogesh S Rawat
The transformed action is integrated with the target appearance using the proposed recurrent transformer network, which provides a transformed appearance for each time-step in the action sequence.
1 code implementation • ECCV 2020 • Shruti Vyas, Yogesh S Rawat, Mubarak Shah
We evaluate the effectiveness of the learned representation for multi-view video action recognition in a supervised approach.
no code implementations • 12 Dec 2023 • Ayush Singh, Aayush J Rana, Akash Kumar, Shruti Vyas, Yogesh Singh Rawat
First, we demonstrate its effectiveness on video action detection where the proposed approach outperforms prior works in semi-supervised and weakly-supervised learning along with several baseline approaches in both UCF101-24 and JHMDB-21.
no code implementations • CVPR 2023 • Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh S. Rawat
In this work, we perform a large-scale robustness analysis of these existing models for video action recognition.
1 code implementation • 6 Jul 2022 • Shruti Vyas, Chen Chen, Mubarak Shah
There are no existing datasets for this problem, therefore we propose GAMa dataset, a large-scale dataset with ground videos and corresponding aerial images.
1 code implementation • 5 Jul 2022 • Madeline C. Schiappa, Shruti Vyas, Hamid Palangi, Yogesh S. Rawat, Vibhav Vineet
Joint visual and language modeling on large-scale datasets has recently shown good progress in multi-modal tasks when compared to single modal learning.
1 code implementation • 4 Jul 2022 • Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh Rawat
In this work, we perform a large-scale robustness analysis of these existing models for video action recognition.
no code implementations • 17 Apr 2022 • Rajat Modi, Aayush Jung Rana, Akash Kumar, Praveen Tirupattur, Shruti Vyas, Yogesh Singh Rawat, Mubarak Shah
Beyond possessing large enough size to feed data hungry machines (eg, transformers), what attributes measure the quality of a dataset?
1 code implementation • 21 Oct 2021 • Naman Biyani, Aayush J Rana, Shruti Vyas, Yogesh S Rawat
We present LARNet, a novel end-to-end approach for generating human action videos.
no code implementations • 15 Oct 2021 • Xianhang Li, Junhao Zhang, Kunchang Li, Shruti Vyas, Yogesh S Rawat
We focus on the problem of novel-view human action synthesis.
1 code implementation • 13 Oct 2021 • Mohit Sharma, Raj Patra, Harshal Desai, Shruti Vyas, Yogesh Rawat, Rajiv Ratn Shah
We present this as a benchmark dataset in noisy learning for video understanding.
1 code implementation • 24 Jul 2021 • Praveen Tirupattur, Aayush J Rana, Tushar Sangam, Shruti Vyas, Yogesh S Rawat, Mubarak Shah
While various approaches have been shown effective for recognition task in recent works, they often do not deal with videos of lower resolution where the action is happening in a tiny region.
no code implementations • 7 Jun 2021 • Sarah Shiraz, Krishna Regmi, Shruti Vyas, Yogesh S. Rawat, Mubarak Shah
We address the problem of novel view video prediction; given a set of input video clips from a single/multiple views, our network is able to predict the video from a novel view.
no code implementations • 1 Sep 2020 • Yogesh S Rawat, Shruti Vyas
The research in view-invariant action recognition addresses this problem and focuses on recognizing human actions from unseen viewpoints.
no code implementations • 26 Nov 2018 • Shruti Vyas, Yogesh S Rawat, Mubarak Shah
We demonstrate the effectiveness of the proposed method in rendering view-aware as well as time-aware video clips on two different real-world datasets including UCF-101 and NTU-RGB+D.