no code implementations • 29 Jan 2024 • Ketul Shah, Robert Crandall, Jie Xu, Peng Zhou, Marian George, Mayank Bansal, Rama Chellappa
We report state-of-the-art results on the NTU-60, NTU-120 and ETRI datasets, as well as in the transfer learning setting on NUCLA, PKU-MMD-II and ROCOG-v2 datasets, demonstrating the robustness of our approach.
no code implementations • 5 Dec 2023 • Arun Reddy, William Paul, Corban Rivera, Ketul Shah, Celso M. de Melo, Rama Chellappa
In this work, we tackle the problem of unsupervised domain adaptation (UDA) for video action recognition.
no code implementations • 16 Nov 2023 • Aniket Roy, Maiterya Suin, Anshul Shah, Ketul Shah, Jiang Liu, Rama Chellappa
Diffusion models have advanced generative AI significantly in terms of editing and creating naturalistic images.
1 code implementation • CVPR 2023 • Anshul Shah, Aniket Roy, Ketul Shah, Shlok Kumar Mishra, David Jacobs, Anoop Cherian, Rama Chellappa
In this work, we propose a new contrastive learning approach to train models for skeleton-based action recognition without labels.
1 code implementation • 17 Mar 2023 • Arun V. Reddy, Ketul Shah, William Paul, Rohita Mocharla, Judy Hoffman, Kapil D. Katyal, Dinesh Manocha, Celso M. de Melo, Rama Chellappa
The dataset is composed of both real and synthetic videos from seven gesture classes, and is intended to support the study of synthetic-to-real domain shift for video-based action recognition.
1 code implementation • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023 • Ketul Shah, Anshul Shah, Chun Pong Lau, Celso M. de Melo, Rama Chellappa
We present a supervised contrastive learning framework to learn a feature embedding robust to changes in viewpoint, by effectively leveraging multi-view data.
Ranked #11 on Action Recognition on NTU RGB+D 120
no code implementations • 11 Dec 2022 • Aniket Roy, Anshul Shah, Ketul Shah, Anirban Roy, Rama Chellappa
We generate captions from the limited training images and using these captions edit the training images using an image-to-image stable diffusion model to generate semantically meaningful augmentations.
1 code implementation • 7 Sep 2020 • Kamal Gupta, Susmija Jabbireddy, Ketul Shah, Abhinav Shrivastava, Matthias Zwicker
Our simple encoder-decoder framework, comprised of a novel identity encoder and class-conditional viewpoint generator, generates 3D consistent depth maps.