Search Results for author: Hans Peter Graf

Found 8 papers, 2 papers with code

COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality

no code implementations11 Dec 2021 Honglu Zhou, Asim Kadav, Aviv Shamsian, Shijie Geng, Farley Lai, Long Zhao, Ting Liu, Mubbasir Kapadia, Hans Peter Graf

Group Activity Recognition detects the activity collectively performed by a group of actors, which requires compositional reasoning of actors and objects.

Group Activity Recognition Relational Reasoning

Hopper: Multi-hop Transformer for Spatiotemporal Reasoning

1 code implementation ICLR 2021 Honglu Zhou, Asim Kadav, Farley Lai, Alexandru Niculescu-Mizil, Martin Renqiang Min, Mubbasir Kapadia, Hans Peter Graf

We evaluate over CATER dataset and find that Hopper achieves 73. 2% Top-1 accuracy using just 1 FPS by hopping through just a few critical frames.

S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation

no code implementations CVPR 2020 Yizhe Zhu, Martin Renqiang Min, Asim Kadav, Hans Peter Graf

We propose a sequential variational autoencoder to learn disentangled representations of sequential data (e. g., videos and audios) under self-supervision.


15 Keypoints Is All You Need

no code implementations CVPR 2020 Michael Snower, Asim Kadav, Farley Lai, Hans Peter Graf

Keypoints are tracked using our Pose Entailment method, in which, first, a pair of pose estimates is sampled from different frames in a video and tokenized.

Optical Flow Estimation Pose Estimation +1

Tripping through time: Efficient Localization of Activities in Videos

no code implementations22 Apr 2019 Meera Hahn, Asim Kadav, James M. Rehg, Hans Peter Graf

Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video.

Grounded Objects and Interactions for Video Captioning

no code implementations16 Nov 2017 Chih-Yao Ma, Asim Kadav, Iain Melvin, Zsolt Kira, Ghassan AlRegib, Hans Peter Graf

We address the problem of video captioning by grounding language generation on object interactions in the video.

Scene Understanding Text Generation +2

Pruning Filters for Efficient ConvNets

21 code implementations31 Aug 2016 Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, Hans Peter Graf

However, magnitude-based pruning of weights reduces a significant number of parameters from the fully connected layers and may not adequately reduce the computation costs in the convolutional layers due to irregular sparsity in the pruned networks.

Image Classification Network Pruning

Cannot find the paper you are looking for? You can Submit a new open access paper.