no code implementations • 16 May 2023 • Yi Huang, Asim Kadav, Farley Lai, Deep Patel, Hans Peter Graf
Specifically, KeyNet introduces the use of object based keypoint information to capture context in the scene.
1 code implementation • 20 Jan 2022 • Cheng-En Wu, Farley Lai, Yu Hen Hu, Asim Kadav
Implementation-wise, CPR is complementary to pretext tasks and can be easily applied to previous work.
no code implementations • 31 Dec 2021 • Farley Lai, Asim Kadav, Erik Kruus
The recent success of deep learning applications has coincided with those widely available powerful computational resources for training sophisticated machine learning models with huge datasets.
1 code implementation • 11 Dec 2021 • Honglu Zhou, Asim Kadav, Aviv Shamsian, Shijie Geng, Farley Lai, Long Zhao, Ting Liu, Mubbasir Kapadia, Hans Peter Graf
Group Activity Recognition detects the activity collectively performed by a group of actors, which requires compositional reasoning of actors and objects.
Ranked #2 on
Group Activity Recognition
on Collective Activity
1 code implementation • ICCV 2021 • Ligong Han, Martin Renqiang Min, Anastasis Stathopoulos, Yu Tian, Ruijiang Gao, Asim Kadav, Dimitris Metaxas
We then propose an improved cGAN model with Auxiliary Classification that directly aligns the fake and real conditionals $P(\text{class}|\text{image})$ by minimizing their $f$-divergence.
1 code implementation • ICLR 2021 • Honglu Zhou, Asim Kadav, Farley Lai, Alexandru Niculescu-Mizil, Martin Renqiang Min, Mubbasir Kapadia, Hans Peter Graf
We evaluate over CATER dataset and find that Hopper achieves 73. 2% Top-1 accuracy using just 1 FPS by hopping through just a few critical frames.
Ranked #5 on
Video Object Tracking
on CATER
no code implementations • CVPR 2020 • Yizhe Zhu, Martin Renqiang Min, Asim Kadav, Hans Peter Graf
We propose a sequential variational autoencoder to learn disentangled representations of sequential data (e. g., videos and audios) under self-supervision.
no code implementations • CVPR 2020 • Michael Snower, Asim Kadav, Farley Lai, Hans Peter Graf
Keypoints are tracked using our Pose Entailment method, in which, first, a pair of pose estimates is sampled from different frames in a video and tokenized.
Ranked #2 on
Pose Tracking
on PoseTrack2017
1 code implementation • 5 Nov 2019 • Farley Lai, Ning Xie, Derek Doran, Asim Kadav
Next, the model learns the contextual representations of the text tokens and image objects through layers of high-order interaction respectively.
no code implementations • 22 Apr 2019 • Meera Hahn, Asim Kadav, James M. Rehg, Hans Peter Graf
Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video.
1 code implementation • 20 Jan 2019 • Ning Xie, Farley Lai, Derek Doran, Asim Kadav
We evaluate various existing VQA baselines and build a model called Explainable Visual Entailment (EVE) system to address the VE task.
Ranked #8 on
Visual Entailment
on SNLI-VE test
1 code implementation • 26 Nov 2018 • Ning Xie, Farley Lai, Derek Doran, Asim Kadav
We introduce a new inference task - Visual Entailment (VE) - which differs from traditional Textual Entailment (TE) tasks whereby a premise is defined by an image, rather than a natural language sentence as in TE tasks.
no code implementations • WS 2018 • Ju-ho Kim, Christopher Malon, Asim Kadav
Existing entailment datasets mainly pose problems which can be answered without attention to grammar or word order.
no code implementations • ICLR 2018 • Daniel Li, Asim Kadav
We present Adaptive Memory Networks (AMN) that processes input-question pairs to dynamically construct a network architecture optimized for lower inference times for Question Answering (QA) tasks.
no code implementations • 11 Dec 2017 • Christopher Streiffer, Huan Chen, Theophilus Benson, Asim Kadav
In recent years, many techniques have been developed to improve the performance and efficiency of data center networks.
no code implementations • CVPR 2018 • Chih-Yao Ma, Asim Kadav, Iain Melvin, Zsolt Kira, Ghassan AlRegib, Hans Peter Graf
Human actions often involve complex interactions across several inter-related objects in the scene.
no code implementations • 16 Nov 2017 • Chih-Yao Ma, Asim Kadav, Iain Melvin, Zsolt Kira, Ghassan AlRegib, Hans Peter Graf
We address the problem of video captioning by grounding language generation on object interactions in the video.
no code implementations • 27 Dec 2016 • Asim Kadav, Erik Kruus
Emerging workloads, such as graph processing and machine learning are approximate because of the scale of data involved and the stochastic nature of the underlying algorithms.
no code implementations • 22 Dec 2016 • Huayu Li, Martin Renqiang Min, Yong Ge, Asim Kadav
Employing these attention mechanisms, our model accurately understands when it can output an answer or when it requires generating a supplementary question for additional input depending on different contexts.
21 code implementations • 31 Aug 2016 • Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, Hans Peter Graf
However, magnitude-based pruning of weights reduces a significant number of parameters from the fully connected layers and may not adequately reduce the computation costs in the convolutional layers due to irregular sparsity in the pruned networks.
Ranked #1 on
Network Pruning
on ImageNet