no code implementations • 16 May 2023 • Yi Huang, Asim Kadav, Farley Lai, Deep Patel, Hans Peter Graf
Specifically, KeyNet introduces the use of object based keypoint information to capture context in the scene.
1 code implementation • 20 Jan 2022 • Cheng-En Wu, Farley Lai, Yu Hen Hu, Asim Kadav
Implementation-wise, CPR is complementary to pretext tasks and can be easily applied to previous work.
no code implementations • 31 Dec 2021 • Farley Lai, Asim Kadav, Erik Kruus
The recent success of deep learning applications has coincided with those widely available powerful computational resources for training sophisticated machine learning models with huge datasets.
1 code implementation • 11 Dec 2021 • Honglu Zhou, Asim Kadav, Aviv Shamsian, Shijie Geng, Farley Lai, Long Zhao, Ting Liu, Mubbasir Kapadia, Hans Peter Graf
Group Activity Recognition detects the activity collectively performed by a group of actors, which requires compositional reasoning of actors and objects.
Ranked #2 on Group Activity Recognition on Collective Activity
1 code implementation • ICLR 2021 • Honglu Zhou, Asim Kadav, Farley Lai, Alexandru Niculescu-Mizil, Martin Renqiang Min, Mubbasir Kapadia, Hans Peter Graf
We evaluate over CATER dataset and find that Hopper achieves 73. 2% Top-1 accuracy using just 1 FPS by hopping through just a few critical frames.
Ranked #5 on Video Object Tracking on CATER
no code implementations • CVPR 2020 • Michael Snower, Asim Kadav, Farley Lai, Hans Peter Graf
Keypoints are tracked using our Pose Entailment method, in which, first, a pair of pose estimates is sampled from different frames in a video and tokenized.
Ranked #2 on Pose Tracking on PoseTrack2017
1 code implementation • 5 Nov 2019 • Farley Lai, Ning Xie, Derek Doran, Asim Kadav
Next, the model learns the contextual representations of the text tokens and image objects through layers of high-order interaction respectively.
1 code implementation • 20 Jan 2019 • Ning Xie, Farley Lai, Derek Doran, Asim Kadav
We evaluate various existing VQA baselines and build a model called Explainable Visual Entailment (EVE) system to address the VE task.
Ranked #8 on Visual Entailment on SNLI-VE test
1 code implementation • 26 Nov 2018 • Ning Xie, Farley Lai, Derek Doran, Asim Kadav
We introduce a new inference task - Visual Entailment (VE) - which differs from traditional Textual Entailment (TE) tasks whereby a premise is defined by an image, rather than a natural language sentence as in TE tasks.