Search Results for author: Nakul Agarwal

Found 12 papers, 2 papers with code

Disentangled Neural Relational Inference for Interpretable Motion Prediction

no code implementations • 7 Jan 2024 • Victoria M. Dax, Jiachen Li, Enna Sachdeva, Nakul Agarwal, Mykel J. Kochenderfer

The results show superior performance compared to existing methods in modeling spatio-temporal relations, motion prediction, and identifying time-invariant latent features.

Motion Planning motion prediction

Paper
Add Code

Vamos: Versatile Action Models for Video Understanding

no code implementations • 22 Nov 2023 • Shijie Wang, Qi Zhao, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, Chen Sun

What makes good video representations for video understanding, such as anticipating future activities, or answering video-conditioned questions?

Ranked #2 on Zero-Shot Video Question Answer on EgoSchema (fullset)

Language Modelling Large Language Model +2

Paper
Add Code

Object-centric Video Representation for Long-term Action Anticipation

1 code implementation • 31 Oct 2023 • Ce Zhang, Changcheng Fu, Shijie Wang, Nakul Agarwal, Kwonjoon Lee, Chiho Choi, Chen Sun

To recognize and predict human-object interactions, we use a Transformer-based neural architecture which allows the "retrieval" of relevant objects for action anticipation at various time scales.

Ranked #2 on Long Term Action Anticipation on Ego4D

Action Anticipation Human-Object Interaction Detection +4

Paper
Code

Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning

1 code implementation • 12 Sep 2023 • Enna Sachdeva, Nakul Agarwal, Suhas Chundi, Sean Roelofs, Jiachen Li, Mykel Kochenderfer, Chiho Choi, Behzad Dariush

The widespread adoption of commercial autonomous vehicles (AVs) and advanced driver assistance systems (ADAS) may largely depend on their acceptance by society, for which their perceived trustworthiness and interpretability to riders are crucial.

Autonomous Vehicles Question Answering +2

615

Paper
Code

AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

no code implementations • 31 Jul 2023 • Qi Zhao, Shijie Wang, Ce Zhang, Changcheng Fu, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, Chen Sun

We propose to formulate the LTA task from two perspectives: a bottom-up approach that predicts the next actions autoregressively by modeling temporal dynamics; and a top-down approach that infers the goal of the actor and plans the needed procedure to accomplish the goal.

Action Anticipation counterfactual +1

Paper
Add Code

Weakly-Supervised Action Segmentation and Unseen Error Detection in Anomalous Instructional Videos

no code implementations • ICCV 2023 • Reza Ghoddoosian, Isht Dwivedi, Nakul Agarwal, Behzad Dariush

We present a novel method for weakly-supervised action segmentation and unseen error detection in anomalous instructional videos.

Action Segmentation Segmentation

Paper
Add Code

Latency Matters: Real-Time Action Forecasting Transformer

no code implementations • CVPR 2023 • Harshayu Girase, Nakul Agarwal, Chiho Choi, Karttikeya Mangalam

We present RAFTformer, a real-time action forecasting transformer for latency aware real-world action forecasting applications.

Paper
Add Code

Ordered Atomic Activity for Fine-grained Interactive Traffic Scenario Understanding

no code implementations • ICCV 2023 • Nakul Agarwal, Yi-Ting Chen

We introduce a novel representation called Ordered Atomic Activity for interactive scenario understanding.

Activity Recognition

Paper
Add Code

AdamsFormer for Spatial Action Localization in the Future

no code implementations • CVPR 2023 • Hyung-gun Chi, Kwonjoon Lee, Nakul Agarwal, Yi Xu, Karthik Ramani, Chiho Choi

SALF is challenging because it requires understanding the underlying physics of video observations to predict future action locations accurately.

Action Localization

Paper
Add Code

Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos

no code implementations • CVPR 2022 • Reza Ghoddoosian, Isht Dwivedi, Nakul Agarwal, Chiho Choi, Behzad Dariush

Experimental results show efficacy of the proposed methods both qualitatively and quantitatively in two domains of cooking and assembly.

Action Segmentation Segmentation

Paper
Add Code

Unsupervised Domain Adaptation for Spatio-Temporal Action Localization

no code implementations • 19 Oct 2020 • Nakul Agarwal, Yi-Ting Chen, Behzad Dariush, Ming-Hsuan Yang

Spatio-temporal action localization is an important problem in computer vision that involves detecting where and when activities occur, and therefore requires modeling of both spatial and temporal features.

object-detection Object Detection +3

Paper
Add Code

Connecting Visual Experiences using Max-flow Network with Application to Visual Localization

no code implementations • 1 Aug 2018 • A. H. Abdul Hafez, Nakul Agarwal, C. V. Jawahar

This problem is solved by finding the maximum flow in a directed graph flow-network, whose vertices represent the matches between frames in the test and reference sequences.

Autonomous Navigation Visual Localization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.