Search Results for author: Shengxin Zha

Found 6 papers, 1 papers with code

Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

no code implementations8 Jan 2025 Zeyi Huang, Yuyang Ji, Xiaofang Wang, Nikhil Mehta, Tong Xiao, DongHyun Lee, Sigmund Vanvalkenburgh, Shengxin Zha, Bolin Lai, Licheng Yu, Ning Zhang, Yong Jae Lee, Miao Liu

Long-form video understanding with Large Vision Language Models is challenged by the need to analyze temporally dispersed yet spatially concentrated key moments within limited context windows.

EgoSchema Object Tracking +1

Human Action Anticipation: A Survey

no code implementations17 Oct 2024 Bolin Lai, Sam Toyer, Tushar Nagarajan, Rohit Girdhar, Shengxin Zha, James M. Rehg, Kris Kitani, Kristen Grauman, Ruta Desai, Miao Liu

Predicting future human behavior is an increasingly popular topic in computer vision, driven by the interest in applications such as autonomous vehicles, digital assistants and human-robot interactions.

Action Anticipation Autonomous Vehicles +2

Only Time Can Tell: Discovering Temporal Data for Temporal Modeling

no code implementations19 Jul 2019 Laura Sevilla-Lara, Shengxin Zha, Zhicheng Yan, Vedanuj Goswami, Matt Feiszli, Lorenzo Torresani

However, in current video datasets it has been observed that action classes can often be recognized without any temporal information from a single frame of video.

Benchmarking Motion Estimation +1

Exploiting Image-trained CNN Architectures for Unconstrained Video Classification

no code implementations13 Mar 2015 Shengxin Zha, Florian Luisier, Walter Andrews, Nitish Srivastava, Ruslan Salakhutdinov

Our proposed late fusion of CNN- and motion-based features can further increase the mean average precision (mAP) on MED'14 from 34. 95% to 38. 74%.

Classification Event Detection +3

Cannot find the paper you are looking for? You can Submit a new open access paper.