no code implementations • 28 Dec 2024 • Xijun Wang, Pedro Sandoval-Segura, ChengYuan Zhang, Junyun Huang, Tianrui Guan, Ruiqi Xian, Fuxiao Liu, Rohan Chandra, Boqing Gong, Dinesh Manocha
Addressing this gap, we present a new dataset, DAVE, designed for evaluating perception methods with high representation of Vulnerable Road Users (VRUs: e. g. pedestrians, animals, motorbikes, and bicycles) in complex and unpredictable environments.
no code implementations • 26 Sep 2024 • Ruiqi Xian, Xiyang Wu, Tianrui Guan, Xijun Wang, Boqing Gong, Dinesh Manocha
We introduce SOAR, a novel Self-supervised pretraining algorithm for aerial footage captured by Unmanned Aerial Vehicles (UAVs).
2 code implementations • 16 Jun 2024 • Xiyang Wu, Tianrui Guan, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha
This motivates the development of AutoHallusion, the first automated benchmark generation approach that employs several key strategies to create a diverse range of hallucination examples.
Ranked #1 on Visual Question Answering (VQA) on AutoHallusion
1 code implementation • 4 Apr 2024 • Tianrui Guan, Ruiqi Xian, Xijun Wang, Xiyang Wu, Mohamed Elnoor, Daeun Song, Dinesh Manocha
We present AGL-NET, a novel learning-based method for global localization using LiDAR point clouds and satellite maps.
1 code implementation • 15 Feb 2024 • Xiyang Wu, Souradip Chakraborty, Ruiqi Xian, Jing Liang, Tianrui Guan, Fuxiao Liu, Brian M. Sadler, Dinesh Manocha, Amrit Singh Bedi
In this paper, we highlight the critical issues of robustness and safety associated with integrating large language models (LLMs) and vision-language models (VLMs) into robotics applications.
7 code implementations • CVPR 2024 • Tianrui Guan, Fuxiao Liu, Xiyang Wu, Ruiqi Xian, Zongxia Li, Xiaoyu Liu, Xijun Wang, Lichang Chen, Furong Huang, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou
Our comprehensive case studies within HallusionBench shed light on the challenges of hallucination and illusion in LVLMs.
Ranked #1 on Visual Question Answering (VQA) on HallusionBench
no code implementations • 21 May 2023 • Xijun Wang, Ruiqi Xian, Tianrui Guan, Fuxiao Liu, Dinesh Manocha
In practice, we observe a 3. 17-10. 2% accuracy improvement on the aerial video datasets (Okutama, NECDrone), which consist of scenes with single-agent and multi-agent actions.
Ranked #1 on Action Recognition on Okutama-Action
1 code implementation • 14 Apr 2023 • Ruiqi Xian, Xijun Wang, Divya Kothandaraman, Dinesh Manocha
Our algorithm utilizes the motion bias within aerial videos, which enables the selection of motion-salient frames.
Ranked #1 on Action Recognition on UAV-Human
1 code implementation • 5 Mar 2023 • Ruiqi Xian, Xijun Wang, Dinesh Manocha
We present a novel approach for action recognition in UAV videos.
Ranked #2 on Action Recognition on UAV-Human
no code implementations • 2 Mar 2023 • Xijun Wang, Ruiqi Xian, Tianrui Guan, Celso M. de Melo, Stephen M. Nogar, Aniket Bera, Dinesh Manocha
We propose a novel approach for aerial video action recognition.
Ranked #1 on Action Recognition on RoCoG-v2