no code implementations • 11 Dec 2024 • Zhuoran Yang, Xi Guo, Chenjing Ding, Chiyu Wang, Wei Wu
Autonomous driving requires robust perception models trained on high-quality, large-scale multi-view driving videos for tasks like 3D object detection, segmentation and trajectory prediction.
no code implementations • 2 Dec 2024 • Xi Guo, Chenjing Ding, Haoxuan Dou, Xin Zhang, Weixuan Tang, Wei Wu
Comprehensive experiments in multiple datasets validate InfinityDrive's ability to generate complex and varied scenarios, highlighting its potential as a next-generation driving world model built for the evolving demands of autonomous driving.
no code implementations • 10 Sep 2024 • Yining Yao, Xi Guo, Chenjing Ding, Wei Wu
High-quality driving video generation is crucial for providing training data for autonomous driving models.
no code implementations • 9 Sep 2024 • Chenjing Ding, Chiyu Wang, Boshi Liu, Xi Guo, Weixuan Tang, Wei Wu
Utilizing inference results from segmentation model , our approach constructs a temporospatially consistent semantic codebook, addressing issues of codebook collapse and imbalanced token semantics.
no code implementations • 9 Sep 2024 • Wei Wu, Xi Guo, Weixuan Tang, Tingxuan Huang, Chiyu Wang, Dongyue Chen, Chenjing Ding
However, existing approaches often struggle with multi-view video generation due to the challenges of integrating 3D information while maintaining spatial-temporal consistency and effectively learning from a unified model.
no code implementations • 1 Apr 2024 • Yunze Liu, Changxi Chen, Chenjing Ding, Li Yi
Humanoid Reaction Synthesis is pivotal for creating highly interactive and empathetic robots that can seamlessly integrate into human environments, enhancing the way we live, work, and communicate.
1 code implementation • 8 Jun 2023 • Jianfei Guo, Nianchen Deng, Xinyang Li, Yeqi Bai, Botian Shi, Chiyu Wang, Chenjing Ding, Dongliang Wang, Yikang Li
We present a novel multi-view implicit surface reconstruction technique, termed StreetSurf, that is readily applicable to street view images in widely-used autonomous driving datasets, such as Waymo-perception sequences, without necessarily requiring LiDAR data.
no code implementations • ICCV 2023 • Liang Xu, Ziyang Song, Dongliang Wang, Jing Su, Zhicheng Fang, Chenjing Ding, Weihao Gan, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng, Wei Wu
We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions.