1 code implementation • 5 Jun 2024 • Xingrun Xing, Zheng Zhang, Ziyi Ni, Shitao Xiao, Yiming Ju, Siqi Fan, Yequan Wang, Jiajun Zhang, Guoqi Li
We plug this elastic bi-spiking mechanism in language modeling, named SpikeLM.
2 code implementations • 31 Mar 2024 • Haibao Yu, Wenxian Yang, Jiaru Zhong, Zhenwei Yang, Siqi Fan, Ping Luo, Zaiqing Nie
Cooperatively utilizing both ego-vehicle and infrastructure sensor data via V2X communication has emerged as a promising approach for advanced autonomous driving.
1 code implementation • CVPR 2024 • Ruiyang Hao, Siqi Fan, Yingru Dai, Zhenlin Zhang, Chenxi Li, Yuntian Wang, Haibao Yu, Wenxian Yang, Jirui Yuan, Zaiqing Nie
The value of roadside perception, which could extend the boundaries of autonomous driving and traffic management, has gradually become more prominent and acknowledged in recent years.
no code implementations • 4 Mar 2024 • Siqi Fan, Xin Jiang, Xiang Li, Xuying Meng, Peng Han, Shuo Shang, Aixin Sun, Yequan Wang, Zhongyuan Wang
That is, not all layers of LLMs are necessary during inference.
2 code implementations • 23 Feb 2024 • Zhe Wang, Siqi Fan, Xiaoliang Huo, Tongda Xu, Yan Wang, Jingjing Liu, Yilun Chen, Ya-Qin Zhang
In autonomous driving, cooperative perception makes use of multi-view cameras from both vehicles and infrastructure, providing a global vantage point with rich semantic context of road conditions beyond a single vehicle viewpoint.
2 code implementations • 1 Nov 2023 • Hongzhi Ruan, Haibao Yu, Wenxian Yang, Siqi Fan, Yingjuan Tang, Zaiqing Nie
Specifically, we present V2X-Graph, the first interpretable and end-to-end learning framework for cooperative motion forecasting.
no code implementations • 7 Sep 2023 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, Zheng Zhang, Aixin Sun, Yequan Wang
We demonstrate that a 101B-parameter LLM with 0. 31T tokens can be trained with a budget of 100K US dollars.
1 code implementation • 3 Aug 2023 • Siqi Fan, Haibao Yu, Wenxian Yang, Jirui Yuan, Zaiqing Nie
In this paper, we propose the concept of query cooperation to enable interpretable instance-level flexible feature interaction.
Ranked #1 on 3D Object Detection on V2X-SIM
1 code implementation • 14 Apr 2023 • Yiqun Yao, Siqi Fan, Xiusheng Huang, Xuezhi Fang, Xiang Li, Ziyi Ni, Xin Jiang, Xuying Meng, Peng Han, Shuo Shang, Kang Liu, Aixin Sun, Yequan Wang
With around 14% of the one-time pre-training cost, we can accurately forecast the loss for models up to 52B.
2 code implementations • 20 Mar 2023 • Zhe Wang, Siqi Fan, Xiaoliang Huo, Tongda Xu, Yan Wang, Jingjing Liu, Yilun Chen, Ya-Qin Zhang
In autonomous driving, Vehicle-Infrastructure Cooperative 3D Object Detection (VIC3D) makes use of multi-view cameras from both vehicles and traffic infrastructure, providing a global vantage point with rich semantic context of road conditions beyond a single vehicle viewpoint.
1 code implementation • 15 Mar 2023 • Siqi Fan, Zhe Wang, Yan Wang, Jingjing Liu
For semantic segmentation in urban scene understanding, RGB cameras alone often fail to capture a clear holistic topology in challenging lighting conditions.
Ranked #10 on Thermal Image Segmentation on PST900
1 code implementation • 7 Mar 2023 • Siqi Fan, Zhe Wang, Xiaoliang Huo, Yan Wang, Jingjing Liu
Effective BEV object detection on infrastructure can greatly improve traffic scenes understanding and vehicle-toinfrastructure (V2I) cooperative perception.
Ranked #5 on 3D Object Detection on DAIR-V2X-I
1 code implementation • 30 Nov 2022 • Siqi Fan, Fenghua Zhu, Zunlei Feng, Yisheng Lv, Mingli Song, Fei-Yue Wang
Pseudo supervision is regarded as the core idea in semi-supervised learning for semantic segmentation, and there is always a tradeoff between utilizing only the high-quality pseudo labels and leveraging all the pseudo labels.
1 code implementation • 9 Apr 2022 • Xin Hu, Zhenyu Wu, Hao-Yu Miao, Siqi Fan, Taiyu Long, Zhenyu Hu, Pengcheng Pi, Yi Wu, Zhou Ren, Zhangyang Wang, Gang Hua
Video action detection (spatio-temporal action localization) is usually the starting point for human-centric intelligent analysis of videos nowadays.
1 code implementation • 17 Sep 2021 • Zhaorun Chen, Siqi Fan, Yuan Tan, Liang Gong, Binhao Chen, Te Sun, David Filliat, Natalia Díaz-Rodríguez, Chengliang Liu
Firstly, We engage RL loss to assist in updating SRL model so that the states can evolve to meet the demand of RL and maintain a good physical interpretation.
1 code implementation • CVPR 2021 • Siqi Fan, Qiulei Dong, Fenghua Zhu, Yisheng Lv, Peijun Ye, Fei-Yue Wang
For each 3D point, the local polar representation block is firstly explored to construct a spatial representation that is invariant to the z-axis rotation, then the dual-distance attentive pooling block is designed to utilize the representations of its neighbors for learning more discriminative local features according to both the geometric and feature distances among them, and finally, the global contextual feature block is designed to learn a global context for each 3D point by utilizing its spatial location and the volume ratio of the neighborhood to the global point cloud.
Ranked #2 on 3D Semantic Segmentation on STPLS3D
1 code implementation • IEEE Transactions on Vehicular Technology 2021 • Siqi Fan, Fenghua Zhu, Shichao Chen, HUI ZHANG, Bin Tian, Yisheng Lv, Fei-Yue Wang
Most successful object detectors are anchor-based, which is difficult to adapt to the diversity of traffic objects.