1 code implementation • 2 Apr 2025 • Yingyan Li, Yuqi Wang, Yang Liu, JiaWei He, Lue Fan, Zhaoxiang Zhang
Therefore, we propose an end-to-end driving framework WoTE, which leverages a BEV World model to predict future BEV states for Trajectory Evaluation.
1 code implementation • 12 Jun 2024 • Yingyan Li, Lue Fan, JiaWei He, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang, Tieniu Tan
Specifically, our framework \textbf{LAW} uses a LAtent World model to predict future latent features based on the predicted ego actions and the latent feature of the current frame.
1 code implementation • 24 Apr 2023 • Yingyan Li, Lue Fan, Yang Liu, Zehao Huang, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang
In this paper, we study how to effectively leverage image modality in the emerging fully sparse architecture.
1 code implementation • 20 Jul 2022 • Yingyan Li, Yuntao Chen, JiaWei He, Zhaoxiang Zhang
So these methods only use a small number of projection constraints and produce insufficient depth candidates, leading to inaccurate depth estimation.
1 code implementation • 20 Apr 2021 • Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Yingyan Li, Xueqi Cheng
The basic idea of PROP is to construct the \textit{representative words prediction} (ROP) task for pre-training inspired by the query likelihood model.
2 code implementations • 11 Mar 2021 • Yuqi Huo, Manli Zhang, Guangzhen Liu, Haoyu Lu, Yizhao Gao, Guoxing Yang, Jingyuan Wen, Heng Zhang, Baogui Xu, Weihao Zheng, Zongzheng Xi, Yueqian Yang, Anwen Hu, Jinming Zhao, Ruichen Li, Yida Zhao, Liang Zhang, Yuqing Song, Xin Hong, Wanqing Cui, Danyang Hou, Yingyan Li, Junyi Li, Peiyu Liu, Zheng Gong, Chuhao Jin, Yuchong Sun, ShiZhe Chen, Zhiwu Lu, Zhicheng Dou, Qin Jin, Yanyan Lan, Wayne Xin Zhao, Ruihua Song, Ji-Rong Wen
We further construct a large Chinese multi-source image-text dataset called RUC-CAS-WenLan for pre-training our BriVL model.
Ranked #1 on
Image Retrieval
on RUC-CAS-WenLan