1 code implementation • 13 Mar 2024 • Hao Shi, Song Wang, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang
Vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision.
1 code implementation • 30 Jan 2024 • Jianbin Jiao, Xina Cheng, WeiJie Chen, Xiaoting Yin, Hao Shi, Kailun Yang
Due to the challenges in data collection, mainstream datasets of 3D human pose estimation are primarily composed of multi-view video data collected in laboratory environments, which contains rich spatial-temporal correlation information besides the image frame content.
1 code implementation • 8 Nov 2023 • Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni, Kailun Yang, Kaiwei Wang
Experiments on EV-3DPW demonstrate that the robustness of our proposed 3D representation methods compared to traditional RGB images and event frame techniques under the same backbones.
1 code implementation • 23 Jul 2023 • Yongkun Du, Zhineng Chen, Caiyan Jia, Xiaoting Yin, Chenxia Li, Yuning Du, Yu-Gang Jiang
We first present an empirical study of AR decoding in STR, and discover that the AR decoder not only models linguistic context, but also provides guidance on visual context perception.
Ranked #1 on Scene Text Recognition on CUTE80 (using extra training data)
1 code implementation • 11 Jul 2023 • Yaozu Ye, Hao Shi, Kailun Yang, Ze Wang, Xiaoting Yin, Yining Lin, Mao Liu, Yaonan Wang, Kaiwei Wang
We then propose EVA-Flow, an EVent-based Anytime Flow estimation network to produce high-frame-rate event optical flow with only low-frame-rate optical flow ground truth for supervision.
3 code implementations • 21 Nov 2022 • Hao Shi, Qi Jiang, Kailun Yang, Xiaoting Yin, Ze Wang, Kaiwei Wang
In this paper, we propose the concept of online video inpainting for autonomous vehicles to expand the field of view, thereby enhancing scene visibility, perception, and system safety.
Ranked #1 on Seeing Beyond the Visible on KITTI360-EX
1 code implementation • 7 Jun 2022 • Chenxia Li, Weiwei Liu, Ruoyu Guo, Xiaoting Yin, Kaitao Jiang, Yongkun Du, Yuning Du, Lingfeng Zhu, Baohua Lai, Xiaoguang Hu, dianhai yu, Yanjun Ma
For text recognizer, the base model is replaced from CRNN to SVTR, and we introduce lightweight text recognition network SVTR LCNet, guided training of CTC by attention, data augmentation strategy TextConAug, better pre-trained model by self-supervised TextRotNet, UDML, and UIM to accelerate the model and improve the effect.
3 code implementations • 30 Apr 2022 • Yongkun Du, Zhineng Chen, Caiyan Jia, Xiaoting Yin, Tianlun Zheng, Chenxia Li, Yuning Du, Yu-Gang Jiang
Dominant scene text recognition models commonly contain two building blocks, a visual model for feature extraction and a sequence model for text transcription.
Ranked #16 on Scene Text Recognition on ICDAR2013
1 code implementation • 27 Feb 2022 • Hao Shi, Yifan Zhou, Kailun Yang, Xiaoting Yin, Ze Wang, Yaozu Ye, Zhe Yin, Shi Meng, Peng Li, Kaiwei Wang
PanoFlow achieves state-of-the-art performance on the public OmniFlowNet and the established FlowScape benchmarks.
1 code implementation • 2 Feb 2022 • Hao Shi, Yifan Zhou, Kailun Yang, Xiaoting Yin, Kaiwei Wang
In this paper, we propose a new deep network architecture for optical flow estimation in autonomous driving--CSFlow, which consists of two novel modules: Cross Strip Correlation module (CSC) and Correlation Regression Initialization module (CRI).
10 code implementations • 21 Sep 2020 • Yuning Du, Chenxia Li, Ruoyu Guo, Xiaoting Yin, Weiwei Liu, Jun Zhou, Yifan Bai, Zilin Yu, Yehua Yang, Qingqing Dang, Haoshuang Wang
Meanwhile, several pre-trained models for the Chinese and English recognition are released, including a text detector (97K images are used), a direction classifier (600K images are used) as well as a text recognizer (17. 9M images are used).