Search Results for author: Xiaoting Yin

Found 11 papers, 10 papers with code

PP-OCR: A Practical Ultra Lightweight OCR System

9 code implementations21 Sep 2020 Yuning Du, Chenxia Li, Ruoyu Guo, Xiaoting Yin, Weiwei Liu, Jun Zhou, Yifan Bai, Zilin Yu, Yehua Yang, Qingqing Dang, Haoshuang Wang

Meanwhile, several pre-trained models for the Chinese and English recognition are released, including a text detector (97K images are used), a direction classifier (600K images are used) as well as a text recognizer (17. 9M images are used).

Computational Efficiency Optical Character Recognition +1

SVTR: Scene Text Recognition with a Single Visual Model

2 code implementations30 Apr 2022 Yongkun Du, Zhineng Chen, Caiyan Jia, Xiaoting Yin, Tianlun Zheng, Chenxia Li, Yuning Du, Yu-Gang Jiang

Dominant scene text recognition models commonly contain two building blocks, a visual model for feature extraction and a sequence model for text transcription.

Scene Text Recognition

PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System

1 code implementation7 Jun 2022 Chenxia Li, Weiwei Liu, Ruoyu Guo, Xiaoting Yin, Kaitao Jiang, Yongkun Du, Yuning Du, Lingfeng Zhu, Baohua Lai, Xiaoguang Hu, dianhai yu, Yanjun Ma

For text recognizer, the base model is replaced from CRNN to SVTR, and we introduce lightweight text recognition network SVTR LCNet, guided training of CTC by attention, data augmentation strategy TextConAug, better pre-trained model by self-supervised TextRotNet, UDML, and UIM to accelerate the model and improve the effect.

Data Augmentation Optical Character Recognition +2

Context Perception Parallel Decoder for Scene Text Recognition

1 code implementation23 Jul 2023 Yongkun Du, Zhineng Chen, Caiyan Jia, Xiaoting Yin, Chenxia Li, Yuning Du, Yu-Gang Jiang

We first present an empirical study of AR decoding in STR, and discover that the AR decoder not only models linguistic context, but also provides guidance on visual context perception.

 Ranked #1 on Scene Text Recognition on CUTE80 (using extra training data)

Language Modelling Scene Text Recognition

Rethinking Event-based Human Pose Estimation with 3D Event Representations

1 code implementation8 Nov 2023 Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni, Kailun Yang, Kaiwei Wang

Experiments on EV-3DPW demonstrate that the robustness of our proposed 3D representation methods compared to traditional RGB images and event frame techniques under the same backbones.

Autonomous Driving Pose Estimation

CSFlow: Learning Optical Flow via Cross Strip Correlation for Autonomous Driving

1 code implementation2 Feb 2022 Hao Shi, Yifan Zhou, Kailun Yang, Xiaoting Yin, Kaiwei Wang

In this paper, we propose a new deep network architecture for optical flow estimation in autonomous driving--CSFlow, which consists of two novel modules: Cross Strip Correlation module (CSC) and Correlation Regression Initialization module (CRI).

Autonomous Driving Optical Flow Estimation

Beyond the Field-of-View: Enhancing Scene Visibility and Perception with Clip-Recurrent Transformer

3 code implementations21 Nov 2022 Hao Shi, Qi Jiang, Kailun Yang, Xiaoting Yin, Huajian Ni, Kaiwei Wang

In this paper, we propose the concept of online video inpainting for autonomous vehicles to expand the field of view, thereby enhancing scene visibility, perception, and system safety.

Autonomous Vehicles object-detection +4

Towards Anytime Optical Flow Estimation with Event Cameras

1 code implementation11 Jul 2023 Yaozu Ye, Hao Shi, Kailun Yang, Ze Wang, Xiaoting Yin, Yining Lin, Mao Liu, Yaonan Wang, Kaiwei Wang

We then propose EVA-Flow, an EVent-based Anytime Flow estimation network to produce high-frame-rate event optical flow with only low-frame-rate optical flow ground truth for supervision.

Autonomous Driving Motion Estimation +1

Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers

1 code implementation30 Jan 2024 Jianbin Jiao, Xina Cheng, WeiJie Chen, Xiaoting Yin, Hao Shi, Kailun Yang

Due to the challenges in data collection, mainstream datasets of 3D human pose estimation are primarily composed of multi-view video data collected in laboratory environments, which contains rich spatial-temporal correlation information besides the image frame content.

3D Human Pose Estimation Scene Understanding

OccFiner: Offboard Occupancy Refinement with Hybrid Propagation

no code implementations13 Mar 2024 Hao Shi, Song Wang, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang

Vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision.

3D Semantic Scene Completion

Cannot find the paper you are looking for? You can Submit a new open access paper.