no code implementations • 25 Nov 2022 • Zhao Zhou, Xiangcheng Du, Yingbin Zheng, Cheng Jin
We present the Aggregated Text TRansformer(ATTR), which is designed to represent texts in scene images with a multi-scale self-attention mechanism.
no code implementations • 23 Jul 2022 • Xiangcheng Du, Zhao Zhou, Yingbin Zheng, Xingjiao Wu, Tianlong Ma, Cheng Jin
Scene text erasing seeks to erase text contents from scene images and current state-of-the-art text erasing models are trained on large-scale synthetic data.
no code implementations • 24 Jan 2022 • Xingjiao Wu, Luwei Xiao, Xiangcheng Du, Yingbin Zheng, Xin Li, Tianlong Ma, Liang He
Our framework is an unsupervised document layout analysis framework.
no code implementations • 27 Nov 2021 • Tianlong Ma, Xingjiao Wu, Xin Li, Xiangcheng Du, Zhao Zhou, Liang Xue, Cheng Jin
To measure the proposed image layer modeling method, we propose a manually-labeled non-Manhattan layout fine-grained segmentation dataset named FPD.
no code implementations • 7 Apr 2021 • Xingjiao Wu, Ziling Hu, Xiangcheng Du, Jing Yang, Liang He
The document layout analysis (DLA) aims to split the document image into different interest regions and understand the role of each region, which has wide application such as optical character recognition (OCR) systems and document retrieval.
Document Layout Analysis
Optical Character Recognition (OCR)
+1
no code implementations • 4 Nov 2019 • Xiangcheng Du, Tianlong Ma, Yingbin Zheng, Hao Ye, Xingjiao Wu, Liang He
In this paper, we study text recognition framework by considering the long-term temporal dependencies in the encoder stage.