Read Extensively, Focus Smartly: A Cross-document Semantic Enhancement Method for Visual Documents NER

no code implementations COLING 2022 Jun Zhao, Xin Zhao, WenYu Zhan, Tao Gui, Qi Zhang, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu

To deal with this problem, this work proposes a cross-document semantic enhancement method, which consists of two modules: 1) To prevent distractions from irrelevant regions in the current document, we design a learnable attention mask mechanism, which is used to adaptively filter redundant information in the current document.


DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding

1 code implementation14 Jul 2022 Liang Qiao, Hui Jiang, Ying Chen, Can Li, Pengfei Li, Zaisheng Li, Baorui Zou, Dashan Guo, Yingda Xu, Yunlu Xu, Zhanzhan Cheng, Yi Niu

Compared with the previous opensource OCR toolbox, DavarOCR has relatively more complete support for the sub-tasks of the cutting-edge technology of document understanding.

Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting

1 code implementation14 Jul 2022 Ying Chen, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Xi Li

In this paper, to address this problem, we propose a novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting framework, which aims to infer images in different small but recognizable resolutions and achieve a better balance between accuracy and efficiency.

A Strong Baseline for Semi-Supervised Incremental Few-Shot Learning

no code implementations21 Oct 2021 Linlan Zhao, Dashan Guo, Yunlu Xu, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Xiangzhong Fang

Few-shot learning (FSL) aims to learn models that generalize to novel classes with limited training samples.

VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations

no code implementations13 May 2021 Peng Zhang, Can Li, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Fei Wu

To address the above limitations, we propose a unified framework VSR for document layout analysis, combining vision, semantics and relations.

LGPMA: Complicated Table Structure Recognition with Local and Global Pyramid Mask Alignment

1 code implementation13 May 2021 Liang Qiao, Zaisheng Li, Zhanzhan Cheng, Peng Zhang, ShiLiang Pu, Yi Niu, Wenqi Ren, Wenming Tan, Fei Wu

In this paper, we aim to obtain more reliable aligned bounding boxes by fully utilizing the visual information from both text regions in proposed local features and cell relations in global features.

Hero: On the Chaos When PATH Meets Modules

no code implementations24 Feb 2021 Ying Wang, Liang Qiao, Chang Xu, Yepang Liu, Shing-Chi Cheung, Na Meng, Hai Yu, Zhiliang Zhu

The results showed that \textsc{Hero} achieved a high detection rate of 98. 5\% on a DM issue benchmark and found 2, 422 new DM issues in 2, 356 popular Golang projects.

MANGO: A Mask Attention Guided One-Stage Scene Text Spotter

1 code implementation8 Dec 2020 Liang Qiao, Ying Chen, Zhanzhan Cheng, Yunlu Xu, Yi Niu, ShiLiang Pu, Fei Wu

Recently end-to-end scene text spotting has become a popular research topic due to its advantages of global optimization and high maintainability in real applications.

TRIE: End-to-End Text Reading and Information Extraction for Document Understanding

1 code implementation27 May 2020 Peng Zhang, Yunlu Xu, Zhanzhan Cheng, ShiLiang Pu, Jing Lu, Liang Qiao, Yi Niu, Fei Wu

Since real-world ubiquitous documents (e. g., invoices, tickets, resumes and leaflets) contain rich information, automatic document image understanding has become a hot topic.

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting

1 code implementation17 Feb 2020 Liang Qiao, Sanli Tang, Zhanzhan Cheng, Yunlu Xu, Yi Niu, ShiLiang Pu, Fei Wu

Many approaches have recently been proposed to detect irregular scene text and achieved promising results.

