Search Results for author: Qiang Huo

Found 16 papers, 0 papers with code

Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis

no code implementations22 Jan 2024 Jiawei Wang, Kai Hu, Zhuoyao Zhong, Lei Sun, Qiang Huo

Our end-to-end system achieves state-of-the-art performance on two large-scale document layout analysis datasets (PubLayNet and DocLayNet), a high-quality hierarchical document structure reconstruction dataset (HRDoc), and our Comp-HRDoc benchmark.

Document Layout Analysis Document Summarization +4

Dynamic Relation Transformer for Contextual Text Block Detection

no code implementations17 Jan 2024 Jiawei Wang, Shunchi Zhang, Kai Hu, Chixiang Ma, Zhuoyao Zhong, Lei Sun, Qiang Huo

Contextual Text Block Detection (CTBD) is the task of identifying coherent text blocks within the complexity of natural scenes.

Graph Generation Relation +1

UniVIE: A Unified Label Space Approach to Visual Information Extraction from Form-like Documents

no code implementations17 Jan 2024 Kai Hu, Jiawei Wang, WeiHong Lin, Zhuoyao Zhong, Lei Sun, Qiang Huo

This unified approach allows for the definition of various relation types and effectively tackles hierarchical relationships in form-like documents.

Key Information Extraction Relation

Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model

no code implementations31 May 2023 Haisong Ding, Bozhi Luan, Dongnan Gui, Kai Chen, Qiang Huo

This model conditions on a printed glyph image and creates mappings between printed characters and handwritten images, thus enabling the generation of photo-realistic handwritten samples with diverse styles and unseen text contents.

Denoising Optical Character Recognition (OCR)

Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition

no code implementations25 May 2023 Dongnan Gui, Kai Chen, Haisong Ding, Qiang Huo

Training from handwritten samples of a small character set, the DDPM is capable of mapping printed strokes to handwritten ones, which makes it possible to generate photo-realistic and diverse style handwritten samples of unseen character categories.

Denoising

A Question-Answering Approach to Key Value Pair Extraction from Form-like Document Images

no code implementations17 Apr 2023 Kai Hu, Zhuoyuan Wu, Zhuoyao Zhong, WeiHong Lin, Lei Sun, Qiang Huo

In this paper, we present a new question-answering (QA) based key-value pair extraction approach, called KVPFormer, to robustly extracting key-value relationships between entities from form-like document images.

Question Answering

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

no code implementations21 Mar 2023 Jiawei Wang, WeiHong Lin, Chixiang Ma, Mingze Li, Zheng Sun, Lei Sun, Qiang Huo

Unlike previous methods, we formulate table separation line prediction as a line regression problem instead of an image segmentation problem and propose a new two-stage dynamic queries enhanced DETR based separation line regression approach, named DQ-DETR, to predict separation lines from table images directly.

Image Segmentation regression +2

TSRFormer: Table Structure Recognition with Transformers

no code implementations9 Aug 2022 WeiHong Lin, Zheng Sun, Chixiang Ma, Mingze Li, Jiawei Wang, Lei Sun, Qiang Huo

We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images.

Ranked #2 on Table Recognition on PubTabNet (TEDS-Struct metric)

Image Segmentation Relation Network +2

Robust Table Detection and Structure Recognition from Heterogeneous Document Images

no code implementations17 Mar 2022 Chixiang Ma, WeiHong Lin, Lei Sun, Qiang Huo

We introduce a new table detection and structure recognition approach named RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of each table from heterogeneous document images.

Ranked #5 on Table Recognition on PubTabNet (TEDS-Struct metric)

Region Proposal Table Detection +1

APRNet: Attention-based Pixel-wise Rendering Network for Photo-Realistic Text Image Generation

no code implementations15 Mar 2022 Yangming Shi, Haisong Ding, Kai Chen, Qiang Huo

Style-guided text image generation tries to synthesize text image by imitating reference image's appearance while keeping text content unaltered.

Image Generation

ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents

no code implementations25 May 2021 WeiHong Lin, Qifang Gao, Lei Sun, Zhuoyao Zhong, Kai Hu, Qin Ren, Qiang Huo

In this paper, we propose a new multi-modal backbone network by concatenating a BERTgrid to an intermediate layer of a CNN model, where the input of CNN is a document image and the BERTgrid is a grid of word embeddings, to generate a more powerful grid-based document representation, named ViBERTgrid.

Image Segmentation Key Information Extraction +4

A Study on Effects of Implicit and Explicit Language Model Information for DBLSTM-CTC Based Handwriting Recognition

no code implementations31 Jul 2020 Qi Liu, Lijuan Wang, Qiang Huo

Deep Bidirectional Long Short-Term Memory (D-BLSTM) with a Connectionist Temporal Classification (CTC) output layer has been established as one of the state-of-the-art solutions for handwriting recognition.

Handwriting Recognition Language Modelling

ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene Text Detection with Graph Convolutional Networks

no code implementations16 Mar 2020 Chixiang Ma, Lei Sun, Zhuoyao Zhong, Qiang Huo

The key idea is to decompose text detection into two subproblems, namely detection of text primitives and prediction of link relationships between nearby text primitive pairs.

Link Prediction Region Proposal +4

Mask R-CNN with Pyramid Attention Network for Scene Text Detection

no code implementations22 Nov 2018 Zhida Huang, Zhuoyao Zhong, Lei Sun, Qiang Huo

In this paper, we present a new Mask R-CNN based text detection approach which can robustly detect multi-oriented and curved text from natural scene images in a unified manner.

Curved Text Detection Text Detection

An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches

no code implementations24 Apr 2018 Zhuoyao Zhong, Lei Sun, Qiang Huo

The anchor mechanism of Faster R-CNN and SSD framework is considered not effective enough to scene text detection, which can be attributed to its IoU based matching criterion between anchors and ground-truth boxes.

Region Proposal Scene Text Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.