Search Results for author: Yinsong Liu

Found 7 papers, 0 papers with code

HRVDA: High-Resolution Visual Document Assistant

no code implementations10 Apr 2024 Chaohu Liu, Kun Yin, Haoyu Cao, Xinghua Jiang, Xin Li, Yinsong Liu, Deqiang Jiang, Xing Sun, Linli Xu

In addition, we construct a document-oriented visual instruction tuning dataset and apply a multi-stage training strategy to enhance the model's document modeling capabilities.

document understanding

Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models

no code implementations29 Feb 2024 Xin Li, Yunfei Wu, Xinghua Jiang, Zhihao Guo, Mingming Gong, Haoyu Cao, Yinsong Liu, Deqiang Jiang, Xing Sun

It can represent that the contrastive learning between the visual holistic representations and the multimodal fine-grained features of document objects can assist the vision encoder in acquiring more effective visual cues, thereby enhancing the comprehension of text-rich documents in LVLMs.

Contrastive Learning document understanding

Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration

no code implementations ICCV 2023 Haoyu Cao, Changcun Bao, Chaohu Liu, Huang Chen, Kun Yin, Hao liu, Yinsong Liu, Deqiang Jiang, Xing Sun

We propose a novel end-to-end document understanding model called SeRum (SElective Region Understanding Model) for extracting meaningful information from document images, including document analysis, retrieval, and office automation.

document understanding Retrieval +1

Grab What You Need: Rethinking Complex Table Structure Recognition with Flexible Components Deliberation

no code implementations16 Mar 2023 Hao liu, Xin Li, Mingming Gong, Bing Liu, Yunfei Wu, Deqiang Jiang, Yinsong Liu, Xing Sun

Recently, Table Structure Recognition (TSR) task, aiming at identifying table structure into machine readable formats, has received increasing interest in the community.

Relational Representation Learning in Visually-Rich Documents

no code implementations5 May 2022 Xin Li, Yan Zheng, Yiqing Hu, Haoyu Cao, Yunfei Wu, Deqiang Jiang, Yinsong Liu, Bo Ren

To deal with the unpredictable definition of relations, we propose a novel contrastive learning task named Relational Consistency Modeling (RCM), which harnesses the fact that existing relations should be consistent in differently augmented positive views.

Contrastive Learning Key Information Extraction +3

Neural Collaborative Graph Machines for Table Structure Recognition

no code implementations CVPR 2022 Hao liu, Xin Li, Bing Liu, Deqiang Jiang, Yinsong Liu, Bo Ren

We also show that the proposed NCGM can modulate collaborative pattern of different modalities conditioned on the context of intra-modality cues, which is vital for diversified table cases.

Table Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.