Search Results for author: Yihao Ding

Found 8 papers, 2 papers with code

M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding

no code implementations28 Feb 2024 Yihao Ding, Lorenzo Vaiani, Caren Han, Jean Lee, Paolo Garza, Josiah Poon, Luca Cagliero

This paper presents a groundbreaking multimodal, multi-task, multi-teacher joint-grained knowledge distillation model for visually-rich form document understanding.

document understanding Knowledge Distillation

Workshop on Document Intelligence Understanding

no code implementations31 Jul 2023 Soyeon Caren Han, Yihao Ding, Siwen Luo, Josiah Poon, HeeGuen Yoon, Zhe Huang, Paul Duuring, Eun Jung Holden

Document understanding and information extraction include different tasks to understand a document and extract valuable information automatically.

document understanding Visual Question Answering (VQA)

Graph Neural Networks for Text Classification: A Survey

no code implementations23 Apr 2023 Kunze Wang, Yihao Ding, Soyeon Caren Han

Text Classification is the most essential and fundamental problem in Natural Language Processing.

graph construction text-classification +1

PDFVQA: A New Dataset for Real-World VQA on PDF Documents

no code implementations13 Apr 2023 Yihao Ding, Siwen Luo, Hyunsuk Chung, Soyeon Caren Han

Document-based Visual Question Answering examines the document understanding of document images in conditions of natural language questions.

document understanding Key Information Extraction +2

Form-NLU: Dataset for the Form Natural Language Understanding

1 code implementation4 Apr 2023 Yihao Ding, Siqu Long, Jiabin Huang, Kaixuan Ren, Xingxiang Luo, Hyunsuk Chung, Soyeon Caren Han

Compared to general document analysis tasks, form document structure understanding and retrieval are challenging.

4k Key Information Extraction +4

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis

1 code implementation COLING 2022 Siwen Luo, Yihao Ding, Siqu Long, Josiah Poon, Soyeon Caren Han

Recognizing the layout of unstructured digital documents is crucial when parsing the documents into the structured, machine-readable format for downstream applications.

Component Classification Document Layout Analysis

V-Doc : Visual questions answers with Documents

no code implementations27 May 2022 Yihao Ding, Zhe Huang, Runlin Wang, Yanhang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, Soyeon Caren Han

We propose V-Doc, a question-answering tool using document images and PDF, mainly for researchers and general non-deep learning experts looking to generate, process, and understand the document visual question answering tasks.

Question Answering Question Generation +2

V-Doc: Visual Questions Answers With Documents

no code implementations CVPR 2022 Yihao Ding, Zhe Huang, Runlin Wang, Yanhang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, Soyeon Caren Han

We propose V-Doc, a question-answering tool using document images and PDF, mainly for researchers and general non-deep learning experts looking to generate, process, and understand the document visual question answering tasks.

Question Answering Question Generation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.