Search Results for author: Feiyu Gao

Found 10 papers, 4 papers with code

A Simple yet Effective Layout Token in Large Language Models for Document Understanding

no code implementations24 Mar 2025 Zhaoqing Zhu, Chuwei Luo, Zirui Shao, Feiyu Gao, Hangdi Xing, Qi Zheng, Ji Zhang

Due to the constraint on max position IDs, assigning them to layout information reduces those available for text content, reducing the capacity for the model to learn from the text during training, while also introducing a large number of potentially untrained position IDs during long-context inference, which can hinder performance on document understanding tasks.

document understanding Position

SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild

no code implementations6 Jan 2025 Jiawei Liu, Yuanzhi Zhu, Feiyu Gao, Zhibo Yang, Peng Wang, Junyang Lin, Xinggang Wang, Wenyu Liu

), the text in natural scene images needs to meet the following four key criteria: (1) Fidelity: the generated text should appear as realistic as a photograph and be completely accurate, with no errors in any of the strokes.

Attribute Optical Character Recognition +4

WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation

1 code implementation22 Jul 2024 Zirui Shao, Feiyu Gao, Hangdi Xing, Zepeng Zhu, Zhi Yu, Jiajun Bu, Qi Zheng, Cong Yao

In the era of content creation revolution propelled by advancements in generative models, the field of web design remains unexplored despite its critical role in modern digital communication.

Visual Text Generation in the Wild

1 code implementation19 Jul 2024 Yuanzhi Zhu, Jiawei Liu, Feiyu Gao, Wenyu Liu, Xinggang Wang, Peng Wang, Fei Huang, Cong Yao, Zhibo Yang

However, it is still challenging to render high-quality text images in real-world scenarios, as three critical criteria should be satisfied: (1) Fidelity: the generated text images should be photo-realistic and the contents are expected to be the same as specified in the given conditions; (2) Reasonability: the regions and contents of the generated text should cohere with the scene; (3) Utility: the generated text images can facilitate related tasks (e. g., text detection and recognition).

Language Modelling Large Language Model +3

LORE: Logical Location Regression Network for Table Structure Recognition

2 code implementations7 Mar 2023 Hangdi Xing, Feiyu Gao, Rujiao Long, Jiajun Bu, Qi Zheng, Liangcheng Li, Cong Yao, Zhi Yu

Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats.

regression Table Recognition

SentiPrompt: Sentiment Knowledge Enhanced Prompt-Tuning for Aspect-Based Sentiment Analysis

no code implementations17 Sep 2021 Chengxi Li, Feiyu Gao, Jiajun Bu, Lu Xu, Xiang Chen, Yu Gu, Zirui Shao, Qi Zheng, Ningyu Zhang, Yongpan Wang, Zhi Yu

We inject sentiment knowledge regarding aspects, opinions, and polarities into prompt and explicitly model term relations via constructing consistency and polarity judgment templates from the ground truth triplets.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +5

Parsing Table Structures in the Wild

3 code implementations ICCV 2021 Rujiao Long, Wen Wang, Nan Xue, Feiyu Gao, Zhibo Yang, Yongpan Wang, Gui-Song Xia

In contrast to existing studies that mainly focus on parsing well-aligned tabular images with simple layouts from scanned PDF documents, we aim to establish a practical table structure parsing system for real-world scenarios where tabular input images are taken or scanned with severe deformation, bending or occlusions.

Object Detection

Graph Convolution for Multimodal Information Extraction from Visually Rich Documents

no code implementations NAACL 2019 Xiaojing Liu, Feiyu Gao, Qiong Zhang, Huasha Zhao

In VRDs, visual and layout information is critical for document understanding, and texts in such documents cannot be serialized into the one-dimensional sequence without losing information.

document understanding Entity Extraction using GAN

Cannot find the paper you are looking for? You can Submit a new open access paper.