no code implementations • 10 Apr 2024 • Chaohu Liu, Kun Yin, Haoyu Cao, Xinghua Jiang, Xin Li, Yinsong Liu, Deqiang Jiang, Xing Sun, Linli Xu
In addition, we construct a document-oriented visual instruction tuning dataset and apply a multi-stage training strategy to enhance the model's document modeling capabilities.
no code implementations • ICCV 2023 • Haoyu Cao, Changcun Bao, Chaohu Liu, Huang Chen, Kun Yin, Hao liu, Yinsong Liu, Deqiang Jiang, Xing Sun
We propose a novel end-to-end document understanding model called SeRum (SElective Region Understanding Model) for extracting meaningful information from document images, including document analysis, retrieval, and office automation.