Search Results for author: Ya-Qi Yu

Found 1 papers, 1 papers with code

TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

1 code implementation14 Apr 2024 Ya-Qi Yu, Minghui Liao, Jihao Wu, Yongxin Liao, Xiaoyu Zheng, Wei Zeng

We conduct extensive experiments on both general and document-oriented MLLM benchmarks, and show that TextHawk outperforms the state-of-the-art methods, demonstrating its effectiveness and superiority in fine-grained document perception and general abilities.

Cannot find the paper you are looking for? You can Submit a new open access paper.