Search Results for author: Lingyu Kong

Found 4 papers, 2 papers with code

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

1 code implementation15 Apr 2024 Jinyue Chen, Lingyu Kong, Haoran Wei, Chenglong Liu, Zheng Ge, Liang Zhao, Jianjian Sun, Chunrui Han, Xiangyu Zhang

To address this, we propose OneChart: a reliable agent specifically devised for the structural extraction of chart information.

Small Language Model Meets with Reinforced Vision Vocabulary

no code implementations23 Jan 2024 Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, En Yu, Jianjian Sun, Chunrui Han, Xiangyu Zhang

In Vary-toy, we introduce an improved vision vocabulary, allowing the model to not only possess all features of Vary but also gather more generality.

Language Modelling Large Language Model +3

Merlin:Empowering Multimodal LLMs with Foresight Minds

no code implementations30 Nov 2023 En Yu, Liang Zhao, Yana Wei, Jinrong Yang, Dongming Wu, Lingyu Kong, Haoran Wei, Tiancai Wang, Zheng Ge, Xiangyu Zhang, Wenbing Tao

Then, FIT requires MLLMs to first predict trajectories of related objects and then reason about potential future events based on them.

Visual Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.