1 code implementation • 9 Mar 2025 • Wenxuan Huang, Bohan Jia, Zijie Zhai, Shaosheng Cao, Zheyu Ye, Fei Zhao, Yao Hu, Shaohui Lin
However, direct training with RL struggles to activate complex reasoning capabilities such as questioning and reflection in MLLMs, due to the absence of substantial high-quality multimodal reasoning data.
1 code implementation • 1 Dec 2024 • Wenxuan Huang, Zijie Zhai, Yunhang Shen, Shaosheng Cao, Fei Zhao, Xiangfeng Xu, Zheyu Ye, Shaohui Lin
To address this problem, we proposed a dynamic vision-language context sparsification framework Dynamic-LLaVA, which dynamically reduces the redundancy of vision context in the prefill stage and decreases the memory and computation overhead of the generated language context during decoding.
Ranked #133 on
Visual Question Answering
on MM-Vet
no code implementations • 18 Oct 2024 • Chen Zhang, Meizhi Zhong, Qimeng Wang, Xuantao Lu, Zheyu Ye, Chengqiang Lu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang, Dawei Song
Long-context efficiency has recently become a trending topic in serving large language models (LLMs).
1 code implementation • 18 Jun 2024 • Zhouhong Gu, Lin Zhang, Xiaoxuan Zhu, Jiangjie Chen, Wenhao Huang, Yikai Zhang, Shusen Wang, Zheyu Ye, Yan Gao, Hongwei Feng, Yanghua Xiao
This paper proposes a benchmark called DetectBench for verifying the ability to detect and piece together implicit evidence within a long context.
1 code implementation • 20 Mar 2024 • Zhouhong Gu, Xiaoxuan Zhu, Haoran Guo, Lin Zhang, Yin Cai, Hao Shen, Jiangjie Chen, Zheyu Ye, Yifei Dai, Yan Gao, Yao Hu, Hongwei Feng, Yanghua Xiao
Language significantly influences the formation and evolution of Human emergent behavior, which is crucial in understanding collective intelligence within human societies.
1 code implementation • 13 Nov 2023 • Chen Zhang, Dawei Song, Zheyu Ye, Yan Gao
Language model (LM) distillation is a trending area that aims to distil the knowledge residing in a large teacher LM to a small student one.
no code implementations • 11 Jul 2023 • Zhouhong Gu, Lin Zhang, Jiangjie Chen, Haoning Ye, Xiaoxuan Zhu, Zihan Li, Zheyu Ye, Yan Gao, Yao Hu, Yanghua Xiao, Hongwei Feng
We introduces the DetectBench, a reading comprehension dataset designed to assess a model's ability to jointly ability in key information detection and multi-hop reasoning when facing complex and implicit information.
no code implementations • 29 Nov 2021 • Zheyu Ye, Jiangning Liu, Qian Yu, Jianxun Ju
Conversation question answering requires the ability to interpret a question correctly.