Search Results for author: Shunian Chen

Found 13 papers, 9 papers with code

BlenderLLM: Training Large Language Models for Computer-Aided Design with Self-improvement

1 code implementation16 Dec 2024 Yuhao Du, Shunian Chen, Wenbo Zan, Peizhao Li, Mingxuan Wang, Dingjie Song, Bo Li, Yan Hu, Benyou Wang

In this paper, we present BlenderLLM, a novel framework for training LLMs specifically for CAD tasks leveraging a self-improvement methodology.

Script Generation Text to 3D

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

1 code implementation6 Nov 2024 Dingjie Song, Sicheng Lai, Shunian Chen, Lichao Sun, Benyou Wang

The rapid progression of multimodal large language models (MLLMs) has demonstrated superior performance on various multimodal benchmarks.

VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment

no code implementations12 Oct 2024 Lei LI, Zhihui Xie, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong, Qi Liu

As large vision-language models (LVLMs) evolve rapidly, the demand for high-quality and diverse data to align these models becomes increasingly crucial.

Diversity Hallucination +3

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture

1 code implementation4 Sep 2024 Xidong Wang, Dingjie Song, Shunian Chen, Chen Zhang, Benyou Wang

Expanding the long-context capabilities of Multi-modal Large Language Models~(MLLMs) is crucial for video understanding, high-resolution image understanding, and multi-modal agents.

Mamba Video Understanding

MileBench: Benchmarking MLLMs in Long Context

no code implementations29 Apr 2024 Dingjie Song, Shunian Chen, Guiming Hardy Chen, Fei Yu, Xiang Wan, Benyou Wang

Despite the advancements and impressive performance of Multimodal Large Language Models (MLLMs) on benchmarks, their effectiveness in real-world, long-context, and multi-image tasks is unclear due to the benchmarks' limited scope.

Benchmarking Diagnostic

ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language Models

1 code implementation18 Feb 2024 Guiming Hardy Chen, Shunian Chen, Ruifei Zhang, Junying Chen, Xiangbo Wu, Zhiyi Zhang, Zhihong Chen, Jianquan Li, Xiang Wan, Benyou Wang

Large vision-language models (LVLMs) have shown premise in a broad range of vision-language tasks with their strong reasoning and generalization capabilities.

Language Modelling Question Answering +1

Silkie: Preference Distillation for Large Visual Language Models

no code implementations17 Dec 2023 Lei LI, Zhihui Xie, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong

This paper explores preference distillation for large vision language models (LVLMs), improving their ability to generate helpful and faithful responses anchoring the visual context.

Hallucination MME +1

MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria

1 code implementation23 Nov 2023 Wentao Ge, Shunian Chen, Guiming Hardy Chen, Junying Chen, Zhihong Chen, Nuo Chen, Wenya Xie, Shuo Yan, Chenghao Zhu, Ziyue Lin, Song Dingjie, Xidong Wang, Anningzhe Gao, Zhang Zhiyi, Jianquan Li, Xiang Wan, Benyou Wang

To this end, in our paper, we propose a new evaluation paradigm for MLLMs, which is evaluating MLLMs with per-sample criteria using potent MLLM as the judge.

HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs

1 code implementation16 Nov 2023 Junying Chen, Xidong Wang, Ke Ji, Anningzhe Gao, Feng Jiang, Shunian Chen, Hongbo Zhang, Dingjie Song, Wenya Xie, Chuyi Kong, Jianquan Li, Xiang Wan, Haizhou Li, Benyou Wang

We validate the new protocol in the domains where proprietary LLMs like ChatGPT perform relatively poorly, such as Traditional Chinese Medicine.

Domain Adaptation Language Modeling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.