Search Results for author: Shengzhi Li

Found 2 papers, 2 papers with code

Multi-modal preference alignment remedies regression of visual instruction tuning on language model

1 code implementation • 16 Feb 2024 • Shengzhi Li, Rongyu Lin, Shichao Pei

In conclusion, we propose a distillation-based multi-modal alignment model with fine-grained annotations on a small dataset that reconciles the textual and visual performance of MLLMs, restoring and boosting language capability after visual instruction tuning.

Instruction Following Language Modelling +2

Paper
Code

SciGraphQA: A Large-Scale Synthetic Multi-Turn Question-Answering Dataset for Scientific Graphs

1 code implementation • 7 Aug 2023 • Shengzhi Li, Nima Tajbakhsh

We asked GPT-4 to assess the matching quality of our question-answer turns given the paper's context, obtaining an average rating of 8. 7/10 on our 3K test set.

Ranked #1 on Visual Question Answering (VQA) on SciGraphQA

GPT-4 Question Answering +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.