Search Results for author: Mengzhao Jia

Found 6 papers, 3 papers with code

Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training

no code implementations • 22 Apr 2024 • Mengzhao Jia, Zhihan Zhang, Wenhao Yu, Fangkai Jiao, Meng Jiang

Open-source multimodal large language models (MLLMs) excel in various tasks involving textual and visual inputs but still struggle with complex multimodal mathematical reasoning, lagging behind proprietary models like GPT-4V(ision) and Gemini-Pro.

Math Mathematical Reasoning

Paper
Add Code

Debiasing Multimodal Sarcasm Detection with Contrastive Learning

no code implementations • 16 Dec 2023 • Mengzhao Jia, Can Xie, Liqiang Jing

Moreover, we propose a novel debiasing multimodal sarcasm detection framework with contrastive learning, which aims to mitigate the harmful effect of biased textual factors for robust OOD generalization.

Contrastive Learning counterfactual +2

Paper
Add Code

PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning

1 code implementation • 15 Nov 2023 • Zhihan Zhang, Dong-Ho Lee, Yuwei Fang, Wenhao Yu, Mengzhao Jia, Meng Jiang, Francesco Barbieri

Instruction tuning has remarkably advanced large language models (LLMs) in understanding and responding to diverse human instructions.

Instruction Following

Paper
Code

FAITHSCORE: Evaluating Hallucinations in Large Vision-Language Models

1 code implementation • 2 Nov 2023 • Liqiang Jing, Ruosen Li, Yunmo Chen, Mengzhao Jia, Xinya Du

We introduce FAITHSCORE (Faithfulness to Atomic Image Facts Score), a reference-free and fine-grained evaluation metric that measures the faithfulness of the generated free-form answers from large vision-language models (LVLMs).

Descriptive Instruction Following

Paper
Code

Knowledge-enhanced Memory Model for Emotional Support Conversation

no code implementations • 11 Oct 2023 • Mengzhao Jia, Qianglong Chen, Liqiang Jing, Dawei Fu, Renyu Li

The prevalence of mental disorders has become a significant issue, leading to the increased focus on Emotional Support Conversation as an effective supplement for mental health support.

Response Generation

Paper
Add Code

Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation

1 code implementation • 29 Jun 2023 • Liqiang Jing, Xuemeng Song, Kun Ouyang, Mengzhao Jia, Liqiang Nie

Multimodal Sarcasm Explanation (MuSE) is a new yet challenging task, which aims to generate a natural language sentence for a multimodal social post (an image as well as its caption) to explain why it contains sarcasm.

Explanation Generation Object +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.