Search Results for author: Mianzhi Pan

Found 2 papers, 1 papers with code

Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models

1 code implementation6 Aug 2023 Zheng Ma, Mianzhi Pan, Wenhan Wu, Kanzhi Cheng, Jianbing Zhang, ShuJian Huang, Jiajun Chen

Experiments on our proposed datasets demonstrate that popular VLMs underperform in the food domain compared with their performance in the general domain.

Probing Cross-modal Semantics Alignment Capability from the Textual Perspective

no code implementations18 Oct 2022 Zheng Ma, Shi Zong, Mianzhi Pan, Jianbing Zhang, ShuJian Huang, Xinyu Dai, Jiajun Chen

In recent years, vision and language pre-training (VLP) models have advanced the state-of-the-art results in a variety of cross-modal downstream tasks.

Image Captioning Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.