Search Results for author: Zechen Bai

Found 9 papers, 6 papers with code

Hallucination of Multimodal Large Language Models: A Survey

1 code implementation • 29 Apr 2024 • Zechen Bai, Pichao Wang, Tianjun Xiao, Tong He, Zongbo Han, Zheng Zhang, Mike Zheng Shou

By drawing the granular classification and landscapes of hallucination causes, evaluation benchmarks, and mitigation methods, this survey aims to deepen the understanding of hallucinations in MLLMs and inspire further advancements in the field.

Hallucination

126

Paper
Code

Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters

1 code implementation • 21 Feb 2024 • Zechen Bai, Peng Chen, Xiaolan Peng, Lu Liu, Hui Chen, Mike Zheng Shou, Feng Tian

In our solution, a deep learning model was first trained to retarget the facial expression from input face images to virtual human faces by estimating the blendshape coefficients.

Unity

Paper
Code

Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models

2 code implementations • 2 Feb 2024 • Zongbo Han, Zechen Bai, Haiyang Mei, Qianli Xu, Changqing Zhang, Mike Zheng Shou

Recent advancements in large vision-language models (LVLMs) have demonstrated impressive capability in visual information understanding with human language.

Hallucination

Paper
Code

ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

no code implementations • 20 Dec 2023 • Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou

Graphical User Interface (GUI) automation holds significant promise for assisting users with complex tasks, thereby boosting human productivity.

Language Modelling Large Language Model

Paper
Add Code

Unsupervised Open-Vocabulary Object Localization in Videos

no code implementations • ICCV 2023 • Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He

In this paper, we show that recent advances in video representation learning and pre-trained vision-language models allow for substantial improvements in self-supervised video object localization.

Object Object Localization +1

Paper
Add Code

Object-Centric Multiple Object Tracking

1 code implementation • ICCV 2023 • Zixu Zhao, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai, Dominik Zietlow, Carl-Johann Simon-Gabriel, Bing Shuai, Zhuowen Tu, Thomas Brox, Bernt Schiele, Yanwei Fu, Francesco Locatello, Zheng Zhang, Tianjun Xiao

Unsupervised object-centric learning methods allow the partitioning of scenes into entities without additional localization information and are excellent candidates for reducing the annotation burden of multiple-object tracking (MOT) pipelines.

Multiple Object Tracking Object +3

Paper
Code

Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

1 code implementation • ICCV 2021 • Zechen Bai, Yuta Nakashima, Noa Garcia

Have you ever looked at a painting and wondered what is the story behind it?

Paper
Code

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

1 code implementation • CVPR 2021 • Zechen Bai, Zhigang Wang, Jian Wang, Di Hu, Errui Ding

Although achieving great success, most of them only use limited data from a single-source domain for model pre-training, making the rich labeled data insufficiently exploited.

Person Re-Identification Unsupervised Domain Adaptation

Paper
Code

Show, Recall, and Tell: Image Captioning with Recall Mechanism

no code implementations • 15 Jan 2020 • Li Wang, Zechen Bai, Yonghua Zhang, Hongtao Lu

SG and RWS are de-signed for the best use of recalled words.

Image Captioning Retrieval +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.