Search Results for author: Guoming Wang

Found 4 papers, 2 papers with code

WorldGPT: Empowering LLM as Multimodal World Model

1 code implementation28 Apr 2024 Zhiqi Ge, Hongzhe Huang, Mingze Zhou, Juncheng Li, Guoming Wang, Siliang Tang, Yueting Zhuang

As for evaluation, we build WorldNet, a multimodal state transition prediction benchmark encompassing varied real-life scenarios.

Language Modeling Language Modelling +2

De-fine: Decomposing and Refining Visual Programs with Auto-Feedback

no code implementations21 Nov 2023 Minghe Gao, Juncheng Li, Hao Fei, Liang Pang, Wei Ji, Guoming Wang, Zheqi Lv, Wenqiao Zhang, Siliang Tang, Yueting Zhuang

Visual programming, a modular and generalizable paradigm, integrates different modules and Python operators to solve various vision-language tasks.

Logical Reasoning

Improving Vision Anomaly Detection with the Guidance of Language Modality

1 code implementation4 Oct 2023 Dong Chen, Kaihang Pan, Guoming Wang, Yueting Zhuang, Siliang Tang

To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality, and then the latent space of vision modality will be learned with the guidance of the matrix.

Anomaly Detection Defect Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.