no code implementations • 22 Apr 2024 • Weili Zeng, Yichao Yan, Qi Zhu, Zhuo Chen, Pengzhi Chu, Weiming Zhao, Xiaokang Yang
Text-to-image (T2I) customization aims to create images that embody specific visual concepts delineated in textual descriptions.
1 code implementation • 27 Jun 2023 • Keqin Chen, Zhao Zhang, Weili Zeng, Richong Zhang, Feng Zhu, Rui Zhao
Referential dialogue is a superset of various vision-language (VL) tasks.
Ranked #10 on Visual Question Answering on ViP-Bench
no code implementations • 5 Mar 2023 • Weili Zeng
Energy-based models parameterize the unnormalized log-probability of data samples, but there is a lack of guidance on how to construct the "energy".