1 code implementation • 12 Mar 2024 • Shihao Zhao, Shaozhe Hao, Bojia Zi, Huaizhe xu, Kwan-Yee K. Wong
In this paper, we explore this objective and propose LaVi-Bridge, a pipeline that enables the integration of diverse pre-trained language models and generative vision models for text-to-image generation.
1 code implementation • 1 Jun 2023 • Shaozhe Hao, Kai Han, Shihao Zhao, Kwan-Yee K. Wong
Personalized text-to-image generation using diffusion models has recently emerged and garnered significant interest.
1 code implementation • NeurIPS 2023 • Shihao Zhao, Dongdong Chen, Yen-Chun Chen, Jianmin Bao, Shaozhe Hao, Lu Yuan, Kwan-Yee K. Wong
Text-to-Image diffusion models have made tremendous progress over the past two years, enabling the generation of highly realistic images based on open-domain text descriptions.
1 code implementation • 14 Apr 2023 • Shaozhe Hao, Kai Han, Kwan-Yee K. Wong
GCD considers the open-world problem of automatically clustering a partially labelled dataset, in which the unlabelled data may contain instances from both novel categories and labelled classes.
1 code implementation • CVPR 2023 • Shaozhe Hao, Kai Han, Kwan-Yee K. Wong
The key to CZSL is learning the disentanglement of the attribute-object composition.
1 code implementation • 15 Feb 2022 • Shaozhe Hao, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong
We introduce rectification blocks to rectify features extracted by a state-of-the-art recognition model, in both spatial and channel dimensions, to minimize the distance between a masked face and its mask-free counterpart in the rectified feature space.