1 code implementation • 5 May 2025 • Nikolay Safonov, Alexey Bryncev, Andrey Moskalenko, Dmitry Kulikov, Dmitry Vatolin, Radu Timofte, Haibo Lei, Qifan Gao, Qing Luo, Yaqing Li, Jie Song, Shaozhe Hao, Meisong Zheng, Jingyi Xu, Chengbin Wu, Jiahui Liu, Ying Chen, Xin Deng, Mai Xu, Peipei Liang, Jie Ma, Junjie Jin, Yingxue Pang, Fangzhou Luo, Kai Chen, Shijie Zhao, Mingyang Wu, Renjie Li, Yushen Zuo, Shengyun Zhong, Zhengzhong Tu
This paper presents an overview of the NTIRE 2025 Challenge on UGC Video Enhancement.
no code implementations • 10 Feb 2025 • Bojia Zi, Penghui Ruan, Marco Chen, Xianbiao Qi, Shaozhe Hao, Shihao Zhao, Youze Huang, Bin Liang, Rong Xiao, Kam-Fai Wong
On the other hand, end-to-end methods, which rely on edited video pairs for training, offer faster inference speeds but often produce poor editing results due to a lack of high-quality training video pairs.
1 code implementation • 21 Oct 2024 • Xuantong Liu, Shaozhe Hao, Xianbiao Qi, Tianyang Hu, Jun Wang, Rong Xiao, Yuan YAO
However, considering the essential differences between text and image modalities, the design space of language models for image generation remains underexplored.
Ranked #23 on
Image Generation
on ImageNet 256x256
1 code implementation • 18 Oct 2024 • Shaozhe Hao, Xuantong Liu, Xianbiao Qi, Shihao Zhao, Bojia Zi, Rong Xiao, Kai Han, Kwan-Yee K. Wong
We introduce BiGR, a novel conditional image generation model using compact binary latent codes for generative training, focusing on enhancing both generation and representation capabilities.
1 code implementation • 1 Oct 2024 • Zhi Xu, Shaozhe Hao, Kai Han
In this paper, we study a new and challenging task, customized concept decomposition, wherein the objective is to leverage diffusion models to decompose a single image and generate visual concepts from various perspectives.
no code implementations • CVPR 2025 • Shuya Yang, Shaozhe Hao, Yukang Cao, Kwan-Yee K. Wong
In this paper, we introduce ArtiFade to tackle this issue and successfully generate high-quality artifact-free images from blemished datasets.
1 code implementation • 9 Jul 2024 • Shaozhe Hao, Kai Han, Zhengyao Lv, Shihao Zhao, Kwan-Yee K. Wong
To achieve this, we present ConceptExpress that tackles UCE by unleashing the inherent capabilities of pretrained diffusion models in two aspects.
2 code implementations • 12 Mar 2024 • Shihao Zhao, Shaozhe Hao, Bojia Zi, Huaizhe xu, Kwan-Yee K. Wong
In this paper, we explore this objective and propose LaVi-Bridge, a pipeline that enables the integration of diverse pre-trained language models and generative vision models for text-to-image generation.
1 code implementation • 1 Jun 2023 • Shaozhe Hao, Kai Han, Shihao Zhao, Kwan-Yee K. Wong
Personalized text-to-image generation using diffusion models has recently emerged and garnered significant interest.
1 code implementation • NeurIPS 2023 • Shihao Zhao, Dongdong Chen, Yen-Chun Chen, Jianmin Bao, Shaozhe Hao, Lu Yuan, Kwan-Yee K. Wong
Text-to-Image diffusion models have made tremendous progress over the past two years, enabling the generation of highly realistic images based on open-domain text descriptions.
1 code implementation • 14 Apr 2023 • Shaozhe Hao, Kai Han, Kwan-Yee K. Wong
GCD considers the open-world problem of automatically clustering a partially labelled dataset, in which the unlabelled data may contain instances from both novel categories and labelled classes.
1 code implementation • CVPR 2023 • Shaozhe Hao, Kai Han, Kwan-Yee K. Wong
The key to CZSL is learning the disentanglement of the attribute-object composition.
1 code implementation • 15 Feb 2022 • Shaozhe Hao, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong
We introduce rectification blocks to rectify features extracted by a state-of-the-art recognition model, in both spatial and channel dimensions, to minimize the distance between a masked face and its mask-free counterpart in the rectified feature space.