no code implementations • 28 Nov 2024 • Yingying Deng, Xiangyu He, Fan Tang, WeiMing Dong
In contrast to existing approaches, we have discovered that latent features in vanilla diffusion models inherently contain natural style and content distributions.
no code implementations • 30 Oct 2024 • Yuxin Zhang, Dandan Zheng, Biao Gong, Jingdong Chen, Ming Yang, WeiMing Dong, Changsheng Xu
Lighting plays a pivotal role in ensuring the naturalness of video generation, significantly influencing the aesthetic quality of the generated content.
1 code implementation • ACM MM 2024 • Zhenyu Yang, Shengsheng Qian, Dizhan Xue, JiaHong Wu, Fan Yang, WeiMing Dong, Changsheng Xu
To address this limitation, this paper proposes a training-free method called Semantic Editing Increment for ZS-CIR (SEIZE) to retrieve the target image based on the query image and text without training.
Ranked #2 on Zero-Shot Composed Image Retrieval (ZS-CIR) on CIRR
no code implementations • 20 Jul 2024 • Yuyang Wanyan, Xiaoshan Yang, WeiMing Dong, Changsheng Xu
Few-shot action recognition aims to address the high cost and impracticality of manually labeling complex and variable video data in action recognition.
1 code implementation • SIGIR 2024 • Zhenyu Yang, Dizhan Xue, Shengsheng Qian, WeiMing Dong, Changsheng Xu
To conduct ZS-CIR, the prevailing methods employ pre-trained image-to-text models to transform the query image and text into a single text, which is then projected into the common feature space by CLIP to retrieve the target image.
Ranked #4 on Zero-Shot Composed Image Retrieval (ZS-CIR) on CIRCO
no code implementations • 28 Apr 2024 • Yunbing Jia, Xiaoyu Kong, Fan Tang, Yixing Gao, WeiMing Dong, Yi Yang
In this paper, we reveal the two sides of data augmentation: enhancements in closed-set recognition correlate with a significant decrease in open-set recognition.
no code implementations • 28 Mar 2024 • Yu Xu, Fan Tang, Juan Cao, Yuxin Zhang, Oliver Deussen, WeiMing Dong, Jintao Li, Tong-Yee Lee
Based on the adapters broken apart for separate training content and style, we then make the entity parameter space by reconstructing the content and style PLPs matrices, followed by fine-tuning the combined adapter to generate the target object with the desired appearance.
1 code implementation • 25 Jan 2024 • Nisha Huang, WeiMing Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.
1 code implementation • CVPR 2024 • Yingying Deng, Xiangyu He, Fan Tang, WeiMing Dong
Despite the remarkable progress in image style transfer formulating style in the context of art is inherently subjective and challenging.
1 code implementation • 25 Dec 2023 • Chengcheng Ma, Ismail Elezi, Jiankang Deng, WeiMing Dong, Changsheng Xu
For instance, on CIFAR-10-LT, CPE improves test accuracy by over 2. 22% compared to baselines.
1 code implementation • 8 Dec 2023 • Yuxin Zhang, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu
The essence of a video lies in its dynamic motions, including character actions, object movements, and camera movements.
1 code implementation • 25 Nov 2023 • Yingying Deng, Xiangyu He, Fan Tang, WeiMing Dong
Despite the remarkable progress in image style transfer, formulating style in the context of art is inherently subjective and challenging.
3 code implementations • 25 May 2023 • Yuxin Zhang, WeiMing Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu
We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models.
1 code implementation • 9 May 2023 • Nisha Huang, Yuxin Zhang, WeiMing Dong
Large-scale text-to-video diffusion models have demonstrated an exceptional ability to synthesize diverse videos.
1 code implementation • 9 Mar 2023 • Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu
Our framework consists of three key components, i. e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
1 code implementation • 23 Feb 2023 • Nisha Huang, Fan Tang, WeiMing Dong, Tong-Yee Lee, Changsheng Xu
Different from current mask-based image editing methods, we propose a novel region-aware diffusion model (RDM) for entity-level image editing, which could automatically locate the region of interest and replace it following given text prompts.
1 code implementation • CVPR 2023 • Yuxin Zhang, Nisha Huang, Fan Tang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu
Our key idea is to learn artistic style directly from a single painting and then guide the synthesis without providing complex textual descriptions.
Ranked #5 on Style Transfer on StyleBench
1 code implementation • 19 Nov 2022 • Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, WeiMing Dong, Changsheng Xu
Despite the impressive results of arbitrary image-guided style transfer methods, text-driven image stylization has recently been proposed for transferring a natural image into a stylized one according to textual descriptions of the target style provided by the user.
1 code implementation • 4 Nov 2022 • Chengcheng Ma, Yang Liu, Jiankang Deng, Lingxi Xie, WeiMing Dong, Changsheng Xu
Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts.
1 code implementation • 27 Sep 2022 • Nisha Huang, Fan Tang, WeiMing Dong, Changsheng Xu
Extensive experimental results on the quality and quantity of the generated digital art paintings confirm the effectiveness of the combination of the diffusion model and multimodal guidance.
1 code implementation • 19 May 2022 • Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu
Our framework consists of three key components, i. e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
Ranked #4 on Style Transfer on StyleBench
1 code implementation • 26 Jan 2022 • Chengcheng Ma, Xingjia Pan, Qixiang Ye, Fan Tang, WeiMing Dong, Changsheng Xu
Semi-supervised object detection has recently achieved substantial progress.
3 code implementations • CVPR 2022 • Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu
The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.
Ranked #3 on Style Transfer on StyleBench
1 code implementation • 3 Aug 2021 • Yifan Xu, Zhijie Zhang, Mengdan Zhang, Kekai Sheng, Ke Li, WeiMing Dong, Liqing Zhang, Changsheng Xu, Xing Sun
Vision transformers (ViTs) have recently received explosive popularity, but the huge computational cost is still a severe issue.
Ranked #11 on Efficient ViTs on ImageNet-1K (with DeiT-T)
no code implementations • 14 Jun 2021 • Pei Lv, Jianqi Fan, Xixi Nie, WeiMing Dong, Xiaoheng Jiang, Bing Zhou, Mingliang Xu, Changsheng Xu
This framework leverages user interactions to retouch and rank images for aesthetic assessment based on deep reinforcement learning (DRL), and generates personalized aesthetic distribution that is more in line with the aesthetic preferences of different users.
4 code implementations • 30 May 2021 • Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu
The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.
no code implementations • 21 Apr 2021 • Yifan Xu, Kekai Sheng, WeiMing Dong, Baoyuan Wu, Changsheng Xu, Bao-Gang Hu
However, due to unpredictable corruptions (e. g., noise and blur) in real data like web images, domain adaptation methods are increasingly required to be corruption robust on target domains.
no code implementations • 25 Mar 2021 • Kekai Sheng, Ke Li, Xiawu Zheng, Jian Liang, WeiMing Dong, Feiyue Huang, Rongrong Ji, Xing Sun
However, considering that the configuration of attention, i. e., the type and the position of attention module, affects the performance significantly, it is more generalized to optimize the attention configuration automatically to be specialized for arbitrary UDA scenario.
Ranked #1 on Partial Domain Adaptation on Office-Home
1 code implementation • CVPR 2021 • Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, WeiMing Dong, Haolei Yuan, Feiyue Huang, Changsheng Xu
Weakly supervised object localization(WSOL) remains an open problem given the deficiency of finding object extent information using a classification network.
no code implementations • 4 Dec 2020 • Zhiyong Huang, Kekai Sheng, WeiMing Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Dengwen Zhou, Changsheng Xu
For intra-domain propagation, we propose an effective self-training strategy to mitigate the noises in pseudo-labeled target domain data and improve the feature discriminability in the target domain.
no code implementations • 26 Mar 2014 • Weiming Dong, Fuzhang Wu, Yan Kong, Xing Mei, Tong-Yee Lee, Xiaopeng Zhang
We propose to retarget the textural regions by content-aware synthesis and non-textural regions by fast multi-operators.