1 code implementation • 15 Oct 2024 • Qizhang Li, Xiaochen Yang, WangMeng Zuo, Yiwen Guo
Our method also achieves over 90% attack success rates against Llama-2-Chat models on AdvBench, despite their outstanding resistance to jailbreak attacks.
1 code implementation • 13 Oct 2024 • Lan Yao, Chaofeng Chen, Xiaoming Li, Zifei Yan, WangMeng Zuo
In this work, we propose encapsulating the generative face prior as a guided natural manifold to facilitate the correction of facial regions.
no code implementations • 4 Oct 2024 • Minheng Ni, Yutao Fan, Lei Zhang, WangMeng Zuo
As large-scale models evolve, language instructions are increasingly utilized in multi-modal tasks.
1 code implementation • 2 Oct 2024 • Kailai Feng, Yabo Zhang, Haodong Yu, Zhilong Ji, Jinfeng Bai, Hongzhi Zhang, WangMeng Zuo
Artistic typography is a technique to visualize the meaning of input character in an imaginable and readable manner.
no code implementations • 26 Sep 2024 • Xinya Shu, Yu Li, Dongwei Ren, Xiaohe Wu, Jin Li, WangMeng Zuo
Then, to effectively learn the baseline defocus deblurring network with misaligned training pairs, our reblurring module ensures spatial consistency between the deblurred image, the reblurred image and the input blurry image by reconstructing spatially variant isotropic blur kernels.
no code implementations • 17 Sep 2024 • Bowen Dong, Pan Zhou, WangMeng Zuo
We introduce LPT++, a comprehensive framework for long-tailed classification that combines parameter-efficient fine-tuning (PEFT) with a learnable model ensemble.
1 code implementation • 25 Aug 2024 • Wenrui Li, Fucheng Cai, Yapeng Mi, Zhe Yang, WangMeng Zuo, Xingtao Wang, Xiaopeng Fan
Our proposed method leverages a text-driven panoramic image generation model as a prior for 3D scene generation and employs 3D Gaussian Splatting (3DGS) to ensure consistency across multi-view panoramic images.
1 code implementation • 21 Aug 2024 • Wei Shang, Dongwei Ren, Wanying Zhang, Qilong Wang, Pengfei Zhu, WangMeng Zuo
Subsequently, to effectively train the DRSC network, we propose a self-supervised learning strategy that ensures cycle consistency between input and reconstructed dual reversed RS images.
no code implementations • 21 Aug 2024 • Minheng Ni, Chenfei Wu, Huaying Yuan, Zhengyuan Yang, Ming Gong, Lijuan Wang, Zicheng Liu, WangMeng Zuo, Nan Duan
With the advancement of generative models, the synthesis of different sensory elements such as music, visuals, and speech has achieved significant realism.
1 code implementation • 17 Aug 2024 • Tianyi Zhu, Wei Shang, Dongwei Ren, WangMeng Zuo
Motivated by this observation, we propose a simple yet effective interpolation method for animation line inbetweening that adopts thin-plate spline-based transformation to estimate coarse motion more accurately by modeling the keypoint correspondence between two key frames, particularly for large motion scenarios.
1 code implementation • 13 Jul 2024 • Wei Shang, Dongwei Ren, Wanying Zhang, Yuming Fang, WangMeng Zuo, Kede Ma
Arbitrary-scale video super-resolution (AVSR) aims to enhance the resolution of video frames, potentially at various scaling factors, which presents several challenges regarding spatial detail reproduction, temporal consistency, and computational complexity.
1 code implementation • 10 Jul 2024 • Haoliang Meng, Xiaopeng Hong, Chenhao Wang, Miao Shang, WangMeng Zuo
Multi-modal crowd counting involves estimating crowd density from both visual and thermal/depth images.
1 code implementation • 1 Jul 2024 • Yuanyang He, Zitong Huang, Xinxing Xu, Rick Siow Mong Goh, Salman Khan, WangMeng Zuo, Yong liu, Chun-Mei Feng
This consistency benefits Proxy-tuning and enhances model performance.
1 code implementation • 1 Jul 2024 • Mingxiang Liao, Hannan Lu, Xinyu Zhang, Fang Wan, Tianyu Wang, Yuzhong Zhao, WangMeng Zuo, Qixiang Ye, Jingdong Wang
For this purpose, we establish a new benchmark comprising text prompts that fully reflect multiple dynamics grades, and define a set of dynamics scores corresponding to various temporal granularities to comprehensively evaluate the dynamics of each generated video.
no code implementations • 20 Jun 2024 • Chaoqi Liang, Guanglei Yang, Lifeng Qiao, Zitong Huang, Hongliang Yan, Yunchao Wei, WangMeng Zuo
Our approach, LayerMatch, which integrates these two strategies, can avoid the severe interference of noisy pseudo-labels in the linear classification layer while accelerating the clustering capability of the feature extraction layer.
1 code implementation • 17 Jun 2024 • Chunming He, Yuqi Shen, Chengyu Fang, Fengyang Xiao, Longxiang Tang, Yulun Zhang, WangMeng Zuo, Zhenhua Guo, Xiu Li
Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities.
1 code implementation • 11 Jun 2024 • Hang Yao, Ming Liu, Haolin Wang, Zhicun Yin, Zifei Yan, Xiaopeng Hong, WangMeng Zuo
Therefore, instead of utilizing the same setting for all samples, we propose to predict a particular denoising step for each sample by evaluating the difference between image contents and the priors extracted from diffusion models.
Ranked #1 on Anomaly Detection on VisA
1 code implementation • 3 Jun 2024 • Tianyu Huang, Haoze Zhang, Yihan Zeng, Zhilu Zhang, Hui Li, WangMeng Zuo, Rynson W. H. Lau
In this work, combining the strengths and complementing shortcomings of the above two solutions, we propose to learn the physical properties of a material field with video diffusion priors, and then utilize a physics-based Material-Point-Method (MPM) simulator to generate 4D content with realistic motions.
1 code implementation • 30 May 2024 • Zixian Guo, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo
Learning a skill generally relies on both practical experience by doer and insightful high-level guidance by instructor.
1 code implementation • 28 May 2024 • Qizhang Li, Yiwen Guo, WangMeng Zuo, Hao Chen
In addition, without introducing obvious cost, the combination achieves >30% absolute increase in attack success rates compared with GCG when generating both query-specific (38% -> 68%) and universal adversarial prompts (26. 68% -> 60. 32%) for attacking the Llama-2-7B-Chat model on AdvBench.
no code implementations • 14 May 2024 • Wei Lian, Zhesen Cui, Fei Ma, Hang Pan, WangMeng Zuo
In many applications, the demand arises for algorithms capable of aligning partially overlapping point sets while remaining invariant to the corresponding transformations.
1 code implementation • 9 May 2024 • Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hongzhi Zhang, Lei Zhang, WangMeng Zuo
In this work, we present MasterWeaver, a test-time tuning-free method designed to generate personalized images with both faithful identity fidelity and flexible editability.
2 code implementations • 3 May 2024 • Zhilu Zhang, Ruohao Wang, Hongzhi Zhang, WangMeng Zuo
In addition, we further take multiple zoomed observations to explore self-supervised RefSR, and present a progressive fusion scheme for the effective utilization of reference images.
1 code implementation • 26 Apr 2024 • Haoyu Wang, Zhilu Zhang, Donglin Di, Shiliang Zhang, WangMeng Zuo
This ensures that the clothing features are roughly fit to the person's view.
no code implementations • 25 Apr 2024 • Zitong Huang, Ze Chen, Bowen Dong, Chaoqi Liang, Erjin Zhou, WangMeng Zuo
Model Weight Averaging (MWA) is a technique that seeks to enhance model's performance by averaging the weights of multiple trained models.
1 code implementation • 17 Apr 2024 • Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei LI, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Fangyuan Kong, Haotian Fan, Yifang Xu, Haoran Xu, Mengduo Yang, Jie zhou, Jiaze Li, Shijie Wen, Mai Xu, Da Li, Shunyu Yao, Jiazhi Du, WangMeng Zuo, Zhibo Li, Shuai He, Anlong Ming, Huiyuan Fu, Huadong Ma, Yong Wu, Fie Xue, Guozhi Zhao, Lina Du, Jie Guo, Yu Zhang, huimin zheng, JunHao Chen, Yue Liu, Dulan Zhou, Kele Xu, Qisheng Xu, Tao Sun, Zhixiang Ding, Yuhang Hu
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i. e., Kuaishou/Kwai Platform.
1 code implementation • 12 Apr 2024 • Rongjian Xu, Zhilu Zhang, Renlong Wu, WangMeng Zuo
Despite the significant progress in image denoising, it is still challenging to restore fine-scale details while removing noise, especially in extremely low-light environments.
1 code implementation • 11 Apr 2024 • Junyi Li, Zhilu Zhang, WangMeng Zuo
For channel self-attention, we observe that it may leak the blind-spot information when the channel number is greater than spatial size in the deep layers of multi-scale architectures.
1 code implementation • 9 Apr 2024 • Xiaoyu Liu, Yuxiang Wei, Ming Liu, Xianhui Lin, Peiran Ren, Xuansong Xie, WangMeng Zuo
The key idea of our SmartControl is to relax the visual condition on the areas that are conflicted with text prompts.
1 code implementation • 8 Apr 2024 • Jiaxiu Jiang, Yabo Zhang, Kailai Feng, Xiaohe Wu, WangMeng Zuo
Customized text-to-image generation aims to synthesize instantiations of user-specified concepts and has achieved unprecedented progress in handling individual concept.
1 code implementation • 8 Apr 2024 • Minheng Ni, Yeli Shen, Lei Zhang, WangMeng Zuo
To mitigate the negative implications of harmful images on research, we create a transparent and public dataset, AltBear, which expresses harmful information using teddy bears instead of humans.
no code implementations • 7 Apr 2024 • Binghui Chen, Wenyu Li, Yifeng Geng, Xuansong Xie, WangMeng Zuo
Specifically, we propose a shoe-wearing system, called Shoe-Model, to generate plausible images of human legs interacting with the given shoes.
1 code implementation • 7 Apr 2024 • Renlong Wu, Zhilu Zhang, Yu Yang, WangMeng Zuo
In this work, we introduce a new task, ie, dual-camera smooth zoom (DCSZ) to achieve a smooth zoom preview.
1 code implementation • 17 Mar 2024 • Renlong Wu, Zhilu Zhang, Shuohao Zhang, Longfei Gou, Haobin Chen, Lei Zhang, Hao Chen, WangMeng Zuo
On the other hand, in order to enhance the desmoking performance, we further feed the valuable information from PS frame into models, where a masking strategy and a regularization term are presented to avoid trivial solutions.
no code implementations • 12 Mar 2024 • Runmin Cong, Ronghui Sheng, Hao Wu, Yulan Guo, Yunchao Wei, WangMeng Zuo, Yao Zhao, Sam Kwong
On the one hand, the low-level detail embedding module is designed to supplement high-frequency color information of depth features in a residual mask manner at the low-level stages.
1 code implementation • 9 Mar 2024 • Chunwei Tian, Menghua Zheng, Tiancai Jiao, WangMeng Zuo, Yanning Zhang, Chia-Wen Lin
Popular convolutional neural networks mainly use paired images in a supervised way for image watermark removal.
1 code implementation • 8 Mar 2024 • Yabo Zhang, Yuxiang Wei, Xianhui Lin, Zheng Hui, Peiran Ren, Xuansong Xie, Xiangyang Ji, WangMeng Zuo
Different from conventional T2V sampling (i. e., temporal and spatial modeling), VideoElevator explicitly decomposes each sampling step into temporal motion refining and spatial quality elevating.
1 code implementation • CVPR 2024 • Zhengyao Lv, Yuxiang Wei, WangMeng Zuo, Kwan-Yee K. Wong
Extensive experiments demonstrate that our approach performs favorably in terms of visual quality, semantic consistency, and layout alignment.
no code implementations • 26 Feb 2024 • Bowen Dong, Guanglei Yang, WangMeng Zuo, Lei Zhang
Empirical investigations on the adaptation of existing frameworks to vanilla ViT reveal that incorporating visual adapters into ViTs or fine-tuning ViTs with distillation terms is advantageous for enhancing the segmentation capability of novel classes.
1 code implementation • 7 Feb 2024 • Lijun Li, Bowen Dong, Ruohui Wang, Xuhao Hu, WangMeng Zuo, Dahua Lin, Yu Qiao, Jing Shao
In the rapidly evolving landscape of Large Language Models (LLMs), ensuring robust safety measures is paramount.
1 code implementation • 2 Feb 2024 • Jian Liu, Xiaoshui Huang, Tianyu Huang, Lu Chen, Yuenan Hou, Shixiang Tang, Ziwei Liu, Wanli Ouyang, WangMeng Zuo, Junjun Jiang, Xianming Liu
Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC), with diverse input modalities, e. g., text, image, video, audio and 3D.
1 code implementation • 3 Jan 2024 • Zitong Huang, Ze Chen, Zhixing Chen, Erjin Zhou, Xinxing Xu, Rick Siow Mong Goh, Yong liu, WangMeng Zuo, ChunMei Feng
When progressing to a new session, pseudo-features are sampled from old-class distributions combined with training images of the current session to optimize the prompt, thus enabling the model to learn new knowledge while retaining old knowledge.
class-incremental learning Few-Shot Class-Incremental Learning +2
2 code implementations • 1 Jan 2024 • Zhilu Zhang, Shuohao Zhang, Renlong Wu, Zifei Yan, WangMeng Zuo
It is highly desired but challenging to acquire high-quality photos with clear content in low-light environments.
1 code implementation • CVPR 2024 • Jingbo Lin, Zhilu Zhang, Yuxiang Wei, Dongwei Ren, Dongsheng Jiang, WangMeng Zuo
To address the cross-modal assistance, we propose to map the degraded images into textual representations for removing the degradations, and then convert the restored textual representations into a guidance image for assisting image restoration.
1 code implementation • 28 Dec 2023 • Wan Xu, Tianyu Huang, Tianyu Qu, Guanglei Yang, Yiwen Guo, WangMeng Zuo
Few-shot class-incremental learning (FSCIL) aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
1 code implementation • 26 Dec 2023 • Zixian Guo, Yuxiang Wei, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo
Parameter-efficient fine-tuning (PEFT) methods have provided an effective way for adapting large vision-language models to specific tasks or scenarios.
1 code implementation • 19 Dec 2023 • Yufei Cai, Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hu Han, WangMeng Zuo
To decouple irrelevant attributes (i. e., background and pose) from the subject embedding, we further present several attribute mappers that encode each image as several image-specific subject-unrelated embeddings.
1 code implementation • 19 Dec 2023 • Chun-Mei Feng, Yang Bai, Tao Luo, Zhen Li, Salman Khan, WangMeng Zuo, Xinxing Xu, Rick Siow Mong Goh, Yong liu
By feeding the retrieved image and question to the VQA model, one can find the images inconsistent with relative caption when the answer by VQA is inconsistent with the answer in the QA pair.
1 code implementation • CVPR 2024 • Tianyu Huang, Yihan Zeng, Zhilu Zhang, Wan Xu, Hang Xu, Songcen Xu, Rynson W. H. Lau, WangMeng Zuo
The priors are then regarded as input conditions to maintain reasonable geometries, in which conditional LoRA and weighted score are further proposed to optimize detailed textures.
1 code implementation • 29 Oct 2023 • Yuanze Li, Haolin Wang, Shihao Yuan, Ming Liu, Debin Zhao, Yiwen Guo, Chen Xu, Guangming Shi, WangMeng Zuo
Existing industrial anomaly detection (IAD) methods predict anomaly scores for both anomaly detection and localization.
no code implementations • 24 Oct 2023 • Qing Miao, Xiaohe Wu, Chao Xu, Yanli Ji, WangMeng Zuo, Yiwen Guo, Zhaopeng Meng
By incorporating auxiliary information from CLIP and utilizing prompt fine-tuning, we effectively eliminate noisy samples from the clean set and mitigate confirmation bias during training.
1 code implementation • 23 Oct 2023 • Xiaohui Liu, Zhilu Zhang, Xiaohe Wu, Chaoyu Feng, Xiaotao Wang, Lei Lei, WangMeng Zuo
Real-world image de-weathering aims at removing various undesirable weather-related artifacts.
1 code implementation • 16 Oct 2023 • Chunwei Tian, Menghua Zheng, WangMeng Zuo, Shichao Zhang, Yanning Zhang, Chia-Wen Ling
To avoid loss of key information, PB uses three heterogeneous networks to implement multiple interactions of multi-level features to broadly search for extra information for improving the adaptability of an obtained denoiser for complex scenes.
1 code implementation • 12 Oct 2023 • Zehao Wang, Yiwen Guo, Qizhang Li, Guanglei Yang, WangMeng Zuo
Most existing data augmentation methods tend to find a compromise in augmenting the data, \textit{i. e.}, increasing the amplitude of augmentation carefully to avoid degrading some data too much and doing harm to the model performance.
no code implementations • 11 Oct 2023 • Chaoqi Liang, Lifeng Qiao, Peng Ye, Nanqing Dong, Jianle Sun, Weiqiang Bai, Yuchen Ren, Xinzhu Ma, Hongliang Yan, Chunfeng Song, Wanli Ouyang, WangMeng Zuo
However, existing pre-training methods for DNA sequences largely rely on direct adoptions of BERT pre-training from NLP, lacking a comprehensive understanding and a specifically tailored approach.
1 code implementation • 9 Oct 2023 • Yang Bai, Xinxing Xu, Yong liu, Salman Khan, Fahad Khan, WangMeng Zuo, Rick Siow Mong Goh, Chun-Mei Feng
Composed image retrieval (CIR) is the task of retrieving specific images by using a query that involves both a reference image and a relative caption.
Ranked #2 on Image Retrieval on Fashion IQ (using extra training data)
1 code implementation • 3 Oct 2023 • Zhilu Zhang, Haoyu Wang, Shuai Liu, Xiaotao Wang, Lei Lei, WangMeng Zuo
The color component is estimated from aligned multi-exposure images, while the structure one is generated through a structure-focused network that is supervised by the color component and an input reference (\eg, medium-exposure) image.
no code implementations • 29 Sep 2023 • Tianyu Huang, Yihan Zeng, Bowen Dong, Hang Xu, Songcen Xu, Rynson W. H. Lau, WangMeng Zuo
To this end, an NTFGen module is proposed to model general text latent code in noisy fields.
1 code implementation • ICCV 2023 • Xiaoyu Liu, Ming Liu, Junyi Li, Shuai Liu, Xiaotao Wang, Lei Lei, WangMeng Zuo
In this paper, we circumvent this issue by presenting a joint framework for both unbounded recommendation of camera view and image composition (i. e., UNIC).
1 code implementation • ICCV 2023 • Zhicun Yin, Ming Liu, Xiaoming Li, Hui Yang, Longan Xiao, WangMeng Zuo
To evaluate our proposed MetaF2N, we have collected a real-world low-quality dataset with one or multiple faces in each image, and our MetaF2N achieves superior performance on both synthetic and real-world datasets.
1 code implementation • 13 Sep 2023 • Dongwei Ren, Wei Shang, Yi Yang, WangMeng Zuo
To aggregate long-term sharp features from detected sharp frames, we utilize a global Transformer with multi-scale matching capability.
no code implementations • 4 Sep 2023 • Yuanshuo Cheng, Mingwen Shao, Yecong Wan, Yuanjian Qiao, WangMeng Zuo, Deyu Meng
To empower the framework for eliminating diverse degradations, we devise a Sequence-wise Adaptive Degradation Estimator (SADE) to estimate degradation features for the input corrupted video.
no code implementations • 31 Aug 2023 • Minheng Ni, Yabo Zhang, Kailai Feng, Xiaoming Li, Yiwen Guo, WangMeng Zuo
In this work, we introduce a novel Referring Diffusional segmentor (Ref-Diff) for this task, which leverages the fine-grained multi-modal information from generative models.
1 code implementation • 27 Aug 2023 • Mingshuai Yao, Yabo Zhang, Xianhui Lin, Xiaoming Li, WangMeng Zuo
In this paper, we propose a VQGAN-based framework (i. e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement.
1 code implementation • 21 Aug 2023 • Jian Zou, Tianyu Huang, Guanglei Yang, Zhenhua Guo, Tao Luo, Chun-Mei Feng, WangMeng Zuo
First, it projects the features from both modalities into a cohesive 3D volume space to intricately marry the bird's eye view (BEV) with the height dimension.
no code implementations • 20 Aug 2023 • Yunlu Yan, Chun-Mei Feng, Mang Ye, WangMeng Zuo, Ping Li, Rick Siow Mong Goh, Lei Zhu, C. L. Philip Chen
Concretely, FedCSD introduces a class prototype similarity distillation to align the local logits with the refined global logits that are weighted by the similarity between local logits and the global prototype.
1 code implementation • ICCV 2023 • Chun-Mei Feng, Kai Yu, Yong liu, Salman Khan, WangMeng Zuo
In this paper, we focus on a particular setting of learning adaptive prompts on the fly for each test sample from an unseen new domain, which is known as test-time prompt tuning (TPT).
1 code implementation • ICCV 2023 • Chun-Mei Feng, Kai Yu, Nian Liu, Xinxing Xu, Salman Khan, WangMeng Zuo
However, the performance of the global model is often hampered by non-i. i. d.
1 code implementation • 24 Jul 2023 • Mingwen Shao, Lingzhuang Meng, Yuanjian Qiao, Lixu Zhang, WangMeng Zuo
Specifically, we augment the latent codes of the inferred member data with LCA and use them as guidance for SD.
no code implementations • 21 Jul 2023 • Qizhang Li, Yiwen Guo, Xiaochen Yang, WangMeng Zuo, Hao Chen
Our ICLR work advocated for enhancing transferability in adversarial examples by incorporating a Bayesian formulation into model parameters, which effectively emulates the ensemble of infinitely many deep neural networks, while, in this paper, we introduce a novel extension by incorporating the Bayesian formulation into the model input as well, enabling the joint diversification of both the model input and model parameters.
1 code implementation • 30 Jun 2023 • Renlong Wu, Zhilu Zhang, Shuohao Zhang, Hongzhi Zhang, WangMeng Zuo
The main challenge of BurstSR is to effectively combine the complementary information from input frames, while existing methods still struggle with it.
no code implementations • 28 Jun 2023 • Jie Ning, Jiebao Sun, Yao Li, Zhichang Guo, WangMeng Zuo
Thus, we further propose an indicator to measure the local similarity of models, called robustness similitude.
1 code implementation • CVPR 2023 • Yuxiang Wei, Zhilong Ji, Xiaohe Wu, Jinfeng Bai, Lei Zhang, WangMeng Zuo
Despite the progress in semantic image synthesis, it remains a challenging problem to generate photo-realistic parts from input semantic map.
2 code implementations • ICCV 2023 • Wei Shang, Dongwei Ren, Chaoyu Feng, Xiaotao Wang, Lei Lei, WangMeng Zuo
In this paper, we propose a Self-supervised learning framework for Dual reversed RS distortions Correction (SelfDRSC), where a DRSC network can be learned to generate a high framerate GS video only based on dual RS images with reversed distortions.
1 code implementation • 22 May 2023 • Yabo Zhang, Yuxiang Wei, Dongsheng Jiang, Xiaopeng Zhang, WangMeng Zuo, Qi Tian
Text-driven diffusion models have unlocked unprecedented abilities in image generation, whereas their video counterpart still lags behind due to the excessive training cost of temporal modeling.
2 code implementations • NeurIPS 2023 • Qizhang Li, Yiwen Guo, WangMeng Zuo, Hao Chen
In particular, the proposed method, named intermediate-level perturbation decay (ILPD), encourages the intermediate-level perturbation to be in an effective adversarial direction and to possess a great magnitude simultaneously.
no code implementations • 3 Apr 2023 • Yabo Zhang, ZiHao Wang, Jun Hao Liew, Jingjia Huang, Manyu Zhu, Jiashi Feng, WangMeng Zuo
In this work, we investigate performing semantic segmentation solely through the training on image-sentence pairs.
1 code implementation • CVPR 2023 • Chun-Mei Feng, Bangjun Li, Xinxing Xu, Yong liu, Huazhu Fu, WangMeng Zuo
Federated Magnetic Resonance Imaging (MRI) reconstruction enables multiple hospitals to collaborate distributedly without aggregating local data, thereby protecting patient privacy.
1 code implementation • CVPR 2023 • Junyi Li, Zhilu Zhang, Xiaoyu Liu, Chaoyu Feng, Xiaotao Wang, Lei Lei, WangMeng Zuo
And we extend the blind-spot network to a blind-neighborhood network (BNN) for providing supervision on flat areas.
2 code implementations • CVPR 2023 • Wei Shang, Dongwei Ren, Yi Yang, Hongzhi Zhang, Kede Ma, WangMeng Zuo
Moreover, on the seemingly implausible x16 interpolation task, our method outperforms existing methods by more than 1. 5 dB in terms of PSNR.
1 code implementation • CVPR 2023 • Xiaoming Li, WangMeng Zuo, Chen Change Loy
To restrict the generative space of StyleGAN so that it obeys the structure of characters yet remains flexible in handling different font styles, we store the discrete features for each character in a codebook.
no code implementations • 12 Mar 2023 • Bowen Dong, Jiaxi Gu, Jianhua Han, Hang Xu, WangMeng Zuo
To improve the open-world segmentation ability, we leverage omni-supervised data (i. e., panoptic segmentation data, object detection data, and image-text pairs data) into training, thus enriching the open-world segmentation ability and achieving better segmentation accuracy.
1 code implementation • ICCV 2023 • Yuxiang Wei, Yabo Zhang, Zhilong Ji, Jinfeng Bai, Lei Zhang, WangMeng Zuo
In addition to the unprecedented ability in imaginary creation, large text-to-image models are expected to take customized concepts in image generation.
1 code implementation • 10 Feb 2023 • Qizhang Li, Yiwen Guo, WangMeng Zuo, Hao Chen
In this paper, by contrast, we opt for the diversity in substitute models and advocate to attack a Bayesian model for achieving desirable transferability.
no code implementations • CVPR 2023 • Minheng Ni, Xiaoming Li, WangMeng Zuo
Language-guided image inpainting aims to fill the defective regions of an image under the guidance of text while keeping the non-defective regions unchanged.
1 code implementation • CVPR 2023 • Yue Cao, Ming Liu, Shuai Liu, Xiaotao Wang, Lei Lei, WangMeng Zuo
Although deep neural networks have achieved astonishing performance in many vision tasks, existing learning-based methods are far inferior to the physical model-based solutions in extreme low-light sensor noise modeling.
Ranked #1 on Image Denoising on ELD SonyA7S2 x200
no code implementations • 27 Dec 2022 • Bo Chen, Zhiwei Hu, Zhilong Ji, Jinfeng Bai, WangMeng Zuo
The main challenge of this task is to understand the visual and linguistic content simultaneously and to find the referred object accurately among all instances in the image.
1 code implementation • IEEE Transactions on Image Processing 2022 • Shuguang Dou, Cairong Zhao, Xinyang Jiang, Shanshan Zhang, Wei-Shi Zheng, WangMeng Zuo
Most supervised methods propose to train an extra human parsing model aside from the ReID model with cross-domain human parts annotation, suffering from expensive annotation cost and domain gap; Unsupervised methods integrate a feature clustering-based human parsing process into the ReID model, but lacking supervision signals brings less satisfactory segmentation results.
Ranked #5 on Person Re-Identification on Occluded-DukeMTMC
1 code implementation • 13 Dec 2022 • Qinghe Wang, Lijie Liu, Miao Hua, Pengfei Zhu, WangMeng Zuo, QinGhua Hu, Huchuan Lu, Bing Cao
We blend the semantic layouts of source head and source body, and then inpaint the transition region by the semantic layout generator, achieving a coarse-grained head swapping.
1 code implementation • 10 Dec 2022 • Ruohao Wang, Xiaohui Liu, Zhilu Zhang, Xiaohe Wu, Chun-Mei Feng, Lei Zhang, WangMeng Zuo
On the other hand, alignment algorithms in existing VSR methods perform poorly for real-world videos, leading to unsatisfactory results.
no code implementations • 8 Dec 2022 • Wenxin Wang, Boyun Li, Yuanbiao Gou, Peng Hu, WangMeng Zuo, Xi Peng
To tackle the first challenge, we proposed a Degradation Relationship Index (DRI) which is defined as the mean drop rate difference in the validation loss between two models which are respectively trained using the anchor degradation and the mixture of the anchor and the auxiliary degradations.
2 code implementations • 26 Nov 2022 • Yu Li, Dongwei Ren, Xinya Shu, WangMeng Zuo
First, in the deblurring module, a bi-directional optical flow-based deformation is introduced to tolerate spatial misalignment between deblurred and ground-truth images.
1 code implementation • CVPR 2023 • Zixian Guo, Bowen Dong, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo
Nonetheless, visual data (e. g., images) is by default prerequisite for learning prompts in existing methods.
1 code implementation • 14 Nov 2022 • Zhilu Zhang, Rongjian Xu, Ming Liu, Zifei Yan, WangMeng Zuo
By learning in a collaborative manner, the deblurring and denoising tasks in our method can benefit each other.
1 code implementation • 20 Oct 2022 • Marcos V. Conde, Radu Timofte, Yibin Huang, Jingyang Peng, Chang Chen, Cheng Li, Eduardo Pérez-Pellitero, Fenglong Song, Furui Bai, Shuai Liu, Chaoyu Feng, Xiaotao Wang, Lei Lei, Yu Zhu, Chenghua Li, Yingying Jiang, Yong A, Peisong Wang, Cong Leng, Jian Cheng, Xiaoyu Liu, Zhicun Yin, Zhilu Zhang, Junyi Li, Ming Liu, WangMeng Zuo, Jun Jiang, Jinha Kim, Yue Zhang, Beiji Zou, Zhikai Zong, Xiaoxiao Liu, Juan Marín Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Furkan Kınlı, Barış Özcan, Furkan Kıraç, Li Leyi, SM Nadim Uddin, Dipon Kumar Ghosh, Yong Ju Jung
Cameras capture sensor RAW images and transform them into pleasant RGB images, suitable for the human eyes, using their integrated Image Signal Processor (ISP).
1 code implementation • 15 Oct 2022 • Xiaoming Li, Shiguang Zhang, Shangchen Zhou, Lei Zhang, WangMeng Zuo
Generally, it is a challenging and intractable task to improve the photo-realistic performance of blind restoration and adaptively handle the generic and specific restoration scenarios with a single unified model.
1 code implementation • 13 Oct 2022 • Minheng Ni, Zitong Huang, Kailai Feng, WangMeng Zuo
Given a class label, the language model is used to generate a full description of a scene with a target object, and the text-to-image model deployed to generate a photo-realistic image.
1 code implementation • 3 Oct 2022 • Xiaoming Li, Chaofeng Chen, Xianhui Lin, WangMeng Zuo, Lei Zhang
Notably, LQ face images, which may have the same degradation process as natural images, can be robustly restored with photo-realistic textures by exploiting their strong structural priors.
1 code implementation • 3 Oct 2022 • Bowen Dong, Pan Zhou, Shuicheng Yan, WangMeng Zuo
For better effectiveness, we divide prompts into two groups: 1) a shared prompt for the whole long-tailed dataset to learn general features and to adapt a pretrained model into target domain; and 2) group-specific prompts to gather group-specific features for the samples which have similar features and also to empower the pretrained model with discrimination ability.
Ranked #1 on Long-tail Learning on CIFAR-100-LT (ρ=100) (using extra training data)
1 code implementation • ICCV 2023 • Tianyu Huang, Bowen Dong, Yunhan Yang, Xiaoshui Huang, Rynson W. H. Lau, Wanli Ouyang, WangMeng Zuo
To address this issue, we propose CLIP2Point, an image-depth pre-training method by contrastive learning to transfer CLIP to the 3D domain, and adapt it to point cloud classification.
Ranked #3 on Training-free 3D Point Cloud Classification on ScanObjectNN (using extra training data)
1 code implementation • 26 Sep 2022 • Chunwei Tian, Menghua Zheng, WangMeng Zuo, Bob Zhang, Yanning Zhang, David Zhang
In this paper, we propose a multi-stage image denoising CNN with the wavelet transform (MWDCNN) via three stages, i. e., a dynamic convolutional block (DCB), two cascaded wavelet transform and enhancement blocks (WEBs) and a residual block (RB).
1 code implementation • 26 Sep 2022 • Chunwei Tian, Yanning Zhang, WangMeng Zuo, Chia-Wen Lin, David Zhang, Yixuan Yuan
To prevent loss of original information, a multi-level enhancement mechanism guides a CNN to achieve a symmetric architecture for promoting expressive ability of HGSRCNN.
1 code implementation • ACMMM 2022 • Yudong Liang, Bin Wang, Wenqi Ren, Jiaying Liu, Wenjian Wang, WangMeng Zuo
In various real-world image enhancement applications, the degradations are always non-uniform or non-homogeneous and diverse, which challenges most deep networks with fixed parameters during the inference phase.
Ranked #16 on Image Dehazing on SOTS Indoor
no code implementations • 8 Aug 2022 • Hannan Lu, Zhi Tian, Lirong Yang, Haibing Ren, WangMeng Zuo
The compact instance stream effectively improves the segmentation accuracy of the unseen pixels, while fusing two streams with the adaptive routing map leads to an overall performance boost.
1 code implementation • 25 Jul 2022 • Zitong Huang, Yiping Bao, Bowen Dong, Erjin Zhou, WangMeng Zuo
Generally, with given pseudo ground-truths generated from the well-trained WSOD network, we propose a two-module iterative training algorithm to refine pseudo labels and supervise better object detector progressively.
1 code implementation • 21 Jul 2022 • Ming Liu, Yuxiang Wei, Xiaohe Wu, WangMeng Zuo, Lei Zhang
Generative adversarial networks (GANs) have drawn enormous attention due to the simple yet effective training mechanism and superior image generation quality.
1 code implementation • 18 Jul 2022 • Qiying Yu, Jieming Lou, Xianyuan Zhan, Qizhang Li, WangMeng Zuo, Yang Liu, Jingjing Liu
Contrastive learning (CL) has recently been applied to adversarial learning tasks.
1 code implementation • 18 Jul 2022 • Yabo Zhang, Mingshuai Yao, Yuxiang Wei, Zhilong Ji, Jinfeng Bai, WangMeng Zuo
In this paper, we present a novel one-shot generative domain adaption method, i. e., DiFa, for diverse generation and faithful adaptation.
1 code implementation • 13 Jul 2022 • Tianyu Huang, Bowen Dong, Jiaying Lin, Xiaohui Liu, Rynson W. H. Lau, WangMeng Zuo
Mirror detection aims to identify the mirror regions in the given input image.
1 code implementation • 12 Jul 2022 • Haolin Wang, Jiawei Zhang, Ming Liu, Xiaohe Wu, WangMeng Zuo
In particular, the style encoder predicts the target style representation of an input image, which serves as the conditional information in the RetouchNet for retouching, while the TSFlow maps the style representation vector into a Gaussian distribution in the forward pass.
1 code implementation • 16 Jun 2022 • Xin Zhong, Zhaoyi Yan, Jing Qin, WangMeng Zuo, Weigang Lu
However, the heads are not uniformly covered by the sampling points in the deformable convolution, resulting in loss of head information.
1 code implementation • 8 Jun 2022 • Pengju Liu, Hongzhi Zhang, Jinghui Wang, Yuzhi Wang, Dongwei Ren, WangMeng Zuo
In particular, we take well-trained CBDNet, NBNet, HINet, Uformer and GMSNet into denoiser pool, and a U-Net is adopted to predict pixel-wise weighting maps to fuse these denoisers.
1 code implementation • 29 May 2022 • Chunwei Tian, Yixuan Yuan, Shichao Zhang, Chia-Wen Lin, WangMeng Zuo, David Zhang
In this paper, we present an enhanced super-resolution group CNN (ESRGCNN) with a shallow architecture by fully fusing deep and wide channel features to extract more accurate low-frequency information in terms of correlations of different channels in single image super-resolution (SISR).
1 code implementation • 23 May 2022 • Qizhang Li, Yiwen Guo, WangMeng Zuo, Hao Chen
The vulnerability of deep neural networks (DNNs) to adversarial examples has attracted great attention in the machine learning community.
no code implementations • 28 Apr 2022 • Chunwei Tian, Xuanyu Zhang, Jerry Chun-Wei Lin, WangMeng Zuo, Yanning Zhang, Chia-Wen Lin
Second, we present popular architectures for GANs in big and small samples for image applications.
1 code implementation • 26 Apr 2022 • Yu Li, Yaling Yi, Dongwei Ren, Qince Li, WangMeng Zuo
Generally, DPANet is an encoder-decoder with skip-connections, where two branches with shared parameters in the encoder are employed to extract and align deep features from left and right views, and one decoder is adopted to fuse aligned features for predicting the sharp image.
2 code implementations • 20 Apr 2022 • Ren Yang, Radu Timofte, Meisong Zheng, Qunliang Xing, Minglang Qiao, Mai Xu, Lai Jiang, Huaida Liu, Ying Chen, Youcheng Ben, Xiao Zhou, Chen Fu, Pei Cheng, Gang Yu, Junyi Li, Renlong Wu, Zhilu Zhang, Wei Shang, Zhengyao Lv, Yunjin Chen, Mingcai Zhou, Dongwei Ren, Kai Zhang, WangMeng Zuo, Pavel Ostyakov, Vyal Dmitry, Shakarim Soltanayev, Chervontsev Sergey, Zhussip Magauiya, Xueyi Zou, Youliang Yan, Pablo Navarrete Michelini, Yunhua Lu, Diankai Zhang, Shaoli Liu, Si Gao, Biao Wu, Chengjian Zheng, Xiaofeng Zhang, Kaidi Lu, Ning Wang, Thuong Nguyen Canh, Thong Bach, Qing Wang, Xiaopeng Sun, Haoyu Ma, Shijie Zhao, Junlin Li, Liangbin Xie, Shuwei Shi, Yujiu Yang, Xintao Wang, Jinjin Gu, Chao Dong, Xiaodi Shi, Chunmei Nian, Dong Jiang, Jucai Lin, Zhihuai Xie, Mao Ye, Dengyan Luo, Liuhan Peng, Shengjie Chen, Qian Wang, Xin Liu, Boyang Liang, Hang Dong, Yuhao Huang, Kai Chen, Xingbei Guo, Yujing Sun, Huilei Wu, Pengxu Wei, Yulin Huang, Junying Chen, Ik Hyun Lee, Sunder Ali Khowaja, Jiseok Yoon
This challenge includes three tracks.
no code implementations • CVPR 2022 • Yue Cao, Zhaolin Wan, Dongwei Ren, Zifei Yan, WangMeng Zuo
Particularly, by treating all labeled data as positive samples, PU learning is leveraged to identify negative samples (i. e., outliers) from unlabeled data.
1 code implementation • 12 Apr 2022 • Junyi Li, Xiaohe Wu, Zhenxing Niu, WangMeng Zuo
However, BiRNN is intrinsically offline because it uses backward recurrent modules to propagate from the last to current frames, which causes high latency and large memory consumption.
1 code implementation • 12 Apr 2022 • Zhaohui Zheng, Rongguang Ye, Qibin Hou, Dongwei Ren, Ping Wang, WangMeng Zuo, Ming-Ming Cheng
Combining these two new components, for the first time, we show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years.
1 code implementation • CVPR 2022 • Yupeng Shi, Xiao Liu, Yuxiang Wei, Zhongqin Wu, WangMeng Zuo
Semantic image synthesis is a challenging task with many practical applications.
1 code implementation • 4 Apr 2022 • Ming Liu, Jianan Pan, Zifei Yan, WangMeng Zuo, Lei Zhang
Meanwhile, diverse testing sets are also provided with different types of reflection and scenes.
1 code implementation • CVPR 2022 • Zhengyao Lv, Xiaoming Li, Zhenxing Niu, Bing Cao, WangMeng Zuo
Obviously, a fine-grained part-level semantic layout will benefit object details generation, and it can be roughly inferred from an object's shape.
1 code implementation • 21 Mar 2022 • Yiwen Guo, Qizhang Li, WangMeng Zuo, Hao Chen
This paper substantially extends our work published at ECCV, in which an intermediate-level attack was proposed to improve the transferability of some baseline adversarial examples.
1 code implementation • 14 Mar 2022 • Bowen Dong, Pan Zhou, Shuicheng Yan, WangMeng Zuo
The few-shot learning ability of vision transformers (ViTs) is rarely investigated though heavily desired.
no code implementations • 6 Mar 2022 • Yuanze Li, Yiwen Guo, Qizhang Li, Hongzhi Zhang, WangMeng Zuo
Despite the remarkable progress, the challenge of optimally learning different tasks simultaneously remains to be explored.
2 code implementations • 2 Mar 2022 • Zhilu Zhang, Ruohao Wang, Hongzhi Zhang, Yunjin Chen, WangMeng Zuo
For this purpose, we take the telephoto image instead of an additional high-resolution image as the supervision information and select a center patch from it as the reference to super-resolve the corresponding short-focus image patch.
no code implementations • 10 Feb 2022 • Minheng Ni, Chenfei Wu, Haoyang Huang, Daxin Jiang, WangMeng Zuo, Nan Duan
Language guided image inpainting aims to fill in the defective regions of an image under the guidance of text while keeping non-defective regions unchanged.
no code implementations • 24 Dec 2021 • Jize Zhang, Haolin Wang, Xiaohe Wu, WangMeng Zuo
Existing unpaired low-light image enhancement approaches prefer to employ the two-way GAN framework, in which two CNN generators are deployed for enhancement and degradation separately.
no code implementations • 29 Sep 2021 • Fangcen Liu, Chenqiang Gao, Fang Chen, Deyu Meng, WangMeng Zuo, Xinbo Gao
We adopt the self-attention mechanism of the transformer to learn the interaction information of image features in a larger range.
1 code implementation • ICCV 2021 • Zhilu Zhang, Haolin Wang, Ming Liu, Ruohao Wang, Jiawei Zhang, WangMeng Zuo
To diminish the effect of color inconsistency in image alignment, we introduce to use a global color mapping (GCM) module to generate an initial sRGB image given the input raw image, which can keep the spatial location of the pixels unchanged, and the target sRGB image is utilized to guide GCM for converting the color towards it.
1 code implementation • ICCV 2021 • Binghui Chen, Zhaoyi Yan, Ke Li, Pengyu Li, Biao Wang, WangMeng Zuo, Lei Zhang
In crowd counting, due to the problem of laborious labelling, it is perceived intractability of collecting a new large-scale dataset which has plentiful images with large diversity in density, scene, etc.
1 code implementation • ICCV 2021 • Yuxiang Wei, Yupeng Shi, Xiao Liu, Zhilong Ji, Yuan Gao, Zhongqin Wu, WangMeng Zuo
It simply encourages the variation of output caused by perturbations on different latent dimensions to be orthogonal, and the Jacobian with respect to the input is calculated to represent this variation.
1 code implementation • 13 Aug 2021 • Fang Chen, Chenqiang Gao, Fangcen Liu, Yue Zhao, Yuxi Zhou, Deyu Meng, WangMeng Zuo
A local patch network (LPNet) with global attention is proposed in this paper to detect small targets by jointly considering the global and local properties of infrared small target images.
1 code implementation • ICCV 2021 • Bowen Dong, Zitong Huang, Yuelin Guo, Qilong Wang, Zhenxing Niu, WangMeng Zuo
In this paper, we defend the problem setting for improving localization performance by leveraging the bounding box regression knowledge from a well-annotated auxiliary dataset.
1 code implementation • 8 Jul 2021 • Zhaoyi Yan, Ruimao Zhang, Hongzhi Zhang, Qingfu Zhang, WangMeng Zuo
One of the main issues in this task is how to handle the dramatic scale variations of pedestrians caused by the perspective effect.
no code implementations • CVPR 2021 • Yuanchao Bai, Xianming Liu, WangMeng Zuo, YaoWei Wang, Xiangyang Ji
To achieve scalable compression with the error bound larger than zero, we derive the probability model of the quantized residual by quantizing the learned probability model of the original residual, instead of training multiple networks.
no code implementations • CVPR 2021 • Wenyu Li, Tianchu Guo, Pengyu Li, Binghui Chen, Biao Wang, WangMeng Zuo, Lei Zhang
In this paper, we propose a novel face recognition method, named VirFace, to effectively apply the unlabeled shallow data for face recognition.
1 code implementation • 25 Apr 2021 • Dongsheng Wang, Chaohao Xie, Shaohui Liu, Zhenxing Niu, WangMeng Zuo
In this paper, we present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes with several distinct merits.