no code implementations • 21 Jan 2025 • Yiyang Wang, Xi Chen, Xiaogang Xu, Sihui Ji, Yu Liu, Yujun Shen, Hengshuang Zhao
In spite of the recent progress, image diffusion models still produce artifacts.
no code implementations • 15 Jan 2025 • Fan Yuan, Xiaoyuan Fang, Rong Quan, Jing Li, Wei Bi, Xiaogang Xu, Piji Li
Visual Commonsense Reasoning, which is regarded as one challenging task to pursue advanced visual scene comprehension, has been used to diagnose the reasoning ability of AI systems.
no code implementations • 22 Dec 2024 • Huiwen Wu, Deyi Zhang, Xiaohan Li, Xiaogang Xu, Jiafei Wu, Zhe Liu
The second stage is to fine-tune the overall LLM with a differential privacy guarantee by adopting appropriate Gaussian noises.
no code implementations • 18 Dec 2024 • Sihui Ji, Yiyang Wang, Xi Chen, Xiaogang Xu, Hao Luo, Hengshuang Zhao
We present FashionComposer for compositional fashion image generation.
no code implementations • 15 Dec 2024 • Liyuan Cui, Xiaogang Xu, Wenqi Dong, Zesong Yang, Hujun Bao, Zhaopeng Cui
Human video synthesis aims to create lifelike characters in various environments, with wide applications in VR, storytelling, and content creation.
no code implementations • 31 Oct 2024 • Chiyu Zhang, Xiaogang Xu, Jiafei Wu, Zhe Liu, Lu Zhou
Adversarial attacks, which manipulate input data to undermine model availability and integrity, pose significant security threats during machine learning inference.
no code implementations • 24 Oct 2024 • Yating Ma, Xiaogang Xu, Liming Fang, Zhe Liu
Current Transferable Adversarial Examples (TAE) are primarily generated by adding Adversarial Noise (AN).
no code implementations • 16 Oct 2024 • Huiwen Wu, Xiaohan Li, Xiaogang Xu, Jiafei Wu, Deyi Zhang, Zhe Liu
By leveraging the differences between these two models, we create a more straightforward pathway to eliminate hallucinations, and the iterative nature of contrastive learning further enhances performance.
1 code implementation • 30 Sep 2024 • Fan Yuan, Chi Qin, Xiaogang Xu, Piji Li
Large Vision-Language Models (LVLMs) have shown remarkable performance on many visual-language tasks.
1 code implementation • 3 Sep 2024 • Kun Zhou, Xinyu Lin, Wenbo Li, Xiaogang Xu, Yuanhao Cai, Zhonghang Liu, Xiaoguang Han, Jiangbo Lu
Previous low-light image enhancement (LLIE) approaches, while employing frequency decomposition techniques to address the intertwined challenges of low frequency (e. g., illumination recovery) and high frequency (e. g., noise reduction), primarily focused on the development of dedicated and complex networks to achieve improved performance.
2 code implementations • 13 Jun 2024 • Lihe Yang, Bingyi Kang, Zilong Huang, Zhen Zhao, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao
This work presents Depth Anything V2.
1 code implementation • 13 Jun 2024 • Baiang Li, Sizhuo Ma, Yanhong Zeng, Xiaogang Xu, Youqing Fang, Zhao Zhang, Jian Wang, Kai Chen
Capturing High Dynamic Range (HDR) scenery using 8-bit cameras often suffers from over-/underexposure, loss of fine details due to low bit-depth compression, skewed color distributions, and strong noise in dark areas.
1 code implementation • 4 Jun 2024 • Muzhi Zhu, Chengxiang Fan, Hao Chen, Yang Liu, Weian Mao, Xiaogang Xu, Chunhua Shen
However, not all generated data can positively impact downstream models, and these methods do not thoroughly explore how to better select and utilize generated data.
no code implementations • 27 May 2024 • Zhuoling Li, Xiaogang Xu, Zhenhua Xu, SerNam Lim, Hengshuang Zhao
Due to the need to interact with the real world, embodied agents are required to possess comprehensive prior knowledge, long-horizon planning capability, and a swift response speed.
1 code implementation • 27 May 2024 • Jiaqi Tang, Hao Lu, Ruizheng Wu, Xiaogang Xu, Ke Ma, Cheng Fang, Bin Guo, Jiangbo Lu, Qifeng Chen, Ying-Cong Chen
Video Anomaly Detection (VAD) systems can autonomously monitor and identify disturbances, reducing the need for manual labor and associated costs.
no code implementations • 24 May 2024 • Lichuan Ji, Yingqi Lin, Zhenhua Huang, Yan Han, Xiaogang Xu, Jiafei Wu, Chong Wang, Zhe Liu
Current datasets lack a varied and comprehensive repository of real and generated content for effective discrimination.
no code implementations • 24 May 2024 • Xiaogang Xu, Kun Zhou, Tao Hu, RuiXing Wang, Hujun Bao
We leverage dynamic cross-frame correspondences for intrinsic appearance and enforce a scene-level continuity constraint on the illumination field to yield satisfactory consistent decomposition results.
no code implementations • 22 May 2024 • Huiwen Wu, Xiaohan Li, Deyi Zhang, Xiaogang Xu, Jiafei Wu, Puning Zhao, Zhe Liu
The success of current Large-Language Models (LLMs) hinges on extensive training data that is collected and stored centrally, called Centralized Learning (CL).
no code implementations • 22 Apr 2024 • Yiming Liu, Kezhao Liu, Yao Xiao, Ziyi Dong, Xiaogang Xu, Pengxu Wei, Liang Lin
To further enhance the robustness of DBP models, we introduce Adversarial Denoising Diffusion Training (ADDT), which incorporates classifier-guided adversarial perturbations into diffusion training, thereby strengthening the DBP models' ability to purify adversarial perturbations.
3 code implementations • 16 Apr 2024 • Bin Ren, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang, Wei Zhai, Renjing Pei, Jiaming Guo, Songcen Xu, Yang Cao, ZhengJun Zha, Yan Wang, Yi Liu, Qing Wang, Gang Zhang, Liou Zhang, Shijie Zhao, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Xin Liu, Min Yan, Menghan Zhou, Yiqiang Yan, Yixuan Liu, Wensong Chan, Dehua Tang, Dong Zhou, Li Wang, Lu Tian, Barsoum Emad, Bohan Jia, Junbo Qiao, Yunshuai Zhou, Yun Zhang, Wei Li, Shaohui Lin, Shenglong Zhou, Binbin Chen, Jincheng Liao, Suiyi Zhao, Zhao Zhang, Bo wang, Yan Luo, Yanyan Wei, Feng Li, Mingshen Wang, Yawei Li, Jinhan Guan, Dehua Hu, Jiawei Yu, Qisheng Xu, Tao Sun, Long Lan, Kele Xu, Xin Lin, Jingtong Yue, Lehan Yang, Shiyi Du, Lu Qi, Chao Ren, Zeyu Han, YuHan Wang, Chaolin Chen, Haobo Li, Mingjun Zheng, Zhongbao Yang, Lianhong Song, Xingzhuo Yan, Minghan Fu, Jingyi Zhang, Baiang Li, Qi Zhu, Xiaogang Xu, Dan Guo, Chunle Guo, Jiadi Chen, Huanhuan Long, Chunjiang Duanmu, Xiaoyan Lei, Jie Liu, Weilin Jia, Weifeng Cao, Wenlong Zhang, Yanyu Mao, Ruilong Guo, Nihao Zhang, Qian Wang, Manoj Pandey, Maksym Chernozhukov, Giang Le, Shuli Cheng, Hongyuan Wang, Ziyan Wei, Qingting Tang, Liejun Wang, Yongming Li, Yanhui Guo, Hao Xu, Akram Khatami-Rizi, Ahmad Mahmoudi-Aznaveh, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi
In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking.
1 code implementation • 15 Apr 2024 • Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Hongyu An, Xinfeng Zhang, Zhiyuan Song, Ziyue Dong, Qing Zhao, Xiaogang Xu, Pengxu Wei, Zhi-chao Dou, Gui-ling Wang, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Cansu Korkmaz, A. Murat Tekalp, Yubin Wei, Xiaole Yan, Binren Li, Haonan Chen, Siqi Zhang, Sihan Chen, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi, Anjali Sarvaiya, Pooja Choksy, Jagrit Joshi, Shubh Kawa, Kishor Upla, Sushrut Patwardhan, Raghavendra Ramachandra, Sadat Hossain, Geongi Park, S. M. Nadim Uddin, Hao Xu, Yanhui Guo, Aman Urumbekov, Xingzhuo Yan, Wei Hao, Minghan Fu, Isaac Orais, Samuel Smith, Ying Liu, Wangwang Jia, Qisheng Xu, Kele Xu, Weijun Yuan, Zhan Li, Wenqin Kuang, Ruijin Guan, Ruting Deng, Zhao Zhang, Bo wang, Suiyi Zhao, Yan Luo, Yanyan Wei, Asif Hussain Khan, Christian Micheloni, Niki Martinel
This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained.
no code implementations • CVPR 2024 • Xiaogang Xu, Shu Kong, Tao Hu, Zhe Liu, Hujun Bao
Pre-trained models with large-scale training data, such as CLIP and Stable Diffusion, have demonstrated remarkable performance in various high-level computer vision tasks such as image understanding and generation from language descriptions.
1 code implementation • CVPR 2024 • Jiaqi Tang, Ruizheng Wu, Xiaogang Xu, Sixing Hu, Ying-Cong Chen
We aim to remove interference from the film (specular highlights and other degradations) with an end-to-end framework.
no code implementations • CVPR 2024 • Zhuoling Li, Xiaogang Xu, SerNam Lim, Hengshuang Zhao
In this work, we propose to address the challenges from two perspectives, the algorithm perspective and data perspective.
1 code implementation • 25 Feb 2024 • Baiang Li, Zhao Zhang, Huan Zheng, Xiaogang Xu, Yanyan Wei, Jingyi Zhang, Jicong Fan, Meng Wang
Our RTB is used for attention selection of rain-affected and unaffected regions and local modeling of mixed scales.
5 code implementations • CVPR 2024 • Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao
To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (~62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error.
Ranked #4 on
Monocular Depth Estimation
on ETH3D
no code implementations • 26 Dec 2023 • Yan Han, Xiaogang Xu, Yingqi Lin, Jiafei Wu, Zhe Liu
In existing Video Frame Interpolation (VFI) approaches, the motion estimation between neighboring frames plays a crucial role.
no code implementations • 26 Dec 2023 • Yingqi Lin, Xiaogang Xu, Yan Han, Jiafei Wu, Zhe Liu
First, a depth-aware feature extraction module is designed to inject depth priors into the image representation.
no code implementations • 19 Dec 2023 • Jiarong Guo, Xiaogang Xu, Hengshuang Zhao
To address this, we present a Self-Supervised Learning (SSL) technique tailored as an auxiliary loss for any 3D-GAN, designed to improve its 3D geometrical modeling capabilities.
1 code implementation • 14 Dec 2023 • Jiaqi Tang, Hao Lu, Xiaogang Xu, Ruizheng Wu, Sixing Hu, Tong Zhang, Tsz Wa Cheng, Ming Ge, Ying-Cong Chen, Fugee Tsung
Artificial Intelligence (AI)-driven defect inspection is pivotal in industrial manufacturing.
1 code implementation • NeurIPS 2023 • Yixing Lao, Xiaogang Xu, Zhipeng Cai, Xihui Liu, Hengshuang Zhao
We present CorresNeRF, a novel method that leverages image correspondence priors computed by off-the-shelf methods to supervise NeRF training.
1 code implementation • 5 Dec 2023 • Yichi Zhang, Xiaogang Xu
DNF is extracted from the estimated noise generated during the inverse diffusion process.
1 code implementation • 1 Dec 2023 • Shuchi Wu, Chuan Ma, Kang Wei, Xiaogang Xu, Ming Ding, Yuwen Qian, Tao Xiang
This paper introduces RDA, a pioneering approach designed to address two primary deficiencies prevalent in previous endeavors aiming at stealing pre-trained encoders: (1) suboptimal performances attributed to biased optimization objectives, and (2) elevated query costs stemming from the end-to-end paradigm that necessitates querying the target encoder every epoch.
no code implementations • 20 Nov 2023 • Yanyan Wei, Zhao Zhang, Jiahuan Ren, Xiaogang Xu, Richang Hong, Yi Yang, Shuicheng Yan, Meng Wang
The generalization capability of existing image restoration and enhancement (IRE) methods is constrained by the limited pre-trained datasets, making it difficult to handle agnostic inputs such as different degradation levels and scenarios beyond their design scopes.
1 code implementation • CVPR 2024 • Yixun Liang, Xin Yang, Jiantao Lin, Haodong Li, Xiaogang Xu, Yingcong Chen
The recent advancements in text-to-3D generation mark a significant milestone in generative models, unlocking new possibilities for creating imaginative 3D assets across various real-world scenarios.
no code implementations • 12 Nov 2023 • Ruijun Wang, YuAn Liu, Zhixia Fan, Xiaogang Xu, Huijie Wang
However, it is still a challenge to understand the correspondence between the structure and function of the model and the diagnosis process.
1 code implementation • NeurIPS 2023 • Lihe Yang, Xiaogang Xu, Bingyi Kang, Yinghuan Shi, Hengshuang Zhao
Then, we investigate the role of synthetic images by joint training with real images, or pre-training for real images.
no code implementations • 9 Aug 2023 • Xiaobei Li, Changchun Yin, Liyue Zhu, Xiaogang Xu, Liming Fang, Run Wang, Chenhao Lin
Self-supervised learning (SSL), a paradigm harnessing unlabeled datasets to train robust encoders, has recently witnessed substantial success.
1 code implementation • 31 Jul 2023 • Jiaqi Tang, Xiaogang Xu, Sixing Hu, Ying-Cong Chen
Besides, since all current datasets do not provide the corresponding relationship between the tone mapping function and the LDR image, we construct a new dataset with both synthetic and real images.
1 code implementation • ICCV 2023 • Haoyuan Wang, Xiaogang Xu, Ke Xu, Rynson WH. Lau
Neural Radiance Field (NeRF) is a promising approach for synthesizing novel views, given a set of images and the corresponding camera poses of a scene.
1 code implementation • CVPR 2023 • Xiaogang Xu, RuiXing Wang, Jiangbo Lu
Moreover, to improve the appearance modeling, which is implemented with a simple U-Net, a novel structure-guided enhancement module is proposed with structure-guided feature synthesis layers.
no code implementations • 5 May 2023 • Yiyi Zhang, Zhiwen Ying, Ying Zheng, Cuiling Wu, Nannan Li, Jun Wang, Xianzhong Feng, Xiaogang Xu
Plant leaf identification is crucial for biodiversity protection and conservation and has gradually attracted the attention of academia in recent years.
no code implementations • CVPR 2023 • Tao Hu, Xiaogang Xu, Shu Liu, Jiaya Jia
Also, we present Point Encoding to build Multi-scale Radiance Fields that provide discriminative 3D point features.
1 code implementation • CVPR 2023 • Tao Hu, Xiaogang Xu, Ruihang Chu, Jiaya Jia
However, artifacts still appear in rendered images, due to the challenges in extracting continuous and discriminative 3D features from point clouds.
1 code implementation • ICCV 2023 • Xin Yang, Xiaogang Xu, Yingcong Chen
In this paper, we propose a novel framework that enhances the fidelity of human face inversion by designing a new module to decompose the input images to ID and OOD partitions with invertibility masks.
no code implementations • 11 Dec 2022 • Xiaogang Xu, Hengshuang Zhao, Philip Torr, Jiaya Jia
In this paper, we use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap.
1 code implementation • 22 Oct 2022 • Chiyu Zhang, Xiaogang Xu, Lei Wang, Zaiyan Dai, Jun Yang
Transformer's recent integration into style transfer leverages its proficiency in establishing long-range dependencies, albeit at the expense of attenuated local modeling.
no code implementations • 16 Aug 2022 • Shihurong Yao, Yizhan Huang, Xiaogang Xu
RANLEN uses a dynamically designed mask-based normalization operation, which enhances an image in a spatially varying manner, ensuring that the enhancement results are consistent with the requirements specified by the input mask.
1 code implementation • 20 Jul 2022 • Xin Lai, Zhuotao Tian, Xiaogang Xu, Yingcong Chen, Shu Liu, Hengshuang Zhao, LiWei Wang, Jiaya Jia
Unsupervised domain adaptation in semantic segmentation has been raised to alleviate the reliance on expensive pixel-wise annotations.
no code implementations • 14 Jul 2022 • Xiaogang Xu, Hengshuang Zhao
Different from existing methods, UADA would adaptively update DA's parameters according to the target model's gradient information during training: given a pre-defined set of DA operations, we randomly decide types and magnitudes of DA operations for every data batch during training, and adaptively update DA's parameters along the gradient direction of the loss concerning DA's parameters.
1 code implementation • 5 Jul 2022 • Xiaogang Xu, RuiXing Wang, Chi-Wing Fu, Jiaya Jia
Despite the quality improvement brought by the recent methods, video super-resolution (SR) is still very challenging, especially for videos that are low-light and noisy.
no code implementations • 4 Jul 2022 • Xiaogang Xu, Yitong Yu, Nianjuan Jiang, Jiangbo Lu, Bei Yu, Jiaya Jia
Moreover, we also propose a new video denoising framework, called Recurrent Video Denoising Transformer (RVDT), which can achieve SOTA performance on PVDD and other current video denoising benchmarks.
no code implementations • 18 Feb 2022 • Yiyi Zhang, Ying Zheng, Xiaogang Xu, Jun Wang
In this paper, we investigate the role of self-supervised representation learning in the context of CDFSL via a thorough evaluation of existing methods.
1 code implementation • CVPR 2022 • Xiaogang Xu, RuiXing Wang, Chi-Wing Fu, Jiaya Jia
They are long-range operations for image regions of extremely low Signal-to-Noise-Ratio (SNR) and short-range operations for other regions.
Ranked #3 on
Low-Light Image Enhancement
on LIME
1 code implementation • 15 Oct 2021 • Yinpeng Dong, Qi-An Fu, Xiao Yang, Wenzhao Xiang, Tianyu Pang, Hang Su, Jun Zhu, Jiayu Tang, Yuefeng Chen, Xiaofeng Mao, Yuan He, Hui Xue, Chao Li, Ye Liu, Qilong Zhang, Lianli Gao, Yunrui Yu, Xitong Gao, Zhe Zhao, Daquan Lin, Jiadong Lin, Chuanbiao Song, ZiHao Wang, Zhennan Wu, Yang Guo, Jiequan Cui, Xiaogang Xu, Pengguang Chen
Due to the vulnerability of deep neural networks (DNNs) to adversarial examples, a large number of defense techniques have been proposed to alleviate this problem in recent years.
no code implementations • 12 Aug 2021 • Xiaogang Xu, Yi Wang, LiWei Wang, Bei Yu, Jiaya Jia
To synthesize a realistic action sequence based on a single human image, it is crucial to model both motion patterns and diversity in the action video.
no code implementations • CVPR 2021 • Tao Hu, LiWei Wang, Xiaogang Xu, Shu Liu, Jiaya Jia
Recent single-view 3D reconstruction methods reconstruct object's shape and texture from a single image with only 2D image-level annotation.
1 code implementation • ICCV 2021 • RuiXing Wang, Xiaogang Xu, Chi-Wing Fu, Jiangbo Lu, Bei Yu, Jiaya Jia
Low-light video enhancement is an important task.
no code implementations • 1 Jan 2021 • Xiaogang Xu, Hengshuang Zhao, Philip Torr, Jiaya Jia
Specifically, compared with previous methods, we propose a more efficient pixel-level training constraint to weaken the hardness of aligning adversarial samples to clean samples, which can thus obviously enhance the robustness on adversarial samples.
2 code implementations • ICCV 2021 • Xiaogang Xu, Hengshuang Zhao, Jiaya Jia
Adversarial training is promising for improving robustness of deep neural networks towards adversarial perturbations, especially on the classification task.
no code implementations • ICCV 2019 • Xiaogang Xu, Ying-Cong Chen, Jiaya Jia
Synthesizing novel views from a 2D image requires to infer 3D structure and project it back to 2D from a new viewpoint.