1 code implementation • 26 Nov 2024 • Libo Zhu, Jianze Li, Haotong Qin, Wenbo Li, Yulun Zhang, Yong Guo, Xiaokang Yang
Diffusion-based image super-resolution (SR) models have shown superior performance at the cost of multiple denoising steps.
1 code implementation • 26 Nov 2024 • Zheng Chen, Xun Zhang, Wenbo Li, Renjing Pei, Fenglong Song, Xiongkuo Min, Xiaohong Liu, Xin Yuan, Yong Guo, Yulun Zhang
Experiments demonstrate that our proposed task paradigm, dataset, and benchmark facilitate the more fine-grained IQA application.
1 code implementation • 22 Nov 2024 • Hang Guo, Yong Guo, Yaohua Zha, Yulun Zhang, Wenbo Li, Tao Dai, Shu-Tao Xia, Yawei Li
The Mamba-based image restoration backbones have recently demonstrated significant potential in balancing global reception and computational efficiency.
1 code implementation • 15 Nov 2024 • Rui Yin, Haotong Qin, Yulun Zhang, Wenbo Li, Yong Guo, Jianjun Zhu, Cheng Wang, Biao Jia
BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB).
no code implementations • 14 Oct 2024 • Junbo Qiao, Jincheng Liao, Wei Li, Yulun Zhang, Yong Guo, Yi Wen, Zhangxizi Qiu, Jiao Xie, Jie Hu, Shaohui Lin
State Space Models (SSM), such as Mamba, have shown strong representation ability in modeling long-range dependency with linear complexity, achieving successful applications from high-level to low-level vision tasks.
no code implementations • 5 Oct 2024 • Yong Guo, Shulian Zhang, Haolin Pan, Jing Liu, Yulun Zhang, Jian Chen
To address this, we propose a Gap Preserving Distillation (GPD) method that trains an additional dynamic teacher model from scratch along with training the student to bridge this gap.
1 code implementation • 5 Oct 2024 • Jianze Li, JieZhang Cao, Zichen Zou, Xiongfei Su, Xin Yuan, Yulun Zhang, Yong Guo, Xiaokang Yang
However, these methods incur substantial training costs and may constrain the performance of the student model by the teacher's limitations.
1 code implementation • 29 Sep 2024 • Kun Cheng, Lei Yu, Zhijun Tu, Xiao He, Liyu Chen, Yong Guo, Mingrui Zhu, Nannan Wang, Xinbo Gao, Jie Hu
In this work, we design an effective diffusion transformer for image super-resolution (DiT-SR) that achieves the visual quality of prior-based methods, but through a training-from-scratch manner.
1 code implementation • 14 Aug 2024 • Xiao He, Huaao Tang, Zhijun Tu, Junchao Zhang, Kun Cheng, Hanting Chen, Yong Guo, Mingrui Zhu, Nannan Wang, Xinbo Gao, Jie Hu
Specifically, we introduce a novel score distillation strategy to align the data distribution between the outputs of the student and teacher models after minor noise perturbation.
2 code implementations • 6 Jul 2024 • Haolin Pan, Yong Guo, Mianjie Yu, Jian Chen
Real-world data often follows a long-tailed distribution, where a few head classes occupy most of the data and a large number of tail classes only contain very limited samples.
Ranked #6 on Long-tail Learning on CIFAR-10-LT (ρ=50)
no code implementations • 2 Jul 2024 • Jingjing Ren, Wenbo Li, Haoyu Chen, Renjing Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, Lei Zhu
Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands.
1 code implementation • 10 Jun 2024 • Kai Liu, Haotong Qin, Yong Guo, Xin Yuan, Linghe Kong, Guihai Chen, Yulun Zhang
Low-bit quantization has become widespread for compressing image super-resolution (SR) models for edge deployment, which allows advanced SR models to enjoy compact low-bit parameters and efficient integer/bitwise constructions for storage compression and inference acceleration, respectively.
1 code implementation • 9 Jun 2024 • Zheng Chen, Haotong Qin, Yong Guo, Xiongfei Su, Xin Yuan, Linghe Kong, Yulun Zhang
Nonetheless, due to the model structure and the multi-step iterative attribute of DMs, existing binarization methods result in significant performance degradation.
1 code implementation • CVPR 2024 • Runhao Zeng, Xiaoyong Chen, Jiaming Liang, Huisi Wu, Guangzhong Cao, Yong Guo
In this paper, we extensively analyze the robustness of seven leading TAD methods and obtain some interesting findings: 1) Existing methods are particularly vulnerable to temporal corruptions, and end-to-end methods are often more susceptible than those with a pre-trained feature extractor; 2) Vulnerability mainly comes from localization error rather than classification error; 3) When corruptions occur in the middle of an action instance, TAD models tend to yield the largest performance drop.
no code implementations • 18 Jan 2024 • Zhao Wang, Aoxue Li, Lingting Zhu, Yong Guo, Qi Dou, Zhenguo Li
Customized text-to-video generation aims to generate high-quality videos guided by text prompts and subject references.
no code implementations • CVPR 2024 • Enxuan Gu, Hongwei Ge, Yong Guo
To address this issue we propose an explicit Content Decoupling framework for IR dubbed CoDe to end-to-end model the restoration process by utilizing decoupled content components in a divide-and-conquer-like architecture.
no code implementations • 16 Oct 2023 • Yingwei Ma, Yue Yu, Shanshan Li, Yu Jiang, Yong Guo, Yuanliang Zhang, Yutao Xie, Xiangke Liao
Meanwhile, while traditional techniques leveraging such semantic information require complex static or dynamic code analysis to obtain features such as data flow and control flow, SeCoT demonstrates that this process can be fully automated via the intrinsic capabilities of LLMs (i. e., in-context learning), while being generalizable and applicable to challenging domains.
no code implementations • 2 Oct 2023 • Zhenhua Xu, Yujia Zhang, Enze Xie, Zhen Zhao, Yong Guo, Kwan-Yee. K. Wong, Zhenguo Li, Hengshuang Zhao
Multimodal large language models (MLLMs) have emerged as a prominent area of interest within the research community, given their proficiency in handling and reasoning with non-textual data, including images and videos.
1 code implementation • ICCV 2023 • Yong Guo, David Stutz, Bernt Schiele
Interestingly, we observe that the attention mechanism of ViTs tends to rely on few important tokens, a phenomenon we call token overfocusing.
1 code implementation • CVPR 2023 • Yong Guo, David Stutz, Bernt Schiele
Despite their success, vision transformers still remain vulnerable to image corruptions, such as noise or blur.
no code implementations • 13 Dec 2022 • Qinyi Deng, Yong Guo, Zhibang Yang, Haolin Pan, Jian Chen
In this way, these data can be also very informative if we can effectively exploit these complementary labels, i. e., the classes that a sample does not belong to.
no code implementations • ICCV 2023 • Bingna Xu, Yong Guo, Luoqian Jiang, Mianjie Yu, Jian Chen
Inspired by this, we propose a Hierarchical Collaborative Downscaling (HCD) method that performs gradient descent in both HR and LR domains to improve the downscaled representations.
1 code implementation • 31 Oct 2022 • Yaofo Chen, Yong Guo, Daihai Liao, Fanbing Lv, Hengjie Song, James Tin-Yau Kwok, Mingkui Tan
Then, we perform a local search within the mined subspace to find effective architectures.
Ranked #2 on Neural Architecture Search on NAS-Bench-201, ImageNet-16-120 (Accuracy (Val) metric)
1 code implementation • 14 Oct 2022 • Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan
More critically, these independent search processes cannot share their learned knowledge (i. e., the distribution of good architectures) with each other and thus often result in limited search results.
1 code implementation • 30 Jul 2022 • Haolin Pan, Yong Guo, Qinyi Deng, Haomin Yang, Yiqun Chen, Jian Chen
Self-supervised learning (SSL) has achieved remarkable performance in pretraining the models that can be further used in downstream tasks via fine-tuning.
2 code implementations • 16 Jul 2022 • Yong Guo, Mingkui Tan, Zeshuai Deng, Jingdong Wang, Qi Chen, JieZhang Cao, Yanwu Xu, Jian Chen
Nevertheless, it is hard for existing model compression methods to accurately identify the redundant components due to the extremely large SR mapping space.
1 code implementation • 30 Jan 2022 • Yong Guo, David Stutz, Bernt Schiele
We show that EWS greatly improves both robustness against corrupted images as well as accuracy on clean data.
no code implementations • 12 Jul 2021 • Xinyu Gao, Yi Li, Yanqing Qiu, Bangning Mao, Miaogen Chen, Yanlong Meng, Chunliu Zhao, Juan Kang, Yong Guo, Changyu Shen
Multiple optical scattering occurs when light propagates in a non-uniform medium.
1 code implementation • 1 Jul 2021 • Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan
To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data.
1 code implementation • 30 Jun 2021 • Yong Guo, Yaofo Chen, Mingkui Tan, Kui Jia, Jian Chen, Jingdong Wang
In practice, the convolutional operation on some of the windows (e. g., smooth windows that contain very similar pixels) can be very redundant and may introduce noises into the computation.
1 code implementation • CVPR 2021 • Yaofo Chen, Yong Guo, Qi Chen, Minli Li, Wei Zeng, YaoWei Wang, Mingkui Tan
One of the key steps in Neural Architecture Search (NAS) is to estimate the performance of candidate architectures.
no code implementations • 27 Feb 2021 • Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan
To this end, we propose a Pareto-Frontier-aware Neural Architecture Generator (NAG) which takes an arbitrary budget as input and produces the Pareto optimal architecture for the target budget.
2 code implementations • 20 Feb 2021 • Yong Guo, Yin Zheng, Mingkui Tan, Qi Chen, Zhipeng Li, Jian Chen, Peilin Zhao, Junzhou Huang
To address this issue, we propose a Neural Architecture Transformer++ (NAT++) method which further enlarges the set of candidate transitions to improve the performance of architecture optimization.
1 code implementation • 19 Jan 2021 • Zhuoman Liu, Wei Jia, Ming Yang, Peiyao Luo, Yong Guo, Mingkui Tan
To address the above issues, in this paper, we propose a novel deep generative model, called Self-Consistent Generative Network (SCGN), which synthesizes novel views from the given input views without explicitly exploiting the geometric information.
no code implementations • 1 Jan 2021 • Yong Guo, Yaofo Chen, Yin Zheng, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan
To find promising architectures under different budgets, existing methods may have to perform an independent search for each budget, which is very inefficient and unnecessary.
no code implementations • 10 Oct 2020 • Yong Guo, Qingyao Wu, Chaorui Deng, Jian Chen, Mingkui Tan
Although the standard BN can significantly accelerate the training of DNNs and improve the generalization performance, it has several underlying limitations which may hamper the performance in both training and inference.
no code implementations • 21 Sep 2020 • Yixin Liu, Yong Guo, Zichang Liu, Haohua Liu, Jingjie Zhang, Zejun Chen, Jing Liu, Jian Chen
To address this issue, given a target compression rate for the whole model, one can search for the optimal compression rate for each layer.
1 code implementation • 28 Jul 2020 • Jiezhang Cao, Yong Guo, Qingyao Wu, Chunhua Shen, Junzhou Huang, Mingkui Tan
In this paper, rather than sampling from the predefined prior distribution, we propose an LCCGAN model with local coordinate coding (LCC) to improve the performance of generating data.
1 code implementation • ICML 2020 • Yong Guo, Yaofo Chen, Yin Zheng, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan
With the proposed search strategy, our Curriculum Neural Architecture Search (CNAS) method significantly improves the search efficiency and finds better architectures than existing NAS methods.
no code implementations • 29 Mar 2020 • Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Yong Guo, Peilin Zhao, Junzhou Huang, Mingkui Tan
To alleviate the performance disturbance issue, we propose a new disturbance-immune update strategy for model updating.
3 code implementations • CVPR 2020 • Yong Guo, Jian Chen, Jingdong Wang, Qi Chen, JieZhang Cao, Zeshuai Deng, Yanwu Xu, Mingkui Tan
Extensive experiments with paired training data and unpaired real-world data demonstrate our superiority over existing methods.
1 code implementation • 10 Mar 2020 • Yong Guo, Yongsheng Luo, Zhenhao He, Jin Huang, Jian Chen
To this end, we design a hierarchical SR search space and propose a hierarchical controller for architecture search.
no code implementations • 1 Mar 2020 • JieZhang Cao, Langyuan Mo, Qing Du, Yong Guo, Peilin Zhao, Junzhou Huang, Mingkui Tan
However, the resultant optimization problem is still intractable.
1 code implementation • 4 Jan 2020 • Jing Liu, Bohan Zhuang, Zhuangwei Zhuang, Yong Guo, Junzhou Huang, Jinhui Zhu, Mingkui Tan
In this paper, we propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power.
1 code implementation • NeurIPS 2019 • Yong Guo, Yin Zheng, Mingkui Tan, Qi Chen, Jian Chen, Peilin Zhao, Junzhou Huang
To verify the effectiveness of the proposed strategies, we apply NAT on both hand-crafted architectures and NAS based architectures.
1 code implementation • 27 Mar 2019 • Yong Guo, Qi Chen, Jian Chen, Qingyao Wu, Qinfeng Shi, Mingkui Tan
To address this issue, we develop a novel GAN called Auto-Embedding Generative Adversarial Network (AEGAN), which simultaneously encodes the global structure features and captures the fine-grained details.
1 code implementation • NeurIPS 2018 • Zhuangwei Zhuang, Mingkui Tan, Bohan Zhuang, Jing Liu, Yong Guo, Qingyao Wu, Junzhou Huang, Jinhui Zhu
Channel pruning is one of the predominant approaches for deep model compression.
no code implementations • 27 Sep 2018 • JieZhang Cao, Yong Guo, Langyuan Mo, Peilin Zhao, Junzhou Huang, Mingkui Tan
We study the joint distribution matching problem which aims at learning bidirectional mappings to match the joint distribution of two domains.
Open-Ended Question Answering Unsupervised Image-To-Image Translation +2
no code implementations • 19 Sep 2018 • Yong Guo, Qi Chen, Jian Chen, Junzhou Huang, Yanwu Xu, JieZhang Cao, Peilin Zhao, Mingkui Tan
However, most deep learning methods employ feed-forward architectures, and thus the dependencies between LR and HR images are not fully exploited, leading to limited learning performance.
no code implementations • ICML 2018 • Jiezhang Cao, Yong Guo, Qingyao Wu, Chunhua Shen, Junzhou Huang, Mingkui Tan
Generative adversarial networks (GANs) aim to generate realistic data from some prior distribution (e. g., Gaussian noises).
1 code implementation • 6 Nov 2016 • Yong Guo, Jian Chen, Qing Du, Anton Van Den Hengel, Qinfeng Shi, Mingkui Tan
As a result, the representation power of intermediate layers can be very weak and the model becomes very redundant with limited performance.
no code implementations • 15 Jun 2016 • Qiang Guo, Hongwei Chen, Yuxi Wang, Yong Guo, Peng Liu, Xiurui Zhu, Zheng Cheng, Zhenming Yu, Minghua Chen, Sigang Yang, Shizhong Xie
However, according to CS theory, image reconstruction is an iterative process that consumes enormous amounts of computational time and cannot be performed in real time.