no code implementations • ECCV 2020 • Shuchen Weng, Wenbo Li, Dawei Li, Hongxia Jin, Boxin Shi
We study conditional image repainting where a model is trained to generate visual content conditioned on user inputs, and composite the generated content seamlessly onto a user provided image while preserving the semantics of users' inputs.
1 code implementation • 3 Oct 2024 • Kai Liu, Ziqing Zhang, Wenbo Li, Renjing Pei, Fenglong Song, Xiaohong Liu, Linghe Kong, Yulun Zhang
Image quality assessment (IQA) serves as the golden standard for all models' performance in nearly all computer vision fields.
1 code implementation • 20 Sep 2024 • Zhibin Lan, LiQiang Niu, Fandong Meng, Wenbo Li, Jie zhou, Jinsong Su
Recently, when dealing with high-resolution images, dominant LMMs usually divide them into multiple local images and one global image, which will lead to a large number of visual tokens.
1 code implementation • 3 Sep 2024 • Kun Zhou, Xinyu Lin, Wenbo Li, Xiaogang Xu, Yuanhao Cai, Zhonghang Liu, Xiaoguang Han, Jiangbo Lu
Previous low-light image enhancement (LLIE) approaches, while employing frequency decomposition techniques to address the intertwined challenges of low frequency (e. g., illumination recovery) and high frequency (e. g., noise reduction), primarily focused on the development of dedicated and complex networks to achieve improved performance.
no code implementations • 16 Aug 2024 • Xin Di, Long Peng, Peizhe Xia, Wenbo Li, Renjing Pei, Yang Cao, Yang Wang, Zheng-Jun Zha
Moreover, AdaUp is designed to dynamically adjust the upsampling kernel based on the spatial distribution of multi-frame sub-pixel information in the different burst scenes, thereby facilitating the reconstruction of the spatial arrangement of high-resolution details.
1 code implementation • 12 Aug 2024 • Bohao Peng, Jian Wang, Yuechen Zhang, Wenbo Li, Ming-Chang Yang, Jiaya Jia
In this paper, we propose ControlNeXt: a powerful and efficient method for controllable image and video generation.
no code implementations • 25 Jul 2024 • Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Sixiang Chen, Tian Ye, Renjing Pei, Kaiwen Zhou, Fenglong Song, Lei Zhu
RestoreAgent autonomously assesses the type and extent of degradation in input images and performs restoration through (1) determining the appropriate restoration tasks, (2) optimizing the task sequence, (3) selecting the most suitable models, and (4) executing the restoration.
no code implementations • 2 Jul 2024 • Jingjing Ren, Wenbo Li, Haoyu Chen, Renjing Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, Lei Zhu
Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands.
no code implementations • 24 Jun 2024 • Shuwei Shi, Wenbo Li, Yuechen Zhang, Jingwen He, Biao Gong, Yinqiang Zheng
Diffusion models excel at producing high-quality images; however, scaling to higher resolutions, such as 4K, often results in over-smoothed content, structural distortions, and repetitive patterns.
1 code implementation • 24 Jun 2024 • Aiwen Jiang, Zhi Wei, Long Peng, Feiqiang Liu, Wenbo Li, Mingwen Wang
Specifically, on one hand, image-restoration prompt alignment decoder is proposed to automatically discern the degradation degree of LR images, thereby generating beneficial degradation priors for image restoration.
no code implementations • 11 Jun 2024 • Long Peng, Wenbo Li, Renjing Pei, Jingjing Ren, Xueyang Fu, Yang Wang, Yang Cao, Zheng-Jun Zha
Existing image super-resolution (SR) techniques often fail to generalize effectively in complex real-world settings due to the significant divergence between training data and practical scenarios.
no code implementations • 6 Jun 2024 • Minghao Yang, Linlin Gao, Pengyuan Li, Wenbo Li, Yihong Dong, Zhiying Cui
Current structured pruning methods often result in considerable accuracy drops due to abrupt network changes and loss of information from pruned structures.
1 code implementation • 1 Jun 2024 • Jiahua Dong, Hui Yin, Hongliu Li, Wenbo Li, Yulun Zhang, Salman Khan, Fahad Shahbaz Khan
Experiments verify the benefits of our DHM for HSI reconstruction.
no code implementations • 21 May 2024 • Huangjun Shen, Liangying Shao, Wenbo Li, Zhibin Lan, Zhanyu Liu, Jinsong Su
In recent years, multi-modal machine translation has attracted significant interest in both academia and industry due to its superior performance.
1 code implementation • 11 May 2024 • Long Peng, Yang Cao, Renjing Pei, Wenbo Li, Jiaming Guo, Xueyang Fu, Yang Wang, Zheng-Jun Zha
These convolutions are integrated in parallel with a novel linear weighting mechanism to form an Adaptive Directional Gradient Convolution (DGConv), which adaptively weights and fuses the basic directional gradients to improve the gradient arrangement perception capability for both regular and irregular textures.
no code implementations • CVPR 2024 • Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Haoze Sun, Xueyi Zou, Zhensong Zhang, Youliang Yan, Lei Zhu
Leveraging unseen LR images for self-supervised learning guides the model to adapt its modeling space to the target domain, facilitating fine-tuning of SR models without requiring paired high-resolution (HR) images.
1 code implementation • CVPR 2024 • Haoze Sun, Wenbo Li, Jianzhuang Liu, Haoyu Chen, Renjing Pei, Xueyi Zou, Youliang Yan, Yujiu Yang
We achieve this by marrying image appearance and language understanding to generate a cognitive embedding, which not only activates prior information from large text-to-image diffusion models but also facilitates the generation of high-quality reference images to optimize the SR process.
1 code implementation • ICCV 2023 • Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Zhengzhe Liu, Xiaojuan Qi
In this work, we focus on synthesizing high-quality textures on 3D meshes.
1 code implementation • 10 Jun 2023 • Kun Zhou, Wenbo Li, Nianjuan Jiang, Xiaoguang Han, Jiangbo Lu
To address this, we propose NeRFLiX, a general NeRF-agnostic restorer paradigm that learns a degradation-driven inter-viewpoint mixer.
Ranked #1 on Novel View Synthesis on Tanks and Temples
1 code implementation • 12 Apr 2023 • Wei Ji, Jingjing Li, Qi Bi, TingWei Liu, Wenbo Li, Li Cheng
Recently, Meta AI Research approaches a general, promptable Segment Anything Model (SAM) pre-trained on an unprecedentedly large segmentation dataset (SA-1B).
no code implementations • 16 Mar 2023 • Wenqian Zhao, Qi Sun, Yang Bai, Wenbo Li, Haisheng Zheng, Bei Yu, Martin D. F. Wong
Recent years have witnessed impressive progress in super-resolution (SR) processing.
2 code implementations • 14 Mar 2023 • Yiming Tan, Dehai Min, Yu Li, Wenbo Li, Nan Hu, Yongrui Chen, Guilin Qi
ChatGPT is a powerful large language model (LLM) that covers knowledge resources such as Wikipedia and supports natural language question answering using its own knowledge.
Ranked #1 on Knowledge Base Question Answering on WebQuestionsSP (Accuracy metric)
1 code implementation • CVPR 2023 • Kun Zhou, Wenbo Li, Yi Wang, Tao Hu, Nianjuan Jiang, Xiaoguang Han, Jiangbo Lu
Neural radiance fields (NeRF) show great success in novel view synthesis.
Ranked #1 on Novel View Synthesis on LLFF
1 code implementation • CVPR 2024 • Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia
This paper presents Video-P2P, a novel framework for real-world video editing with cross-attention control.
no code implementations • ICCV 2023 • Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang
Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images.
no code implementations • 21 Dec 2022 • Shengju Qian, Yi Zhu, Wenbo Li, Mu Li, Jiaya Jia
The architecture of transformers, which recently witness booming applications in vision tasks, has pivoted against the widespread convolutional paradigm.
2 code implementations • 6 Dec 2022 • Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia
To achieve high-quality results with low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled probabilistic modeling, combining the optimization efficiency of GANs with the prediction tractability of probabilistic models.
no code implementations • 25 Nov 2022 • Kun Zhou, Kenkun Liu, Wenbo Li, Xiaoguang Han, Jiangbo Lu
To address those issues, we propose a novel mutual guidance network (MGN) to perform effective bidirectional global-local information exchange while keeping a compact architecture.
1 code implementation • 20 Jul 2022 • Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, Xiaojuan Qi
With the rapid development of mobile devices, modern widely-used mobile phones typically allow users to capture 4K resolution (i. e., ultra-high-definition) images.
Ranked #1 on Image Restoration on UHDM
1 code implementation • CVPR 2022 • Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun Shen, Xiaojuan Qi
Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.
1 code implementation • CVPR 2022 • Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia
Recent studies have shown the importance of modeling long-range interactions in the inpainting problem.
Ranked #1 on Image Inpainting on CelebA-HQ
no code implementations • CVPR 2023 • Kun Zhou, Wenbo Li, Xiaoguang Han, Jiangbo Lu
Without the bells and whistles, our plug-and-play TCL is capable of improving the performance of existing VFI frameworks.
Ranked #1 on Video Frame Interpolation on Vimeo90K
no code implementations • CVPR 2022 • Luwei Yang, Rakesh Shrestha, Wenbo Li, Shuaicheng Liu, Guofeng Zhang, Zhaopeng Cui, Ping Tan
Standard visual localization methods build a priori 3D model of a scene which is used to establish correspondences against the 2D keypoints in a query image.
1 code implementation • 19 Dec 2021 • Wenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu, Xiangyu Zhang, Jiaya Jia
Pre-training has marked numerous state of the arts in high-level computer vision, while few attempts have ever been made to investigate how pre-training acts in image processing systems.
Ranked #10 on Image Super-Resolution on Set5 - 2x upscaling (using extra training data)
1 code implementation • CVPR 2022 • Kun Zhou, Wenbo Li, Liying Lu, Xiaoguang Han, Jiangbo Lu
Long-range temporal alignment is critical yet challenging for video restoration tasks.
Ranked #1 on Video Super-Resolution on Vimeo-90K
no code implementations • 23 Nov 2021 • Yifan Chang, Wenbo Li, Jian Peng, Bo Tang, Yu Kang, Yinjie Lei, Yuanmiao Gui, Qing Zhu, Yu Liu, Haifeng Li
Different from previous reviews that mainly focus on the catastrophic forgetting phenomenon in CL, this paper surveys CL from a more macroscopic perspective based on the Stability Versus Plasticity mechanism.
no code implementations • 21 Nov 2021 • Jian Peng, Xian Sun, Min Deng, Chao Tao, Bo Tang, Wenbo Li, Guohua Wu, QingZhu, Yu Liu, Tao Lin, Haifeng Li
This paper presents a learning model by active forgetting mechanism with artificial neural networks.
1 code implementation • CVPR 2021 • Liying Lu, Wenbo Li, Xin Tao, Jiangbo Lu, Jiaya Jia
Therefore, high-quality correspondence matching is critical.
2 code implementations • NeurIPS 2020 • Wenbo Li, Kun Zhou, Lu Qi, Nianjuan Jiang, Jiangbo Lu, Jiaya Jia
Single image super-resolution (SISR) deals with a fundamental problem of upsampling a low-resolution (LR) image to its high-resolution (HR) version.
no code implementations • 30 Apr 2021 • Yichen Zhang, Zeyang Song, Wenbo Li
Data augmentation has always been an effective way to overcome overfitting issue when the dataset is small.
2 code implementations • 29 Mar 2021 • Wenbo Li, Kun Zhou, Lu Qi, Liying Lu, Nianjuan Jiang, Jiangbo Lu, Jiaya Jia
We consider the single image super-resolution (SISR) problem, where a high-resolution (HR) image is generated based on a low-resolution (LR) input.
no code implementations • 3 Oct 2020 • Yi Wei, Zhe Gan, Wenbo Li, Siwei Lyu, Ming-Ching Chang, Lei Zhang, Jianfeng Gao, Pengchuan Zhang
We present Mask-guided Generative Adversarial Network (MagGAN) for high-resolution face attribute editing, in which semantic facial masks from a pre-trained face parser are used to guide the fine-grained image editing process.
1 code implementation • ECCV 2020 • Wenbo Li, Xin Tao, Taian Guo, Lu Qi, Jiangbo Lu, Jiaya Jia
Motivated by these findings, we propose a temporal multi-correspondence aggregation strategy to leverage similar patches across frames, and a cross-scale nonlocal-correspondence aggregation scheme to explore self-similarity of images across scales.
no code implementations • 16 Jul 2020 • Teng Liu, Xiaolin Tang, Jinwei Zhang, Wenbo Li, Zejian Deng, Yalian Yang
As a typical vehicle-cyber-physical-system (V-CPS), connected automated vehicles attracted more and more attention in recent years.
no code implementations • 22 May 2020 • Yuhang Song, Wenbo Li, Lei Zhang, Jianwei Yang, Emre Kiciman, Hamid Palangi, Jianfeng Gao, C. -C. Jay Kuo, Pengchuan Zhang
We study in this paper the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios.
no code implementations • 26 Apr 2020 • Wenbo Li, Yaodong Cui, Yintao Ma, Xingxin Chen, Guofa Li, Gang Guo, Dongpu Cao
In this paper, we introduce a new dataset, the driver emotion facial expression (DEFE) dataset, for driver spontaneous emotions analysis.
no code implementations • 27 Jan 2020 • Jie Chen, Haozhe Huang, Jian Peng, Jiawei Zhu, Li Chen, Wenbo Li, Binyu Sun, Haifeng Li
The feature-learning procedure of CNN largely depends on the architecture of CNN.
1 code implementation • CVPR 2019 • Wenbo Li, Pengchuan Zhang, Lei Zhang, Qiuyuan Huang, Xiaodong He, Siwei Lyu, Jianfeng Gao
In this paper, we propose Object-driven Attentive Generative Adversarial Newtorks (Obj-GANs) that allow object-centered text-to-image synthesis for complex scenes.
7 code implementations • 1 Jan 2019 • Wenbo Li, Zhicheng Wang, Binyi Yin, Qixiang Peng, Yuming Du, Tianzi Xiao, Gang Yu, Hongtao Lu, Yichen Wei, Jian Sun
Existing pose estimation approaches fall into two categories: single-stage and multi-stage methods.
Ranked #1 on Pose Estimation on COCO minival
no code implementations • 6 Nov 2018 • Wenbo Li, Longyin Wen, Xiao Bian, Siwei Lyu
Video style transfer is a useful component for applications such as augmented reality, non-photorealistic rendering, and interactive games.
no code implementations • 3 Jul 2018 • Wenbo Li, Ming-Ching Chang, Siwei Lyu
We present a bootstrapping framework to simultaneously improve multi-person tracking and activity recognition at individual, interaction and social group activity levels.
no code implementations • 20 May 2018 • Shuchen Weng, Wenbo Li, Yi Zhang, Siwei Lyu
Inspired by the dual-stream hypothesis in neural science, we propose a novel dual-stream framework for modeling the interweaved spatiotemporal dependency, and develop a convolutional neural network within this framework that aims to achieve high adaptability and flexibility in STS configurations from various diagonals, i. e., sequential order, dependency range and features.
no code implementations • The IEEE International Conference on Computer Vision (ICCV), 2017 2017 • Wenbo Li, Longyin Wen, Ming-Ching Chang, Ser Nam Lim, Siwei Lyu
The RNNs in RNN-T are co-trained with the action category hierarchy, which determines the structure of RNN-T.
Ranked #109 on Skeleton Based Action Recognition on NTU RGB+D
no code implementations • 19 Oct 2016 • Fengwei Yu, Wenbo Li, Quanquan Li, Yu Liu, Xiaohua Shi, Junjie Yan
In this paper, we explore the high-performance detection and deep learning based appearance feature, and show that they lead to significantly better MOT results in both online and offline setting.
no code implementations • ICCV 2015 • Wenbo Li, Longyin Wen, Mooi Choo Chuah, Siwei Lyu
In this paper, we propose the category-blind human recognition method (CHARM) which can recognize a human action without making assumptions of the action category.
no code implementations • CVPR 2014 • Longyin Wen, Wenbo Li, Junjie Yan, Zhen Lei, Dong Yi, Stan Z. Li
Multi-target tracking is an interesting but challenging task in computer vision field.