Search Results for author: Wenbo Li

Found 56 papers, 26 papers with code

Conditional Image Repainting via Semantic Bridge and Piecewise Value Function

no code implementations ECCV 2020 Shuchen Weng, Wenbo Li, Dawei Li, Hongxia Jin, Boxin Shi

We study conditional image repainting where a model is trained to generate visual content conditioned on user inputs, and composite the generated content seamlessly onto a user provided image while preserving the semantics of users' inputs.

Dog-IQA: Standard-guided Zero-shot MLLM for Mix-grained Image Quality Assessment

1 code implementation3 Oct 2024 Kai Liu, Ziqing Zhang, Wenbo Li, Renjing Pei, Fenglong Song, Xiaohong Liu, Linghe Kong, Yulun Zhang

Image quality assessment (IQA) serves as the golden standard for all models' performance in nearly all computer vision fields.

AVG-LLaVA: A Large Multimodal Model with Adaptive Visual Granularity

1 code implementation20 Sep 2024 Zhibin Lan, LiQiang Niu, Fandong Meng, Wenbo Li, Jie zhou, Jinsong Su

Recently, when dealing with high-resolution images, dominant LMMs usually divide them into multiple local images and one global image, which will lead to a large number of visual tokens.

Unveiling Advanced Frequency Disentanglement Paradigm for Low-Light Image Enhancement

1 code implementation3 Sep 2024 Kun Zhou, Xinyu Lin, Wenbo Li, Xiaogang Xu, Yuanhao Cai, Zhonghang Liu, Xiaoguang Han, Jiangbo Lu

Previous low-light image enhancement (LLIE) approaches, while employing frequency decomposition techniques to address the intertwined challenges of low frequency (e. g., illumination recovery) and high frequency (e. g., noise reduction), primarily focused on the development of dedicated and complex networks to achieve improved performance.

Disentanglement Low-Light Image Enhancement

QMambaBSR: Burst Image Super-Resolution with Query State Space Model

no code implementations16 Aug 2024 Xin Di, Long Peng, Peizhe Xia, Wenbo Li, Renjing Pei, Yang Cao, Yang Wang, Zheng-Jun Zha

Moreover, AdaUp is designed to dynamically adjust the upsampling kernel based on the spatial distribution of multi-frame sub-pixel information in the different burst scenes, thereby facilitating the reconstruction of the spatial arrangement of high-resolution details.

Burst Image Super-Resolution

ControlNeXt: Powerful and Efficient Control for Image and Video Generation

1 code implementation12 Aug 2024 Bohao Peng, Jian Wang, Yuechen Zhang, Wenbo Li, Ming-Chang Yang, Jiaya Jia

In this paper, we propose ControlNeXt: a powerful and efficient method for controllable image and video generation.

Video Generation

RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models

no code implementations25 Jul 2024 Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Sixiang Chen, Tian Ye, Renjing Pei, Kaiwen Zhou, Fenglong Song, Lei Zhu

RestoreAgent autonomously assesses the type and extent of degradation in input images and performs restoration through (1) determining the appropriate restoration tasks, (2) optimizing the task sequence, (3) selecting the most suitable models, and (4) executing the restoration.

Image Restoration

UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks

no code implementations2 Jul 2024 Jingjing Ren, Wenbo Li, Haoyu Chen, Renjing Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, Lei Zhu

Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands.

Computational Efficiency Denoising +1

ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance

no code implementations24 Jun 2024 Shuwei Shi, Wenbo Li, Yuechen Zhang, Jingwen He, Biao Gong, Yinqiang Zheng

Diffusion models excel at producing high-quality images; however, scaling to higher resolutions, such as 4K, often results in over-smoothed content, structural distortions, and repetitive patterns.

4k Denoising +2

DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution

1 code implementation24 Jun 2024 Aiwen Jiang, Zhi Wei, Long Peng, Feiqiang Liu, Wenbo Li, Mingwen Wang

Specifically, on one hand, image-restoration prompt alignment decoder is proposed to automatically discern the degradation degree of LR images, thereby generating beneficial degradation priors for image restoration.

Image Restoration Image Super-Resolution +3

Towards Realistic Data Generation for Real-World Super-Resolution

no code implementations11 Jun 2024 Long Peng, Wenbo Li, Renjing Pei, Jingjing Ren, Xueyang Fu, Yang Wang, Yang Cao, Zheng-Jun Zha

Existing image super-resolution (SR) techniques often fail to generalize effectively in complex real-world settings due to the significant divergence between training data and practical scenarios.

Image Super-Resolution

Decay Pruning Method: Smooth Pruning With a Self-Rectifying Procedure

no code implementations6 Jun 2024 Minghao Yang, Linlin Gao, Pengyuan Li, Wenbo Li, Yihong Dong, Zhiying Cui

Current structured pruning methods often result in considerable accuracy drops due to abrupt network changes and loss of information from pruned structures.

A Survey on Multi-modal Machine Translation: Tasks, Methods and Challenges

no code implementations21 May 2024 Huangjun Shen, Liangying Shao, Wenbo Li, Zhibin Lan, Zhanyu Liu, Jinsong Su

In recent years, multi-modal machine translation has attracted significant interest in both academia and industry due to its superior performance.

Machine Translation Translation

Efficient Real-world Image Super-Resolution Via Adaptive Directional Gradient Convolution

1 code implementation11 May 2024 Long Peng, Yang Cao, Renjing Pei, Wenbo Li, Jiaming Guo, Xueyang Fu, Yang Wang, Zheng-Jun Zha

These convolutions are integrated in parallel with a novel linear weighting mechanism to form an Adaptive Directional Gradient Convolution (DGConv), which adaptively weights and fuses the basic directional gradients to improve the gradient arrangement perception capability for both regular and irregular textures.

Image Super-Resolution

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

no code implementations CVPR 2024 Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Haoze Sun, Xueyi Zou, Zhensong Zhang, Youliang Yan, Lei Zhu

Leveraging unseen LR images for self-supervised learning guides the model to adapt its modeling space to the target domain, facilitating fine-tuning of SR models without requiring paired high-resolution (HR) images.

Image Super-Resolution Self-Supervised Learning

CoSeR: Bridging Image and Language for Cognitive Super-Resolution

1 code implementation CVPR 2024 Haoze Sun, Wenbo Li, Jianzhuang Liu, Haoyu Chen, Renjing Pei, Xueyi Zou, Youliang Yan, Yujiu Yang

We achieve this by marrying image appearance and language understanding to generate a cognitive embedding, which not only activates prior information from large text-to-image diffusion models but also facilitates the generation of high-quality reference images to optimize the SR process.

Super-Resolution

From NeRFLiX to NeRFLiX++: A General NeRF-Agnostic Restorer Paradigm

1 code implementation10 Jun 2023 Kun Zhou, Wenbo Li, Nianjuan Jiang, Xiaoguang Han, Jiangbo Lu

To address this, we propose NeRFLiX, a general NeRF-agnostic restorer paradigm that learns a degradation-driven inter-viewpoint mixer.

Computational Efficiency Novel View Synthesis

Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications

1 code implementation12 Apr 2023 Wei Ji, Jingjing Li, Qi Bi, TingWei Liu, Wenbo Li, Li Cheng

Recently, Meta AI Research approaches a general, promptable Segment Anything Model (SAM) pre-trained on an unprecedentedly large segmentation dataset (SA-1B).

Image Segmentation Segmentation +1

Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family

2 code implementations14 Mar 2023 Yiming Tan, Dehai Min, Yu Li, Wenbo Li, Nan Hu, Yongrui Chen, Guilin Qi

ChatGPT is a powerful large language model (LLM) that covers knowledge resources such as Wikipedia and supports natural language question answering using its own knowledge.

Knowledge Base Question Answering Language Modelling +3

Video-P2P: Video Editing with Cross-attention Control

1 code implementation CVPR 2024 Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia

This paper presents Video-P2P, a novel framework for real-world video editing with cross-attention control.

Image Generation Video Editing +1

High Quality Entity Segmentation

no code implementations ICCV 2023 Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang

Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images.

Image Segmentation Panop +2

What Makes for Good Tokenizers in Vision Transformer?

no code implementations21 Dec 2022 Shengju Qian, Yi Zhu, Wenbo Li, Mu Li, Jiaya Jia

The architecture of transformers, which recently witness booming applications in vision tasks, has pivoted against the widespread convolutional paradigm.

Image Inpainting via Iteratively Decoupled Probabilistic Modeling

2 code implementations6 Dec 2022 Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia

To achieve high-quality results with low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled probabilistic modeling, combining the optimization efficiency of GANs with the prediction tractability of probabilistic models.

Denoising Image Inpainting

Mutual Guidance and Residual Integration for Image Enhancement

no code implementations25 Nov 2022 Kun Zhou, Kenkun Liu, Wenbo Li, Xiaoguang Han, Jiangbo Lu

To address those issues, we propose a novel mutual guidance network (MGN) to perform effective bidirectional global-local information exchange while keeping a compact architecture.

Computational Efficiency Image Enhancement +1

Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing

1 code implementation20 Jul 2022 Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, Xiaojuan Qi

With the rapid development of mobile devices, modern widely-used mobile phones typically allow users to capture 4K resolution (i. e., ultra-high-definition) images.

4k Image Enhancement +2

Video Demoireing with Relation-Based Temporal Consistency

1 code implementation CVPR 2022 Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun Shen, Xiaojuan Qi

Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.

Relation

SceneSqueezer: Learning To Compress Scene for Camera Relocalization

no code implementations CVPR 2022 Luwei Yang, Rakesh Shrestha, Wenbo Li, Shuaicheng Liu, Guofeng Zhang, Zhaopeng Cui, Ping Tan

Standard visual localization methods build a priori 3D model of a scene which is used to establish correspondences against the 2D keypoints in a query image.

Camera Relocalization Image Registration +3

On Efficient Transformer-Based Image Pre-training for Low-Level Vision

1 code implementation19 Dec 2021 Wenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu, Xiangyu Zhang, Jiaya Jia

Pre-training has marked numerous state of the arts in high-level computer vision, while few attempts have ever been made to investigate how pre-training acts in image processing systems.

Ranked #10 on Image Super-Resolution on Set5 - 2x upscaling (using extra training data)

Denoising Image Super-Resolution

Reviewing continual learning from the perspective of human-level intelligence

no code implementations23 Nov 2021 Yifan Chang, Wenbo Li, Jian Peng, Bo Tang, Yu Kang, Yinjie Lei, Yuanmiao Gui, Qing Zhu, Yu Liu, Haifeng Li

Different from previous reviews that mainly focus on the catastrophic forgetting phenomenon in CL, this paper surveys CL from a more macroscopic perspective based on the Stability Versus Plasticity mechanism.

Continual Learning

Learning by Active Forgetting for Neural Networks

no code implementations21 Nov 2021 Jian Peng, Xian Sun, Min Deng, Chao Tao, Bo Tang, Wenbo Li, Guohua Wu, QingZhu, Yu Liu, Tao Lin, Haifeng Li

This paper presents a learning model by active forgetting mechanism with artificial neural networks.

LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single Image Super-Resolution and Beyond

2 code implementations NeurIPS 2020 Wenbo Li, Kun Zhou, Lu Qi, Nianjuan Jiang, Jiangbo Lu, Jiaya Jia

Single image super-resolution (SISR) deals with a fundamental problem of upsampling a low-resolution (LR) image to its high-resolution (HR) version.

Image Deblocking Image Denoising +2

Unsupervised data augmentation for object detection

no code implementations30 Apr 2021 Yichen Zhang, Zeyang Song, Wenbo Li

Data augmentation has always been an effective way to overcome overfitting issue when the dataset is small.

Data Augmentation Image Classification +4

Best-Buddy GANs for Highly Detailed Image Super-Resolution

2 code implementations29 Mar 2021 Wenbo Li, Kun Zhou, Lu Qi, Liying Lu, Nianjuan Jiang, Jiangbo Lu, Jiaya Jia

We consider the single image super-resolution (SISR) problem, where a high-resolution (HR) image is generated based on a low-resolution (LR) input.

4k Image Super-Resolution

MagGAN: High-Resolution Face Attribute Editing with Mask-Guided Generative Adversarial Network

no code implementations3 Oct 2020 Yi Wei, Zhe Gan, Wenbo Li, Siwei Lyu, Ming-Ching Chang, Lei Zhang, Jianfeng Gao, Pengchuan Zhang

We present Mask-guided Generative Adversarial Network (MagGAN) for high-resolution face attribute editing, in which semantic facial masks from a pre-trained face parser are used to guide the fine-grained image editing process.

Attribute Generative Adversarial Network +1

MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution

1 code implementation ECCV 2020 Wenbo Li, Xin Tao, Taian Guo, Lu Qi, Jiangbo Lu, Jiaya Jia

Motivated by these findings, we propose a temporal multi-correspondence aggregation strategy to leverage similar patches across frames, and a cross-scale nonlocal-correspondence aggregation scheme to explore self-similarity of images across scales.

Optical Flow Estimation Video Super-Resolution

Novel Human-Object Interaction Detection via Adversarial Domain Generalization

no code implementations22 May 2020 Yuhang Song, Wenbo Li, Lei Zhang, Jianwei Yang, Emre Kiciman, Hamid Palangi, Jianfeng Gao, C. -C. Jay Kuo, Pengchuan Zhang

We study in this paper the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios.

Domain Generalization Human-Object Interaction Detection +2

A Spontaneous Driver Emotion Facial Expression (DEFE) Dataset for Intelligent Vehicles

no code implementations26 Apr 2020 Wenbo Li, Yaodong Cui, Yintao Ma, Xingxin Chen, Guofa Li, Gang Guo, Dongpu Cao

In this paper, we introduce a new dataset, the driver emotion facial expression (DEFE) dataset, for driver spontaneous emotions analysis.

Emotion Recognition

Object-driven Text-to-Image Synthesis via Adversarial Training

1 code implementation CVPR 2019 Wenbo Li, Pengchuan Zhang, Lei Zhang, Qiuyuan Huang, Xiaodong He, Siwei Lyu, Jianfeng Gao

In this paper, we propose Object-driven Attentive Generative Adversarial Newtorks (Obj-GANs) that allow object-centered text-to-image synthesis for complex scenes.

Image Generation Object

Evolvement Constrained Adversarial Learning for Video Style Transfer

no code implementations6 Nov 2018 Wenbo Li, Longyin Wen, Xiao Bian, Siwei Lyu

Video style transfer is a useful component for applications such as augmented reality, non-photorealistic rendering, and interactive games.

Generative Adversarial Network Optical Flow Estimation +2

Who did What at Where and When: Simultaneous Multi-Person Tracking and Activity Recognition

no code implementations3 Jul 2018 Wenbo Li, Ming-Ching Chang, Siwei Lyu

We present a bootstrapping framework to simultaneously improve multi-person tracking and activity recognition at individual, interaction and social group activity levels.

Activity Recognition Visual Tracking

STS Classification with Dual-stream CNN

no code implementations20 May 2018 Shuchen Weng, Wenbo Li, Yi Zhang, Siwei Lyu

Inspired by the dual-stream hypothesis in neural science, we propose a novel dual-stream framework for modeling the interweaved spatiotemporal dependency, and develop a convolutional neural network within this framework that aims to achieve high adaptability and flexibility in STS configurations from various diagonals, i. e., sequential order, dependency range and features.

Activity Recognition Classification +4

POI: Multiple Object Tracking with High Performance Detection and Appearance Feature

no code implementations19 Oct 2016 Fengwei Yu, Wenbo Li, Quanquan Li, Yu Liu, Xiaohua Shi, Junjie Yan

In this paper, we explore the high-performance detection and deep learning based appearance feature, and show that they lead to significantly better MOT results in both online and offline setting.

Multiple Object Tracking Vocal Bursts Intensity Prediction

Category-Blind Human Action Recognition: A Practical Recognition System

no code implementations ICCV 2015 Wenbo Li, Longyin Wen, Mooi Choo Chuah, Siwei Lyu

In this paper, we propose the category-blind human recognition method (CHARM) which can recognize a human action without making assumptions of the action category.

Action Recognition Temporal Action Localization

Cannot find the paper you are looking for? You can Submit a new open access paper.