Search Results for author: Jianzhuang Liu

Found 93 papers, 42 papers with code

Wavelet-Based Dual-Branch Network for Image Demoiréing

no code implementations • ECCV 2020 • Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Aleš Leonardis, Wengang Zhou, Qi Tian

When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.

Image Restoration Rain Removal

Paper
Add Code

API-Net: Robust Generative Classifier via a Single Discriminator

1 code implementation • ECCV 2020 • Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian

On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.

Robust classification

Paper
Code

Large-Scale Few-Shot Learning via Multi-Modal Knowledge Discovery

no code implementations • ECCV 2020 • Shuo Wang, Jun Yue, Jianzhuang Liu, Qi Tian, Meng Wang

It is a challenging problem since (1) the identifying process is susceptible to over-fitting with limited samples of an object, and (2) the sample imbalance between a base (known knowledge) category and a novel category is easy to bias the recognition results.

Few-Shot Learning

Paper
Add Code

MVEB: Self-Supervised Learning with Multi-View Entropy Bottleneck

no code implementations • 28 Mar 2024 • Liangjian Wen, Xiasi Wang, Jianzhuang Liu, Zenglin Xu

One can learn this representation by maximizing the mutual information between the representation and the supervised view while eliminating superfluous information.

Self-Supervised Learning

Paper
Add Code

Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation

no code implementations • 13 Mar 2024 • ZiCheng Zhang, Tong Zhang, Yi Zhu, Jianzhuang Liu, Xiaodan Liang, Qixiang Ye, Wei Ke

To mitigate these issues, we propose a Language-Driven Visual Consensus (LDVC) approach, fostering improved alignment of semantic and visual information. Specifically, we leverage class embeddings as anchors due to their discrete and abstract nature, steering vision features toward class embeddings.

Language Modelling Semantic Segmentation +1

Paper
Add Code

VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction

no code implementations • 27 Feb 2024 • Jiaqi Lin, Zhihao LI, Xiao Tang, Jianzhuang Liu, Shiyong Liu, Jiayue Liu, Yangdi Lu, Xiaofei Wu, Songcen Xu, Youliang Yan, Wenming Yang

Existing NeRF-based methods for large scene reconstruction often have limitations in visual quality and rendering speed.

Paper
Add Code

Diffusion Model-Based Image Editing: A Survey

1 code implementation • 27 Feb 2024 • Yi Huang, Jiancheng Huang, Yifan Liu, Mingfu Yan, Jiaxi Lv, Jianzhuang Liu, Wei Xiong, He Zhang, Shifeng Chen, Liangliang Cao

In this survey, we provide an exhaustive overview of existing methods using diffusion models for image editing, covering both theoretical and practical aspects in the field.

Denoising Image Inpainting +1

233

Paper
Code

ZONE: Zero-Shot Instruction-Guided Local Editing

1 code implementation • 28 Dec 2023 • Shanglin Li, Bohan Zeng, Yutang Feng, Sicheng Gao, Xuhui Liu, Jiaming Liu, Li Lin, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.

Image Generation

Paper
Code

Learning Unorthogonalized Matrices for Rotation Estimation

no code implementations • 1 Dec 2023 • Kerui Gu, Zhihao LI, Shiyong Liu, Jianzhuang Liu, Songcen Xu, Youliang Yan, Michael Bi Mi, Kenji Kawaguchi, Angela Yao

Estimating 3D rotations is a common procedure for 3D computer vision.

Pose Estimation

Paper
Add Code

CoSeR: Bridging Image and Language for Cognitive Super-Resolution

1 code implementation • 27 Nov 2023 • Haoze Sun, Wenbo Li, Jianzhuang Liu, Haoyu Chen, Renjing Pei, Xueyi Zou, Youliang Yan, Yujiu Yang

We achieve this by marrying image appearance and language understanding to generate a cognitive embedding, which not only activates prior information from large text-to-image diffusion models but also facilitates the generation of high-quality reference images to optimize the SR process.

Super-Resolution

269

Paper
Code

GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning

no code implementations • 21 Nov 2023 • Jiaxi Lv, Yi Huang, Mingfu Yan, Jiancheng Huang, Jianzhuang Liu, Yifan Liu, Yafei Wen, Xiaoxin Chen, Shifeng Chen

To tackle these issues, we propose GPT4Motion, a training-free framework that leverages the planning capability of large language models such as GPT, the physical simulation strength of Blender, and the excellent image generation ability of text-to-image diffusion models to enhance the quality of video synthesis.

Image Generation Text-to-Video Generation +1

Paper
Add Code

Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification

1 code implementation • 4 Nov 2023 • Hao Zheng, Runqi Wang, Jianzhuang Liu, Asako Kanezaki

The conventional few-shot classification aims at learning a model on a large labeled base dataset and rapidly adapting to a target dataset that is from the same distribution as the base dataset.

Classification Cross-Domain Few-Shot +2

Paper
Code

IPDreamer: Appearance-Controllable 3D Object Generation with Image Prompts

1 code implementation • 9 Oct 2023 • Bohan Zeng, Shanglin Li, Yutang Feng, Hong Li, Sicheng Gao, Jiaming Liu, Huaxia Li, Xu Tang, Jianzhuang Liu, Baochang Zhang

Recent advances in 3D generation have been remarkable, with methods such as DreamFusion leveraging large-scale text-to-image diffusion-based models to supervise 3D generation.

3D Generation Image to 3D +2

Paper
Code

Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

no code implementations • ICCV 2023 • Yinglong Wang, Zhen Liu, Jianzhuang Liu, Songcen Xu, Shuaicheng Liu

We propose to integrate the effectiveness of gamma correction with the strong modelling capacities of deep networks, which enables the correction factor gamma to be learned in a coarse to elaborate manner via adaptively perceiving the deviated illumination.

Low-Light Image Enhancement

Paper
Add Code

Generalizing Event-Based Motion Deblurring in Real-World Scenarios

1 code implementation • ICCV 2023 • Xiang Zhang, Lei Yu, Wen Yang, Jianzhuang Liu, Gui-Song Xia

Event-based motion deblurring has shown promising results by exploiting low-latency events.

Deblurring Self-Supervised Learning

Paper
Code

MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation

no code implementations • ICCV 2023 • Kaixin Cai, Pengzhen Ren, Yi Zhu, Hang Xu, Jianzhuang Liu, Changlin Li, Guangrun Wang, Xiaodan Liang

To address this issue, we propose MixReorg, a novel and straightforward pre-training paradigm for semantic segmentation that enhances a model's ability to reorganize patches mixed across images, exploring both local visual relevance and global semantic coherence.

Segmentation Semantic Segmentation +1

Paper
Add Code

Video Frame Interpolation with Stereo Event and Intensity Camera

no code implementations • 17 Jul 2023 • Chao Ding, Mingyuan Lin, Haijian Zhang, Jianzhuang Liu, Lei Yu

The stereo event-intensity camera setup is widely applied to leverage the advantages of both event cameras with low latency and intensity cameras that capture accurate brightness and texture information.

Disparity Estimation Optical Flow Estimation +1

Paper
Add Code

WaveDM: Wavelet-Based Diffusion Models for Image Restoration

1 code implementation • 23 May 2023 • Yi Huang, Jiancheng Huang, Jianzhuang Liu, Mingfu Yan, Yu Dong, Jiaxi Lv, Chaoqi Chen, Shifeng Chen

Latest diffusion-based methods for many image restoration tasks outperform traditional models, but they encounter the long-time inference problem.

Deblurring Denoising +2

Paper
Code

AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning

no code implementations • CVPR 2023 • Runqi Wang, Xiaoyue Duan, Guoliang Kang, Jianzhuang Liu, Shaohui Lin, Songcen Xu, Jinhu Lv, Baochang Zhang

Text consists of a category name and a fixed number of learnable parameters which are selected from our designed attribute word bank and serve as attributes.

Attribute Continual Learning +1

Paper
Add Code

Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment

1 code implementation • CVPR 2023 • Runqi Wang, Hao Zheng, Xiaoyue Duan, Jianzhuang Liu, Yuning Lu, Tian Wang, Songcen Xu, Baochang Zhang

However, with only a few training images, there exist two crucial problems: (1) the visual feature distributions are easily distracted by class-irrelevant information in images, and (2) the alignment between the visual and language feature distributions is difficult.

Few-Shot Learning

Paper
Code

Controllable Mind Visual Diffusion Model

1 code implementation • 17 May 2023 • Bohan Zeng, Shanglin Li, Xuhui Liu, Sicheng Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.

Attribute Image Generation

Paper
Code

AsConvSR: Fast and Lightweight Super-Resolution Network with Assembled Convolutions

no code implementations • 5 May 2023 • Jiaming Guo, Xueyi Zou, Yuyi Chen, Yi Liu, Jia Hao, Jianzhuang Liu, Youliang Yan

In recent years, videos and images in 720p (HD), 1080p (FHD) and 4K (UHD) resolution have become more popular for display devices such as TVs, mobile phones and VR.

4k Super-Resolution

Paper
Add Code

Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining

1 code implementation • 26 Apr 2023 • Bingqian Lin, Zicong Chen, Mingjie Li, Haokun Lin, Hang Xu, Yi Zhu, Jianzhuang Liu, Wenjia Cai, Lei Yang, Shen Zhao, Chenfei Wu, Ling Chen, Xiaojun Chang, Yi Yang, Lei Xing, Xiaodan Liang

In MOTOR, we combine two kinds of basic medical knowledge, i. e., general and specific knowledge, in a complementary manner to boost the general pretraining process.

Medical Visual Question Answering Question Answering +1

Paper
Code

Face Animation with an Attribute-Guided Diffusion Model

1 code implementation • 6 Apr 2023 • Bohan Zeng, Xuhui Liu, Sicheng Gao, Boyu Liu, Hong Li, Jianzhuang Liu, Baochang Zhang

Face animation has achieved much progress in computer vision.

3D Face Reconstruction Attribute +1

Paper
Code

Recovering Continuous Scene Dynamics from A Single Blurry Image with Events

no code implementations • 5 Apr 2023 • Zhangyi Cheng, Xiang Zhang, Lei Yu, Jianzhuang Liu, Wen Yang, Gui-Song Xia

This paper aims at demystifying a single motion-blurred image with events and revealing temporally continuous scene dynamics encrypted behind motion blurs.

Image Restoration SSIM

Paper
Add Code

Implicit Diffusion Models for Continuous Super-Resolution

1 code implementation • CVPR 2023 • Sicheng Gao, Xuhui Liu, Bohan Zeng, Sheng Xu, Yanjing Li, Xiaoyan Luo, Jianzhuang Liu, XianTong Zhen, Baochang Zhang

IDM integrates an implicit neural representation and a denoising diffusion model in a unified end-to-end framework, where the implicit neural representation is adopted in the decoding process to learn continuous-resolution representation.

Ranked #1 on Image Super-Resolution on CelebA-HQ 128x128

Denoising Image Super-Resolution

251

Paper
Code

Learning to Super-Resolve Blurry Images with Events

1 code implementation • 27 Feb 2023 • Lei Yu, Bishan Wang, Xiang Zhang, Haijian Zhang, Wen Yang, Jianzhuang Liu, Gui-Song Xia

Super-Resolution from a single motion Blurred image (SRB) is a severely ill-posed problem due to the joint degradation of motion blurs and low spatial resolution.

Sparse Learning Super-Resolution

Paper
Code

Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation

no code implementations • 13 Feb 2023 • Bingqian Lin, Yi Zhu, Xiaodan Liang, Liang Lin, Jianzhuang Liu

Vision-Language Navigation (VLN) is a challenging task which requires an agent to align complex visual observations to language instructions to reach the goal position.

Re-Ranking Vision-Language Navigation

Paper
Add Code

ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency

1 code implementation • 31 Jan 2023 • Pengzhen Ren, Changlin Li, Hang Xu, Yi Zhu, Guangrun Wang, Jianzhuang Liu, Xiaojun Chang, Xiaodan Liang

Specifically, we first propose text-to-views consistency modeling to learn correspondence for multiple views of the same input image.

Segmentation Semantic Segmentation

Paper
Code

SmartAssign: Learning a Smart Knowledge Assignment Strategy for Deraining and Desnowing

no code implementations • CVPR 2023 • Yinglong Wang, Chao Ma, Jianzhuang Liu

Extensive experiments on seven benchmark datasets verify that proposed SmartAssign explores effective connection between rain and snow, and improves the performances of both deraining and desnowing apparently.

Multi-Task Learning Rain Removal

Paper
Add Code

PIDRo: Parallel Isomeric Attention with Dynamic Routing for Text-Video Retrieval

no code implementations • ICCV 2023 • Peiyan Guan, Renjing Pei, Bin Shao, Jianzhuang Liu, Weimian Li, Jiaxi Gu, Hang Xu, Songcen Xu, Youliang Yan, Edmund Y. Lam

The parallel isomeric attention module is used as the video encoder, which consists of two parallel branches modeling the spatial-temporal information of videos from both patch and frame levels.

Ranked #3 on Video Retrieval on MSR-VTT-1kA

Representation Learning Retrieval +3

Paper
Add Code

HiVLP: Hierarchical Interactive Video-Language Pre-Training

no code implementations • ICCV 2023 • Bin Shao, Jianzhuang Liu, Renjing Pei, Songcen Xu, Peng Dai, Juwei Lu, Weimian Li, Youliang Yan

However, compared to image-language pre-training, VLP has lagged far behind due to the lack of large amounts of video-text pairs.

Retrieval Self-Supervised Learning +3

Paper
Add Code

CLIPPING: Distilling CLIP-Based Models With a Student Base for Video-Language Retrieval

no code implementations • CVPR 2023 • Renjing Pei, Jianzhuang Liu, Weimian Li, Bin Shao, Songcen Xu, Peng Dai, Juwei Lu, Youliang Yan

Pre-training a vison-language model and then fine-tuning it on downstream tasks have become a popular paradigm.

Knowledge Distillation Language Modelling +1

Paper
Add Code

Feature Calibration Network for Occluded Pedestrian Detection

no code implementations • 12 Dec 2022 • Tianliang Zhang, Qixiang Ye, Baochang Zhang, Jianzhuang Liu, Xiaopeng Zhang, Qi Tian

FC-Net is based on the observation that the visible parts of pedestrians are selective and decisive for detection, and is implemented as a self-paced feature learning framework with a self-activation (SA) module and a feature calibration (FC) module.

Pedestrian Detection

Paper
Add Code

CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation

no code implementations • 4 Dec 2022 • ZiCheng Zhang, Yi Zhu, Jianzhuang Liu, Xiaodan Liang, Wei Ke

Then in the Sentence-Mask Alignment (SMA) module, the masks are weighted by the sentence embedding to localize the referred object, and finally projected back to aggregate the pixels for the target.

Image Segmentation Semantic Segmentation +3

Paper
Add Code

FNeVR: Neural Volume Rendering for Face Animation

1 code implementation • 21 Sep 2022 • Bohan Zeng, Boyu Liu, Hong Li, Xuhui Liu, Jianzhuang Liu, Dapeng Chen, Wei Peng, Baochang Zhang

In FNeVR, we design a 3D Face Volume Rendering (FVR) module to enhance the facial details for image rendering.

Talking Face Generation

Paper
Code

Structure-Preserving Graph Representation Learning

1 code implementation • 2 Sep 2022 • Ruiyi Fang, Liangjian Wen, Zhao Kang, Jianzhuang Liu

To this end, we propose a novel Structure-Preserving Graph Representation Learning (SPGRL) method, to fully capture the structure information of graphs.

Graph Representation Learning Node Classification

Paper
Code

Removing Rain Streaks via Task Transfer Learning

no code implementations • 28 Aug 2022 • Yinglong Wang, Chao Ma, Jianzhuang Liu

Inspired by our studies, we propose to remove rain by learning favorable deraining representations from other connected tasks.

Knowledge Distillation Rain Removal +1

Paper
Add Code

Anti-Retroactive Interference for Lifelong Learning

1 code implementation • 27 Aug 2022 • Runqi Wang, Yuxiang Bao, Baochang Zhang, Jianzhuang Liu, Wentao Zhu, Guodong Guo

Second, according to the similarity between incremental knowledge and base knowledge, we design an adaptive fusion of incremental knowledge, which helps the model allocate capacity to the knowledge of different difficulties.

Meta-Learning

Paper
Code

Low-Light Video Enhancement with Synthetic Event Guidance

no code implementations • 23 Aug 2022 • Lin Liu, Junfeng An, Jianzhuang Liu, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Yanfeng Wang, Qi Tian

Low-light video enhancement (LLVE) is an important yet challenging task with many applications such as photographing and autonomous driving.

Autonomous Driving Image Enhancement +1

Paper
Add Code

CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation

6 code implementations • 1 Aug 2022 • Zhihao LI, Jianzhuang Liu, Zhensong Zhang, Songcen Xu, Youliang Yan

Top-down methods dominate the field of 3D human pose and shape estimation, because they are decoupled from human detection and allow researchers to focus on the core problem.

Ranked #1 on Unsupervised 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D human pose and shape estimation Human Detection +1

835

Paper
Code

Self-Supervision Can Be a Good Few-Shot Learner

3 code implementations • 19 Jul 2022 • Yuning Lu, Liangjian Wen, Jianzhuang Liu, Yajing Liu, Xinmei Tian

Specifically, we maximize the mutual information (MI) of instances and their representations with a low-bias MI estimator to perform self-supervised pre-training.

Ranked #2 on Unsupervised Few-Shot Image Classification on Tiered ImageNet 5-way (5-shot)

cross-domain few-shot learning Unsupervised Few-Shot Image Classification +1

Paper
Code

ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts

no code implementations • CVPR 2022 • Bingqian Lin, Yi Zhu, Zicong Chen, Xiwen Liang, Jianzhuang Liu, Xiaodan Liang

Vision-Language Navigation (VLN) is a challenging task that requires an embodied agent to perform action-level modality alignment, i. e., make instruction-asked actions sequentially in complex visual environments.

Vision-Language Navigation

Paper
Add Code

Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection

no code implementations • CVPR 2022 • Zhuoling Li, Zhan Qu, Yang Zhou, Jianzhuang Liu, Haoqian Wang, Lihui Jiang

To tackle this problem, we propose a depth solving system that fully explores the visual clues from the subtasks in M3OD and generates multiple estimations for the depth of each target.

Depth Estimation Monocular 3D Object Detection +2

Paper
Add Code

Prompt Distribution Learning

no code implementations • CVPR 2022 • Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian

We present prompt distribution learning for effectively adapting a pre-trained vision-language model to address downstream recognition tasks.

Language Modelling

Paper
Add Code

Learning Enriched Illuminants for Cross and Single Sensor Color Constancy

no code implementations • 21 Mar 2022 • Xiaodong Cun, Zhendong Wang, Chi-Man Pun, Jianzhuang Liu, Wengang Zhou, Xu Jia, Houqiang Li

Color constancy aims to restore the constant colors of a scene under different illuminants.

Color Constancy

Paper
Add Code

Differentiated Relevances Embedding for Group-based Referring Expression Comprehension

no code implementations • 12 Mar 2022 • Fuhai Chen, Xuri Ge, Xiaoshuai Sun, Yue Gao, Jianzhuang Liu, Fufeng Chen, Wenjie Li

The key of referring expression comprehension lies in capturing the cross-modal visual-linguistic relevance.

Attribute Object +2

Paper
Add Code

Neural Architecture Search With Representation Mutual Information

1 code implementation • CVPR 2022 • Xiawu Zheng, Xiang Fei, Lei Zhang, Chenglin Wu, Fei Chao, Jianzhuang Liu, Wei Zeng, Yonghong Tian, Rongrong Ji

Building upon RMI, we further propose a new search algorithm termed RMI-NAS, facilitating with a theorem to guarantee the global optimal of the searched architecture.

Neural Architecture Search

Paper
Code

SiamTrans: Zero-Shot Multi-Frame Image Restoration with Pre-Trained Siamese Transformers

no code implementations • 17 Dec 2021 • Lin Liu, Shanxin Yuan, Jianzhuang Liu, Xin Guo, Youliang Yan, Qi Tian

For zero-shot image restoration, we design a novel model, termed SiamTrans, which is constructed by Siamese transformers, encoders, and decoders.

Denoising Image Restoration +1

Paper
Add Code

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization

1 code implementation • CVPR 2022 • Yunshan Zhong, Mingbao Lin, Gongrui Nan, Jianzhuang Liu, Baochang Zhang, Yonghong Tian, Rongrong Ji

In this paper, we observe an interesting phenomenon of intra-class heterogeneity in real data and show that existing methods fail to retain this property in their synthetic images, which causes a limited performance increase.

Quantization

Paper
Code

Motion Deblurring with Real Events

no code implementations • ICCV 2021 • Fang Xu, Lei Yu, Bishan Wang, Wen Yang, Gui-Song Xia, Xu Jia, Zhendong Qiao, Jianzhuang Liu

In this paper, we propose an end-to-end learning framework for event-based motion deblurring in a self-supervised manner, where real-world events are exploited to alleviate the performance degradation caused by data inconsistency.

Deblurring

Paper
Add Code

Wavelet-Based Network For High Dynamic Range Imaging

1 code implementation • 3 Aug 2021 • Tianhong Dai, Wei Li, Xilei Cao, Jianzhuang Liu, Xu Jia, Ales Leonardis, Youliang Yan, Shanxin Yuan

The frequency-guided upsampling module reconstructs details from multiple frequency-specific components with rich details.

Optical Flow Estimation Vocal Bursts Intensity Prediction

Paper
Code

Multi-Target Domain Adaptation with Collaborative Consistency Learning

no code implementations • CVPR 2021 • Takashi Isobe, Xu Jia, Shuaijun Chen, Jianzhong He, Yongjie Shi, Jianzhuang Liu, Huchuan Lu, Shengjin Wang

To obtain a single model that works across multiple target domains, we propose to simultaneously learn a student model which is trained to not only imitate the output of each expert on the corresponding target domain, but also to pull different expert close to each other with regularization on their weights.

Ranked #4 on Domain Adaptation on GTAV to Cityscapes+Mapillary

Multi-target Domain Adaptation Semantic Segmentation +1

Paper
Add Code

Uformer: A General U-Shaped Transformer for Image Restoration

4 code implementations • CVPR 2022 • Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, Houqiang Li

Powered by these two designs, Uformer enjoys a high capability for capturing both local and global dependencies for image restoration.

Ranked #2 on Deblurring on RealBlur-R (trained on GoPro)

Deblurring Image Deblurring +5

731

Paper
Code

Towards Compact CNNs via Collaborative Compression

1 code implementation • CVPR 2021 • Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji

Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression.

Neural Network Compression Tensor Decomposition

Paper
Code

Multiple instance active learning for object detection

1 code implementation • CVPR 2021 • Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye

Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection.

Ranked #1 on Active Object Detection on MS COCO

Active Object Detection Multiple Instance Learning +3

323

Paper
Code

ReCU: Reviving the Dead Weights in Binary Neural Networks

3 code implementations • ICCV 2021 • Zihan Xu, Mingbao Lin, Jianzhuang Liu, Jie Chen, Ling Shao, Yue Gao, Yonghong Tian, Rongrong Ji

We prove that reviving the "dead weights" by ReCU can result in a smaller quantization error.

Binarization Quantization

Paper
Code

Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation

1 code implementation • CVPR 2021 • Shuaijun Chen, Xu Jia, Jianzhong He, Yongjie Shi, Jianzhuang Liu

To address the task of SSDA, a novel framework based on dual-level domain mixing is proposed.

Semantic Segmentation Semi-supervised Domain Adaptation +1

334

Paper
Code

Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation

no code implementations • CVPR 2021 • Jianzhong He, Xu Jia, Shuaijun Chen, Jianzhuang Liu

Multi-source unsupervised domain adaptation~(MSDA) aims at adapting models trained on multiple labeled source domains to an unlabeled target domain.

Ranked #1 on Domain Adaptation on GTA5+Synscapes to Cityscapes

Multi-Source Unsupervised Domain Adaptation Semantic Segmentation +1

Paper
Add Code

Neighbor2Neighbor: Self-Supervised Denoising from Single Noisy Images

12 code implementations • CVPR 2021 • Tao Huang, Songjiang Li, Xu Jia, Huchuan Lu, Jianzhuang Liu

In this paper, we present a very simple yet effective method named Neighbor2Neighbor to train an effective image denoising model with only noisy images.

Image Denoising Self-Supervised Learning

231

Paper
Code

TRAR: Routing the Attention Spans in Transformer for Visual Question Answering

1 code implementation • ICCV 2021 • Yiyi Zhou, Tianhe Ren, Chaoyang Zhu, Xiaoshuai Sun, Jianzhuang Liu, Xinghao Ding, Mingliang Xu, Rongrong Ji

Due to the superior ability of global dependency modeling, Transformer and its variants have become the primary choice of many vision-and-language tasks.

Question Answering Referring Expression +2

Paper
Code

Occlude Them All: Occlusion-Aware Attention Network for Occluded Person Re-ID

no code implementations • ICCV 2021 • Peixian Chen, Wenfeng Liu, Pingyang Dai, Jianzhuang Liu, Qixiang Ye, Mingliang Xu, Qi'an Chen, Rongrong Ji

To avoid such problematic models in occluded person ReID, we propose the Occlusion-Aware Mask Network (OAMN).

Person Re-Identification

Paper
Add Code

Active Learning for Lane Detection: A Knowledge Distillation Approach

no code implementations • ICCV 2021 • Fengchao Peng, Chao Wang, Jianzhuang Liu, Zhen Yang

The experiments show that our method achieves new state-of-the-art on the lane detection benchmarks.

Active Learning Autonomous Driving +4

Paper
Add Code

Self-Adaptively Learning to Demoiré from Focused and Defocused Image Pairs

no code implementations • NeurIPS 2020 • Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian

In this paper, we propose a self-adaptive learning method for demoiréing a high-frequency image, with the help of an additional defocused moiré-free blur image.

Test-time Adaptation

Paper
Add Code

Self-Adaptively Learning to Demoire from Focused and Defocused Image Pairs

1 code implementation • 3 Nov 2020 • Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian

In this paper, we propose a self-adaptive learning method for demoireing a high-frequency image, with the help of an additional defocused moire-free blur image.

Demoire Test-time Adaptation

Paper
Code

Binarized Neural Architecture Search for Efficient Object Recognition

no code implementations • 8 Sep 2020 • Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, Rongrong Ji, David Doermann, Guodong Guo

In this paper, binarized neural architecture search (BNAS), with a search space of binarized convolutions, is introduced to produce extremely compressed models to reduce huge computational cost on embedded devices for edge computing.

Edge-computing Face Recognition +3

Paper
Add Code

Light Field View Synthesis via Aperture Disparity and Warping Confidence Map

no code implementations • 7 Sep 2020 • Nan Meng, Kai Li, Jianzhuang Liu, Edmund Y. Lam

This paper presents a learning-based approach to synthesize the view from an arbitrary camera position given a sparse set of images.

Novel View Synthesis Position

Paper
Add Code

Dual Distribution Alignment Network for Generalizable Person Re-Identification

1 code implementation • 27 Jul 2020 • Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji

Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.

Domain Generalization Generalizable Person Re-identification

Paper
Code

Wavelet-Based Dual-Branch Network for Image Demoireing

1 code implementation • 14 Jul 2020 • Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, Qi Tian

When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.

Demoire Image Restoration +1

Paper
Code

Projection & Probability-Driven Black-Box Attack

1 code implementation • CVPR 2020 • Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian

For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.

Paper
Code

High-Order Residual Network for Light Field Super-Resolution

1 code implementation • 29 Mar 2020 • Nan Meng, Xiaofei Wu, Jianzhuang Liu, Edmund Y. Lam

In this paper, we propose a novel high-order residual network to learn the geometric features hierarchically from the LF for reconstruction.

Super-Resolution Vocal Bursts Intensity Prediction

Paper
Code

Context-Transformer: Tackling Object Confusion for Few-Shot Detection

1 code implementation • 16 Mar 2020 • Ze Yang, Yali Wang, Xianyu Chen, Jianzhuang Liu, Yu Qiao

Few-shot object detection is a challenging but realistic scenario, where only a few annotated training images are available for training detectors.

Few-Shot Learning Few-Shot Object Detection +3

102

Paper
Code

SketchyCOCO: Image Generation from Freehand Scene Sketches

2 code implementations • CVPR 2020 • Chengying Gao, Qi Liu, Qi Xu, Li-Min Wang, Jianzhuang Liu, Changqing Zou

We introduce the first method for automatic image generation from scene-level freehand sketches.

Ranked #2 on Sketch-to-Image Translation on SketchyCOCO

Attribute Generative Adversarial Network +3

Paper
Code

Filter Sketch for Network Pruning

1 code implementation • 23 Jan 2020 • Mingbao Lin, Liujuan Cao, Shaojie Li, Qixiang Ye, Yonghong Tian, Jianzhuang Liu, Qi Tian, Rongrong Ji

Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure.

Network Pruning

Paper
Code

Multiple Anchor Learning for Visual Object Detection

3 code implementations • CVPR 2020 • Wei Ke, Tianliang Zhang, Zeyi Huang, Qixiang Ye, Jianzhuang Liu, Dong Huang

In this paper, we propose a Multiple Instance Learning (MIL) approach that selects anchors and jointly optimizes the two modules of a CNN-based object detector.

Ranked #116 on Object Detection on COCO test-dev

General Classification Multiple Instance Learning +3

Paper
Code

GBCNs: Genetic Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

no code implementations • 25 Nov 2019 • Chunlei Liu, Wenrui Ding, Yuan Hu, Baochang Zhang, Jianzhuang Liu, Guodong Guo

The BGA method is proposed to modify the binary process of GBCNs to alleviate the local minima problem, which can significantly improve the performance of 1-bit DCNNs.

Face Recognition Object Recognition +1

Paper
Add Code

Binarized Neural Architecture Search

no code implementations • 25 Nov 2019 • Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, David Doermann, Rongrong Ji

A variant, binarized neural architecture search (BNAS), with a search space of binarized convolutions, can produce extremely compressed models.

Neural Architecture Search

Paper
Add Code

An End-to-End Foreground-Aware Network for Person Re-Identification

no code implementations • 25 Oct 2019 • Yiheng Liu, Wengang Zhou, Jianzhuang Liu, Guo-Jun Qi, Qi Tian, Houqiang Li

By presenting a target attention loss, the pedestrian features extracted from the foreground branch become more insensitive to the backgrounds, which greatly reduces the negative impacts of changing backgrounds on matching an identical across different camera views.

Person Re-Identification

Paper
Add Code

Circulant Binary Convolutional Networks: Enhancing the Performance of 1-bit DCNNs with Circulant Back Propagation

no code implementations • CVPR 2019 • Chunlei Liu, Wenrui Ding, Xin Xia, Baochang Zhang, Jiaxin Gu, Jianzhuang Liu, Rongrong Ji, David Doermann

The CiFs can be easily incorporated into existing deep convolutional neural networks (DCNNs), which leads to new Circulant Binary Convolutional Networks (CBCNs).

Paper
Add Code

Unsupervised Image Super-Resolution with an Indirect Supervised Path

no code implementations • 7 Oct 2019 • Zhen Han, Enyan Dai, Xu Jia, Xiaoying Ren, Shuaijun Chen, Chunjing Xu, Jianzhuang Liu, Qi Tian

The task of single image super-resolution (SISR) aims at reconstructing a high-resolution (HR) image from a low-resolution (LR) image.

Image Super-Resolution Translation

Paper
Add Code

RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

no code implementations • 21 Aug 2019 • Chunlei Liu, Wenrui Ding, Xin Xia, Yuan Hu, Baochang Zhang, Jianzhuang Liu, Bohan Zhuang, Guodong Guo

Binarized convolutional neural networks (BCNNs) are widely used to improve memory and computation efficiency of deep convolutional neural networks (DCNNs) for mobile and AI chips based applications.

Binarization Object Tracking

Paper
Add Code

Bayesian Optimized 1-Bit CNNs

no code implementations • ICCV 2019 • Jiaxin Gu, Junhe Zhao, Xiao-Long Jiang, Baochang Zhang, Jianzhuang Liu, Guodong Guo, Rongrong Ji

Deep convolutional neural networks (DCNNs) have dominated the recent developments in computer vision through making various record-breaking models.

Paper
Add Code

Multinomial Distribution Learning for Effective Neural Architecture Search

1 code implementation • ICCV 2019 • Xiawu Zheng, Rongrong Ji, Lang Tang, Baochang Zhang, Jianzhuang Liu, Qi Tian

Therefore, NAS can be transformed to a multinomial distribution learning problem, i. e., the distribution is optimized to have a high expectation of the performance.

Neural Architecture Search

207

Paper
Code

Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression

1 code implementation • CVPR 2019 • Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, Rongrong Ji

The relationship between the input feature maps and 2D kernels is revealed in a theoretical framework, based on which a kernel sparsity and entropy (KSE) indicator is proposed to quantitate the feature map importance in a feature-agnostic manner to guide model compression.

Clustering Model Compression

Paper
Code

Projection Convolutional Neural Networks for 1-bit CNNs via Discrete Back Propagation

no code implementations • 30 Nov 2018 • Jiaxin Gu, Ce Li, Baochang Zhang, Jungong Han, Xian-Bin Cao, Jianzhuang Liu, David Doermann

The advancement of deep convolutional neural networks (DCNNs) has driven significant improvement in the accuracy of recognition systems for many computer vision tasks.

Paper
Add Code

Modulated Convolutional Networks

no code implementations • CVPR 2018 • Xiaodi Wang, Baochang Zhang, Ce Li, Rongrong Ji, Jungong Han, Xian-Bin Cao, Jianzhuang Liu

In this paper, we propose new Modulated Convolutional Networks (MCNs) to improve the portability of CNNs via binarized filters.

Paper
Add Code

Memory Attention Networks for Skeleton-based Action Recognition

1 code implementation • 23 Apr 2018 • Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han, Changqing Zou, Jianzhuang Liu

Specifically, the TARM is deployed in a residual learning module that employs a novel attention learning network to recalibrate the temporal attention of frames in a skeleton sequence.

Ranked #89 on Skeleton Based Action Recognition on NTU RGB+D

Action Recognition Skeleton Based Action Recognition +1

Paper
Code

One-Two-One Networks for Compression Artifacts Reduction in Remote Sensing

no code implementations • 1 Apr 2018 • Baochang Zhang, Jiaxin Gu, Chen Chen, Jungong Han, Xiangbo Su, Xian-Bin Cao, Jianzhuang Liu

Compression artifacts reduction (CAR) is a challenging problem in the field of remote sensing.

Blocking Image Compression +1

Paper
Add Code

Gabor Convolutional Networks

no code implementations • 3 May 2017 • Shangzhen Luan, Baochang Zhang, Chen Chen, Xian-Bin Cao, Jungong Han, Jianzhuang Liu

Steerable properties dominate the design of traditional filters, e. g., Gabor filters, and endow features the capability of dealing with spatial transformations.

Paper
Add Code

A Maximum Entropy Feature Descriptor for Age Invariant Face Recognition

no code implementations • CVPR 2015 • Dihong Gong, Zhifeng Li, DaCheng Tao, Jianzhuang Liu, Xuelong. Li

In this paper, we propose a new approach to overcome the representation and matching problems in age invariant face recognition.

Age-Invariant Face Recognition MORPH

Paper
Add Code

Separation of Line Drawings Based on Split Faces for 3D Object Reconstruction

no code implementations • CVPR 2014 • Changqing Zou, Heng Yang, Jianzhuang Liu

Reconstructing 3D objects from single line drawings is often desirable in computer vision and graphics applications.

3D Object Reconstruction Object

Paper
Add Code

Transitive Distance Clustering with K-Means Duality

no code implementations • CVPR 2014 • Zhiding Yu, Chunjing Xu, Deyu Meng, Zhuo Hui, Fanyi Xiao, Wenbo Liu, Jianzhuang Liu

We propose a very intuitive and simple approximation for the conventional spectral clustering methods.

Clustering Image Segmentation +1

Paper
Add Code

The BeiHang Keystroke Dynamics Authentication System

no code implementations • 15 Oct 2013 • Juan Liu, Baochang Zhang, Linlin Shen, Jianzhuang Liu, Jason Zhao

Keystroke Dynamics is an important biometric solution for person authentication.

General Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.