Search Results for author: Jianzhuang Liu

Found 93 papers, 42 papers with code

Wavelet-Based Dual-Branch Network for Image Demoiréing

no code implementations ECCV 2020 Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Aleš Leonardis, Wengang Zhou, Qi Tian

When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.

Image Restoration Rain Removal

API-Net: Robust Generative Classifier via a Single Discriminator

1 code implementation ECCV 2020 Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian

On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.

Robust classification

Large-Scale Few-Shot Learning via Multi-Modal Knowledge Discovery

no code implementations ECCV 2020 Shuo Wang, Jun Yue, Jianzhuang Liu, Qi Tian, Meng Wang

It is a challenging problem since (1) the identifying process is susceptible to over-fitting with limited samples of an object, and (2) the sample imbalance between a base (known knowledge) category and a novel category is easy to bias the recognition results.

Few-Shot Learning

MVEB: Self-Supervised Learning with Multi-View Entropy Bottleneck

no code implementations28 Mar 2024 Liangjian Wen, Xiasi Wang, Jianzhuang Liu, Zenglin Xu

One can learn this representation by maximizing the mutual information between the representation and the supervised view while eliminating superfluous information.

Self-Supervised Learning

Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation

no code implementations13 Mar 2024 ZiCheng Zhang, Tong Zhang, Yi Zhu, Jianzhuang Liu, Xiaodan Liang, Qixiang Ye, Wei Ke

To mitigate these issues, we propose a Language-Driven Visual Consensus (LDVC) approach, fostering improved alignment of semantic and visual information. Specifically, we leverage class embeddings as anchors due to their discrete and abstract nature, steering vision features toward class embeddings.

Language Modelling Semantic Segmentation +1

VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction

no code implementations27 Feb 2024 Jiaqi Lin, Zhihao LI, Xiao Tang, Jianzhuang Liu, Shiyong Liu, Jiayue Liu, Yangdi Lu, Xiaofei Wu, Songcen Xu, Youliang Yan, Wenming Yang

Existing NeRF-based methods for large scene reconstruction often have limitations in visual quality and rendering speed.

Diffusion Model-Based Image Editing: A Survey

1 code implementation27 Feb 2024 Yi Huang, Jiancheng Huang, Yifan Liu, Mingfu Yan, Jiaxi Lv, Jianzhuang Liu, Wei Xiong, He Zhang, Shifeng Chen, Liangliang Cao

In this survey, we provide an exhaustive overview of existing methods using diffusion models for image editing, covering both theoretical and practical aspects in the field.

Denoising Image Inpainting +1

ZONE: Zero-Shot Instruction-Guided Local Editing

1 code implementation28 Dec 2023 Shanglin Li, Bohan Zeng, Yutang Feng, Sicheng Gao, Xuhui Liu, Jiaming Liu, Li Lin, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.

Image Generation

CoSeR: Bridging Image and Language for Cognitive Super-Resolution

1 code implementation27 Nov 2023 Haoze Sun, Wenbo Li, Jianzhuang Liu, Haoyu Chen, Renjing Pei, Xueyi Zou, Youliang Yan, Yujiu Yang

We achieve this by marrying image appearance and language understanding to generate a cognitive embedding, which not only activates prior information from large text-to-image diffusion models but also facilitates the generation of high-quality reference images to optimize the SR process.

Super-Resolution

GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning

no code implementations21 Nov 2023 Jiaxi Lv, Yi Huang, Mingfu Yan, Jiancheng Huang, Jianzhuang Liu, Yifan Liu, Yafei Wen, Xiaoxin Chen, Shifeng Chen

To tackle these issues, we propose GPT4Motion, a training-free framework that leverages the planning capability of large language models such as GPT, the physical simulation strength of Blender, and the excellent image generation ability of text-to-image diffusion models to enhance the quality of video synthesis.

Image Generation Text-to-Video Generation +1

Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification

1 code implementation4 Nov 2023 Hao Zheng, Runqi Wang, Jianzhuang Liu, Asako Kanezaki

The conventional few-shot classification aims at learning a model on a large labeled base dataset and rapidly adapting to a target dataset that is from the same distribution as the base dataset.

Classification Cross-Domain Few-Shot +2

IPDreamer: Appearance-Controllable 3D Object Generation with Image Prompts

1 code implementation9 Oct 2023 Bohan Zeng, Shanglin Li, Yutang Feng, Hong Li, Sicheng Gao, Jiaming Liu, Huaxia Li, Xu Tang, Jianzhuang Liu, Baochang Zhang

Recent advances in 3D generation have been remarkable, with methods such as DreamFusion leveraging large-scale text-to-image diffusion-based models to supervise 3D generation.

3D Generation Image to 3D +2

Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

no code implementations ICCV 2023 Yinglong Wang, Zhen Liu, Jianzhuang Liu, Songcen Xu, Shuaicheng Liu

We propose to integrate the effectiveness of gamma correction with the strong modelling capacities of deep networks, which enables the correction factor gamma to be learned in a coarse to elaborate manner via adaptively perceiving the deviated illumination.

Low-Light Image Enhancement

MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation

no code implementations ICCV 2023 Kaixin Cai, Pengzhen Ren, Yi Zhu, Hang Xu, Jianzhuang Liu, Changlin Li, Guangrun Wang, Xiaodan Liang

To address this issue, we propose MixReorg, a novel and straightforward pre-training paradigm for semantic segmentation that enhances a model's ability to reorganize patches mixed across images, exploring both local visual relevance and global semantic coherence.

Segmentation Semantic Segmentation +1

Video Frame Interpolation with Stereo Event and Intensity Camera

no code implementations17 Jul 2023 Chao Ding, Mingyuan Lin, Haijian Zhang, Jianzhuang Liu, Lei Yu

The stereo event-intensity camera setup is widely applied to leverage the advantages of both event cameras with low latency and intensity cameras that capture accurate brightness and texture information.

Disparity Estimation Optical Flow Estimation +1

WaveDM: Wavelet-Based Diffusion Models for Image Restoration

1 code implementation23 May 2023 Yi Huang, Jiancheng Huang, Jianzhuang Liu, Mingfu Yan, Yu Dong, Jiaxi Lv, Chaoqi Chen, Shifeng Chen

Latest diffusion-based methods for many image restoration tasks outperform traditional models, but they encounter the long-time inference problem.

Deblurring Denoising +2

AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning

no code implementations CVPR 2023 Runqi Wang, Xiaoyue Duan, Guoliang Kang, Jianzhuang Liu, Shaohui Lin, Songcen Xu, Jinhu Lv, Baochang Zhang

Text consists of a category name and a fixed number of learnable parameters which are selected from our designed attribute word bank and serve as attributes.

Attribute Continual Learning +1

Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment

1 code implementation CVPR 2023 Runqi Wang, Hao Zheng, Xiaoyue Duan, Jianzhuang Liu, Yuning Lu, Tian Wang, Songcen Xu, Baochang Zhang

However, with only a few training images, there exist two crucial problems: (1) the visual feature distributions are easily distracted by class-irrelevant information in images, and (2) the alignment between the visual and language feature distributions is difficult.

Few-Shot Learning

Controllable Mind Visual Diffusion Model

1 code implementation17 May 2023 Bohan Zeng, Shanglin Li, Xuhui Liu, Sicheng Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.

Attribute Image Generation

AsConvSR: Fast and Lightweight Super-Resolution Network with Assembled Convolutions

no code implementations5 May 2023 Jiaming Guo, Xueyi Zou, Yuyi Chen, Yi Liu, Jia Hao, Jianzhuang Liu, Youliang Yan

In recent years, videos and images in 720p (HD), 1080p (FHD) and 4K (UHD) resolution have become more popular for display devices such as TVs, mobile phones and VR.

4k Super-Resolution

Recovering Continuous Scene Dynamics from A Single Blurry Image with Events

no code implementations5 Apr 2023 Zhangyi Cheng, Xiang Zhang, Lei Yu, Jianzhuang Liu, Wen Yang, Gui-Song Xia

This paper aims at demystifying a single motion-blurred image with events and revealing temporally continuous scene dynamics encrypted behind motion blurs.

Image Restoration SSIM

Implicit Diffusion Models for Continuous Super-Resolution

1 code implementation CVPR 2023 Sicheng Gao, Xuhui Liu, Bohan Zeng, Sheng Xu, Yanjing Li, Xiaoyan Luo, Jianzhuang Liu, XianTong Zhen, Baochang Zhang

IDM integrates an implicit neural representation and a denoising diffusion model in a unified end-to-end framework, where the implicit neural representation is adopted in the decoding process to learn continuous-resolution representation.

Denoising Image Super-Resolution

Learning to Super-Resolve Blurry Images with Events

1 code implementation27 Feb 2023 Lei Yu, Bishan Wang, Xiang Zhang, Haijian Zhang, Wen Yang, Jianzhuang Liu, Gui-Song Xia

Super-Resolution from a single motion Blurred image (SRB) is a severely ill-posed problem due to the joint degradation of motion blurs and low spatial resolution.

Sparse Learning Super-Resolution

Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation

no code implementations13 Feb 2023 Bingqian Lin, Yi Zhu, Xiaodan Liang, Liang Lin, Jianzhuang Liu

Vision-Language Navigation (VLN) is a challenging task which requires an agent to align complex visual observations to language instructions to reach the goal position.

Re-Ranking Vision-Language Navigation

ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency

1 code implementation31 Jan 2023 Pengzhen Ren, Changlin Li, Hang Xu, Yi Zhu, Guangrun Wang, Jianzhuang Liu, Xiaojun Chang, Xiaodan Liang

Specifically, we first propose text-to-views consistency modeling to learn correspondence for multiple views of the same input image.

Segmentation Semantic Segmentation

SmartAssign: Learning a Smart Knowledge Assignment Strategy for Deraining and Desnowing

no code implementations CVPR 2023 Yinglong Wang, Chao Ma, Jianzhuang Liu

Extensive experiments on seven benchmark datasets verify that proposed SmartAssign explores effective connection between rain and snow, and improves the performances of both deraining and desnowing apparently.

Multi-Task Learning Rain Removal

PIDRo: Parallel Isomeric Attention with Dynamic Routing for Text-Video Retrieval

no code implementations ICCV 2023 Peiyan Guan, Renjing Pei, Bin Shao, Jianzhuang Liu, Weimian Li, Jiaxi Gu, Hang Xu, Songcen Xu, Youliang Yan, Edmund Y. Lam

The parallel isomeric attention module is used as the video encoder, which consists of two parallel branches modeling the spatial-temporal information of videos from both patch and frame levels.

Representation Learning Retrieval +3

Feature Calibration Network for Occluded Pedestrian Detection

no code implementations12 Dec 2022 Tianliang Zhang, Qixiang Ye, Baochang Zhang, Jianzhuang Liu, Xiaopeng Zhang, Qi Tian

FC-Net is based on the observation that the visible parts of pedestrians are selective and decisive for detection, and is implemented as a self-paced feature learning framework with a self-activation (SA) module and a feature calibration (FC) module.

Pedestrian Detection

CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation

no code implementations4 Dec 2022 ZiCheng Zhang, Yi Zhu, Jianzhuang Liu, Xiaodan Liang, Wei Ke

Then in the Sentence-Mask Alignment (SMA) module, the masks are weighted by the sentence embedding to localize the referred object, and finally projected back to aggregate the pixels for the target.

Image Segmentation Semantic Segmentation +3

FNeVR: Neural Volume Rendering for Face Animation

1 code implementation21 Sep 2022 Bohan Zeng, Boyu Liu, Hong Li, Xuhui Liu, Jianzhuang Liu, Dapeng Chen, Wei Peng, Baochang Zhang

In FNeVR, we design a 3D Face Volume Rendering (FVR) module to enhance the facial details for image rendering.

Talking Face Generation

Structure-Preserving Graph Representation Learning

1 code implementation2 Sep 2022 Ruiyi Fang, Liangjian Wen, Zhao Kang, Jianzhuang Liu

To this end, we propose a novel Structure-Preserving Graph Representation Learning (SPGRL) method, to fully capture the structure information of graphs.

Graph Representation Learning Node Classification

Removing Rain Streaks via Task Transfer Learning

no code implementations28 Aug 2022 Yinglong Wang, Chao Ma, Jianzhuang Liu

Inspired by our studies, we propose to remove rain by learning favorable deraining representations from other connected tasks.

Knowledge Distillation Rain Removal +1

Anti-Retroactive Interference for Lifelong Learning

1 code implementation27 Aug 2022 Runqi Wang, Yuxiang Bao, Baochang Zhang, Jianzhuang Liu, Wentao Zhu, Guodong Guo

Second, according to the similarity between incremental knowledge and base knowledge, we design an adaptive fusion of incremental knowledge, which helps the model allocate capacity to the knowledge of different difficulties.

Meta-Learning

Low-Light Video Enhancement with Synthetic Event Guidance

no code implementations23 Aug 2022 Lin Liu, Junfeng An, Jianzhuang Liu, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Yanfeng Wang, Qi Tian

Low-light video enhancement (LLVE) is an important yet challenging task with many applications such as photographing and autonomous driving.

Autonomous Driving Image Enhancement +1

CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation

6 code implementations1 Aug 2022 Zhihao LI, Jianzhuang Liu, Zhensong Zhang, Songcen Xu, Youliang Yan

Top-down methods dominate the field of 3D human pose and shape estimation, because they are decoupled from human detection and allow researchers to focus on the core problem.

3D human pose and shape estimation Human Detection +1

Self-Supervision Can Be a Good Few-Shot Learner

3 code implementations19 Jul 2022 Yuning Lu, Liangjian Wen, Jianzhuang Liu, Yajing Liu, Xinmei Tian

Specifically, we maximize the mutual information (MI) of instances and their representations with a low-bias MI estimator to perform self-supervised pre-training.

cross-domain few-shot learning Unsupervised Few-Shot Image Classification +1

ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts

no code implementations CVPR 2022 Bingqian Lin, Yi Zhu, Zicong Chen, Xiwen Liang, Jianzhuang Liu, Xiaodan Liang

Vision-Language Navigation (VLN) is a challenging task that requires an embodied agent to perform action-level modality alignment, i. e., make instruction-asked actions sequentially in complex visual environments.

Vision-Language Navigation

Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection

no code implementations CVPR 2022 Zhuoling Li, Zhan Qu, Yang Zhou, Jianzhuang Liu, Haoqian Wang, Lihui Jiang

To tackle this problem, we propose a depth solving system that fully explores the visual clues from the subtasks in M3OD and generates multiple estimations for the depth of each target.

Depth Estimation Monocular 3D Object Detection +2

Prompt Distribution Learning

no code implementations CVPR 2022 Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian

We present prompt distribution learning for effectively adapting a pre-trained vision-language model to address downstream recognition tasks.

Language Modelling

Neural Architecture Search With Representation Mutual Information

1 code implementation CVPR 2022 Xiawu Zheng, Xiang Fei, Lei Zhang, Chenglin Wu, Fei Chao, Jianzhuang Liu, Wei Zeng, Yonghong Tian, Rongrong Ji

Building upon RMI, we further propose a new search algorithm termed RMI-NAS, facilitating with a theorem to guarantee the global optimal of the searched architecture.

Neural Architecture Search

SiamTrans: Zero-Shot Multi-Frame Image Restoration with Pre-Trained Siamese Transformers

no code implementations17 Dec 2021 Lin Liu, Shanxin Yuan, Jianzhuang Liu, Xin Guo, Youliang Yan, Qi Tian

For zero-shot image restoration, we design a novel model, termed SiamTrans, which is constructed by Siamese transformers, encoders, and decoders.

Denoising Image Restoration +1

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization

1 code implementation CVPR 2022 Yunshan Zhong, Mingbao Lin, Gongrui Nan, Jianzhuang Liu, Baochang Zhang, Yonghong Tian, Rongrong Ji

In this paper, we observe an interesting phenomenon of intra-class heterogeneity in real data and show that existing methods fail to retain this property in their synthetic images, which causes a limited performance increase.

Quantization

Motion Deblurring with Real Events

no code implementations ICCV 2021 Fang Xu, Lei Yu, Bishan Wang, Wen Yang, Gui-Song Xia, Xu Jia, Zhendong Qiao, Jianzhuang Liu

In this paper, we propose an end-to-end learning framework for event-based motion deblurring in a self-supervised manner, where real-world events are exploited to alleviate the performance degradation caused by data inconsistency.

Deblurring

Wavelet-Based Network For High Dynamic Range Imaging

1 code implementation3 Aug 2021 Tianhong Dai, Wei Li, Xilei Cao, Jianzhuang Liu, Xu Jia, Ales Leonardis, Youliang Yan, Shanxin Yuan

The frequency-guided upsampling module reconstructs details from multiple frequency-specific components with rich details.

Optical Flow Estimation Vocal Bursts Intensity Prediction

Multi-Target Domain Adaptation with Collaborative Consistency Learning

no code implementations CVPR 2021 Takashi Isobe, Xu Jia, Shuaijun Chen, Jianzhong He, Yongjie Shi, Jianzhuang Liu, Huchuan Lu, Shengjin Wang

To obtain a single model that works across multiple target domains, we propose to simultaneously learn a student model which is trained to not only imitate the output of each expert on the corresponding target domain, but also to pull different expert close to each other with regularization on their weights.

Multi-target Domain Adaptation Semantic Segmentation +1

Multiple instance active learning for object detection

1 code implementation CVPR 2021 Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye

Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection.

Active Object Detection Multiple Instance Learning +3

Neighbor2Neighbor: Self-Supervised Denoising from Single Noisy Images

12 code implementations CVPR 2021 Tao Huang, Songjiang Li, Xu Jia, Huchuan Lu, Jianzhuang Liu

In this paper, we present a very simple yet effective method named Neighbor2Neighbor to train an effective image denoising model with only noisy images.

Image Denoising Self-Supervised Learning

TRAR: Routing the Attention Spans in Transformer for Visual Question Answering

1 code implementation ICCV 2021 Yiyi Zhou, Tianhe Ren, Chaoyang Zhu, Xiaoshuai Sun, Jianzhuang Liu, Xinghao Ding, Mingliang Xu, Rongrong Ji

Due to the superior ability of global dependency modeling, Transformer and its variants have become the primary choice of many vision-and-language tasks.

Question Answering Referring Expression +2

Self-Adaptively Learning to Demoiré from Focused and Defocused Image Pairs

no code implementations NeurIPS 2020 Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian

In this paper, we propose a self-adaptive learning method for demoiréing a high-frequency image, with the help of an additional defocused moiré-free blur image.

Test-time Adaptation

Self-Adaptively Learning to Demoire from Focused and Defocused Image Pairs

1 code implementation3 Nov 2020 Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian

In this paper, we propose a self-adaptive learning method for demoireing a high-frequency image, with the help of an additional defocused moire-free blur image.

Demoire Test-time Adaptation

Binarized Neural Architecture Search for Efficient Object Recognition

no code implementations8 Sep 2020 Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, Rongrong Ji, David Doermann, Guodong Guo

In this paper, binarized neural architecture search (BNAS), with a search space of binarized convolutions, is introduced to produce extremely compressed models to reduce huge computational cost on embedded devices for edge computing.

Edge-computing Face Recognition +3

Light Field View Synthesis via Aperture Disparity and Warping Confidence Map

no code implementations7 Sep 2020 Nan Meng, Kai Li, Jianzhuang Liu, Edmund Y. Lam

This paper presents a learning-based approach to synthesize the view from an arbitrary camera position given a sparse set of images.

Novel View Synthesis Position

Dual Distribution Alignment Network for Generalizable Person Re-Identification

1 code implementation27 Jul 2020 Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji

Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.

Domain Generalization Generalizable Person Re-identification

Wavelet-Based Dual-Branch Network for Image Demoireing

1 code implementation14 Jul 2020 Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, Qi Tian

When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.

Demoire Image Restoration +1

Projection & Probability-Driven Black-Box Attack

1 code implementation CVPR 2020 Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian

For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.

High-Order Residual Network for Light Field Super-Resolution

1 code implementation29 Mar 2020 Nan Meng, Xiaofei Wu, Jianzhuang Liu, Edmund Y. Lam

In this paper, we propose a novel high-order residual network to learn the geometric features hierarchically from the LF for reconstruction.

Super-Resolution Vocal Bursts Intensity Prediction

Context-Transformer: Tackling Object Confusion for Few-Shot Detection

1 code implementation16 Mar 2020 Ze Yang, Yali Wang, Xianyu Chen, Jianzhuang Liu, Yu Qiao

Few-shot object detection is a challenging but realistic scenario, where only a few annotated training images are available for training detectors.

Few-Shot Learning Few-Shot Object Detection +3

Filter Sketch for Network Pruning

1 code implementation23 Jan 2020 Mingbao Lin, Liujuan Cao, Shaojie Li, Qixiang Ye, Yonghong Tian, Jianzhuang Liu, Qi Tian, Rongrong Ji

Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure.

Network Pruning

Multiple Anchor Learning for Visual Object Detection

3 code implementations CVPR 2020 Wei Ke, Tianliang Zhang, Zeyi Huang, Qixiang Ye, Jianzhuang Liu, Dong Huang

In this paper, we propose a Multiple Instance Learning (MIL) approach that selects anchors and jointly optimizes the two modules of a CNN-based object detector.

General Classification Multiple Instance Learning +3

GBCNs: Genetic Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

no code implementations25 Nov 2019 Chunlei Liu, Wenrui Ding, Yuan Hu, Baochang Zhang, Jianzhuang Liu, Guodong Guo

The BGA method is proposed to modify the binary process of GBCNs to alleviate the local minima problem, which can significantly improve the performance of 1-bit DCNNs.

Face Recognition Object Recognition +1

Binarized Neural Architecture Search

no code implementations25 Nov 2019 Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, David Doermann, Rongrong Ji

A variant, binarized neural architecture search (BNAS), with a search space of binarized convolutions, can produce extremely compressed models.

Neural Architecture Search

An End-to-End Foreground-Aware Network for Person Re-Identification

no code implementations25 Oct 2019 Yiheng Liu, Wengang Zhou, Jianzhuang Liu, Guo-Jun Qi, Qi Tian, Houqiang Li

By presenting a target attention loss, the pedestrian features extracted from the foreground branch become more insensitive to the backgrounds, which greatly reduces the negative impacts of changing backgrounds on matching an identical across different camera views.

Person Re-Identification

Circulant Binary Convolutional Networks: Enhancing the Performance of 1-bit DCNNs with Circulant Back Propagation

no code implementations CVPR 2019 Chunlei Liu, Wenrui Ding, Xin Xia, Baochang Zhang, Jiaxin Gu, Jianzhuang Liu, Rongrong Ji, David Doermann

The CiFs can be easily incorporated into existing deep convolutional neural networks (DCNNs), which leads to new Circulant Binary Convolutional Networks (CBCNs).

Unsupervised Image Super-Resolution with an Indirect Supervised Path

no code implementations7 Oct 2019 Zhen Han, Enyan Dai, Xu Jia, Xiaoying Ren, Shuaijun Chen, Chunjing Xu, Jianzhuang Liu, Qi Tian

The task of single image super-resolution (SISR) aims at reconstructing a high-resolution (HR) image from a low-resolution (LR) image.

Image Super-Resolution Translation

RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

no code implementations21 Aug 2019 Chunlei Liu, Wenrui Ding, Xin Xia, Yuan Hu, Baochang Zhang, Jianzhuang Liu, Bohan Zhuang, Guodong Guo

Binarized convolutional neural networks (BCNNs) are widely used to improve memory and computation efficiency of deep convolutional neural networks (DCNNs) for mobile and AI chips based applications.

Binarization Object Tracking

Bayesian Optimized 1-Bit CNNs

no code implementations ICCV 2019 Jiaxin Gu, Junhe Zhao, Xiao-Long Jiang, Baochang Zhang, Jianzhuang Liu, Guodong Guo, Rongrong Ji

Deep convolutional neural networks (DCNNs) have dominated the recent developments in computer vision through making various record-breaking models.

Multinomial Distribution Learning for Effective Neural Architecture Search

1 code implementation ICCV 2019 Xiawu Zheng, Rongrong Ji, Lang Tang, Baochang Zhang, Jianzhuang Liu, Qi Tian

Therefore, NAS can be transformed to a multinomial distribution learning problem, i. e., the distribution is optimized to have a high expectation of the performance.

Neural Architecture Search

Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression

1 code implementation CVPR 2019 Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, Rongrong Ji

The relationship between the input feature maps and 2D kernels is revealed in a theoretical framework, based on which a kernel sparsity and entropy (KSE) indicator is proposed to quantitate the feature map importance in a feature-agnostic manner to guide model compression.

Clustering Model Compression

Projection Convolutional Neural Networks for 1-bit CNNs via Discrete Back Propagation

no code implementations30 Nov 2018 Jiaxin Gu, Ce Li, Baochang Zhang, Jungong Han, Xian-Bin Cao, Jianzhuang Liu, David Doermann

The advancement of deep convolutional neural networks (DCNNs) has driven significant improvement in the accuracy of recognition systems for many computer vision tasks.

Modulated Convolutional Networks

no code implementations CVPR 2018 Xiaodi Wang, Baochang Zhang, Ce Li, Rongrong Ji, Jungong Han, Xian-Bin Cao, Jianzhuang Liu

In this paper, we propose new Modulated Convolutional Networks (MCNs) to improve the portability of CNNs via binarized filters.

Memory Attention Networks for Skeleton-based Action Recognition

1 code implementation23 Apr 2018 Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han, Changqing Zou, Jianzhuang Liu

Specifically, the TARM is deployed in a residual learning module that employs a novel attention learning network to recalibrate the temporal attention of frames in a skeleton sequence.

Action Recognition Skeleton Based Action Recognition +1

Gabor Convolutional Networks

no code implementations3 May 2017 Shangzhen Luan, Baochang Zhang, Chen Chen, Xian-Bin Cao, Jungong Han, Jianzhuang Liu

Steerable properties dominate the design of traditional filters, e. g., Gabor filters, and endow features the capability of dealing with spatial transformations.

A Maximum Entropy Feature Descriptor for Age Invariant Face Recognition

no code implementations CVPR 2015 Dihong Gong, Zhifeng Li, DaCheng Tao, Jianzhuang Liu, Xuelong. Li

In this paper, we propose a new approach to overcome the representation and matching problems in age invariant face recognition.

Age-Invariant Face Recognition MORPH

Cannot find the paper you are looking for? You can Submit a new open access paper.