Search Results for author: Baochang Zhang

Found 110 papers, 37 papers with code

The 1st Tiny Object Detection Challenge:Methods and Results

1 code implementation • 16 Sep 2020 • Xuehui Yu, Zhenjun Han, Yuqi Gong, Nan Jiang, Jian Zhao, Qixiang Ye, Jie Chen, Yuan Feng, Bin Zhang, Xiaodi Wang, Ying Xin, Jingwei Liu, Mingyuan Mao, Sheng Xu, Baochang Zhang, Shumin Han, Cheng Gao, Wei Tang, Lizuo Jin, Mingbo Hong, Yuchao Yang, Shuiwang Li, Huan Luo, Qijun Zhao, Humphrey Shi

The 1st Tiny Object Detection (TOD) Challenge aims to encourage research in developing novel and accurate methods for tiny object detection in images which have wide views, with a current focus on tiny person detection.

Human Detection Object +2

632

Paper
Code

Implicit Diffusion Models for Continuous Super-Resolution

1 code implementation • CVPR 2023 • Sicheng Gao, Xuhui Liu, Bohan Zeng, Sheng Xu, Yanjing Li, Xiaoyan Luo, Jianzhuang Liu, XianTong Zhen, Baochang Zhang

IDM integrates an implicit neural representation and a denoising diffusion model in a unified end-to-end framework, where the implicit neural representation is adopted in the decoding process to learn continuous-resolution representation.

Ranked #1 on Image Super-Resolution on CelebA-HQ 128x128

Denoising Image Super-Resolution

252

Paper
Code

HRank: Filter Pruning using High-Rank Feature Map

2 code implementations • CVPR 2020 • Mingbao Lin, Rongrong Ji, Yan Wang, Yichen Zhang, Baochang Zhang, Yonghong Tian, Ling Shao

The principle behind our pruning is that low-rank feature maps contain less information, and thus pruned results can be easily reproduced.

Network Pruning Vocal Bursts Intensity Prediction

245

Paper
Code

Multinomial Distribution Learning for Effective Neural Architecture Search

1 code implementation • ICCV 2019 • Xiawu Zheng, Rongrong Ji, Lang Tang, Baochang Zhang, Jianzhuang Liu, Qi Tian

Therefore, NAS can be transformed to a multinomial distribution learning problem, i. e., the distribution is optimized to have a high expectation of the performance.

Neural Architecture Search

207

Paper
Code

CCMB: A Large-scale Chinese Cross-modal Benchmark

1 code implementation • 8 May 2022 • Chunyu Xie, Heng Cai, Jincheng Li, Fanjing Kong, Xiaoyu Wu, Jianfei Song, Henrique Morimitsu, Lin Yao, Dexin Wang, Xiangzheng Zhang, Dawei Leng, Baochang Zhang, Xiangyang Ji, Yafeng Deng

In this work, we build a large-scale high-quality Chinese Cross-Modal Benchmark named CCMB for the research community, which contains the currently largest public pre-training dataset Zero and five human-annotated fine-tuning datasets for downstream tasks.

Ranked #3 on Image Retrieval on Flickr30k-CN

Image Classification Image Retrieval +7

150

Paper
Code

Channel Pruning via Automatic Structure Search

1 code implementation • 23 Jan 2020 • Mingbao Lin, Rongrong Ji, Yuxin Zhang, Baochang Zhang, Yongjian Wu, Yonghong Tian

In this paper, we propose a new channel pruning method based on artificial bee colony algorithm (ABC), dubbed as ABCPruner, which aims to efficiently find optimal pruned structure, i. e., channel number in each layer, rather than selecting "important" channels as previous works did.

137

Paper
Code

Self-Supervised Monocular Depth and Ego-Motion Estimation in Endoscopy: Appearance Flow to the Rescue

1 code implementation • 15 Dec 2021 • Shuwei Shao, Zhongcai Pei, Weihai Chen, Wentao Zhu, Xingming Wu, Dianmin Sun, Baochang Zhang

Recently, self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos, achieving remarkable performance in autonomous driving scenarios.

Depth Estimation Motion Estimation +1

Paper
Code

Rotated Binary Neural Network

2 code implementations • NeurIPS 2020 • Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Yan Wang, Yongjian Wu, Feiyue Huang, Chia-Wen Lin

In this paper, for the first time, we explore the influence of angular bias on the quantization error and then introduce a Rotated Binary Neural Network (RBNN), which considers the angle alignment between the full-precision weight vector and its binarized version.

Binarization Quantization

Paper
Code

SiMaN: Sign-to-Magnitude Network Binarization

2 code implementations • 16 Feb 2021 • Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Fei Chao, Chia-Wen Lin, Ling Shao

In this paper, we show that our weight binarization provides an analytical solution by encoding high-magnitude weights into +1s, and 0s otherwise.

Binarization

Paper
Code

Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer

1 code implementation • 13 Oct 2022 • Yanjing Li, Sheng Xu, Baochang Zhang, Xianbin Cao, Peng Gao, Guodong Guo

The large pre-trained vision transformers (ViTs) have demonstrated remarkable performance on various visual tasks, but suffer from expensive computational and memory cost problems when deployed on resource-constrained devices.

Quantization

Paper
Code

RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model

1 code implementation • 12 Mar 2024 • Mingze Wang, Lili Su, Cilin Yan, Sheng Xu, Pengcheng Yuan, XiaoLong Jiang, Baochang Zhang

RSBuilding is designed to enhance cross-scene generalization and task universality.

Change Detection Zero-shot Generalization

Paper
Code

PAMS: Quantized Super-Resolution via Parameterized Max Scale

1 code implementation • ECCV 2020 • Huixia Li, Chenqian Yan, Shaohui Lin, Xiawu Zheng, Yuchao Li, Baochang Zhang, Fan Yang, Rongrong Ji

Specifically, most state-of-the-art SR models without batch normalization have a large dynamic quantization range, which also serves as another cause of performance drop.

Quantization Super-Resolution +1

Paper
Code

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning

1 code implementation • CVPR 2019 • Shaohui Lin, Rongrong Ji, Chenqian Yan, Baochang Zhang, Liujuan Cao, Qixiang Ye, Feiyue Huang, David Doermann

In this paper, we propose an effective structured pruning approach that jointly prunes filters as well as other structures in an end-to-end manner.

Paper
Code

Face Animation with an Attribute-Guided Diffusion Model

1 code implementation • 6 Apr 2023 • Bohan Zeng, Xuhui Liu, Sicheng Gao, Boyu Liu, Hong Li, Jianzhuang Liu, Baochang Zhang

Face animation has achieved much progress in computer vision.

3D Face Reconstruction Attribute +1

Paper
Code

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

1 code implementation • 28 May 2019 • Xiawu Zheng, Chenyi Yang, Shaokun Zhang, Yan Wang, Baochang Zhang, Yongjian Wu, Yunsheng Wu, Ling Shao, Rongrong Ji

With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints, which is practical for on-device models across diverse search spaces and constraints.

Neural Architecture Search

Paper
Code

Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression

1 code implementation • CVPR 2019 • Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, Rongrong Ji

The relationship between the input feature maps and 2D kernels is revealed in a theoretical framework, based on which a kernel sparsity and entropy (KSE) indicator is proposed to quantitate the feature map importance in a feature-agnostic manner to guide model compression.

Clustering Model Compression

Paper
Code

FNeVR: Neural Volume Rendering for Face Animation

1 code implementation • 21 Sep 2022 • Bohan Zeng, Boyu Liu, Hong Li, Xuhui Liu, Jianzhuang Liu, Dapeng Chen, Wei Peng, Baochang Zhang

In FNeVR, we design a 3D Face Volume Rendering (FVR) module to enhance the facial details for image rendering.

Talking Face Generation

Paper
Code

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization

1 code implementation • CVPR 2022 • Yunshan Zhong, Mingbao Lin, Gongrui Nan, Jianzhuang Liu, Baochang Zhang, Yonghong Tian, Rongrong Ji

In this paper, we observe an interesting phenomenon of intra-class heterogeneity in real data and show that existing methods fail to retain this property in their synthetic images, which causes a limited performance increase.

Quantization

Paper
Code

Memory Attention Networks for Skeleton-based Action Recognition

1 code implementation • 23 Apr 2018 • Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han, Changqing Zou, Jianzhuang Liu

Specifically, the TARM is deployed in a residual learning module that employs a novel attention learning network to recalibrate the temporal attention of frames in a skeleton sequence.

Ranked #89 on Skeleton Based Action Recognition on NTU RGB+D

Action Recognition Skeleton Based Action Recognition +1

Paper
Code

Q-DETR: An Efficient Low-Bit Quantized Detection Transformer

1 code implementation • CVPR 2023 • Sheng Xu, Yanjing Li, Mingbao Lin, Peng Gao, Guodong Guo, Jinhu Lu, Baochang Zhang

At the upper level, we introduce a new foreground-aware query matching scheme to effectively transfer the teacher information to distillation-desired features to minimize the conditional information entropy.

object-detection Object Detection +1

Paper
Code

Long term 5G network traffic forecasting via modeling non-stationarity with deep learning

1 code implementation • journal 2023 • Yuguang Yang, Shupeng Geng, Baochang Zhang, Juan Zhang, Zheng Wang, Yong Zhang & David Doermann

However, long term prediction horizon exposes the non-stationarity of series data, which deteriorates the performance of existing approaches.

Time Series Time Series Forecasting +1

Paper
Code

IPDreamer: Appearance-Controllable 3D Object Generation with Image Prompts

1 code implementation • 9 Oct 2023 • Bohan Zeng, Shanglin Li, Yutang Feng, Hong Li, Sicheng Gao, Jiaming Liu, Huaxia Li, Xu Tang, Jianzhuang Liu, Baochang Zhang

Recent advances in 3D generation have been remarkable, with methods such as DreamFusion leveraging large-scale text-to-image diffusion-based models to supervise 3D generation.

3D Generation Image to 3D +2

Paper
Code

Recurrent Bilinear Optimization for Binary Neural Networks

2 code implementations • 4 Sep 2022 • Sheng Xu, Yanjing Li, Tiancheng Wang, Teli Ma, Baochang Zhang, Peng Gao, Yu Qiao, Jinhu Lv, Guodong Guo

To address this issue, Recurrent Bilinear Optimization is proposed to improve the learning process of BNNs (RBONNs) by associating the intrinsic bilinear variables in the back propagation process.

object-detection Object Detection

Paper
Code

Deep Fisher Discriminant Learning for Mobile Hand Gesture Recognition

1 code implementation • 12 Jul 2017 • Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han

Gesture recognition is a challenging problem in the field of biometrics.

Ranked #1 on Hand Gesture Recognition on MGB

Hand Gesture Recognition Hand-Gesture Recognition

Paper
Code

iffDetector: Inference-aware Feature Filtering for Object Detection

1 code implementation • 23 Jun 2020 • Mingyuan Mao, Yuxin Tian, Baochang Zhang, Qixiang Ye, Wanquan Liu, Guodong Guo, David Doermann

In this paper, we propose a new feature optimization approach to enhance features and suppress background noise in both the training and inference stages.

Object object-detection +1

Paper
Code

IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors

1 code implementation • 7 Oct 2022 • Sheng Xu, Yanjing Li, Bohan Zeng, Teli Ma, Baochang Zhang, Xianbin Cao, Peng Gao, Jinhu Lv

This explains why existing KD methods are less effective for 1-bit detectors, caused by a significant information discrepancy between the real-valued teacher and the 1-bit student.

Knowledge Distillation object-detection +1

Paper
Code

Resilient Binary Neural Network

1 code implementation • 2 Feb 2023 • Sheng Xu, Yanjing Li, Teli Ma, Mingbao Lin, Hao Dong, Baochang Zhang, Peng Gao, Jinhu Lv

In this paper, we introduce a Resilient Binary Neural Network (ReBNN) to mitigate the frequent oscillation for better BNNs' training.

Paper
Code

Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment

1 code implementation • CVPR 2023 • Runqi Wang, Hao Zheng, Xiaoyue Duan, Jianzhuang Liu, Yuning Lu, Tian Wang, Songcen Xu, Baochang Zhang

However, with only a few training images, there exist two crucial problems: (1) the visual feature distributions are easily distracted by class-irrelevant information in images, and (2) the alignment between the visual and language feature distributions is difficult.

Few-Shot Learning

Paper
Code

Aha! Adaptive History-Driven Attack for Decision-Based Black-Box Models

1 code implementation • ICCV 2021 • Jie Li, Rongrong Ji, Peixian Chen, Baochang Zhang, Xiaopeng Hong, Ruixin Zhang, Shaoxin Li, Jilin Li, Feiyue Huang, Yongjian Wu

A common practice is to start from a large perturbation and then iteratively reduce it with a deterministic direction and a random one while keeping it adversarial.

Dimensionality Reduction

Paper
Code

Controllable Mind Visual Diffusion Model

1 code implementation • 17 May 2023 • Bohan Zeng, Shanglin Li, Xuhui Liu, Sicheng Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.

Attribute Image Generation

Paper
Code

A General and Efficient Training for Transformer via Token Expansion

1 code implementation • 31 Mar 2024 • Wenxuan Huang, Yunhang Shen, Jiao Xie, Baochang Zhang, Gaoqi He, Ke Li, Xing Sun, Shaohui Lin

The remarkable performance of Vision Transformers (ViTs) typically requires an extremely large training cost.

Paper
Code

ZONE: Zero-Shot Instruction-Guided Local Editing

1 code implementation • 28 Dec 2023 • Shanglin Li, Bohan Zeng, Yutang Feng, Sicheng Gao, Xuhui Liu, Jiaming Liu, Li Lin, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.

Image Generation

Paper
Code

Self-Enhancement Improves Text-Image Retrieval in Foundation Visual-Language Models

1 code implementation • 11 Jun 2023 • Yuguang Yang, Yiming Wang, Shupeng Geng, Runqi Wang, Yimi Wang, Sheng Wu, Baochang Zhang

The emergence of cross-modal foundation models has introduced numerous approaches grounded in text-image retrieval.

Attribute Image Retrieval +2

Paper
Code

Filter Pruning for Efficient CNNs via Knowledge-driven Differential Filter Sampler

1 code implementation • 1 Jul 2023 • Shaohui Lin, Wenxuan Huang, Jiao Xie, Baochang Zhang, Yunhang Shen, Zhou Yu, Jungong Han, David Doermann

In this paper, we propose a novel Knowledge-driven Differential Filter Sampler~(KDFS) with Masked Filter Modeling~(MFM) framework for filter pruning, which globally prunes the redundant filters based on the prior knowledge of a pre-trained model in a differential and non-alternative optimization.

Image Classification Network Pruning

Paper
Code

The Structure Transfer Machine Theory and Applications

1 code implementation • 1 Apr 2018 • Baochang Zhang, Lian Zhuo, Ze Wang, Jungong Han, Xian-Tong Zhen

Representation learning is a fundamental but challenging problem, especially when the distribution of data is unknown.

Image Classification Object Tracking +1

Paper
Code

Federated Learning via Input-Output Collaborative Distillation

1 code implementation • 22 Dec 2023 • Xuan Gong, Shanglin Li, Yuxiang Bao, Barry Yao, Yawen Huang, Ziyan Wu, Baochang Zhang, Yefeng Zheng, David Doermann

Federated learning (FL) is a machine learning paradigm in which distributed local nodes collaboratively train a central model without sharing individually held private data.

Federated Learning Image Classification

Paper
Code

Anti-Retroactive Interference for Lifelong Learning

1 code implementation • 27 Aug 2022 • Runqi Wang, Yuxiang Bao, Baochang Zhang, Jianzhuang Liu, Wentao Zhu, Guodong Guo

Second, according to the similarity between incremental knowledge and base knowledge, we design an adaptive fusion of incremental knowledge, which helps the model allocate capacity to the knowledge of different difficulties.

Meta-Learning

Paper
Code

Object detection and tracking benchmark in industry based on improved correlation filter

no code implementations • 11 Jun 2018 • Shangzhen Luan, Yan Li, Xiaodi Wang, Baochang Zhang

Real-time object detection and tracking have shown to be the basis of intelligent production for industrial 4. 0 applications.

object-detection Real-Time Object Detection

Paper
Add Code

One-Two-One Networks for Compression Artifacts Reduction in Remote Sensing

no code implementations • 1 Apr 2018 • Baochang Zhang, Jiaxin Gu, Chen Chen, Jungong Han, Xiangbo Su, Xian-Bin Cao, Jianzhuang Liu

Compression artifacts reduction (CAR) is a challenging problem in the field of remote sensing.

Blocking Image Compression +1

Paper
Add Code

Gabor Convolutional Networks

no code implementations • 3 May 2017 • Shangzhen Luan, Baochang Zhang, Chen Chen, Xian-Bin Cao, Jungong Han, Jianzhuang Liu

Steerable properties dominate the design of traditional filters, e. g., Gabor filters, and endow features the capability of dealing with spatial transformations.

Paper
Add Code

Latent Constrained Correlation Filter

no code implementations • 11 Nov 2017 • Baochang Zhang, Shangzhen Luan, Chen Chen, Jungong Han, Wei Wang, Alessandro Perina, Ling Shao

In this paper, we introduce an intermediate step -- solution sampling -- after the data sampling step to form a subspace, in which an optimal solution can be estimated.

Object Recognition Object Tracking

Paper
Add Code

Manifold Constrained Low-Rank Decomposition

no code implementations • 6 Aug 2017 • Chen Chen, Baochang Zhang, Alessio Del Bue, Vittorio Murino

Low-rank decomposition (LRD) is a state-of-the-art method for visual data reconstruction and modelling.

Paper
Add Code

Latent Constrained Correlation Filters for Object Localization

no code implementations • 7 Jun 2016 • Shangzhen Luan, Baochang Zhang, Jungong Han, Chen Chen, Ling Shao, Alessandro Perina, Linlin Shen

There is a neglected fact in the traditional machine learning methods that the data sampling can actually lead to the solution sampling.

Object Object Localization

Paper
Add Code

Deep Spatio-temporal Manifold Network for Action Recognition

no code implementations • 9 May 2017 • Ce Li, Chen Chen, Baochang Zhang, Qixiang Ye, Jungong Han, Rongrong Ji

Visual data such as videos are often sampled from complex manifold.

Action Recognition Temporal Action Localization

Paper
Add Code

Output Constraint Transfer for Kernelized Correlation Filter in Tracking

no code implementations • 16 Dec 2016 • Baochang Zhang, Zhigang Li, Xian-Bin Cao, Qixiang Ye, Chen Chen, Linlin Shen, Alessandro Perina, Rongrong Ji

Kernelized Correlation Filter (KCF) is one of the state-of-the-art object trackers.

Bayesian Optimization

Paper
Add Code

Self-learning Scene-specific Pedestrian Detectors using a Progressive Latent Model

no code implementations • CVPR 2017 • Qixiang Ye, Tianliang Zhang, Qiang Qiu, Baochang Zhang, Jie Chen, Guillermo Sapiro

In this paper, a self-learning approach is proposed towards solving scene-specific pedestrian detection problem without any human' annotation involved.

Object Object Discovery +5

Paper
Add Code

Boosting-like Deep Learning For Pedestrian Detection

no code implementations • 26 May 2015 • Lei Wang, Baochang Zhang

This paper proposes boosting-like deep learning (BDL) framework for pedestrian detection.

Pedestrian Detection

Paper
Add Code

The BeiHang Keystroke Dynamics Authentication System

no code implementations • 15 Oct 2013 • Juan Liu, Baochang Zhang, Linlin Shen, Jianzhuang Liu, Jason Zhao

Keystroke Dynamics is an important biometric solution for person authentication.

General Classification

Paper
Add Code

Projection Convolutional Neural Networks for 1-bit CNNs via Discrete Back Propagation

no code implementations • 30 Nov 2018 • Jiaxin Gu, Ce Li, Baochang Zhang, Jungong Han, Xian-Bin Cao, Jianzhuang Liu, David Doermann

The advancement of deep convolutional neural networks (DCNNs) has driven significant improvement in the accuracy of recognition systems for many computer vision tasks.

Paper
Add Code

Modulated Convolutional Networks

no code implementations • CVPR 2018 • Xiaodi Wang, Baochang Zhang, Ce Li, Rongrong Ji, Jungong Han, Xian-Bin Cao, Jianzhuang Liu

In this paper, we propose new Modulated Convolutional Networks (MCNs) to improve the portability of CNNs via binarized filters.

Paper
Add Code

Sparse Representation Classification With Manifold Constraints Transfer

no code implementations • CVPR 2015 • Baochang Zhang, Alessandro Perina, Vittorio Murino, Alessio Del Bue

The fact that image data samples lie on a manifold has been successfully exploited in many learning and inference problems.

Classification General Classification +1

Paper
Add Code

Cross-Modality Binary Code Learning via Fusion Similarity Hashing

no code implementations • CVPR 2017 • Hong Liu, Rongrong Ji, Yongjian Wu, Feiyue Huang, Baochang Zhang

In this paper, we propose a hashing scheme, termed Fusion Similarity Hashing (FSH), which explicitly embeds the graph-based fusion similarity across modalities into a common Hamming space.

Retrieval

Paper
Add Code

Crowd Counting and Density Estimation by Trellis Encoder-Decoder Network

no code implementations • 3 Mar 2019 • Xiaolong Jiang, Zehao Xiao, Baochang Zhang, Xian-Tong Zhen, Xian-Bin Cao, David Doermann, Ling Shao

In this paper, we propose a trellis encoder-decoder network (TEDnet) for crowd counting, which focuses on generating high-quality density estimation maps.

Crowd Counting Density Estimation

Paper
Add Code

Supervised Online Hashing via Similarity Distribution Learning

no code implementations • 31 May 2019 • Mingbao Lin, Rongrong Ji, Shen Chen, Feng Zheng, Xiaoshuai Sun, Baochang Zhang, Liujuan Cao, Guodong Guo, Feiyue Huang

In this paper, we propose to model the similarity distributions between the input data and the hashing codes, upon which a novel supervised online hashing method, dubbed as Similarity Distribution based Online Hashing (SDOH), is proposed, to keep the intrinsic semantic relationship in the produced Hamming space.

Retrieval

Paper
Add Code

Interpretable Neural Network Decoupling

no code implementations • ECCV 2020 • Yuchao Li, Rongrong Ji, Shaohui Lin, Baochang Zhang, Chenqian Yan, Yongjian Wu, Feiyue Huang, Ling Shao

More specifically, we introduce a novel architecture controlling module in each layer to encode the network architecture by a vector.

Network Interpretation

Paper
Add Code

Bayesian Optimized 1-Bit CNNs

no code implementations • ICCV 2019 • Jiaxin Gu, Junhe Zhao, Xiao-Long Jiang, Baochang Zhang, Jianzhuang Liu, Guodong Guo, Rongrong Ji

Deep convolutional neural networks (DCNNs) have dominated the recent developments in computer vision through making various record-breaking models.

Paper
Add Code

RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

no code implementations • 21 Aug 2019 • Chunlei Liu, Wenrui Ding, Xin Xia, Yuan Hu, Baochang Zhang, Jianzhuang Liu, Bohan Zhuang, Guodong Guo

Binarized convolutional neural networks (BCNNs) are widely used to improve memory and computation efficiency of deep convolutional neural networks (DCNNs) for mobile and AI chips based applications.

Binarization Object Tracking

Paper
Add Code

Semantic-aware Image Deblurring

no code implementations • 9 Oct 2019 • Fuhai Chen, Rongrong Ji, Chengpeng Dai, Xiaoshuai Sun, Chia-Wen Lin, Jiayi Ji, Baochang Zhang, Feiyue Huang, Liujuan Cao

Specially, we propose a novel Structured-Spatial Semantic Embedding model for image deblurring (termed S3E-Deblur), which introduces a novel Structured-Spatial Semantic tree model (S3-tree) to bridge two basic tasks in computer vision: image deblurring (ImD) and image captioning (ImC).

Deblurring Image Captioning +1

Paper
Add Code

Aggregation Signature for Small Object Tracking

no code implementations • 24 Oct 2019 • Chunlei Liu, Wenrui Ding, Jinyu Yang, Vittorio Murino, Baochang Zhang, Jungong Han, Guodong Guo

In this paper, we propose a novel aggregation signature suitable for small object tracking, especially aiming for the challenge of sudden and large drift.

Object Object Tracking

Paper
Add Code

Circulant Binary Convolutional Networks: Enhancing the Performance of 1-bit DCNNs with Circulant Back Propagation

no code implementations • CVPR 2019 • Chunlei Liu, Wenrui Ding, Xin Xia, Baochang Zhang, Jiaxin Gu, Jianzhuang Liu, Rongrong Ji, David Doermann

The CiFs can be easily incorporated into existing deep convolutional neural networks (DCNNs), which leads to new Circulant Binary Convolutional Networks (CBCNs).

Paper
Add Code

Variational Structured Semantic Inference for Diverse Image Captioning

no code implementations • NeurIPS 2019 • Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang

To model these two inherent diversities in image captioning, we propose a Variational Structured Semantic Inferring model (termed VSSI-cap) executed in a novel structured encoder-inferer-decoder schema.

Image Captioning

Paper
Add Code

Binarized Neural Architecture Search

no code implementations • 25 Nov 2019 • Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, David Doermann, Rongrong Ji

A variant, binarized neural architecture search (BNAS), with a search space of binarized convolutions, can produce extremely compressed models.

Neural Architecture Search

Paper
Add Code

GBCNs: Genetic Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

no code implementations • 25 Nov 2019 • Chunlei Liu, Wenrui Ding, Yuan Hu, Baochang Zhang, Jianzhuang Liu, Guodong Guo

The BGA method is proposed to modify the binary process of GBCNs to alleviate the local minima problem, which can significantly improve the performance of 1-bit DCNNs.

Face Recognition Object Recognition +1

Paper
Add Code

NAS-Count: Counting-by-Density with Neural Architecture Search

no code implementations • ECCV 2020 • Yutao Hu, Xiao-Long Jiang, Xuhui Liu, Baochang Zhang, Jungong Han, Xian-Bin Cao, David Doermann

Most of the recent advances in crowd counting have evolved from hand-designed density estimation networks, where multi-scale features are leveraged to address the scale variation problem, but at the expense of demanding design efforts.

Crowd Counting Density Estimation +1

Paper
Add Code

CP-NAS: Child-Parent Neural Architecture Search for Binary Neural Networks

no code implementations • 30 Apr 2020 • Li'an Zhuo, Baochang Zhang, Hanlin Chen, Linlin Yang, Chen Chen, Yanjun Zhu, David Doermann

To this end, a Child-Parent (CP) model is introduced to a differentiable NAS to search the binarized architecture (Child) under the supervision of a full-precision model (Parent).

Neural Architecture Search

Paper
Add Code

Cogradient Descent for Bilinear Optimization

no code implementations • CVPR 2020 • Li'an Zhuo, Baochang Zhang, Linlin Yang, Hanlin Chen, Qixiang Ye, David Doermann, Guodong Guo, Rongrong Ji

Conventional learning methods simplify the bilinear model by regarding two intrinsically coupled factors independently, which degrades the optimization procedure.

Image Reconstruction Network Pruning

Paper
Add Code

Anti-Bandit Neural Architecture Search for Model Defense

no code implementations • ECCV 2020 • Hanlin Chen, Baochang Zhang, Song Xue, Xuan Gong, Hong Liu, Rongrong Ji, David Doermann

Deep convolutional neural networks (DCNNs) have dominated as the best performers in machine learning, but can be challenged by adversarial attacks.

Denoising Neural Architecture Search

Paper
Add Code

Binarized Neural Architecture Search for Efficient Object Recognition

no code implementations • 8 Sep 2020 • Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, Rongrong Ji, David Doermann, Guodong Guo

In this paper, binarized neural architecture search (BNAS), with a search space of binarized convolutions, is introduced to produce extremely compressed models to reduce huge computational cost on embedded devices for edge computing.

Edge-computing Face Recognition +3

Paper
Add Code

A Review of Recent Advances of Binary Neural Networks for Edge Computing

no code implementations • 24 Nov 2020 • Wenyu Zhao, Teli Ma, Xuan Gong, Baochang Zhang, David Doermann

Edge computing is promising to become one of the next hottest topics in artificial intelligence because it benefits various evolving domains such as real-time unmanned aerial systems, industrial applications, and the demand for privacy protection.

Edge-computing Neural Architecture Search +3

Paper
Add Code

Fast Class-wise Updating for Online Hashing

no code implementations • 1 Dec 2020 • Mingbao Lin, Rongrong Ji, Xiaoshuai Sun, Baochang Zhang, Feiyue Huang, Yonghong Tian, DaCheng Tao

To achieve fast online adaptivity, a class-wise updating method is developed to decompose the binary code learning and alternatively renew the hash functions in a class-wise fashion, which well addresses the burden on large amounts of training batches.

Paper
Add Code

Deformable Gabor Feature Networks for Biomedical Image Classification

no code implementations • 7 Dec 2020 • Xuan Gong, Xin Xia, Wentao Zhu, Baochang Zhang, David Doermann, Lian Zhuo

In recent years, deep learning has dominated progress in the field of medical image analysis.

Classification General Classification +2

Paper
Add Code

Multi-UAV Mobile Edge Computing and Path Planning Platform based on Reinforcement Learning

no code implementations • 3 Feb 2021 • Huan Chang, Yicheng Chen, Baochang Zhang, David Doermann

Unmanned Aerial vehicles (UAVs) are widely used as network processors in mobile networks, but more recently, UAVs have been used in Mobile Edge Computing as mobile servers.

Edge-computing reinforcement-learning +1

Paper
Add Code

Interpretable Attention Guided Network for Fine-grained Visual Classification

no code implementations • 8 Mar 2021 • Zhenhuan Huang, Xiaoyue Duan, Bo Zhao, Jinhu Lü, Baochang Zhang

We propose an Interpretable Attention Guided Network (IAGN) for fine-grained visual classification.

Classification Fine-Grained Image Classification +1

Paper
Add Code

Probabilistic Ranking-Aware Ensembles for Enhanced Object Detections

no code implementations • 7 May 2021 • Mingyuan Mao, Baochang Zhang, David Doermann, Jie Guo, Shumin Han, Yuan Feng, Xiaodi Wang, Errui Ding

This leads to a new problem of confidence discrepancy for the detector ensembles.

Ensemble Learning Object +2

Paper
Add Code

Dual-stream Network for Visual Recognition

no code implementations • NeurIPS 2021 • Mingyuan Mao, Renrui Zhang, Honghui Zheng, Peng Gao, Teli Ma, Yan Peng, Errui Ding, Baochang Zhang, Shumin Han

Transformers with remarkable global representation capacities achieve competitive results for visual tasks, but fail to consider high-level local pattern information in input images.

Image Classification Instance Segmentation +3

Paper
Add Code

Oriented Object Detection with Transformer

no code implementations • 6 Jun 2021 • Teli Ma, Mingyuan Mao, Honghui Zheng, Peng Gao, Xiaodi Wang, Shumin Han, Errui Ding, Baochang Zhang, David Doermann

Object detection with Transformers (DETR) has achieved a competitive performance over traditional detectors, such as Faster R-CNN.

Object object-detection +2

Paper
Add Code

Cogradient Descent for Dependable Learning

no code implementations • 20 Jun 2021 • Runqi Wang, Baochang Zhang, Li'an Zhuo, Qixiang Ye, David Doermann

Conventional gradient descent methods compute the gradients for multiple variables through the partial derivative.

Image Inpainting Image Reconstruction +1

Paper
Add Code

Layer-Wise Searching for 1-Bit Detectors

no code implementations • CVPR 2021 • Sheng Xu, Junhe Zhao, Jinhu Lu, Baochang Zhang, Shumin Han, David Doermann

At each layer, it exploits a differentiable binarization search (DBS) to minimize the angular error in a student-teacher framework.

Binarization

Paper
Add Code

Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation

no code implementations • CVPR 2021 • Yunhang Shen, Liujuan Cao, Zhiwei Chen, Feihong Lian, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji

To date, learning weakly supervised panoptic segmentation (WSPS) with only image-level labels remains unexplored.

Instance Segmentation Multiple Instance Learning +6

Paper
Add Code

IDARTS: Interactive Differentiable Architecture Search

no code implementations • ICCV 2021 • Song Xue, Runqi Wang, Baochang Zhang, Tian Wang, Guodong Guo, David Doermann

Differentiable Architecture Search (DARTS) improves the efficiency of architecture search by learning the architecture and network parameters end-to-end.

Paper
Add Code

Parallel Detection-and-Segmentation Learning for Weakly Supervised Instance Segmentation

no code implementations • ICCV 2021 • Yunhang Shen, Liujuan Cao, Zhiwei Chen, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji

Weakly supervised instance segmentation (WSIS) with only image-level labels has recently drawn much attention.

Instance Segmentation object-detection +5

Paper
Add Code

Towards Comprehensive Monocular Depth Estimation: Multiple Heads Are Better Than One

no code implementations • 16 Nov 2021 • Shuwei Shao, Ran Li, Zhongcai Pei, Zhong Liu, Weihai Chen, Wentao Zhu, Xingming Wu, Baochang Zhang

In this work, we investigate into the phenomenon and propose to integrate the strengths of multiple weak depth predictor to build a comprehensive and accurate depth predictor, which is critical for many real-world applications, e. g., 3D reconstruction.

3D Reconstruction Ensemble Learning +2

Paper
Add Code

Auxiliary learning induced graph convolutional networks

no code implementations • NeurIPS 2021 • Gengchen Duan, Taisong Jin, Rongrong Ji, Ling Shao, Baochang Zhang, Feiyue Huang, Yongjian Wu

In this article, we propose a novel auxiliary learning induced graph convolutional network in a multi-task fashion.

Auxiliary Learning Classification +4

Paper
Add Code

POEM: 1-bit Point-wise Operations based on Expectation-Maximization for Efficient Point Cloud Processing

no code implementations • 26 Nov 2021 • Sheng Xu, Yanjing Li, Junhe Zhao, Baochang Zhang, Guodong Guo

Real-time point cloud processing is fundamental for lots of computer vision tasks, while still challenged by the computational problem on resource-limited edge devices.

Paper
Add Code

Associative Adversarial Learning Based on Selective Attack

no code implementations • 28 Dec 2021 • Runqi Wang, Xiaoyue Duan, Baochang Zhang, Song Xue, Wentao Zhu, David Doermann, Guodong Guo

We show that our method improves the recognition accuracy of adversarial training on ImageNet by 8. 32% compared with the baseline.

Adversarial Robustness Few-Shot Learning +2

Paper
Add Code

TerViT: An Efficient Ternary Vision Transformer

no code implementations • 20 Jan 2022 • Sheng Xu, Yanjing Li, Teli Ma, Bohan Zeng, Baochang Zhang, Peng Gao, Jinhu Lv

Vision transformers (ViTs) have demonstrated great potential in various visual tasks, but suffer from expensive computational and memory cost problems when deployed on resource-constrained devices.

Paper
Add Code

Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation

no code implementations • 17 Mar 2022 • Runqi Wang, Linlin Yang, Baochang Zhang, Wentao Zhu, David Doermann, Guodong Guo

Research on the generalization ability of deep neural networks (DNNs) has recently attracted a great deal of attention.

Image Classification object-detection +1

Paper
Add Code

Bi-level Doubly Variational Learning for Energy-based Latent Variable Models

no code implementations • CVPR 2022 • Ge Kan, Jinhu Lü, Tian Wang, Baochang Zhang, Aichun Zhu, Lei Huang, Guodong Guo, Hichem Snoussi

In this paper, we propose Bi-level doubly variational learning (BiDVL), which is based on a new bi-level optimization framework and two tractable variational distributions to facilitate learning EBLVMs.

Image Generation Image Reconstruction +1

Paper
Add Code

MAFormer: A Transformer Network with Multi-scale Attention Fusion for Visual Recognition

no code implementations • 31 Aug 2022 • Yunhao Wang, Huixin Sun, Xiaodi Wang, Bin Zhang, Chao Li, Ying Xin, Baochang Zhang, Errui Ding, Shumin Han

We develop a simple but effective module to explore the full potential of transformers for visual representation by learning fine-grained and coarse-grained features at a token level and dynamically fusing them.

Instance Segmentation object-detection +2

Paper
Add Code

Rethinking the Number of Shots in Robust Model-Agnostic Meta-Learning

no code implementations • 28 Nov 2022 • Xiaoyue Duan, Guoliang Kang, Runqi Wang, Shumin Han, Song Xue, Tian Wang, Baochang Zhang

Based on this observation, we propose a simple strategy, i. e., increasing the number of training shots, to mitigate the loss of intrinsic dimension caused by robustness-promoting regularization.

Meta-Learning

Paper
Add Code

Feature Calibration Network for Occluded Pedestrian Detection

no code implementations • 12 Dec 2022 • Tianliang Zhang, Qixiang Ye, Baochang Zhang, Jianzhuang Liu, Xiaopeng Zhang, Qi Tian

FC-Net is based on the observation that the visible parts of pedestrians are selective and decisive for detection, and is implemented as a self-paced feature learning framework with a self-activation (SA) module and a feature calibration (FC) module.

Pedestrian Detection

Paper
Add Code

CircleNet: Reciprocating Feature Adaptation for Robust Pedestrian Detection

no code implementations • 12 Dec 2022 • Tianliang Zhang, Zhenjun Han, Huijuan Xu, Baochang Zhang, Qixiang Ye

In this paper we propose a novel feature learning model, referred to as CircleNet, to achieve feature adaptation by mimicking the process humans looking at low resolution and occluded objects: focusing on it again, at a finer scale, if the object can not be identified clearly for the first time.

object-detection Object Detection +1

Paper
Add Code

Confidence-driven Bounding Box Localization for Small Object Detection

no code implementations • 3 Mar 2023 • Huixin Sun, Baochang Zhang, Yanjing Li, Xianbin Cao

C-BBL quantizes continuous labels into grids and formulates two-hot ground truth labels.

Object object-detection +3

Paper
Add Code

MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation

no code implementations • 14 Apr 2023 • Jie Guo, Qimeng Wang, Yan Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Baochang Zhang

CLIP (Contrastive Language-Image Pretraining) is well-developed for open-vocabulary zero-shot image-level recognition, while its applications in pixel-level tasks are less investigated, where most efforts directly adopt CLIP features without deliberative adaptations.

GPR Open Vocabulary Semantic Segmentation +3

Paper
Add Code

AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning

no code implementations • CVPR 2023 • Runqi Wang, Xiaoyue Duan, Guoliang Kang, Jianzhuang Liu, Shaohui Lin, Songcen Xu, Jinhu Lv, Baochang Zhang

Text consists of a category name and a fixed number of learnable parameters which are selected from our designed attribute word bank and serve as attributes.

Attribute Continual Learning +1

Paper
Add Code

Bi-ViT: Pushing the Limit of Vision Transformer Quantization

no code implementations • 21 May 2023 • Yanjing Li, Sheng Xu, Mingbao Lin, Xianbin Cao, Chuanjian Liu, Xiao Sun, Baochang Zhang

Vision transformers (ViTs) quantization offers a promising prospect to facilitate deploying large pre-trained networks on resource-limited devices.

Binarization Quantization

Paper
Add Code

Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map

no code implementations • 27 May 2023 • Yuguang Yang, Runtang Guo, Sheng Wu, Yimi Wang, Juan Zhang, Xuan Gong, Baochang Zhang

Although the Class Activation Map (CAM) is widely used to interpret deep model predictions by highlighting object location, it fails to provide insight into the salient features used by the model to make decisions.

Decision Making

Paper
Add Code

DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit CNNs

no code implementations • 27 Jun 2023 • Yanjing Li, Sheng Xu, Xianbin Cao, Li'an Zhuo, Baochang Zhang, Tian Wang, Guodong Guo

One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS by taking advantage of the strengths of each in a unified framework, while searching the 1-bit CNNs is more challenging due to the more complicated processes involved.

Neural Architecture Search object-detection +2

Paper
Add Code

Q-YOLO: Efficient Inference for Real-time Object Detection

no code implementations • 1 Jul 2023 • Mingze Wang, Huixin Sun, Jun Shi, Xuhui Liu, Baochang Zhang, Xianbin Cao

Real-time object detection plays a vital role in various computer vision applications.

Object object-detection +2

Paper
Add Code

Representation Disparity-aware Distillation for 3D Object Detection

no code implementations • ICCV 2023 • Yanjing Li, Sheng Xu, Mingbao Lin, Jihao Yin, Baochang Zhang, Xianbin Cao

In this paper, we focus on developing knowledge distillation (KD) for compact 3D detectors.

3D Object Detection Knowledge Distillation +2

Paper
Add Code

Heterogeneous Generative Knowledge Distillation with Masked Image Modeling

no code implementations • 18 Sep 2023 • ZiMing Wang, Shumin Han, Xiaodi Wang, Jing Hao, Xianbin Cao, Baochang Zhang

Masked image modeling (MIM) methods achieve great success in various visual tasks but remain largely unexplored in knowledge distillation for heterogeneous deep models.

Image Classification Knowledge Distillation +3

Paper
Add Code

LatentWarp: Consistent Diffusion Latents for Zero-Shot Video-to-Video Translation

no code implementations • 1 Nov 2023 • Yuxiang Bao, Di Qiu, Guoliang Kang, Baochang Zhang, Bo Jin, Kaiye Wang, Pengfei Yan

As a result, the corresponding regions across the adjacent frames can share closely-related query tokens and attention outputs, which can further improve latent-level consistency to enhance visual temporal coherence of generated videos.

Denoising Optical Flow Estimation +1

Paper
Add Code

Tuning-Free Inversion-Enhanced Control for Consistent Image Editing

no code implementations • 22 Dec 2023 • Xiaoyue Duan, Shuhao Cui, Guoliang Kang, Baochang Zhang, Zhengcong Fei, Mingyuan Fan, Junshi Huang

Consistent editing of real images is a challenging task, as it requires performing non-rigid edits (e. g., changing postures) to the main objects in the input image without changing their identity or attributes.

Denoising

Paper
Add Code

Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud Semantic Segmentation via Decoupling Optimization

no code implementations • 13 Jan 2024 • Mengtian Li, Shaohui Lin, Zihan Wang, Yunhang Shen, Baochang Zhang, Lizhuang Ma

Semi-supervised learning (SSL), thanks to the significant reduction of data annotation costs, has been an active research topic for large-scale 3D scene understanding.

Pseudo Label Representation Learning +2

Paper
Add Code

Push Quantization-Aware Training Toward Full Precision Performances via Consistency Regularization

no code implementations • 21 Feb 2024 • Junbiao Pang, Tianyang Cai, Baochang Zhang, Jiaqi Wu, Ye Tao

Existing Quantization-Aware Training (QAT) methods intensively depend on the complete labeled dataset or knowledge distillation to guarantee the performances toward Full Precision (FP) accuracies.

Knowledge Distillation Quantization

Paper
Add Code

Effective Gradient Sample Size via Variation Estimation for Accelerating Sharpness aware Minimization

no code implementations • 24 Feb 2024 • Jiaxin Deng, Junbiao Pang, Baochang Zhang, Tian Wang

Concretely, we discover that the gradient of SAM is a combination of the gradient of SGD and the Projection of the Second-order gradient matrix onto the First-order gradient (PSF).

Paper
Add Code

A Channel-ensemble Approach: Unbiased and Low-variance Pseudo-labels is Critical for Semi-supervised Classification

no code implementations • 27 Mar 2024 • Jiaqi Wu, Junbiao Pang, Baochang Zhang, Qingming Huang

Semi-supervised learning (SSL) is a practical challenge in computer vision.

Pseudo Label

Paper
Add Code

$\mathrm{F^2Depth}$: Self-supervised Indoor Monocular Depth Estimation via Optical Flow Consistency and Feature Map Synthesis

no code implementations • 27 Mar 2024 • Xiaotong Guo, Huijie Zhao, Shuwei Shao, Xudong Li, Baochang Zhang

To evaluate the generalization ability of our $\mathrm{F^2Depth}$, we collect a Campus Indoor depth dataset composed of approximately 1500 points selected from 99 images in 18 scenes.

Indoor Monocular Depth Estimation Monocular Depth Estimation +2

Paper
Add Code

Fusion-Mamba for Cross-modality Object Detection

no code implementations • 14 Apr 2024 • Wenhao Dong, Haodong Zhu, Shaohui Lin, Xiaoyan Luo, Yunhang Shen, Xuhui Liu, Juan Zhang, Guodong Guo, Baochang Zhang

In this paper, we investigate cross-modality fusion by associating cross-modal features in a hidden state space based on an improved Mamba with a gating mechanism.

Object object-detection +1

Paper
Add Code

Real-time guidewire tracking and segmentation in intraoperative x-ray

no code implementations • 12 Apr 2024 • Baochang Zhang, Mai Bui, Cheng Wang, Felix Bourier, Heribert Schunkert, Nassir Navab

For this purpose, real-time and accurate guidewire segmentation and tracking can enhance the visualization of guidewires and provide visual feedback for physicians during the intervention as well as for robot-assisted interventions.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.