Search Results for author: Chunhua Shen

Found 340 papers, 138 papers with code

Instance-Aware Embedding for Point Cloud Instance Segmentation

no code implementations ECCV 2020 Tong He, Yifan Liu, Chunhua Shen, Xinlong Wang, Changming Sun

However, these methods are unaware of the instance context and fail to realize the boundary and geometric information of an instance, which are critical to separate adjacent objects.

Instance Segmentation Semantic Segmentation

A Geometric Perspective on Diffusion Models

no code implementations31 May 2023 Defang Chen, Zhenyu Zhou, Jian-Ping Mei, Chunhua Shen, Chun Chen, Can Wang

Recent years have witnessed significant progress in developing efficient training and fast sampling approaches for diffusion models.


StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation

1 code implementation30 May 2023 Chi Zhang, YiWen Chen, Yijun Fu, Zhenglin Zhou, Gang Yu, Billzb Wang, Bin Fu, Tao Chen, Guosheng Lin, Chunhua Shen

The recent advancements in image-text diffusion models have stimulated research interest in large-scale 3D generative models.

Learning Conditional Attributes for Compositional Zero-Shot Learning

1 code implementation CVPR 2023 Qingsheng Wang, Lingqiao Liu, Chenchen Jing, Hao Chen, Guoqiang Liang, Peng Wang, Chunhua Shen

Compositional Zero-Shot Learning (CZSL) aims to train models to recognize novel compositional concepts based on learned concepts such as attribute-object combinations.

Compositional Zero-Shot Learning

Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning

no code implementations28 May 2023 Mingyang Zhang, Hao Chen, Chunhua Shen, Zhen Yang, Linlin Ou, Xinyi Yu, Bohan Zhuang

We first design a PEFT-aware pruning criterion, which utilizes the values and gradients of Low-Rank Adaption (LoRA), rather than the gradients of pre-trained parameters for importance estimation.

Model Compression Network Pruning

Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching

1 code implementation22 May 2023 Yang Liu, Muzhi Zhu, Hengtao Li, Hao Chen, Xinlong Wang, Chunhua Shen

Naively connecting the models results in unsatisfying performance, e. g., the models tend to generate matching outliers and false-positive mask fragments.

Semantic Segmentation

SegGPT: Segmenting Everything In Context

1 code implementation6 Apr 2023 Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang

We unify various segmentation tasks into a generalist in-context learning framework that accommodates different kinds of segmentation data by transforming them into the same format of images.

Few-Shot Semantic Segmentation Panoptic Segmentation +3

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models

1 code implementation30 Mar 2023 Wen Wang, Kangyang Xie, Zide Liu, Hao Chen, Yue Cao, Xinlong Wang, Chunhua Shen

Our vid2vid-zero leverages off-the-shelf image diffusion models, and doesn't require training on any video.

Image Generation Video Alignment +1

Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction

1 code implementation24 Mar 2023 Wenjia Wang, Yongtao Ge, Haiyi Mei, Zhongang Cai, Qingping Sun, Yanjun Wang, Chunhua Shen, Lei Yang, Taku Komura

As it is hard to calibrate single-view RGB images in the wild, existing 3D human mesh reconstruction (3DHMR) methods either use a constant large focal length or estimate one based on the background environment context, which can not tackle the problem of the torso, limb, hand or face distortion caused by perspective camera projection when the camera is close to the human body.

3D Reconstruction

Background Matters: Enhancing Out-of-distribution Detection with Domain Features

no code implementations15 Mar 2023 Choubo Ding, Guansong Pang, Chunhua Shen

To this end, we propose a novel generic framework that can learn the domain features from the ID training samples by a dense prediction approach, with which different existing semantic-feature-based OOD detection methods can be seamlessly combined to jointly learn the in-distribution features from both the semantic and domain dimensions.

Object Recognition Out-of-Distribution Detection

Traffic Scene Parsing through the TSP6K Dataset

no code implementations6 Mar 2023 Peng-Tao Jiang, YuQi Yang, Yang Cao, Qibin Hou, Ming-Ming Cheng, Chunhua Shen

Traffic scene parsing is one of the most important tasks to achieve intelligent cities.

Scene Parsing

A Survey on Efficient Training of Transformers

no code implementations2 Feb 2023 Bohan Zhuang, Jing Liu, Zizheng Pan, Haoyu He, Yuetian Weng, Chunhua Shen

Recent advances in Transformers have come with a huge requirement on computing resources, highlighting the importance of developing efficient training techniques to make Transformer training faster, at lower cost, and to higher accuracy by the efficient use of computation and memory resources.

SPTS v2: Single-Point Scene Text Spotting

2 code implementations4 Jan 2023 Yuliang Liu, Jiaxin Zhang, Dezhi Peng, Mingxin Huang, Xinyu Wang, Jingqun Tang, Can Huang, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin

End-to-end scene text spotting has made significant progress due to its intrinsic synergy between text detection and recognition.

Text Spotting

Images Speak in Images: A Generalist Painter for In-Context Visual Learning

1 code implementation CVPR 2023 Xinlong Wang, Wen Wang, Yue Cao, Chunhua Shen, Tiejun Huang

In this work, we present Painter, a generalist model which addresses these obstacles with an "image"-centric solution, that is, to redefine the output of core vision tasks as images, and specify task prompts as also images.

Keypoint Detection Personalized Segmentation +1

FoPro: Few-Shot Guided Robust Webly-Supervised Prototypical Learning

1 code implementation1 Dec 2022 Yulei Qin, Xingyu Chen, Chao Chen, Yunhang Shen, Bo Ren, Yun Gu, Jie Yang, Chunhua Shen

Most existing methods focus on learning noise-robust models from web images while neglecting the performance drop caused by the differences between web domain and real-world domain.

Contrastive Learning Representation Learning

Learning from partially labeled data for multi-organ and tumor segmentation

no code implementations13 Nov 2022 Yutong Xie, Jianpeng Zhang, Yong Xia, Chunhua Shen

To address this, we propose a Transformer based dynamic on-demand network (TransDoDNet) that learns to segment organs and tumors on multiple partially labeled datasets.

Image Segmentation Medical Image Segmentation +2

Adv-Attribute: Inconspicuous and Transferable Adversarial Attack on Face Recognition

no code implementations13 Oct 2022 Shuai Jia, Bangjie Yin, Taiping Yao, Shouhong Ding, Chunhua Shen, Xiaokang Yang, Chao Ma

For face recognition attacks, existing methods typically generate the l_p-norm perturbations on pixels, however, resulting in low attack transferability and high vulnerability to denoising defense models.

Adversarial Attack Denoising +1

SegViT: Semantic Segmentation with Plain Vision Transformers

1 code implementation12 Oct 2022 BoWen Zhang, Zhi Tian, Quan Tang, Xiangxiang Chu, Xiaolin Wei, Chunhua Shen, Yifan Liu

We explore the capability of plain Vision Transformers (ViTs) for semantic segmentation and propose the SegVit.

Semantic Segmentation

Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval

no code implementations27 Sep 2022 Chengzhi Lin, AnCong Wu, Junwei Liang, Jun Zhang, Wenhang Ge, Wei-Shi Zheng, Chunhua Shen

To address this problem, we propose a Text-Adaptive Multiple Visual Prototype Matching model, which automatically captures multiple prototypes to describe a video by adaptive aggregation of video token features.

Cross-Modal Retrieval Retrieval +2

Multi-dataset Training of Transformers for Robust Action Recognition

1 code implementation26 Sep 2022 Junwei Liang, Enwei Zhang, Jun Zhang, Chunhua Shen

We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition.

Action Recognition Temporal Action Localization

Towards Accurate Reconstruction of 3D Scene Shape from A Single Monocular Image

1 code implementation28 Aug 2022 Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Yifan Liu, Chunhua Shen

To do so, we propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.

Depth Estimation Depth Prediction

Towards Domain-agnostic Depth Completion

1 code implementation29 Jul 2022 Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Chunhua Shen

Our method leverages a data driven prior in the form of a single image depth prediction network trained on large-scale datasets, the output of which is used as an input to our model.

Depth Completion Depth Estimation +2

Real-time End-to-End Video Text Spotter with Contrastive Representation Learning

no code implementations18 Jul 2022 Wejia Wu, Zhuang Li, Jiahong Li, Chunhua Shen, Hong Zhou, Size Li, Zhongyuan Wang, Ping Luo

Our contributions are three-fold: 1) CoText simultaneously address the three tasks (e. g., text detection, tracking, recognition) in a real-time end-to-end trainable framework.

Contrastive Learning Representation Learning +1

Efficient Decoder-free Object Detection with Transformers

1 code implementation14 Jun 2022 Peixian Chen, Mengdan Zhang, Yunhang Shen, Kekai Sheng, Yuting Gao, Xing Sun, Ke Li, Chunhua Shen

A natural usage of ViTs in detection is to replace the CNN-based backbone with a transformer-based backbone, which is straightforward and effective, with the price of bringing considerable computation burden for inference.

Object Detection

Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images

no code implementations27 May 2022 Zhi Tian, Xiangxiang Chu, Xiaoming Wang, Xiaolin Wei, Chunhua Shen

In this work, we tackle this challenging issue with a novel range view projection mechanism, and for the first time demonstrate the benefits of fusing multi-frame point clouds for a range-view based detector.

3D Object Detection Autonomous Driving +2

Super Vision Transformer

1 code implementation23 May 2022 Mingbao Lin, Mengzhao Chen, Yuxin Zhang, Chunhua Shen, Rongrong Ji, Liujuan Cao

Experimental results on ImageNet demonstrate that our SuperViT can considerably reduce the computational costs of ViT models with even performance increase.

PointInst3D: Segmenting 3D Instances by Points

no code implementations25 Apr 2022 Tong He, Wei Yin, Chunhua Shen, Anton Van Den Hengel

The current state-of-the-art methods in 3D instance segmentation typically involve a clustering step, despite the tendency towards heuristics, greedy algorithms, and a lack of robustness to the changes in data statistics.

3D Instance Segmentation Semantic Segmentation

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation

3 code implementations CVPR 2022 Wenqiang Zhang, Zilong Huang, Guozhong Luo, Tao Chen, Xinggang Wang, Wenyu Liu, Gang Yu, Chunhua Shen

Although vision transformers (ViTs) have achieved great success in computer vision, the heavy computational cost hampers their applications to dense prediction tasks such as semantic segmentation on mobile devices.

Semantic Segmentation

Improving Monocular Visual Odometry Using Learned Depth

no code implementations4 Apr 2022 Libo Sun, Wei Yin, Enze Xie, Zhengrong Li, Changming Sun, Chunhua Shen

The core of our framework is a monocular depth estimation module with a strong generalization capability for diverse scenes.

Monocular Depth Estimation Monocular Visual Odometry

Catching Both Gray and Black Swans: Open-set Supervised Anomaly Detection

1 code implementation CVPR 2022 Choubo Ding, Guansong Pang, Chunhua Shen

Despite most existing anomaly detection studies assume the availability of normal training samples only, a few labeled anomaly examples are often available in many real-world applications, such as defect samples identified during random quality inspection, lesion images confirmed by radiologists in daily medical screening, etc.

Ranked #2 on supervised anomaly detection on MVTec AD (using extra training data)

supervised anomaly detection

End-to-End Video Text Spotting with Transformer

1 code implementation20 Mar 2022 Weijia Wu, Yuanqiang Cai, Chunhua Shen, Debing Zhang, Ying Fu, Hong Zhou, Ping Luo

Recent video text spotting methods usually require the three-staged pipeline, i. e., detecting text in individual images, recognizing localized text, tracking text streams with post-processing to generate final results.

Text Spotting

PointAttN: You Only Need Attention for Point Cloud Completion

1 code implementation16 Mar 2022 Jun Wang, Ying Cui, Dongyan Guo, Junxia Li, Qingshan Liu, Chunhua Shen

To solve the problems, we leverage the cross-attention and self-attention mechanisms to design novel neural network for processing point cloud in a per-point manner to eliminate kNNs.

Point Cloud Completion

Training Protocol Matters: Towards Accurate Scene Text Recognition via Training Protocol Searching

2 code implementations13 Mar 2022 Xiaojie Chu, Yongtao Wang, Chunhua Shen, Jingdong Chen, Wei Chu

The development of scene text recognition (STR) in the era of deep learning has been mainly focused on novel architectures of STR models.

Scene Text Recognition

FreeSOLO: Learning to Segment Objects without Annotations

1 code implementation CVPR 2022 Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Anima Anandkumar, Chunhua Shen, Jose M. Alvarez

FreeSOLO further demonstrates superiority as a strong pre-training method, outperforming state-of-the-art self-supervised pre-training methods by +9. 8% AP when fine-tuning instance segmentation with only 5% COCO masks.

Instance Segmentation object-detection +3

The devil is in the labels: Semantic segmentation from sentences

no code implementations4 Feb 2022 Wei Yin, Yifan Liu, Chunhua Shen, Anton Van Den Hengel, Baichuan Sun

The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets, despite not using any images therefrom.

Instance Segmentation Monocular Depth Estimation +1

Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth

1 code implementation3 Feb 2022 Guangkai Xu, Wei Yin, Hao Chen, Chunhua Shen, Kai Cheng, Feng Wu, Feng Zhao

However, in some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency.

3D Scene Reconstruction Depth Completion +1

DENSE: Data-Free One-Shot Federated Learning

1 code implementation23 Dec 2021 Jie Zhang, Chen Chen, Bo Li, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chunhua Shen, Chao Wu

One-shot Federated Learning (FL) has recently emerged as a promising approach, which allows the central server to learn a model in a single communication round.

Federated Learning

SPTS: Single-Point Text Spotting

1 code implementation15 Dec 2021 Dezhi Peng, Xinyu Wang, Yuliang Liu, Jiaxin Zhang, Mingxin Huang, Songxuan Lai, Shenggao Zhu, Jing Li, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin

For the first time, we demonstrate that training scene text spotting models can be achieved with an extremely low-cost annotation of a single-point for each instance.

Language Modelling Text Spotting

NAS-FCOS: Efficient Search for Object Detection Architectures

1 code implementation24 Oct 2021 Ning Wang, Yang Gao, Hao Chen, Peng Wang, Zhi Tian, Chunhua Shen, Yanning Zhang

Neural Architecture Search (NAS) has shown great potential in effectively reducing manual effort in network design by automatically discovering optimal architectures.

Neural Architecture Search object-detection +1

TSGB: Target-Selective Gradient Backprop for Probing CNN Visual Saliency

1 code implementation11 Oct 2021 Lin Cheng, Pengfei Fang, Yanjie Liang, Liao Zhang, Chunhua Shen, Hanzi Wang

Inspired by those observations, we propose a novel visual saliency method, termed Target-Selective Gradient Backprop (TSGB), which leverages rectification operations to effectively emphasize target classes and further efficiently propagate the saliency to the image space, thereby generating target-selective and fine-grained saliency maps.

Meta Navigator: Search for a Good Adaptation Policy for Few-shot Learning

no code implementations ICCV 2021 Chi Zhang, Henghui Ding, Guosheng Lin, Ruibo Li, Changhu Wang, Chunhua Shen

Inspired by the recent success in Automated Machine Learning literature (AutoML), in this paper, we present Meta Navigator, a framework that attempts to solve the aforementioned limitation in few-shot learning by seeking a higher-level strategy and proffer to automate the selection from various few-shot learning designs.

AutoML Few-Shot Learning

Explainable Deep Few-shot Anomaly Detection with Deviation Networks

1 code implementation1 Aug 2021 Guansong Pang, Choubo Ding, Chunhua Shen, Anton Van Den Hengel

Here, we study the problem of few-shot anomaly detection, in which we aim at using a few labeled anomaly examples to train sample-efficient discriminative detection models.

Ranked #3 on supervised anomaly detection on MVTec AD (using extra training data)

Multiple Instance Learning supervised anomaly detection

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

1 code implementation NeurIPS 2021 BoWen Zhang, Yifan Liu, Zhi Tian, Chunhua Shen

This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.

Semantic Segmentation Vocal Bursts Intensity Prediction

Dynamic Convolution for 3D Point Cloud Instance Segmentation

1 code implementation18 Jul 2021 Tong He, Chunhua Shen, Anton Van Den Hengel

The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance.

Instance Segmentation Semantic Segmentation

SOLO: A Simple Framework for Instance Segmentation

no code implementations30 Jun 2021 Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei LI

Besides instance segmentation, our method yields state-of-the-art results in object detection (from our mask byproduct) and panoptic segmentation.

Image Matting Instance Segmentation +3

Learning Spatial-Semantic Relationship for Facial Attribute Recognition With Limited Labeled Data

no code implementations CVPR 2021 Ying Shu, Yan Yan, Si Chen, Jing-Hao Xue, Chunhua Shen, Hanzi Wang

First, three auxiliary tasks, consisting of a Patch Rotation Task (PRT), a Patch Segmentation Task (PST), and a Patch Classification Task (PCT), are jointly developed to learn the spatial-semantic relationship from large-scale unlabeled facial data.

Self-Supervised Learning

Unsupervised Scale-consistent Depth Learning from Video

2 code implementations25 May 2021 Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Zhichao Li, Le Zhang, Chunhua Shen, Ming-Ming Cheng, Ian Reid

We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training and enables the scale-consistent prediction at inference time.

Monocular Depth Estimation Monocular Visual Odometry +1

ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting

1 code implementation8 May 2021 Yuliang Liu, Chunhua Shen, Lianwen Jin, Tong He, Peng Chen, Chongyu Liu, Hao Chen

Previous methods can be roughly categorized into two groups: character-based and segmentation-based, which often require character-level annotations and/or complex post-processing due to the unstructured output.

Text Spotting

PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text

1 code implementation2 May 2021 Wenhai Wang, Enze Xie, Xiang Li, Xuebo Liu, Ding Liang, Zhibo Yang, Tong Lu, Chunhua Shen

By systematically comparing with existing scene text representations, we show that our kernel representation can not only describe arbitrarily-shaped text but also well distinguish adjacent text.

Scene Text Detection Text Spotting

Twins: Revisiting the Design of Spatial Attention in Vision Transformers

8 code implementations NeurIPS 2021 Xiangxiang Chu, Zhi Tian, Yuqing Wang, Bo Zhang, Haibing Ren, Xiaolin Wei, Huaxia Xia, Chunhua Shen

Very recently, a variety of vision transformer architectures for dense prediction tasks have been proposed and they show that the design of spatial attention is critical to their success in these tasks.

Image Classification Semantic Segmentation

DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning

2 code implementations19 Apr 2021 Yuting Gao, Jia-Xin Zhuang, Shaohui Lin, Hao Cheng, Xing Sun, Ke Li, Chunhua Shen

Specifically, we find the final embedding obtained by the mainstream SSL methods contains the most fruitful information, and propose to distill the final embedding to maximally transmit a teacher's knowledge to a lightweight model by constraining the last embedding of the student to be consistent with that of the teacher.

Contrastive Learning Representation Learning +1

A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

1 code implementation ICCV 2021 Jianlong Yuan, Yifan Liu, Chunhua Shen, Zhibin Wang, Hao Li

Previous works [3, 27] fail to employ strong augmentation in pseudo label learning efficiently, as the large distribution change caused by strong augmentation harms the batch normalisation statistics.

Data Augmentation Image Classification +2

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

no code implementations CVPR 2021 Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, Hanzi Wang

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition.

Facial Expression Recognition (FER)

An Adversarial Human Pose Estimation Network Injected with Graph Structure

no code implementations29 Mar 2021 Lei Tian, Guoqiang Liang, Peng Wang, Chunhua Shen

Because of the invisible human keypoints in images caused by illumination, occlusion and overlap, it is likely to produce unreasonable human pose prediction for most of the current human pose estimation methods.

Pose Estimation Pose Prediction

Generic Perceptual Loss for Modeling Structured Output Dependencies

no code implementations CVPR 2021 Yifan Liu, Hao Chen, Yu Chen, Wei Yin, Chunhua Shen

We hope that this simple, extended perceptual loss may serve as a generic structured-output loss that is applicable to most structured output learning tasks.

Depth Estimation Image Generation +4

FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation

2 code implementations8 Mar 2021 Lingtong Kong, Chunhua Shen, Jie Yang

Experiments on both synthetic Sintel data and real-world KITTI datasets demonstrate the effectiveness of the proposed approach, which needs only 1/10 computation of comparable networks to achieve on par accuracy.

Optical Flow Estimation

CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation

1 code implementation4 Mar 2021 Yutong Xie, Jianpeng Zhang, Chunhua Shen, Yong Xia

Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.

Image Segmentation Inductive Bias +3

Instance and Panoptic Segmentation Using Conditional Convolutions

no code implementations5 Feb 2021 Zhi Tian, BoWen Zhang, Hao Chen, Chunhua Shen

In the literature, top-performing instance segmentation methods typically follow the paradigm of Mask R-CNN and rely on ROI operations (typically ROIAlign) to attend to each instance.

Instance Segmentation Panoptic Segmentation

Object Detection Made Simpler by Eliminating Heuristic NMS

no code implementations28 Jan 2021 Qiang Zhou, Chaohui Yu, Chunhua Shen, Zhibin Wang, Hao Li

On the COCO dataset, our simple design achieves superior performance compared to both the FCOS baseline detector with NMS post-processing and the recent end-to-end NMS-free detectors.

object-detection Object Detection

Multi-intersection Traffic Optimisation: A Benchmark Dataset and a Strong Baseline

no code implementations24 Jan 2021 Hu Wang, Hao Chen, Qi Wu, Congbo Ma, Yidong Li, Chunhua Shen

To address these issues, in this work we carefully design our settings and propose a new dataset including both synthetic and real traffic data in more complex scenarios.

Single-path Bit Sharing for Automatic Loss-aware Model Compression

no code implementations13 Jan 2021 Jing Liu, Bohan Zhuang, Peng Chen, Chunhua Shen, Jianfei Cai, Mingkui Tan

By jointly training the binary gates in conjunction with network parameters, the compression configurations of each layer can be automatically determined.

Model Compression Network Pruning +1

BV-Person: A Large-Scale Dataset for Bird-View Person Re-Identification

no code implementations ICCV 2021 Cheng Yan, Guansong Pang, Lei Wang, Jile Jiao, Xuetao Feng, Chunhua Shen, Jingjing Li

In this work we introduce a new ReID task, bird-view person ReID, which aims at searching for a person in a gallery of horizontal-view images with the query images taken from a bird's-eye view, i. e., an elevated view of an object from above.

Person Re-Identification

Occluded Person Re-Identification With Single-Scale Global Representations

no code implementations ICCV 2021 Cheng Yan, Guansong Pang, Jile Jiao, Xiao Bai, Xuetao Feng, Chunhua Shen

However, real-world ReID applications typically have highly diverse occlusions and involve a hybrid of occluded and non-occluded pedestrians.

Graph Matching Person Re-Identification +1

Memory-Efficient Hierarchical Neural Architecture Search for Image Restoration

1 code implementation24 Dec 2020 Haokui Zhang, Ying Li, Hao Chen, Chengrong Gong, Zongwen Bai, Chunhua Shen

For the inner search space, we propose a layer-wise architecture sharing strategy (LWAS), resulting in more flexible architectures and better performance.

Image Denoising Image Restoration +2

Learning to Recover 3D Scene Shape from a Single Image

1 code implementation CVPR 2021 Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, Chunhua Shen

Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length.

3D Scene Reconstruction Depth Prediction +3

Hyperspectral Classification Based on Lightweight 3-D-CNN With Transfer Learning

1 code implementation7 Dec 2020 Haokui Zhang, Ying Li, Yenan Jiang, Peng Wang, Qiang Shen, Chunhua Shen

In contrast to previous approaches, we do not impose restrictions over the source data sets, in which they do not have to be collected by the same sensors as the target data sets.

Classification General Classification +1

End-to-End Video Instance Segmentation with Transformers

3 code implementations CVPR 2021 Yuqing Wang, Zhaoliang Xu, Xinlong Wang, Chunhua Shen, Baoshan Cheng, Hao Shen, Huaxia Xia

Here, we propose a new video instance segmentation framework built upon Transformers, termed VisTR, which views the VIS task as a direct end-to-end parallel sequence decoding/prediction problem.

Instance Segmentation Semantic Segmentation +2

Fully Quantized Image Super-Resolution Networks

1 code implementation29 Nov 2020 Hu Wang, Peng Chen, Bohan Zhuang, Chunhua Shen

With the rising popularity of intelligent mobile devices, it is of great practical significance to develop accurate, realtime and energy-efficient image Super-Resolution (SR) inference methods.

Image Super-Resolution Quantization

Learning Affinity-Aware Upsampling for Deep Image Matting

1 code implementation CVPR 2021 Yutong Dai, Hao Lu, Chunhua Shen

By looking at existing upsampling operators from a unified mathematical perspective, we generalize them into a second-order form and introduce Affinity-Aware Upsampling (A2U) where upsampling kernels are generated using a light-weight lowrank bilinear model and are conditioned on second-order features.

Image Matting Image Reconstruction

Channel-wise Knowledge Distillation for Dense Prediction

2 code implementations ICCV 2021 Changyong Shu, Yifan Liu, Jianfei Gao, Zheng Yan, Chunhua Shen

Observing that in semantic segmentation, some layers' feature activations of each channel tend to encode saliency of scene categories (analogue to class activation mapping), we propose to align features channel-wise between the student and teacher networks.

Knowledge Distillation Semantic Segmentation

DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution

1 code implementation CVPR 2021 Tong He, Chunhua Shen, Anton Van Den Hengel

Previous top-performing approaches for point cloud instance segmentation involve a bottom-up strategy, which often includes inefficient operations or complex pipelines, such as grouping over-segmented components, introducing additional steps for refining, or designing complicated loss functions.

Instance Segmentation Semantic Segmentation

PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image Segmentation

no code implementations25 Nov 2020 Yutong Xie, Jianpeng Zhang, Zehui Liao, Yong Xia, Chunhua Shen

In this paper, we propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space.

Image Segmentation Medical Image Segmentation +2

Graph Attention Tracking

no code implementations CVPR 2021 Dongyan Guo, Yanyan Shao, Ying Cui, Zhenhua Wang, Liyan Zhang, Chunhua Shen

We propose to establish part-to-part correspondence between the target and the search region with a complete bipartite graph, and apply the graph attention mechanism to propagate target information from the template feature to the search feature.

Graph Attention Object Tracking +1

Robust Data Hiding Using Inverse Gradient Attention

1 code implementation21 Nov 2020 Honglei Zhang, Hu Wang, Yuanzhouhan Cao, Chunhua Shen, Yidong Li

In deep data hiding models, to maximize the encoding capacity, each pixel of the cover image ought to be treated differently since they have different sensitivities w. r. t.

DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets

1 code implementation CVPR 2021 Jianpeng Zhang, Yutong Xie, Yong Xia, Chunhua Shen

To address this, we propose a dynamic on-demand network (DoDNet) that learns to segment multiple organs and tumors on partially labeled datasets.

Image Segmentation Medical Image Segmentation +2

Unifying Instance and Panoptic Segmentation with Dynamic Rank-1 Convolutions

no code implementations19 Nov 2020 Hao Chen, Chunhua Shen, Zhi Tian

To our knowledge, DR1Mask is the first panoptic segmentation framework that exploits a shared feature map for both instance and semantic segmentation by considering both efficacy and efficiency.

Instance Segmentation Multi-Task Learning +1

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

6 code implementations CVPR 2021 Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei LI

Compared to the baseline method MoCo-v2, our method introduces negligible computation overhead (only <1% slower), but demonstrates consistently superior performance when transferring to downstream dense prediction tasks including object detection, semantic segmentation and instance segmentation; and outperforms the state-of-the-art methods by a large margin.

Contrastive Learning Image Classification +5

FATNN: Fast and Accurate Ternary Neural Networks

no code implementations ICCV 2021 Peng Chen, Bohan Zhuang, Chunhua Shen

Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts.

Image Classification Quantization

Representative Graph Neural Network

no code implementations ECCV 2020 Changqian Yu, Yifan Liu, Changxin Gao, Chunhua Shen, Nong Sang

In this paper, we present a Representative Graph (RepGraph) layer to dynamically sample a few representative features, which dramatically reduces redundancy.

object-detection Object Detection +1

Pairwise Relation Learning for Semi-supervised Gland Segmentation

no code implementations6 Aug 2020 Yutong Xie, Jianpeng Zhang, Zhibin Liao, Chunhua Shen, Johan Verjans, Yong Xia

In this paper, we propose the pairwise relation-based semi-supervised (PRS^2) model for gland segmentation on histology images.

AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting

2 code implementations ECCV 2020 Wenhai Wang, Xuebo Liu, Xiaozhong Ji, Enze Xie, Ding Liang, Zhibo Yang, Tong Lu, Chunhua Shen, Ping Luo

Unlike previous works that merely employed visual features for text detection, this work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter), which learns both visual and linguistic features to significantly reduce ambiguity in text detection.

Language Modelling Text Spotting

Improving Generative Adversarial Networks with Local Coordinate Coding

1 code implementation28 Jul 2020 Jiezhang Cao, Yong Guo, Qingyao Wu, Chunhua Shen, Junzhou Huang, Mingkui Tan

In this paper, rather than sampling from the predefined prior distribution, we propose an LCCGAN model with local coordinate coding (LCC) to improve the performance of generating data.

Soft Expert Reward Learning for Vision-and-Language Navigation

no code implementations ECCV 2020 Hu Wang, Qi Wu, Chunhua Shen

In this paper, we introduce a Soft Expert Reward Learning (SERL) model to overcome the reward engineering designing and generalisation problems of the VLN task.

Reinforcement Learning (RL) Vision and Language Navigation

AQD: Towards Accurate Quantized Object Detection

no code implementations CVPR 2021 Peng Chen, Jing Liu, Bohan Zhuang, Mingkui Tan, Chunhua Shen

Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices.

Image Classification object-detection +2

Deep Learning for Anomaly Detection: A Review

no code implementations6 Jul 2020 Guansong Pang, Chunhua Shen, Longbing Cao, Anton Van Den Hengel

This paper surveys the research of deep anomaly detection with a comprehensive taxonomy, covering advancements in three high-level categories and 11 fine-grained categories of the methods.

Anomaly Detection Outlier Detection

FCOS: A simple and strong anchor-free object detector

no code implementations14 Jun 2020 Zhi Tian, Chunhua Shen, Hao Chen, Tong He

In computer vision, object detection is one of most important tasks, which underpins a few instance-level recognition tasks and many downstream applications.

Object Detection Semantic Segmentation

A Robust Attentional Framework for License Plate Recognition in the Wild

no code implementations6 Jun 2020 Linjiang Zhang, Peng Wang, Hui Li, Zhen Li, Chunhua Shen, Yanning Zhang

On the other hand, the 2D attentional based license plate recognizer with an Xception-based CNN encoder is capable of recognizing license plates with different patterns under various scenarios accurately and robustly.

Image Generation License Plate Recognition

Auto-Rectify Network for Unsupervised Indoor Depth Estimation

1 code implementation4 Jun 2020 Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Tat-Jun Chin, Chunhua Shen, Ian Reid

However, excellent results have mostly been obtained in street-scene driving scenarios, and such methods often fail in other settings, particularly indoor videos taken by handheld devices.

Monocular Depth Estimation Self-Supervised Learning +1

Scope Head for Accurate Localization in Object Detection

no code implementations11 May 2020 Geng Zhan, Dan Xu, Guo Lu, Wei Wu, Chunhua Shen, Wanli Ouyang

Existing anchor-based and anchor-free object detectors in multi-stage or one-stage pipelines have achieved very promising detection performance.

object-detection Object Detection +1

BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation

6 code implementations5 Apr 2020 Changqian Yu, Changxin Gao, Jingbo Wang, Gang Yu, Chunhua Shen, Nong Sang

We propose to treat these spatial details and categorical semantics separately to achieve high accuracy and high efficiency for realtime semantic segmentation.

Real-Time Semantic Segmentation

Context Prior for Scene Segmentation

2 code implementations CVPR 2020 Changqian Yu, Jingbo Wang, Changxin Gao, Gang Yu, Chunhua Shen, Nong Sang

Given an input image and corresponding ground truth, Affinity Loss constructs an ideal affinity map to supervise the learning of Context Prior.

Scene Segmentation Scene Understanding

Segmenting Transparent Objects in the Wild

1 code implementation ECCV 2020 Enze Xie, Wenjia Wang, Wenhai Wang, Mingyu Ding, Chunhua Shen, Ping Luo

To address this important problem, this work proposes a large-scale dataset for transparent object segmentation, named Trans10K, consisting of 10, 428 images of real scenarios with carefully manual annotations, which are 10 times larger than the existing datasets.

Semantic Segmentation Transparent objects

Viral Pneumonia Screening on Chest X-ray Images Using Confidence-Aware Anomaly Detection

1 code implementation27 Mar 2020 Jianpeng Zhang, Yutong Xie, Guansong Pang, Zhibin Liao, Johan Verjans, Wenxin Li, Zongji Sun, Jian He, Yi Li, Chunhua Shen, Yong Xia

In this paper, we formulate the task of differentiating viral pneumonia from non-viral pneumonia and healthy controls into an one-class classification-based anomaly detection problem, and thus propose the confidence-aware anomaly detection (CAAD) model, which consists of a shared feature extractor, an anomaly detection module, and a confidence prediction module.

Binary Classification Classification +2

SOLOv2: Dynamic and Fast Instance Segmentation

18 code implementations NeurIPS 2020 Xinlong Wang, Rufeng Zhang, Tao Kong, Lei LI, Chunhua Shen

Importantly, we take one step further by dynamically learning the mask head of the object segmenter such that the mask head is conditioned on the location.

object-detection Object Detection +3

DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning

3 code implementations15 Mar 2020 Chi Zhang, Yujun Cai, Guosheng Lin, Chunhua Shen

We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations to determine image relevance.

Classification Few-Shot Image Classification +4

Self-trained Deep Ordinal Regression for End-to-End Video Anomaly Detection

no code implementations CVPR 2020 Guansong Pang, Cheng Yan, Chunhua Shen, Anton Van Den Hengel, Xiao Bai

Video anomaly detection is of critical practical importance to a variety of real applications because it allows human attention to be focused on events that are likely to be of interest, in spite of an otherwise overwhelming volume of video.

Anomaly Detection regression +2

Conditional Convolutions for Instance Segmentation

7 code implementations ECCV 2020 Zhi Tian, Chunhua Shen, Hao Chen

We propose a simple yet effective instance segmentation framework, termed CondInst (conditional convolutions for instance segmentation).

Instance Segmentation Semantic Segmentation

Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes

no code implementations11 Mar 2020 Genshun Dong, Yan Yan, Chunhua Shen, Hanzi Wang

Meanwhile, a Spatial detail-Preserving Network (SPN) with shallow convolutional layers is designed to generate high-resolution feature maps preserving the detailed spatial information.

Image Segmentation Semantic Segmentation +1

Efficient Semantic Video Segmentation with Per-frame Inference

1 code implementation ECCV 2020 Yifan Liu, Chunhua Shen, Changqian Yu, Jingdong Wang

For semantic segmentation, most existing real-time deep models trained with each frame independently may produce inconsistent results for a video sequence.

Knowledge Distillation Optical Flow Estimation +3

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

7 code implementations CVPR 2020 Yuliang Liu, Hao Chen, Chunhua Shen, Tong He, Lianwen Jin, Liangwei Wang

Our contributions are three-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve.

Scene Text Detection Text Spotting

Joint Deep Learning of Facial Expression Synthesis and Recognition

no code implementations6 Feb 2020 Yan Yan, Ying Huang, Si Chen, Chunhua Shen, Hanzi Wang

Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.

Facial Expression Recognition (FER)

DiverseDepth: Affine-invariant Depth Prediction Using Diverse Data

2 code implementations3 Feb 2020 Wei Yin, Xinlong Wang, Chunhua Shen, Yifan Liu, Zhi Tian, Songcen Xu, Changming Sun, Dou Renyin

Compared with previous learning objectives, i. e., learning metric depth or relative depth, we propose to learn the affine-invariant depth using our diverse dataset to ensure both generalization and high-quality geometric shapes of scenes.

Depth Estimation Depth Prediction

Separating Content from Style Using Adversarial Learning for Recognizing Text in the Wild

no code implementations13 Jan 2020 Canjie Luo, Qingxiang Lin, Yuliang Liu, Lianwen Jin, Chunhua Shen

Furthermore, to tackle the issue of lacking paired training samples, we design an interactive joint training scheme, which shares attention masks from the recognizer to the discriminator, and enables the discriminator to extract the features of each character for further adversarial training.

Style Transfer

Memorizing Comprehensively to Learn Adaptively: Unsupervised Cross-Domain Person Re-ID with Multi-level Memory

no code implementations13 Jan 2020 Xin-Yu Zhang, Dong Gong, Jiewei Cao, Chunhua Shen

Due to the lack of supervision in the target domain, it is crucial to identify the underlying similarity-and-dissimilarity relationships among the unlabelled samples in the target domain.

Person Re-Identification

From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting

3 code implementations7 Jan 2020 Haipeng Xiong, Hao Lu, Chengxin Liu, Liang Liu, Chunhua Shen, Zhiguo Cao

Visual counting, a task that aims to estimate the number of objects from an image/video, is an open-set problem by nature, i. e., the number of population can vary in [0, inf) in theory.

Object Counting

BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation

9 code implementations CVPR 2020 Hao Chen, Kunyang Sun, Zhi Tian, Chunhua Shen, Yongming Huang, Youliang Yan

The proposed BlendMask can effectively predict dense per-pixel position-sensitive instance features with very few channels, and learn attention maps for each instance with merely one convolution layer, thus being fast in inference.

Real-time Instance Segmentation Semantic Segmentation

Ordered or Orderless: A Revisit for Video based Person Re-Identification

no code implementations24 Dec 2019 Le Zhang, Zenglin Shi, Joey Tianyi Zhou, Ming-Ming Cheng, Yun Liu, Jia-Wang Bian, Zeng Zeng, Chunhua Shen

Specifically, with a diagnostic analysis, we show that the recurrent structure may not be effective to learn temporal dependencies than what we expected and implicitly yields an orderless representation.

Video-Based Person Re-Identification

Unsupervised Representation Learning by Predicting Random Distances

2 code implementations22 Dec 2019 Hu Wang, Guansong Pang, Chunhua Shen, Congbo Ma

To enable unsupervised learning on those domains, in this work we propose to learn features without using any labelled data by training neural networks to predict data distances in a randomly projected space.

Anomaly Detection Representation Learning

Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection

1 code implementation20 Dec 2019 Yuliang Liu, Tong He, Hao Chen, Xinyu Wang, Canjie Luo, Shuaitao Zhang, Chunhua Shen, Lianwen Jin

More importantly, based on OBD, we provide a detailed analysis of the impact of a collection of refinements, which may inspire others to build state-of-the-art text detectors.

Scene Text Detection

To Balance or Not to Balance: A Simple-yet-Effective Approach for Learning with Long-Tailed Distributions

no code implementations10 Dec 2019 Jun-Jie Zhang, Lingqiao Liu, Peng Wang, Chunhua Shen

Such imbalanced distribution causes a great challenge for learning a deep neural network, which can be boiled down into a dilemma: on the one hand, we prefer to increase the exposure of tail class samples to avoid the excessive dominance of head classes in the classifier training.

Auxiliary Learning Self-Supervised Learning

Unified Multifaceted Feature Learning for Person Re-Identification

no code implementations20 Nov 2019 Cheng Yan, Guansong Pang, Xiao Bai, Chunhua Shen

The loss structures the augmented images resulted by the two types of image erasing in a two-level hierarchy and enforces multifaceted attention to different parts.

Person Re-Identification

Deep Anomaly Detection with Deviation Networks

6 code implementations19 Nov 2019 Guansong Pang, Chunhua Shen, Anton Van Den Hengel

Instead of representation learning, our method fulfills an end-to-end learning of anomaly scores by a neural deviation learning, in which we leverage a few (e. g., multiple to dozens) labeled anomalies and a prior probability to enforce statistically significant deviations of the anomaly scores of anomalies from that of normal data objects in the upper tail.

Anomaly Detection Cyber Attack Detection +3

DirectPose: Direct End-to-End Multi-Person Pose Estimation

7 code implementations18 Nov 2019 Zhi Tian, Hao Chen, Chunhua Shen

We propose the first direct end-to-end multi-person pose estimation framework, termed DirectPose.

Multi-Person Pose Estimation

Multi-marginal Wasserstein GAN

3 code implementations NeurIPS 2019 Jiezhang Cao, Langyuan Mo, Yifan Zhang, Kui Jia, Chunhua Shen, Mingkui Tan

Multiple marginal matching problem aims at learning mappings to match a source domain to multiple target domains and it has attracted great attention in many applications, such as multi-domain image translation.

Image Generation Translation

Deep Weakly-supervised Anomaly Detection

3 code implementations30 Oct 2019 Guansong Pang, Chunhua Shen, Huidong Jin, Anton Van Den Hengel

To detect both seen and unseen anomalies, we introduce a novel deep weakly-supervised approach, namely Pairwise Relation prediction Network (PReNet), that learns pairwise relation features and anomaly scores by predicting the relation of any two randomly sampled training instances, in which the pairwise relation can be anomaly-anomaly, anomaly-unlabeled, or unlabeled-unlabeled.

Semi-supervised Anomaly Detection supervised anomaly detection +1

PolarMask: Single Shot Instance Segmentation with Polar Representation

2 code implementations CVPR 2020 Enze Xie, Peize Sun, Xiaoge Song, Wenhai Wang, Ding Liang, Chunhua Shen, Ping Luo

In this paper, we introduce an anchor-box free and single shot instance segmentation method, which is conceptually simple, fully convolutional and can be used as a mask prediction module for instance segmentation, by easily embedding it into most off-the-shelf detection methods.

Instance Segmentation Object Detection +2

Structured Binary Neural Networks for Image Recognition

no code implementations22 Sep 2019 Bohan Zhuang, Chunhua Shen, Mingkui Tan, Peng Chen, Lingqiao Liu, Ian Reid

Experiments on both classification, semantic segmentation and object detection tasks demonstrate the superior performance of the proposed methods over various quantized networks in the literature.

object-detection Object Detection +2

Task-Aware Monocular Depth Estimation for 3D Object Detection

1 code implementation17 Sep 2019 Xinlong Wang, Wei Yin, Tao Kong, Yuning Jiang, Lei LI, Chunhua Shen

In this paper, we first analyse the data distributions and interaction of foreground and background, then propose the foreground-background separated monocular depth estimation (ForeSeE) method, to estimate the foreground depth and background depth using separate optimization objectives and depth decoders.

3D Object Detection 3D Object Recognition +3

TextSR: Content-Aware Text Super-Resolution Guided by Recognition

1 code implementation16 Sep 2019 Wenjia Wang, Enze Xie, Peize Sun, Wenhai Wang, Lixun Tian, Chunhua Shen, Ping Luo

Nonetheless, most of the previous methods may not work well in recognizing text with low resolution which is often seen in natural scene images.

Scene Text Recognition Super-Resolution

Auxiliary Learning for Deep Multi-task Learning

no code implementations5 Sep 2019 Yifan Liu, Bohan Zhuang, Chunhua Shen, Hao Chen, Wei Yin

The most current methods can be categorized as either: (i) hard parameter sharing where a subset of the parameters is shared among tasks while other parameters are task-specific; or (ii) soft parameter sharing where all parameters are task-specific but they are jointly regularized.

Auxiliary Learning Depth Estimation +3

Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video

2 code implementations NeurIPS 2019 Jia-Wang Bian, Zhichao Li, Naiyan Wang, Huangying Zhan, Chunhua Shen, Ming-Ming Cheng, Ian Reid

To the best of our knowledge, this is the first work to show that deep networks trained using unlabelled monocular videos can predict globally scale-consistent camera trajectories over a long video sequence.

Depth And Camera Motion Monocular Depth Estimation +1

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

6 code implementations ICCV 2019 Wenhai Wang, Enze Xie, Xiaoge Song, Yuhang Zang, Wenjia Wang, Tong Lu, Gang Yu, Chunhua Shen

Recently, some methods have been proposed to tackle arbitrary-shaped text detection, but they rarely take the speed of the entire pipeline into consideration, which may fall short in practical applications. In this paper, we propose an efficient and accurate arbitrary-shaped text detector, termed Pixel Aggregation Network (PAN), which is equipped with a low computational-cost segmentation head and a learnable post-processing.

Scene Text Detection

Index Network

2 code implementations11 Aug 2019 Hao Lu, Yutong Dai, Chunhua Shen, Songcen Xu

By viewing the indices as a function of the feature map, we introduce the concept of "learning to index", and present a novel index-guided encoder-decoder framework where indices are self-learned adaptively from data and are used to guide the downsampling and upsampling stages, without extra training supervision.

Grayscale Image Denoising Image Denoising +3

MobileFAN: Transferring Deep Hidden Representation for Face Alignment

no code implementations11 Aug 2019 Yang Zhao, Yifan Liu, Chunhua Shen, Yongsheng Gao, Shengwu Xiong

To this end, we propose an effective lightweight model, namely Mobile Face Alignment Network (MobileFAN), using a simple backbone MobileNetV2 as the encoder and three deconvolutional layers as the decoder.

Face Alignment Facial Landmark Detection

Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations

no code implementations10 Aug 2019 Bohan Zhuang, Jing Liu, Mingkui Tan, Lingqiao Liu, Ian Reid, Chunhua Shen

Furthermore, we propose a second progressive quantization scheme which gradually decreases the bit-width from high-precision to low-precision during training.

Knowledge Distillation Quantization

V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices

no code implementations29 Jul 2019 Damien Teney, Peng Wang, Jiewei Cao, Lingqiao Liu, Chunhua Shen, Anton Van Den Hengel

One of the primary challenges faced by deep learning is the degree to which current methods exploit superficial statistics and dataset bias, rather than learning to generalise over the specific representations they have experienced.