Search Results for author: Errui Ding

Found 64 papers, 33 papers with code

An Information Theory-inspired Strategy for Automatic Network Pruning

no code implementations19 Aug 2021 Xiawu Zheng, Yuexiao Ma, Teng Xi, Gang Zhang, Errui Ding, Yuchao Li, Jie Chen, Yonghong Tian, Rongrong Ji

This practically limits the application of model compression when the model needs to be deployed on a wide range of devices.

AutoML Model Compression +1

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition

1 code implementation10 Aug 2021 Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Yu Guan, Xuming He, Errui Ding

The task of skeleton-based action recognition remains a core challenge in human-centred scene understanding due to the multiple granularities and large variation in human motion.

Action Recognition Scene Understanding +1

Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video

no code implementations9 Aug 2021 Jie Wu, Wei zhang, Guanbin Li, Wenhao Wu, Xiao Tan, YingYing Li, Errui Ding, Liang Lin

In this paper, we introduce a novel task, referred to as Weakly-Supervised Spatio-Temporal Anomaly Detection (WSSTAD) in surveillance video.

Anomaly Detection

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

2 code implementations ICCV 2021 Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, Hao Wang

Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks.

Learning to Paint Object Detection +1

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer

2 code implementations ICCV 2021 Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Meiling Wang, Xin Li, Zhengxing Sun, Qian Li, Errui Ding

Finally, the content feature is normalized so that they demonstrate the same local feature statistics as the calculated per-point weighted style feature statistics.

Style Transfer Video Style Transfer

Oriented Object Detection with Transformer

no code implementations6 Jun 2021 Teli Ma, Mingyuan Mao, Honghui Zheng, Peng Gao, Xiaodi Wang, Shumin Han, Errui Ding, Baochang Zhang, David Doermann

Object detection with Transformers (DETR) has achieved a competitive performance over traditional detectors, such as Faster R-CNN.

Object Detection Oriented Object Detection

Dual-stream Network for Visual Recognition

no code implementations NeurIPS 2021 Mingyuan Mao, Renrui Zhang, Honghui Zheng, Peng Gao, Teli Ma, Yan Peng, Errui Ding, Baochang Zhang, Shumin Han

Transformers with remarkable global representation capacities achieve competitive results for visual tasks, but fail to consider high-level local pattern information in input images.

Image Classification Instance Segmentation +2

Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness

1 code implementation28 Apr 2021 Manyu Zhu, Dongliang He, Xin Li, Chao Li, Fu Li, Xiao Liu, Errui Ding, Zhaoxiang Zhang

Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.

Image Inpainting

PAFNet: An Efficient Anchor-Free Object Detector Guidance

1 code implementation28 Apr 2021 Ying Xin, Guanzhong Wang, Mingyuan Mao, Yuan Feng, Qingqing Dang, Yanjun Ma, Errui Ding, Shumin Han

Therefore, a trade-off between effectiveness and efficiency is necessary in practical scenarios.

 Ranked #1 on Object Detection on COCO test-dev (Hardware Burden metric)

Object Detection

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

no code implementations CVPR 2021 Zechen Bai, Zhigang Wang, Jian Wang, Di Hu, Errui Ding

Although achieving great success, most of them only use limited data from a single-source domain for model pre-training, making the rich labeled data insufficiently exploited.

Person Re-Identification Rectification +1

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

1 code implementation12 Apr 2021 Pengfei Wang, Chengquan Zhang, Fei Qi, Shanshan Liu, Xiaoqiang Zhang, Pengyuan Lyu, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi

With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations involved, which guarantees high efficiency.

 Ranked #1 on Scene Text Detection on ICDAR 2015 (Accuracy metric)

Scene Text Detection Text Spotting

Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer

2 code implementations CVPR 2021 Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao

Inspired by the common painting process of drawing a draft and revising the details, we introduce a novel feed-forward method named Laplacian Pyramid Network (LapStyle).

Style Transfer

Student-Teacher Feature Pyramid Matching for Anomaly Detection

2 code implementations7 Mar 2021 Guodong Wang, Shumin Han, Errui Ding, Di Huang

Anomaly detection is a challenging task and usually formulated as an one-class learning problem for the unexpectedness of anomalies.

Ranked #12 on Anomaly Detection on MVTec AD (using extra training data)

Image Classification Unsupervised Anomaly Detection

FaceController: Controllable Attribute Editing for Face in the Wild

no code implementations23 Feb 2021 Zhiliang Xu, Xiyu Yu, Zhibin Hong, Zhen Zhu, Junyu Han, Jingtuo Liu, Errui Ding, Xiang Bai

By simply employing some existing and easy-obtainable prior information, our method can control, transfer, and edit diverse attributes of faces in the wild.

 Ranked #1 on Face Swapping on FaceForensics++ (FID metric)

Face Swapping GAN inversion

EC-DARTS: Inducing Equalized and Consistent Optimization Into DARTS

no code implementations ICCV 2021 Qinqin Zhou, Xiawu Zheng, Liujuan Cao, Bineng Zhong, Teng Xi, Gang Zhang, Errui Ding, Mingliang Xu, Rongrong Ji

EC-DARTS decouples different operations based on their categories to optimize the operation weights so that the operation gap between them is shrinked.

Revealing the Reciprocal Relations Between Self-Supervised Stereo and Monocular Depth Estimation

no code implementations ICCV 2021 Zhi Chen, Xiaoqing Ye, Wei Yang, Zhenbo Xu, Xiao Tan, Zhikang Zou, Errui Ding, Xinming Zhang, Liusheng Huang

Second, we introduce an occlusion-aware distillation (OA Distillation) module, which leverages the predicted depths from StereoNet in non-occluded regions to train our monocular depth estimation network named SingleNet.

Monocular Depth Estimation Stereo Matching

Understanding Image Retrieval Re-Ranking: A Graph Neural Network Perspective

1 code implementation14 Dec 2020 Xuanmeng Zhang, Minyue Jiang, Zhedong Zheng, Xiao Tan, Errui Ding, Yi Yang

We argue that the first phase equals building the k-nearest neighbor graph, while the second phase can be viewed as spreading the message within the graph.

Image Retrieval Re-Ranking

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

2 code implementations13 Dec 2020 Wenhao Wu, Dongliang He, Tianwei Lin, Fu Li, Chuang Gan, Errui Ding

Existing state-of-the-art methods have achieved excellent accuracy regardless of the complexity meanwhile efficient spatiotemporal modeling solutions are slightly inferior in performance.

Action Classification Action Recognition +1

Coherent Loss: A Generic Framework for Stable Video Segmentation

no code implementations25 Oct 2020 Mingyang Qian, Yi Fu, Xiao Tan, YingYing Li, Jinqing Qi, Huchuan Lu, Shilei Wen, Errui Ding

Video segmentation approaches are of great importance for numerous vision tasks especially in video manipulation for entertainment.

Semantic Segmentation Video Segmentation +1

HS-ResNet: Hierarchical-Split Block on Convolutional Neural Network

1 code implementation15 Oct 2020 Pengcheng Yuan, Shufei Lin, Cheng Cui, Yuning Du, Ruoyu Guo, Dongliang He, Errui Ding, Shumin Han

Moreover, Hierarchical-Split block is very flexible and efficient, which provides a large space of potential network architectures for different applications.

Image Classification Instance Segmentation +2

Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching

1 code implementation NeurIPS 2020 Di Hu, Rui Qian, Minyue Jiang, Xiao Tan, Shilei Wen, Errui Ding, Weiyao Lin, Dejing Dou

First, we propose to learn robust object representations by aggregating the candidate sound localization results in the single source scenes.

Object Localization

Real Image Super Resolution Via Heterogeneous Model Ensemble using GP-NAS

no code implementations2 Sep 2020 Zhihong Pan, Baopu Li, Teng Xi, Yanwen Fan, Gang Zhang, Jingtuo Liu, Junyu Han, Errui Ding

With advancement in deep neural network (DNN), recent state-of-the-art (SOTA) image superresolution (SR) methods have achieved impressive performance using deep residual network with dense skip connections.

Image Super-Resolution Neural Architecture Search

Learning Global Structure Consistency for Robust Object Tracking

no code implementations26 Aug 2020 Bi Li, Chengquan Zhang, Zhibin Hong, Xu Tang, Jingtuo Liu, Junyu Han, Errui Ding, Wenyu Liu

Unlike many existing trackers that focus on modeling only the target, in this work, we consider the \emph{transient variations of the whole scene}.

Visual Object Tracking

PP-YOLO: An Effective and Efficient Implementation of Object Detector

5 code implementations23 Jul 2020 Xiang Long, Kaipeng Deng, Guanzhong Wang, Yang Zhang, Qingqing Dang, Yuan Gao, Hui Shen, Jianguo Ren, Shumin Han, Errui Ding, Shilei Wen

We mainly try to combine various existing tricks that almost not increase the number of model parameters and FLOPs, to achieve the goal of improving the accuracy of detector as much as possible while ensuring that the speed is almost unchanged.

Object Detection

Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement

no code implementations ECCV 2020 Jian Wang, Xiang Long, Yuan Gao, Errui Ding, Shilei Wen

In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled.

Pose Estimation

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation

1 code implementation ECCV 2020 Zhenbo Xu, Wei zhang, Xiao Tan, Wei Yang, Huan Huang, Shilei Wen, Errui Ding, Liusheng Huang

The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods including 3D tracking methods by large margins (5. 4% higher MOTSA and 18 times faster over MOTSFusion) with the near real-time speed (22 FPS).

Multi-Object Tracking Multi-Object Tracking and Segmentation +1

PointTrack++ for Effective Online Multi-Object Tracking and Segmentation

1 code implementation3 Jul 2020 Zhenbo Xu, Wei zhang, Xiao Tan, Wei Yang, Xiangbo Su, Yuchen Yuan, Hongwu Zhang, Shilei Wen, Errui Ding, Liusheng Huang

In this work, we present PointTrack++, an effective on-line framework for MOTS, which remarkably extends our recently proposed PointTrack framework.

Data Augmentation Instance Segmentation +5

Learning Generalized Spoof Cues for Face Anti-spoofing

4 code implementations8 May 2020 Haocheng Feng, Zhibin Hong, Haixiao Yue, Yang Chen, Keyao Wang, Junyu Han, Jingtuo Liu, Errui Ding

In this paper, we reformulate FAS in an anomaly detection perspective and propose a residual-learning framework to learn the discriminative live-spoof differences which are defined as the spoof cues.

Anomaly Detection Face Anti-Spoofing

ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection

1 code implementation1 Mar 2020 Zhenbo Xu, Wei zhang, Xiaoqing Ye, Xiao Tan, Wei Yang, Shilei Wen, Errui Ding, Ajin Meng, Liusheng Huang

The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.

2D Object Detection 3D Object Detection +2

HAMBox: Delving into Online High-quality Anchors Mining for Detecting Outer Faces

no code implementations19 Dec 2019 Yang Liu, Xu Tang, Xiang Wu, Junyu Han, Jingtuo Liu, Errui Ding

In this paper, we propose an Online High-quality Anchor Mining Strategy (HAMBox), which explicitly helps outer faces compensate with high-quality anchors.

Face Detection Multi-Task Learning

Dynamic Instance Normalization for Arbitrary Style Transfer

no code implementations16 Nov 2019 Yongcheng Jing, Xiao Liu, Yukang Ding, Xinchao Wang, Errui Ding, Mingli Song, Shilei Wen

Prior normalization methods rely on affine transformations to produce arbitrary image style transfers, of which the parameters are computed in a pre-defined way.

Style Transfer

EATEN: Entity-aware Attention for Single Shot Visual Text Extraction

1 code implementation20 Sep 2019 He guo, Xiameng Qin, Jiaming Liu, Junyu Han, Jingtuo Liu, Errui Ding

Extracting entity from images is a crucial part of many OCR applications, such as entity recognition of cards, invoices, and receipts.

Entity Extraction using GAN Optical Character Recognition

ACFNet: Attentional Class Feature Network for Semantic Segmentation

1 code implementation ICCV 2019 Fan Zhang, Yanqin Chen, Zhihang Li, Zhibin Hong, Jingtuo Liu, Feifei Ma, Junyu Han, Errui Ding

Recent works have made great progress in semantic segmentation by exploiting richer context, most of which are designed from a spatial perspective.

Semantic Segmentation

Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning

no code implementations ICCV 2019 Yipeng Sun, Jiaming Liu, Wei Liu, Junyu Han, Errui Ding, Jingtuo Liu

Most existing text reading benchmarks make it difficult to evaluate the performance of more advanced deep learning models in large vocabularies due to the limited amount of training data.

Perspective-Guided Convolution Networks for Crowd Counting

1 code implementation ICCV 2019 Zhaoyi Yan, Yuchen Yuan, WangMeng Zuo, Xiao Tan, Yezhen Wang, Shilei Wen, Errui Ding

In this paper, we propose a novel perspective-guided convolution (PGC) for convolutional neural network (CNN) based crowd counting (i. e. PGCNet), which aims to overcome the dramatic intra-scene scale variations of people due to the perspective effect.

Crowd Counting

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT)

1 code implementation16 Sep 2019 Chee-Kheng Chng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, ChuanMing Fang, Shuaitao Zhang, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin

This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT) that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting.

Scene Text Scene Text Detection +2

Image Inpainting with Learnable Bidirectional Attention Maps

1 code implementation ICCV 2019 Chaohao Xie, Shaohui Liu, Chao Li, Ming-Ming Cheng, WangMeng Zuo, Xiao Liu, Shilei Wen, Errui Ding

Most convolutional network (CNN)-based inpainting methods adopt standard convolution to indistinguishably treat valid pixels and holes, making them limited in handling irregular holes and more likely to generate inpainting results with color discrepancy and blurriness.

Image Inpainting

An End-to-end Video Text Detector with Online Tracking

no code implementations20 Aug 2019 Hongyuan Yu, Chengquan Zhang, Xuan Li, Junyu Han, Errui Ding, Liang Wang

Most existing methods attempt to enhance the performance of video text detection by cooperating with video text tracking, but treat these two tasks separately.

Editing Text in the Wild

2 code implementations8 Aug 2019 Liang Wu, Chengquan Zhang, Jiaming Liu, Junyu Han, Jingtuo Liu, Errui Ding, Xiang Bai

Specifically, we propose an end-to-end trainable style retention network (SRNet) that consists of three modules: text conversion module, background inpainting module and fusion module.

Image Inpainting Image-to-Image Translation +1

BMN: Boundary-Matching Network for Temporal Action Proposal Generation

10 code implementations ICCV 2019 Tianwei Lin, Xiao Liu, Xin Li, Errui Ding, Shilei Wen

To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map.

Action Detection Action Recognition +1

STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing

3 code implementations CVPR 2019 Ming Liu, Yukang Ding, Min Xia, Xiao Liu, Errui Ding, WangMeng Zuo, Shilei Wen

Arbitrary attribute editing generally can be tackled by incorporating encoder-decoder and generative adversarial networks.


Detecting Text in the Wild with Deep Character Embedding Network

no code implementations2 Jan 2019 Jiaming Liu, Chengquan Zhang, Yipeng Sun, Junyu Han, Errui Ding

However, text in the wild is usually perspectively distorted or curved, which can not be easily tackled by existing approaches.

TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network

no code implementations24 Dec 2018 Yipeng Sun, Chengquan Zhang, Zuming Huang, Jiaming Liu, Junyu Han, Errui Ding

Reading text from images remains challenging due to multi-orientation, perspective distortion and especially the curved nature of irregular text.

Optical Character Recognition

Compact Generalized Non-local Network

2 code implementations NeurIPS 2018 Kaiyu Yue, Ming Sun, Yuchen Yuan, Feng Zhou, Errui Ding, Fuxin Xu

The non-local module is designed for capturing long-range spatio-temporal dependencies in images and videos.

Object Detection Object Recognition +1

Fine-grained Video Categorization with Redundancy Reduction Attention

no code implementations ECCV 2018 Chen Zhu, Xiao Tan, Feng Zhou, Xiao Liu, Kaiyu Yue, Errui Ding, Yi Ma

Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform.

Video Classification

Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition

1 code implementation ECCV 2018 Ming Sun, Yuchen Yuan, Feng Zhou, Errui Ding

Attention-based learning for fine-grained image recognition remains a challenging task, where most of the existing methods treat each object part in isolation, while neglecting the correlations among them.

Fine-Grained Image Recognition Metric Learning

WordSup: Exploiting Word Annotations for Character based Text Detection

no code implementations ICCV 2017 Han Hu, Chengquan Zhang, Yuxuan Luo, Yuzhuo Wang, Junyu Han, Errui Ding

When applied in scene text detection, we are thus able to train a robust character detector by exploiting word annotations in the rich large-scale real scene text datasets, e. g. ICDAR15 and COCO-text.

Scene Text Scene Text Detection

Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition

no code implementations20 May 2016 Xiao Liu, Jiang Wang, Shilei Wen, Errui Ding, Yuanqing Lin

By designing a novel reward strategy, we are able to learn to locate regions that are spatially and semantically distinctive with reinforcement learning algorithm.

Cannot find the paper you are looking for? You can Submit a new open access paper.