Search Results for author: ShiLiang Pu

Found 110 papers, 37 papers with code

Read Extensively, Focus Smartly: A Cross-document Semantic Enhancement Method for Visual Documents NER

no code implementations COLING 2022 Jun Zhao, Xin Zhao, WenYu Zhan, Tao Gui, Qi Zhang, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu

To deal with this problem, this work proposes a cross-document semantic enhancement method, which consists of two modules: 1) To prevent distractions from irrelevant regions in the current document, we design a learnable attention mask mechanism, which is used to adaptively filter redundant information in the current document.


Single Domain Dynamic Generalization for Iris Presentation Attack Detection

no code implementations22 May 2023 Yachun Li, Jingjing Wang, Yuhui Chen, Di Xie, ShiLiang Pu

To tackle the above issues, we propose a Single Domain Dynamic Generalization (SDDG) framework, which simultaneously exploits domain-invariant and domain-specific features on a per-sample basis and learns to generalize to various unseen domains with numerous natural images.

Domain Generalization Meta-Learning

Taxonomy Completion with Probabilistic Scorer via Box Embedding

1 code implementation18 May 2023 Wei Xue, Yongliang Shen, Wenqi Ren, Jietian Guo, ShiLiang Pu, Weiming Lu

Specifically, TaxBox consists of three components: (1) a graph aggregation module to leverage the structural information of the taxonomy and two lightweight decoders that map features to box embedding and capture complex relationships between concepts; (2) two probabilistic scorers that correspond to attachment and insertion operations and ensure the avoidance of pseudo-leaves; and (3) three learning objectives that assist the model in mapping concepts more granularly onto the box embedding space.

Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains

1 code implementation CVPR 2023 Mingjun Xu, Lingyun Qin, WeiJie Chen, ShiLiang Pu, Lei Zhang

In this work, we present an idea to remove non-causal factors from common features by multi-view adversarial training on source domains, because we observe that such insignificant non-causal factors may still be significant in other latent spaces (views) due to the multi-mode structure of data.

Domain Generalization object-detection +1

Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation

1 code implementation CVPR 2023 Hang Du, Xuejun Yan, Jingjing Wang, Di Xie, ShiLiang Pu

Most existing approaches for point cloud normal estimation aim to locally fit a geometric surface and calculate the normal from the fitted surface.

1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

no code implementations12 Jan 2023 Wei Zhao, Binbin Chen, WeiJie Chen, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang

The domain adaptation part is implemented as a Source-Free Domain Adaptation paradigm, which only uses the pre-trained model and the unlabeled target data to further optimize in a self-supervised training manner.

Domain Generalization object-detection +3

1st Place Solution for ECCV 2022 OOD-CV Challenge Image Classification Track

no code implementations12 Jan 2023 Yilu Guo, Xingyue Shi, WeiJie Chen, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang

In the test-time training stage, we use the pre-trained model to assign noisy label for the unlabeled target data, and propose a Label-Periodically-Updated DivideMix method for noisy label learning.

Data Augmentation Domain Generalization +2

PHA: Patch-Wise High-Frequency Augmentation for Transformer-Based Person Re-Identification

no code implementations CVPR 2023 Guiwei Zhang, Yongfei Zhang, Tianyu Zhang, Bo Li, ShiLiang Pu

Although recent studies empirically show that injecting Convolutional Neural Networks (CNNs) into Vision Transformers (ViTs) can improve the performance of person re-identification, the rationale behind it remains elusive.

Person Re-Identification

SAViT: Structure-Aware Vision Transformer Pruning via Collaborative Optimization

no code implementations NIPS 2022 Zheng Chuanyang, Zheyang Li, Kai Zhang, Zhi Yang, Wenming Tan, Jun Xiao, Ye Ren, ShiLiang Pu

In this paper, we introduce joint importance, which integrates essential structural-aware interactions between components for the first time, to perform collaborative pruning.

object-detection Object Detection

Attention Diversification for Domain Generalization

1 code implementation9 Oct 2022 Rang Meng, Xianfeng Li, WeiJie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Mingli Song, Di Xie, ShiLiang Pu

Under this guidance, a novel Attention Diversification framework is proposed, in which Intra-Model and Inter-Model Attention Diversification Regularization are collaborated to reassign appropriate attention to diverse task-related features.

Domain Generalization

Multi-Scale Wavelet Transformer for Face Forgery Detection

no code implementations8 Oct 2022 Jie Liu, Jingjing Wang, Peng Zhang, Chunmao Wang, Di Xie, ShiLiang Pu

To overcome these limitations, we propose a multi-scale wavelet transformer framework for face forgery detection.

Point Cloud Upsampling via Cascaded Refinement Network

1 code implementation8 Oct 2022 Hang Du, Xuejun Yan, Jingjing Wang, Di Xie, ShiLiang Pu

In this manner, the proposed cascaded refinement network can be easily optimized without extra learning strategies.

FBNet: Feedback Network for Point Cloud Completion

1 code implementation8 Oct 2022 Xuejun Yan, Hongyu Yan, Jingjing Wang, Hang Du, Zhihong Wu, Di Xie, ShiLiang Pu, Li Lu

The rapid development of point cloud learning has driven point cloud completion into a new era.

Point Cloud Completion

Unified Normalization for Accelerating and Stabilizing Transformers

1 code implementation2 Aug 2022 Qiming Yang, Kai Zhang, Chaoxiang Lan, Zhi Yang, Zheyang Li, Wenming Tan, Jun Xiao, ShiLiang Pu

To tackle these issues, we propose Unified Normalization (UN), which can speed up the inference by being fused with other linear operations and achieve comparable performance on par with LN.

TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents

no code implementations14 Jul 2022 Zhanzhan Cheng, Peng Zhang, Can Li, Qiao Liang, Yunlu Xu, Pengfei Li, ShiLiang Pu, Yi Niu, Fei Wu

Most existing methods divide this task into two subparts: the text reading part for obtaining the plain text from the original document images and the information extraction part for extracting key contents.

Language Modelling

E2-AEN: End-to-End Incremental Learning with Adaptively Expandable Network

no code implementations14 Jul 2022 Guimei Cao, Zhanzhan Cheng, Yunlu Xu, Duo Li, ShiLiang Pu, Yi Niu, Fei Wu

In this paper, we propose an end-to-end trainable adaptively expandable network named E2-AEN, which dynamically generates lightweight structures for new tasks without any accuracy drop in previous tasks.

Incremental Learning

Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting

1 code implementation14 Jul 2022 Ying Chen, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Xi Li

In this paper, to address this problem, we propose a novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting framework, which aims to infer images in different small but recognizable resolutions and achieve a better balance between accuracy and efficiency.

Knowledge Distillation Optical Character Recognition (OCR) +1

Semi-supervised Ranking for Object Image Blur Assessment

1 code implementation13 Jul 2022 Qiang Li, Zhaoliang Yao, Jingjing Wang, Ye Tian, Pengju Yang, Di Xie, ShiLiang Pu

Based on this dataset, we propose a method to obtain the blur scores only with the pairwise rank labels as supervision.

Object Recognition Retrieval

Universal Domain Adaptive Object Detector

no code implementations5 Jul 2022 Wenxu Shi, Lei Zhang, WeiJie Chen, ShiLiang Pu

Universal domain adaptive object detection (UniDAOD)is more challenging than domain adaptive object detection (DAOD) since the label space of the source domain may not be the same as that of the target and the scale of objects in the universal scenarios can vary dramatically (i. e, category shift and scale shift).

Multi-Label Learning object-detection +1

Slimmable Domain Adaptation

1 code implementation CVPR 2022 Rang Meng, WeiJie Chen, Shicai Yang, Jie Song, Luojun Lin, Di Xie, ShiLiang Pu, Xinchao Wang, Mingli Song, Yueting Zhuang

In this paper, we introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank, from which models of different capacities can be sampled to accommodate different accuracy-efficiency trade-offs.

Domain Generalization Unsupervised Domain Adaptation

Label Matching Semi-Supervised Object Detection

3 code implementations CVPR 2022 Binbin Chen, WeiJie Chen, Shicai Yang, Yunyi Xuan, Jie Song, Di Xie, ShiLiang Pu, Mingli Song, Yueting Zhuang

To remedy this issue, we present a novel label assignment mechanism for self-training framework, namely proposal self-assignment, which injects the proposals from student into teacher and generates accurate pseudo labels to match each proposal in the student model accordingly.

object-detection Object Detection +1

Transductive CLIP with Class-Conditional Contrastive Learning

no code implementations13 Jun 2022 Junchu Huang, WeiJie Chen, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang

This framework can reduce the impact of noisy labels from CLIP model effectively by combining both techniques.

Contrastive Learning Pseudo Label

Learning Domain Adaptive Object Detection with Probabilistic Teacher

1 code implementation13 Jun 2022 Meilin Chen, WeiJie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Yunfeng Yan, Donglian Qi, Yueting Zhuang, Di Xie, ShiLiang Pu

In addition, we conduct anchor adaptation in parallel with localization adaptation, since anchor can be regarded as a learnable parameter.

object-detection Object Detection

Self-distilled Knowledge Delegator for Exemplar-free Class Incremental Learning

no code implementations23 May 2022 Fanfan Ye, Liang Ma, Qiaoyong Zhong, Di Xie, ShiLiang Pu

The knowledge extracted by the delegator is then utilized to maintain the performance of the model on old tasks in incremental learning.

class-incremental learning Class Incremental Learning +1

KRNet: Towards Efficient Knowledge Replay

no code implementations23 May 2022 Yingying Zhang, Qiaoyong Zhong, Di Xie, ShiLiang Pu

However, the number of stored latent codes in autoencoder increases linearly with the scale of data and the trained encoder is redundant for the replaying stage.

Continual Learning Domain Adaptation

High-Efficiency Lossy Image Coding Through Adaptive Neighborhood Information Aggregation

1 code implementation25 Apr 2022 Ming Lu, Fangdong Chen, ShiLiang Pu, Zhan Ma

To this end, Integrated Convolution and Self-Attention (ICSA) unit is first proposed to form a content-adaptive transform to characterize and embed neighborhood information dynamically of any input.

Vocal Bursts Intensity Prediction

Few-shot One-class Domain Adaptation Based on Frequency for Iris Presentation Attack Detection

no code implementations1 Apr 2022 Yachun Li, Ying Lian, Jingjing Wang, Yuhui Chen, Chunmao Wang, ShiLiang Pu

We thus define a new domain adaptation setting called Few-shot One-class Domain Adaptation (FODA), where adaptation only relies on a limited number of target bonafide samples.

Domain Adaptation Iris Recognition

Few-Shot Class-Incremental Learning by Sampling Multi-Phase Tasks

1 code implementation31 Mar 2022 Da-Wei Zhou, Han-Jia Ye, Liang Ma, Di Xie, ShiLiang Pu, De-Chuan Zhan

In this work, we propose a new paradigm for FSCIL based on meta-learning by LearnIng Multi-phase Incremental Tasks (LIMIT), which synthesizes fake FSCIL tasks from the base dataset.

class-incremental learning Few-Shot Class-Incremental Learning +2

End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding

no code implementations ACL 2022 Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Wenming Tan, Jin Wang, Peng Wang, ShiLiang Pu, Fei Wu

To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding, and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner.

Representation Learning Video Grounding

Forward Compatible Few-Shot Class-Incremental Learning

1 code implementation CVPR 2022 Da-Wei Zhou, Fu-Yun Wang, Han-Jia Ye, Liang Ma, ShiLiang Pu, De-Chuan Zhan

Forward compatibility requires future new classes to be easily incorporated into the current model based on the current stage data, and we seek to realize it by reserving embedding space for future new classes.

class-incremental learning Few-Shot Class-Incremental Learning +1

CAKE: A Scalable Commonsense-Aware Framework For Multi-View Knowledge Graph Completion

1 code implementation ACL 2022 Guanglin Niu, Bo Li, Yongfei Zhang, ShiLiang Pu

The previous knowledge graph embedding (KGE) techniques suffer from invalid negative sampling and the uncertainty of fact-view link prediction, limiting KGC's performance.

Knowledge Graph Embedding Link Prediction

Learning Multiple Explainable and Generalizable Cues for Face Anti-spoofing

no code implementations21 Feb 2022 Ying Bian, Peng Zhang, Jingjing Wang, Chunmao Wang, ShiLiang Pu

However, many other generalizable cues are unexplored for face anti-spoofing, which limits their performance under cross-dataset testing.

Face Anti-Spoofing

UWC: Unit-wise Calibration Towards Rapid Network Compression

no code implementations17 Jan 2022 Chen Lin, Zheyang Li, Bo Peng, Haoji Hu, Wenming Tan, Ye Ren, ShiLiang Pu

This paper introduces a post-training quantization~(PTQ) method achieving highly efficient Convolutional Neural Network~ (CNN) quantization with high performance.


Cross-Modal ASR Post-Processing System for Error Correction and Utterance Rejection

no code implementations10 Jan 2022 Jing Du, ShiLiang Pu, Qinbo Dong, Chao Jin, Xin Qi, Dian Gu, Ru Wu, Hongwei Zhou

Although modern automatic speech recognition (ASR) systems can achieve high performance, they may produce errors that weaken readers' experience and do harm to downstream tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Perform Like an Engine: A Closed-Loop Neural-Symbolic Learning Framework for Knowledge Graph Inference

no code implementations COLING 2022 Guanglin Niu, Bo Li, Yongfei Zhang, ShiLiang Pu

Knowledge graph (KG) inference aims to address the natural incompleteness of KGs, including rule learning-based and KG embedding (KGE) models.

Link Prediction

STEP: Out-of-Distribution Detection in the Presence of Limited In-Distribution Labeled Data

no code implementations NeurIPS 2021 Zhi Zhou, Lan-Zhe Guo, Zhanzhan Cheng, Yu-Feng Li, ShiLiang Pu

However, in many real-world applications, it is desirable to have SSL algorithms that not only classify the samples drawn from the same distribution of labeled data but also detect out-of-distribution (OOD) samples drawn from an unknown distribution.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

A Strong Baseline for Semi-Supervised Incremental Few-Shot Learning

no code implementations21 Oct 2021 Linlan Zhao, Dashan Guo, Yunlu Xu, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Xiangzhong Fang

Few-shot learning (FSL) aims to learn models that generalize to novel classes with limited training samples.

Few-Shot Learning

C+1 Loss: Learn to Classify C Classes of Interest and the Background Class Differentially

no code implementations29 Sep 2021 Changhuai Chen, Xile Shen, Mengyu Ye, Yi Lu, Jun Che, ShiLiang Pu

We figure out that the background class should be treated differently from the classes of interest during training.

Classification Human Parsing +3

Hindsight Reward Tweaking via Conditional Deep Reinforcement Learning

no code implementations6 Sep 2021 Ning Wei, Jiahua Liang, Di Xie, ShiLiang Pu

Designing optimal reward functions has been desired but extremely difficult in reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

TransForensics: Image Forgery Localization with Dense Self-Attention

no code implementations ICCV 2021 Jing Hao, Zhixin Zhang, Shicai Yang, Di Xie, ShiLiang Pu

Nowadays advanced image editing tools and technical skills produce tampered images more realistically, which can easily evade image forensic systems and make authenticity verification of images more difficult.

Self-Supervised Regional and Temporal Auxiliary Tasks for Facial Action Unit Recognition

no code implementations30 Jul 2021 Jingwei Yan, Jingjing Wang, Qiang Li, Chunmao Wang, ShiLiang Pu

Based on these two self-supervised auxiliary tasks, local features, mutual relation and motion cues of AUs are better captured in the backbone network with the proposed regional and temporal based auxiliary task learning (RTATL) framework.

Facial Action Unit Detection Optical Flow Estimation

Divide-and-Assemble: Learning Block-wise Memory for Unsupervised Anomaly Detection

no code implementations ICCV 2021 Jinlei Hou, Yingying Zhang, Qiaoyong Zhong, Di Xie, ShiLiang Pu, Hong Zhou

Surprisingly, by varying the granularity of division on feature maps, we are able to modulate the reconstruction capability of the model for both normal and abnormal samples.

Unsupervised Anomaly Detection

RangeIoUDet: Range Image Based Real-Time 3D Object Detector Optimized by Intersection Over Union

no code implementations CVPR 2021 Zhidong Liang, Zehan Zhang, Ming Zhang, Xian Zhao, ShiLiang Pu

Benefiting from the dense representation of the range image, RangeIoUDet is entirely constructed based on 2D convolution, making it possible to have a fast inference speed.

3D Object Detection Autonomous Driving +2

Entity Concept-enhanced Few-shot Relation Extraction

1 code implementation ACL 2021 Shan Yang, Yongfei Zhang, Guanglin Niu, Qinghua Zhao, ShiLiang Pu

Few-shot relation extraction (FSRE) is of great importance in long-tail distribution problem, especially in special domain with low-resource data.

Relation Extraction Semantic Similarity +3

VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations

no code implementations13 May 2021 Peng Zhang, Can Li, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Fei Wu

To address the above limitations, we propose a unified framework VSR for document layout analysis, combining vision, semantics and relations.

Document Layout Analysis

LGPMA: Complicated Table Structure Recognition with Local and Global Pyramid Mask Alignment

1 code implementation13 May 2021 Liang Qiao, Zaisheng Li, Zhanzhan Cheng, Peng Zhang, ShiLiang Pu, Yi Niu, Wenqi Ren, Wenming Tan, Fei Wu

In this paper, we aim to obtain more reliable aligned bounding boxes by fully utilizing the visual information from both text regions in proposed local features and cell relations in global features.

Table Recognition

Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene Text Recognition

1 code implementation13 May 2021 Hui Jiang, Yunlu Xu, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Wenqi Ren, Fei Wu, Wenming Tan

In this work, we excavate the implicit task, character counting within the traditional text recognition, without additional labor annotation cost.

Optical Character Recognition (OCR) Scene Text Recognition

Modulating Localization and Classification for Harmonized Object Detection

no code implementations16 Mar 2021 Taiheng Zhang, Qiaoyong Zhong, ShiLiang Pu, Di Xie

Object detection involves two sub-tasks, i. e. localizing objects in an image and classifying them into various categories.

Classification General Classification +2

Self-Domain Adaptation for Face Anti-Spoofing

no code implementations24 Feb 2021 Jingjing Wang, Jingyi Zhang, Ying Bian, Youyi Cai, Chunmao Wang, ShiLiang Pu

In this paper, we propose a self-domain adaptation framework to leverage the unlabeled test domain data at inference.

Domain Generalization Face Anti-Spoofing +1

Multi-Level Adaptive Region of Interest and Graph Learning for Facial Action Unit Recognition

no code implementations24 Feb 2021 Jingwei Yan, Boyuan Jiang, Jingjing Wang, Qiang Li, Chunmao Wang, ShiLiang Pu

In order to incorporate the intra-level AU relation and inter-level AU regional relevance simultaneously, a multi-level AU relation graph is constructed and graph convolution is performed to further enhance AU regional features of each level.

Facial Action Unit Detection Graph Learning

Self-Supervised Noisy Label Learning for Source-Free Unsupervised Domain Adaptation

no code implementations23 Feb 2021 WeiJie Chen, Luojun Lin, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang, Wenqi Ren

Usually, the given source domain pre-trained model is expected to optimize with only unlabeled target data, which is termed as source-free unsupervised domain adaptation.

Self-Supervised Learning Unsupervised Domain Adaptation

SGMNet: Learning Rotation-Invariant Point Cloud Representations via Sorted Gram Matrix

no code implementations ICCV 2021 Jianyun Xu, Xin Tang, Yushi Zhu, Jie Sun, ShiLiang Pu

Recently, various works that attempted to introduce rotation invariance to point cloud analysis have devised point-pair features, such as angles and distances.

Rethinking Pseudo-labeled Sample Mining for Semi-Supervised Object Detection

no code implementations1 Jan 2021 Duo Li, Sanli Tang, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Wenming Tan, Fei Wu, Xiaokang Yang

However, the impact of the pseudo-labeled samples' quality as well as the mining strategies for high quality training sample have rarely been studied in SSL.

object-detection Object Detection +1

MANGO: A Mask Attention Guided One-Stage Scene Text Spotter

1 code implementation8 Dec 2020 Liang Qiao, Ying Chen, Zhanzhan Cheng, Yunlu Xu, Yi Niu, ShiLiang Pu, Fei Wu

Recently end-to-end scene text spotting has become a popular research topic due to its advantages of global optimization and high maintainability in real applications.

Text Spotting

Learning Open Set Network with Discriminative Reciprocal Points

1 code implementation ECCV 2020 Guangyao Chen, Limeng Qiao, Yemin Shi, Peixi Peng, Jia Li, Tiejun Huang, ShiLiang Pu, Yonghong Tian

In this process, one of the key challenges is to reduce the risk of generalizing the inherent characteristics of numerous unknown samples learned from a small amount of known data.

Open Set Learning

MAFF-Net: Filter False Positive for 3D Vehicle Detection with Multi-modal Adaptive Feature Fusion

no code implementations23 Sep 2020 Zehan Zhang, Ming Zhang, Zhidong Liang, Xian Zhao, Ming Yang, Wenming Tan, ShiLiang Pu

Experimental results on the KITTI dataset demonstrate significant improvement in filtering false positive over the approach using only point cloud data.

Autonomous Driving

Two Step Joint Model for Drug Drug Interaction Extraction

no code implementations28 Aug 2020 Siliang Tang, Qi Zhang, Tianpeng Zheng, Mengdi Zhou, Zhan Chen, Lixing Shen, Xiang Ren, Yueting Zhuang, ShiLiang Pu, Fei Wu

When patients need to take medicine, particularly taking more than one kind of drug simultaneously, they should be alarmed that there possibly exists drug-drug interaction.

Drug–drug Interaction Extraction named-entity-recognition +4

Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling

no code implementations11 Aug 2020 Jiacheng Li, Siliang Tang, Juncheng Li, Jun Xiao, Fei Wu, ShiLiang Pu, Yueting Zhuang

In this paper, we focus on enhancing the generalization ability of the VIST model by considering the few-shot setting.

Meta-Learning Visual Storytelling

Learning a Domain Classifier Bank for Unsupervised Adaptive Object Detection

no code implementations6 Jul 2020 Sanli Tang, Zhanzhan Cheng, ShiLiang Pu, Dashan Guo, Yi Niu, Fei Wu

To tackle this issue, we develop a fine-grained domain alignment approach with a well-designed domain classifier bank that achieves the instance-level alignment respecting to their categories.

object-detection Object Detection

Text Recognition in Real Scenarios with a Few Labeled Samples

no code implementations22 Jun 2020 Jinghuang Lin, Zhanzhan Cheng, Fan Bai, Yi Niu, ShiLiang Pu, Shuigeng Zhou

Scene text recognition (STR) is still a hot research topic in computer vision field due to its various applications.

Domain Adaptation Scene Text Recognition

Unsupervised Image Classification for Deep Representation Learning

1 code implementation20 Jun 2020 Wei-Jie Chen, ShiLiang Pu, Di Xie, Shicai Yang, Yilu Guo, Luojun Lin

Extensive experiments on ImageNet dataset have been conducted to prove the effectiveness of our method.

Classification Contrastive Learning +12

Object-QA: Towards High Reliable Object Quality Assessment

no code implementations27 May 2020 Jing Lu, Baorui Zou, Zhanzhan Cheng, ShiLiang Pu, Shuigeng Zhou, Yi Niu, Fei Wu

In this paper, we define the problem of object quality assessment for the first time and propose an effective approach named Object-QA to assess high-reliable quality scores for object images.

Object Recognition Vocal Bursts Intensity Prediction

TRIE: End-to-End Text Reading and Information Extraction for Document Understanding

1 code implementation27 May 2020 Peng Zhang, Yunlu Xu, Zhanzhan Cheng, ShiLiang Pu, Jing Lu, Liang Qiao, Yi Niu, Fei Wu

Since real-world ubiquitous documents (e. g., invoices, tickets, resumes and leaflets) contain rich information, automatic document image understanding has become a hot topic.

Counterfactual Samples Synthesizing for Robust Visual Question Answering

2 code implementations CVPR 2020 Long Chen, Xin Yan, Jun Xiao, Hanwang Zhang, ShiLiang Pu, Yueting Zhuang

To reduce the language biases, several recent works introduce an auxiliary question-only model to regularize the training of targeted VQA model, and achieve dominating performance on VQA-CP.

 Ranked #1 on Visual Question Answering (VQA) on VQA-CP (using extra training data)

Question Answering Visual Question Answering

Neural Inheritance Relation Guided One-Shot Layer Assignment Search

no code implementations28 Feb 2020 Rang Meng, Wei-Jie Chen, Di Xie, Yuan Zhang, ShiLiang Pu

In this paper, for the first time, we systematically investigate the impact of different layer assignments to the network performance by building an architecture dataset of layer assignment on CIFAR-100.

Neural Architecture Search

Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units

no code implementations26 Feb 2020 Zhanzhan Cheng, Yunlu Xu, Mingjian Cheng, Yu Qiao, ShiLiang Pu, Yi Niu, Fei Wu

Recurrent neural network (RNN) has been widely studied in sequence learning tasks, while the mainstream models (e. g., LSTM and GRU) rely on the gating mechanism (in control of how information flows between hidden states).

Language Modelling Scene Text Recognition

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting

1 code implementation17 Feb 2020 Liang Qiao, Sanli Tang, Zhanzhan Cheng, Yunlu Xu, Yi Niu, ShiLiang Pu, Fei Wu

Many approaches have recently been proposed to detect irregular scene text and achieved promising results.

Text Spotting

Fast Task Adaptation for Few-Shot Learning

no code implementations25 Sep 2019 Yingying Zhang, Qiaoyong Zhong, Di Xie, ShiLiang Pu

The key lies in generalization of prior knowledge learned from large-scale base classes and fast adaptation of the classifier to novel classes.

Few-Shot Learning

Adversarial Seeded Sequence Growing for Weakly-Supervised Temporal Action Localization

no code implementations7 Aug 2019 Chengwei Zhang, Yunlu Xu, Zhanzhan Cheng, Yi Niu, ShiLiang Pu, Fei Wu, Futai Zou

The second module is a specific classifier for mining trivial or incomplete action regions, which is trained on the shared features after erasing the seeded regions activated by SSG.

Action Detection Weakly-supervised Temporal Action Localization +1

Learned Quality Enhancement via Multi-Frame Priors for HEVC Compliant Low-Delay Applications

no code implementations3 May 2019 Ming Lu, Ming Cheng, Yiling Xu, ShiLiang Pu, Qiu Shen, Zhan Ma

Networked video applications, e. g., video conferencing, often suffer from poor visual quality due to unexpected network fluctuation and limited bandwidth.

Video Compression

All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification

3 code implementations CVPR 2019 Wei-Jie Chen, Di Xie, Yuan Zhang, ShiLiang Pu

In this family of architectures, the basic block is only composed by 1x1 convolutional layers with only a few shift operations applied to the intermediate feature maps.

General Classification Image Classification +1

You Only Recognize Once: Towards Fast Video Text Spotting

1 code implementation8 Mar 2019 Zhanzhan Cheng, Jing Lu, Yi Niu, ShiLiang Pu, Fei Wu, Shuigeng Zhou

Video text spotting is still an important research topic due to its various real-applications.

Text Spotting

Collaborative Spatio-temporal Feature Learning for Video Action Recognition

1 code implementation4 Mar 2019 Chao Li, Qiaoyong Zhong, Di Xie, ShiLiang Pu

By sharing the convolution kernels of different views, spatial and temporal features are collaboratively learned and thus benefit from each other.

Action Recognition In Videos Temporal Action Localization +1

Cross-relation Cross-bag Attention for Distantly-supervised Relation Extraction

1 code implementation27 Dec 2018 Yujin Yuan, Liyuan Liu, Siliang Tang, Zhongfei Zhang, Yueting Zhuang, ShiLiang Pu, Fei Wu, Xiang Ren

Distant supervision leverages knowledge bases to automatically label instances, thus allowing us to train relation extractor without human annotations.

Relation Extraction

Learning Incremental Triplet Margin for Person Re-identification

no code implementations17 Dec 2018 Yingying Zhang, Qiaoyong Zhong, Liang Ma, Di Xie, ShiLiang Pu

In particular, we propose a novel multi-stage training strategy which learns incremental triplet margin and improves triplet loss effectively.

Metric Learning Person Re-Identification

A Layer Decomposition-Recomposition Framework for Neuron Pruning towards Accurate Lightweight Networks

no code implementations17 Dec 2018 Wei-Jie Chen, Yuan Zhang, Di Xie, ShiLiang Pu

A better alternative is to propagate the entire useful information to reconstruct the pruned layer instead of directly discarding the less important neurons.

Counterfactual Critic Multi-Agent Training for Scene Graph Generation

no code implementations ICCV 2019 Long Chen, Hanwang Zhang, Jun Xiao, Xiangnan He, ShiLiang Pu, Shih-Fu Chang

CMAT is a multi-agent policy gradient method that frames objects as cooperative agents, and then directly maximizes a graph-level metric as the reward.

Graph Generation Scene Graph Generation +1

Small-scale Pedestrian Detection Based on Topological Line Localization and Temporal Feature Aggregation

no code implementations ECCV 2018 Tao Song, Leiyu Sun, Di Xie, Haiming Sun, ShiLiang Pu

A critical issue in pedestrian detection is to detect small-scale objects that will introduce feeble contrast and motion blur in images and videos, which in our opinion should partially resort to deep-rooted annotation bias.

Pedestrian Detection

Extreme Network Compression via Filter Group Approximation

no code implementations ECCV 2018 Bo Peng, Wenming Tan, Zheyang Li, Shun Zhang, Di Xie, ShiLiang Pu

In this paper we propose a novel decomposition method based on filter group approximation, which can significantly reduce the redundancy of deep convolutional neural networks (CNNs) while maintaining the majority of feature representation.

General Classification Image Classification

Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation

no code implementations4 Jul 2018 Tao Song, Leiyu Sun, Di Xie, Haiming Sun, ShiLiang Pu

A critical issue in pedestrian detection is to detect small-scale objects that will introduce feeble contrast and motion blur in images and videos, which in our opinion should partially resort to deep-rooted annotation bias.

Pedestrian Detection

A practical convolutional neural network as loop filter for intra frame

no code implementations16 May 2018 Xiaodan Song, Jiabao Yao, Lulu Zhou, Li Wang, Xiaoyang Wu, Di Xie, ShiLiang Pu

It aims to design a single CNN model with low redundancy to adapt to decoded frames with different qualities and ensure consistency.


Edit Probability for Scene Text Recognition

no code implementations CVPR 2018 Fan Bai, Zhanzhan Cheng, Yi Niu, ShiLiang Pu, Shuigeng Zhou

The advantage lies in that the training process can focus on the missing, superfluous and unrecognized characters, and thus the impact of the misalignment problem can be alleviated or even overcome.

Scene Text Recognition

AON: Towards Arbitrarily-Oriented Text Recognition

1 code implementation CVPR 2018 Zhanzhan Cheng, Yangliu Xu, Fan Bai, Yi Niu, ShiLiang Pu, Shuigeng Zhou

Existing methods on text recognition mainly work with regular (horizontal and frontal) texts and cannot be trivially generalized to handle irregular texts.

Optical Character Recognition (OCR) Scene Text Recognition

Focusing Attention: Towards Accurate Text Recognition in Natural Images

no code implementations ICCV 2017 Zhanzhan Cheng, Fan Bai, Yunlu Xu, Gang Zheng, ShiLiang Pu, Shuigeng Zhou

FAN consists of two major components: an attention network (AN) that is responsible for recognizing character targets as in the existing methods, and a focusing network (FN) that is responsible for adjusting attention by evaluating whether AN pays attention properly on the target areas in the images.

Scene Text Recognition

Skeleton-based Action Recognition with Convolutional Neural Networks

1 code implementation25 Apr 2017 Chao Li, Qiaoyong Zhong, Di Xie, ShiLiang Pu

Current state-of-the-art approaches to skeleton-based action recognition are mostly based on recurrent neural networks (RNN).

Action Classification Action Recognition +3

Mixed context networks for semantic segmentation

no code implementations19 Oct 2016 Haiming Sun, Di Xie, ShiLiang Pu

Semantic segmentation is challenging as it requires both object-level information and pixel-level accuracy.

General Classification Semantic Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.