no code implementations • COLING 2022 • Jun Zhao, Xin Zhao, WenYu Zhan, Tao Gui, Qi Zhang, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu
To deal with this problem, this work proposes a cross-document semantic enhancement method, which consists of two modules: 1) To prevent distractions from irrelevant regions in the current document, we design a learnable attention mask mechanism, which is used to adaptively filter redundant information in the current document.
no code implementations • 22 May 2023 • Yachun Li, Jingjing Wang, Yuhui Chen, Di Xie, ShiLiang Pu
To tackle the above issues, we propose a Single Domain Dynamic Generalization (SDDG) framework, which simultaneously exploits domain-invariant and domain-specific features on a per-sample basis and learns to generalize to various unseen domains with numerous natural images.
1 code implementation • 18 May 2023 • Wei Xue, Yongliang Shen, Wenqi Ren, Jietian Guo, ShiLiang Pu, Weiming Lu
Specifically, TaxBox consists of three components: (1) a graph aggregation module to leverage the structural information of the taxonomy and two lightweight decoders that map features to box embedding and capture complex relationships between concepts; (2) two probabilistic scorers that correspond to attachment and insertion operations and ensure the avoidance of pseudo-leaves; and (3) three learning objectives that assist the model in mapping concepts more granularly onto the box embedding space.
1 code implementation • CVPR 2023 • Mingjun Xu, Lingyun Qin, WeiJie Chen, ShiLiang Pu, Lei Zhang
In this work, we present an idea to remove non-causal factors from common features by multi-view adversarial training on source domains, because we observe that such insignificant non-causal factors may still be significant in other latent spaces (views) due to the multi-mode structure of data.
1 code implementation • CVPR 2023 • Hang Du, Xuejun Yan, Jingjing Wang, Di Xie, ShiLiang Pu
Most existing approaches for point cloud normal estimation aim to locally fit a geometric surface and calculate the normal from the fitted surface.
no code implementations • 12 Jan 2023 • Wei Zhao, Binbin Chen, WeiJie Chen, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang
The domain adaptation part is implemented as a Source-Free Domain Adaptation paradigm, which only uses the pre-trained model and the unlabeled target data to further optimize in a self-supervised training manner.
no code implementations • 12 Jan 2023 • Yilu Guo, Xingyue Shi, WeiJie Chen, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang
In the test-time training stage, we use the pre-trained model to assign noisy label for the unlabeled target data, and propose a Label-Periodically-Updated DivideMix method for noisy label learning.
no code implementations • CVPR 2023 • Chen Lin, Bo Peng, Zheyang Li, Wenming Tan, Ye Ren, Jun Xiao, ShiLiang Pu
To this end, we detach a sharpness term from the loss which reflects the impact of quantization noise.
no code implementations • CVPR 2023 • Guiwei Zhang, Yongfei Zhang, Tianyu Zhang, Bo Li, ShiLiang Pu
Although recent studies empirically show that injecting Convolutional Neural Networks (CNNs) into Vision Transformers (ViTs) can improve the performance of person re-identification, the rationale behind it remains elusive.
no code implementations • 30 Dec 2022 • Pengwei Yin, Jiawu Dai, Jingjing Wang, Di Xie, ShiLiang Pu
Gaze estimation is the fundamental basis for many visual tasks.
no code implementations • NIPS 2022 • Zheng Chuanyang, Zheyang Li, Kai Zhang, Zhi Yang, Wenming Tan, Jun Xiao, Ye Ren, ShiLiang Pu
In this paper, we introduce joint importance, which integrates essential structural-aware interactions between components for the first time, to perform collaborative pruning.
no code implementations • ECCV 2022 • Jingyuan Ma, Xiangyu Lei, Nan Liu, Xian Zhao, ShiLiang Pu
Semantics-guided self-supervised monocular depth estimation has been widely researched, owing to the strong cross-task correlation of depth and semantics.
1 code implementation • 9 Oct 2022 • Rang Meng, Xianfeng Li, WeiJie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Mingli Song, Di Xie, ShiLiang Pu
Under this guidance, a novel Attention Diversification framework is proposed, in which Intra-Model and Inter-Model Attention Diversification Regularization are collaborated to reassign appropriate attention to diverse task-related features.
no code implementations • 8 Oct 2022 • Jie Liu, Jingjing Wang, Peng Zhang, Chunmao Wang, Di Xie, ShiLiang Pu
To overcome these limitations, we propose a multi-scale wavelet transformer framework for face forgery detection.
1 code implementation • 8 Oct 2022 • Hang Du, Xuejun Yan, Jingjing Wang, Di Xie, ShiLiang Pu
In this manner, the proposed cascaded refinement network can be easily optimized without extra learning strategies.
1 code implementation • 8 Oct 2022 • Xuejun Yan, Hongyu Yan, Jingjing Wang, Hang Du, Zhihong Wu, Di Xie, ShiLiang Pu, Li Lu
The rapid development of point cloud learning has driven point cloud completion into a new era.
1 code implementation • 2 Aug 2022 • Qiming Yang, Kai Zhang, Chaoxiang Lan, Zhi Yang, Zheyang Li, Wenming Tan, Jun Xiao, ShiLiang Pu
To tackle these issues, we propose Unified Normalization (UN), which can speed up the inference by being fused with other linear operations and achieve comparable performance on par with LN.
no code implementations • 14 Jul 2022 • Zhanzhan Cheng, Peng Zhang, Can Li, Qiao Liang, Yunlu Xu, Pengfei Li, ShiLiang Pu, Yi Niu, Fei Wu
Most existing methods divide this task into two subparts: the text reading part for obtaining the plain text from the original document images and the information extraction part for extracting key contents.
no code implementations • 14 Jul 2022 • Guimei Cao, Zhanzhan Cheng, Yunlu Xu, Duo Li, ShiLiang Pu, Yi Niu, Fei Wu
In this paper, we propose an end-to-end trainable adaptively expandable network named E2-AEN, which dynamically generates lightweight structures for new tasks without any accuracy drop in previous tasks.
1 code implementation • 14 Jul 2022 • Ying Chen, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Xi Li
In this paper, to address this problem, we propose a novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting framework, which aims to infer images in different small but recognizable resolutions and achieve a better balance between accuracy and efficiency.
Knowledge Distillation
Optical Character Recognition (OCR)
+1
1 code implementation • 13 Jul 2022 • Qiang Li, Zhaoliang Yao, Jingjing Wang, Ye Tian, Pengju Yang, Di Xie, ShiLiang Pu
Based on this dataset, we propose a method to obtain the blur scores only with the pairwise rank labels as supervision.
no code implementations • 5 Jul 2022 • Wenxu Shi, Lei Zhang, WeiJie Chen, ShiLiang Pu
Universal domain adaptive object detection (UniDAOD)is more challenging than domain adaptive object detection (DAOD) since the label space of the source domain may not be the same as that of the target and the scale of objects in the universal scenarios can vary dramatically (i. e, category shift and scale shift).
1 code implementation • CVPR 2022 • Rang Meng, WeiJie Chen, Shicai Yang, Jie Song, Luojun Lin, Di Xie, ShiLiang Pu, Xinchao Wang, Mingli Song, Yueting Zhuang
In this paper, we introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank, from which models of different capacities can be sampled to accommodate different accuracy-efficiency trade-offs.
3 code implementations • CVPR 2022 • Binbin Chen, WeiJie Chen, Shicai Yang, Yunyi Xuan, Jie Song, Di Xie, ShiLiang Pu, Mingli Song, Yueting Zhuang
To remedy this issue, we present a novel label assignment mechanism for self-training framework, namely proposal self-assignment, which injects the proposals from student into teacher and generates accurate pseudo labels to match each proposal in the student model accordingly.
no code implementations • 13 Jun 2022 • Yilu Guo, Shicai Yang, WeiJie Chen, Liang Ma, Di Xie, ShiLiang Pu
Therefore, it is crucial to study how to learn more discriminative representations while avoiding over-fitting.
no code implementations • 13 Jun 2022 • Junchu Huang, WeiJie Chen, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang
This framework can reduce the impact of noisy labels from CLIP model effectively by combining both techniques.
1 code implementation • 13 Jun 2022 • Meilin Chen, WeiJie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Yunfeng Yan, Donglian Qi, Yueting Zhuang, Di Xie, ShiLiang Pu
In addition, we conduct anchor adaptation in parallel with localization adaptation, since anchor can be regarded as a learnable parameter.
no code implementations • 23 May 2022 • Fanfan Ye, Liang Ma, Qiaoyong Zhong, Di Xie, ShiLiang Pu
The knowledge extracted by the delegator is then utilized to maintain the performance of the model on old tasks in incremental learning.
no code implementations • 23 May 2022 • Yingying Zhang, Qiaoyong Zhong, Di Xie, ShiLiang Pu
However, the number of stored latent codes in autoencoder increases linearly with the scale of data and the trained encoder is redundant for the replaying stage.
1 code implementation • 25 Apr 2022 • Ming Lu, Fangdong Chen, ShiLiang Pu, Zhan Ma
To this end, Integrated Convolution and Self-Attention (ICSA) unit is first proposed to form a content-adaptive transform to characterize and embed neighborhood information dynamically of any input.
no code implementations • 1 Apr 2022 • Yachun Li, Ying Lian, Jingjing Wang, Yuhui Chen, Chunmao Wang, ShiLiang Pu
We thus define a new domain adaptation setting called Few-shot One-class Domain Adaptation (FODA), where adaptation only relies on a limited number of target bonafide samples.
no code implementations • CVPR 2022 • Qiang Li, Jingjing Wang, Zhaoliang Yao, Yachun Li, Pengju Yang, Jingwei Yan, Chunmao Wang, ShiLiang Pu
In this paper, we emphatically summarize that learning an adaptive label distribution on ordinal regression tasks should follow three principles.
no code implementations • 1 Apr 2022 • Jingwei Yan, Jingjing Wang, Qiang Li, Chunmao Wang, ShiLiang Pu
Automatic facial action unit (AU) recognition is a challenging task due to the scarcity of manual annotations.
1 code implementation • 31 Mar 2022 • Da-Wei Zhou, Han-Jia Ye, Liang Ma, Di Xie, ShiLiang Pu, De-Chuan Zhan
In this work, we propose a new paradigm for FSCIL based on meta-learning by LearnIng Multi-phase Incremental Tasks (LIMIT), which synthesizes fake FSCIL tasks from the base dataset.
Ranked #4 on
Few-Shot Class-Incremental Learning
on CIFAR-100
class-incremental learning
Few-Shot Class-Incremental Learning
+2
no code implementations • ACL 2022 • Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Wenming Tan, Jin Wang, Peng Wang, ShiLiang Pu, Fei Wu
To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding, and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner.
1 code implementation • CVPR 2022 • Da-Wei Zhou, Fu-Yun Wang, Han-Jia Ye, Liang Ma, ShiLiang Pu, De-Chuan Zhan
Forward compatibility requires future new classes to be easily incorporated into the current model based on the current stage data, and we seek to realize it by reserving embedding space for future new classes.
Ranked #3 on
Few-Shot Class-Incremental Learning
on CIFAR-100
class-incremental learning
Few-Shot Class-Incremental Learning
+1
1 code implementation • ACL 2022 • Guanglin Niu, Bo Li, Yongfei Zhang, ShiLiang Pu
The previous knowledge graph embedding (KGE) techniques suffer from invalid negative sampling and the uncertainty of fact-view link prediction, limiting KGC's performance.
no code implementations • 21 Feb 2022 • Ying Bian, Peng Zhang, Jingjing Wang, Chunmao Wang, ShiLiang Pu
However, many other generalizable cues are unexplored for face anti-spoofing, which limits their performance under cross-dataset testing.
no code implementations • 17 Jan 2022 • Chen Lin, Zheyang Li, Bo Peng, Haoji Hu, Wenming Tan, Ye Ren, ShiLiang Pu
This paper introduces a post-training quantization~(PTQ) method achieving highly efficient Convolutional Neural Network~ (CNN) quantization with high performance.
no code implementations • 10 Jan 2022 • Jing Du, ShiLiang Pu, Qinbo Dong, Chao Jin, Xin Qi, Dian Gu, Ru Wu, Hongwei Zhou
Although modern automatic speech recognition (ASR) systems can achieve high performance, they may produce errors that weaken readers' experience and do harm to downstream tasks.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • CVPR 2022 • Yuxi Wu, Changhuai Chen, Jun Che, ShiLiang Pu
To handle this task, we propose a novel visual explanation paradigm called Feature Activation Mapping (FAM) in this paper.
no code implementations • COLING 2022 • Guanglin Niu, Bo Li, Yongfei Zhang, ShiLiang Pu
Knowledge graph (KG) inference aims to address the natural incompleteness of KGs, including rule learning-based and KG embedding (KGE) models.
no code implementations • NeurIPS 2021 • Zhi Zhou, Lan-Zhe Guo, Zhanzhan Cheng, Yu-Feng Li, ShiLiang Pu
However, in many real-world applications, it is desirable to have SSL algorithms that not only classify the samples drawn from the same distribution of labeled data but also detect out-of-distribution (OOD) samples drawn from an unknown distribution.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
no code implementations • 21 Oct 2021 • Linlan Zhao, Dashan Guo, Yunlu Xu, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Xiangzhong Fang
Few-shot learning (FSL) aims to learn models that generalize to novel classes with limited training samples.
no code implementations • 29 Sep 2021 • Changhuai Chen, Xile Shen, Mengyu Ye, Yi Lu, Jun Che, ShiLiang Pu
We figure out that the background class should be treated differently from the classes of interest during training.
no code implementations • 6 Sep 2021 • Ning Wei, Jiahua Liang, Di Xie, ShiLiang Pu
Designing optimal reward functions has been desired but extremely difficult in reinforcement learning (RL).
no code implementations • ICCV 2021 • Jing Hao, Zhixin Zhang, Shicai Yang, Di Xie, ShiLiang Pu
Nowadays advanced image editing tools and technical skills produce tampered images more realistically, which can easily evade image forensic systems and make authenticity verification of images more difficult.
no code implementations • 30 Jul 2021 • Jingwei Yan, Jingjing Wang, Qiang Li, Chunmao Wang, ShiLiang Pu
Based on these two self-supervised auxiliary tasks, local features, mutual relation and motion cues of AUs are better captured in the backbone network with the proposed regional and temporal based auxiliary task learning (RTATL) framework.
no code implementations • ICCV 2021 • Jinlei Hou, Yingying Zhang, Qiaoyong Zhong, Di Xie, ShiLiang Pu, Hong Zhou
Surprisingly, by varying the granularity of division on feature maps, we are able to modulate the reconstruction capability of the model for both normal and abnormal samples.
1 code implementation • 19 Jul 2021 • Dahu Shi, Xing Wei, Xiaodong Yu, Wenming Tan, Ye Ren, ShiLiang Pu
Multi-person pose estimation is an attractive and challenging task.
Ranked #4 on
Multi-Person Pose Estimation
on COCO minival
no code implementations • CVPR 2021 • Zhidong Liang, Zehan Zhang, Ming Zhang, Xian Zhao, ShiLiang Pu
Benefiting from the dense representation of the range image, RangeIoUDet is entirely constructed based on 2D convolution, making it possible to have a fast inference speed.
1 code implementation • ACL 2021 • Shan Yang, Yongfei Zhang, Guanglin Niu, Qinghua Zhao, ShiLiang Pu
Few-shot relation extraction (FSRE) is of great importance in long-tail distribution problem, especially in special domain with low-resource data.
no code implementations • 13 May 2021 • Peng Zhang, Can Li, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Fei Wu
To address the above limitations, we propose a unified framework VSR for document layout analysis, combining vision, semantics and relations.
1 code implementation • 13 May 2021 • Liang Qiao, Zaisheng Li, Zhanzhan Cheng, Peng Zhang, ShiLiang Pu, Yi Niu, Wenqi Ren, Wenming Tan, Fei Wu
In this paper, we aim to obtain more reliable aligned bounding boxes by fully utilizing the visual information from both text regions in proposed local features and cell relations in global features.
Ranked #6 on
Table Recognition
on PubTabNet
1 code implementation • 13 May 2021 • Hui Jiang, Yunlu Xu, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Wenqi Ren, Fei Wu, Wenming Tan
In this work, we excavate the implicit task, character counting within the traditional text recognition, without additional labor annotation cost.
no code implementations • ICCV 2021 • Jianyun Xu, Ruixiang Zhang, Jian Dou, Yushi Zhu, Jie Sun, ShiLiang Pu
The voxel-based view is regular, but sparse, and computation grows cubically when voxel resolution increases.
Ranked #15 on
Robust 3D Semantic Segmentation
on SemanticKITTI-C
no code implementations • 16 Mar 2021 • Taiheng Zhang, Qiaoyong Zhong, ShiLiang Pu, Di Xie
Object detection involves two sub-tasks, i. e. localizing objects in an image and classifying them into various categories.
no code implementations • 24 Feb 2021 • Jingjing Wang, Jingyi Zhang, Ying Bian, Youyi Cai, Chunmao Wang, ShiLiang Pu
In this paper, we propose a self-domain adaptation framework to leverage the unlabeled test domain data at inference.
no code implementations • 24 Feb 2021 • Jingwei Yan, Boyuan Jiang, Jingjing Wang, Qiang Li, Chunmao Wang, ShiLiang Pu
In order to incorporate the intra-level AU relation and inter-level AU regional relevance simultaneously, a multi-level AU relation graph is constructed and graph convolution is performed to further enhance AU regional features of each level.
no code implementations • 23 Feb 2021 • WeiJie Chen, Luojun Lin, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang, Wenqi Ren
Usually, the given source domain pre-trained model is expected to optimize with only unlabeled target data, which is termed as source-free unsupervised domain adaptation.
no code implementations • 1 Feb 2021 • WeiJie Chen, Yilu Guo, Shicai Yang, Zhaoyang Li, Zhenxin Ma, Binbin Chen, Long Zhao, Di Xie, ShiLiang Pu, Yueting Zhuang
Therefore, it yields our attention to suppress false positive in each target domain in an unsupervised way.
no code implementations • ICCV 2021 • Jianyun Xu, Xin Tang, Yushi Zhu, Jie Sun, ShiLiang Pu
Recently, various works that attempted to introduce rotation invariance to point cloud analysis have devised point-pair features, such as angles and distances.
no code implementations • 1 Jan 2021 • Duo Li, Sanli Tang, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Wenming Tan, Fei Wu, Xiaokang Yang
However, the impact of the pseudo-labeled samples' quality as well as the mining strategies for high quality training sample have rarely been studied in SSL.
no code implementations • 10 Dec 2020 • Xianfeng Li, WeiJie Chen, Di Xie, Shicai Yang, Peng Yuan, ShiLiang Pu, Yueting Zhuang
However, it is difficult to evaluate the quality of pseudo labels since no labels are available in target domain.
1 code implementation • 8 Dec 2020 • Liang Qiao, Ying Chen, Zhanzhan Cheng, Yunlu Xu, Yi Niu, ShiLiang Pu, Fei Wu
Recently end-to-end scene text spotting has become a popular research topic due to its advantages of global optimization and high maintainability in real applications.
1 code implementation • ECCV 2020 • Guangyao Chen, Limeng Qiao, Yemin Shi, Peixi Peng, Jia Li, Tiejun Huang, ShiLiang Pu, Yonghong Tian
In this process, one of the key challenges is to reduce the risk of generalizing the inherent characteristics of numerous unknown samples learned from a small amount of known data.
no code implementations • 17 Oct 2020 • Pengbo Zhao, Zhenshen Qu, Yingjia Bu, Wenming Tan, Ye Ren, ShiLiang Pu
Fast and precise object detection for high-resolution aerial images has been a challenging task over the years.
Ranked #32 on
Object Detection In Aerial Images
on DOTA
no code implementations • 6 Oct 2020 • Guanglin Niu, Bo Li, Yongfei Zhang, Yongpan Sheng, Chuan Shi, Jingyang Li, ShiLiang Pu
Inference on a large-scale knowledge graph (KG) is of great importance for KG applications like question answering.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Guanglin Niu, Bo Li, Yongfei Zhang, ShiLiang Pu, Jingyang Li
Recent advances in Knowledge Graph Embedding (KGE) allow for representing entities and relations in continuous vector spaces.
no code implementations • 23 Sep 2020 • Zehan Zhang, Ming Zhang, Zhidong Liang, Xian Zhao, Ming Yang, Wenming Tan, ShiLiang Pu
Experimental results on the KITTI dataset demonstrate significant improvement in filtering false positive over the approach using only point cloud data.
1 code implementation • 1 Sep 2020 • Zhidong Liang, Ming Zhang, Zehan Zhang, Xian Zhao, ShiLiang Pu
We present RangeRCNN, a novel and effective 3D object detection framework based on the range image representation.
no code implementations • 28 Aug 2020 • Siliang Tang, Qi Zhang, Tianpeng Zheng, Mengdi Zhou, Zhan Chen, Lixing Shen, Xiang Ren, Yueting Zhuang, ShiLiang Pu, Fei Wu
When patients need to take medicine, particularly taking more than one kind of drug simultaneously, they should be alarmed that there possibly exists drug-drug interaction.
Drug–drug Interaction Extraction
named-entity-recognition
+4
no code implementations • 11 Aug 2020 • Jiacheng Li, Siliang Tang, Juncheng Li, Jun Xiao, Fei Wu, ShiLiang Pu, Yueting Zhuang
In this paper, we focus on enhancing the generalization ability of the VIST model by considering the few-shot setting.
1 code implementation • 29 Jul 2020 • Fanfan Ye, ShiLiang Pu, Qiaoyong Zhong, Chao Li, Di Xie, Huiming Tang
The key lies in the design of the graph structure, which encodes skeleton topology information.
no code implementations • 6 Jul 2020 • Sanli Tang, Zhanzhan Cheng, ShiLiang Pu, Dashan Guo, Yi Niu, Fei Wu
To tackle this issue, we develop a fine-grained domain alignment approach with a well-designed domain classifier bank that achieves the instance-level alignment respecting to their categories.
no code implementations • 22 Jun 2020 • Jinghuang Lin, Zhanzhan Cheng, Fan Bai, Yi Niu, ShiLiang Pu, Shuigeng Zhou
Scene text recognition (STR) is still a hot research topic in computer vision field due to its various applications.
1 code implementation • 20 Jun 2020 • Wei-Jie Chen, ShiLiang Pu, Di Xie, Shicai Yang, Yilu Guo, Luojun Lin
Extensive experiments on ImageNet dataset have been conducted to prove the effectiveness of our method.
3 code implementations • 27 May 2020 • Chengwei Zhang, Yunlu Xu, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Fei Wu, Futai Zou
Arbitrary text appearance poses a great challenge in scene text recognition tasks.
no code implementations • 27 May 2020 • Jing Lu, Baorui Zou, Zhanzhan Cheng, ShiLiang Pu, Shuigeng Zhou, Yi Niu, Fei Wu
In this paper, we define the problem of object quality assessment for the first time and propose an effective approach named Object-QA to assess high-reliable quality scores for object images.
1 code implementation • 27 May 2020 • Peng Zhang, Yunlu Xu, Zhanzhan Cheng, ShiLiang Pu, Jing Lu, Liang Qiao, Yi Niu, Fei Wu
Since real-world ubiquitous documents (e. g., invoices, tickets, resumes and leaflets) contain rich information, automatic document image understanding has become a hot topic.
no code implementations • 26 Apr 2020 • Qi She, Fan Feng, Qi Liu, Rosa H. M. Chan, Xinyue Hao, Chuanlin Lan, Qihan Yang, Vincenzo Lomonaco, German I. Parisi, Heechul Bae, Eoin Brophy, Baoquan Chen, Gabriele Graffieti, Vidit Goel, Hyonyoung Han, Sathursan Kanagarajah, Somesh Kumar, Siew-Kei Lam, Tin Lun Lam, Liang Ma, Davide Maltoni, Lorenzo Pellegrini, Duvindu Piyasena, ShiLiang Pu, Debdoot Sheet, Soonyong Song, Youngsung Son, Zhengwei Wang, Tomas E. Ward, Jianwen Wu, Meiqing Wu, Di Xie, Yangsheng Xu, Lin Yang, Qiaoyong Zhong, Liguang Zhou
This report summarizes IROS 2019-Lifelong Robotic Vision Competition (Lifelong Object Recognition Challenge) with methods and results from the top $8$ finalists (out of over~$150$ teams).
2 code implementations • CVPR 2020 • Long Chen, Xin Yan, Jun Xiao, Hanwang Zhang, ShiLiang Pu, Yueting Zhuang
To reduce the language biases, several recent works introduce an auxiliary question-only model to regularize the training of targeted VQA model, and achieve dominating performance on VQA-CP.
Ranked #1 on
Visual Question Answering (VQA)
on VQA-CP
(using extra training data)
no code implementations • 28 Feb 2020 • Rang Meng, Wei-Jie Chen, Di Xie, Yuan Zhang, ShiLiang Pu
In this paper, for the first time, we systematically investigate the impact of different layer assignments to the network performance by building an architecture dataset of layer assignment on CIFAR-100.
no code implementations • 26 Feb 2020 • Zhanzhan Cheng, Yunlu Xu, Mingjian Cheng, Yu Qiao, ShiLiang Pu, Yi Niu, Fei Wu
Recurrent neural network (RNN) has been widely studied in sequence learning tasks, while the mainstream models (e. g., LSTM and GRU) rely on the gating mechanism (in control of how information flows between hidden states).
1 code implementation • 17 Feb 2020 • Liang Qiao, Sanli Tang, Zhanzhan Cheng, Yunlu Xu, Yi Niu, ShiLiang Pu, Fei Wu
Many approaches have recently been proposed to detect irregular scene text and achieved promising results.
no code implementations • 21 Nov 2019 • Jiaxu Chen, Jing Hao, Kai Chen, Di Xie, Shicai Yang, ShiLiang Pu
This paper introduces an end-to-end audio classification system based on raw waveforms and mix-training strategy.
no code implementations • 25 Sep 2019 • Yingying Zhang, Qiaoyong Zhong, Di Xie, ShiLiang Pu
The key lies in generalization of prior knowledge learned from large-scale base classes and fast adaptation of the classifier to novel classes.
no code implementations • 7 Aug 2019 • Chengwei Zhang, Yunlu Xu, Zhanzhan Cheng, Yi Niu, ShiLiang Pu, Fei Wu, Futai Zou
The second module is a specific classifier for mining trivial or incomplete action regions, which is trained on the shared features after erasing the seeded regions activated by SSG.
Action Detection
Weakly-supervised Temporal Action Localization
+1
no code implementations • 3 May 2019 • Ming Lu, Ming Cheng, Yiling Xu, ShiLiang Pu, Qiu Shen, Zhan Ma
Networked video applications, e. g., video conferencing, often suffer from poor visual quality due to unexpected network fluctuation and limited bandwidth.
1 code implementation • NAACL 2019 • Qi Zhang, Siliang Tang, Xiang Ren, Fei Wu, ShiLiang Pu, Yueting Zhuang
This paper provides a new way to improve the efficiency of the REINFORCE training process.
3 code implementations • CVPR 2019 • Wei-Jie Chen, Di Xie, Yuan Zhang, ShiLiang Pu
In this family of architectures, the basic block is only composed by 1x1 convolutional layers with only a few shift operations applied to the intermediate feature maps.
1 code implementation • 8 Mar 2019 • Zhanzhan Cheng, Jing Lu, Yi Niu, ShiLiang Pu, Fei Wu, Shuigeng Zhou
Video text spotting is still an important research topic due to its various real-applications.
1 code implementation • 4 Mar 2019 • Chao Li, Qiaoyong Zhong, Di Xie, ShiLiang Pu
By sharing the convolution kernels of different views, spatial and temporal features are collaboratively learned and thus benefit from each other.
Action Recognition In Videos
Temporal Action Localization
+1
1 code implementation • 27 Dec 2018 • Yujin Yuan, Liyuan Liu, Siliang Tang, Zhongfei Zhang, Yueting Zhuang, ShiLiang Pu, Fei Wu, Xiang Ren
Distant supervision leverages knowledge bases to automatically label instances, thus allowing us to train relation extractor without human annotations.
no code implementations • 17 Dec 2018 • Yingying Zhang, Qiaoyong Zhong, Liang Ma, Di Xie, ShiLiang Pu
In particular, we propose a novel multi-stage training strategy which learns incremental triplet margin and improves triplet loss effectively.
no code implementations • 17 Dec 2018 • Wei-Jie Chen, Yuan Zhang, Di Xie, ShiLiang Pu
A better alternative is to propagate the entire useful information to reconstruct the pruned layer instead of directly discarding the less important neurons.
no code implementations • ICCV 2019 • Long Chen, Hanwang Zhang, Jun Xiao, Xiangnan He, ShiLiang Pu, Shih-Fu Chang
CMAT is a multi-agent policy gradient method that frames objects as cooperative agents, and then directly maximizes a graph-level metric as the reward.
no code implementations • 19 Nov 2018 • Yunlu Xu, Chengwei Zhang, Zhanzhan Cheng, Jianwen Xie, Yi Niu, ShiLiang Pu, Fei Wu
Finally, we transform the output of recurrent neural network into the corresponding action distribution.
no code implementations • ECCV 2018 • Tao Song, Leiyu Sun, Di Xie, Haiming Sun, ShiLiang Pu
A critical issue in pedestrian detection is to detect small-scale objects that will introduce feeble contrast and motion blur in images and videos, which in our opinion should partially resort to deep-rooted annotation bias.
no code implementations • ECCV 2018 • Bo Peng, Wenming Tan, Zheyang Li, Shun Zhang, Di Xie, ShiLiang Pu
In this paper we propose a novel decomposition method based on filter group approximation, which can significantly reduce the redundancy of deep convolutional neural networks (CNNs) while maintaining the majority of feature representation.
no code implementations • 4 Jul 2018 • Tao Song, Leiyu Sun, Di Xie, Haiming Sun, ShiLiang Pu
A critical issue in pedestrian detection is to detect small-scale objects that will introduce feeble contrast and motion blur in images and videos, which in our opinion should partially resort to deep-rooted annotation bias.
Ranked #17 on
Pedestrian Detection
on CityPersons
no code implementations • 16 May 2018 • Xiaodan Song, Jiabao Yao, Lulu Zhou, Li Wang, Xiaoyang Wu, Di Xie, ShiLiang Pu
It aims to design a single CNN model with low redundancy to adapt to decoded frames with different qualities and ensure consistency.
Multimedia
no code implementations • CVPR 2018 • Fan Bai, Zhanzhan Cheng, Yi Niu, ShiLiang Pu, Shuigeng Zhou
The advantage lies in that the training process can focus on the missing, superfluous and unrecognized characters, and thus the impact of the misalignment problem can be alleviated or even overcome.
6 code implementations • 17 Apr 2018 • Chao Li, Qiaoyong Zhong, Di Xie, ShiLiang Pu
Skeleton-based human action recognition has recently drawn increasing attentions with the availability of large-scale skeleton datasets.
Ranked #2 on
Skeleton Based Action Recognition
on PKU-MMD
1 code implementation • CVPR 2018 • Zhanzhan Cheng, Yangliu Xu, Fan Bai, Yi Niu, ShiLiang Pu, Shuigeng Zhou
Existing methods on text recognition mainly work with regular (horizontal and frontal) texts and cannot be trivially generalized to handle irregular texts.
Ranked #8 on
Scene Text Recognition
on ICDAR 2003
no code implementations • 30 Oct 2017 • Qiaoyong Zhong, Chao Li, Yingying Zhang, Di Xie, Shicai Yang, ShiLiang Pu
Deep region-based object detector consists of a region proposal step and a deep object recognition step.
no code implementations • ICCV 2017 • Zhanzhan Cheng, Fan Bai, Yunlu Xu, Gang Zheng, ShiLiang Pu, Shuigeng Zhou
FAN consists of two major components: an attention network (AN) that is responsible for recognizing character targets as in the existing methods, and a focusing network (FN) that is responsible for adjusting attention by evaluating whether AN pays attention properly on the target areas in the images.
1 code implementation • 25 Apr 2017 • Chao Li, Qiaoyong Zhong, Di Xie, ShiLiang Pu
Current state-of-the-art approaches to skeleton-based action recognition are mostly based on recurrent neural networks (RNN).
Ranked #3 on
Skeleton Based Action Recognition
on PKU-MMD
no code implementations • CVPR 2017 • Di Xie, Jiang Xiong, ShiLiang Pu
Moreover, we can successfully train plain CNNs to match the performance of the residual counterparts.
no code implementations • 19 Oct 2016 • Haiming Sun, Di Xie, ShiLiang Pu
Semantic segmentation is challenging as it requires both object-level information and pixel-level accuracy.