Search Results for author: Xu-Cheng Yin

Found 26 papers, 11 papers with code

RAPIDFlow: Recurrent Adaptable Pyramids with Iterative Decoding for Efficient Optical Flow Estimation

1 code implementation • IEEE International Conference on Robotics and Automation (ICRA) 2024 • Henrique Morimitsu, Xiaobin Zhu, Roberto M. Cesar-Jr., Xiangyang Ji, Xu-Cheng Yin

Extracting motion information from videos with optical flow estimation is vital in multiple practical robot applications.

Ranked #6 on Optical Flow Estimation on KITTI 2015

Optical Flow Estimation

197

Paper
Code

Inverse-like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling

no code implementations • 8 Jan 2024 • Shi-Xue Zhang, Chun Yang, Xiaobin Zhu, Hongyang Zhou, Hongfa Wang, Xu-Cheng Yin

Specifically, we propose an innovative reading-order estimation module (REM) that extracts reading-order information from the initial text boundary generated by an initial boundary module (IBM).

Text Detection Text Spotting

Paper
Add Code

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

no code implementations • 24 May 2023 • Zhi-Hao Lai, Tian-Hao Zhang, Qi Liu, Xinyuan Qian, Li-Fang Wei, Song-Lu Chen, Feng Chen, Xu-Cheng Yin

To address these issues, this paper proposes InterFormer for interactive local and global features fusion to learn a better representation for ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding

no code implementations • 23 May 2023 • Tian-Hao Zhang, Hai-Bo Qin, Zhi-Hao Lai, Song-Lu Chen, Qi Liu, Feng Chen, Xinyuan Qian, Xu-Cheng Yin

The experimental results show that ASCD significantly improves the performance by leveraging both the acoustic and semantic information cooperatively.

speech-recognition Speech Recognition

Paper
Add Code

Unsupervised Multi-view Pedestrian Detection

no code implementations • 21 May 2023 • Mengyin Liu, Chao Zhu, Shiqi Ren, Xu-Cheng Yin

1) Firstly, Semantic-aware Iterative Segmentation (SIS) is proposed to extract unsupervised representations of multi-view images, which are converted into 2D pedestrian masks as pseudo labels, via our proposed iterative PCA and zero-shot semantic classes from vision-language models.

Camera Calibration Pedestrian Detection

Paper
Add Code

VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision

1 code implementation • CVPR 2023 • Mengyin Liu, Jie Jiang, Chao Zhu, Xu-Cheng Yin

Firstly, we propose a self-supervised Vision-Language Semantic (VLS) segmentation method, which learns both fully-supervised pedestrian detection and contextual segmentation via self-generated explicit labels of semantic classes by vision-language models.

Ranked #5 on Pedestrian Detection on Caltech

Autonomous Driving Pedestrian Detection

Paper
Code

Learning Correction Filter via Degradation-Adaptive Regression for Blind Single Image Super-Resolution

1 code implementation • ICCV 2023 • Hongyang Zhou, Xiaobin Zhu, Jianqing Zhu, Zheng Han, Shi-Xue Zhang, Jingyan Qin, Xu-Cheng Yin

Instead of assuming degradation are spatially invariant across the whole image, we learn correction filters to adjust degradations to known degradations in a spatially variant way by a novel linearly-assembled pixel degradation-adaptive regression module (DARM).

Image Super-Resolution regression

Paper
Code

Arbitrary Shape Text Detection via Segmentation with Probability Maps

1 code implementation • 26 Aug 2022 • Shi-Xue Zhang, Xiaobin Zhu, Lei Chen, Jie-Bo Hou, Xu-Cheng Yin

To be concrete, we adopt a Sigmoid Alpha Function (SAF) to transfer the distances between boundaries and their inside pixels to a probability map.

Scene Text Detection Segmentation +1

Paper
Code

Boosting Multi-Modal E-commerce Attribute Value Extraction via Unified Learning Scheme and Dynamic Range Minimization

no code implementations • 15 Jul 2022 • Mengyin Liu, Chao Zhu, Hongyu Gao, Weibo Gu, Hongfa Wang, Wei Liu, Xu-Cheng Yin

2) Secondly, a text-guided information range minimization method is proposed to adaptively encode descriptive parts of each modality into an identical space with a powerful pretrained linguistic model.

Attribute Attribute Value Extraction +2

Paper
Add Code

Arbitrary Shape Text Detection via Boundary Transformer

2 code implementations • 11 May 2022 • Shi-Xue Zhang, Chun Yang, Xiaobin Zhu, Xu-Cheng Yin

In our method, we explicitly model the text boundary via an innovative iterative boundary transformer in a coarse-to-fine manner.

Text Detection

157

Paper
Code

Graph Fusion Network for Multi-Oriented Object Detection

no code implementations • 7 May 2022 • Shi-Xue Zhang, Xiaobin Zhu, Jie-Bo Hou, Xu-Cheng Yin

Then, we propose a graph-based fusion network via Graph Convolutional Network (GCN) to learn to reason and fuse the detection boxes for generating final instance boxes.

Object object-detection +2

Paper
Add Code

Open-set Text Recognition via Character-Context Decoupling

1 code implementation • CVPR 2022 • Chang Liu, Chun Yang, Xu-Cheng Yin

Contextual information can be decomposed into temporal information and linguistic information.

Paper
Code

Kernel Proposal Network for Arbitrary Shape Text Detection

1 code implementation • 12 Mar 2022 • Shi-Xue Zhang, Xiaobin Zhu, Jie-Bo Hou, Chun Yang, Xu-Cheng Yin

In this paper, we propose an innovative Kernel Proposal Network (dubbed KPN) for arbitrary shape text detection.

Text Detection

Paper
Code

Towards Open-Set Text Recognition via Label-to-Prototype Learning

no code implementations • 10 Mar 2022 • Chang Liu, Chun Yang, Hai-Bo Qin, Xiaobin Zhu, Cheng-Lin Liu, Xu-Cheng Yin

Scene text recognition is a popular topic and extensively used in the industry.

Scene Text Recognition

Paper
Add Code

Learning Aligned Cross-Modal Representation for Generalized Zero-Shot Classification

no code implementations • 24 Dec 2021 • Zhiyu Fang, Xiaobin Zhu, Chun Yang, Zheng Han, Jingyan Qin, Xu-Cheng Yin

Learning a common latent embedding by aligning the latent spaces of cross-modal autoencoders is an effective strategy for Generalized Zero-Shot Classification (GZSC).

Classification Zero-Shot Learning

Paper
Add Code

Non-autoregressive Transformer with Unified Bidirectional Decoder for Automatic Speech Recognition

no code implementations • 14 Sep 2021 • Chuan-Fei Zhang, Yan Liu, Tian-Hao Zhang, Song-Lu Chen, Feng Chen, Xu-Cheng Yin

To tackle the above problems, we propose a new non-autoregressive transformer with a unified bidirectional decoder (NAT-UBD), which can simultaneously utilize left-to-right and right-to-left contexts.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection

1 code implementation • ICCV 2021 • Shi-Xue Zhang, Xiaobin Zhu, Chun Yang, Hongfa Wang, Xu-Cheng Yin

In this work, we propose a novel adaptive boundary proposal network for arbitrary shape text detection, which can learn to directly produce accurate boundary for arbitrary shape text without any post-processing.

Text Detection

108

Paper
Code

End-to-end trainable network for degraded license plate detection via vehicle-plate relation mining

1 code implementation • 27 Oct 2020 • Song-Lu Chen, Shu Tian, Jia-Wei Ma, Qi Liu, Chun Yang, Feng Chen, Xu-Cheng Yin

Second, we propose to predict the quadrilateral bounding box in the local region by regressing the four corners of the license plate to robustly detect oblique license plates.

License Plate Detection License Plate Recognition +1

Paper
Code

RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering

1 code implementation • 24 Oct 2020 • Zan-Xia Jin, Heran Wu, Chun Yang, Fang Zhou, Jingyan Qin, Lei Xiao, Xu-Cheng Yin

Text-based visual question answering (VQA) requires to read and understand text in an image to correctly answer a given question.

Optical Character Recognition Optical Character Recognition (OCR) +2

Paper
Code

Mutual-Supervised Feature Modulation Network for Occluded Pedestrian Detection

no code implementations • 21 Oct 2020 • Ye He, Chao Zhu, Xu-Cheng Yin

These two branches are trained in a mutual-supervised way with full body annotations and visible body annotations, respectively.

Body Detection Occlusion Handling +1

Paper
Add Code

Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection

2 code implementations • CVPR 2020 • Shi-Xue Zhang, Xiaobin Zhu, Jie-Bo Hou, Chang Liu, Chun Yang, Hongfa Wang, Xu-Cheng Yin

In this paper, we propose a novel unified relational reasoning graph network for arbitrary shape text detection.

graph construction Optical Character Recognition (OCR) +2

4,065

Paper
Code

Semantic Bilinear Pooling for Fine-Grained Recognition

no code implementations • 3 Apr 2019 • Xinjie Li, Chun Yang, Songlu Chen, Chao Zhu, Xu-Cheng Yin

Specifically, we design a generalized cross-entropy loss for the training of the proposed framework to fully exploit the semantic priors via considering the relevance between adjacent levels and enlarge the distance between samples of different coarse classes.

General Classification Multi-Label Learning

Paper
Add Code

AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition

no code implementations • 10 Oct 2017 • Chun Yang, Xu-Cheng Yin, Zejun Li, Jianwei Wu, Chunchao Guo, Hongfa Wang, Lei Xiao

Recognizing text in the wild is a really challenging task because of complex backgrounds, various illuminations and diverse distortions, even with deep neural networks (convolutional neural networks and recurrent neural networks).

Scene Text Recognition

Paper
Add Code

A Multi-strategy Query Processing Approach for Biomedical Question Answering: USTB\_PRIR at BioASQ 2017 Task 5B

no code implementations • WS 2017 • Zan-Xia Jin, Bo-Wen Zhang, Fan Fang, Le-Le Zhang, Xu-Cheng Yin

This paper describes the participation of USTB{\_}PRIR team in the 2017 BioASQ 5B on question answering, including document retrieval, snippet retrieval, and concept retrieval task.

Information Retrieval Question Answering +1

Paper
Add Code

Learning to Diversify via Weighted Kernels for Classifier Ensemble

no code implementations • 4 Jun 2014 • Xu-Cheng Yin, Chun Yang, Hong-Wei Hao

In this paper, we argue that diversity, not direct diversity on samples but adaptive diversity with data, is highly correlated to ensemble accuracy, and we propose a novel technology for classifier ensemble, learning to diversify, which learns to adaptively combine classifiers by considering both accuracy and diversity.

Ensemble Learning Ensemble Pruning

Paper
Add Code

Robust Text Detection in Natural Scene Images

no code implementations • 11 Jan 2013 • Xu-Cheng Yin, Xuwang Yin, Kai-Zhu Huang, Hong-Wei Hao

Text detection in natural scene images is an important prerequisite for many content-based image analysis tasks.

Clustering Metric Learning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.