Search Results for author: Yuliang Liu

Found 38 papers, 31 papers with code

Turning a CLIP Model into a Scene Text Spotter

1 code implementation21 Aug 2023 Wenwen Yu, Yuliang Liu, Xingkui Zhu, Haoyu Cao, Xing Sun, Xiang Bai

Utilizing only 10% of the supervised data, FastTCM-CR50 improves performance by an average of 26. 5% and 5. 5% for text detection and spotting tasks, respectively.

object-detection Object Detection +3

ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer

2 code implementations ICCV 2023 Mingxin Huang, Jiaxin Zhang, Dezhi Peng, Hao Lu, Can Huang, Yuliang Liu, Xiang Bai, Lianwen Jin

To this end, we introduce a new model named Explicit Synergy-based Text Spotting Transformer framework (ESTextSpotter), which achieves explicit synergy by modeling discriminative and interactive features for text detection and recognition within a single decoder.

Text Detection Text Spotting

Box-DETR: Understanding and Boxing Conditional Spatial Queries

1 code implementation17 Jul 2023 Wenze Liu, Hao Lu, Yuliang Liu, Zhiguo Cao

In DAB-DETR, such queries are modulated by the so-called conditional linear projection at each decoder stage, aiming to search for positions of interest such as the four extremities of the box.

ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining

1 code implementation21 Jun 2023 Dezhi Peng, Chongyu Liu, Yuliang Liu, Lianwen Jin

As ViTEraser implicitly integrates text localization and inpainting, we propose a novel end-to-end pretraining method, termed SegMIM, which focuses the encoder and decoder on the text box segmentation and masked image modeling tasks, respectively.

Scene Text Detection Text Detection

ICDAR 2023 Competition on Reading the Seal Title

no code implementations24 Apr 2023 Wenwen Yu, MingYu Liu, Mingrui Chen, Ning Lu, Yinlong Wen, Yuliang Liu, Dimosthenis Karatzas, Xiang Bai

To promote research in this area, we organized ICDAR 2023 competition on reading the seal title (ReST), which included two tasks: seal title text detection (Task 1) and end-to-end seal title recognition (Task 2).

Optical Character Recognition (OCR) Text Detection

Accountable Textual-Visual Chat Learns to Reject Human Instructions in Image Re-creation

1 code implementation10 Mar 2023 Zhiwei Zhang, Yuliang Liu

The recent success of ChatGPT and GPT-4 has drawn widespread attention to multimodal dialogue systems.

multimodal generation Visual Reasoning

Turning a CLIP Model into a Scene Text Detector

1 code implementation CVPR 2023 Wenwen Yu, Yuliang Liu, Wei Hua, Deqiang Jiang, Bo Ren, Xiang Bai

Recently, pretraining approaches based on vision language models have made effective progresses in the field of text detection.

Domain Adaptation Scene Text Detection +1

Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models

1 code implementation6 Feb 2023 Yuliang Liu, Shenggui Li, Jiarui Fang, Yanjun Shao, Boyuan Yao, Yang You

To address these challenges, we introduce a system that can jointly optimize distributed execution and gradient checkpointing plans.

Scheduling

SPTS v2: Single-Point Scene Text Spotting

3 code implementations4 Jan 2023 Yuliang Liu, Jiaxin Zhang, Dezhi Peng, Mingxin Huang, Xinyu Wang, Jingqun Tang, Can Huang, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin

Within the context of our SPTS v2 framework, our experiments suggest a potential preference for single-point representation in scene text spotting when compared to other representations.

Text Detection Text Spotting

Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution

1 code implementation CVPR 2023 Chenfan Qu, Chongyu Liu, Yuliang Liu, Xinhong Chen, Dezhi Peng, Fengjun Guo, Lianwen Jin

In this paper, we propose a novel framework to capture more fine-grained clues in complex scenarios for tampered text detection, termed as Document Tampering Detector (DTD), which consists of a Frequency Perception Head (FPH) to compensate the deficiencies caused by the inconspicuous visual features, and a Multi-view Iterative Decoder (MID) for fully utilizing the information of features in different scales.

Image and Video Forgery Detection Image Compression +1

MSDS: A Large-Scale Chinese Signature and Token Digit String Dataset for Handwriting Verification

1 code implementation17 Oct 2022 Peirong Zhang, Jiajia Jiang, Yuliang Liu, Lianwen Jin

MSDS-ChS consists of handwritten Chinese signatures, which, to the best of our knowledge, is the largest publicly available Chinese signature dataset for handwriting verification, at least eight times larger than existing online datasets.

Handwriting Verification

SAPA: Similarity-Aware Point Affiliation for Feature Upsampling

2 code implementations26 Sep 2022 Hao Lu, Wenze Liu, Zixuan Ye, Hongtao Fu, Yuliang Liu, Zhiguo Cao

We introduce point affiliation into feature upsampling, a notion that describes the affiliation of each upsampled point to a semantic cluster formed by local decoder feature points with semantic similarity.

Depth Estimation Image Matting +5

PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese Text Recognition

2 code implementations29 Jul 2022 Dezhi Peng, Lianwen Jin, Yuliang Liu, Canjie Luo, Songxuan Lai

Utilizing the proposed weakly supervised learning framework, PageNet requires only transcripts to be annotated for real data; however, it can still output detection and recognition results at both the character and line levels, avoiding the labor and cost of labeling bounding boxes of characters and text lines.

Handwritten Chinese Text Recognition Line Detection +1

DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation

1 code implementation30 Mar 2022 Yu Tang, Chenyu Wang, Yufan Zhang, Yuliang Liu, Xingcheng Zhang, Linbo Qiao, Zhiquan Lai, Dongsheng Li

To the best of our knowledge, we are the first to make a reasonable dynamic runtime scheduler on the combination of tensor swapping and tensor recomputation without user oversight.

SPTS: Single-Point Text Spotting

1 code implementation15 Dec 2021 Dezhi Peng, Xinyu Wang, Yuliang Liu, Jiaxin Zhang, Mingxin Huang, Songxuan Lai, Shenggao Zhu, Jing Li, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin

For the first time, we demonstrate that training scene text spotting models can be achieved with an extremely low-cost annotation of a single-point for each instance.

Language Modelling Text Detection +1

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

1 code implementation28 Oct 2021 Shenggui Li, Jiarui Fang, Zhengda Bian, Hongxin Liu, Yuliang Liu, Haichen Huang, Boxiang Wang, Yang You

The success of Transformer models has pushed the deep learning model scale to billions of parameters.

ICDAR 2021 Competition on Integrated Circuit Text Spotting and Aesthetic Assessment

1 code implementation12 Jul 2021 Chun Chet Ng, Akmalul Khairi Bin Nazaruddin, Yeong Khang Lee, Xinyu Wang, Yuliang Liu, Chee Seng Chan, Lianwen Jin, Yipeng Sun, Lixin Fan

With hundreds of thousands of electronic chip components are being manufactured every day, chip manufacturers have seen an increasing demand in seeking a more efficient and effective way of inspecting the quality of printed texts on chip components.

Text Spotting

ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting

1 code implementation8 May 2021 Yuliang Liu, Chunhua Shen, Lianwen Jin, Tong He, Peng Chen, Chongyu Liu, Hao Chen

Previous methods can be roughly categorized into two groups: character-based and segmentation-based, which often require character-level annotations and/or complex post-processing due to the unstructured output.

Text Spotting

Structured Multimodal Attentions for TextVQA

2 code implementations1 Jun 2020 Chenyu Gao, Qi Zhu, Peng Wang, Hui Li, Yuliang Liu, Anton Van Den Hengel, Qi Wu

In this paper, we propose an end-to-end structured multimodal attention (SMA) neural network to mainly solve the first two issues above.

Graph Attention Optical Character Recognition (OCR) +3

Separating Content from Style Using Adversarial Learning for Recognizing Text in the Wild

no code implementations13 Jan 2020 Canjie Luo, Qingxiang Lin, Yuliang Liu, Lianwen Jin, Chunhua Shen

Furthermore, to tackle the issue of lacking paired training samples, we design an interactive joint training scheme, which shares attention masks from the recognizer to the discriminator, and enables the discriminator to extract the features of each character for further adversarial training.

Style Transfer

Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection

1 code implementation20 Dec 2019 Yuliang Liu, Tong He, Hao Chen, Xinyu Wang, Canjie Luo, Shuaitao Zhang, Chunhua Shen, Lianwen Jin

More importantly, based on OBD, we provide a detailed analysis of the impact of a collection of refinements, which may inspire others to build state-of-the-art text detectors.

Scene Text Detection Text Detection

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT)

1 code implementation16 Sep 2019 Chee-Kheng Chng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, ChuanMing Fang, Shuaitao Zhang, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin

This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT) that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting.

Scene Text Detection Scene Text Recognition +2

Aggregation Cross-Entropy for Sequence Recognition

2 code implementations CVPR 2019 Zecheng Xie, Yaoxiong Huang, Yuanzhi Zhu, Lianwen Jin, Yuliang Liu, Lele Xie

In this paper, we propose a novel method, aggregation cross-entropy (ACE), for sequence recognition from a brand new perspective.

EnsNet: Ensconce Text in the Wild

3 code implementations3 Dec 2018 Shuaitao Zhang, Yuliang Liu, Lianwen Jin, Yaoxiong Huang, Songxuan Lai

The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background.

Image Text Removal

Feature Enhancement Network: A Refined Scene Text Detector

no code implementations12 Nov 2017 Sheng Zhang, Yuliang Liu, Lianwen Jin, Canjie Luo

In this paper, we propose a refined scene text detector with a \textit{novel} Feature Enhancement Network (FEN) for Region Proposal and Text Detection Refinement.

object-detection Object Detection +2

Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

no code implementations CVPR 2017 Yuliang Liu, Lianwen Jin

The effectiveness of our approach is evaluated on a public word-level, multi-oriented scene text database, ICDAR 2015 Robust Reading Competition Challenge 4 "Incidental scene text localization".

Text Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.