Search Results for author: Yuliang Liu

Found 52 papers, 37 papers with code

TextSquare: Scaling up Text-Centric Visual Instruction Tuning

no code implementations • 19 Apr 2024 • Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao liu, Yuan Xie, Xiang Bai, Can Huang

Text-centric visual question answering (VQA) has made great strides with the development of Multimodal Large Language Models (MLLMs), yet open-source models still fall short of leading models like GPT4V and Gemini, partly due to a lack of extensive, high-quality instruction tuning data.

Hallucination Hallucination Evaluation +2

Paper
Add Code

Bridging the Gap Between End-to-End and Two-Step Text Spotting

3 code implementations • 6 Apr 2024 • Mingxin Huang, Hongliang Li, Yuliang Liu, Xiang Bai, Lianwen Jin

Subsequently, we introduce a Bridge that connects the locked detector and recognizer through a zero-initialized neural network.

Text Spotting

256

Paper
Code

OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

1 code implementation • 28 Mar 2024 • Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yang

Recently, visually-situated text parsing (VsTP) has experienced notable advancements, driven by the increasing demand for automated document understanding and the emergence of Generative Large Language Models (LLMs) capable of processing document-based questions.

document understanding Key Information Extraction +3

926

Paper
Code

TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document

1 code implementation • 7 Mar 2024 • Yuliang Liu, Biao Yang, Qiang Liu, Zhang Li, Zhiyin Ma, Shuo Zhang, Xiang Bai

We present TextMonkey, a large multimodal model (LMM) tailored for text-centric tasks.

document understanding Key Information Extraction +4

1,377

Paper
Code

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

1 code implementation • 5 Feb 2024 • Yang Jin, Zhicheng Sun, Kun Xu, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang song, Kun Gai, Yadong Mu

In light of recent advances in multimodal Large Language Models (LLMs), there is increasing attention to scaling them from image-text data to more informative real-world videos.

Ranked #59 on Visual Question Answering on MM-Vet

Video Understanding Visual Question Answering

348

Paper
Code

An open dataset for oracle bone script recognition and decipherment

no code implementations • 27 Jan 2024 • Pengjie Wang, Kaile Zhang, Xinyu Wang, Shengwei Han, Yongge Liu, Jinpeng Wan, Haisu Guan, Zhebin Kuang, Lianwen Jin, Xiang Bai, Yuliang Liu

Additionally, all images and labels have been reviewed and corrected by experts in oracle bone studies.

Decipherment

Paper
Add Code

An open dataset for the evolution of oracle bone characters: EVOBC

no code implementations • 23 Jan 2024 • Haisu Guan, Jinpeng Wan, Yuliang Liu, Pengjie Wang, Kaile Zhang, Zhebin Kuang, Xinyu Wang, Xiang Bai, Lianwen Jin

We conducted validation and simulated deciphering on the constructed dataset, and the results demonstrate its high efficacy in aiding the study of oracle bone script.

Decipherment

Paper
Add Code

SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting

no code implementations • 15 Jan 2024 • Mingxin Huang, Dezhi Peng, Hongliang Li, Zhenghao Peng, Chongyu Liu, Dahua Lin, Yuliang Liu, Xiang Bai, Lianwen Jin

In this paper, we propose a new end-to-end scene text spotting framework termed SwinTextSpotter v2, which seeks to find a better synergy between text detection and recognition.

Text Detection Text Spotting

Paper
Add Code

Progressive Evolution from Single-Point to Polygon for Scene Text

no code implementations • 21 Dec 2023 • Linger Deng, Mingxin Huang, Xudong Xie, Yuliang Liu, Lianwen Jin, Xiang Bai

We demonstrate the accuracy of the generated polygons through extensive experiments: 1) By creating polygons from ground truth points, we achieved an accuracy of 82. 0% on ICDAR 2015; 2) In training detectors with polygons generated by our method, we attained 86% of the accuracy relative to training with ground truth (GT); 3) Additionally, the proposed Point2Polygon can be seamlessly integrated to empower single-point spotters to generate polygons.

Text Detection

Paper
Add Code

Toward Real Text Manipulation Detection: New Dataset and New Solution

no code implementations • 12 Dec 2023 • Dongliang Luo, Yuliang Liu, Rui Yang, Xianjin Liu, Jishen Zeng, Yu Zhou, Xiang Bai

With the surge in realistic text tampering, detecting fraudulent text in images has gained prominence for maintaining information security.

Contrastive Learning

Paper
Add Code

Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion Models

no code implementations • 28 Nov 2023 • Ling Fu, Zijie Wu, Yingying Zhu, Yuliang Liu, Xiang Bai

We contend that one main limitation of existing generation methods is the insufficient integration of foreground text with the background.

Image Generation Scene Text Detection +1

Paper
Add Code

ML-Bench: Evaluating Large Language Models for Code Generation in Repository-Level Machine Learning Tasks

1 code implementation • 16 Nov 2023 • Yuliang Liu, Xiangru Tang, Zefan Cai, Junjie Lu, Yichi Zhang, Yanjun Shao, Zexuan Deng, Helan Hu, Kaikai An, Ruijun Huang, Shuzheng Si, Sheng Chen, Haozhe Zhao, Liang Chen, Yan Wang, Tianyu Liu, Zhiwei Jiang, Baobao Chang, Yujia Qin, Wangchunshu Zhou, Yilun Zhao, Arman Cohan, Mark Gerstein

While Large Language Models (LLMs) have demonstrated proficiency in code generation benchmarks, translating these results into practical development scenarios - where leveraging existing repository-level libraries is the norm - remains challenging.

Code Generation Navigate

Paper
Code

Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models

1 code implementation • 11 Nov 2023 • Zhang Li, Biao Yang, Qiang Liu, Zhiyin Ma, Shuo Zhang, Jingxu Yang, Yabo Sun, Yuliang Liu, Xiang Bai

Additionally, experiments on 18 datasets further demonstrate that Monkey surpasses existing LMMs in many tasks like Image Captioning and various Visual Question Answering formats.

Image Captioning Question Answering +2

1,377

Paper
Code

KwaiYiiMath: Technical Report

no code implementations • 11 Oct 2023 • Jiayi Fu, Lei Lin, Xiaoyang Gao, Pengli Liu, Zhengzong Chen, Zhirui Yang, ShengNan Zhang, Xue Zheng, Yan Li, Yuliang Liu, Xucheng Ye, Yiqiao Liao, Chao Liao, Bin Chen, Chengru Song, Junchen Wan, Zijia Lin, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning.

Ranked #87 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +1

Paper
Add Code

Turning a CLIP Model into a Scene Text Spotter

1 code implementation • 21 Aug 2023 • Wenwen Yu, Yuliang Liu, Xingkui Zhu, Haoyu Cao, Xing Sun, Xiang Bai

Utilizing only 10% of the supervised data, FastTCM-CR50 improves performance by an average of 26. 5% and 5. 5% for text detection and spotting tasks, respectively.

object-detection Object Detection +3

148

Paper
Code

ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer

3 code implementations • ICCV 2023 • Mingxin Huang, Jiaxin Zhang, Dezhi Peng, Hao Lu, Can Huang, Yuliang Liu, Xiang Bai, Lianwen Jin

To this end, we introduce a new model named Explicit Synergy-based Text Spotting Transformer framework (ESTextSpotter), which achieves explicit synergy by modeling discriminative and interactive features for text detection and recognition within a single decoder.

Text Detection Text Spotting

256

Paper
Code

On Point Affiliation in Feature Upsampling

2 code implementations • 17 Jul 2023 • Wenze Liu, Hao Lu, Yuliang Liu, Zhiguo Cao

We introduce the notion of point affiliation into feature upsampling.

Depth Estimation Feature Upsampling +6

Paper
Code

Box-DETR: Understanding and Boxing Conditional Spatial Queries

1 code implementation • 17 Jul 2023 • Wenze Liu, Hao Lu, Yuliang Liu, Zhiguo Cao

In DAB-DETR, such queries are modulated by the so-called conditional linear projection at each decoder stage, aiming to search for positions of interest such as the four extremities of the box.

Paper
Code

ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining

1 code implementation • 21 Jun 2023 • Dezhi Peng, Chongyu Liu, Yuliang Liu, Lianwen Jin

As ViTEraser implicitly integrates text localization and inpainting, we propose a novel end-to-end pretraining method, termed SegMIM, which focuses the encoder and decoder on the text box segmentation and masked image modeling tasks, respectively.

Long-range modeling Scene Text Detection +1

Paper
Code

Looking and Listening: Audio Guided Text Recognition

1 code implementation • 6 Jun 2023 • Wenwen Yu, MingYu Liu, Biao Yang, Enming Zhang, Deqiang Jiang, Xing Sun, Yuliang Liu, Xiang Bai

Text recognition in the wild is a long-standing problem in computer vision.

Scene Text Recognition

Paper
Code

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

no code implementations • 5 Jun 2023 • Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, MingYu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai

It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI.

Document AI Entity Linking +1

Paper
Add Code

On the Hidden Mystery of OCR in Large Multimodal Models

1 code implementation • 13 May 2023 • Yuliang Liu, Zhang Li, Biao Yang, Chunyuan Li, XuCheng Yin, Cheng-Lin Liu, Lianwen Jin, Xiang Bai

In this paper, we conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks including Text Recognition, Scene Text-Centric Visual Question Answering (VQA), Document-Oriented VQA, Key Information Extraction (KIE), and Handwritten Mathematical Expression Recognition (HMER).

Key Information Extraction Nutrition +4

283

Paper
Code

ICDAR 2023 Competition on Reading the Seal Title

no code implementations • 24 Apr 2023 • Wenwen Yu, MingYu Liu, Mingrui Chen, Ning Lu, Yinlong Wen, Yuliang Liu, Dimosthenis Karatzas, Xiang Bai

To promote research in this area, we organized ICDAR 2023 competition on reading the seal title (ReST), which included two tasks: seal title text detection (Task 1) and end-to-end seal title recognition (Task 2).

Optical Character Recognition (OCR) Task 2 +1

Paper
Add Code

Accountable Textual-Visual Chat Learns to Reject Human Instructions in Image Re-creation

2 code implementations • 10 Mar 2023 • Zhiwei Zhang, Yuliang Liu

This stream is subsequently fed into the decoder-based transformer to generate visual re-creations and textual feedback in the second stage.

multimodal generation Text-to-Image Generation +1

Paper
Code

Turning a CLIP Model into a Scene Text Detector

1 code implementation • CVPR 2023 • Wenwen Yu, Yuliang Liu, Wei Hua, Deqiang Jiang, Bo Ren, Xiang Bai

Recently, pretraining approaches based on vision language models have made effective progresses in the field of text detection.

Domain Adaptation Scene Text Detection +1

148

Paper
Code

Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models

1 code implementation • 6 Feb 2023 • Yuliang Liu, Shenggui Li, Jiarui Fang, Yanjun Shao, Boyuan Yao, Yang You

To address these challenges, we introduce a system that can jointly optimize distributed execution and gradient checkpointing plans.

Scheduling

37,854

Paper
Code

SPTS v2: Single-Point Scene Text Spotting

3 code implementations • 4 Jan 2023 • Yuliang Liu, Jiaxin Zhang, Dezhi Peng, Mingxin Huang, Xinyu Wang, Jingqun Tang, Can Huang, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin

Within the context of our SPTS v2 framework, our experiments suggest a potential preference for single-point representation in scene text spotting when compared to other representations.

Ranked #15 on Text Spotting on ICDAR 2015

Text Detection Text Spotting

128

Paper
Code

Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution

1 code implementation • CVPR 2023 • Chenfan Qu, Chongyu Liu, Yuliang Liu, Xinhong Chen, Dezhi Peng, Fengjun Guo, Lianwen Jin

In this paper, we propose a novel framework to capture more fine-grained clues in complex scenarios for tampered text detection, termed as Document Tampering Detector (DTD), which consists of a Frequency Perception Head (FPH) to compensate the deficiencies caused by the inconspicuous visual features, and a Multi-view Iterative Decoder (MID) for fully utilizing the information of features in different scales.

Image and Video Forgery Detection Image Compression +1

Paper
Code

MSDS: A Large-Scale Chinese Signature and Token Digit String Dataset for Handwriting Verification

1 code implementation • 17 Oct 2022 • Peirong Zhang, Jiajia Jiang, Yuliang Liu, Lianwen Jin

MSDS-ChS consists of handwritten Chinese signatures, which, to the best of our knowledge, is the largest publicly available Chinese signature dataset for handwriting verification, at least eight times larger than existing online datasets.

Handwriting Verification

Paper
Code

SAPA: Similarity-Aware Point Affiliation for Feature Upsampling

2 code implementations • 26 Sep 2022 • Hao Lu, Wenze Liu, Zixuan Ye, Hongtao Fu, Yuliang Liu, Zhiguo Cao

We introduce point affiliation into feature upsampling, a notion that describes the affiliation of each upsampled point to a semantic cluster formed by local decoder feature points with semantic similarity.

Ranked #5 on Feature Upsampling on ImageNet

Depth Estimation Feature Upsampling +6

Paper
Code

PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese Text Recognition

4 code implementations • 29 Jul 2022 • Dezhi Peng, Lianwen Jin, Yuliang Liu, Canjie Luo, Songxuan Lai

Utilizing the proposed weakly supervised learning framework, PageNet requires only transcripts to be annotated for real data; however, it can still output detection and recognition results at both the character and line levels, avoiding the labor and cost of labeling bounding boxes of characters and text lines.

Handwritten Chinese Text Recognition Line Detection +1

Paper
Code

Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context

1 code implementation • 21 Jul 2022 • Chongyu Liu, Lianwen Jin, Yuliang Liu, Canjie Luo, Bangdong Chen, Fengjun Guo, Kai Ding

To address this issue, we propose a Contextual-guided Text Removal Network, termed as CTRNet.

Paper
Code

DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation

1 code implementation • 30 Mar 2022 • Yu Tang, Chenyu Wang, Yufan Zhang, Yuliang Liu, Xingcheng Zhang, Linbo Qiao, Zhiquan Lai, Dongsheng Li

To the best of our knowledge, we are the first to make a reasonable dynamic runtime scheduler on the combination of tensor swapping and tensor recomputation without user oversight.

Paper
Code

SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

2 code implementations • CVPR 2022 • Mingxin Huang, Yuliang Liu, Zhenghao Peng, Chongyu Liu, Dahua Lin, Shenggao Zhu, Nicholas Yuan, Kai Ding, Lianwen Jin

End-to-end scene text spotting has attracted great attention in recent years due to the success of excavating the intrinsic synergy of the scene text detection and recognition.

Ranked #3 on Text Spotting on Inverse-Text

Scene Text Detection Text Detection +1

256

Paper
Code

SPTS: Single-Point Text Spotting

1 code implementation • 15 Dec 2021 • Dezhi Peng, Xinyu Wang, Yuliang Liu, Jiaxin Zhang, Mingxin Huang, Songxuan Lai, Shenggao Zhu, Jing Li, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin

For the first time, we demonstrate that training scene text spotting models can be achieved with an extremely low-cost annotation of a single-point for each instance.

Ranked #3 on Text Spotting on SCUT-CTW1500

Language Modelling Text Detection +1

128

Paper
Code

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

1 code implementation • 28 Oct 2021 • Shenggui Li, Hongxin Liu, Zhengda Bian, Jiarui Fang, Haichen Huang, Yuliang Liu, Boxiang Wang, Yang You

The success of Transformer models has pushed the deep learning model scale to billions of parameters.

37,854

Paper
Code

ICDAR 2021 Competition on Integrated Circuit Text Spotting and Aesthetic Assessment

1 code implementation • 12 Jul 2021 • Chun Chet Ng, Akmalul Khairi Bin Nazaruddin, Yeong Khang Lee, Xinyu Wang, Yuliang Liu, Chee Seng Chan, Lianwen Jin, Yipeng Sun, Lixin Fan

With hundreds of thousands of electronic chip components are being manufactured every day, chip manufacturers have seen an increasing demand in seeking a more efficient and effective way of inspecting the quality of printed texts on chip components.

Text Spotting

Paper
Code

ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting

1 code implementation • 8 May 2021 • Yuliang Liu, Chunhua Shen, Lianwen Jin, Tong He, Peng Chen, Chongyu Liu, Hao Chen

Previous methods can be roughly categorized into two groups: character-based and segmentation-based, which often require character-level annotations and/or complex post-processing due to the unstructured output.

Ranked #7 on Text Spotting on Inverse-Text

Text Spotting

Paper
Code

Structured Multimodal Attentions for TextVQA

2 code implementations • 1 Jun 2020 • Chenyu Gao, Qi Zhu, Peng Wang, Hui Li, Yuliang Liu, Anton Van Den Hengel, Qi Wu

In this paper, we propose an end-to-end structured multimodal attention (SMA) neural network to mainly solve the first two issues above.

Graph Attention Optical Character Recognition (OCR) +3

Paper
Code

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

15 code implementations • CVPR 2020 • Yuliang Liu, Hao Chen, Chunhua Shen, Tong He, Lianwen Jin, Liangwei Wang

Our contributions are three-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve.

Ranked #9 on Text Spotting on Inverse-Text

Scene Text Detection Text Detection +1

3,324

Paper
Code

On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

no code implementations • CVPR 2020 • Xinyu Wang, Yuliang Liu, Chunhua Shen, Chun Chet Ng, Canjie Luo, Lianwen Jin, Chee Seng Chan, Anton Van Den Hengel, Liangwei Wang

Visual Question Answering (VQA) methods have made incredible progress, but suffer from a failure to generalize.

Question Answering Referring Expression +1

Paper
Add Code

Separating Content from Style Using Adversarial Learning for Recognizing Text in the Wild

no code implementations • 13 Jan 2020 • Canjie Luo, Qingxiang Lin, Yuliang Liu, Lianwen Jin, Chunhua Shen

Furthermore, to tackle the issue of lacking paired training samples, we design an interactive joint training scheme, which shares attention masks from the recognizer to the discriminator, and enables the discriminator to extract the features of each character for further adversarial training.

Style Transfer

Paper
Add Code

Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection

1 code implementation • 20 Dec 2019 • Yuliang Liu, Tong He, Hao Chen, Xinyu Wang, Canjie Luo, Shuaitao Zhang, Chunhua Shen, Lianwen Jin

More importantly, based on OBD, we provide a detailed analysis of the impact of a collection of refinements, which may inspire others to build state-of-the-art text detectors.

Ranked #3 on Scene Text Detection on ICDAR 2017 MLT

Scene Text Detection Text Detection

271

Paper
Code

ICDAR 2019 Competition on Large-scale Street View Text with Partial Labeling -- RRC-LSVT

no code implementations • 17 Sep 2019 • Yipeng Sun, Zihan Ni, Chee-Kheng Chng, Yuliang Liu, Canjie Luo, Chun Chet Ng, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin

Robust text reading from street view images provides valuable information for various applications.

Text Detection Text Spotting +1

Paper
Add Code

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT)

1 code implementation • 16 Sep 2019 • Chee-Kheng Chng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, ChuanMing Fang, Shuaitao Zhang, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin

This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT) that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting.

Scene Text Detection Scene Text Recognition +2

720

Paper
Code

Omnidirectional Scene Text Detection with Sequential-free Box Discretization

1 code implementation • 6 Jun 2019 • Yuliang Liu, Sheng Zhang, Lianwen Jin, Lele Xie, Yaqiang Wu, Zhepeng Wang

Scene text in the wild is commonly presented with high variant characteristics.

Ranked #1 on Scene Text Detection on IC19-ReCTs (using extra training data)

Scene Text Detection Text Detection

271

Paper
Code

Aggregation Cross-Entropy for Sequence Recognition

2 code implementations • CVPR 2019 • Zecheng Xie, Yaoxiong Huang, Yuanzhi Zhu, Lianwen Jin, Yuliang Liu, Lele Xie

In this paper, we propose a novel method, aggregation cross-entropy (ACE), for sequence recognition from a brand new perspective.

301

Paper
Code

Tightness-aware Evaluation Protocol for Scene Text Detection

1 code implementation • CVPR 2019 • Yuliang Liu, Lianwen Jin, Zecheng Xie, Canjie Luo, Shuaitao Zhang, Lele Xie

Evaluation protocols play key role in the developmental progress of text detection methods.

object-detection Object Detection +2

209

Paper
Code

EnsNet: Ensconce Text in the Wild

3 code implementations • 3 Dec 2018 • Shuaitao Zhang, Yuliang Liu, Lianwen Jin, Yaoxiong Huang, Songxuan Lai

The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background.

Generative Adversarial Network Image Text Removal

220

Paper
Code

DeRPN: Taking a further step toward more general object detection

1 code implementation • 16 Nov 2018 • Lele Xie, Yuliang Liu, Lianwen Jin, Zecheng Xie

However, the detection performance is sensitive to the setting of the anchor boxes.

Object object-detection +4

155

Paper
Code

Feature Enhancement Network: A Refined Scene Text Detector

no code implementations • 12 Nov 2017 • Sheng Zhang, Yuliang Liu, Lianwen Jin, Canjie Luo

In this paper, we propose a refined scene text detector with a \textit{novel} Feature Enhancement Network (FEN) for Region Proposal and Text Detection Refinement.

object-detection Object Detection +3

Paper
Add Code

Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

no code implementations • CVPR 2017 • Yuliang Liu, Lianwen Jin

The effectiveness of our approach is evaluated on a public word-level, multi-oriented scene text database, ICDAR 2015 Robust Reading Competition Challenge 4 "Incidental scene text localization".

Text Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.