no code implementations • ECCV 2020 • Yaoxiong Huang, Mengchao He, Lianwen Jin, Yongpan Wang
Style transfer has attracted much interest owing to its various applications.
no code implementations • 3 Dec 2024 • Zhibo Yang, Jun Tang, Zhaohai Li, Pengfei Wang, Jianqiang Wan, Humen Zhong, Xuejing Liu, Mingkun Yang, Peng Wang, Yuliang Liu, Lianwen Jin, Xiang Bai, Shuai Bai, Junyang Lin
The current landscape lacks a comprehensive benchmark to effectively measure the literate capabilities of LMMs.
no code implementations • 22 Nov 2024 • Chenfan Qu, Yiwu Zhong, Fengjun Guo, Lianwen Jin
To this end, we propose Omni-IML, the first generalist model to unify diverse IML tasks.
no code implementations • 1 Oct 2024 • Jiapeng Wang, Chengyu Wang, Kunzhe Huang, Jun Huang, Lianwen Jin
Contrastive Language-Image Pre-training (CLIP) has been widely studied and applied in numerous applications.
no code implementations • 27 Aug 2024 • Wenhui Liao, Jiapeng Wang, Hongliang Li, Chengyu Wang, Jun Huang, Lianwen Jin
Text-rich document understanding (TDU) refers to analyzing and comprehending documents containing substantial textual content.
1 code implementation • 4 Aug 2024 • Mingxin Huang, Yuliang Liu, Dingkang Liang, Lianwen Jin, Xiang Bai
To address this issue, we introduce a Complementary Image Pyramid (CIP), a simple, effective, and plug-and-play solution designed to mitigate semantic discontinuity during high-resolution image processing.
no code implementations • 4 Aug 2024 • Yujin Ren, Jiaxin Zhang, Lianwen Jin
To tackle this problem, a highly promising approach is to utilize massive amounts of unlabeled real data for self-supervised training, which has been widely proven effective in many NLP and CV tasks.
no code implementations • 31 Jul 2024 • Chenfan Qu, Yiwu Zhong, Fengjun Guo, Lianwen Jin
To tackle this, we propose a novel task: open-set tampered scene text detection, which evaluates forensics models on their ability to identify both seen and previously unseen forgery types.
1 code implementation • 4 Jul 2024 • Jiahuan Cao, Dezhi Peng, Peirong Zhang, Yongxin Shi, Yang Liu, Kai Ding, Lianwen Jin
Classical Chinese is a gateway to the rich heritage and wisdom of ancient China, yet its complexities pose formidable comprehension barriers for most modern people without specialized knowledge.
no code implementations • 27 Jun 2024 • Jiaxin Zhang, Wentao Yang, Songxuan Lai, Zecheng Xie, Lianwen Jin
Current multimodal large language models (MLLMs) face significant challenges in visual document understanding (VDU) tasks due to the high resolution, dense text, and complex layouts typical of document images.
1 code implementation • 5 Jun 2024 • Pengjie Wang, Kaile Zhang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu
Oracle Bone Inscriptions is one of the oldest existing forms of writing in the world.
1 code implementation • 2 Jun 2024 • Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu
Originating from China's Shang Dynasty approximately 3, 000 years ago, the Oracle Bone Script (OBS) is a cornerstone in the annals of linguistic history, predating many established writing systems.
no code implementations • 28 May 2024 • Jiahuan Cao, Yongxin Shi, Dezhi Peng, Yang Liu, Lianwen Jin
To fill this gap, this paper introduces C$^{3}$bench, a Comprehensive Classical Chinese understanding benchmark, which comprises 50, 000 text pairs for five primary CCU tasks, including classification, retrieval, named entity recognition, punctuation, and translation.
1 code implementation • CVPR 2024 • Jiaxin Zhang, Dezhi Peng, Chongyu Liu, Peirong Zhang, Lianwen Jin
This underscores the potential of DocRes across a broader spectrum of document image restoration tasks.
1 code implementation • 30 Apr 2024 • Yuliang Liu, Mingxin Huang, Hao Yan, Linger Deng, Weijia Wu, Hao Lu, Chunhua Shen, Lianwen Jin, Xiang Bai
Typically, we propose a Prompt Queries Generation Module and a Tasks-aware Adapter to effectively convert the original single-task model into a multi-task model suitable for both image and video scenarios with minimal additional parameters.
4 code implementations • CVPR 2024 • Mingxin Huang, Hongliang Li, Yuliang Liu, Xiang Bai, Lianwen Jin
Subsequently, we introduce a Bridge that connects the locked detector and recognizer through a zero-initialized neural network.
no code implementations • 20 Mar 2024 • Yuyi Zhang, Yuanzhi Zhu, Dezhi Peng, Peirong Zhang, Zhenhua Yang, Zhibo Yang, Cong Yao, Lianwen Jin
Text recognition, especially for complex scripts like Chinese, faces unique challenges due to its intricate character structures and vast vocabulary.
no code implementations • 8 Mar 2024 • Jiapeng Wang, Chengyu Wang, Tingfeng Cao, Jun Huang, Lianwen Jin
We present DiffChat, a novel method to align Large Language Models (LLMs) to "chat" with prompt-as-input Text-to-Image Synthesis (TIS) models (e. g., Stable Diffusion) for interactive image creation.
1 code implementation • 28 Feb 2024 • Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
Additionally, a comprehensive review of the existing available dataset resources is also provided, including statistics from 444 datasets, covering 8 language categories and spanning 32 domains.
2 code implementations • 27 Jan 2024 • Pengjie Wang, Kaile Zhang, Xinyu Wang, Shengwei Han, Yongge Liu, Jinpeng Wan, Haisu Guan, Zhebin Kuang, Lianwen Jin, Xiang Bai, Yuliang Liu
Oracle bone script, one of the earliest known forms of ancient Chinese writing, presents invaluable research materials for scholars studying the humanities and geography of the Shang Dynasty, dating back 3, 000 years.
no code implementations • 23 Jan 2024 • Haisu Guan, Jinpeng Wan, Yuliang Liu, Pengjie Wang, Kaile Zhang, Zhebin Kuang, Xinyu Wang, Xiang Bai, Lianwen Jin
We conducted validation and simulated deciphering on the constructed dataset, and the results demonstrate its high efficacy in aiding the study of oracle bone script.
no code implementations • 15 Jan 2024 • Mingxin Huang, Dezhi Peng, Hongliang Li, Zhenghao Peng, Chongyu Liu, Dahua Lin, Yuliang Liu, Xiang Bai, Lianwen Jin
In this paper, we propose a new end-to-end scene text spotting framework termed SwinTextSpotter v2, which seeks to find a better synergy between text detection and recognition.
1 code implementation • 7 Jan 2024 • Zening Lin, Jiapeng Wang, Teng Li, Wenhui Liao, Dayi Huang, Longfei Xiong, Lianwen Jin
However, simply concatenating SER and RE serially can lead to severe error propagation, and it fails to handle cases like multi-line entities in real scenarios.
Ranked #1 on Key-value Pair Extraction on SIBR
1 code implementation • CVPR 2024 • Chenfan Qu, Yiwu Zhong, Chongyu Liu, Guitao Xu, Dezhi Peng, Fengjun Guo, Lianwen Jin
We further propose a novel metric termed as QES to assist in filtering out unreliable annotations.
no code implementations • 21 Dec 2023 • Linger Deng, Mingxin Huang, Xudong Xie, Yuliang Liu, Lianwen Jin, Xiang Bai
We demonstrate the accuracy of the generated polygons through extensive experiments: 1) By creating polygons from ground truth points, we achieved an accuracy of 82. 0% on ICDAR 2015; 2) In training detectors with polygons generated by our method, we attained 86% of the accuracy relative to training with ground truth (GT); 3) Additionally, the proposed Point2Polygon can be seamlessly integrated to empower single-point spotters to generate polygons.
1 code implementation • 19 Dec 2023 • Zhenhua Yang, Dezhi Peng, Yuxin Kong, Yuyi Zhang, Cong Yao, Lianwen Jin
Automatic font generation is an imitation task, which aims to create a font library that mimics the style of reference images while preserving the content from source images.
no code implementations • 5 Dec 2023 • Dezhi Peng, Zhenhua Yang, Jiaxin Zhang, Chongyu Liu, Yongxin Shi, Kai Ding, Fengjun Guo, Lianwen Jin
Without bells and whistles, the experimental results showcase that the proposed method can simultaneously achieve state-of-the-art performance on three tasks with a unified single model, which provides valuable strategies and insights for future research on generalist OCR models.
1 code implementation • 25 Oct 2023 • Yongxin Shi, Dezhi Peng, Wenhui Liao, Zening Lin, Xinhong Chen, Chongyu Liu, Yuyi Zhang, Lianwen Jin
We assess the model's performance across a range of OCR tasks, including scene text recognition, handwritten text recognition, handwritten mathematical expression recognition, table structure recognition, and information extraction from visually-rich document.
no code implementations • 9 Oct 2023 • Weifeng Lin, Ziheng Wu, Wentao Yang, Mingxin Huang, Jun Huang, Lianwen Jin
In this paper, we introduce Hierarchical Side-Tuning (HST), an innovative PETL method facilitating the transfer of ViT models to diverse downstream tasks.
3 code implementations • ICCV 2023 • Mingxin Huang, Jiaxin Zhang, Dezhi Peng, Hao Lu, Can Huang, Yuliang Liu, Xiang Bai, Lianwen Jin
To this end, we introduce a new model named Explicit Synergy-based Text Spotting Transformer framework (ESTextSpotter), which achieves explicit synergy by modeling discriminative and interactive features for text detection and recognition within a single decoder.
1 code implementation • ICCV 2023 • Qing Jiang, Jiapeng Wang, Dezhi Peng, Chongyu Liu, Lianwen Jin
To this end, we consolidate a large-scale real STR dataset, namely Union14M, which comprises 4 million labeled images and 10 million unlabeled images, to assess the performance of STR models in more complex real-world scenarios.
1 code implementation • ICCV 2023 • Weifeng Lin, Ziheng Wu, Jiayu Chen, Jun Huang, Lianwen Jin
Specifically, SMT with 11. 5M / 2. 4GFLOPs and 32M / 7. 7GFLOPs can achieve 82. 2% and 84. 3% top-1 accuracy on ImageNet-1K, respectively.
1 code implementation • 21 Jun 2023 • Dezhi Peng, Chongyu Liu, Yuliang Liu, Lianwen Jin
As ViTEraser implicitly integrates text localization and inpainting, we propose a novel end-to-end pretraining method, termed SegMIM, which focuses the encoder and decoder on the text box segmentation and masked image modeling tasks, respectively.
no code implementations • 9 Jun 2023 • Jiaxin Zhang, Bangdong Chen, Hiuyi Cheng, Fengjun Guo, Kai Ding, Lianwen Jin
Furthermore, considering the importance of fine-grained elements in document images, we present a details recurrent refinement module to enhance the output in a high-resolution space.
no code implementations • 28 May 2023 • Jiapeng Wang, Chengyu Wang, Xiaodan Wang, Jun Huang, Lianwen Jin
Large-scale pre-trained text-image models with dual-encoder architectures (such as CLIP) are typically adopted for various vision-language applications, including text-image retrieval.
no code implementations • 15 May 2023 • Hiuyi Cheng, Peirong Zhang, Sihang Wu, Jiaxin Zhang, Qiyuan Zhu, Zecheng Xie, Jing Li, Kai Ding, Lianwen Jin
Document layout analysis is a crucial prerequisite for document understanding, including document retrieval and conversion.
1 code implementation • 13 May 2023 • Yuliang Liu, Zhang Li, Mingxin Huang, Biao Yang, Wenwen Yu, Chunyuan Li, XuCheng Yin, Cheng-Lin Liu, Lianwen Jin, Xiang Bai
In this paper, we conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks including Text Recognition, Scene Text-Centric Visual Question Answering (VQA), Document-Oriented VQA, Key Information Extraction (KIE), and Handwritten Mathematical Expression Recognition (HMER).
3 code implementations • 4 Jan 2023 • Yuliang Liu, Jiaxin Zhang, Dezhi Peng, Mingxin Huang, Xinyu Wang, Jingqun Tang, Can Huang, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin
Within the context of our SPTS v2 framework, our experiments suggest a potential preference for single-point representation in scene text spotting when compared to other representations.
Ranked #15 on Text Spotting on ICDAR 2015
1 code implementation • CVPR 2023 • Hiuyi Cheng, Peirong Zhang, Sihang Wu, Jiaxin Zhang, Qiyuan Zhu, Zecheng Xie, Jing Li, Kai Ding, Lianwen Jin
Document layout analysis is a crucial prerequisite for document understanding, including document retrieval and conversion.
1 code implementation • CVPR 2023 • Chenfan Qu, Chongyu Liu, Yuliang Liu, Xinhong Chen, Dezhi Peng, Fengjun Guo, Lianwen Jin
In this paper, we propose a novel framework to capture more fine-grained clues in complex scenarios for tampered text detection, termed as Document Tampering Detector (DTD), which consists of a Frequency Perception Head (FPH) to compensate the deficiencies caused by the inconspicuous visual features, and a Multi-view Iterative Decoder (MID) for fully utilizing the information of features in different scales.
1 code implementation • 17 Oct 2022 • Peirong Zhang, Jiajia Jiang, Yuliang Liu, Lianwen Jin
MSDS-ChS consists of handwritten Chinese signatures, which, to the best of our knowledge, is the largest publicly available Chinese signature dataset for handwriting verification, at least eight times larger than existing online datasets.
no code implementations • 29 Jul 2022 • Dezhi Peng, Lianwen Jin, Weihong Ma, Canyu Xie, Hesuo Zhang, Shenggao Zhu, Jing Li
A novel weakly supervised learning method is proposed to enable the network to be trained using only transcript annotations; thus, the expensive character segmentation annotations required by previous segmentation-based methods can be avoided.
4 code implementations • 29 Jul 2022 • Dezhi Peng, Lianwen Jin, Yuliang Liu, Canjie Luo, Songxuan Lai
Utilizing the proposed weakly supervised learning framework, PageNet requires only transcripts to be annotated for real data; however, it can still output detection and recognition results at both the character and line levels, avoiding the labor and cost of labeling bounding boxes of characters and text lines.
1 code implementation • 23 Jul 2022 • Jiaxin Zhang, Canjie Luo, Lianwen Jin, Fengjun Guo, Kai Ding
To address this issue, we propose a novel approach called Marior (Margin Removal and \Iterative Content Rectification).
1 code implementation • 21 Jul 2022 • Chongyu Liu, Lianwen Jin, Yuliang Liu, Canjie Luo, Bangdong Chen, Fengjun Guo, Kai Ding
To address this issue, we propose a Contextual-guided Text Removal Network, termed as CTRNet.
no code implementations • 27 Jun 2022 • Chuwei Luo, Guozhi Tang, Qi Zheng, Cong Yao, Lianwen Jin, Chenliang Li, Yang Xue, Luo Si
Multi-modal document pre-trained models have proven to be very effective in a variety of visually-rich document understanding (VrDU) tasks.
1 code implementation • CVPR 2022 • Yuxin Kong, Canjie Luo, Weihong Ma, Qiyuan Zhu, Shenggao Zhu, Nicholas Yuan, Lianwen Jin
Automatic font generation remains a challenging research issue due to the large amounts of characters with complicated structures.
1 code implementation • CVPR 2022 • Canjie Luo, Lianwen Jin, Jingdong Chen
Motivated by this common sense, we augment one image patch and use its neighboring patch as guidance to recover itself.
2 code implementations • CVPR 2022 • Mingxin Huang, Yuliang Liu, Zhenghao Peng, Chongyu Liu, Dahua Lin, Shenggao Zhu, Nicholas Yuan, Kai Ding, Lianwen Jin
End-to-end scene text spotting has attracted great attention in recent years due to the success of excavating the intrinsic synergy of the scene text detection and recognition.
Ranked #3 on Text Spotting on Inverse-Text
4 code implementations • ACL 2022 • Jiapeng Wang, Lianwen Jin, Kai Ding
LiLT can be pre-trained on the structured documents of a single language and then directly fine-tuned on other languages with the corresponding off-the-shelf monolingual/multilingual pre-trained textual models.
Ranked #5 on Key-value Pair Extraction on SIBR
no code implementations • 23 Feb 2022 • Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Zhe Li, Dezhi Peng
Specifically, we propose a style bank to parameterize the specific handwriting styles as latent vectors, which are input to a generator as style priors to achieve the corresponding handwritten styles.
1 code implementation • 15 Dec 2021 • Dezhi Peng, Xinyu Wang, Yuliang Liu, Jiaxin Zhang, Mingxin Huang, Songxuan Lai, Shenggao Zhu, Jing Li, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin
For the first time, we demonstrate that training scene text spotting models can be achieved with an extremely low-cost annotation of a single-point for each instance.
Ranked #3 on Text Spotting on SCUT-CTW1500
1 code implementation • 13 Aug 2021 • Ruben Tolosana, Ruben Vera-Rodriguez, Carlos Gonzalez-Garcia, Julian Fierrez, Aythami Morales, Javier Ortega-Garcia, Juan Carlos Ruiz-Garcia, Sergio Romero-Tapiador, Santiago Rengifo, Miguel Caruana, Jiajia Jiang, Songxuan Lai, Lianwen Jin, Yecheng Zhu, Javier Galbally, Moises Diaz, Miguel Angel Ferrer, Marta Gomez-Barrero, Ilya Hodashinsky, Konstantin Sarin, Artem Slezkin, Marina Bardamova, Mikhail Svetlakov, Mohammad Saleem, Cintia Lia Szucs, Bence Kovari, Falk Pulsmeyer, Mohamad Wehbi, Dario Zanca, Sumaiya Ahmad, Sarthak Mishra, Suraiya Jabin
This article presents SVC-onGoing, an on-going competition for on-line signature verification where researchers can easily benchmark their systems against the state of the art in an open common platform using large-scale public databases, such as DeepSignDB and SVC2021_EvalDB, and standard experimental protocols.
1 code implementation • 12 Jul 2021 • Chun Chet Ng, Akmalul Khairi Bin Nazaruddin, Yeong Khang Lee, Xinyu Wang, Yuliang Liu, Chee Seng Chan, Lianwen Jin, Yipeng Sun, Lixin Fan
With hundreds of thousands of electronic chip components are being manufactured every day, chip manufacturers have seen an increasing demand in seeking a more efficient and effective way of inspecting the quality of printed texts on chip components.
no code implementations • 24 Jun 2021 • Guozhi Tang, Lele Xie, Lianwen Jin, Jiapeng Wang, Jingdong Chen, Zhen Xu, Qianying Wang, Yaqiang Wu, Hui Li
Through key-value matching based on relevancy evaluation, the proposed MatchVIE can bypass the recognitions to various semantics, and simply focuses on the strong relevancy between entities.
no code implementations • 20 Jun 2021 • Jiapeng Wang, Tianwei Wang, Guozhi Tang, Lianwen Jin, Weihong Ma, Kai Ding, Yichao Huang
Visual information extraction (VIE) has attracted increasing attention in recent years.
1 code implementation • CVPR 2021 • Tianwei Wang, Yuanzhi Zhu, Lianwen Jin, Dezhi Peng, Zhe Li, Mengchao He, Yongpan Wang, Canjie Luo
Specifically, we integrate IFA into the two most prevailing text recognition streams (attention-based and CTC-based) and propose attention-guided dense prediction (ADP) and Extended CTC (ExCTC).
Optical Character Recognition Optical Character Recognition (OCR) +1
1 code implementation • 1 Jun 2021 • Ruben Tolosana, Ruben Vera-Rodriguez, Carlos Gonzalez-Garcia, Julian Fierrez, Santiago Rengifo, Aythami Morales, Javier Ortega-Garcia, Juan Carlos Ruiz-Garcia, Sergio Romero-Tapiador, Jiajia Jiang, Songxuan Lai, Lianwen Jin, Yecheng Zhu, Javier Galbally, Moises Diaz, Miguel Angel Ferrer, Marta Gomez-Barrero, Ilya Hodashinsky, Konstantin Sarin, Artem Slezkin, Marina Bardamova, Mikhail Svetlakov, Mohammad Saleem, Cintia Lia Szücs, Bence Kovari, Falk Pulsmeyer, Mohamad Wehbi, Dario Zanca, Sumaiya Ahmad, Sarthak Mishra, Suraiya Jabin
This paper describes the experimental framework and results of the ICDAR 2021 Competition on On-Line Signature Verification (SVC 2021).
1 code implementation • 8 May 2021 • Yuliang Liu, Chunhua Shen, Lianwen Jin, Tong He, Peng Chen, Chongyu Liu, Hao Chen
Previous methods can be roughly categorized into two groups: character-based and segmentation-based, which often require character-level annotations and/or complex post-processing due to the unstructured output.
Ranked #7 on Text Spotting on Inverse-Text
no code implementations • 5 May 2021 • Weihong Ma, Hesuo Zhang, Shuang Yan, Guangshun Yao, Yichao Huang, Hui Li, Yaqiang Wu, Lianwen Jin
For building a robust point detector, a fully convolutional network with feature fusion module is adopted, which can distinguish close points compared to traditional methods.
8 code implementations • CVPR 2021 2021 • Yiqin Zhu, Jianyong Chen, Lingyu Liang, Zhanghui Kuang, Lianwen Jin, Wayne Zhang
One of the main challenges for arbitrary-shaped text detection is to design a good text instance representation that allows networks to learn diverse text geometry variances.
1 code implementation • 24 Jan 2021 • Jiapeng Wang, Chongyu Liu, Lianwen Jin, Guozhi Tang, Jiaxin Zhang, Shuaitao Zhang, Qianying Wang, Yaqiang Wu, Mingxiang Cai
Visual information extraction (VIE) has attracted considerable attention recently owing to its various advanced applications such as document understanding, automatic marking and intelligent education.
no code implementations • 20 Jul 2020 • Zhe Li, Lianwen Jin, Songxuan Lai, Yecheng Zhu
Handwritten mathematical expression recognition (HMER) is an important research direction in handwriting recognition.
1 code implementation • 14 Jul 2020 • Weihong Ma, Hesuo Zhang, Lianwen Jin, Sihang Wu, Jiapeng Wang, Yongpan Wang
In this framework, two branches named character branch and layout branch are added behind the feature extraction network.
1 code implementation • 7 May 2020 • Xiaoxue Chen, Lianwen Jin, Yuanzhi Zhu, Canjie Luo, Tianwei Wang
This paper aims to (1) summarize the fundamental problems and the state-of-the-art associated with scene text recognition; (2) introduce new insights and ideas; (3) provide a comprehensive review of publicly available resources; (4) point out directions for future work.
3 code implementations • CVPR 2020 • Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang
An agent network learns from the output of the recognition network and controls the fiducial points to generate more proper training samples for the recognition network.
15 code implementations • CVPR 2020 • Yuliang Liu, Hao Chen, Chunhua Shen, Tong He, Lianwen Jin, Liangwei Wang
Our contributions are three-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve.
Ranked #9 on Text Spotting on Inverse-Text
no code implementations • CVPR 2020 • Xinyu Wang, Yuliang Liu, Chunhua Shen, Chun Chet Ng, Canjie Luo, Lianwen Jin, Chee Seng Chan, Anton Van Den Hengel, Liangwei Wang
Visual Question Answering (VQA) methods have made incredible progress, but suffer from a failure to generalize.
no code implementations • 13 Jan 2020 • Canjie Luo, Qingxiang Lin, Yuliang Liu, Lianwen Jin, Chunhua Shen
Furthermore, to tackle the issue of lacking paired training samples, we design an interactive joint training scheme, which shares attention masks from the recognizer to the discriminator, and enables the discriminator to extract the features of each character for further adversarial training.
4 code implementations • 21 Dec 2019 • Tianwei Wang, Yuanzhi Zhu, Lianwen Jin, Canjie Luo, Xiaoxue Chen, Yaqiang Wu, Qianying Wang, Mingxiang Cai
To remedy this issue, we propose a decoupled attention network (DAN), which decouples the alignment operation from using historical decoding results.
Ranked #4 on Scene Text Recognition on ICDAR 2003
1 code implementation • 20 Dec 2019 • Yuliang Liu, Tong He, Hao Chen, Xinyu Wang, Canjie Luo, Shuaitao Zhang, Chunhua Shen, Lianwen Jin
More importantly, based on OBD, we provide a detailed analysis of the impact of a collection of refinements, which may inspire others to build state-of-the-art text detectors.
Ranked #3 on Scene Text Detection on ICDAR 2017 MLT
no code implementations • NeurIPS 2019 • Lingyu Liang, Lianwen Jin, Yong Xu
In practical verification, we design a new regularization structure with guided feature to produce GNN-based filtering and propagation diffusion to tackle the ill-posed inverse problems of quotient image analysis (QIA), which recovers the reflectance ratio as a signature for image analysis or adjustment.
no code implementations • 13 Nov 2019 • Songxuan Lai, Lianwen Jin, Luojun Lin, Yecheng Zhu, Huiyun Mao
To tackle this issue, this paper proposes to learn dynamic signature representations through ranking synthesized signatures.
no code implementations • 1 Nov 2019 • Songbin Xu, Yang Xue, Xin Zhang, Lianwen Jin
As a new way of human-computer interaction, inertial sensor based in-air handwriting can provide a natural and unconstrained interaction to express more complex and richer information in 3D space.
1 code implementation • 17 Sep 2019 • Yipeng Sun, Zihan Ni, Chee-Kheng Chng, Yuliang Liu, Canjie Luo, Chun Chet Ng, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin
Robust text reading from street view images provides valuable information for various applications.
1 code implementation • 16 Sep 2019 • Chee-Kheng Chng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, ChuanMing Fang, Shuaitao Zhang, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin
This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT) that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting.
no code implementations • 26 Aug 2019 • Xiaoxue Chen, Tianwei Wang, Yuanzhi Zhu, Lianwen Jin, Canjie Luo
Scene text recognition has attracted particular research interest because it is a very challenging problem and has various applications.
1 code implementation • 6 Jun 2019 • Yuliang Liu, Sheng Zhang, Lianwen Jin, Lele Xie, Yaqiang Wu, Zhepeng Wang
Scene text in the wild is commonly presented with high variant characteristics.
Ranked #1 on Scene Text Detection on IC19-ReCTs (using extra training data)
no code implementations • 3 May 2019 • Songxuan Lai, Lianwen Jin
In this paper, we propose a novel set of features for offline writer identification based on the path signature approach, which provides a principled way to express information contained in a path.
2 code implementations • CVPR 2019 • Zecheng Xie, Yaoxiong Huang, Yuanzhi Zhu, Lianwen Jin, Yuliang Liu, Lele Xie
In this paper, we propose a novel method, aggregation cross-entropy (ACE), for sequence recognition from a brand new perspective.
1 code implementation • CVPR 2019 • Yuliang Liu, Lianwen Jin, Zecheng Xie, Canjie Luo, Shuaitao Zhang, Lele Xie
Evaluation protocols play key role in the developmental progress of text detection methods.
no code implementations • 9 Mar 2019 • Pengwei Wang, Dejing Dou, Fangzhao Wu, Nisansa de Silva, Lianwen Jin
And then, to put both triples and mined logic rules within the same semantic space, all triples in the knowledge graph are represented as first-order logic.
6 code implementations • 10 Jan 2019 • Canjie Luo, Lianwen Jin, Zenghui Sun
It decreases the difficulty of recognition and enables the attention-based sequence recognition network to more easily read irregular text.
3 code implementations • 3 Dec 2018 • Shuaitao Zhang, Yuliang Liu, Lianwen Jin, Yaoxiong Huang, Songxuan Lai
The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background.
1 code implementation • 17 Nov 2018 • Chenyang Li, Xin Zhang, Lufan Liao, Lianwen Jin, Weixin Yang
In this paper, we first leverage a robust feature descriptor, path signature (PS), and propose three PS features to explicitly represent the spatial and temporal motion characteristics, i. e., spatial PS (S_PS), temporal PS (T_PS) and temporal spatial PS (T_S_PS).
Ranked #1 on Gesture Recognition on ChaLearn 2013
1 code implementation • 16 Nov 2018 • Lele Xie, Yuliang Liu, Lianwen Jin, Zecheng Xie
However, the detection performance is sensitive to the setting of the anchor boxes.
no code implementations • 25 Mar 2018 • Dezhi Peng, Zikai Sun, Zirong Chen, Zirui Cai, Lele Xie, Lianwen Jin
To improve the performance of small head detection, we propose a cascaded multi-scale architecture which has two detectors.
5 code implementations • 19 Jan 2018 • Lingyu Liang, Luojun Lin, Lianwen Jin, Duorui Xie, Mengru Li
Previous works have formulated the recognition of facial beauty as a specific supervised learning problem of classification, regression or ranking, which indicates that FBP is intrinsically a computation problem with multiple paradigms.
Ranked #2 on Facial Beauty Prediction on SCUT-FBP
no code implementations • 12 Nov 2017 • Sheng Zhang, Yuliang Liu, Lianwen Jin, Canjie Luo
In this paper, we propose a refined scene text detector with a \textit{novel} Feature Enhancement Network (FEN) for Region Proposal and Text Detection Refinement.
no code implementations • 13 Jul 2017 • Weixin Yang, Terry Lyons, Hao Ni, Cordelia Schmid, Lianwen Jin
To this end, we regard the evolving landmark data as a high-dimensional path and apply non-linear path signature techniques to provide an expressive, robust, non-linear, and interpretable representation for the sequential events.
no code implementations • 19 May 2017 • Songxuan Lai, Lianwen Jin, Weixin Yang
Inspired by the great success of recurrent neural networks (RNNs) in sequential modeling, we introduce a novel RNN system to improve the performance of online signature verification.
no code implementations • 15 May 2017 • Xuefeng Xiao, Yafeng Yang, Tasweer Ahmad, Lianwen Jin, Tianhai Chang
Currently, owing to the ubiquity of mobile devices, online handwritten Chinese character recognition (HCCR) has become one of the suitable choice for feeding input to cell phones and tablet devices.
no code implementations • 17 Mar 2017 • Shuangping Huangm Zhuoyao Zhong, Lianwen Jin, Shuye Zhang, Haobin Wang
Chinese font recognition (CFR) has gained significant attention in recent years.
no code implementations • CVPR 2017 • Yuliang Liu, Lianwen Jin
The effectiveness of our approach is evaluated on a public word-level, multi-oriented scene text database, ICDAR 2015 Robust Reading Competition Challenge 4 "Incidental scene text localization".
no code implementations • 26 Feb 2017 • Xuefeng Xiao, Lianwen Jin, Yafeng Yang, Weixin Yang, Jun Sun, Tianhai Chang
We design a nine-layer CNN for HCCR consisting of 3, 755 classes, and devise an algorithm that can reduce the networks computational cost by nine times and compress the network to 1/18 of the original size of the baseline model, with only a 0. 21% drop in accuracy.
no code implementations • 24 Feb 2017 • Songxuan Lai, Lianwen Jin, Weixin Yang
This paper presents an investigation of several techniques that increase the accuracy of online handwritten Chinese character recognition (HCCR).
no code implementations • 9 Oct 2016 • Zecheng Xie, Zenghui Sun, Lianwen Jin, Hao Ni, Terry Lyons
Online handwritten Chinese text recognition (OHCTR) is a challenging problem as it involves a large-scale character set, ambiguous segmentation, and variable-length input sequences.
5 code implementations • 24 May 2016 • Zhuoyao Zhong, Lianwen Jin, Shuye Zhang, Ziyong Feng
In this paper, we develop a novel unified framework called DeepText for text region proposal generation and text detection in natural images via a fully convolutional neural network (CNN).
no code implementations • 18 Apr 2016 • Zecheng Xie, Zenghui Sun, Lianwen Jin, Ziyong Feng, Shuye Zhang
This paper proposes an end-to-end framework, namely fully convolutional recurrent network (FCRN) for handwritten Chinese text recognition (HCTR).
Handwriting Recognition Handwritten Chinese Text Recognition +1
no code implementations • 13 Feb 2016 • Shuye Zhang, Mude Lin, Tianshui Chen, Lianwen Jin, Liang Lin
Maximally stable extremal regions (MSER), which is a popular method to generate character proposals/candidates, has shown superior performance in scene text detection.
1 code implementation • 8 Nov 2015 • Duorui Xie, Lingyu Liang, Lianwen Jin, Jie Xu, Mengru Li
In this paper, a novel face dataset with attractiveness ratings, namely, the SCUT-FBP dataset, is developed for automatic facial beauty perception.
no code implementations • 8 Nov 2015 • Jie Xu, Lianwen Jin, Lingyu Liang, Ziyong Feng, Duorui Xie
This paper proposes a deep leaning method to address the challenging facial attractiveness prediction problem.
no code implementations • 7 Nov 2015 • Xiaorui Liu, Yichao Huang, Xin Zhang, Lianwen Jin
We introduce a new pipeline for hand localization and fingertip detection.
no code implementations • 20 Aug 2015 • Weixin Yang, Lianwen Jin, Manfei Liu
A key feature of DeepWriterID is a new method we are proposing, called DropSegment.
no code implementations • 30 May 2015 • Liquan Qiu, Lianwen Jin, Ruifen Dai, Yuxiang Zhang, Lei LI
This paper presents an open source tool for testing the recognition accuracy of Chinese handwriting input methods.
no code implementations • 28 May 2015 • Weixin Yang, Lianwen Jin, Zecheng Xie, Ziyong Feng
Deep convolutional neural networks (DCNNs) have achieved great success in various computer vision and pattern recognition applications, including those for handwritten Chinese character recognition (HCCR).
no code implementations • 25 May 2015 • Meijun He, Shuye Zhang, Huiyun Mao, Lianwen Jin
In this paper, we present an effective method to analyze the recognition confidence of handwritten Chinese character, based on the softmax regression score of a high performance convolutional neural networks (CNN).
no code implementations • 20 May 2015 • Weixin Yang, Lianwen Jin, DaCheng Tao, Zecheng Xie, Ziyong Feng
Inspired by the theory of Leitners learning box from the field of psychology, we propose DropSample, a new method for training deep convolutional neural networks (DCNNs), and apply it to large-scale online handwritten Chinese character recognition (HCCR).
1 code implementation • 19 May 2015 • Zhuoyao Zhong, Lianwen Jin, Zecheng Xie
We design a streamlined version of GoogLeNet [13], which was original proposed for image classification in recent years with very deep architecture, for HCCR (denoted as HCCR-GoogLeNet).
Image Classification Offline Handwritten Chinese Character Recognition
no code implementations • 19 May 2015 • Weixin Yang, Lianwen Jin, Manfei Liu
The results reveal that the path-signature feature is useful for writer identification, and the proposed DropStroke technique enhances the generalization and significantly improves performance.