Search Results for author: Yasuhisa Fujii

Found 17 papers, 5 papers with code

Sequence-to-Label Script Identification for Multilingual OCR

no code implementations • 15 Aug 2017 • Yasuhisa Fujii, Karel Driesen, Jonathan Baccash, Ash Hurst, Ashok C. Popat

Therefore we reframe line script identification as a sequence-to-label problem and solve it using two components, trained end-toend: Encoder and Summarizer.

Optical Character Recognition (OCR)

Paper
Add Code

A Scalable Handwritten Text Recognition System

no code implementations • 19 Apr 2019 • R. Reeve Ingle, Yasuhisa Fujii, Thomas Deselaers, Jonathan Baccash, Ashok C. Popat

These constitute a solution to bring HTR capability into a large scale OCR system.

Handwriting Recognition Handwritten Text Recognition +1

Paper
Add Code

Towards Unconstrained End-to-End Text Spotting

no code implementations • ICCV 2019 • Siyang Qin, Alessandro Bissacco, Michalis Raptis, Yasuhisa Fujii, Ying Xiao

We propose an end-to-end trainable network that can simultaneously detect and recognize text of arbitrary shape, making substantial progress on the open problem of reading scene text of irregular shape.

Instance Segmentation Optical Character Recognition (OCR) +3

Paper
Add Code

Post-OCR Paragraph Recognition by Graph Convolutional Networks

no code implementations • 29 Jan 2021 • Renshen Wang, Yasuhisa Fujii, Ashok C. Popat

We propose a new approach for paragraph recognition in document images by spatial graph convolutional networks (GCN) applied on OCR text boxes.

Clustering Optical Character Recognition (OCR)

Paper
Add Code

Rethinking Text Line Recognition Models

1 code implementation • 15 Apr 2021 • Daniel Hernandez Diaz, Siyang Qin, Reeve Ingle, Yasuhisa Fujii, Alessandro Bissacco

Unlike the more common Transformer-based models, this architecture can handle inputs of arbitrary length, a requirement for universal line recognition.

Ranked #2 on Handwritten Text Recognition on IAM (using extra training data)

Handwritten Text Recognition Language Modelling

Paper
Code

ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction

no code implementations • ACL 2021 • Chen-Yu Lee, Chun-Liang Li, Chu Wang, Renshen Wang, Yasuhisa Fujii, Siyang Qin, Ashok Popat, Tomas Pfister

Natural reading orders of words are crucial for information extraction from form-like documents.

Paper
Add Code

FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction

no code implementations • ACL 2022 • Chen-Yu Lee, Chun-Liang Li, Timothy Dozat, Vincent Perot, Guolong Su, Nan Hua, Joshua Ainslie, Renshen Wang, Yasuhisa Fujii, Tomas Pfister

Sequence modeling has demonstrated state-of-the-art performance on natural language and document understanding tasks.

document understanding

Paper
Add Code

Unified Line and Paragraph Detection by Graph Convolutional Networks

no code implementations • 17 Mar 2022 • Shuang Liu, Renshen Wang, Michalis Raptis, Yasuhisa Fujii

We formulate the task of detecting lines and paragraphs in a document into a unified two-level clustering problem.

Clustering Text Detection

Paper
Add Code

Towards End-to-End Unified Scene Text Detection and Layout Analysis

2 code implementations • CVPR 2022 • Shangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii, Michalis Raptis

In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.

Document Layout Analysis Scene Text Detection +1

76,579

Paper
Code

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

1 code implementation • 3 May 2023 • Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, Tomas Pfister

Third, we reduce both the model size and the amount of data required to outperform LLMs; our finetuned 770M T5 model outperforms the few-shot prompted 540B PaLM model using only 80% of available data on a benchmark, whereas standard finetuning the same T5 model struggles to match even by using 100% of the dataset.

332

Paper
Code

Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation

no code implementations • 4 May 2023 • Renshen Wang, Yasuhisa Fujii, Alessandro Bissacco

Text reading order is a crucial aspect in the output of an OCR engine, with a large impact on downstream tasks.

Optical Character Recognition (OCR)

Paper
Add Code

FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction

no code implementations • 4 May 2023 • Chen-Yu Lee, Chun-Liang Li, Hao Zhang, Timothy Dozat, Vincent Perot, Guolong Su, Xiang Zhang, Kihyuk Sohn, Nikolai Glushnev, Renshen Wang, Joshua Ainslie, Shangbang Long, Siyang Qin, Yasuhisa Fujii, Nan Hua, Tomas Pfister

In FormNetV2, we introduce a centralized multimodal graph contrastive learning strategy to unify self-supervised pre-training for all modalities in one loss.

Contrastive Learning document understanding +1