2 code implementations • CVPR 2022 • Shangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii, Michalis Raptis
In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.
no code implementations • 17 Mar 2022 • Shuang Liu, Renshen Wang, Michalis Raptis, Yasuhisa Fujii
We formulate the task of detecting lines and paragraphs in a document into a unified two-level clustering problem.
no code implementations • ACL 2022 • Chen-Yu Lee, Chun-Liang Li, Timothy Dozat, Vincent Perot, Guolong Su, Nan Hua, Joshua Ainslie, Renshen Wang, Yasuhisa Fujii, Tomas Pfister
Sequence modeling has demonstrated state-of-the-art performance on natural language and document understanding tasks.
no code implementations • ACL 2021 • Chen-Yu Lee, Chun-Liang Li, Chu Wang, Renshen Wang, Yasuhisa Fujii, Siyang Qin, Ashok Popat, Tomas Pfister
Natural reading orders of words are crucial for information extraction from form-like documents.
no code implementations • 15 Apr 2021 • Daniel Hernandez Diaz, Siyang Qin, Reeve Ingle, Yasuhisa Fujii, Alessandro Bissacco
Unlike the more common Transformer-based models, this architecture can handle inputs of arbitrary length, a requirement for universal line recognition.
no code implementations • 29 Jan 2021 • Renshen Wang, Yasuhisa Fujii, Ashok C. Popat
We propose a new approach for paragraph recognition in document images by spatial graph convolutional networks (GCN) applied on OCR text boxes.
no code implementations • ICCV 2019 • Siyang Qin, Alessandro Bissacco, Michalis Raptis, Yasuhisa Fujii, Ying Xiao
We propose an end-to-end trainable network that can simultaneously detect and recognize text of arbitrary shape, making substantial progress on the open problem of reading scene text of irregular shape.
Instance Segmentation
Optical Character Recognition (OCR)
+2
no code implementations • 19 Apr 2019 • R. Reeve Ingle, Yasuhisa Fujii, Thomas Deselaers, Jonathan Baccash, Ashok C. Popat
These constitute a solution to bring HTR capability into a large scale OCR system.
no code implementations • 15 Aug 2017 • Yasuhisa Fujii, Karel Driesen, Jonathan Baccash, Ash Hurst, Ashok C. Popat
Therefore we reframe line script identification as a sequence-to-label problem and solve it using two components, trained end-toend: Encoder and Summarizer.