Search Results for author: Yasuhisa Fujii

Found 17 papers, 5 papers with code

Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

no code implementations • 9 Jan 2024 • Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, Tomas Pfister

We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts.

Ranked #3 on Table-based Fact Verification on TabFact

Fact Verification In-Context Learning +3

Paper
Add Code

Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis

1 code implementation • 25 Oct 2023 • Shangbang Long, Siyang Qin, Yasuhisa Fujii, Alessandro Bissacco, Michalis Raptis

We propose Hierarchical Text Spotter (HTS), a novel method for the joint task of word-level text spotting and geometric layout analysis.

Text Spotting

235

Paper
Code

OCR Language Models with Custom Vocabularies

no code implementations • 18 Aug 2023 • Peter Garst, Reeve Ingle, Yasuhisa Fujii

Language models are useful adjuncts to optical models for producing accurate optical character recognition (OCR) results.

Decoder Language Modelling +2

Paper
Add Code

Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models

no code implementations • 1 Aug 2023 • Cheng-Yu Hsieh, Si-An Chen, Chun-Liang Li, Yasuhisa Fujii, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister

Today, large language models (LLMs) are taught to use new tools by providing a few demonstrations of the tool's usage.

Image Generation

Paper
Add Code

ICDAR 2023 Competition on Hierarchical Text Detection and Recognition

1 code implementation • 16 May 2023 • Shangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii, Michalis Raptis

We organize a competition on hierarchical text detection and recognition.

Text Detection

235

Paper
Code

Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation

no code implementations • 4 May 2023 • Renshen Wang, Yasuhisa Fujii, Alessandro Bissacco

Text reading order is a crucial aspect in the output of an OCR engine, with a large impact on downstream tasks.

Optical Character Recognition (OCR)

Paper
Add Code

FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction

no code implementations • 4 May 2023 • Chen-Yu Lee, Chun-Liang Li, Hao Zhang, Timothy Dozat, Vincent Perot, Guolong Su, Xiang Zhang, Kihyuk Sohn, Nikolai Glushnev, Renshen Wang, Joshua Ainslie, Shangbang Long, Siyang Qin, Yasuhisa Fujii, Nan Hua, Tomas Pfister

In FormNetV2, we introduce a centralized multimodal graph contrastive learning strategy to unify self-supervised pre-training for all modalities in one loss.

Contrastive Learning document understanding +1

Paper
Add Code

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

1 code implementation • 3 May 2023 • Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, Tomas Pfister

Third, we reduce both the model size and the amount of data required to outperform LLMs; our finetuned 770M T5 model outperforms the few-shot prompted 540B PaLM model using only 80% of available data on a benchmark, whereas standard finetuning the same T5 model struggles to match even by using 100% of the dataset.

339

Paper
Code

Towards End-to-End Unified Scene Text Detection and Layout Analysis

2 code implementations • CVPR 2022 • Shangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii, Michalis Raptis

In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.

Document Layout Analysis Scene Text Detection +1

76,616

Paper
Code

Unified Line and Paragraph Detection by Graph Convolutional Networks

no code implementations • 17 Mar 2022 • Shuang Liu, Renshen Wang, Michalis Raptis, Yasuhisa Fujii

We formulate the task of detecting lines and paragraphs in a document into a unified two-level clustering problem.

Clustering Text Detection

Paper
Add Code

FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction

no code implementations • ACL 2022 • Chen-Yu Lee, Chun-Liang Li, Timothy Dozat, Vincent Perot, Guolong Su, Nan Hua, Joshua Ainslie, Renshen Wang, Yasuhisa Fujii, Tomas Pfister

Sequence modeling has demonstrated state-of-the-art performance on natural language and document understanding tasks.

document understanding

Paper
Add Code

ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction

no code implementations • ACL 2021 • Chen-Yu Lee, Chun-Liang Li, Chu Wang, Renshen Wang, Yasuhisa Fujii, Siyang Qin, Ashok Popat, Tomas Pfister

Natural reading orders of words are crucial for information extraction from form-like documents.

Paper
Add Code

Rethinking Text Line Recognition Models

1 code implementation • 15 Apr 2021 • Daniel Hernandez Diaz, Siyang Qin, Reeve Ingle, Yasuhisa Fujii, Alessandro Bissacco

Unlike the more common Transformer-based models, this architecture can handle inputs of arbitrary length, a requirement for universal line recognition.

Ranked #2 on Handwritten Text Recognition on IAM (using extra training data)

Decoder Handwritten Text Recognition +1

Paper
Code

Post-OCR Paragraph Recognition by Graph Convolutional Networks

no code implementations • 29 Jan 2021 • Renshen Wang, Yasuhisa Fujii, Ashok C. Popat

We propose a new approach for paragraph recognition in document images by spatial graph convolutional networks (GCN) applied on OCR text boxes.

Clustering Optical Character Recognition (OCR)

Paper
Add Code

Towards Unconstrained End-to-End Text Spotting

no code implementations • ICCV 2019 • Siyang Qin, Alessandro Bissacco, Michalis Raptis, Yasuhisa Fujii, Ying Xiao

We propose an end-to-end trainable network that can simultaneously detect and recognize text of arbitrary shape, making substantial progress on the open problem of reading scene text of irregular shape.

Instance Segmentation Optical Character Recognition (OCR) +3

Paper
Add Code

A Scalable Handwritten Text Recognition System

no code implementations • 19 Apr 2019 • R. Reeve Ingle, Yasuhisa Fujii, Thomas Deselaers, Jonathan Baccash, Ashok C. Popat

These constitute a solution to bring HTR capability into a large scale OCR system.

Handwriting Recognition Handwritten Text Recognition +1

Paper
Add Code

Sequence-to-Label Script Identification for Multilingual OCR

no code implementations • 15 Aug 2017 • Yasuhisa Fujii, Karel Driesen, Jonathan Baccash, Ash Hurst, Ashok C. Popat

Therefore we reframe line script identification as a sequence-to-label problem and solve it using two components, trained end-toend: Encoder and Summarizer.

Optical Character Recognition (OCR)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.