Search Results for author: Masato Fujitake

Found 8 papers, 1 papers with code

JSTR: Judgment Improves Scene Text Recognition

no code implementations9 Apr 2024 Masato Fujitake

In this paper, we present a method for enhancing the accuracy of scene text recognition tasks by judging whether the image and text match each other.

Scene Text Recognition

LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding

no code implementations21 Mar 2024 Masato Fujitake

By leveraging the strengths of existing research in document image understanding and LLMs' superior language understanding capabilities, the proposed model, fine-tuned with multimodal instruction datasets, performs an understanding of document images in a single model.

Document Image Classification document understanding +2

RL-LOGO: Deep Reinforcement Learning Localization for Logo Recognition

no code implementations28 Dec 2023 Masato Fujitake

Therefore, we propose a deep reinforcement learning localization method for logo recognition (RL-LOGO).

2k Image Classification +3

FA Team at the NTCIR-17 UFO Task

no code implementations31 Oct 2023 Yuki Okumura, Masato Fujitake

The FA team participated in the Table Data Extraction (TDE) and Text-to-Table Relationship Extraction (TTRE) tasks of the NTCIR-17 Understanding of Non-Financial Objects in Financial Reports (UFO).

Language Modelling

DTrOCR: Decoder-only Transformer for Optical Character Recognition

no code implementations30 Aug 2023 Masato Fujitake

Typical text recognition methods rely on an encoder-decoder structure, in which the encoder extracts features from an image, and the decoder produces recognized text from these features.

Handwritten Text Recognition Language Modelling +4

DiffusionSTR: Diffusion Model for Scene Text Recognition

no code implementations29 Jun 2023 Masato Fujitake

This paper presents Diffusion Model for Scene Text Recognition (DiffusionSTR), an end-to-end text recognition framework using diffusion models for recognizing text in the wild.

Scene Text Recognition

A3S: Adversarial learning of semantic representations for Scene-Text Spotting

no code implementations21 Feb 2023 Masato Fujitake

Scene-text spotting is a task that predicts a text area on natural scene images and recognizes its text characters simultaneously.

Text Spotting

Video Sparse Transformer With Attention-Guided Memory for Video Object Detection

1 code implementation IEEE Access 2022 Masato Fujitake, Akihiro Sugimoto

In this paper, we enhance features element-wisely before the object candidate region detection, proposing Video Sparse Transformer with Attention-guided Memory (VSTAM).

Object object-detection +3

Cannot find the paper you are looking for? You can Submit a new open access paper.