1 code implementation • 24 Jan 2024 • Ryota Tanaka, Taichi Iki, Kyosuke Nishida, Kuniko Saito, Jun Suzuki
We study the problem of completing various visual document understanding (VDU) tasks, e. g., question answering and information extraction, on real-world documents through human-written instructions.
1 code implementation • 15 Mar 2022 • Taichi Iki, Akiko Aizawa
We develop task pages with and without page transitions and propose a BERT extension for the framework.
1 code implementation • EMNLP 2021 • Taichi Iki, Akiko Aizawa
A method for creating a vision-and-language (V&L) model is to extend a language model through structural modifications and V&L pre-training.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Taichi Iki, Akiko Aizawa
However, few models consider the fusion of linguistic features with multiple visual features with different sizes of receptive fields, though the proper size of the receptive field of visual features intuitively varies depending on expressions.