Document Image Classification

25 papers with code • 8 benchmarks • 4 datasets

Document image classification is the task of classifying documents based on images of their contents.

( Image credit: Real-Time Document Image Classification using Deep CNN and Extreme Learning Machines )


Use these libraries to find Document Image Classification models and implementations

Latest papers with no code

DocXplain: A Novel Model-Agnostic Explainability Method for Document Image Classification

no code yet • 4 Jul 2024

In this paper, we aim to bridge this research gap by introducing DocXplain, a novel model-agnostic explainability method specifically designed for generating high interpretability feature attribution maps for the task of document image classification.

DistilDoc: Knowledge Distillation for Visually-Rich Document Applications

no code yet • 12 Jun 2024

This work explores knowledge distillation (KD) for visually-rich document (VRD) applications such as document layout analysis (DLA) and document image classification (DIC).

CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification

no code yet • 6 May 2024

We provide a comprehensive document image classification analysis in Zero-Shot Learning (ZSL) and Generalized Zero-Shot Learning (GZSL) settings to address this gap.

LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding

no code yet • 21 Mar 2024

By leveraging the strengths of existing research in document image understanding and LLMs' superior language understanding capabilities, the proposed model, fine-tuned with multimodal instruction datasets, performs an understanding of document images in a single model.

Automatic Recognition of Learning Resource Category in a Digital Library

no code yet • 28 Nov 2023

Digital libraries often face the challenge of processing a large volume of diverse document types.

A Multi-Modal Multilingual Benchmark for Document Image Classification

no code yet • 25 Oct 2023

Document image classification is different from plain-text document classification and consists of classifying a document by understanding the content and structure of documents such as forms, emails, and other such documents.

TransferDoc: A Self-Supervised Transferable Document Representation Learning Model Unifying Vision and Language

no code yet • 11 Sep 2023

The field of visual document understanding has witnessed a rapid growth in emerging challenges and powerful multi-modal strategies.

LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding

no code yet • 30 May 2023

LayoutMask can enhance the interactions between text and layout modalities in a unified model and produce adaptive and robust multi-modal representations for downstream tasks.

EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification

no code yet • IJDAR 2021

To the best of our knowledge, this is the first time to leverage a mutual learning approach along with a self-attention-based fusion module to perform document image classification.

Evaluating Adversarial Robustness on Document Image Classification

no code yet • 24 Apr 2023

Adversarial attacks and defenses have gained increasing interest on computer vision systems in recent years, but as of today, most investigations are limited to images.