MNIST-MIX: A Multi-language Handwritten Digit Recognition Dataset

8 Apr 2020  ·  Weiwei Jiang ·

In this letter, we contribute a multi-language handwritten digit recognition dataset named MNIST-MIX, which is the largest dataset of the same type in terms of both languages and data samples. With the same data format with MNIST, MNIST-MIX can be seamlessly applied in existing studies for handwritten digit recognition. By introducing digits from 10 different languages, MNIST-MIX becomes a more challenging dataset and its imbalanced classification requires a better design of models. We also present the results of applying a LeNet model which is pre-trained on MNIST as the baseline.

PDF Abstract

Datasets


Introduced in the Paper:

MNIST-MIX

Used in the Paper:

MNIST Kannada-MNIST BanglaLekha-Isolated

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods