A Hybrid Swarm and Gravitation based feature selection algorithm for Handwritten Indic Script Classification problem

10 May 2020  ·  Ritam Guha, Manosij Ghosh, Pawan Kumar Singh, Ram Sarkar, Mita Nasipuri ·

In any multi-script environment, handwritten script classification is of paramount importance before the document images are fed to their respective Optical Character Recognition (OCR) engines. Over the years, this complex pattern classification problem has been solved by researchers proposing various feature vectors mostly having large dimension, thereby increasing the computation complexity of the whole classification model. Feature Selection (FS) can serve as an intermediate step to reduce the size of the feature vectors by restricting them only to the essential and relevant features. In our paper, we have addressed this issue by introducing a new FS algorithm, called Hybrid Swarm and Gravitation based FS (HSGFS). This algorithm is made to run on 3 feature vectors introduced in the literature recently - Distance-Hough Transform (DHT), Histogram of Oriented Gradients (HOG) and Modified log-Gabor (MLG) filter Transform. Three state-of-the-art classifiers namely, Multi-Layer Perceptron (MLP), K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) are used for the handwritten script classification. Handwritten datasets, prepared at block, text-line and word level, consisting of officially recognized 12 Indic scripts are used for the evaluation of our method. An average improvement in the range of 2-5 % is achieved in the classification accuracies by utilizing only about 75-80 % of the original feature vectors on all three datasets. The proposed methodology also shows better performance when compared to some popularly used FS models.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods