Multistage Curvilinear Coordinate Transform Based Document Image Dewarping using a Novel Quality Estimator

15 Mar 2020  ·  Tanmoy Dasgupta, Nibaran Das, Mita Nasipuri ·

The present work demonstrates a fast and improved technique for dewarping nonlinearly warped document images. The images are first dewarped at the page-level by estimating optimum inverse projections using curvilinear homography. The quality of the process is then estimated by evaluating a set of metrics related to the characteristics of the text lines and rectilinear objects for measuring parallelism, orthogonality, etc. These are designed specifically to estimate the quality of the dewarping process without the need of any ground truth. If the quality is estimated to be unsatisfactory, the page-level dewarping process is repeated with finer approximations. This is followed by a line-level dewarping process that makes granular corrections to the warps in individual text-lines. The methodology has been tested on the CBDAR 2007 / IUPR 2011 document image dewarping dataset and is seen to yield the best OCR accuracy in the shortest amount of time, till date. The usefulness of the methodology has also been evaluated on the DocUNet 2018 dataset with some minor tweaks, and is seen to produce comparable results.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here