TLHDIBD2021 (Tai Le historical document image binarization dataset)

Hybrid-CBF: A hybrid classification and binarization framework for historical Tai Le document image binarization The binarization of historical documents is very important and more challenging than the binarization of ordinary documents. As a result of the serious noise pollution found on the historical Tai Le documents, a new hybrid classification and binarization framework (Hybrid-CBF) is proposed for the binarization of historical Tai Le document images. The Tai Le historical document image binarization dataset (TLHDIBD2021) containing 2,780 image pairs is constructed. Due to the different degrees of document background pollution, the single method has a poor effect on the binarization of historical Tai Le documents. First, Hybrid-CBF clusters the historical Tai Le document images according to the noise level estimation to obtain document images with different noise levels. Second, the corresponding optimal binarization method is used for historical Tai Le documents with different noise levels. In Hybrid-CBF, two binarization methods of historical Tai Le documents based on a deep neural network are proposed.

Source: TLHDIBD2021

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Unknown

Modalities


Languages