Search Results for author: Jonathan Boarman

Found 2 papers, 1 papers with code

ShabbyPages: A Reproducible Document Denoising and Binarization Dataset

no code implementations16 Mar 2023 Alexander Groleau, Kok Wei Chee, Stefan Larson, Samay Maini, Jonathan Boarman

Document denoising and binarization are fundamental problems in the document processing space, but current datasets are often too small and lack sufficient complexity to effectively train and benchmark modern data-driven machine learning models.

Benchmarking Binarization +1

Augraphy: A Data Augmentation Library for Document Images

2 code implementations30 Aug 2022 Alexander Groleau, Kok Wei Chee, Stefan Larson, Samay Maini, Jonathan Boarman

This paper introduces Augraphy, a Python library for constructing data augmentation pipelines which produce distortions commonly seen in real-world document image datasets.

Data Augmentation Denoising

Cannot find the paper you are looking for? You can Submit a new open access paper.