Search Results for author: Michal Hradiš

Found 11 papers, 4 papers with code

OCR, Classification & Machine Translation (OCCAM)

no code implementations • EAMT 2020 • Joachim Van den Bogaert, Arne Defauw, Frederic Everaert, Koen Van Winckel, Alina Kramchaninova, Anna Bardadym, Tom Vanallemeersch, Pavel Smrž, Michal Hradiš

The OCCAM project (Optical Character recognition, ClassificAtion & Machine Translation) aims at integrating the CEF (Connecting Europe Facility) Automated Translation service with image classification, Translation Memories (TMs), Optical Character Recognition (OCR), and Machine Translation (MT).

Classification Image Classification +4

Paper
Add Code

Self-supervised Pre-training of Text Recognizers

1 code implementation • 1 May 2024 • Martin Kišš, Michal Hradiš

The evaluation shows that the self-supervised pre-training on data from the target domain is very effective, but it struggles to outperform transfer learning from closely related domains.

Quantization Transfer Learning

Paper
Code

Towards Writing Style Adaptation in Handwriting Recognition

no code implementations • 13 Feb 2023 • Jan Kohút, Michal Hradiš, Martin Kišš

We experimented with various placements and settings of WSB and contrastively pre-trained embeddings.

Domain Adaptation Handwriting Recognition

Paper
Add Code

Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition

no code implementations • 13 Feb 2023 • Jan Kohút, Michal Hradiš

In many machine learning tasks, a large general dataset and a small specialized dataset are available.

Data Augmentation Domain Adaptation +1

Paper
Add Code

SoftCTC -- Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels

1 code implementation • 5 Dec 2022 • Martin Kišš, Michal Hradiš, Karel Beneš, Petr Buchal, Michal Kula

This paper explores semi-supervised training for sequence tasks, such as Optical Character Recognition or Automatic Speech Recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

Importance of Textlines in Historical Document Classification

no code implementations • 24 Jan 2022 • Martin Kišš, Jan Kohút, Karel Beneš, Michal Hradiš

The line-level system significantly improves results in script and font classification and in the dating task.

Classification Document Classification +1

Paper
Add Code

AT-ST: Self-Training Adaptation Strategy for OCR in Domains with Limited Transcriptions

1 code implementation • 27 Apr 2021 • Martin Kišš, Karel Beneš, Michal Hradiš

This paper addresses text recognition for domains with limited manual annotations by a simple self-training strategy.

Optical Character Recognition (OCR)

Paper
Code

TS-Net: OCR Trained to Switch Between Text Transcription Styles

no code implementations • 9 Mar 2021 • Jan Kohút, Michal Hradiš

Users of OCR systems, from different institutions and scientific disciplines, prefer and produce different transcription styles.

Optical Character Recognition (OCR)

Paper
Add Code

Page Layout Analysis System for Unconstrained Historic Documents

no code implementations • 23 Feb 2021 • Oldřich Kodym, Michal Hradiš

Extraction of text regions and individual text lines from historic documents is necessary for automatic transcription.

Paper
Add Code

Brno Mobile OCR Dataset

1 code implementation • 2 Jul 2019 • Martin Kišš, Michal Hradiš, Oldřich Kodym

We introduce the Brno Mobile OCR Dataset (B-MOD) for document Optical Character Recognition from low-quality images captured by handheld mobile devices.

Binarization Denoising +3

Paper
Code

Technical Report: Image Captioning with Semantically Similar Images

no code implementations • 12 Jun 2015 • Martin Kolář, Michal Hradiš, Pavel Zemčík

This report presents our submission to the MS COCO Captioning Challenge 2015.

Image Captioning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.