Search Results for author: Milan Šulc

Found 6 papers, 4 papers with code

DocILE Benchmark for Document Information Localization and Extraction

1 code implementation • 11 Feb 2023 • Štěpán Šimsa, Milan Šulc, Michal Uřičář, Yash Patel, Ahmed Hamdi, Matěj Kocián, Matyáš Skalický, Jiří Matas, Antoine Doucet, Mickaël Coustaty, Dimosthenis Karatzas

This paper introduces the DocILE benchmark with the largest dataset of business documents for the tasks of Key Information Localization and Extraction and Line Item Recognition.

Key Information Extraction Unsupervised Pre-training

104

Paper
Code

DocILE 2023 Teaser: Document Information Localization and Extraction

no code implementations • 29 Jan 2023 • Štěpán Šimsa, Milan Šulc, Matyáš Skalický, Yash Patel, Ahmed Hamdi

The DocILE 2023 competition, hosted as a lab at the CLEF 2023 conference and as an ICDAR 2023 competition, will run the first major benchmark for the tasks of Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR) from business documents.

Information Retrieval Retrieval

Paper
Add Code

GLAMI-1M: A Multilingual Image-Text Fashion Dataset

1 code implementation • BMVC 2022 • Vaclav Kosar, Antonín Hoskovec, Milan Šulc, Radek Bartyzal

We introduce GLAMI-1M: the largest multilingual image-text classification dataset and benchmark.

Ranked #1 on Multilingual Image-Text Classification on GLAMI-1M (using extra training data)

Image Generation Multilingual Image-Text Classification +2

Paper
Code

Text Detection Forgot About Document OCR

2 code implementations • 14 Oct 2022 • Krzysztof Olejniczak, Milan Šulc

While the state-of-the-art methods for in-the-wild text recognition are typically evaluated on complex scenes, their performance in the domain of documents is typically not published, and a comprehensive comparison with methods for document OCR is missing.

Optical Character Recognition Optical Character Recognition (OCR) +1

157

Paper
Code

Business Document Information Extraction: Towards Practical Benchmarks

no code implementations • 20 Jun 2022 • Matyáš Skalický, Štěpán Šimsa, Michal Uřičář, Milan Šulc

Information extraction from semi-structured documents is crucial for frictionless business-to-business (B2B) communication.

Paper
Add Code

Danish Fungi 2020 -- Not Just Another Image Recognition Dataset

1 code implementation • 18 Mar 2021 • Lukáš Picek, Milan Šulc, Jiří Matas, Jacob Heilmann-Clausen, Thomas S. Jeppesen, Thomas Læssøe, Tobias Frøslev

Interestingly, ViT achieves results superior to CNN baselines with 80. 45% accuracy and 0. 743 macro F1 score, reducing the CNN error by 9% and 12% respectively.

Ranked #1 on Image Classification on DF20

Classifier calibration Fine-Grained Image Classification +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.