1 code implementation • 11 Feb 2023 • Štěpán Šimsa, Milan Šulc, Michal Uřičář, Yash Patel, Ahmed Hamdi, Matěj Kocián, Matyáš Skalický, Jiří Matas, Antoine Doucet, Mickaël Coustaty, Dimosthenis Karatzas
This paper introduces the DocILE benchmark with the largest dataset of business documents for the tasks of Key Information Localization and Extraction and Line Item Recognition.
1 code implementation • 3 Dec 2021 • Matěj Kocián, Jakub Náplava, Daniel Štancl, Vladimír Kadlec
For further research and evaluation, we release DaReCzech, a unique data set of 1. 6 million Czech user query-document pairs with manually assigned relevance levels.
Ranked #1 on Document Ranking on DaReCzech