The aim of the paper is to apply, for historical texts, the methodology used commonly to solve various NLP tasks defined for contemporary data, i. e. pre-train and fine-tune large Transformer models.
In recent years, the field of document understanding has progressed a lot.
The relevance of the Key Information Extraction (KIE) task is increasingly important in natural language processing problems.
This paper investigates various Transformer architectures on the WikiReading Information Extraction and Machine Reading Comprehension dataset.
The paper presents a novel method of finding a fragment in a long temporal sequence similar to the set of shorter sequences.
Ranked #2 on Semantic Retrieval on Contract Discovery
In this paper, we investigate the Dual-source Transformer architecture on the WikiReading information extraction and machine reading comprehension dataset.
This paper presents the winning system for the propaganda Technique Classification (TC) task and the second-placed system for the propaganda Span Identification (SI) task.
State-of-the-art solutions for Natural Language Processing (NLP) are able to capture a broad range of contexts, like the sentence-level context or document-level context for short documents.
We introduce a simple new approach to the problem of understanding documents where non-trivial layout influences the local semantics.
Ranked #3 on Key Information Extraction on Kleister NDA
1 code implementation • • Łukasz Borchmann, Dawid Wiśniewski, Andrzej Gretkowski, Izabela Kosmala, Dawid Jurkiewicz, Łukasz Szałkiewicz, Gabriela Pałka, Karol Kaczmarek, Agnieszka Kaliska, Filip Graliński
We propose a new shared task of semantic retrieval from legal texts, in which a so-called contract discovery is to be performed, where legal clauses are extracted from documents, given a few examples of similar clauses from other legal acts.
Ranked #1 on Semantic Retrieval on Contract Discovery