1 code implementation • 30 Oct 2024 • Michał Pietruszka, Łukasz Borchmann, Aleksander Jędrosz, Paweł Morawiecki
We present a benchmark for large language models designed to tackle one of the most knowledge-intensive tasks in data science: writing feature engineering code, which requires domain knowledge in addition to a deep understanding of the underlying problem and data structure.
no code implementations • 8 Aug 2024 • Łukasz Borchmann, Michał Pietruszka, Wojciech Jaśkowski, Dawid Jurkiewicz, Piotr Halama, Paweł Józiak, Łukasz Garncarek, Paweł Liskowski, Karolina Szyndler, Andrzej Gretkowski, Julita Ołtusek, Gabriela Nowakowska, Artur Zawłocki, Łukasz Duhr, Paweł Dyda, Michał Turski
The vast portion of workloads employing LLMs involves answering questions grounded on PDF or scan content.
1 code implementation • ICCV 2023 • Jordy Van Landeghem, Rubén Tito, Łukasz Borchmann, Michał Pietruszka, Paweł Józiak, Rafał Powalski, Dawid Jurkiewicz, Mickaël Coustaty, Bertrand Ackaert, Ernest Valveny, Matthew Blaschko, Sien Moens, Tomasz Stanisławek
We call on the Document AI (DocAI) community to reevaluate current methodologies and embrace the challenge of creating more practically-oriented benchmarks.
no code implementations • 8 Jun 2022 • Michał Pietruszka, Michał Turski, Łukasz Borchmann, Tomasz Dwojak, Gabriela Pałka, Karolina Szyndler, Dawid Jurkiewicz, Łukasz Garncarek
The output structure of database-like tables, consisting of values structured in horizontal rows and vertical columns identifiable by name, can cover a wide range of NLP tasks.
1 code implementation • 18 Feb 2021 • Rafał Powalski, Łukasz Borchmann, Dawid Jurkiewicz, Tomasz Dwojak, Michał Pietruszka, Gabriela Pałka
We address the challenging problem of Natural Language Comprehension beyond plain-text documents by introducing the TILT neural network architecture which simultaneously learns layout information, visual features, and textual semantics.
Ranked #7 on
Visual Question Answering (VQA)
on InfographicVQA
(using extra training data)
1 code implementation • CONLL 2020 • Tomasz Dwojak, Michał Pietruszka, Łukasz Borchmann, Jakub Chłędowski, Filip Graliński
This paper investigates various Transformer architectures on the WikiReading Information Extraction and Machine Reading Comprehension dataset.
1 code implementation • 8 Oct 2020 • Michał Pietruszka, Łukasz Borchmann, Filip Graliński
We propose a differentiable successive halving method of relaxing the top-k operator, rendering gradient-based optimization possible.
1 code implementation • ACL 2022 • Michał Pietruszka, Łukasz Borchmann, Łukasz Garncarek
A reduction of quadratic time and memory complexity to sublinear was achieved due to a robust trainable top-$k$ operator.
Ranked #2 on
Text Summarization
on arXiv Summarization Dataset
no code implementations • 15 Jun 2020 • Tomasz Dwojak, Michał Pietruszka, Łukasz Borchmann, Filip Graliński, Jakub Chłędowski
In this paper, we investigate the Dual-source Transformer architecture on the WikiReading information extraction and machine reading comprehension dataset.