Search Results for author: Maciej Ogrodniczuk

Found 28 papers, 1 papers with code

Error Correction Environment for the Polish Parliamentary Corpus

no code implementations ParlaCLARIN (LREC) 2022 Maciej Ogrodniczuk, Michał Rudolf, Beata Wójtowicz, Sonia Janicka

The paper introduces the environment for detecting and correcting various kinds of errors in the Polish Parliamentary Corpus.

Language Modelling

ParlaMint II: The Show Must Go On

no code implementations ParlaCLARIN (LREC) 2022 Maciej Ogrodniczuk, Petya Osenova, Tomaž Erjavec, Darja Fišer, Nikola Ljubešić, Çağrı Çöltekin, Matyáš Kopp, Meden Katja

In ParlaMint I, a CLARIN-ERIC supported project in pandemic times, a set of comparable and uniformly annotated multilingual corpora for 17 national parliaments were developed and released in 2021.

Multisłownik: Linking plWordNet-based Lexical Data for Lexicography and Educational Purposes

no code implementations GWC 2018 Maciej Ogrodniczuk, Joanna Bilińska, Zbigniew Bronk, Witold Kieraś

Multisłownik is an automated integrator of Polish lexical data retrieved from multiple available online sources intended to be used in various scenarios requiring access to such data, most prominently dictionary creation, linguistic studies and education.

PolQA: Polish Question Answering Dataset

no code implementations17 Dec 2022 Piotr Rybak, Piotr Przybyła, Maciej Ogrodniczuk

Recently proposed systems for open-domain question answering (OpenQA) require large amounts of training data to achieve state-of-the-art performance.

Open-Domain Question Answering Passage Retrieval +1

Keyword Extraction from Short Texts with a Text-To-Text Transfer Transformer

no code implementations28 Sep 2022 Piotr Pęzik, Agnieszka Mikołajczyk-Bareła, Adam Wawrzyński, Bartłomiej Nitoń, Maciej Ogrodniczuk

The paper explores the relevance of the Text-To-Text Transfer Transformer language model (T5) for Polish (plT5) to the task of intrinsic and extrinsic keyword extraction from short text passages.

Keyword Extraction Language Modelling

New Developments in the Polish Parliamentary Corpus

no code implementations LREC 2020 Maciej Ogrodniczuk, Bart{\l}omiej Nito{\'n}

This short paper presents the current (as of February 2020) state of preparation of the Polish Parliamentary Corpus (PPC){---}an extensive collection of transcripts of Polish parliamentary proceedings dating from 1919 to present.

Lexical Correction of Polish Twitter Political Data

no code implementations WS 2017 Maciej Ogrodniczuk, Mateusz Kope{\'c}

Language processing architectures are often evaluated in near-to-perfect conditions with respect to processed content.

Entity Extraction using GAN Lemmatization +1

Improving Polish Mention Detection with Valency Dictionary

no code implementations WS 2017 Maciej Ogrodniczuk, Bart{\l}omiej Nito{\'n}

This paper presents results of an experiment integrating information from valency dictionary of Polish into a mention detection system.

Coreference Resolution

Web services and data mining: combining linguistic tools for Polish with an analytical platform

no code implementations WS 2016 Maciej Ogrodniczuk

The setting is verified in a simple task of counting frequencies of unknown words in a small corpus.

Digital Library 2.0: Source of Knowledge and Research Collaboration Platform

no code implementations LREC 2014 W{\l}odzimierz Gruszczy{\'n}ski, Maciej Ogrodniczuk

Digital libraries are frequently treated just as a new method of storage of digitized artifacts, with all consequences of transferring long-established ways of dealing with physical objects into the digital world.

Polish Coreference Corpus in Numbers

no code implementations LREC 2014 Maciej Ogrodniczuk, Mateusz Kope{\'c}, Agata Savary

Correlation between cluster and mention count within a text is investigated, with short characteristics of outlier cases.

Clustering coreference-resolution +2

The Polish Summaries Corpus

no code implementations LREC 2014 Maciej Ogrodniczuk, Mateusz Kope{\'c}

This article presents the Polish Summaries Corpus, a new resource created to support the development and evaluation of the tools for automated single-document summarization of Polish.

Document Summarization

The Polish Sejm Corpus

no code implementations LREC 2012 Maciej Ogrodniczuk

This document presents the first edition of the Polish Sejm Corpus -- a new specialized resource containing transcribed, automatically annotated utterances of the Members of Polish Sejm (lower chamber of the Polish Parliament).

Sentence Word Sense Disambiguation

Web Service integration platform for Polish linguistic resources

no code implementations LREC 2012 Maciej Ogrodniczuk, Micha{\l} Lenart

This paper presents a robust linguistic Web service framework for Polish, combining several mature offline linguistic tools in a common online platform.

Sentence

Creating a Coreference Resolution System for Polish

no code implementations LREC 2012 Mateusz Kope{\'c}, Maciej Ogrodniczuk

Although the availability of the natural language processing tools and the development of metrics to evaluate them increases, there is a certain gap to fill in that field for the less-resourced languages, such as Polish.

coreference-resolution

PoliMorf: a (not so) new open morphological dictionary for Polish

no code implementations LREC 2012 Marcin Woli{\'n}ski, Marcin Mi{\l}kowski, Maciej Ogrodniczuk, Adam Przepi{\'o}rkowski

This paper presents preliminary results of an effort aiming at the creation of a morphological dictionary of Polish, PoliMorf, available under a very liberal BSD-style license.

Morphological Analysis

Towards a comprehensive open repository of Polish language resources

no code implementations LREC 2012 Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Adam Przepi{\'o}rkowski

The aim of this paper is to present current efforts towards the creation of a comprehensive open repository of Polish language resources and tools (LRTs).

Cannot find the paper you are looking for? You can Submit a new open access paper.