Search Results for author: Maciej Ogrodniczuk

This article presents the Polish Summaries Corpus, a new resource created to support the development and evaluation of the tools for automated single-document summarization of Polish.

Document Summarization

Paper
Add Code

Digital Library 2.0: Source of Knowledge and Research Collaboration Platform

no code implementations • LREC 2014 • W{\l}odzimierz Gruszczy{\'n}ski, Maciej Ogrodniczuk

Digital libraries are frequently treated just as a new method of storage of digitized artifacts, with all consequences of transferring long-established ways of dealing with physical objects into the digital world.

Paper
Add Code

The Strategic Impact of META-NET on the Regional, National and International Level

no code implementations • LREC 2014 • Georg Rehm, Hans Uszkoreit, Sophia Ananiadou, N{\'u}ria Bel, Audron{\.e} Bielevi{\v{c}}ien{\.e}, Lars Borin, Ant{\'o}nio Branco, Gerhard Budin, Nicoletta Calzolari, Walter Daelemans, Radovan Garab{\'\i}k, Marko Grobelnik, Carmen Garc{\'\i}a-Mateo, Josef van Genabith, Jan Haji{\v{c}}, Inma Hern{\'a}ez, John Judge, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lind{\'e}n, Bernardo Magnini, Joseph Mariani, John McNaught, Maite Melero, Monica Monachini, Asunci{\'o}n Moreno, Jan Odijk, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Stelios Piperidis, Adam Przepi{\'o}rkowski, Eir{\'\i}kur R{\"o}gnvaldsson, Michael Rosner, Bolette Pedersen, Inguna Skadi{\c{n}}a, Koenraad De Smedt, Marko Tadi{\'c}, Paul Thompson, Dan Tufi{\c{s}}, Tam{\'a}s V{\'a}radi, Andrejs Vasi{\c{l}}jevs, Kadri Vider, Jolanta Zabarskaite

This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics.

Machine Translation

Paper
Add Code

Measuring Readability of Polish Texts: Baseline Experiments

no code implementations • LREC 2014 • Bartosz Broda, Bart{\l}omiej Nito{\'n}, W{\l}odzimierz Gruszczy{\'n}ski, Maciej Ogrodniczuk

In this paper we present an overview of the most common approaches to automatic measuring of readability.

Language Modelling

Paper
Add Code

Towards a comprehensive open repository of Polish language resources

no code implementations • LREC 2012 • Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Adam Przepi{\'o}rkowski

The aim of this paper is to present current efforts towards the creation of a comprehensive open repository of Polish language resources and tools (LRTs).

Paper
Add Code

PoliMorf: a (not so) new open morphological dictionary for Polish

no code implementations • LREC 2012 • Marcin Woli{\'n}ski, Marcin Mi{\l}kowski, Maciej Ogrodniczuk, Adam Przepi{\'o}rkowski

This paper presents preliminary results of an effort aiming at the creation of a morphological dictionary of Polish, PoliMorf, available under a very liberal BSD-style license.

Morphological Analysis

Paper
Add Code

Web Service integration platform for Polish linguistic resources

no code implementations • LREC 2012 • Maciej Ogrodniczuk, Micha{\l} Lenart

This paper presents a robust linguistic Web service framework for Polish, combining several mature offline linguistic tools in a common online platform.

Sentence

Paper
Add Code

The Polish Sejm Corpus

no code implementations • LREC 2012 • Maciej Ogrodniczuk

This document presents the first edition of the Polish Sejm Corpus -- a new specialized resource containing transcribed, automatically annotated utterances of the Members of Polish Sejm (lower chamber of the Polish Parliament).

Sentence Word Sense Disambiguation

Paper
Add Code

Creating a Coreference Resolution System for Polish

no code implementations • LREC 2012 • Mateusz Kope{\'c}, Maciej Ogrodniczuk

Although the availability of the natural language processing tools and the development of metrics to evaluate them increases, there is a certain gap to fill in that field for the less-resourced languages, such as Polish.

coreference-resolution

Paper
Add Code

The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

no code implementations • LREC 2020 • Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, José Manuel Gómez Pérez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, António Branco, Gerhard Budin, Walter Daelemans, Koenraad De Smedt, Radovan Garabík, Maria Gavriilidou, Dagmar Gromann, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Jan Odijk, Maciej Ogrodniczuk, Eiríkur Rögnvaldsson, Mike Rosner, Bolette Sandford Pedersen, Inguna Skadiņa, Marko Tadić, Dan Tufiş, Tamás Váradi, Kadri Vider, Andy Way, François Yvon

Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality.

Misconceptions

Paper
Add Code

New Developments in the Polish Parliamentary Corpus

no code implementations • LREC 2020 • Maciej Ogrodniczuk, Bart{\l}omiej Nito{\'n}

This short paper presents the current (as of February 2020) state of preparation of the Polish Parliamentary Corpus (PPC){---}an extensive collection of transcripts of Polish parliamentary proceedings dating from 1919 to present.

Paper
Add Code

The MARCELL Legislative Corpus

no code implementations • LREC 2020 • Tam{\'a}s V{\'a}radi, Svetla Koeva, Martin Yamalov, Marko Tadi{\'c}, B{\'a}lint Sass, Bart{\l}omiej Nito{\'n}, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Verginica Barbu Mititelu, Radu Ion, Elena Irimia, Maria Mitrofan, Vasile P{\u{a}}i{\textcommabelow{s}}, Dan Tufi{\textcommabelow{s}}, Radovan Garab{\'\i}k, Simon Krek, Andraz Repar, Matja{\v{z}} Rihtar, Janez Brank

This article presents the current outcomes of the MARCELL CEF Telecom project aiming to collect and deeply annotate a large comparable corpus of legal documents.

Sentence

Paper
Add Code

Multisłownik: Linking plWordNet-based Lexical Data for Lexicography and Educational Purposes

no code implementations • GWC 2018 • Maciej Ogrodniczuk, Joanna Bilińska, Zbigniew Bronk, Witold Kieraś

Multisłownik is an automated integrator of Polish lexical data retrieved from multiple available online sources intended to be used in various scenarios requiring access to such data, most prominently dictionary creation, linguistic studies and education.

Paper
Add Code

Curated Multilingual Language Resources for CEF AT (CURLICAT): overall view

no code implementations • EAMT 2022 • Tamás Váradi, Marko Tadić, Svetla Koeva, Maciej Ogrodniczuk, Dan Tufiş, Radovan Garabík, Simon Krek, Andraž Repar

The work in progress on the CEF Action CURLICA T is presented.

Paper
Add Code

Keyword Extraction from Short Texts with a Text-To-Text Transfer Transformer

no code implementations • 28 Sep 2022 • Piotr Pęzik, Agnieszka Mikołajczyk-Bareła, Adam Wawrzyński, Bartłomiej Nitoń, Maciej Ogrodniczuk

The paper explores the relevance of the Text-To-Text Transfer Transformer language model (T5) for Polish (plT5) to the task of intrinsic and extrinsic keyword extraction from short text passages.

Keyword Extraction Language Modelling

Paper
Add Code

ParlaMint II: The Show Must Go On

no code implementations • ParlaCLARIN (LREC) 2022 • Maciej Ogrodniczuk, Petya Osenova, Tomaž Erjavec, Darja Fišer, Nikola Ljubešić, Çağrı Çöltekin, Matyáš Kopp, Meden Katja

In ParlaMint I, a CLARIN-ERIC supported project in pandemic times, a set of comparable and uniformly annotated multilingual corpora for 17 national parliaments were developed and released in 2021.

Paper
Add Code

Error Correction Environment for the Polish Parliamentary Corpus

no code implementations • ParlaCLARIN (LREC) 2022 • Maciej Ogrodniczuk, Michał Rudolf, Beata Wójtowicz, Sonia Janicka

The paper introduces the environment for detecting and correcting various kinds of errors in the Polish Parliamentary Corpus.

Language Modelling

Paper
Add Code

Introducing the CURLICAT Corpora: Seven-language Domain Specific Annotated Corpora from Curated Sources

no code implementations • LREC 2022 • Tamás Váradi, Bence Nyéki, Svetla Koeva, Marko Tadić, Vanja Štefanec, Maciej Ogrodniczuk, Bartłomiej Nitoń, Piotr Pęzik, Verginica Barbu Mititelu, Elena Irimia, Maria Mitrofan, Dan Tufiș, Radovan Garabík, Simon Krek, Andraž Repar

This article presents the current outcomes of the CURLICAT CEF Telecom project, which aims to collect and deeply annotate a set of large corpora from selected domains.

NMT

Paper
Add Code

PolQA: Polish Question Answering Dataset

no code implementations • 17 Dec 2022 • Piotr Rybak, Piotr Przybyła, Maciej Ogrodniczuk

Recently proposed systems for open-domain question answering (OpenQA) require large amounts of training data to achieve state-of-the-art performance.

Open-Domain Question Answering Passage Retrieval +1

Paper
Add Code

Silver Retriever: Advancing Neural Passage Retrieval for Polish Question Answering

no code implementations • 15 Sep 2023 • Piotr Rybak, Maciej Ogrodniczuk

However, most of the work concerns popular languages such as English or Chinese.

Open-Domain Question Answering Passage Retrieval +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.