no code implementations • LREC 2022 • Tamás Váradi, Bence Nyéki, Svetla Koeva, Marko Tadić, Vanja Štefanec, Maciej Ogrodniczuk, Bartłomiej Nitoń, Piotr Pęzik, Verginica Barbu Mititelu, Elena Irimia, Maria Mitrofan, Dan Tufiș, Radovan Garabík, Simon Krek, Andraž Repar
This article presents the current outcomes of the CURLICAT CEF Telecom project, which aims to collect and deeply annotate a set of large corpora from selected domains.
no code implementations • ParlaCLARIN (LREC) 2022 • Maciej Ogrodniczuk, Michał Rudolf, Beata Wójtowicz, Sonia Janicka
The paper introduces the environment for detecting and correcting various kinds of errors in the Polish Parliamentary Corpus.
no code implementations • ParlaCLARIN (LREC) 2022 • Maciej Ogrodniczuk, Petya Osenova, Tomaž Erjavec, Darja Fišer, Nikola Ljubešić, Çağrı Çöltekin, Matyáš Kopp, Meden Katja
In ParlaMint I, a CLARIN-ERIC supported project in pandemic times, a set of comparable and uniformly annotated multilingual corpora for 17 national parliaments were developed and released in 2021.
no code implementations • EAMT 2022 • Tamás Váradi, Marko Tadić, Svetla Koeva, Maciej Ogrodniczuk, Dan Tufiş, Radovan Garabík, Simon Krek, Andraž Repar
The work in progress on the CEF Action CURLICA T is presented.
no code implementations • GWC 2018 • Maciej Ogrodniczuk, Joanna Bilińska, Zbigniew Bronk, Witold Kieraś
Multisłownik is an automated integrator of Polish lexical data retrieved from multiple available online sources intended to be used in various scenarios requiring access to such data, most prominently dictionary creation, linguistic studies and education.
no code implementations • 15 Sep 2023 • Piotr Rybak, Maciej Ogrodniczuk
However, most of the work concerns popular languages such as English or Chinese.
no code implementations • 17 Dec 2022 • Piotr Rybak, Piotr Przybyła, Maciej Ogrodniczuk
Recently proposed systems for open-domain question answering (OpenQA) require large amounts of training data to achieve state-of-the-art performance.
no code implementations • 28 Sep 2022 • Piotr Pęzik, Agnieszka Mikołajczyk-Bareła, Adam Wawrzyński, Bartłomiej Nitoń, Maciej Ogrodniczuk
The paper explores the relevance of the Text-To-Text Transfer Transformer language model (T5) for Polish (plT5) to the task of intrinsic and extrinsic keyword extraction from short text passages.
1 code implementation • CRAC (ACL) 2022 • Zdeněk Žabokrtský, Miloslav Konopík, Anna Nedoluzhko, Michal Novák, Maciej Ogrodniczuk, Martin Popel, Ondřej Pražák, Jakub Sido, Daniel Zeman, YIlun Zhu
The public edition of CorefUD 1. 0, which contains 13 datasets for 10 languages, was used as the source of training and evaluation data.
no code implementations • LREC 2020 • Tam{\'a}s V{\'a}radi, Svetla Koeva, Martin Yamalov, Marko Tadi{\'c}, B{\'a}lint Sass, Bart{\l}omiej Nito{\'n}, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Verginica Barbu Mititelu, Radu Ion, Elena Irimia, Maria Mitrofan, Vasile P{\u{a}}i{\textcommabelow{s}}, Dan Tufi{\textcommabelow{s}}, Radovan Garab{\'\i}k, Simon Krek, Andraz Repar, Matja{\v{z}} Rihtar, Janez Brank
This article presents the current outcomes of the MARCELL CEF Telecom project aiming to collect and deeply annotate a large comparable corpus of legal documents.
no code implementations • LREC 2020 • Maciej Ogrodniczuk, Bart{\l}omiej Nito{\'n}
This short paper presents the current (as of February 2020) state of preparation of the Polish Parliamentary Corpus (PPC){---}an extensive collection of transcripts of Polish parliamentary proceedings dating from 1919 to present.
no code implementations • LREC 2020 • Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, José Manuel Gómez Pérez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, António Branco, Gerhard Budin, Walter Daelemans, Koenraad De Smedt, Radovan Garabík, Maria Gavriilidou, Dagmar Gromann, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Jan Odijk, Maciej Ogrodniczuk, Eiríkur Rögnvaldsson, Mike Rosner, Bolette Sandford Pedersen, Inguna Skadiņa, Marko Tadić, Dan Tufiş, Tamás Váradi, Kadri Vider, Andy Way, François Yvon
Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality.
no code implementations • WS 2018 • Anna Nedoluzhko, Michal Nov{\'a}k, Maciej Ogrodniczuk
We present PAWS, a multi-lingual parallel treebank with coreference annotation.
no code implementations • WS 2017 • Maciej Ogrodniczuk, Mateusz Kope{\'c}
Language processing architectures are often evaluated in near-to-perfect conditions with respect to processed content.
no code implementations • WS 2017 • Maciej Ogrodniczuk, Bart{\l}omiej Nito{\'n}
This paper presents results of an experiment integrating information from valency dictionary of Polish into a mention detection system.
no code implementations • WS 2016 • Maciej Ogrodniczuk
The setting is verified in a simple task of counting frequencies of unknown words in a small corpus.
no code implementations • LREC 2014 • W{\l}odzimierz Gruszczy{\'n}ski, Maciej Ogrodniczuk
Digital libraries are frequently treated just as a new method of storage of digitized artifacts, with all consequences of transferring long-established ways of dealing with physical objects into the digital world.
no code implementations • LREC 2014 • Georg Rehm, Hans Uszkoreit, Sophia Ananiadou, N{\'u}ria Bel, Audron{\.e} Bielevi{\v{c}}ien{\.e}, Lars Borin, Ant{\'o}nio Branco, Gerhard Budin, Nicoletta Calzolari, Walter Daelemans, Radovan Garab{\'\i}k, Marko Grobelnik, Carmen Garc{\'\i}a-Mateo, Josef van Genabith, Jan Haji{\v{c}}, Inma Hern{\'a}ez, John Judge, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lind{\'e}n, Bernardo Magnini, Joseph Mariani, John McNaught, Maite Melero, Monica Monachini, Asunci{\'o}n Moreno, Jan Odijk, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Stelios Piperidis, Adam Przepi{\'o}rkowski, Eir{\'\i}kur R{\"o}gnvaldsson, Michael Rosner, Bolette Pedersen, Inguna Skadi{\c{n}}a, Koenraad De Smedt, Marko Tadi{\'c}, Paul Thompson, Dan Tufi{\c{s}}, Tam{\'a}s V{\'a}radi, Andrejs Vasi{\c{l}}jevs, Kadri Vider, Jolanta Zabarskaite
This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics.
no code implementations • LREC 2014 • Bartosz Broda, Bart{\l}omiej Nito{\'n}, W{\l}odzimierz Gruszczy{\'n}ski, Maciej Ogrodniczuk
In this paper we present an overview of the most common approaches to automatic measuring of readability.
no code implementations • LREC 2014 • Maciej Ogrodniczuk, Mateusz Kope{\'c}, Agata Savary
Correlation between cluster and mention count within a text is investigated, with short characteristics of outlier cases.
no code implementations • LREC 2014 • Maciej Ogrodniczuk, Mateusz Kope{\'c}
This article presents the Polish Summaries Corpus, a new resource created to support the development and evaluation of the tools for automated single-document summarization of Polish.
no code implementations • LREC 2012 • Maciej Ogrodniczuk
This document presents the first edition of the Polish Sejm Corpus -- a new specialized resource containing transcribed, automatically annotated utterances of the Members of Polish Sejm (lower chamber of the Polish Parliament).
no code implementations • LREC 2012 • Maciej Ogrodniczuk, Micha{\l} Lenart
This paper presents a robust linguistic Web service framework for Polish, combining several mature offline linguistic tools in a common online platform.
no code implementations • LREC 2012 • Mateusz Kope{\'c}, Maciej Ogrodniczuk
Although the availability of the natural language processing tools and the development of metrics to evaluate them increases, there is a certain gap to fill in that field for the less-resourced languages, such as Polish.
no code implementations • LREC 2012 • Marcin Woli{\'n}ski, Marcin Mi{\l}kowski, Maciej Ogrodniczuk, Adam Przepi{\'o}rkowski
This paper presents preliminary results of an effort aiming at the creation of a morphological dictionary of Polish, PoliMorf, available under a very liberal BSD-style license.
no code implementations • LREC 2012 • Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Adam Przepi{\'o}rkowski
The aim of this paper is to present current efforts towards the creation of a comprehensive open repository of Polish language resources and tools (LRTs).