no code implementations • 6 Oct 2024 • Piotr Gramacki, Bruno Martins, Piotr Szymański
We share our dataset and reproducible evaluation code on a public GitHub repository, arguing that this can serve as an evaluation benchmark for new LLMs in the future.
1 code implementation • 19 Oct 2023 • Piotr Gramacki, Kacper Leśniara, Kamil Raczycki, Szymon Woźniak, Marcin Przymus, Piotr Szymański
Spatial Representations for Artificial Intelligence (srai) is a Python library for working with geospatial data.
1 code implementation • 2 Jun 2023 • Piotr Kawa, Marcin Plata, Michał Czuba, Piotr Szymański, Piotr Syga
With a recent influx of voice generation methods, the threat introduced by audio DeepFake (DF) is ever-increasing.
1 code implementation • 26 Apr 2023 • Kacper Leśniara, Piotr Szymański
Recent years brought advancements in using neural networks for representation learning of various language or visual phenomena.
1 code implementation • 23 Nov 2022 • Łukasz Augustyniak, Kamil Tagowski, Albert Sawczyn, Denis Janiak, Roman Bartusiak, Adrian Szymczak, Marcin Wątroba, Arkadiusz Janz, Piotr Szymański, Mikołaj Morzy, Tomasz Kajdanowicz, Maciej Piasecki
In this paper, we introduce LEPISZCZE (the Polish word for glew, the Middle English predecessor of glue), a new, comprehensive benchmark for Polish NLP with a large variety of tasks and high-quality operationalization of the benchmark.
1 code implementation • 1 Nov 2021 • Kamil Raczycki, Piotr Szymański
Bicycle-sharing systems (BSS) have become a daily reality for many citizens of larger, wealthier cities in developed regions.
1 code implementation • 1 Nov 2021 • Szymon Woźniak, Piotr Szymański
In this paper we propose the first approach to learning vector representations of OpenStreetMap regions with respect to urban functions and land-use in a micro-region grid.
1 code implementation • 1 Nov 2021 • Piotr Gramacki, Szymon Woźniak, Piotr Szymański
We selected 48 European cities and gathered their public transport timetables in the GTFS format.
1 code implementation • 11 Oct 2021 • Kamil Raczycki, Marcin Szymański, Yahor Yeliseyenka, Piotr Szymański, Tomasz Kajdanowicz
We successfully build an information type classifier for social media posts, detect stop names in posts, and relate them to GPS coordinates, obtaining a spatial understanding of long-term aggregated phenomena.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Piotr Szymański, Piotr Żelasko, Mikolaj Morzy, Adrian Szymczak, Marzena Żyła-Hoppe, Joanna Banaszczak, Lukasz Augustyniak, Jan Mizgajski, Yishay Carmiel
Natural language processing of conversational speech requires the availability of high-quality transcripts.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • EMNLP 2020 • Piotr Szymański, Kyle Gorman
Recent work raises concerns about the use of standard splits to compare natural language processing models.
no code implementations • 2 Sep 2019 • Jan Mizgajski, Adrian Szymczak, Robert Głowski, Piotr Szymański, Piotr Żelasko, Łukasz Augustyniak, Mikołaj Morzy, Yishay Carmiel, Jeff Hodson, Łukasz Wójciak, Daniel Smoczyk, Adam Wróbel, Bartosz Borowik, Adam Artajew, Marcin Baran, Cezary Kwiatkowski, Marzena Żyła-Hoppe
Avaya Conversational Intelligence(ACI) is an end-to-end, cloud-based solution for real-time Spoken Language Understanding for call centers.
no code implementations • 21 Aug 2019 • Piotr Żelasko, Jan Mizgajski, Mikołaj Morzy, Adrian Szymczak, Piotr Szymański, Łukasz Augustyniak, Yishay Carmiel
In this paper, we present a method for correcting automatic speech recognition (ASR) errors using a finite state transducer (FST) intent recognition framework.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • 7 Dec 2018 • Piotr Szymański, Tomasz Kajdanowicz, Nitesh Chawla
Multi-label classification aims to classify instances with discrete non-exclusive labels.
no code implementations • 2 Jul 2018 • Piotr Żelasko, Piotr Szymański, Jan Mizgajski, Adrian Szymczak, Yishay Carmiel, Najim Dehak
The models are trained on the Fisher corpus which includes punctuation annotation.
no code implementations • 27 Apr 2017 • Piotr Szymański, Tomasz Kajdanowicz
We present a new approach to stratifying multi-label data for classification purposes based on the iterative stratification approach proposed by Sechidis et.
no code implementations • 13 Feb 2017 • Piotr Szymański, Tomasz Kajdanowicz
In case of F1 scores and Subset Accuracy - data driven approaches were more likely to perform better than random approaches than otherwise in the worst case.
2 code implementations • 5 Feb 2017 • Piotr Szymański, Tomasz Kajdanowicz
It provides native Python implementations of popular multi-label classification methods alongside a novel framework for label space partitioning and division.
no code implementations • 7 Jun 2016 • Piotr Szymański, Tomasz Kajdanowicz, Kristian Kersting
We show that fastgreedy and walktrap community detection methods on weighted label co-occurence graphs are 85-92% more likely to yield better F1 scores than random partitioning.