no code implementations • SemEval (NAACL) 2022 • Matej Klemen, Marko Robnik-Šikonja
We describe the ULFRI system used in the Subtask 1 of SemEval-2022 Task 4 Patronizing and condescending language detection.
no code implementations • EACL (Hackashop) 2021 • Enja Kokalj, Blaž Škrlj, Nada Lavrač, Senja Pollak, Marko Robnik-Šikonja
Transformer-based neural networks offer very good classification performance across a wide range of domains, but do not provide explanations of their predictions.
no code implementations • EACL (Hackashop) 2021 • Senja Pollak, Marko Robnik-Šikonja, Matthew Purver, Michele Boggia, Ravi Shekhar, Marko Pranjić, Salla Salmela, Ivar Krustok, Tarmo Paju, Carl-Gustav Linden, Leo Leppänen, Elaine Zosa, Matej Ulčar, Linda Freienthal, Silver Traat, Luis Adrián Cabrera-Diego, Matej Martinc, Nada Lavrač, Blaž Škrlj, Martin Žnidaršič, Andraž Pelicon, Boshko Koloski, Vid Podpečan, Janez Kranjc, Shane Sheehan, Emanuela Boros, Jose G. Moreno, Antoine Doucet, Hannu Toivonen
This paper presents tools and data sources collected and released by the EMBEDDIA project, supported by the European Union’s Horizon 2020 research and innovation program.
1 code implementation • EACL (Hackashop) 2021 • Aleš Žagar, Marko Robnik-Šikonja
The research on the summarization of user comments is still in its infancy, and human-created summarization datasets are scarce, especially for less-resourced languages.
1 code implementation • EACL (Hackashop) 2021 • Blaž Škrlj, Shane Sheehan, Nika Eržen, Marko Robnik-Šikonja, Saturnino Luz, Senja Pollak
Large pretrained language models using the transformer neural network architecture are becoming a dominant methodology for many natural language processing tasks, such as question answering, text classification, word sense disambiguation, text completion and machine translation.
no code implementations • 9 Aug 2024 • Marko Hostnik, Marko Robnik-Šikonja
We evaluate In-context retrieval-augmented generation on larger models and conclude that, despite its simplicity, the approach is more suitable than using the RETRO architecture.
no code implementations • 12 Sep 2023 • Boshko Koloski, Blaž Škrlj, Marko Robnik-Šikonja, Senja Pollak
As cross-lingual transfer strategies, we compare the intermediate-training (\textit{IT}) that uses each language sequentially and cross-lingual validation (\textit{CLV}) that uses a target language already in the validation phase of fine-tuning.
no code implementations • 20 Jun 2023 • Aleš Žagar, Marko Robnik-Šikonja
We propose a system that recommends the most suitable summarization model for a given text.
1 code implementation • 9 May 2023 • Ilija Tavchioski, Marko Robnik-Šikonja, Senja Pollak
As the impact of technology on our lives is increasing, we witness increased use of social media that became an essential tool not only for communication but also for sharing information with community about our thoughts and feelings.
no code implementations • 23 Jan 2023 • Boštjan Vouk, Matej Guid, Marko Robnik-Šikonja
Feature construction can contribute to comprehensibility and performance of machine learning models.
1 code implementation • 16 Nov 2022 • Katja Logar, Marko Robnik-Šikonja
Most approaches are developed for English, while less-resourced languages are much less researched.
no code implementations • 22 Aug 2022 • Dimitar Trajanov, Vangel Trajkovski, Makedonka Dimitrieva, Jovana Dobreva, Milos Jovanovik, Matej Klemen, Aleš Žagar, Marko Robnik-Šikonja
As our work shows, NLP is a highly relevant information extraction and processing approach for pharmacology.
1 code implementation • 28 Jul 2022 • Matej Ulčar, Marko Robnik-Šikonja
Large pretrained language models have recently conquered the area of natural language processing.
no code implementations • LREC 2022 • Aleš Žagar, Marko Robnik-Šikonja
We present a Slovene combined machine-human translated SuperGLUE benchmark.
no code implementations • 20 Dec 2021 • Matej Ulčar, Marko Robnik-Šikonja
To analyze the importance of focusing on a single language and the importance of a large training set, we compare created models with existing monolingual and multilingual BERT models for Estonian, Latvian, and Lithuanian.
1 code implementation • 13 Nov 2021 • Matej Klemen, Marko Robnik-Šikonja
We propose a novel methodology for the extraction of paraphrasing datasets from NLI datasets and cleaning existing paraphrasing datasets.
2 code implementations • 20 Oct 2021 • Boshko Koloski, Timen Stepišnik-Perdih, Marko Robnik-Šikonja, Senja Pollak, Blaž Škrlj
Increasing amounts of freely available data both in textual and relational form offers exploration of richer document representations, potentially improving the model performance and robustness.
no code implementations • 22 Jul 2021 • Matej Ulčar, Aleš Žagar, Carlos S. Armendariz, Andraž Repar, Senja Pollak, Matthew Purver, Marko Robnik-Šikonja
The current dominance of deep neural networks in natural language processing is based on contextual embeddings such as ELMo, BERT, and BERT derivatives.
no code implementations • 30 Jun 2021 • Matej Ulčar, Marko Robnik-Šikonja
Building machine learning prediction models for a specific NLP task requires sufficient training data, which can be difficult to obtain for less-resourced languages.
no code implementations • 8 Dec 2020 • Aleš Žagar, Marko Robnik-Šikonja
Automatic evaluation shows that the summaries of our best cross-lingual model are useful and of quality similar to the model trained only in the target language.
2 code implementations • 24 Nov 2020 • Matej Klemen, Luka Krsnik, Marko Robnik-Šikonja
We analyse the effect of adding morphological features to LSTM and BERT models.
1 code implementation • 13 Aug 2020 • Tadej Škvorc, Polona Gantar, Marko Robnik-Šikonja
Idiomatic expressions can be problematic for natural language processing applications as their meaning cannot be inferred from their constituting words.
no code implementations • 14 Jun 2020 • Matej Ulčar, Marko Robnik-Šikonja
Large pretrained masked language models have become state-of-the-art solutions for many NLP problems.
2 code implementations • 8 Jun 2020 • Nada Lavrač, Blaž Škrlj, Marko Robnik-Šikonja
This paper outlines some of the modern data processing techniques used in relational learning that enable data fusion from different input data types and formats into a single table data representation, focusing on the propositionalization and embedding data transformation approaches.
no code implementations • 13 May 2020 • Kristian Miok, Dong Nguyen-Doan, Marko Robnik-Šikonja, Daniela Zaharie
Due to complex experimental settings, missing values are common in biomedical data.
1 code implementation • 12 May 2020 • Blaž Škrlj, Nika Eržen, Shane Sheehan, Saturnino Luz, Marko Robnik-Šikonja, Senja Pollak
Neural language models are becoming the prevailing methodology for the tasks of query answering, text classification, disambiguation, completion and translation.
1 code implementation • LREC 2020 • Carlos Santos Armendariz, Matthew Purver, Matej Ulčar, Senja Pollak, Nikola Ljubešić, Marko Robnik-Šikonja, Mark Granroth-Wilding, Kristiina Vaik
State of the art natural language processing tools are built on context-dependent word embeddings, but no direct method for evaluating these representations currently exists.
no code implementations • LREC 2020 • Matej Ulčar, Kristiina Vaik, Jessica Lindström, Milda Dailidėnaitė, Marko Robnik-Šikonja
In text processing, deep neural networks mostly use word embeddings as an input.
no code implementations • 22 Nov 2019 • Matej Ulčar, Marko Robnik-Šikonja
Recent results show that deep neural networks using contextual embeddings significantly outperform non-contextual embeddings on a majority of text classification task.
no code implementations • 16 Sep 2019 • Kristian Miok, Dong Nguyen-Doan, Blaž Škrlj, Daniela Zaharie, Marko Robnik-Šikonja
As a result of social network popularity, in recent years, hate speech phenomenon has significantly increased.
1 code implementation • 12 Sep 2019 • Kristian Miok, Dong Nguyen-Doan, Daniela Zaharie, Marko Robnik-Šikonja
In many such cases, generators of synthetic data with the same statistical and predictive properties as the actual data allow efficient simulations and development of tools and applications.
2 code implementations • CL (ACL) 2021 • Matej Martinc, Senja Pollak, Marko Robnik-Šikonja
We present a set of novel neural supervised and unsupervised approaches for determining the readability of documents.
1 code implementation • 11 Feb 2019 • Blaž Škrlj, Jan Kralj, Janez Konc, Marko Robnik-Šikonja, Nada Lavrač
Network node embedding is an active research subfield of complex network analysis.
no code implementations • 17 Jun 2014 • Andreja Čufar, Aleš Mrhar, Marko Robnik-Šikonja
Next, we build a model for predicting a successful introduction of clinical pharmacy to the clinical departments.
no code implementations • 28 Mar 2014 • Marko Robnik-Šikonja
The proposed generator is based on RBF networks, which learn sets of Gaussian kernels.