no code implementations • EACL (Hackashop) 2021 • Senja Pollak, Marko Robnik-Šikonja, Matthew Purver, Michele Boggia, Ravi Shekhar, Marko Pranjić, Salla Salmela, Ivar Krustok, Tarmo Paju, Carl-Gustav Linden, Leo Leppänen, Elaine Zosa, Matej Ulčar, Linda Freienthal, Silver Traat, Luis Adrián Cabrera-Diego, Matej Martinc, Nada Lavrač, Blaž Škrlj, Martin Žnidaršič, Andraž Pelicon, Boshko Koloski, Vid Podpečan, Janez Kranjc, Shane Sheehan, Emanuela Boros, Jose G. Moreno, Antoine Doucet, Hannu Toivonen
This paper presents tools and data sources collected and released by the EMBEDDIA project, supported by the European Union’s Horizon 2020 research and innovation program.
no code implementations • EACL (Hackashop) 2021 • Matej Martinc, Nina Perger, Andraž Pelicon, Matej Ulčar, Andreja Vezovnik, Senja Pollak
We conduct automatic sentiment and viewpoint analysis of the newly created Slovenian news corpus containing articles related to the topic of LGBTIQ+ by employing the state-of-the-art news sentiment classifier and a system for semantic change detection.
no code implementations • LREC (BUCC) 2022 • Andraz Repar, Senja Pollak, Matej Ulčar, Boshko Koloski
Crosslingual terminology alignment task has many practical applications.
1 code implementation • 28 Jul 2022 • Matej Ulčar, Marko Robnik-Šikonja
Large pretrained language models have recently conquered the area of natural language processing.
no code implementations • 20 Dec 2021 • Matej Ulčar, Marko Robnik-Šikonja
To analyze the importance of focusing on a single language and the importance of a large training set, we compare created models with existing monolingual and multilingual BERT models for Estonian, Latvian, and Lithuanian.
no code implementations • 22 Jul 2021 • Matej Ulčar, Aleš Žagar, Carlos S. Armendariz, Andraž Repar, Senja Pollak, Matthew Purver, Marko Robnik-Šikonja
The current dominance of deep neural networks in natural language processing is based on contextual embeddings such as ELMo, BERT, and BERT derivatives.
no code implementations • 30 Jun 2021 • Matej Ulčar, Marko Robnik-Šikonja
Building machine learning prediction models for a specific NLP task requires sufficient training data, which can be difficult to obtain for less-resourced languages.
no code implementations • 14 Jun 2020 • Matej Ulčar, Marko Robnik-Šikonja
Large pretrained masked language models have become state-of-the-art solutions for many NLP problems.
1 code implementation • LREC 2020 • Carlos Santos Armendariz, Matthew Purver, Matej Ulčar, Senja Pollak, Nikola Ljubešić, Marko Robnik-Šikonja, Mark Granroth-Wilding, Kristiina Vaik
State of the art natural language processing tools are built on context-dependent word embeddings, but no direct method for evaluating these representations currently exists.
no code implementations • LREC 2020 • Matej Ulčar, Kristiina Vaik, Jessica Lindström, Milda Dailidėnaitė, Marko Robnik-Šikonja
In text processing, deep neural networks mostly use word embeddings as an input.
no code implementations • 22 Nov 2019 • Matej Ulčar, Marko Robnik-Šikonja
Recent results show that deep neural networks using contextual embeddings significantly outperform non-contextual embeddings on a majority of text classification task.