no code implementations • LTEDI (ACL) 2022 • Ilija Tavchioski, Boshko Koloski, Blaž Škrlj, Senja Pollak
Depression is a mental illness that negatively affects a person’s well-being and can, if left untreated, lead to serious consequences such as suicide.
no code implementations • LREC (BUCC) 2022 • Andraz Repar, Senja Pollak, Matej Ulčar, Boshko Koloski
Crosslingual terminology alignment task has many practical applications.
1 code implementation • SemEval (NAACL) 2022 • Elaine Zosa, Emanuela Boros, Boshko Koloski, Lidia Pivovarova
In this paper, we present the participation of the EMBEDDIA team in the SemEval-2022 Task 8 (Multilingual News Article Similarity).
no code implementations • EACL (Hackashop) 2021 • Senja Pollak, Marko Robnik-Šikonja, Matthew Purver, Michele Boggia, Ravi Shekhar, Marko Pranjić, Salla Salmela, Ivar Krustok, Tarmo Paju, Carl-Gustav Linden, Leo Leppänen, Elaine Zosa, Matej Ulčar, Linda Freienthal, Silver Traat, Luis Adrián Cabrera-Diego, Matej Martinc, Nada Lavrač, Blaž Škrlj, Martin Žnidaršič, Andraž Pelicon, Boshko Koloski, Vid Podpečan, Janez Kranjc, Shane Sheehan, Emanuela Boros, Jose G. Moreno, Antoine Doucet, Hannu Toivonen
This paper presents tools and data sources collected and released by the EMBEDDIA project, supported by the European Union’s Horizon 2020 research and innovation program.
no code implementations • EACL (Hackashop) 2021 • Boshko Koloski, Elaine Zosa, Timen Stepišnik-Perdih, Blaž Škrlj, Tarmo Paju, Senja Pollak
Team Name: team-8 Embeddia Tool: Cross-Lingual Document Retrieval Zosa et al. Dataset: Estonian and Latvian news datasets abstract: Contemporary news media face increasing amounts of available data that can be of use when prioritizing, selecting and discovering new news.
1 code implementation • 18 Dec 2024 • Matej Martinc, Hanh Thi Hong Tran, Senja Pollak, Boshko Koloski
Relying on the insight that real-world keyword detection often requires handling of diverse content, we propose a novel supervised keyword extraction approach based on the mixture of experts (MoE) technique.
no code implementations • 30 Sep 2024 • Luka Andrenšek, Boshko Koloski, Andraž Pelicon, Nada Lavrač, Senja Pollak, Matthew Purver
We investigate zero-shot cross-lingual news sentiment detection, aiming to develop robust sentiment classifiers that can be deployed across multiple languages without target-language training data.
no code implementations • 8 Sep 2024 • Guillermo Bernárdez, Lev Telyatnikov, Marco Montagna, Federica Baccini, Mathilde Papillon, Miquel Ferriol-Galmés, Mustafa Hajij, Theodore Papamarkou, Maria Sofia Bucarelli, Olga Zaghen, Johan Mathe, Audun Myers, Scott Mahan, Hansen Lillemark, Sharvaree Vadgama, Erik Bekkers, Tim Doster, Tegan Emerson, Henry Kvinge, Katrina Agate, Nesreen K Ahmed, Pengfei Bai, Michael Banf, Claudio Battiloro, Maxim Beketov, Paul Bogdan, Martin Carrasco, Andrea Cavallo, Yun Young Choi, George Dasoulas, Matouš Elphick, Giordan Escalona, Dominik Filipiak, Halley Fritze, Thomas Gebhart, Manel Gil-Sorribes, Salvish Goomanee, Victor Guallar, Liliya Imasheva, Andrei Irimia, Hongwei Jin, Graham Johnson, Nikos Kanakaris, Boshko Koloski, Veljko Kovač, Manuel Lecha, Minho Lee, Pierrick Leroy, Theodore Long, German Magai, Alvaro Martinez, Marissa Masden, Sebastian Mežnar, Bertran Miquel-Oliver, Alexis Molina, Alexander Nikitin, Marco Nurisso, Matt Piekenbrock, Yu Qin, Patryk Rygiel, Alessandro Salatiello, Max Schattauer, Pavel Snopov, Julian Suk, Valentina Sánchez, Mauricio Tec, Francesco Vaccarino, Jonas Verhellen, Frederic Wantiez, Alexander Weers, Patrik Zajec, Blaž Škrlj, Nina Miolane
This paper describes the 2nd edition of the ICML Topological Deep Learning Challenge that was hosted within the ICML 2024 ELLIS Workshop on Geometry-grounded Representation Learning and Generative Modeling (GRaM).
no code implementations • 19 Aug 2024 • Boshko Koloski, Senja Pollak, Roberto Navigli, Blaž Škrlj
This work demonstrates that injecting embedded information from knowledge bases can augment the performance of contemporary Large Language Model (LLM)-based representations for the task of text classification.
no code implementations • 10 Apr 2024 • Jaya Caporusso, Damar Hoogland, Mojca Brglez, Boshko Koloski, Matthew Purver, Senja Pollak
Dehumanisation involves the perception and or treatment of a social group's members as less than human.
1 code implementation • 8 Apr 2024 • Syrielle Montariol, Matej Martinc, Andraž Pelicon, Senja Pollak, Boshko Koloski, Igor Lončarski, Aljoša Valentinčič
For assessing various performance indicators of companies, the focus is shifting from strictly financial (quantitative) publicly disclosed information to qualitative (textual) information.
no code implementations • 25 Dec 2023 • Boshko Koloski, Nada Lavrač, Bojan Cestnik, Senja Pollak, Blaž Škrlj, Andrej Kastrin
Our system aims to reduce both the ratio of outlier topics to the total number of topics and the similarity between topic definitions.
no code implementations • 27 Sep 2023 • Boshko Koloski, Nada Lavrač, Senja Pollak, Blaž Škrlj
In the domain of semi-supervised learning, the current approaches insufficiently exploit the potential of considering inter-instance relationships among (un)labeled data.
no code implementations • 12 Sep 2023 • Boshko Koloski, Blaž Škrlj, Marko Robnik-Šikonja, Senja Pollak
As cross-lingual transfer strategies, we compare the intermediate-training (\textit{IT}) that uses each language sequentially and cross-lingual validation (\textit{CLV}) that uses a target language already in the validation phase of fine-tuning.
no code implementations • 15 Aug 2022 • Blaž Škrlj, Boshko Koloski, Senja Pollak
Efficiently identifying keyphrases that represent a given document is a challenging task.
no code implementations • LREC 2022 • Boshko Koloski, Senja Pollak, Blaž Škrlj, Matej Martinc
We find that the pretrained models fine-tuned on a multilingual corpus covering languages that do not appear in the test set (i. e. in a zero-shot setting), consistently outscore unsupervised models in all six languages.
2 code implementations • 20 Oct 2021 • Boshko Koloski, Timen Stepišnik-Perdih, Marko Robnik-Šikonja, Senja Pollak, Blaž Škrlj
Increasing amounts of freely available data both in textual and relational form offers exploration of richer document representations, potentially improving the model performance and robustness.
1 code implementation • EACL (Hackashop) 2021 • Boshko Koloski, Senja Pollak, Blaž Škrlj, Matej Martinc
Keyword extraction is the task of identifying words (or multi-word expressions) that best describe a given document and serve in news portals to link articles of similar topics.
no code implementations • 11 Jan 2021 • Boshko Koloski, Timen Stepišnik Perdih, Senja Pollak, Blaž Škrlj
Identification of Fake News plays a prominent role in the ongoing pandemic, impacting multiple aspects of day-to-day life.