Search Results for author: Blaž Škrlj

Found 41 papers, 19 papers with code

RaKUn: Rank-based Keyword extraction via Unsupervised learning and Meta vertex aggregation

1 code implementation • 15 Jul 2019 • Blaž Škrlj, Andraž Repar, Senja Pollak

Keyword extraction is used for summarizing the content of a document and supports efficient document retrieval, and is as such an indispensable part of modern text-based systems.

Keyword Extraction Retrieval

Paper
Code

tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification

1 code implementation • 1 Feb 2019 • Blaž Škrlj, Matej Martinc, Jan Kralj, Nada Lavrač, Senja Pollak

The use of background knowledge is largely unexploited in text classification tasks.

Few-Shot Learning General Classification +3

Paper
Code

SNoRe: Scalable Unsupervised Learning of Symbolic Node Representations

1 code implementation • 8 Sep 2020 • Sebastian Mežnar, Nada Lavrač, Blaž Škrlj

Learning from complex real-life networks is a lively research area, with recent advances in learning information-rich, low-dimensional network node representations.

Ranked #17 on Node Classification on Coauthor CS

Node Classification Structural Node Embedding

Paper
Code

OutRank: Speeding up AutoML-based Model Search for Large Sparse Data sets with Cardinality-aware Feature Ranking

1 code implementation • 4 Sep 2023 • Blaž Škrlj, Blaž Mramor

The proposed approach's feasibility is demonstrated by speeding up the state-of-the-art AutoML system on a synthetic data set with no performance loss.

Anomaly Detection AutoML +2

Paper
Code

Knowledge Graph informed Fake News Classification via Heterogeneous Representation Ensembles

2 code implementations • 20 Oct 2021 • Boshko Koloski, Timen Stepišnik-Perdih, Marko Robnik-Šikonja, Senja Pollak, Blaž Škrlj

Increasing amounts of freely available data both in textual and relational form offers exploration of richer document representations, potentially improving the model performance and robustness.

Classification Fake News Detection +4

Paper
Code

Ensemble- and Distance-Based Feature Ranking for Unsupervised Learning

1 code implementation • 23 Nov 2020 • Matej Petković, Dragi Kocev, Blaž Škrlj, Sašo Džeroski

In this work, we propose two novel (groups of) methods for unsupervised feature ranking and selection.

Clustering

Paper
Code

Unsupervised Feature Ranking via Attribute Networks

1 code implementation • 25 Nov 2021 • Urh Primožič, Blaž Škrlj, Sašo Džeroski, Matej Petković

The need for learning from unlabeled data is increasing in contemporary machine learning.

Attribute Recommendation Systems

Paper
Code

Embedding-based Silhouette Community Detection

1 code implementation • 17 Jul 2019 • Blaž Škrlj, Jan Kralj, Nada Lavrač

Mining complex data in the form of networks is of increasing interest in many scientific disciplines.

Clustering Community Detection +1

Paper
Code

AttViz: Online exploration of self-attention for transparent neural language modeling

1 code implementation • 12 May 2020 • Blaž Škrlj, Nika Eržen, Shane Sheehan, Saturnino Luz, Marko Robnik-Šikonja, Senja Pollak

Neural language models are becoming the prevailing methodology for the tasks of query answering, text classification, disambiguation, completion and translation.

Language Modelling text-classification +2

Paper
Code

ReliefE: Feature Ranking in High-dimensional Spaces via Manifold Embeddings

1 code implementation • 23 Jan 2021 • Blaž Škrlj, Sašo Džeroski, Nada Lavrač, Matej Petković

The utility of ReliefE for high-dimensional data sets is ensured by its implementation that utilizes sparse matrix algebraic operations.

Multi-Label Classification Vocal Bursts Intensity Prediction

Paper
Code

Exploring Neural Language Models via Analysis of Local and Global Self-Attention Spaces

1 code implementation • EACL (Hackashop) 2021 • Blaž Škrlj, Shane Sheehan, Nika Eržen, Marko Robnik-Šikonja, Saturnino Luz, Senja Pollak

Large pretrained language models using the transformer neural network architecture are becoming a dominant methodology for many natural language processing tasks, such as question answering, text classification, word sense disambiguation, text completion and machine translation.

Machine Translation Question Answering +4

Paper
Code

Link Analysis meets Ontologies: Are Embeddings the Answer?

1 code implementation • 23 Nov 2021 • Sebastian Mežnar, Matej Bevec, Nada Lavrač, Blaž Škrlj

The increasing amounts of semantic resources offer valuable storage of human knowledge; however, the probability of wrong entries increases with the increased size.

Anomaly Detection

Paper
Code

TNT-KID: Transformer-based Neural Tagger for Keyword Identification

1 code implementation • 20 Mar 2020 • Matej Martinc, Blaž Škrlj, Senja Pollak

With growing amounts of available textual data, development of algorithms capable of automatic analysis, categorization and summarization of these data has become a necessity.

Keyword Extraction Language Modelling

Paper
Code

Fuzzy Jaccard Index: A robust comparison of ordered lists

2 code implementations • 5 Aug 2020 • Matej Petković, Blaž Škrlj, Dragi Kocev, Nikola Simidjievski

In real-life, and in particular high-dimensional domains, where only a small percentage of the whole feature space might be relevant, a robust and confident feature ranking leads to interpretable findings as well as efficient computation and good predictive performance.

BIG-bench Machine Learning

Paper
Code

Extending Neural Keyword Extraction with TF-IDF tagset matching

1 code implementation • EACL (Hackashop) 2021 • Boshko Koloski, Senja Pollak, Blaž Škrlj, Matej Martinc

Keyword extraction is the task of identifying words (or multi-word expressions) that best describe a given document and serve in news portals to link articles of similar topics.

Keyword Extraction

Paper
Code

Propositionalization and Embeddings: Two Sides of the Same Coin

2 code implementations • 8 Jun 2020 • Nada Lavrač, Blaž Škrlj, Marko Robnik-Šikonja

This paper outlines some of the modern data processing techniques used in relational learning that enable data fusion from different input data types and formats into a single table data representation, focusing on the propositionalization and embedding data transformation approaches.

Relational Reasoning Vocal Bursts Valence Prediction

Paper
Code

Transfer Learning for Node Regression Applied to Spreading Prediction

1 code implementation • 31 Mar 2021 • Sebastian Mežnar, Nada Lavrač, Blaž Škrlj

This work is one of the first to explore transferability of the learned representations for the task of node regression; we show there exist pairs of networks with similar structure between which the trained models can be transferred (zero-shot), and demonstrate their competitive performance.

Misinformation regression +1

Paper
Code

Deep Node Ranking for Neuro-symbolic Structural Node Embedding and Classification

1 code implementation • 11 Feb 2019 • Blaž Škrlj, Jan Kralj, Janez Konc, Marko Robnik-Šikonja, Nada Lavrač

Network node embedding is an active research subfield of complex network analysis.

Classification General Classification +3

Paper
Code

Language comparison via network topology

1 code implementation • 16 Jul 2019 • Blaž Škrlj, Senja Pollak

In our experiments, we employ eight different network topology metrics, and empirically showcase on a parallel corpus, how the methods can be used for modeling the relations between nine selected languages.

Paper
Code

Prediction Uncertainty Estimation for Hate Speech Classification

no code implementations • 16 Sep 2019 • Kristian Miok, Dong Nguyen-Doan, Blaž Škrlj, Daniela Zaharie, Marko Robnik-Šikonja

As a result of social network popularity, in recent years, hate speech phenomenon has significantly increased.

Bayesian Inference General Classification +3

Paper
Add Code

Feature Importance Estimation with Self-Attention Networks

no code implementations • 11 Feb 2020 • Blaž Škrlj, Sašo Džeroski, Nada Lavrač, Matej Petkovič

Black-box neural network models are widely used in industry and science, yet are hard to understand and interpret.

Feature Importance

Paper
Add Code

COVID-19 therapy target discovery with context-aware literature mining

no code implementations • 30 Jul 2020 • Matej Martinc, Blaž Škrlj, Sergej Pirkmajer, Nada Lavrač, Bojan Cestnik, Martin Marzidovšek, Senja Pollak

The abundance of literature related to the widespread COVID-19 pandemic is beyond manual inspection of a single expert.

Domain Adaptation Language Modelling +2

Paper
Add Code

Predicting Generalization in Deep Learning via Metric Learning -- PGDL Shared task

no code implementations • 16 Dec 2020 • Sebastian Mežnar, Blaž Škrlj

The competition "Predicting Generalization in Deep Learning (PGDL)" aims to provide a platform for rigorous study of generalization of deep learning models and offer insight into the progress of understanding and explaining these models.

Metric Learning

Paper
Add Code

Identification of COVID-19 related Fake News via Neural Stacking

no code implementations • 11 Jan 2021 • Boshko Koloski, Timen Stepišnik Perdih, Senja Pollak, Blaž Škrlj

Identification of Fake News plays a prominent role in the ongoing pandemic, impacting multiple aspects of day-to-day life.

Fake News Detection General Classification

Paper
Add Code

Semantic Reasoning from Model-Agnostic Explanations

no code implementations • 29 Jun 2021 • Timen Stepišnik Perdih, Nada Lavrač, Blaž Škrlj

The derived semantic explanations are potentially more informative, as they describe the key attributes in the context of more general background knowledge, e. g., at the biological process level.

Paper
Add Code

Compressibility of Distributed Document Representations

no code implementations • 14 Oct 2021 • Blaž Škrlj, Matej Petkovič

Contemporary natural language processing (NLP) revolves around learning from latent document representations, generated either implicitly by neural language models or explicitly by methods such as doc2vec or similar.

text-classification Text Classification

Paper
Add Code

Prioritization of COVID-19-related literature via unsupervised keyphrase extraction and document representation learning

no code implementations • 17 Oct 2021 • Blaž Škrlj, Marko Jukič, Nika Eržen, Senja Pollak, Nada Lavrač

The COVID-19 pandemic triggered a wave of novel scientific literature that is impossible to inspect and study in a reasonable time frame manually.

Keyphrase Extraction Representation Learning

Paper
Add Code

BERT meets Shapley: Extending SHAP Explanations to Transformer-based Classifiers

no code implementations • EACL (Hackashop) 2021 • Enja Kokalj, Blaž Škrlj, Nada Lavrač, Senja Pollak, Marko Robnik-Šikonja

Transformer-based neural networks offer very good classification performance across a wide range of domains, but do not provide explanations of their predictions.

Paper
Add Code

Zero-shot Cross-lingual Content Filtering: Offensive Language and Hate Speech Detection

no code implementations • EACL (Hackashop) 2021 • Andraž Pelicon, Ravi Shekhar, Matej Martinc, Blaž Škrlj, Matthew Purver, Senja Pollak

We present a system for zero-shot cross-lingual offensive language and hate speech classification.

Hate Speech Detection

Paper
Add Code

EMBEDDIA Tools, Datasets and Challenges: Resources and Hackathon Contributions

no code implementations • EACL (Hackashop) 2021 • Senja Pollak, Marko Robnik-Šikonja, Matthew Purver, Michele Boggia, Ravi Shekhar, Marko Pranjić, Salla Salmela, Ivar Krustok, Tarmo Paju, Carl-Gustav Linden, Leo Leppänen, Elaine Zosa, Matej Ulčar, Linda Freienthal, Silver Traat, Luis Adrián Cabrera-Diego, Matej Martinc, Nada Lavrač, Blaž Škrlj, Martin Žnidaršič, Andraž Pelicon, Boshko Koloski, Vid Podpečan, Janez Kranjc, Shane Sheehan, Emanuela Boros, Jose G. Moreno, Antoine Doucet, Hannu Toivonen

This paper presents tools and data sources collected and released by the EMBEDDIA project, supported by the European Union’s Horizon 2020 research and innovation program.

Paper
Add Code

Interesting cross-border news discovery using cross-lingual article linking and document similarity

no code implementations • EACL (Hackashop) 2021 • Boshko Koloski, Elaine Zosa, Timen Stepišnik-Perdih, Blaž Škrlj, Tarmo Paju, Senja Pollak

Team Name: team-8 Embeddia Tool: Cross-Lingual Document Retrieval Zosa et al. Dataset: Estonian and Latvian news datasets abstract: Contemporary news media face increasing amounts of available data that can be of use when prioritizing, selecting and discovering new news.

Retrieval

Paper
Add Code

Out of Thin Air: Is Zero-Shot Cross-Lingual Keyword Detection Better Than Unsupervised?

no code implementations • LREC 2022 • Boshko Koloski, Senja Pollak, Blaž Škrlj, Matej Martinc

We find that the pretrained models fine-tuned on a multilingual corpus covering languages that do not appear in the test set (i. e. in a zero-shot setting), consistently outscore unsupervised models in all six languages.

Keyword Extraction Pretrained Multilingual Language Models

Paper
Add Code

E8-IJS@LT-EDI-ACL2022 - BERT, AutoML and Knowledge-graph backed Detection of Depression

no code implementations • LTEDI (ACL) 2022 • Ilija Tavchioski, Boshko Koloski, Blaž Škrlj, Senja Pollak

Depression is a mental illness that negatively affects a person’s well-being and can, if left untreated, lead to serious consequences such as suicide.

AutoML

Paper
Add Code

Retrieval-efficiency trade-off of Unsupervised Keyword Extraction

no code implementations • 15 Aug 2022 • Blaž Škrlj, Boshko Koloski, Senja Pollak

Efficiently identifying keyphrases that represent a given document is a challenging task.

Keyword Extraction Retrieval

Paper
Add Code

Dynamic Surrogate Switching: Sample-Efficient Search for Factorization Machine Configurations in Online Recommendations

no code implementations • 29 Sep 2022 • Blaž Škrlj, Adi Schwartz, Jure Ferlež, Davorin Kopič, Naama Ziporin

The main idea underlying this paradigm considers an incrementally updated model of the relation between the hyperparameter space and the output (target) space; the data for this model are obtained by evaluating the main learning engine, which is, for example, a factorization machine-based model.

Hyperparameter Optimization

Paper
Add Code

Sentiment Classification by Incorporating Background Knowledge from Financial Ontologies

no code implementations • FNP (LREC) 2022 • Timen Stepišnik-Perdih, Andraž Pelicon, Blaž Škrlj, Martin Žnidaršič, Igor Lončarski, Senja Pollak

Ontologies are increasingly used for machine reasoning over the last few years.

Classification Sentiment Analysis +1

Paper
Add Code

DDeMON: Ontology-based function prediction by Deep Learning from Dynamic Multiplex Networks

no code implementations • 8 Feb 2023 • Jan Kralj, Blaž Škrlj, Živa Ramšak, Nada Lavrač, Kristina Gruden

Biological systems can be studied at multiple levels of information, including gene, protein, RNA and different interaction networks levels.

Paper
Add Code

Measuring Catastrophic Forgetting in Cross-Lingual Transfer Paradigms: Exploring Tuning Strategies

no code implementations • 12 Sep 2023 • Boshko Koloski, Blaž Škrlj, Marko Robnik-Šikonja, Senja Pollak

As cross-lingual transfer strategies, we compare the intermediate-training (\textit{IT}) that uses each language sequentially and cross-lingual validation (\textit{CLV}) that uses a target language already in the validation phase of fine-tuning.

Cross-Lingual Transfer Hate Speech Detection

Paper
Add Code

Drifter: Efficient Online Feature Monitoring for Improved Data Integrity in Large-Scale Recommendation Systems

no code implementations • 4 Sep 2023 • Blaž Škrlj, Nir Ki-Tov, Lee Edelist, Natalia Silberstein, Hila Weisman-Zohar, Blaž Mramor, Davorin Kopič, Naama Ziporin

Real-world production systems often grapple with maintaining data quality in large-scale, dynamic streams.

Anomaly Detection Recommendation Systems

Paper
Add Code

Latent Graphs for Semi-Supervised Learning on Biomedical Tabular Data

no code implementations • 27 Sep 2023 • Boshko Koloski, Nada Lavrač, Senja Pollak, Blaž Škrlj

In the domain of semi-supervised learning, the current approaches insufficiently exploit the potential of considering inter-instance relationships among (un)labeled data.

Paper
Add Code

AHAM: Adapt, Help, Ask, Model -- Harvesting LLMs for literature mining

no code implementations • 25 Dec 2023 • Boshko Koloski, Nada Lavrač, Bojan Cestnik, Senja Pollak, Blaž Škrlj, Andrej Kastrin

Our system aims to reduce both the ratio of outlier topics to the total number of topics and the similarity between topic definitions.

Domain Adaptation Language Modelling +6

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.