Search Results for author: Matej Martinc

Found 27 papers, 9 papers with code

JSI at SemEval-2022 Task 1: CODWOE - Reverse Dictionary: Monolingual and cross-lingual approaches

1 code implementation SemEval (NAACL) 2022 Thi Hong Hanh Tran, Matej Martinc, Matthew Purver, Senja Pollak

The reverse dictionary task is a sequence-to-vector task in which a gloss is provided as input, and the output must be a semantically matching word vector.

Reverse Dictionary Zero-Shot Learning

IJS at TextGraphs-16 Natural Language Premise Selection Task: Will Contextual Information Improve Natural Language Premise Selection?

no code implementations COLING (TextGraphs) 2022 Thi Hong Hanh Tran, Matej Martinc, Antoine Doucet, Senja Pollak

The results demonstrate that the contextual representation is better at capturing meaningful information despite not being pretrained in the mathematical background compared to the statistical approach (e. g., the TF-IDF) with a boost of around 3. 00% MAP@500.

Embeddings models for Buddhist Sanskrit

no code implementations LREC 2022 Ligeia Lugli, Matej Martinc, Andraž Pelicon, Senja Pollak

We release a novel corpus of Buddhist texts, a novel corpus of general Sanskrit and word similarity and word analogy datasets for intrinsic evaluation of Buddhist Sanskrit embeddings models.

Semantic Similarity Semantic Textual Similarity +2

EMBEDDIA hackathon report: Automatic sentiment and viewpoint analysis of Slovenian news corpus on the topic of LGBTIQ+

no code implementations EACL (Hackashop) 2021 Matej Martinc, Nina Perger, Andraž Pelicon, Matej Ulčar, Andreja Vezovnik, Senja Pollak

We conduct automatic sentiment and viewpoint analysis of the newly created Slovenian news corpus containing articles related to the topic of LGBTIQ+ by employing the state-of-the-art news sentiment classifier and a system for semantic change detection.

Change Detection

Multi-Task Learning for Features Extraction in Financial Annual Reports

1 code implementation8 Apr 2024 Syrielle Montariol, Matej Martinc, Andraž Pelicon, Senja Pollak, Boshko Koloski, Igor Lončarski, Aljoša Valentinčič

For assessing various performance indicators of companies, the focus is shifting from strictly financial (quantitative) publicly disclosed information to qualitative (textual) information.

Multi-Task Learning Sentence +2

Semantic change detection for Slovene language: a novel dataset and an approach based on optimal transport

1 code implementation26 Feb 2024 Marko Pranjić, Kaja Dobrovoljc, Senja Pollak, Matej Martinc

In this paper, we focus on the detection of semantic changes in Slovene, a less resourced Slavic language with two million speakers.

Change Detection Sentence

The Recent Advances in Automatic Term Extraction: A survey

no code implementations17 Jan 2023 Hanh Thi Hong Tran, Matej Martinc, Jaya Caporusso, Antoine Doucet, Senja Pollak

Automatic term extraction (ATE) is a Natural Language Processing (NLP) task that eases the effort of manually identifying terms from domain-specific corpora by providing a list of candidate terms.

Feature Engineering Information Retrieval +4

Ensembling Transformers for Cross-domain Automatic Term Extraction

no code implementations12 Dec 2022 Hanh Thi Hong Tran, Matej Martinc, Andraz Pelicon, Antoine Doucet, Senja Pollak

Automatic term extraction plays an essential role in domain language understanding and several natural language processing downstream tasks.

Term Extraction

Out of Thin Air: Is Zero-Shot Cross-Lingual Keyword Detection Better Than Unsupervised?

no code implementations LREC 2022 Boshko Koloski, Senja Pollak, Blaž Škrlj, Matej Martinc

We find that the pretrained models fine-tuned on a multilingual corpus covering languages that do not appear in the test set (i. e. in a zero-shot setting), consistently outscore unsupervised models in all six languages.

Keyword Extraction Pretrained Multilingual Language Models

Scalable and Interpretable Semantic Change Detection

1 code implementation NAACL 2021 Syrielle Montariol, Matej Martinc, Lidia Pivovarova

We propose a novel scalable method for word usage-change detection that offers large gains in processing time and significant memory savings while offering the same interpretability and better performance than unscalable methods.

Change Detection

Extending Neural Keyword Extraction with TF-IDF tagset matching

1 code implementation EACL (Hackashop) 2021 Boshko Koloski, Senja Pollak, Blaž Škrlj, Matej Martinc

Keyword extraction is the task of identifying words (or multi-word expressions) that best describe a given document and serve in news portals to link articles of similar topics.

Keyword Extraction

TNT-KID: Transformer-based Neural Tagger for Keyword Identification

1 code implementation20 Mar 2020 Matej Martinc, Blaž Škrlj, Senja Pollak

With growing amounts of available textual data, development of algorithms capable of automatic analysis, categorization and summarization of these data has become a necessity.

Keyword Extraction Language Modelling

Capturing Evolution in Word Usage: Just Add More Clusters?

no code implementations18 Jan 2020 Matej Martinc, Syrielle Montariol, Elaine Zosa, Lidia Pivovarova

The way the words are used evolves through time, mirroring cultural or technological evolution of society.

Change Detection

Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift

no code implementations LREC 2020 Matej Martinc, Petra Kralj Novak, Senja Pollak

We propose a new method that leverages contextual embeddings for the task of diachronic semantic shift detection by generating time specific word representations from BERT embeddings.

Domain Adaptation Semantic Shift Detection

Embeddia at SemEval-2019 Task 6: Detecting Hate with Neural Network and Transfer Learning Approaches

1 code implementation SEMEVAL 2019 Andra{\v{z}} Pelicon, Matej Martinc, Petra Kralj Novak

For the first sub-task, we used a BERT model fine-tuned on the OLID dataset, while for the second and third tasks we developed a custom neural network architecture which combines bag-of-words features and automatically generated sequence-based features.

Language Identification Transfer Learning

Er ... well, it matters, right? On the role of data representations in spoken language dependency parsing

no code implementations WS 2018 Kaja Dobrovoljc, Matej Martinc

Despite the significant improvement of data-driven dependency parsing systems in recent years, they still achieve a considerably lower performance in parsing spoken language data in comparison to written data.

Dependency Parsing Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.