Search Results for author: Grigorios Tsoumakas

Found 42 papers, 24 papers with code

AUTH @ CLSciSumm 20, LaySumm 20, LongSumm 20

no code implementations EMNLP (sdp) 2020 Alexios Gidiotis, Stefanos Stefanidis, Grigorios Tsoumakas

We present the systems we submitted for the shared tasks of the Workshop on Scholarly Document Processing at EMNLP 2020.

Keyphrase Extraction from Scientific Articles via Extractive Summarization

1 code implementation NAACL (sdp) 2021 Chrysovalantis Giorgos Kontoulis, Eirini Papagiannopoulou, Grigorios Tsoumakas

Automatically extracting keyphrases from scholarly documents leads to a valuable concise representation that humans can understand and machines can process for tasks, such as information retrieval, article clustering and article classification.

Extractive Summarization Information Retrieval +1

Keyword Extraction Using Unsupervised Learning on the Document’s Adjacency Matrix

no code implementations NAACL (TextGraphs) 2021 Eirini Papagiannopoulou, Grigorios Tsoumakas, Apostolos Papadopoulos

This work revisits the information given by the graph-of-words and its typical utilization through graph-based ranking approaches in the context of keyword extraction.

Keyword Extraction

Bayesian Active Summarization

no code implementations9 Oct 2021 Alexios Gidiotis, Grigorios Tsoumakas

Bayesian Active Learning has had significant impact to various NLP problems, but nevertheless it's application to text summarization has been explored very little.

Active Learning Text Summarization

Harvesting the Public MeSH Note field

no code implementations1 Jun 2021 Anastasios Nentidis, Anastasia Krithara, Grigorios Tsoumakas, Georgios Paliouras

In this document, we report an analysis of the Public MeSH Note field of the new descriptors introduced in the MeSH thesaurus between 2006 and 2020.

Optimizing Area Under the Curve Measures via Matrix Factorization for Predicting Drug-Target Interaction with Multiple Similarities

1 code implementation1 May 2021 Bin Liu, Grigorios Tsoumakas

In drug discovery, identifying drug-target interactions (DTIs) via experimental approaches is a tedious and expensive procedure.

Drug Discovery

Conclusive Local Interpretation Rules for Random Forests

1 code implementation13 Apr 2021 Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas

LionForests is a random forest-specific interpretation technique, which provides rules as explanations.

Decision Making Multi-class Classification

VisioRed: A Visualisation Tool for Interpretable Predictive Maintenance

1 code implementation31 Mar 2021 Spyridon Paraschos, Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas

The use of machine learning rapidly increases in high-risk scenarios where decisions are required, for example in healthcare or industrial monitoring equipment.

Decision Making Time Series

Improving Zero-Shot Entity Retrieval through Effective Dense Representations

no code implementations6 Mar 2021 Eleni Partalidou, Despina Christou, Grigorios Tsoumakas

We achieve a new state-of-the-art 84. 28% accuracy on top-50 candidates on the Zeshel dataset, compared to the previous 82. 06% on the top-64 of (Wu et al., 2020).

Entity Linking Entity Retrieval

Improving Distantly-Supervised Relation Extraction through BERT-based Label & Instance Embeddings

1 code implementation1 Feb 2021 Despina Christou, Grigorios Tsoumakas

We propose REDSandT (Relation Extraction with Distant Supervision and Transformers), a novel distantly-supervised transformer-based RE method, that manages to capture a wider set of relations through highly informative instance and label embeddings for RE, by exploiting BERT's pre-trained model, and the relationship between labels and entities, respectively.

Relationship Extraction (Distant Supervised)

What is all this new MeSH about? Exploring the semantic provenance of new descriptors in the MeSH thesaurus

1 code implementation20 Jan 2021 Anastasios Nentidis, Anastasia Krithara, Grigorios Tsoumakas, Georgios Paliouras

To this end, we propose a framework to categorize new descriptors based on their current relation to older descriptors.

Drug-Target Interaction Prediction via an Ensemble of Weighted Nearest Neighbors with Interaction Recovery

1 code implementation22 Dec 2020 Bin Liu, Konstantinos Pliakos, Celine Vens, Grigorios Tsoumakas

In addition, WkNNIR exploits local imbalance to promote the influence of more reliable similarities on the interaction recovery and prediction processes.

Drug Discovery

From Protocol to Screening: A Hybrid Learning Approach for Technology-Assisted Systematic Literature Reviews

no code implementations19 Nov 2020 Athanasios Lagopoulos, Grigorios Tsoumakas

We present a novel method for TAR that implements a full pipeline from the research protocol to the screening of the relevant papers.

Learning-To-Rank

Keywords lie far from the mean of all words in local vector space

1 code implementation21 Aug 2020 Eirini Papagiannopoulou, Grigorios Tsoumakas, Apostolos N. Papadopoulos

Keyword extraction is an important document process that aims at finding a small set of terms that concisely describe a document's topics.

Keyword Extraction

ETHOS: an Online Hate Speech Detection Dataset

1 code implementation11 Jun 2020 Ioannis Mollas, Zoe Chrysopoulou, Stamatis Karlos, Grigorios Tsoumakas

Online hate speech is a recent problem in our society that is rising at a steady pace by leveraging the vulnerabilities of the corresponding regimes that characterise most social media platforms.

Hate Speech Detection

Beyond MeSH: Fine-Grained Semantic Indexing of Biomedical Literature based on Weak Supervision

1 code implementation15 May 2020 Anastasios Nentidis, Anastasia Krithara, Grigorios Tsoumakas, Georgios Paliouras

To this end, we propose a new method that uses weak supervision to train a concept annotator on the literature available for a particular disease.

Multi-Label Sampling based on Local Label Imbalance

1 code implementation7 May 2020 Bin Liu, Konstantinos Blekas, Grigorios Tsoumakas

Experimental results on 13 multi-label datasets demonstrate the effectiveness of the proposed measure and sampling approaches for a variety of evaluation metrics, particularly in the case of an ensemble of classifiers trained on repeated samples of the original data.

Multi-Label Learning

A Divide-and-Conquer Approach to the Summarization of Long Documents

1 code implementation13 Apr 2020 Alexios Gidiotis, Grigorios Tsoumakas

With this approach we can decompose the problem of long document summarization into smaller and simpler problems, reducing computational complexity and creating more training examples, which at the same time contain less noise in the target summaries compared to the standard approach.

Ranked #6 on Text Summarization on Pubmed (using extra training data)

Document Summarization Sentence Similarity

LionForests: Local Interpretation of Random Forests

no code implementations20 Nov 2019 Ioannis Mollas, Nick Bassiliades, Ioannis Vlahavas, Grigorios Tsoumakas

Towards a future where machine learning systems will integrate into every aspect of people's lives, researching methods to interpret such systems is necessary, instead of focusing exclusively on enhancing their performance.

LioNets: Local Interpretation of Neural Networks through Penultimate Layer Decoding

1 code implementation15 Jun 2019 Ioannis Mollas, Nikolaos Bassiliades, Grigorios Tsoumakas

Technological breakthroughs on smart homes, self-driving cars, health care and robotic assistants, in addition to reinforced law regulations, have critically influenced academic research on explainable machine learning.

General Classification Self-Driving Cars

Structured Summarization of Academic Publications

1 code implementation19 May 2019 Alexios Gidiotis, Grigorios Tsoumakas

We propose SUSIE, a novel summarization method that can work with state-of-the-art summarization models in order to produce structured scientific summaries for academic articles.

A Review of Keyphrase Extraction

2 code implementations13 May 2019 Eirini Papagiannopoulou, Grigorios Tsoumakas

Keyphrase extraction is a textual information processing task concerned with the automatic extraction of representative and characteristic phrases from a document that express all the key aspects of its content.

Keyphrase Extraction

Synthetic Oversampling of Multi-Label Data based on Local Label Distribution

2 code implementations2 May 2019 Bin Liu, Grigorios Tsoumakas

Class-imbalance is an inherent characteristic of multi-label data which affects the prediction accuracy of most multi-label learning methods.

Multi-Label Learning

Unsupervised Keyphrase Extraction from Scientific Publications

1 code implementation10 Aug 2018 Eirini Papagiannopoulou, Grigorios Tsoumakas

It then uses the minimum covariance determinant estimator to model the distribution of non-keyphrase word vectors, under the assumption that these vectors come from the same distribution, indicative of their irrelevance to the semantics expressed by the dimensions of the learned vector representation.

Keyphrase Extraction Outlier Detection +1

Web Robot Detection in Academic Publishing

no code implementations14 Nov 2017 Athanasios Lagopoulos, Grigorios Tsoumakas, Georgios Papadopoulos

In this paper, we present our approach on detecting web robots in academic publishing websites.

Local Word Vectors Guiding Keyphrase Extraction

1 code implementation20 Oct 2017 Eirini Papagiannopoulou, Grigorios Tsoumakas

Automated keyphrase extraction is a fundamental textual information processing task concerned with the selection of representative phrases from a document that summarize its content.

Keyphrase Extraction Word Embeddings

Subset Labeled LDA for Large-Scale Multi-Label Classification

no code implementations16 Sep 2017 Yannis Papanikolaou, Grigorios Tsoumakas

We conduct extensive experiments on eight data sets, with label sets sizes ranging from hundreds to hundreds of thousands, comparing our proposed algorithm with the previously proposed LLDA algorithms (Prior--LDA, Dep--LDA), as well as the state of the art in extreme multi-label classification.

Classification Extreme Multi-Label Classification +3

Large-Scale Online Semantic Indexing of Biomedical Articles via an Ensemble of Multi-Label Classification Models

no code implementations18 Apr 2017 Yannis Papanikolaou, Grigorios Tsoumakas, Manos Laliotis, Nikos Markantonatos, Ioannis Vlahavas

Background: In this paper we present the approaches and methods employed in order to deal with a large scale multi-label semantic indexing task of biomedical papers.

General Classification Multi-Label Classification

Hierarchical Partitioning of the Output Space in Multi-label Data

no code implementations19 Dec 2016 Yannis Papanikolaou, Ioannis Katakis, Grigorios Tsoumakas

Hierarchy Of Multi-label classifiers (HOMER) is a multi-label learning algorithm that breaks the initial learning task to several, easier sub-tasks by first constructing a hierarchy of labels from a given label set and secondly employing a given base multi-label classifier (MLC) to the resulting sub-problems.

Multi-Label Classification Multi-Label Learning

Dense Distributions from Sparse Samples: Improved Gibbs Sampling Parameter Estimators for LDA

1 code implementation8 May 2015 Yannis Papanikolaou, James R. Foulds, Timothy N. Rubin, Grigorios Tsoumakas

We introduce a novel approach for estimating Latent Dirichlet Allocation (LDA) parameters from collapsed Gibbs samples (CGS), by leveraging the full conditional distributions over the latent variable assignments to efficiently average over multiple samples, for little more computational cost than drawing a single additional collapsed Gibbs sample.

Multi-Label Classification

Multi-Target Regression via Random Linear Target Combinations

no code implementations20 Apr 2014 Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis, Aikaterini Vrekou, Ioannis Vlahavas

Multi-target regression is concerned with the simultaneous prediction of multiple continuous target variables based on the same set of input variables.

General Classification Multi-Label Classification +1

Discovering and Exploiting Entailment Relationships in Multi-Label Learning

no code implementations15 Apr 2014 Christina Papagiannopoulou, Grigorios Tsoumakas, Ioannis Tsamardinos

Marginal probabilities are entered as soft evidence in the network and adjusted through probabilistic inference.

Multi-Label Learning

Multi-Target Regression via Input Space Expansion: Treating Targets as Inputs

no code implementations28 Nov 2012 Eleftherios Spyromitros-Xioufis, Grigorios Tsoumakas, William Groves, Ioannis Vlahavas

When the prediction targets are binary the task is called multi-label classification, while when the targets are continuous the task is called multi-target regression.

General Classification Multi-Label Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.