1 code implementation • LREC 2022 • Tatiana Passali, Thanassis Mavropoulos, Grigorios Tsoumakas, Georgios Meditskos, Stefanos Vrochidis
In addition, we release a new large-scale dataset with disfluencies that can be used on four different tasks: disfluency detection, classification, extraction, and correction.
no code implementations • EACL (HCINLP) 2021 • Tatiana Passali, Alexios Gidiotis, Efstathios Chatzikyriakidis, Grigorios Tsoumakas
In order to overcome these issues, we reconsider the task of summarization from a human-centered perspective.
no code implementations • NLP4DH (ICON) 2021 • Fotini Koidaki, Despina Christou, Katerina Tiktopoulou, Grigorios Tsoumakas
How the construction of national consciousness may be captured in the literary production of a whole century?
1 code implementation • NAACL (sdp) 2021 • Chrysovalantis Giorgos Kontoulis, Eirini Papagiannopoulou, Grigorios Tsoumakas
Automatically extracting keyphrases from scholarly documents leads to a valuable concise representation that humans can understand and machines can process for tasks, such as information retrieval, article clustering and article classification.
no code implementations • NAACL (TextGraphs) 2021 • Eirini Papagiannopoulou, Grigorios Tsoumakas, Apostolos Papadopoulos
This work revisits the information given by the graph-of-words and its typical utilization through graph-based ranking approaches in the context of keyword extraction.
no code implementations • EMNLP (sdp) 2020 • Alexios Gidiotis, Stefanos Stefanidis, Grigorios Tsoumakas
We present the systems we submitted for the shared tasks of the Workshop on Scholarly Document Processing at EMNLP 2020.
1 code implementation • 2 Mar 2025 • Petros Stylianos Giouroukis, Alexios Gidiotis, Grigorios Tsoumakas
Active learning methods typically prioritize either uncertainty or diversity but have shown limited effectiveness in summarization, often being outperformed by random sampling.
1 code implementation • 21 Dec 2024 • Ao Zhou, Bin Liu, Jin Wang, Grigorios Tsoumakas
The accuracy of deep neural networks is significantly influenced by the effectiveness of mini-batch construction during training.
no code implementations • Association for Computational Linguistics 2024 • Loukritia Stefanou, Tatiana Passali, Grigorios Tsoumakas
The BioLaySumm 2024 shared task at the ACL 2024 BioNLP workshop aims to transform biomedical research articles into lay summaries suitable for a broad audience, including children.
no code implementations • 27 Mar 2024 • Ao Zhou, Bin Liu, Jin Wang, Grigorios Tsoumakas
However, the intrinsic class imbalance in multi-label data may bias the model towards majority labels, since samples relevant to minority labels may be underrepresented in each mini-batch.
no code implementations • 8 Dec 2023 • Tatiana Passali, Efstathios Chatzikyriakidis, Stelios Andreadis, Thanos G. Stavropoulos, Anastasia Matonaki, Anestis Fachantidis, Grigorios Tsoumakas
This survey, conducted using the PRISMA guidelines, systematically reviews two main strategies for addressing the issue of long sentences: a) sentence compression and b) sentence splitting.
1 code implementation • 29 Mar 2023 • Avraam Bardos, Nikolaos Mylonas, Ioannis Mollas, Grigorios Tsoumakas
Although model-agnostic techniques exist for multi-target regression, specific techniques tailored to random forest models are not available.
1 code implementation • 25 Feb 2023 • Georgios Kamtziridis, Dimitris Vrakas, Grigorios Tsoumakas
Real estate markets depend on various methods to predict housing prices, including models that have been trained on datasets of residential or commercial properties.
2 code implementations • 23 Jan 2023 • Anastasios Nentidis, Thomas Chatzopoulos, Anastasia Krithara, Grigorios Tsoumakas, Georgios Paliouras
Conclusion: The results suggest that concept occurrence is a strong heuristic for refining the coarse-grained labels at the level of MeSH concepts and the proposed method improves it further.
1 code implementation • 7 Dec 2022 • Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas
As a result, the demand for a selection tool, a meta-explanation technique based on a high-quality evaluation metric, is apparent.
1 code implementation • 1 Dec 2022 • Bin Liu, Jin Wang, Kaiwei Sun, Grigorios Tsoumakas
Recently, with the availability of abundant heterogeneous biological information from diverse data sources, computational methods have been able to leverage multiple drug and target similarities to boost the performance of DTI prediction.
no code implementations • 22 Sep 2022 • Nikolaos Mylonas, Ioannis Mollas, Grigorios Tsoumakas
Transformers are widely used in natural language processing, where they consistently achieve state-of-the-art performance.
1 code implementation • 5 Jul 2022 • Nikolaos Mylonas, Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas
Random Forest falls short on this property, especially when a large number of tree predictors are used.
1 code implementation • 9 Jun 2022 • Tatiana Passali, Grigorios Tsoumakas
Topic-controllable summarization is an emerging research area with a wide range of potential applications.
1 code implementation • 29 Apr 2022 • Avraam Bardos, Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas
Dimensionality reduction (DR) is a popular method for preparing and analyzing high-dimensional data.
1 code implementation • 24 Jan 2022 • Bin Liu, Dimitrios Papadopoulos, Fragkiskos D. Malliaros, Grigorios Tsoumakas, Apostolos N. Papadopoulos
Moreover, the validation of highly ranked non-interacting pairs also demonstrates the potential of MDMF2A to discover novel DTIs.
no code implementations • 9 Oct 2021 • Alexios Gidiotis, Grigorios Tsoumakas
Bayesian Active Learning has had significant impact to various NLP problems, but nevertheless it's application to text summarization has been explored very little.
1 code implementation • 8 Jul 2021 • Argyrios Vartholomaios, Stamatis Karlos, Eleftherios Kouloumpris, Grigorios Tsoumakas
Energy production using renewable sources exhibits inherent uncertainties due to their intermittent nature.
no code implementations • 1 Jun 2021 • Anastasios Nentidis, Anastasia Krithara, Grigorios Tsoumakas, Georgios Paliouras
In this document, we report an analysis of the Public MeSH Note field of the new descriptors introduced in the MeSH thesaurus between 2006 and 2020.
no code implementations • Findings (ACL) 2022 • Alexios Gidiotis, Grigorios Tsoumakas
We explore the notion of uncertainty in the context of modern abstractive summarization models, using the tools of Bayesian Deep Learning.
1 code implementation • 1 May 2021 • Bin Liu, Grigorios Tsoumakas
In drug discovery, identifying drug-target interactions (DTIs) via experimental approaches is a tedious and expensive procedure.
1 code implementation • 13 Apr 2021 • Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas
LionForests is a random forest-specific interpretation technique, which provides rules as explanations.
2 code implementations • 13 Apr 2021 • Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas
Artificial Intelligence (AI) has a tremendous impact on the unexpected growth of technology in almost every aspect.
1 code implementation • 31 Mar 2021 • Spyridon Paraschos, Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas
The use of machine learning rapidly increases in high-risk scenarios where decisions are required, for example in healthcare or industrial monitoring equipment.
no code implementations • 6 Mar 2021 • Eleni Partalidou, Despina Christou, Grigorios Tsoumakas
We achieve a new state-of-the-art 84. 28% accuracy on top-50 candidates on the Zeshel dataset, compared to the previous 82. 06% on the top-64 of (Wu et al., 2020).
1 code implementation • 1 Feb 2021 • Despina Christou, Grigorios Tsoumakas
We propose REDSandT (Relation Extraction with Distant Supervision and Transformers), a novel distantly-supervised transformer-based RE method, that manages to capture a wider set of relations through highly informative instance and label embeddings for RE, by exploiting BERT's pre-trained model, and the relationship between labels and entities, respectively.
Ranked #2 on
Relationship Extraction (Distant Supervised)
on New York Times Corpus
(AUC metric)
1 code implementation • 20 Jan 2021 • Anastasios Nentidis, Anastasia Krithara, Grigorios Tsoumakas, Georgios Paliouras
To this end, we propose a framework to categorize new descriptors based on their current relation to older descriptors.
1 code implementation • 22 Dec 2020 • Bin Liu, Konstantinos Pliakos, Celine Vens, Grigorios Tsoumakas
In addition, WkNNIR exploits local imbalance to promote the influence of more reliable similarities on the interaction recovery and prediction processes.
no code implementations • 19 Nov 2020 • Athanasios Lagopoulos, Grigorios Tsoumakas
We present a novel method for TAR that implements a full pipeline from the research protocol to the screening of the relevant papers.
1 code implementation • 15 Oct 2020 • Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas
Lack of evaluation and selection criteria also makes it difficult for the end user to choose the most suitable technique.
1 code implementation • 21 Aug 2020 • Eirini Papagiannopoulou, Grigorios Tsoumakas, Apostolos N. Papadopoulos
Keyword extraction is an important document process that aims at finding a small set of terms that concisely describe a document's topics.
1 code implementation • 11 Jun 2020 • Ioannis Mollas, Zoe Chrysopoulou, Stamatis Karlos, Grigorios Tsoumakas
Online hate speech is a recent problem in our society that is rising at a steady pace by leveraging the vulnerabilities of the corresponding regimes that characterise most social media platforms.
Ranked #1 on
Hate Speech Detection
on Ethos MultiLabel
1 code implementation • 15 May 2020 • Anastasios Nentidis, Anastasia Krithara, Grigorios Tsoumakas, Georgios Paliouras
To this end, we propose a new method that uses weak supervision to train a concept annotator on the literature available for a particular disease.
1 code implementation • 7 May 2020 • Bin Liu, Konstantinos Blekas, Grigorios Tsoumakas
Experimental results on 13 multi-label datasets demonstrate the effectiveness of the proposed measure and sampling approaches for a variety of evaluation metrics, particularly in the case of an ensemble of classifiers trained on repeated samples of the original data.
1 code implementation • 13 Apr 2020 • Alexios Gidiotis, Grigorios Tsoumakas
With this approach we can decompose the problem of long document summarization into smaller and simpler problems, reducing computational complexity and creating more training examples, which at the same time contain less noise in the target summaries compared to the standard approach.
Ranked #14 on
Text Summarization
on Pubmed
(using extra training data)
no code implementations • 20 Nov 2019 • Ioannis Mollas, Nick Bassiliades, Ioannis Vlahavas, Grigorios Tsoumakas
Towards a future where machine learning systems will integrate into every aspect of people's lives, researching methods to interpret such systems is necessary, instead of focusing exclusively on enhancing their performance.
1 code implementation • 15 Jun 2019 • Ioannis Mollas, Nikolaos Bassiliades, Grigorios Tsoumakas
Technological breakthroughs on smart homes, self-driving cars, health care and robotic assistants, in addition to reinforced law regulations, have critically influenced academic research on explainable machine learning.
1 code implementation • 19 May 2019 • Alexios Gidiotis, Grigorios Tsoumakas
We propose SUSIE, a novel summarization method that can work with state-of-the-art summarization models in order to produce structured scientific summaries for academic articles.
2 code implementations • 13 May 2019 • Eirini Papagiannopoulou, Grigorios Tsoumakas
Keyphrase extraction is a textual information processing task concerned with the automatic extraction of representative and characteristic phrases from a document that express all the key aspects of its content.
2 code implementations • 2 May 2019 • Bin Liu, Grigorios Tsoumakas
Class-imbalance is an inherent characteristic of multi-label data which affects the prediction accuracy of most multi-label learning methods.
1 code implementation • 10 Aug 2018 • Eirini Papagiannopoulou, Grigorios Tsoumakas
It then uses the minimum covariance determinant estimator to model the distribution of non-keyphrase word vectors, under the assumption that these vectors come from the same distribution, indicative of their irrelevance to the semantics expressed by the dimensions of the learned vector representation.
no code implementations • 30 Jul 2018 • Bin Liu, Grigorios Tsoumakas
Class imbalance is an intrinsic characteristic of multi-label data.
no code implementations • 14 Nov 2017 • Athanasios Lagopoulos, Grigorios Tsoumakas, Georgios Papadopoulos
In this paper, we present our approach on detecting web robots in academic publishing websites.
1 code implementation • 20 Oct 2017 • Eirini Papagiannopoulou, Grigorios Tsoumakas
Automated keyphrase extraction is a fundamental textual information processing task concerned with the selection of representative phrases from a document that summarize its content.
no code implementations • 16 Sep 2017 • Yannis Papanikolaou, Grigorios Tsoumakas
We conduct extensive experiments on eight data sets, with label sets sizes ranging from hundreds to hundreds of thousands, comparing our proposed algorithm with the previously proposed LLDA algorithms (Prior--LDA, Dep--LDA), as well as the state of the art in extreme multi-label classification.
no code implementations • 18 Apr 2017 • Yannis Papanikolaou, Grigorios Tsoumakas, Manos Laliotis, Nikos Markantonatos, Ioannis Vlahavas
Background: In this paper we present the approaches and methods employed in order to deal with a large scale multi-label semantic indexing task of biomedical papers.
no code implementations • 19 Dec 2016 • Yannis Papanikolaou, Ioannis Katakis, Grigorios Tsoumakas
Hierarchy Of Multi-label classifiers (HOMER) is a multi-label learning algorithm that breaks the initial learning task to several, easier sub-tasks by first constructing a hierarchy of labels from a given label set and secondly employing a given base multi-label classifier (MLC) to the resulting sub-problems.
1 code implementation • 8 May 2015 • Yannis Papanikolaou, James R. Foulds, Timothy N. Rubin, Grigorios Tsoumakas
We introduce a novel approach for estimating Latent Dirichlet Allocation (LDA) parameters from collapsed Gibbs samples (CGS), by leveraging the full conditional distributions over the latent variable assignments to efficiently average over multiple samples, for little more computational cost than drawing a single additional collapsed Gibbs sample.
no code implementations • 20 Apr 2014 • Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis, Aikaterini Vrekou, Ioannis Vlahavas
Multi-target regression is concerned with the simultaneous prediction of multiple continuous target variables based on the same set of input variables.
no code implementations • 15 Apr 2014 • Christina Papagiannopoulou, Grigorios Tsoumakas, Ioannis Tsamardinos
Marginal probabilities are entered as soft evidence in the network and adjusted through probabilistic inference.
no code implementations • 28 Nov 2012 • Eleftherios Spyromitros-Xioufis, Grigorios Tsoumakas, William Groves, Ioannis Vlahavas
When the prediction targets are binary the task is called multi-label classification, while when the targets are continuous the task is called multi-target regression.