Search Results for author: Fabrizio Sebastiani

Found 31 papers, 15 papers with code

Explainable Authorship Identification in Cultural Heritage Applications: Analysis of a New Perspective

no code implementations • 3 Nov 2023 • Mattia Setzu, Silvia Corbara, Anna Monreale, Alejandro Moreo, Fabrizio Sebastiani

While a substantial amount of work has recently been devoted to enhance the performance of computational Authorship Identification (AId) systems, little to no attention has been paid to endowing AId systems with the ability to explain the reasons behind their predictions.

Authorship Attribution Authorship Verification +3

Paper
Add Code

Regularization-Based Methods for Ordinal Quantification

1 code implementation • 13 Oct 2023 • Mirko Bunse, Alejandro Moreo, Fabrizio Sebastiani, Martin Senz

Quantification, i. e., the task of training predictors of the class prevalence values in sets of unlabeled data items, has received increased attention in recent years.

Paper
Code

Binary Quantification and Dataset Shift: An Experimental Investigation

1 code implementation • 6 Oct 2023 • Pablo González, Alejandro Moreo, Fabrizio Sebastiani

One finding that results from this investigation is that many existing quantification methods that had been found robust to prior probability shift are not necessarily robust to other types of dataset shift.

Binary Quantification

Paper
Code

Same or Different? Diff-Vectors for Authorship Analysis

1 code implementation • 24 Jan 2023 • Silvia Corbara, Alejandro Moreo, Fabrizio Sebastiani

We investigate the effects on authorship identification tasks of a fundamental shift in how to conceive the vectorial representations of documents that are given as input to a supervised learner.

Authorship Attribution Authorship Verification

Paper
Code

Multi-Label Quantification

1 code implementation • 15 Nov 2022 • Alejandro Moreo, Manuel Francisco, Fabrizio Sebastiani

While many quantification methods have been proposed in the past for binary problems and, to a lesser extent, single-label multiclass problems, the multi-label setting (i. e., the scenario in which the classes of interest are not mutually exclusive) remains by and large unexplored.

Binary Quantification

Paper
Code

Unravelling Interlanguage Facts via Explainable Machine Learning

no code implementations • 2 Aug 2022 • Barbara Berti, Andrea Esuli, Fabrizio Sebastiani

We focus on a different facet of the NLI task, i. e., that of analysing the internals of an NLI classifier trained by an \emph{explainable} machine learning algorithm, in order to obtain explanations of its classification decisions, with the ultimate goal of gaining insight into which linguistic phenomena ``give a speaker's native language away''.

BIG-bench Machine Learning Native Language Identification

Paper
Add Code

LeQua@CLEF2022: Learning to Quantify

no code implementations • 22 Nov 2021 • Andrea Esuli, Alejandro Moreo, Fabrizio Sebastiani

LeQua 2022 is a new lab for the evaluation of methods for "learning to quantify" in textual datasets, i. e., for training predictors of the relative frequencies of the classes of interest in sets of unlabelled textual documents.

Binary Quantification Multiclass Quantification

Paper
Add Code

Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution

1 code implementation • 27 Oct 2021 • Silvia Corbara, Alejandro Moreo, Fabrizio Sebastiani

It is well known that, within the Latin production of written text, peculiar metric schemes were followed not only in poetic compositions, but also in many prose works.

Authorship Attribution

Paper
Code

Measuring Fairness Under Unawareness of Sensitive Attributes: A Quantification-Based Approach

1 code implementation • 17 Sep 2021 • Alessandro Fabris, Andrea Esuli, Alejandro Moreo, Fabrizio Sebastiani

More in detail, we show that fairness under unawareness can be cast as a quantification problem and solved with proven methods from the quantification literature.

Fairness

Paper
Code

Generalized Funnelling: Ensemble Learning and Heterogeneous Document Embeddings for Cross-Lingual Text Classification

no code implementations • 17 Sep 2021 • Alejandro Moreo, Andrea Pedrotti, Fabrizio Sebastiani

In this ensemble method, 1st-tier classifiers, each working on a different and language-dependent feature space, return a vector of calibrated posterior probabilities (with one dimension for each class) for each document, and the final classification decision is taken by a metaclassifier that uses this vector as its input.

Ensemble Learning Multilabel Text Classification +3

Paper
Add Code

QuaPy: A Python-Based Framework for Quantification

1 code implementation • 18 Jun 2021 • Alejandro Moreo, Andrea Esuli, Fabrizio Sebastiani

prevalence values) of the classes of interest in a sample of unlabelled data.

Model Selection

Paper
Code

Re-Assessing the "Classify and Count" Quantification Method

1 code implementation • 4 Nov 2020 • Alejandro Moreo, Fabrizio Sebastiani

This task originated with the observation that "Classify and Count" (CC), the trivial method of obtaining class prevalence estimates, is often a biased estimator, and thus delivers suboptimal quantification accuracy; following this observation, several methods for learning to quantify have been proposed that have been shown to outperform CC.

General Classification Sentiment Analysis +1

Paper
Code

Tweet Sentiment Quantification: An Experimental Re-Evaluation

1 code implementation • 4 Nov 2020 • Alejandro Moreo, Fabrizio Sebastiani

It is well-known that solving quantification by means of ``classify and count'' (i. e., by classifying all unlabelled items by means of a standard classifier and counting the items that have been assigned to a given class) is less than optimal in terms of accuracy, and that more accurate quantification methods exist.

Sentiment Analysis Sentiment Classification

Paper
Code

MedLatinEpi and MedLatinLit: Two Datasets for the Computational Authorship Analysis of Medieval Latin Texts

no code implementations • 22 Jun 2020 • Silvia Corbara, Alejandro Moreo, Fabrizio Sebastiani, Mirko Tavoni

We present and make available MedLatinEpi and MedLatinLit, two datasets of medieval Latin texts to be used in research on computational authorship analysis.

Authorship Attribution Authorship Verification

Paper
Add Code

SemEval-2016 Task 4: Sentiment Analysis in Twitter

no code implementations • SEMEVAL 2016 • Preslav Nakov, Alan Ritter, Sara Rosenthal, Fabrizio Sebastiani, Veselin Stoyanov

The three new subtasks focus on two variants of the basic ``sentiment classification in Twitter'' task.

General Classification Sentiment Analysis +1

Paper
Add Code

Word-Class Embeddings for Multiclass Text Classification

2 code implementations • 26 Nov 2019 • Alejandro Moreo, Andrea Esuli, Fabrizio Sebastiani

Pre-trained word embeddings encode general word semantics and lexical regularities of natural language, and have proven useful across many NLP tasks, including word sense disambiguation, machine translation, and sentiment analysis, to name a few.

General Classification Machine Translation +6

Paper
Code

Evaluating Variable-Length Multiple-Option Lists in Chatbots and Mobile Search

no code implementations • 25 May 2019 • Pepa Atanasova, Georgi Karadzhov, Yasen Kiprov, Preslav Nakov, Fabrizio Sebastiani

While typically a user would expect a single response at any utterance, a system could also return multiple options for the user to select from, based on different system understandings of the user's intent.

Question Answering

Paper
Add Code

Cross-Lingual Sentiment Quantification

3 code implementations • 16 Apr 2019 • Andrea Esuli, Alejandro Moreo, Fabrizio Sebastiani

Cross-lingual sentiment quantification (and cross-lingual \emph{text} quantification in general) has never been discussed before in the literature; we establish baseline results for the binary case by combining state-of-the-art quantification methods with methods capable of generating cross-lingual vectorial representations of the source and target documents involved.

Cross-Lingual Sentiment Classification General Classification +2

Paper
Code

Building Automated Survey Coders via Interactive Machine Learning

no code implementations • 28 Mar 2019 • Andrea Esuli, Alejandro Moreo, Fabrizio Sebastiani

We will show that, for the same amount of training effort, interactive learning delivers much better coding accuracy than standard "non-interactive" learning.

BIG-bench Machine Learning

Paper
Add Code

Learning to Weight for Text Classification

1 code implementation • 28 Mar 2019 • Alejandro Moreo Fernández, Andrea Esuli, Fabrizio Sebastiani

In information retrieval (IR) and related tasks, term weighting approaches typically consider the frequency of the term in the document and in the collection in order to compute a score reflecting the importance of the term for the document.

General Classification Information Retrieval +3

Paper
Code

Funnelling: A New Ensemble Method for Heterogeneous Transfer Learning and its Application to Cross-Lingual Text Classification

1 code implementation • 31 Jan 2019 • Andrea Esuli, Alejandro Moreo, Fabrizio Sebastiani

Funnelling consists of generating a two-tier classification system where all documents, irrespectively of language, are classified by the same (2nd-tier) classifier.

Ensemble Learning General Classification +3

Paper
Code

Revisiting Distributional Correspondence Indexing: A Python Reimplementation and New Experiments

1 code implementation • 19 Oct 2018 • Alejandro Moreo, Andrea Esuli, Fabrizio Sebastiani

This paper introduces PyDCI, a new implementation of Distributional Correspondence Indexing (DCI) written in Python.

Ranked #2 on Sentiment Analysis on Multi-Domain Sentiment Dataset

Domain Adaptation General Classification +4

Paper
Code

Evaluation Measures for Quantification: An Axiomatic Approach

no code implementations • 6 Sep 2018 • Fabrizio Sebastiani

While the scientific community has devoted a lot of attention to devising more accurate quantification methods, it has not devoted much to discussing what properties an \emph{evaluation measure for quantification} (EMQ) should enjoy, and which EMQs should be adopted as a result.

Paper
Add Code

A Recurrent Neural Network for Sentiment Quantification

1 code implementation • 4 Sep 2018 • Andrea Esuli, Alejandro Moreo Fernández, Fabrizio Sebastiani

Quantification is a supervised learning task that consists in predicting, given a set of classes C and a set D of unlabelled items, the prevalence (or relative frequency) p(c|D) of each class c in C. Quantification can in principle be solved by classifying all the unlabelled items and counting how many of them have been attributed to each class.

Paper
Code

Optimizing Non-decomposable Measures with Deep Networks

no code implementations • 31 Jan 2018 • Amartya Sanyal, Pawan Kumar, Purushottam Kar, Sanjay Chawla, Fabrizio Sebastiani

We present a class of algorithms capable of directly training deep neural networks with respect to large families of task-specific performance measures such as the F-measure and the Kullback-Leibler divergence that are structured and non-decomposable.

Paper
Add Code

QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification

no code implementations • SEMEVAL 2016 • Giovanni Da San Martino, Wei Gao, Fabrizio Sebastiani

Sentiment Analysis

Paper
Add Code

The Challenge of Sentiment Quantification

no code implementations • WS 2016 • Fabrizio Sebastiani

Sentiment Analysis

Paper
Add Code

Online Optimization Methods for the Quantification Problem

no code implementations • 13 May 2016 • Purushottam Kar, Shuai Li, Harikrishna Narasimhan, Sanjay Chawla, Fabrizio Sebastiani

The estimation of class prevalence, i. e., the fraction of a population that belongs to a certain class, is a very useful tool in data analytics and learning, and finds applications in many domains such as sentiment analysis, epidemiology, etc.

Epidemiology Sentiment Analysis

Paper
Add Code

Utility-Theoretic Ranking for Semi-Automated Text Classification

no code implementations • 2 Mar 2015 • Giacomo Berardi, Andrea Esuli, Fabrizio Sebastiani

\emph{Semi-Automated Text Classification} (SATC) may be defined as the task of ranking a set $\mathcal{D}$ of automatically labelled textual documents in such a way that, if a human annotator validates (i. e., inspects and corrects where appropriate) the documents in a top-ranked portion of $\mathcal{D}$ with the goal of increasing the overall labelling accuracy of $\mathcal{D}$, the expected increase is maximized.

General Classification text-classification +1

Paper
Add Code

Optimizing Text Quantifiers for Multivariate Loss Functions

no code implementations • 19 Feb 2015 • Andrea Esuli, Fabrizio Sebastiani

We address the problem of \emph{quantification}, a supervised learning task whose goal is, given a class, to estimate the relative frequency (or \emph{prevalence}) of the class in a dataset of unlabelled items.

Structured Prediction

Paper
Add Code

On the Effects of Low-Quality Training Data on Information Extraction from Clinical Reports

no code implementations • 19 Feb 2015 • Diego Marcheggiani, Fabrizio Sebastiani

While a lot of work has been devoted to devising learning methods that generate more and more accurate information extractors, no work has been devoted to investigating the effect of the quality of training data on the learning process.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.