Search Results for author: Andrea Pietracaprina

Found 5 papers, 3 papers with code

Distributed k-Means with Outliers in General Metrics

no code implementations16 Feb 2022 Enrico Dandolo, Andrea Pietracaprina, Geppino Pucci

A more general formulation, known as k-means with $z$ outliers, introduced to deal with noisy datasets, features a further parameter $z$ and allows up to $z$ points of $P$ (outliers) to be disregarded when computing the aforementioned sum.

k-Center Clustering with Outliers in Sliding Windows

1 code implementation7 Jan 2022 Paolo Pellizzoni, Andrea Pietracaprina, Geppino Pucci

We provide efficient algorithms for this important variant in the streaming model under the sliding window setting, where, at each time step, the dataset to be clustered is the window $W$ of the most recent data items.

Clustering

Scalable Distributed Approximation of Internal Measures for Clustering Evaluation

1 code implementation3 Mar 2020 Federico Altieri, Andrea Pietracaprina, Geppino Pucci, Fabio Vandin

The experiments provide evidence that, unlike other heuristics, our estimation strategy not only provides tight theoretical guarantees but is also able to return highly accurate estimations while running in a fraction of the time required by the exact computation, and that its distributed implementation is highly scalable, thus enabling the computation of internal measures for very large datasets for which the exact computation is prohibitive.

Clustering

Coreset-based Strategies for Robust Center-type Problems

no code implementations18 Feb 2020 Andrea Pietracaprina, Geppino Pucci, Federico Soldà

Given a dataset $V$ of points from some metric space, the popular $k$-center problem requires to identify a subset of $k$ points (centers) in $V$ minimizing the maximum distance of any point of $V$ from its closest center.

Vocal Bursts Type Prediction

MapReduce and Streaming Algorithms for Diversity Maximization in Metric Spaces of Bounded Doubling Dimension

1 code implementation18 May 2016 Matteo Ceccarello, Andrea Pietracaprina, Geppino Pucci, Eli Upfal

Given a dataset of points in a metric space and an integer $k$, a diversity maximization problem requires determining a subset of $k$ points maximizing some diversity objective measure, e. g., the minimum or the average distance between two points in the subset.

Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.