Search Results for author: Silvio Lattanzi

Found 37 papers, 7 papers with code

A Scalable Algorithm for Individually Fair K-means Clustering

1 code implementation9 Feb 2024 Mohammadhossein Bateni, Vincent Cohen-Addad, Alessandro Epasto, Silvio Lattanzi

We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.

Clustering

A quasi-polynomial time algorithm for Multi-Dimensional Scaling via LP hierarchies

no code implementations29 Nov 2023 Ainesh Bakshi, Vincent Cohen-Addad, Samuel B. Hopkins, Rajesh Jayaram, Silvio Lattanzi

Multi-dimensional Scaling (MDS) is a family of methods for embedding pair-wise dissimilarities between $n$ objects into low-dimensional space.

Data Visualization

Multi-Swap $k$-Means++

no code implementations28 Sep 2023 Lorenzo Beretta, Vincent Cohen-Addad, Silvio Lattanzi, Nikos Parotsidis

The $k$-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is often the practitioners' choice algorithm for optimizing the popular $k$-means clustering objective and is known to give an $O(\log k)$-approximation in expectation.

Clustering

Fully Dynamic Submodular Maximization over Matroids

no code implementations31 May 2023 Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

Maximizing monotone submodular functions under a matroid constraint is a classic algorithmic problem with multiple applications in data mining and machine learning.

On Classification Thresholds for Graph Attention with Edge Features

no code implementations18 Oct 2022 Kimon Fountoulakis, Dake He, Silvio Lattanzi, Bryan Perozzi, Anton Tsitsulin, Shenghao Yang

In CSBM the nodes and edge features are obtained from a mixture of Gaussians and the edges from a stochastic block model.

Classification Graph Attention +2

Active Learning of Classifiers with Label and Seed Queries

no code implementations8 Sep 2022 Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice, Maximilian Thiessen

In this work we show that, by carefully combining the two types of queries, a binary classifier can be learned in time $\operatorname{poly}(n+m)$ using only $O(m^2 \log n)$ label queries and $O\big(m \log \frac{m}{\gamma}\big)$ seed queries; the result extends to $k$-class classifiers at the price of a $k! k^2$ multiplicative overhead.

Active Learning

Deletion Robust Non-Monotone Submodular Maximization over Matroids

no code implementations16 Aug 2022 Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

Maximizing a submodular function is a fundamental task in machine learning and in this paper we study the deletion robust version of the problem under the classic matroids constraint.

Near-Optimal Correlation Clustering with Privacy

no code implementations2 Mar 2022 Vincent Cohen-Addad, Chenglin Fan, Silvio Lattanzi, Slobodan Mitrović, Ashkan Norouzi-Fard, Nikos Parotsidis, Jakub Tarnawski

Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more.

Clustering Community Detection

Efficient and Local Parallel Random Walks

no code implementations NeurIPS 2021 Michael Kapralov, Silvio Lattanzi, Navid Nouri, Jakab Tardos

Random walks are a fundamental primitive used in many machine learning algorithms with several applications in clustering and semi-supervised learning.

Clustering

Robust Online Correlation Clustering

no code implementations NeurIPS 2021 Silvio Lattanzi, Benjamin Moseley, Sergei Vassilvitskii, Yuyan Wang, Rudy Zhou

In correlation clustering we are given a set of points along with recommendations whether each pair of points should be placed in the same cluster or into separate clusters.

Clustering

Online Facility Location with Multiple Advice

no code implementations NeurIPS 2021 Matteo Almanza, Flavio Chierichetti, Silvio Lattanzi, Alessandro Panconesi, Giuseppe Re

Clustering is a central topic in unsupervised learning and its online formulation has received a lot of attention in recent years.

Clustering

Parallel and Efficient Hierarchical k-Median Clustering

no code implementations NeurIPS 2021 Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler, Ola Svensson

In this paper we introduce a new parallel algorithm for the Euclidean hierarchical $k$-median problem that, when using machines with memory $s$ (for $s\in \Omega(\log^2 (n+\Delta+d))$), outputs a hierarchical clustering such that for every fixed value of $k$ the cost of the solution is at most an $O(\min\{d, \log n\} \log \Delta)$ factor larger in expectation than that of an optimal solution.

Clustering

Correlation Clustering in Constant Many Parallel Rounds

no code implementations15 Jun 2021 Vincent Cohen-Addad, Silvio Lattanzi, Slobodan Mitrović, Ashkan Norouzi-Fard, Nikos Parotsidis, Jakub Tarnawski

Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining.

Clustering

On Margin-Based Cluster Recovery with Oracle Queries

no code implementations NeurIPS 2021 Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

We study an active cluster recovery problem where, given a set of $n$ points and an oracle answering queries like "are these two points in the same cluster?

Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries

no code implementations31 Jan 2021 Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(\log n)$ same-cluster queries, where $n$ is the number of input points.

Spectral Clustering Oracles in Sublinear Time

no code implementations14 Jan 2021 Grzegorz Gluch, Michael Kapralov, Silvio Lattanzi, Aida Mousavifar, Christian Sohler

The main technical contribution is a sublinear time oracle that provides dot product access to the spectral embedding of $G$ by estimating distributions of short random walks from vertices in $G$.

Data Structures and Algorithms

Fast and Accurate $k$-means++ via Rejection Sampling

no code implementations NeurIPS 2020 Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler, Ola Svensson

$k$-means++ \cite{arthur2007k} is a widely used clustering algorithm that is easy to implement, has nice theoretical guarantees and strong empirical performance.

Clustering

Online MAP Inference of Determinantal Point Processes

no code implementations NeurIPS 2020 Aditya Bhaskara, Amin Karbasi, Silvio Lattanzi, Morteza Zadimoghaddam

In this paper, we provide an efficient approximation algorithm for finding the most likelihood configuration (MAP) of size $k$ for Determinantal Point Processes (DPP) in the online setting where the data points arrive in an arbitrary order and the algorithm cannot discard the selected elements from its local memory.

Point Processes

On Mean Estimation for Heteroscedastic Random Variables

no code implementations22 Oct 2020 Luc Devroye, Silvio Lattanzi, Gabor Lugosi, Nikita Zhivotovskiy

We study the problem of estimating the common mean $\mu$ of $n$ independent symmetric random variables with different and unknown standard deviations $\sigma_1 \le \sigma_2 \le \cdots \le\sigma_n$.

InstantEmbedding: Efficient Local Node Representations

no code implementations14 Oct 2020 Ştefan Postăvaru, Anton Tsitsulin, Filipe Miguel Gonçalves de Almeida, Yingtao Tian, Silvio Lattanzi, Bryan Perozzi

In this paper, we introduce InstantEmbedding, an efficient method for generating single-node representations using local PageRank computations.

Link Prediction Node Classification +1

Sliding Window Algorithms for k-Clustering Problems

1 code implementation NeurIPS 2020 Michele Borassi, Alessandro Epasto, Silvio Lattanzi, Sergei Vassilvitskii, Morteza Zadimoghaddam

The sliding window model of computation captures scenarios in which data is arriving continuously, but only the latest $w$ elements should be used for analysis.

Data Structures and Algorithms

Exact Recovery of Mangled Clusters with Same-Cluster Queries

no code implementations NeurIPS 2020 Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

Given a finite set of input points, and an oracle revealing whether any two points lie in the same cluster, our goal is to recover all clusters exactly using as few queries as possible.

Clustering

Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

no code implementations2 May 2019 Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi, Amin Karbasi

We show how one can achieve the tight $(1/2)$-approximation guarantee with $O(k)$ shared memory while minimizing not only the required rounds of computations but also the total number of communicated bits.

Data Summarization

Mallows Models for Top-k Lists

no code implementations NeurIPS 2018 Flavio Chierichetti, Anirban Dasgupta, Shahrzad Haddadan, Ravi Kumar, Silvio Lattanzi

The classic Mallows model is a widely-used tool to realize distributions on per- mutations.

Parallel and Streaming Algorithms for K-Core Decomposition

no code implementations ICML 2018 Hossein Esfandiari, Silvio Lattanzi, Vahab Mirrokni

The $k$-core decomposition is a fundamental primitive in many machine learning and data mining applications.

Fair Clustering Through Fairlets

2 code implementations NeurIPS 2017 Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Sergei Vassilvitskii

We show that any fair clustering problem can be decomposed into first finding good fairlets, and then using existing machinery for traditional clustering algorithms.

Clustering

Algorithms for $\ell_p$ Low-Rank Approximation

no code implementations ICML 2017 Flavio Chierichetti, Sreenivas Gollapudi, Ravi Kumar, Silvio Lattanzi, Rina Panigrahy, David P. Woodruff

We consider the problem of approximating a given matrix by a low-rank matrix so as to minimize the entrywise $\ell_p$-approximation error, for any $p \geq 1$; the case $p = 2$ is the classical SVD problem.

Consistent k-Clustering

no code implementations ICML 2017 Silvio Lattanzi, Sergei Vassilvitskii

The study of online algorithms and competitive analysis provides a solid foundation for studying the quality of irrevocable decision making when the data arrives in an online manner.

Clustering Decision Making

Community Detection on Evolving Graphs

no code implementations NeurIPS 2016 Aris Anagnostopoulos, Jakub Łącki, Silvio Lattanzi, Stefano Leonardi, Mohammad Mahdian

In many of these applications, the input graph evolves over time in a continual and decentralized manner, and, to maintain a good clustering, the clustering algorithm needs to repeatedly probe the graph.

Clustering Community Detection +3

Local Graph Clustering Beyond Cheeger's Inequality

no code implementations30 Apr 2013 Zeyuan Allen Zhu, Silvio Lattanzi, Vahab Mirrokni

We also prove that our analysis is tight, and perform empirical evaluation to support our theory on both synthetic and real data.

Clustering Graph Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.