no code implementations • 3 Dec 2024 • Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Ola Svensson, Morteza Zadimoghaddam
In this work, we study online submodular maximization, and how the requirement of maintaining a stable solution impacts the approximation.
1 code implementation • 13 Jun 2024 • Vincent Cohen-Addad, Silvio Lattanzi, Andreas Maggiori, Nikos Parotsidis
We study the classic problem of correlation clustering in dynamic node streams.
no code implementations • 7 Jun 2024 • Vincent Cohen-Addad, Tommaso d'Orsi, Silvio Lattanzi, Rajai Nasser
Graph clustering is a central topic in unsupervised learning with a multitude of practical applications.
1 code implementation • 30 May 2024 • Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam
Maximizing monotone submodular functions under cardinality constraints is a classic optimization task with several applications in data mining and machine learning.
1 code implementation • 9 Feb 2024 • Mohammadhossein Bateni, Vincent Cohen-Addad, Alessandro Epasto, Silvio Lattanzi
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
no code implementations • 29 Nov 2023 • Ainesh Bakshi, Vincent Cohen-Addad, Samuel B. Hopkins, Rajesh Jayaram, Silvio Lattanzi
Multi-dimensional Scaling (MDS) is a family of methods for embedding an $n$-point metric into low-dimensional Euclidean space.
1 code implementation • 28 Sep 2023 • Lorenzo Beretta, Vincent Cohen-Addad, Silvio Lattanzi, Nikos Parotsidis
The $k$-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is often the practitioners' choice algorithm for optimizing the popular $k$-means clustering objective and is known to give an $O(\log k)$-approximation in expectation.
no code implementations • 31 May 2023 • Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam
Maximizing monotone submodular functions under a matroid constraint is a classic algorithmic problem with multiple applications in data mining and machine learning.
no code implementations • 18 Oct 2022 • Kimon Fountoulakis, Dake He, Silvio Lattanzi, Bryan Perozzi, Anton Tsitsulin, Shenghao Yang
In CSBM the nodes and edge features are obtained from a mixture of Gaussians and the edges from a stochastic block model.
no code implementations • 8 Sep 2022 • Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice, Maximilian Thiessen
In this work we show that, by carefully combining the two types of queries, a binary classifier can be learned in time $\operatorname{poly}(n+m)$ using only $O(m^2 \log n)$ label queries and $O\big(m \log \frac{m}{\gamma}\big)$ seed queries; the result extends to $k$-class classifiers at the price of a $k! k^2$ multiplicative overhead.
no code implementations • 16 Aug 2022 • Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam
Maximizing a submodular function is a fundamental task in machine learning and in this paper we study the deletion robust version of the problem under the classic matroids constraint.
1 code implementation • 7 Jul 2022 • Oleksandr Ferludin, Arno Eigenwillig, Martin Blais, Dustin Zelle, Jan Pfeifer, Alvaro Sanchez-Gonzalez, Wai Lok Sibon Li, Sami Abu-El-Haija, Peter Battaglia, Neslihan Bulut, Jonathan Halcrow, Filipe Miguel Gonçalves de Almeida, Pedro Gonnet, Liangze Jiang, Parth Kothari, Silvio Lattanzi, André Linhares, Brandon Mayer, Vahab Mirrokni, John Palowitch, Mihir Paradkar, Jennifer She, Anton Tsitsulin, Kevin Villela, Lisa Wang, David Wong, Bryan Perozzi
TensorFlow-GNN (TF-GNN) is a scalable library for Graph Neural Networks in TensorFlow.
1 code implementation • 17 Jun 2022 • Vincent Cohen-Addad, Alessandro Epasto, Silvio Lattanzi, Vahab Mirrokni, Andres Munoz, David Saulpic, Chris Schwiegelshohn, Sergei Vassilvitskii
We study the private $k$-median and $k$-means clustering problem in $d$ dimensional Euclidean space.
no code implementations • 2 Mar 2022 • Vincent Cohen-Addad, Chenglin Fan, Silvio Lattanzi, Slobodan Mitrović, Ashkan Norouzi-Fard, Nikos Parotsidis, Jakub Tarnawski
Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more.
no code implementations • 31 Jan 2022 • Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam
Maximizing a monotone submodular function is a fundamental task in machine learning.
no code implementations • NeurIPS 2021 • Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler, Ola Svensson
In this paper we introduce a new parallel algorithm for the Euclidean hierarchical $k$-median problem that, when using machines with memory $s$ (for $s\in \Omega(\log^2 (n+\Delta+d))$), outputs a hierarchical clustering such that for every fixed value of $k$ the cost of the solution is at most an $O(\min\{d, \log n\} \log \Delta)$ factor larger in expectation than that of an optimal solution.
no code implementations • NeurIPS 2021 • Michael Kapralov, Silvio Lattanzi, Navid Nouri, Jakab Tardos
Random walks are a fundamental primitive used in many machine learning algorithms with several applications in clustering and semi-supervised learning.
no code implementations • NeurIPS 2021 • Silvio Lattanzi, Benjamin Moseley, Sergei Vassilvitskii, Yuyan Wang, Rudy Zhou
In correlation clustering we are given a set of points along with recommendations whether each pair of points should be placed in the same cluster or into separate clusters.
no code implementations • NeurIPS 2021 • Matteo Almanza, Flavio Chierichetti, Silvio Lattanzi, Alessandro Panconesi, Giuseppe Re
Clustering is a central topic in unsupervised learning and its online formulation has received a lot of attention in recent years.
no code implementations • 15 Jun 2021 • Vincent Cohen-Addad, Silvio Lattanzi, Slobodan Mitrović, Ashkan Norouzi-Fard, Nikos Parotsidis, Jakub Tarnawski
Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining.
no code implementations • NeurIPS 2021 • Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice
We study an active cluster recovery problem where, given a set of $n$ points and an oracle answering queries like "are these two points in the same cluster?
no code implementations • 31 Jan 2021 • Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice
Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(\log n)$ same-cluster queries, where $n$ is the number of input points.
no code implementations • 14 Jan 2021 • Grzegorz Gluch, Michael Kapralov, Silvio Lattanzi, Aida Mousavifar, Christian Sohler
The main technical contribution is a sublinear time oracle that provides dot product access to the spectral embedding of $G$ by estimating distributions of short random walks from vertices in $G$.
Data Structures and Algorithms
no code implementations • NeurIPS 2020 • Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler, Ola Svensson
$k$-means++ \cite{arthur2007k} is a widely used clustering algorithm that is easy to implement, has nice theoretical guarantees and strong empirical performance.
no code implementations • NeurIPS 2020 • Aditya Bhaskara, Amin Karbasi, Silvio Lattanzi, Morteza Zadimoghaddam
In this paper, we provide an efficient approximation algorithm for finding the most likelihood configuration (MAP) of size $k$ for Determinantal Point Processes (DPP) in the online setting where the data points arrive in an arbitrary order and the algorithm cannot discard the selected elements from its local memory.
no code implementations • 22 Oct 2020 • Luc Devroye, Silvio Lattanzi, Gabor Lugosi, Nikita Zhivotovskiy
We study the problem of estimating the common mean $\mu$ of $n$ independent symmetric random variables with different and unknown standard deviations $\sigma_1 \le \sigma_2 \le \cdots \le\sigma_n$.
no code implementations • 14 Oct 2020 • Ştefan Postăvaru, Anton Tsitsulin, Filipe Miguel Gonçalves de Almeida, Yingtao Tian, Silvio Lattanzi, Bryan Perozzi
In this paper, we introduce InstantEmbedding, an efficient method for generating single-node representations using local PageRank computations.
1 code implementation • NeurIPS 2020 • Michele Borassi, Alessandro Epasto, Silvio Lattanzi, Sergei Vassilvitskii, Morteza Zadimoghaddam
The sliding window model of computation captures scenarios in which data is arriving continuously, but only the latest $w$ elements should be used for analysis.
Data Structures and Algorithms
no code implementations • NeurIPS 2020 • Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice
Given a finite set of input points, and an oracle revealing whether any two points lie in the same cluster, our goal is to recover all clusters exactly using as few queries as possible.
no code implementations • 2 May 2019 • Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi, Amin Karbasi
We show how one can achieve the tight $(1/2)$-approximation guarantee with $O(k)$ shared memory while minimizing not only the required rounds of computations but also the total number of communicated bits.
no code implementations • NeurIPS 2018 • Flavio Chierichetti, Anirban Dasgupta, Shahrzad Haddadan, Ravi Kumar, Silvio Lattanzi
The classic Mallows model is a widely-used tool to realize distributions on per- mutations.
no code implementations • ICML 2018 • Hossein Esfandiari, Silvio Lattanzi, Vahab Mirrokni
The $k$-core decomposition is a fundamental primitive in many machine learning and data mining applications.
2 code implementations • NeurIPS 2017 • Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Sergei Vassilvitskii
We show that any fair clustering problem can be decomposed into first finding good fairlets, and then using existing machinery for traditional clustering algorithms.
1 code implementation • NeurIPS 2017 • Mohammadhossein Bateni, Soheil Behnezhad, Mahsa Derakhshan, Mohammadtaghi Hajiaghayi, Raimondas Kiveris, Silvio Lattanzi, Vahab Mirrokni
In particular, we propose affinity, a novel hierarchical clustering based on Boruvka's MST algorithm.
no code implementations • 27 Nov 2017 • Olivier Bachem, Mario Lucic, Silvio Lattanzi
Scaling clustering algorithms to massive data sets is a challenging task.
2 code implementations • KDD 2017 • Alessandro Epasto, Silvio Lattanzi, Renato Paes Leme
More precisely, our framework works in two steps: a local ego-net analysis phase, and a global graph partitioning phase .
Ranked #3 on
Community Detection
on Amazon
no code implementations • ICML 2017 • Flavio Chierichetti, Sreenivas Gollapudi, Ravi Kumar, Silvio Lattanzi, Rina Panigrahy, David P. Woodruff
We consider the problem of approximating a given matrix by a low-rank matrix so as to minimize the entrywise $\ell_p$-approximation error, for any $p \geq 1$; the case $p = 2$ is the classical SVD problem.
no code implementations • ICML 2017 • Silvio Lattanzi, Sergei Vassilvitskii
The study of online algorithms and competitive analysis provides a solid foundation for studying the quality of irrevocable decision making when the data arrives in an online manner.
no code implementations • NeurIPS 2016 • Aris Anagnostopoulos, Jakub Łącki, Silvio Lattanzi, Stefano Leonardi, Mohammad Mahdian
In many of these applications, the input graph evolves over time in a continual and decentralized manner, and, to maintain a good clustering, the clustering algorithm needs to repeatedly probe the graph.
no code implementations • NeurIPS 2014 • Mohammadhossein Bateni, Aditya Bhaskara, Silvio Lattanzi, Vahab Mirrokni
Large-scale clustering of data points in metric spaces is an important problem in mining big data sets.
no code implementations • 30 Apr 2013 • Zeyuan Allen Zhu, Silvio Lattanzi, Vahab Mirrokni
We also prove that our analysis is tight, and perform empirical evaluation to support our theory on both synthetic and real data.