no code implementations • 3 Aug 2024 • Yanfei Chen, Jinsung Yoon, Devendra Singh Sachan, Qingze Wang, Vincent Cohen-Addad, Mohammadhossein Bateni, Chen-Yu Lee, Tomas Pfister

Recent advances in large language models (LLMs) have enabled autonomous agents with complex reasoning and task-fulfillment capabilities using a wide range of tools.

1 code implementation • 13 Jun 2024 • Vincent Cohen-Addad, Silvio Lattanzi, Andreas Maggiori, Nikos Parotsidis

We study the classic problem of correlation clustering in dynamic node streams.

no code implementations • 7 Jun 2024 • Vincent Cohen-Addad, Tommaso d'Orsi, Aida Mousavifar

We consider the semi-random graph model of [Makarychev, Makarychev and Vijayaraghavan, STOC'12], where, given a random bipartite graph with $\alpha$ edges and an unknown bipartition $(A, B)$ of the vertex set, an adversary can add arbitrary edges inside each community and remove arbitrary edges from the cut $(A, B)$ (i. e. all adversarial changes are \textit{monotone} with respect to the bipartition).

no code implementations • 7 Jun 2024 • Vincent Cohen-Addad, Tommaso d'Orsi, Silvio Lattanzi, Rajai Nasser

Graph clustering is a central topic in unsupervised learning with a multitude of practical applications.

no code implementations • 7 Jun 2024 • Vincent Cohen-Addad, Tommaso d'Orsi, Alessandro Epasto, Vahab Mirrokni, Peilin Zhong

We revisit the input perturbations framework for differential privacy where noise is added to the input $A\in \mathcal{S}$ and the result is then projected back to the space of admissible datasets $\mathcal{S}$.

no code implementations • 27 Feb 2024 • Kyriakos Axiotis, Vincent Cohen-Addad, Monika Henzinger, Sammy Jerome, Vahab Mirrokni, David Saulpic, David Woodruff, Michael Wunder

We study the data selection problem, whose aim is to select a small representative subset of data that can be used to efficiently train a machine learning model.

1 code implementation • 9 Feb 2024 • Mohammadhossein Bateni, Vincent Cohen-Addad, Alessandro Epasto, Silvio Lattanzi

We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.

no code implementations • 29 Nov 2023 • Ainesh Bakshi, Vincent Cohen-Addad, Samuel B. Hopkins, Rajesh Jayaram, Silvio Lattanzi

Multi-dimensional Scaling (MDS) is a family of methods for embedding an $n$-point metric into low-dimensional Euclidean space.

1 code implementation • 28 Sep 2023 • Lorenzo Beretta, Vincent Cohen-Addad, Silvio Lattanzi, Nikos Parotsidis

The $k$-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is often the practitioners' choice algorithm for optimizing the popular $k$-means clustering objective and is known to give an $O(\log k)$-approximation in expectation.

1 code implementation • 31 Jan 2023 • Jacob Imola, Alessandro Epasto, Mohammad Mahdian, Vincent Cohen-Addad, Vahab Mirrokni

Then, we exhibit a polynomial-time approximation algorithm with $O(|V|^{2. 5}/ \epsilon)$-additive error, and an exponential-time algorithm that meets the lower bound.

no code implementations • 15 Nov 2022 • Vincent Cohen-Addad, Kasper Green Larsen, David Saulpic, Chris Schwiegelshohn, Omar Ali Sheikh-Omar

the Euclidean $k$-median problem) consists of finding $k$ centers such that the sum of squared distances (resp.

1 code implementation • 17 Jun 2022 • Vincent Cohen-Addad, Alessandro Epasto, Silvio Lattanzi, Vahab Mirrokni, Andres Munoz, David Saulpic, Chris Schwiegelshohn, Sergei Vassilvitskii

We study the private $k$-median and $k$-means clustering problem in $d$ dimensional Euclidean space.

no code implementations • 24 May 2022 • Limor Gultchin, Vincent Cohen-Addad, Sophie Giffard-Roisin, Varun Kanade, Frederik Mallmann-Trenn

Among the various aspects of algorithmic fairness studied in recent years, the tension between satisfying both \textit{sufficiency} and \textit{separation} -- e. g. the ratios of positive or negative predictive values, and false positive or false negative rates across groups -- has received much attention.

no code implementations • 11 Apr 2022 • Vincent Cohen-Addad, Hossein Esfandiari, Vahab Mirrokni, Shyam Narayanan

Motivated by data analysis and machine learning applications, we consider the popular high-dimensional Euclidean $k$-median and $k$-means problems.

no code implementations • 2 Mar 2022 • Vincent Cohen-Addad, Chenglin Fan, Silvio Lattanzi, Slobodan Mitrović, Ashkan Norouzi-Fard, Nikos Parotsidis, Jakub Tarnawski

Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more.

no code implementations • 25 Feb 2022 • Vincent Cohen-Addad, Kasper Green Larsen, David Saulpic, Chris Schwiegelshohn

Given a set of points in a metric space, the $(k, z)$-clustering problem consists of finding a set of $k$ points called centers, such that the sum of distances raised to the power of $z$ of every data point to its closest center is minimized.

no code implementations • 6 Dec 2021 • Amir Abboud, Mohammad Hossein Bateni, Vincent Cohen-Addad, Karthik C. S., Saeed Seddighin

Moreover, we extend one of our hardness results to rule out subquartic algorithms for the well-studied 1-median problem in the edit metric, where given a set of $n$ strings each of length $n$, the goal is to find a string in the set that minimizes the sum of the edit distances to the rest of the strings in the set.

no code implementations • NeurIPS 2021 • Vincent Cohen-Addad, David Saulpic, Chris Schwiegelshohn

Special cases of problem include the well-known Fermat-Weber problem -- or geometric median problem -- where $z = 1$, the mean or centroid where $z=2$, and the Minimum Enclosing Ball problem, where $z = \infty$. We consider these problem in the big data regime. Here, we are interested in sampling as few points as possible such that we can accurately estimate $m$. More specifically, we consider sublinear algorithms as well as coresets for these problems. Sublinear algorithms have a random query access to the $A$ and the goal is to minimize the number of queries. Here, we show that $\tilde{O}(\varepsilon^{-z-3})$ samples are sufficient to achieve a $(1+\varepsilon)$ approximation, generalizing the results from Cohen, Lee, Miller, Pachocki, and Sidford [STOC '16] and Inaba, Katoh, and Imai [SoCG '94] to arbitrary $z$.

no code implementations • NeurIPS 2021 • Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler, Ola Svensson

In this paper we introduce a new parallel algorithm for the Euclidean hierarchical $k$-median problem that, when using machines with memory $s$ (for $s\in \Omega(\log^2 (n+\Delta+d))$), outputs a hierarchical clustering such that for every fixed value of $k$ the cost of the solution is at most an $O(\min\{d, \log n\} \log \Delta)$ factor larger in expectation than that of an optimal solution.

no code implementations • 21 Nov 2021 • Vincent Cohen-Addad, Karthik C. S, Euiwoong Lee

We then show that together with generalizations of the embedding techniques introduced by Cohen-Addad and Karthik (FOCS '19), JCH implies hardness of approximation results for k-median and k-means in $\ell_p$-metrics for factors which are close to the ones obtained for general metrics.

no code implementations • 15 Jun 2021 • Vincent Cohen-Addad, Silvio Lattanzi, Slobodan Mitrović, Ashkan Norouzi-Fard, Nikos Parotsidis, Jakub Tarnawski

Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining.

no code implementations • NeurIPS 2020 • Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler, Ola Svensson

$k$-means++ \cite{arthur2007k} is a widely used clustering algorithm that is easy to implement, has nice theoretical guarantees and strong empirical performance.

no code implementations • NeurIPS 2020 • Vincent Cohen-Addad, Adrian Kosowski, Frederik Mallmann-Trenn, David Saulpic

A classic problem in machine learning and data analysis is to partition the vertices of a network in such a way that vertices in the same set are densely connected and vertices in different sets are loosely connected.

no code implementations • 30 Sep 2020 • Vincent Cohen-Addad, C. S. Karthik, Euiwoong Lee

In practice and historically, k-means is thought of in a continuous setting, namely where the centers can be located anywhere in the metric space.

no code implementations • ICML 2020 • Vincent Cohen-Addad, Karthik C. S., Guillaume Lagarde

In this paper, we provide a new algorithm which takes as input a set of points $P$ in $\mathbb{R}^d$, and for every $c\ge 1$, runs in time $n^{1+\frac{\rho}{c^2}}$ (for some universal constant $\rho>1$) to output an ultrametric $\Delta$ such that for any two points $u, v$ in $P$, we have $\Delta(u, v)$ is within a multiplicative factor of $5c$ to the distance between $u$ and $v$ in the "best" ultrametric representation of $P$.

1 code implementation • NeurIPS 2019 • Vincent Cohen-Addad, Niklas Oskar D. Hjuler, Nikos Parotsidis, David Saulpic, Chris Schwiegelshohn

This improves over the naive algorithm which consists in recomputing a solution at each time step and that can take up to $O(n^2)$ update time, and $O(n^2)$ total recourse.

no code implementations • NeurIPS 2019 • Amir Abboud, Vincent Cohen-Addad, Hussein Houdrouge

We consider the widely-used average-linkage, single-linkage, and Ward's methods for computing hierarchical clusterings of high-dimensional Euclidean inputs.

no code implementations • 15 Sep 2019 • Vincent Cohen-Addad, Benjamin Guedj, Varun Kanade, Guy Rom

The specific formulation we use is the $k$-means objective: At each time step the algorithm has to maintain a set of k candidate centers and the loss incurred is the squared distance between the new point and the closest center.

no code implementations • NeurIPS 2018 • Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn

In this work, we take a different approach, based on the observation that the consistency axiom fails to be satisfied when the “correct” number of clusters changes.

no code implementations • 21 Jun 2018 • Vincent Cohen-Addad, Frederik Mallmann-Trenn, Claire Mathieu

In this paper, we show optimal worst-case query complexity for the \textsc{max},\textsc{threshold-$v$} and \textsc{Top}-$k$ problems.

no code implementations • NeurIPS 2017 • Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn

Hiererachical clustering, that is computing a recursive partitioning of a dataset to obtain clusters at increasingly finer granularity is a fundamental problem in data analysis.

no code implementations • 7 Apr 2017 • Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn, Claire Mathieu

For similarity-based hierarchical clustering, Dasgupta showed that the divisive sparsest-cut approach achieves an $O(\log^{3/2} n)$-approximation.

no code implementations • 29 Jan 2017 • Vincent Cohen-Addad, Chris Schwiegelshohn

We study the classic $k$-median and $k$-means clustering objectives in the beyond-worst-case scenario.

no code implementations • 7 Apr 2016 • Vincent Cohen-Addad, Varun Kanade

We study online optimization of smoothed piecewise constant functions over the domain [0, 1).

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.