Search Results for author: Chris Schwiegelshohn

Found 13 papers, 4 papers with code

Settling Time vs. Accuracy Tradeoffs for Clustering Big Data

1 code implementation • 2 Apr 2024 • Andrew Draganov, David Saulpic, Chris Schwiegelshohn

We study the theoretical and practical runtime limits of k-means and k-median clustering on large datasets.

Paper
Code

Optimal Sketching Bounds for Sparse Linear Regression

no code implementations • 5 Apr 2023 • Tung Mai, Alexander Munteanu, Cameron Musco, Anup B. Rao, Chris Schwiegelshohn, David P. Woodruff

For this problem, under the $\ell_2$ norm, we observe an upper bound of $O(k \log (d)/\varepsilon + k\log(k/\varepsilon)/\varepsilon^2)$ rows, showing that sparse recovery is strictly easier to sketch than sparse regression.

regression

Paper
Add Code

Sparse Dimensionality Reduction Revisited

no code implementations • 13 Feb 2023 • Mikael Møller Høgsgaard, Lion Kamma, Kasper Green Larsen, Jelani Nelson, Chris Schwiegelshohn

In this work, we revisit sparse embeddings and identify a loophole in the lower bound.

Dimensionality Reduction

Paper
Add Code

Improved Coresets for Euclidean $k$-Means

no code implementations • 15 Nov 2022 • Vincent Cohen-Addad, Kasper Green Larsen, David Saulpic, Chris Schwiegelshohn, Omar Ali Sheikh-Omar

the Euclidean $k$-median problem) consists of finding $k$ centers such that the sum of squared distances (resp.

Paper
Add Code

An Empirical Evaluation of $k$-Means Coresets

1 code implementation • 3 Jul 2022 • Chris Schwiegelshohn, Omar Ali Sheikh-Omar

Using this benchmark and real-world data sets, we conduct an exhaustive evaluation of the most commonly used coreset algorithms from theory and practice.

Clustering

Paper
Code

Scalable Differentially Private Clustering via Hierarchically Separated Trees

1 code implementation • 17 Jun 2022 • Vincent Cohen-Addad, Alessandro Epasto, Silvio Lattanzi, Vahab Mirrokni, Andres Munoz, David Saulpic, Chris Schwiegelshohn, Sergei Vassilvitskii

We study the private $k$-median and $k$-means clustering problem in $d$ dimensional Euclidean space.

Clustering Dimensionality Reduction +1

32,758

Paper
Code

Towards Optimal Lower Bounds for k-median and k-means Coresets

no code implementations • 25 Feb 2022 • Vincent Cohen-Addad, Kasper Green Larsen, David Saulpic, Chris Schwiegelshohn

Given a set of points in a metric space, the $(k, z)$-clustering problem consists of finding a set of $k$ points called centers, such that the sum of distances raised to the power of $z$ of every data point to its closest center is minimized.

Clustering

Paper
Add Code

Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces

no code implementations • NeurIPS 2021 • Vincent Cohen-Addad, David Saulpic, Chris Schwiegelshohn

Special cases of problem include the well-known Fermat-Weber problem -- or geometric median problem -- where $z = 1$, the mean or centroid where $z=2$, and the Minimum Enclosing Ball problem, where $z = \infty$. We consider these problem in the big data regime. Here, we are interested in sampling as few points as possible such that we can accurately estimate $m$. More specifically, we consider sublinear algorithms as well as coresets for these problems. Sublinear algorithms have a random query access to the $A$ and the goal is to minimize the number of queries. Here, we show that $\tilde{O}(\varepsilon^{-z-3})$ samples are sufficient to achieve a $(1+\varepsilon)$ approximation, generalizing the results from Cohen, Lee, Miller, Pachocki, and Sidford [STOC '16] and Inaba, Katoh, and Imai [SoCG '94] to arbitrary $z$.

Paper
Add Code

Algorithms for Fair Team Formation in Online Labour Marketplaces

no code implementations • 14 Feb 2020 • Giorgio Barnabò, Adriano Fazzone, Stefano Leonardi, Chris Schwiegelshohn

In this short paper, we define the Fair Team Formation problem in the following way: given an online labour marketplace where each worker possesses one or more skills, and where all workers are divided into two or more not overlapping classes (for examples, men and women), we want to design an algorithm that is able to find a team with all the skills needed to complete a given task, and that has the same number of people from all classes.

Fairness

Paper
Add Code

Fully Dynamic Consistent Facility Location

1 code implementation • NeurIPS 2019 • Vincent Cohen-Addad, Niklas Oskar D. Hjuler, Nikos Parotsidis, David Saulpic, Chris Schwiegelshohn

This improves over the naive algorithm which consists in recomputing a solution at each time step and that can take up to $O(n^2)$ update time, and $O(n^2)$ total recourse.

Clustering

Paper
Code

Principal Fairness: Removing Bias via Projections

no code implementations • 31 May 2019 • Aris Anagnostopoulos, Luca Becchetti, Adriano Fazzone, Cristina Menghini, Chris Schwiegelshohn

Reducing hidden bias in the data and ensuring fairness in algorithmic data analysis has recently received significant attention.

Clustering Fairness

Paper
Add Code

On Coresets for Logistic Regression

no code implementations • NeurIPS 2018 • Alexander Munteanu, Chris Schwiegelshohn, Christian Sohler, David P. Woodruff

For data sets with bounded $\mu(X)$-complexity, we show that a novel sensitivity sampling scheme produces the first provably sublinear $(1\pm\varepsilon)$-coreset.

regression

Paper
Add Code

On the Local Structure of Stable Clustering Instances

no code implementations • 29 Jan 2017 • Vincent Cohen-Addad, Chris Schwiegelshohn

We study the classic $k$-median and $k$-means clustering objectives in the beyond-worst-case scenario.

Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.