no code implementations • 18 Dec 2023 • Lucas Rosenblatt, Julia Stoyanovich, Christopher Musco
Our theoretical results center on the private mean estimation problem, while our empirical results center on extensive experiments on private data synthesis to demonstrate the effectiveness of stratification on a variety of private mechanisms.
no code implementations • 8 Oct 2023 • Atsushi Shimizu, Xiaoou Cheng, Christopher Musco, Jonathan Weare
We show how to obtain improved active learning methods in the agnostic (adversarial noise) setting by combining marginal leverage score sampling with non-independent sampling strategies that promote spatial coverage.
no code implementations • 2 Jul 2023 • Yujia Jin, Christopher Musco, Aaron Sidford, Apoorv Vikram Singh
We study lower bounds for the problem of approximating a one dimensional distribution given (noisy) measurements of its moments.
no code implementations • 30 May 2023 • Xinyu Luo, Christopher Musco, Cas Widdershoven
There has been particular interest in efficient methods for solving the problem when $D$ is represented as a mixture model or kernel density estimate, although few algorithmic results with worst-case approximation and runtime guarantees are known.
no code implementations • 24 Oct 2022 • Aarshvi Gajjar, Chinmay Hegde, Christopher Musco
Namely, we can collect samples via statistical \emph{leverage score sampling}, which has been shown to be near-optimal in other active learning scenarios.
no code implementations • 9 Nov 2021 • Cameron Musco, Christopher Musco, David P. Woodruff, Taisuke Yasuda
By combining this with our techniques for $\ell_p$ regression, we obtain an active regression algorithm making $\tilde O(d^{1+\max\{1, p/2\}}/\mathrm{poly}(\epsilon))$ queries for such loss functions, including the Tukey and Huber losses, answering another question of [CD21].
no code implementations • 7 Apr 2021 • Aécio Santos, Aline Bessa, Fernando Chirigati, Christopher Musco, Juliana Freire
The increasing availability of structured datasets, from Web tables and open-data portals to enterprise data, opens up opportunities~to enrich analytics and improve machine learning models through relational data augmentation.
no code implementations • NeurIPS 2020 • Raphael Meyer, Christopher Musco
This paper studies the statistical complexity of kernel hyperparameter tuning in the setting of active regression under adversarial noise.
1 code implementation • 19 Oct 2020 • Raphael A. Meyer, Cameron Musco, Christopher Musco, David P. Woodruff
This improves on the ubiquitous Hutchinson's estimator, which requires $O(1/\epsilon^2)$ matrix-vector products.
no code implementations • 4 Aug 2020 • Arun Jambulapati, Jerry Li, Christopher Musco, Aaron Sidford, Kevin Tian
In this paper, we revisit the decades-old problem of how to best improve $\mathbf{A}$'s condition number by left or right diagonal rescaling.
no code implementations • 22 Jun 2020 • Prathamesh Dharangutte, Christopher Musco
Our main contribution is an efficient algorithm for \emph{inverse landscape genetics}, which is the task of inferring this graph from measurements of genetic similarity at different locations (graph nodes).
no code implementations • 14 Jun 2020 • Raphael A. Meyer, Christopher Musco
This paper studies the statistical complexity of kernel hyperparameter tuning in the setting of active regression under adversarial noise.
no code implementations • NeurIPS 2020 • Tamás Erdélyi, Cameron Musco, Christopher Musco
Bounding Fourier sparse leverage scores under various measures is of pure mathematical interest in approximation theory, and our work extends existing results for the uniform measure [Erd17, CP19a].
no code implementations • 17 Apr 2020 • Cameron Musco, Christopher Musco
In this note we illustrate how common matrix approximation methods, such as random projection and random sampling, yield projection-cost-preserving sketches, as introduced in [FSS13, CEM+15].
no code implementations • 14 May 2019 • Yonina C. Eldar, Jerry Li, Cameron Musco, Christopher Musco
In addition to results that hold for any Toeplitz $T$, we further study the important setting when $T$ is close to low-rank, which is often the case in practice.
no code implementations • 22 Apr 2019 • Cameron Musco, Christopher Musco, David P. Woodruff
In particular, for rank $k' > k$ depending on the $public\ coin\ partition\ number$ of $W$, the heuristic outputs rank-$k'$ $L$ with cost$(L) \leq OPT + \epsilon \|A\|_F^2$.
no code implementations • 20 Dec 2018 • Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, Amir Zandieh
We formalize this intuition by showing that, roughly, a continuous signal from a given class can be approximately reconstructed using a number of samples proportional to the *statistical dimension* of the allowed power spectrum of that class.
1 code implementation • NeurIPS 2018 • Jeremy Hoskins, Cameron Musco, Christopher Musco, Babis Tsourakakis
In this work we consider a privacy threat to a social network in which an attacker has access to a subset of random walk-based node similarities, such as effective resistances (i. e., commute times) or personalized PageRank scores.
no code implementations • ICML 2017 • Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, Amir Zandieh
Qualitatively, our results are twofold: on the one hand, we show that random Fourier feature approximation can provably speed up kernel ridge regression under reasonable assumptions.
1 code implementation • 23 Jan 2018 • Jeremy G. Hoskins, Cameron Musco, Christopher Musco, Charalampos E. Tsourakakis
In this work we consider a privacy threat to a social network in which an attacker has access to a subset of random walk-based node similarities, such as effective resistances (i. e., commute times) or personalized PageRank scores.
3 code implementations • 28 Dec 2017 • Cameron Musco, Christopher Musco, Charalampos E. Tsourakakis
We perform an empirical study of our proposed methods on synthetic and real-world data that verify their value as mining tools to better understand the trade-off between of disagreement and polarization.
no code implementations • NeurIPS 2017 • Cameron Musco, Christopher Musco
We give the first algorithm for kernel Nystrom approximation that runs in linear time in the number of training points and is provably accurate for all kernel matrices, without dependence on regularity or incoherence conditions.
1 code implementation • 25 Aug 2017 • Cameron Musco, Christopher Musco, Aaron Sidford
In exact arithmetic, the method's error after $k$ iterations is bounded by the error of the best degree-$k$ polynomial uniformly approximating $f(x)$ on the range $[\lambda_{min}(A), \lambda_{max}(A)]$.
Data Structures and Algorithms Numerical Analysis
2 code implementations • 24 May 2016 • Cameron Musco, Christopher Musco
We give the first algorithm for kernel Nystr\"om approximation that runs in *linear time in the number of training points* and is provably accurate for all kernel matrices, without dependence on regularity or incoherence conditions.
no code implementations • 22 Feb 2016 • Roy Frostig, Cameron Musco, Christopher Musco, Aaron Sidford
To achieve our results, we first observe that ridge regression can be used to obtain a "smooth projection" onto the top principal components.
no code implementations • 23 Nov 2015 • Michael B. Cohen, Cameron Musco, Christopher Musco
Our method is based on a recursive sampling scheme for computing a representative subset of $A$'s columns, which is then used to find a low-rank approximation.
no code implementations • NeurIPS 2015 • Cameron Musco, Christopher Musco
We give the first provable runtime improvement on Simultaneous Iteration: a simple randomized block Krylov method, closely related to the classic Block Lanczos algorithm, gives the same guarantees in just $\tilde{O}(1/\sqrt{\epsilon})$ iterations and performs substantially better experimentally.
no code implementations • 24 Oct 2014 • Michael B. Cohen, Sam Elder, Cameron Musco, Christopher Musco, Madalina Persu
We show how to approximate a data matrix $\mathbf{A}$ with a much smaller sketch $\mathbf{\tilde A}$ that can be used to solve a general class of constrained k-rank approximation problems to within $(1+\epsilon)$ error.
no code implementations • 21 Aug 2014 • Michael B. Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, Aaron Sidford
In addition to an improved understanding of uniform sampling, our main proof introduces a structural result of independent interest: we show that every matrix can be made to have low coherence by reweighting a small subset of its rows.