Search Results for author: David P. Woodruff

Found 56 papers, 5 papers with code

Fast Sketching of Polynomial Kernels of Polynomial Degree

no code implementations21 Aug 2021 Zhao Song, David P. Woodruff, Zheng Yu, Lichen Zhang

Recent techniques in oblivious sketching reduce the dependence in the running time on the degree $q$ of the polynomial kernel from exponential to polynomial, which is useful for the Gaussian kernel, for which $q$ can be chosen to be polylogarithmic.

Near-Optimal Algorithms for Linear Algebra in the Current Matrix Multiplication Time

no code implementations16 Jul 2021 Nadiia Chepurko, Kenneth L. Clarkson, Praneeth Kacham, David P. Woodruff

Currently, in the numerical linear algebra community, it is thought that to obtain nearly-optimal bounds for various problems such as rank computation, finding a maximal linearly independent subset of columns, regression, low rank approximation, maximum matching on general graphs and linear matroid union, one would need to resolve the main open question of Nelson and Nguyen (FOCS, 2013) regarding the logarithmic factors in the sketching dimension for existing constant factor approximation oblivious subspace embeddings.

Single Pass Entrywise-Transformed Low Rank Approximation

no code implementations16 Jul 2021 Yifei Jiang, Yi Li, Yiming Sun, Jiaxin Wang, David P. Woodruff

A natural way to do this would be to simply apply $f$ to each entry of $A$, and then compute the matrix decomposition, but this requires storing all of $A$ as well as multiple passes over its entries.

Average-Case Communication Complexity of Statistical Problems

no code implementations3 Jul 2021 Cyrus Rashtchian, David P. Woodruff, Peng Ye, Hanlin Zhu

Our motivation is to understand the statistical-computational trade-offs in streaming, sketching, and query-based models.

Non-PSD Matrix Sketching with Applications to Regression and Optimization

no code implementations16 Jun 2021 Zhili Feng, Fred Roosta, David P. Woodruff

In this paper, we present novel dimensionality reduction methods for non-PSD matrices, as well as their ``square-roots", which involve matrices with complex entries.

Dimensionality Reduction

Learning a Latent Simplex in Input-Sparsity Time

no code implementations17 May 2021 Ainesh Bakshi, Chiranjib Bhattacharyya, Ravi Kannan, David P. Woodruff, Samson Zhou

We consider the problem of learning a latent $k$-vertex simplex $K\subset\mathbb{R}^d$, given access to $A\in\mathbb{R}^{d\times n}$, which can be viewed as a data matrix with $n$ points that are obtained by randomly perturbing latent points in the simplex $K$ (potentially beyond $K$).

Latent Variable Models Topic Models

Learning-Augmented Sketches for Hessians

no code implementations24 Feb 2021 Yi Li, Honghao Lin, David P. Woodruff

Sketching is a dimensionality reduction technique where one compresses a matrix by linear combinations that are typically chosen at random.

Dimensionality Reduction

Revisiting the Sample Complexity of Sparse Spectrum Approximation of Gaussian Processes

1 code implementation NeurIPS 2020 Quang Minh Hoang, Trong Nghia Hoang, Hai Pham, David P. Woodruff

We introduce a new scalable approximation for Gaussian processes with provable guarantees which hold simultaneously over its entire parameter space.

Gaussian Processes

Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra

no code implementations9 Nov 2020 Nadiia Chepurko, Kenneth L. Clarkson, Lior Horesh, David P. Woodruff

We create classical (non-quantum) dynamic data structures supporting queries for recommender systems and least-squares regression that are comparable to their quantum analogues.

Recommendation Systems

Hutch++: Optimal Stochastic Trace Estimation

1 code implementation19 Oct 2020 Raphael A. Meyer, Cameron Musco, Christopher Musco, David P. Woodruff

This improves on the ubiquitous Hutchinson's estimator, which requires $O(1/\epsilon^2)$ matrix-vector products.

Learning the Positions in CountSketch

no code implementations20 Jul 2020 Simin Liu, Tianrui Liu, Ali Vakilian, Yulin Wan, David P. Woodruff

Despite the growing body of work on this paradigm, a noticeable omission is that the locations of the non-zero entries of previous algorithms were fixed, and only their values were learned.

Optimal $\ell_1$ Column Subset Selection and a Fast PTAS for Low Rank Approximation

no code implementations20 Jul 2020 Arvind V. Mahankali, David P. Woodruff

We give the first polynomial time column subset selection-based $\ell_1$ low rank approximation algorithm sampling $\tilde{O}(k)$ columns and achieving an $\tilde{O}(k^{1/2})$-approximation for any $k$, improving upon the previous best $\tilde{O}(k)$-approximation and matching a prior lower bound for column subset selection-based $\ell_1$-low rank approximation which holds for any $\text{poly}(k)$ number of columns.

WOR and $p$'s: Sketches for $\ell_p$-Sampling Without Replacement

no code implementations NeurIPS 2020 Edith Cohen, Rasmus Pagh, David P. Woodruff

We design novel composable sketches for WOR $\ell_p$ sampling, weighted sampling of keys according to a power $p\in[0, 2]$ of their frequency (or for signed data, sum of updates).

Streaming Complexity of SVMs

no code implementations7 Jul 2020 Alexandr Andoni, Collin Burns, Yi Li, Sepideh Mahabadi, David P. Woodruff

We show that, for both problems, for dimensions $d=1, 2$, one can obtain streaming algorithms with space polynomially smaller than $\frac{1}{\lambda\epsilon}$, which is the complexity of SGD for strongly convex functions like the bias-regularized SVM, and which is known to be tight in general, even for $d=1$.

Vector-Matrix-Vector Queries for Solving Linear Algebra, Statistics, and Graph Problems

no code implementations24 Jun 2020 Cyrus Rashtchian, David P. Woodruff, Hanlin Zhu

We consider the general problem of learning about a matrix through vector-matrix-vector queries.

Approximation Algorithms for Sparse Principal Component Analysis

no code implementations23 Jun 2020 Agniva Chowdhury, Petros Drineas, David P. Woodruff, Samson Zhou

To improve the interpretability of PCA, various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis (SPCA).

Dimensionality Reduction

Learning-Augmented Data Stream Algorithms

no code implementations ICLR 2020 Tanqiu Jiang, Yi Li, Honghao Lin, Yisong Ruan, David P. Woodruff

For estimating the $p$-th frequency moment for $0 < p < 2$ we obtain the first algorithms with optimal update time.

Non-Adaptive Adaptive Sampling on Turnstile Streams

no code implementations23 Apr 2020 Sepideh Mahabadi, Ilya Razenshteyn, David P. Woodruff, Samson Zhou

Adaptive sampling is a useful algorithmic tool for data summarization problems in the classical centralized setting, where the entire dataset is available to the single processor performing the computation.

Data Summarization

Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss

no code implementations16 Apr 2020 Zhao Song, David P. Woodruff, Peilin Zhong

entries drawn from any distribution $\mu$ for which the $(1+\gamma)$-th moment exists, for an arbitrarily small constant $\gamma > 0$, then it is possible to obtain a $(1+\epsilon)$-approximate column subset selection to the entrywise $\ell_1$-norm in nearly linear time.

LSF-Join: Locality Sensitive Filtering for Distributed All-Pairs Set Similarity Under Skew

no code implementations6 Mar 2020 Cyrus Rashtchian, Aneesh Sharma, David P. Woodruff

Theoretically, we show that LSF-Join efficiently finds most close pairs, even for small similarity thresholds and for skewed input sets.

Recommendation Systems

Sublinear Time Numerical Linear Algebra for Structured Matrices

no code implementations12 Dec 2019 Xiaofei Shi, David P. Woodruff

For example, in the overconstrained $(1+\epsilon)$-approximate polynomial interpolation problem, $A$ is a Vandermonde matrix and $T(A) = O(n \log n)$; in this case our running time is $n \cdot \poly(\log n) + \poly(d/\epsilon)$ and we recover the results of \cite{avron2013sketching} as a special case.

Robust and Sample Optimal Algorithms for PSD Low-Rank Approximation

no code implementations9 Dec 2019 Ainesh Bakshi, Nadiia Chepurko, David P. Woodruff

Our main result is to resolve this question by obtaining an optimal algorithm that queries $O(nk/\epsilon)$ entries of $A$ and outputs a relative-error low-rank approximation in $O(n(k/\epsilon)^{\omega-1})$ time.

Optimal Sketching for Kronecker Product Regression and Low Rank Approximation

no code implementations NeurIPS 2019 Huaian Diao, Rajesh Jayaram, Zhao Song, Wen Sun, David P. Woodruff

For input $\mathcal{A}$ as above, we give $O(\sum_{i=1}^q \text{nnz}(A_i))$ time algorithms, which is much faster than computing $\mathcal{A}$.

Total Least Squares Regression in Input Sparsity Time

1 code implementation NeurIPS 2019 Huaian Diao, Zhao Song, David P. Woodruff, Xin Yang

In the total least squares problem, one is given an $m \times n$ matrix $A$, and an $m \times d$ matrix $B$, and one seeks to "correct" both $A$ and $B$, obtaining matrices $\hat{A}$ and $\hat{B}$, so that there exists an $X$ satisfying the equation $\hat{A}X = \hat{B}$.

The Communication Complexity of Optimization

no code implementations13 Jun 2019 Santosh S. Vempala, Ruosong Wang, David P. Woodruff

We first resolve the randomized and deterministic communication complexity in the point-to-point model of communication, showing it is $\tilde{\Theta}(d^2L + sd)$ and $\tilde{\Theta}(sd^2L)$, respectively.

Distributed Optimization

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering

no code implementations15 May 2019 Manuel Fernandez, David P. Woodruff, Taisuke Yasuda

We present tight lower bounds on the number of kernel evaluations required to approximately solve kernel ridge regression (KRR) and kernel $k$-means clustering (KKMC) on $n$ input points.

Dimensionality Reduction for Tukey Regression

no code implementations14 May 2019 Kenneth L. Clarkson, Ruosong Wang, David P. Woodruff

We give the first dimensionality reduction methods for the overconstrained Tukey regression problem.

Dimensionality Reduction

Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation

no code implementations22 Apr 2019 Cameron Musco, Christopher Musco, David P. Woodruff

In particular, for rank $k' > k$ depending on the $public\ coin\ partition\ number$ of $W$, the heuristic outputs rank-$k'$ $L$ with cost$(L) \leq OPT + \epsilon \|A\|_F^2$.

Low-Rank Matrix Completion Tensor Decomposition

Learning Two Layer Rectified Neural Networks in Polynomial Time

no code implementations5 Nov 2018 Ainesh Bakshi, Rajesh Jayaram, David P. Woodruff

Given $n$ samples as a matrix $\mathbf{X} \in \mathbb{R}^{d \times n}$ and the (possibly noisy) labels $\mathbf{U}^* f(\mathbf{V}^* \mathbf{X}) + \mathbf{E}$ of the network on these samples, where $\mathbf{E}$ is a noise matrix, our goal is to recover the weight matrices $\mathbf{U}^*$ and $\mathbf{V}^*$.

Towards a Zero-One Law for Column Subset Selection

1 code implementation NeurIPS 2019 Zhao Song, David P. Woodruff, Peilin Zhong

Our approximation algorithms handle functions which are not even scale-invariant, such as the Huber loss function, which we show have very different structural properties than $\ell_p$-norms, e. g., one can show the lack of scale-invariance causes any column subset selection algorithm to provably require a $\sqrt{\log n}$ factor larger number of columns than $\ell_p$-norms; nevertheless we design the first efficient column subset selection algorithms for such error measures.

Testing Matrix Rank, Optimally

no code implementations18 Oct 2018 Maria-Florina Balcan, Yi Li, David P. Woodruff, Hongyang Zhang

This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix.

A PTAS for $\ell_p$-Low Rank Approximation

no code implementations16 Jul 2018 Frank Ban, Vijay Bhattiprolu, Karl Bringmann, Pavel Kolev, Euiwoong Lee, David P. Woodruff

On the algorithmic side, for $p \in (0, 2)$, we give the first $(1+\epsilon)$-approximation algorithm running in time $n^{\text{poly}(k/\epsilon)}$.

On Coresets for Logistic Regression

no code implementations NeurIPS 2018 Alexander Munteanu, Chris Schwiegelshohn, Christian Sohler, David P. Woodruff

For data sets with bounded $\mu(X)$-complexity, we show that a novel sensitivity sampling scheme produces the first provably sublinear $(1\pm\varepsilon)$-coreset.

Sketching for Kronecker Product Regression and P-splines

no code implementations27 Dec 2017 Huaian Diao, Zhao Song, Wen Sun, David P. Woodruff

That is, TensorSketch only provides input sparsity time for Kronecker product regression with respect to the $2$-norm.

Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?

no code implementations NeurIPS 2017 Cameron Musco, David P. Woodruff

Low-rank approximation is a common tool used to accelerate kernel methods: the $n \times n$ kernel matrix $K$ is approximated via a rank-$k$ matrix $\tilde K$ which can be stored in much less space and processed more quickly.

Near Optimal Sketching of Low-Rank Tensor Regression

no code implementations NeurIPS 2017 Jarvis Haupt, Xingguo Li, David P. Woodruff

We study the least squares regression problem \begin{align*} \min_{\Theta \in \mathcal{S}_{\odot D, R}} \|A\Theta-b\|_2, \end{align*} where $\mathcal{S}_{\odot D, R}$ is the set of $\Theta$ for which $\Theta = \sum_{r=1}^{R} \theta_1^{(r)} \circ \cdots \circ \theta_D^{(r)}$ for vectors $\theta_d^{(r)} \in \mathbb{R}^{p_d}$ for all $r \in [R]$ and $d \in [D]$, and $\circ$ denotes the outer product of vectors.

Dimensionality Reduction

Algorithms for $\ell_p$ Low-Rank Approximation

no code implementations ICML 2017 Flavio Chierichetti, Sreenivas Gollapudi, Ravi Kumar, Silvio Lattanzi, Rina Panigrahy, David P. Woodruff

We consider the problem of approximating a given matrix by a low-rank matrix so as to minimize the entrywise $\ell_p$-approximation error, for any $p \geq 1$; the case $p = 2$ is the classical SVD problem.

Fast Regression with an $\ell_\infty$ Guarantee

no code implementations30 May 2017 Eric Price, Zhao Song, David P. Woodruff

Our main result is that, when $S$ is the subsampled randomized Fourier/Hadamard transform, the error $x' - x^*$ behaves as if it lies in a "random" direction within this bound: for any fixed direction $a\in \mathbb{R}^d$, we have with $1 - d^{-c}$ probability that \[ \langle a, x'-x^*\rangle \lesssim \frac{\|a\|_2\|x'-x^*\|_2}{d^{\frac{1}{2}-\gamma}}, \quad (1) \] where $c, \gamma > 0$ are arbitrary constants.

Matrix Completion and Related Problems via Strong Duality

no code implementations27 Apr 2017 Maria-Florina Balcan, YIngyu Liang, David P. Woodruff, Hongyang Zhang

This work studies the strong duality of non-convex matrix factorization problems: we show that under certain dual conditions, these problems and its dual have the same optimum.

Matrix Completion

Relative Error Tensor Low Rank Approximation

no code implementations26 Apr 2017 Zhao Song, David P. Woodruff, Peilin Zhong

Despite the success on obtaining relative error low rank approximations for matrices, no such results were known for tensors.

Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness

no code implementations13 Apr 2017 Cameron Musco, Praneeth Netrapalli, Aaron Sidford, Shashanka Ubaru, David P. Woodruff

We thus effectively compute a histogram of the spectrum, which can stand in for the true singular values in many applications.

Sublinear Time Low-Rank Approximation of Positive Semidefinite Matrices

no code implementations11 Apr 2017 Cameron Musco, David P. Woodruff

We show how to compute a relative-error low-rank approximation to any positive semidefinite (PSD) matrix in sublinear time, i. e., for any $n \times n$ PSD matrix $A$, in $\tilde O(n \cdot poly(k/\epsilon))$ time we output a rank-$k$ matrix $B$, in factored form, for which $\|A-B\|_F^2 \leq (1+\epsilon)\|A-A_k\|_F^2$, where $A_k$ is the best rank-$k$ approximation to $A$.

Communication-Optimal Distributed Clustering

no code implementations NeurIPS 2016 Jiecao Chen, He Sun, David P. Woodruff, Qin Zhang

We would like the quality of the clustering in the distributed setting to match that in the centralized setting for which all the data resides on a single site.

Faster Kernel Ridge Regression Using Sketching and Preconditioning

no code implementations10 Nov 2016 Haim Avron, Kenneth L. Clarkson, David P. Woodruff

The preconditioner is based on random feature maps, such as random Fourier features, which have recently emerged as a powerful technique for speeding up and scaling the training of kernel-based methods, such as kernel ridge regression, by resorting to approximations.

Sharper Bounds for Regularized Data Fitting

no code implementations10 Nov 2016 Haim Avron, Kenneth L. Clarkson, David P. Woodruff

We study regularization both in a fairly broad setting, and in the specific context of the popular and widely used technique of ridge regularization; for the latter, as applied to each of these problems, we show algorithmic resource bounds in which the {\em statistical dimension} appears in places where in previous bounds the rank would appear.

Low Rank Approximation with Entrywise $\ell_1$-Norm Error

no code implementations3 Nov 2016 Zhao Song, David P. Woodruff, Peilin Zhong

We give the first provable approximation algorithms for $\ell_1$-low rank approximation, showing that it is possible to achieve approximation factor $\alpha = (\log d) \cdot \mathrm{poly}(k)$ in $\mathrm{nnz}(A) + (n+d) \mathrm{poly}(k)$ time, where $\mathrm{nnz}(A)$ denotes the number of non-zero entries of $A$.

Distributed Low Rank Approximation of Implicit Functions of a Matrix

no code implementations28 Jan 2016 David P. Woodruff, Peilin Zhong

For example, each of $s$ servers may have an $n \times d$ matrix $A^t$, and we may be interested in computing a low rank approximation to $A = f(\sum_{t=1}^s A^t)$, where $f$ is a function which is applied entrywise to the matrix $\sum_{t=1}^s A^t$.

Optimal approximate matrix product in terms of stable rank

no code implementations8 Jul 2015 Michael B. Cohen, Jelani Nelson, David P. Woodruff

We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having $m = O(\tilde{r}/\varepsilon^2)$ rows.

Dimensionality Reduction

Communication Lower Bounds for Statistical Estimation Problems via a Distributed Data Processing Inequality

no code implementations24 Jun 2015 Mark Braverman, Ankit Garg, Tengyu Ma, Huy L. Nguyen, David P. Woodruff

We study the tradeoff between the statistical error and communication cost of distributed statistical estimation problems in high dimensions.

Frequent Directions : Simple and Deterministic Matrix Sketching

no code implementations8 Jan 2015 Mina Ghashami, Edo Liberty, Jeff M. Phillips, David P. Woodruff

It performed $O(d \times \ell)$ operations per row and maintains a sketch matrix $B \in R^{\ell \times d}$ such that for any $k < \ell$ $\|A^TA - B^TB \|_2 \leq \|A - A_k\|_F^2 / (\ell-k)$ and $\|A - \pi_{B_k}(A)\|_F^2 \leq \big(1 + \frac{k}{\ell-k}\big) \|A-A_k\|_F^2 $ .

Data Structures and Algorithms 68W40 (Primary)

Sketching as a Tool for Numerical Linear Algebra

no code implementations17 Nov 2014 David P. Woodruff

This survey highlights the recent advances in algorithms for numerical linear algebra that have come from the technique of linear sketching, whereby given a matrix, one first compresses it to a much smaller matrix by multiplying it by a (usually) random matrix with certain properties.

Data Structures and Algorithms

Optimal CUR Matrix Decompositions

no code implementations30 May 2014 Christos Boutsidis, David P. Woodruff

The CUR decomposition of an $m \times n$ matrix $A$ finds an $m \times c$ matrix $C$ with a subset of $c < n$ columns of $A,$ together with an $r \times n$ matrix $R$ with a subset of $r < m$ rows of $A,$ as well as a $c \times r$ low-rank matrix $U$ such that the matrix $C U R$ approximates the matrix $A,$ that is, $ || A - CUR ||_F^2 \le (1+\epsilon) || A - A_k||_F^2$, where $||.||_F$ denotes the Frobenius norm and $A_k$ is the best $m \times n$ matrix of rank $k$ constructed via the SVD.

Low Rank Approximation and Regression in Input Sparsity Time

1 code implementation26 Jul 2012 Kenneth L. Clarkson, David P. Woodruff

We design a new distribution over $\poly(r \eps^{-1}) \times n$ matrices $S$ so that for any fixed $n \times d$ matrix $A$ of rank $r$, with probability at least 9/10, $\norm{SAx}_2 = (1 \pm \eps)\norm{Ax}_2$ simultaneously for all $x \in \mathbb{R}^d$.

Data Structures and Algorithms

The Fast Cauchy Transform and Faster Robust Linear Regression

no code implementations19 Jul 2012 Kenneth L. Clarkson, Petros Drineas, Malik Magdon-Ismail, Michael W. Mahoney, Xiangrui Meng, David P. Woodruff

We provide fast algorithms for overconstrained $\ell_p$ regression and related problems: for an $n\times d$ input matrix $A$ and vector $b\in\mathbb{R}^n$, in $O(nd\log n)$ time we reduce the problem $\min_{x\in\mathbb{R}^d} \|Ax-b\|_p$ to the same problem with input matrix $\tilde A$ of dimension $s \times d$ and corresponding $\tilde b$ of dimension $s\times 1$.

Fast Moment Estimation in Data Streams in Optimal Space

no code implementations23 Jul 2010 Daniel M. Kane, Jelani Nelson, Ely Porat, David P. Woodruff

We give a space-optimal algorithm with update time O(log^2(1/eps)loglog(1/eps)) for (1+eps)-approximating the pth frequency moment, 0 < p < 2, of a length-n vector updated in a data stream.

Data Structures and Algorithms

Cannot find the paper you are looking for? You can Submit a new open access paper.