Search Results for author: Peilin Zhong

Found 18 papers, 6 papers with code

PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels

no code implementations2 Oct 2023 Praneeth Kacham, Vahab Mirrokni, Peilin Zhong

For context lengths of 32k and GPT-2 style models, our model achieves a 2. 5-4x speedup in training compared to FlashAttention, with no observed degradation in quality across our experiments.

Language Modelling

Differentially Private Clustering in Data Streams

no code implementations14 Jul 2023 Alessandro Epasto, Tamalika Mukherjee, Peilin Zhong

In this work, we provide the first differentially private streaming algorithms for $k$-means and $k$-median clustering of $d$-dimensional Euclidean data points over a stream with length at most $T$ using $poly(k, d,\log(T))$ space to achieve a constant multiplicative error and a $poly(k, d,\log(T))$ additive error.

Clustering

Measuring Re-identification Risk

3 code implementations12 Apr 2023 CJ Carey, Travis Dick, Alessandro Epasto, Adel Javanmard, Josh Karlin, Shankar Kumar, Andres Munoz Medina, Vahab Mirrokni, Gabriel Henrique Nunes, Sergei Vassilvitskii, Peilin Zhong

In this work, we present a new theoretical framework to measure re-identification risk in such user representations.

Stars: Tera-Scale Graph Building for Clustering and Graph Learning

no code implementations5 Dec 2022 CJ Carey, Jonathan Halcrow, Rajesh Jayaram, Vahab Mirrokni, Warren Schudy, Peilin Zhong

We evaluate the performance of Stars for clustering and graph learning, and demonstrate 10~1000-fold improvements in pairwise similarity comparisons compared to different baselines, and 2~10-fold improvement in running time without quality loss.

Clustering Graph Learning

Differentially Private Graph Learning via Sensitivity-Bounded Personalized PageRank

1 code implementation14 Jul 2022 Alessandro Epasto, Vahab Mirrokni, Bryan Perozzi, Anton Tsitsulin, Peilin Zhong

Personalized PageRank (PPR) is a fundamental tool in unsupervised learning of graph representations such as node ranking, labeling, and graph embedding.

Graph Embedding Graph Learning +1

Planning with General Objective Functions: Going Beyond Total Rewards

no code implementations NeurIPS 2020 Ruosong Wang, Peilin Zhong, Simon S. Du, Russ R. Salakhutdinov, Lin Yang

Standard sequential decision-making paradigms aim to maximize the cumulative reward when interacting with the unknown environment., i. e., maximize $\sum_{h = 1}^H r_h$ where $H$ is the planning horizon.

Decision Making

Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss

no code implementations16 Apr 2020 Zhao Song, David P. Woodruff, Peilin Zhong

entries drawn from any distribution $\mu$ for which the $(1+\gamma)$-th moment exists, for an arbitrarily small constant $\gamma > 0$, then it is possible to obtain a $(1+\epsilon)$-approximate column subset selection to the entrywise $\ell_1$-norm in nearly linear time.

Average Case Column Subset Selection for Entrywise \ell_1-Norm Loss

1 code implementation NeurIPS 2019 Zhao Song, David Woodruff, Peilin Zhong

entries drawn from any distribution $\mu$ for which the $(1+\gamma)$-th moment exists, for an arbitrarily small constant $\gamma > 0$, then it is possible to obtain a $(1+\epsilon)$-approximate column subset selection to the entrywise $\ell_1$-norm in nearly linear time.

Efficient Symmetric Norm Regression via Linear Sketching

no code implementations NeurIPS 2019 Zhao Song, Ruosong Wang, Lin F. Yang, Hongyang Zhang, Peilin Zhong

When the loss function is a general symmetric norm, our algorithm produces a $\sqrt{d} \cdot \mathrm{polylog} n \cdot \mathrm{mmc}(\ell)$-approximate solution in input-sparsity time, where $\mathrm{mmc}(\ell)$ is a quantity related to the symmetric norm under consideration.

regression

Enhancing Adversarial Defense by k-Winners-Take-All

1 code implementation ICLR 2020 Chang Xiao, Peilin Zhong, Changxi Zheng

In all cases, the robustness of k-WTA networks outperforms that of traditional networks under white-box attacks.

Adversarial Defense

Rethinking Generative Mode Coverage: A Pointwise Guaranteed Approach

no code implementations NeurIPS 2019 Peilin Zhong, Yuchen Mo, Chang Xiao, Peng-Yu Chen, Changxi Zheng

The conventional wisdom to this end is by reducing through training a statistical distance (such as $f$-divergence) between the generated distribution and provided data distribution.

Towards a Zero-One Law for Column Subset Selection

1 code implementation NeurIPS 2019 Zhao Song, David P. Woodruff, Peilin Zhong

Our approximation algorithms handle functions which are not even scale-invariant, such as the Huber loss function, which we show have very different structural properties than $\ell_p$-norms, e. g., one can show the lack of scale-invariance causes any column subset selection algorithm to provably require a $\sqrt{\log n}$ factor larger number of columns than $\ell_p$-norms; nevertheless we design the first efficient column subset selection algorithms for such error measures.

Subspace Embedding and Linear Regression with Orlicz Norm

no code implementations ICML 2018 Alexandr Andoni, Chengyu Lin, Ying Sheng, Peilin Zhong, Ruiqi Zhong

An Orlicz norm is parameterized by a non-negative convex function $G:\mathbb{R}_+\rightarrow\mathbb{R}_+$ with $G(0)=0$: the Orlicz norm of a vector $x\in\mathbb{R}^n$ is defined as $ \|x\|_G=\inf\left\{\alpha>0\large\mid\sum_{i=1}^n G(|x_i|/\alpha)\leq 1\right\}.

regression

BourGAN: Generative Networks with Metric Embeddings

1 code implementation NeurIPS 2018 Chang Xiao, Peilin Zhong, Changxi Zheng

This paper addresses the mode collapse for generative adversarial networks (GANs).

Nearly Optimal Dynamic $k$-Means Clustering for High-Dimensional Data

no code implementations1 Feb 2018 Wei Hu, Zhao Song, Lin F. Yang, Peilin Zhong

We consider the $k$-means clustering problem in the dynamic streaming setting, where points from a discrete Euclidean space $\{1, 2, \ldots, \Delta\}^d$ can be dynamically inserted to or deleted from the dataset.

Clustering Vocal Bursts Intensity Prediction

Relative Error Tensor Low Rank Approximation

no code implementations26 Apr 2017 Zhao Song, David P. Woodruff, Peilin Zhong

Despite the success on obtaining relative error low rank approximations for matrices, no such results were known for tensors.

Low Rank Approximation with Entrywise $\ell_1$-Norm Error

no code implementations3 Nov 2016 Zhao Song, David P. Woodruff, Peilin Zhong

We give the first provable approximation algorithms for $\ell_1$-low rank approximation, showing that it is possible to achieve approximation factor $\alpha = (\log d) \cdot \mathrm{poly}(k)$ in $\mathrm{nnz}(A) + (n+d) \mathrm{poly}(k)$ time, where $\mathrm{nnz}(A)$ denotes the number of non-zero entries of $A$.

Distributed Low Rank Approximation of Implicit Functions of a Matrix

no code implementations28 Jan 2016 David P. Woodruff, Peilin Zhong

For example, each of $s$ servers may have an $n \times d$ matrix $A^t$, and we may be interested in computing a low rank approximation to $A = f(\sum_{t=1}^s A^t)$, where $f$ is a function which is applied entrywise to the matrix $\sum_{t=1}^s A^t$.

Cannot find the paper you are looking for? You can Submit a new open access paper.