Search Results for author: Zhao Song

Found 80 papers, 17 papers with code

Instance-hiding Schemes for Private Distributed Learning

no code implementations ICML 2020 Yangsibo Huang, Zhao Song, Sanjeev Arora, Kai Li

The new ideas in the current paper are: (a) new variants of mixup with negative as well as positive coefficients, and extend the sample-wise mixup to be pixel-wise.

Federated Learning

Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis

no code implementations26 Jun 2022 Alexander Munteanu, Simon Omlor, Zhao Song, David P. Woodruff

A common method in training neural networks is to initialize all the weights to be independent Gaussian vectors.

Smoothed Online Combinatorial Optimization Using Imperfect Predictions

no code implementations23 Apr 2022 Kai Wang, Zhao Song, Georgios Theocharous, Sridhar Mahadevan

Smoothed online combinatorial optimization considers a learner who repeatedly chooses a combinatorial decision to minimize an unknown changing cost function with a penalty on switching decisions in consecutive rounds.

Combinatorial Optimization

Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time

no code implementations14 Dec 2021 Zhao Song, Lichen Zhang, Ruizhe Zhang

We consider the problem of training a multi-layer over-parametrized neural networks to minimize the empirical risk induced by a loss function.

On Convergence of Federated Averaging Langevin Dynamics

no code implementations9 Dec 2021 Wei Deng, Qian Zhang, Yi-An Ma, Zhao Song, Guang Lin

We develop theoretical guarantees for FA-LD for strongly log-concave distributions with non-i. i. d data and study how the injected noise and the stochastic-gradient noise, the heterogeneity of data, and the varying learning rates affect the convergence.

Fast Graph Neural Tangent Kernel via Kronecker Sketching

no code implementations4 Dec 2021 Shunhua Jiang, Yunze Man, Zhao Song, Zheng Yu, Danyang Zhuo

Given a kernel matrix of $n$ graphs, using sketching in solving kernel regression can reduce the running time to $o(n^3)$.

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models

1 code implementation ICLR 2022 Tri Dao, Beidi Chen, Kaizhao Liang, Jiaming Yang, Zhao Song, Atri Rudra, Christopher Ré

To address this, our main insight is to optimize over a continuous superset of sparse matrices with a fixed structure known as products of butterfly matrices.

Language Modelling

Evaluating Gradient Inversion Attacks and Defenses in Federated Learning

1 code implementation NeurIPS 2021 Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, Sanjeev Arora

Gradient inversion attack (or input recovery from gradient) is an emerging threat to the security and privacy preservation of Federated learning, whereby malicious eavesdroppers or participants in the protocol can recover (partially) the clients' private data.

Federated Learning

Online MAP Inference and Learning for Nonsymmetric Determinantal Point Processes

no code implementations29 Nov 2021 Aravind Reddy, Ryan A. Rossi, Zhao Song, Anup Rao, Tung Mai, Nedim Lipka, Gang Wu, Eunyee Koh, Nesreen Ahmed

In this paper, we introduce the online and streaming MAP inference and learning problems for Non-symmetric Determinantal Point Processes (NDPPs) where data points arrive in an arbitrary order and the algorithms are constrained to use a single-pass over the data as well as sub-linear memory.

Point Processes

An Interpretable Graph Generative Model with Heterophily

no code implementations4 Nov 2021 Sudhanshu Chanpuriya, Ryan A. Rossi, Anup Rao, Tung Mai, Nedim Lipka, Zhao Song, Cameron Musco

These models output the probabilities of edges existing between all pairs of nodes, and the probability of a link between two nodes increases with the dot product of vectors associated with the nodes.

Link Prediction

Scatterbrain: Unifying Sparse and Low-rank Attention Approximation

1 code implementation NeurIPS 2021 Beidi Chen, Tri Dao, Eric Winsor, Zhao Song, Atri Rudra, Christopher Ré

Recent advances in efficient Transformers have exploited either the sparsity or low-rank properties of attention matrices to reduce the computational and memory bottlenecks of modeling long sequences.

Image Generation Language Modelling

Does Preprocessing Help Training Over-parameterized Neural Networks?

no code implementations NeurIPS 2021 Zhao Song, Shuo Yang, Ruizhe Zhang

The classical training method requires paying $\Omega(mnd)$ cost for both forward computation and backward computation, where $m$ is the width of the neural network, and we are given $n$ training points in $d$-dimensional space.

Provable Federated Adversarial Learning via Min-max Optimization

no code implementations29 Sep 2021 Xiaoxiao Li, Zhao Song, Jiaming Yang

Unlike the convergence analysis in centralized training that relies on the gradient direction, it is significantly harder to analyze the convergence in FAL for two reasons: 1) the complexity of min-max optimization, and 2) model not updating in the gradient direction due to the multi-local updates on the client-side before aggregation.

Federated Learning

Iterative Sketching and its Application to Federated Learning

no code implementations29 Sep 2021 Zhao Song, Zheng Yu, Lichen Zhang

Though most federated learning frameworks only require clients and the server to send gradient information over the network, they still face the challenges of communication efficiency and data privacy.

Federated Learning

Sample Complexity of Deep Active Learning

no code implementations29 Sep 2021 Zhao Song, Baocheng Sun, Danyang Zhuo

In this paper, we present the first deep active learning algorithm which has a provable sample complexity.

Active Learning Fraud Detection +1

InstaHide’s Sample Complexity When Mixing Two Private Images

no code implementations29 Sep 2021 Baihe Huang, Zhao Song, Runzhou Tao, Ruizhe Zhang, Danyang Zhuo

Inspired by InstaHide challenge [Huang, Song, Li and Arora'20], [Chen, Song and Zhuo'20] recently provides one mathematical formulation of InstaHide attack problem under Gaussian images distribution.

Fast Sketching of Polynomial Kernels of Polynomial Degree

no code implementations21 Aug 2021 Zhao Song, David P. Woodruff, Zheng Yu, Lichen Zhang

Recent techniques in oblivious sketching reduce the dependence in the running time on the degree $q$ of the polynomial kernel from exponential to polynomial, which is useful for the Gaussian kernel, for which $q$ can be chosen to be polylogarithmic.

Scatterbrain: Unifying Sparse and Low-rank Attention

1 code implementation NeurIPS 2021 Beidi Chen, Tri Dao, Eric Winsor, Zhao Song, Atri Rudra, Christopher Ré

Recent advances in efficient Transformers have exploited either the sparsity or low-rank properties of attention matrices to reduce the computational and memory bottlenecks of modeling long sequences.

Image Generation Language Modelling

Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing

no code implementations18 May 2021 Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu

We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions.

reinforcement-learning

FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis

no code implementations11 May 2021 Baihe Huang, Xiaoxiao Li, Zhao Song, Xin Yang

Nevertheless, training analysis of neural networks in FL is non-trivial for two reasons: first, the objective loss function we are optimizing is non-smooth and non-convex, and second, we are even not updating in the gradient direction.

Federated Learning

Near-Optimal Two-Pass Streaming Algorithm for Sampling Random Walks over Directed Graphs

no code implementations22 Feb 2021 Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh Saxena, Zhao Song, Huacheng Yu

In addition, we show a similar $\tilde{\Theta}(n \cdot \sqrt{L})$ bound on the space complexity of any algorithm (with any number of passes) for the related problem of sampling an $L$-step random walk from every vertex in the graph.

Data Structures and Algorithms Computational Complexity

Symmetric Sparse Boolean Matrix Factorization and Applications

no code implementations2 Feb 2021 Sitan Chen, Zhao Song, Runzhou Tao, Ruizhe Zhang

As this problem is hard in the worst-case, we study a natural average-case variant that arises in the context of these reconstruction attacks: $\mathbf{M} = \mathbf{W}\mathbf{W}^{\top}$ for $\mathbf{W}$ a random Boolean matrix with $k$-sparse rows, and the goal is to recover $\mathbf{W}$ up to column permutation.

Tensor Decomposition

Solving SDP Faster: A Robust IPM Framework and Efficient Implementation

no code implementations20 Jan 2021 Baihe Huang, Shunhua Jiang, Zhao Song, Runzhou Tao

This paper introduces a new robust interior point method analysis for semidefinite programming (SDP).

Optimization and Control Data Structures and Algorithms

Minimum Cost Flows, MDPs, and $\ell_1$-Regression in Nearly Linear Time for Dense Instances

no code implementations14 Jan 2021 Jan van den Brand, Yin Tat Lee, Yang P. Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, Di Wang

In the special case of the minimum cost flow problem on $n$-vertex $m$-edge graphs with integer polynomially-bounded costs and capacities we obtain a randomized method which solves the problem in $\tilde{O}(m+n^{1. 5})$ time.

Data Structures and Algorithms Optimization and Control

What Can Phase Retrieval Tell Us About Private Distributed Learning?

no code implementations ICLR 2021 Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo

In this work, we examine the security of InstaHide, a scheme recently proposed by \cite{hsla20} for preserving the security of private datasets in the context of distributed learning.

MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training

no code implementations ICLR 2021 Beidi Chen, Zichang Liu, Binghui Peng, Zhaozhuo Xu, Jonathan Lingjie Li, Tri Dao, Zhao Song, Anshumali Shrivastava, Christopher Re

Recent advances by practitioners in the deep learning community have breathed new life into Locality Sensitive Hashing (LSH), using it to reduce memory and time bottlenecks in neural network (NN) training.

Language Modelling Recommendation Systems

Oblivious Sketching-based Central Path Method for Solving Linear Programming Problems

no code implementations1 Jan 2021 Zhao Song, Zheng Yu

In this work, we propose a sketching-based central path method for solving linear programmings, whose running time matches the state of art results [Cohen, Lee, Song STOC 19; Lee, Song, Zhang COLT 19].

Graph Neural Network Acceleration via Matrix Dimension Reduction

no code implementations1 Jan 2021 Shunhua Jiang, Yunze Man, Zhao Song, Danyang Zhuo

Theoretically, we present two techniques to speed up GNTK training while preserving the generalization error: (1) We use a novel matrix decoupling method to reduce matrix dimensions during the kernel solving.

Dimensionality Reduction

InstaHide's Sample Complexity When Mixing Two Private Images

no code implementations24 Nov 2020 Baihe Huang, Zhao Song, Runzhou Tao, Ruizhe Zhang, Danyang Zhuo

Inspired by InstaHide challenge [Huang, Song, Li and Arora'20], [Chen, Song and Zhuo'20] recently provides one mathematical formulation of InstaHide attack problem under Gaussian images distribution.

On InstaHide, Phase Retrieval, and Sparse Matrix Factorization

no code implementations23 Nov 2020 Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo

In this work, we examine the security of InstaHide, a scheme recently proposed by [Huang, Song, Li and Arora, ICML'20] for preserving the security of private datasets in the context of distributed learning.

Algorithms and Hardness for Linear Algebra on Geometric Graphs

no code implementations4 Nov 2020 Josh Alman, Timothy Chu, Aaron Schild, Zhao Song

We investigate whether or not it is possible to solve the following problems in $n^{1+o(1)}$ time for a $\mathsf{K}$-graph $G_P$ when $d < n^{o(1)}$: $\bullet$ Multiply a given vector by the adjacency matrix or Laplacian matrix of $G_P$ $\bullet$ Find a spectral sparsifier of $G_P$ $\bullet$ Solve a Laplacian system in $G_P$'s Laplacian matrix For each of these problems, we consider all functions of the form $\mathsf{K}(u, v) = f(\|u-v\|_2^2)$ for a function $f:\mathbb{R} \rightarrow \mathbb{R}$.

MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery

no code implementations22 Oct 2020 Xiaoxiao Li, Yangsibo Huang, Binghui Peng, Zhao Song, Kai Li

To address the issue that deep neural networks (DNNs) are vulnerable to model inversion attacks, we design an objective function, which adjusts the separability of the hidden data representations, as a way to control the trade-off between data utility and vulnerability to inversion attacks.

InstaHide: Instance-hiding Schemes for Private Distributed Learning

2 code implementations6 Oct 2020 Yangsibo Huang, Zhao Song, Kai Li, Sanjeev Arora

This paper introduces InstaHide, a simple encryption of training images, which can be plugged into existing distributed deep learning pipelines.

Generalized Leverage Score Sampling for Neural Networks

no code implementations NeurIPS 2020 Jason D. Lee, Ruoqi Shen, Zhao Song, Mengdi Wang, Zheng Yu

Leverage score sampling is a powerful technique that originates from theoretical computer science, which can be used to speed up a large number of fundamental questions, e. g. linear regression, linear programming, semi-definite programming, cutting plane method, graph sparsification, maximum matching and max-flow.

Learning Theory

Training (Overparametrized) Neural Networks in Near-Linear Time

no code implementations20 Jun 2020 Jan van den Brand, Binghui Peng, Zhao Song, Omri Weinstein

The slow convergence rate and pathological curvature issues of first-order gradient methods for training deep neural networks, initiated an ongoing effort for developing faster $\mathit{second}$-$\mathit{order}$ optimization algorithms beyond SGD, without compromising the generalization error.

Dimensionality Reduction

When is Particle Filtering Efficient for Planning in Partially Observed Linear Dynamical Systems?

no code implementations10 Jun 2020 Simon S. Du, Wei Hu, Zhiyuan Li, Ruoqi Shen, Zhao Song, Jiajun Wu

Though errors in past actions may affect the future, we are able to bound the number of particles needed so that the long-run reward of the policy based on particle filtering is close to that based on exact inference.

Decision Making

Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss

no code implementations16 Apr 2020 Zhao Song, David P. Woodruff, Peilin Zhong

entries drawn from any distribution $\mu$ for which the $(1+\gamma)$-th moment exists, for an arbitrarily small constant $\gamma > 0$, then it is possible to obtain a $(1+\epsilon)$-approximate column subset selection to the entrywise $\ell_1$-norm in nearly linear time.

An Improved Cutting Plane Method for Convex Optimization, Convex-Concave Games and its Applications

no code implementations8 Apr 2020 Haotian Jiang, Yin Tat Lee, Zhao Song, Sam Chiu-wai Wong

We propose a new cutting plane algorithm that uses an optimal $O(n \log (\kappa))$ evaluations of the oracle and an additional $O(n^2)$ time per evaluation, where $\kappa = nR/\epsilon$.

Privacy-preserving Learning via Deep Net Pruning

no code implementations4 Mar 2020 Yangsibo Huang, Yushan Su, Sachin Ravi, Zhao Song, Sanjeev Arora, Kai Li

This paper attempts to answer the question whether neural network pruning can be used as a tool to achieve differential privacy without losing much data utility.

Network Pruning Privacy Preserving

Sketching Transformed Matrices with Applications to Natural Language Processing

no code implementations23 Feb 2020 Yingyu Liang, Zhao Song, Mengdi Wang, Lin F. Yang, Xin Yang

We show that our approach obtains small error and is efficient in both space and time.

Natural Language Processing

Meta-learning for mixed linear regression

no code implementations ICML 2020 Weihao Kong, Raghav Somani, Zhao Song, Sham Kakade, Sewoong Oh

In modern supervised learning, there are a large number of tasks, but many of them are associated with only a small amount of labeled data.

Meta-Learning Small Data Image Classification

Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

no code implementations NeurIPS 2020 Yi Zhang, Orestis Plevrakis, Simon S. Du, Xingguo Li, Zhao Song, Sanjeev Arora

Our work proves convergence to low robust training loss for \emph{polynomial} width instead of exponential, under natural assumptions and with the ReLU activation.

online learning

Parallel Neural Text-to-Speech

no code implementations ICLR 2020 Kainan Peng, Wei Ping, Zhao Song, Kexin Zhao

In this work, we first propose ParaNet, a non-autoregressive seq2seq model that converts text to spectrogram.

Learning Mixtures of Linear Regressions in Subexponential Time via Fourier Moments

no code implementations16 Dec 2019 Sitan Chen, Jerry Li, Zhao Song

In this paper, we give the first algorithm for learning an MLR that runs in time which is sub-exponential in $k$.

Density Estimation

WaveFlow: A Compact Flow-based Model for Raw Audio

4 code implementations ICML 2020 Wei Ping, Kainan Peng, Kexin Zhao, Zhao Song

WaveFlow provides a unified view of likelihood-based models for 1-D data, including WaveNet and WaveGlow as special cases.

Provable Non-linear Inductive Matrix Completion

no code implementations NeurIPS 2019 Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon

Inductive matrix completion (IMC) method is a standard approach for this problem where the given query as well as the items are embedded in a common low-dimensional space.

Matrix Completion

Average Case Column Subset Selection for Entrywise \ell_1-Norm Loss

1 code implementation NeurIPS 2019 Zhao Song, David Woodruff, Peilin Zhong

entries drawn from any distribution $\mu$ for which the $(1+\gamma)$-th moment exists, for an arbitrarily small constant $\gamma > 0$, then it is possible to obtain a $(1+\epsilon)$-approximate column subset selection to the entrywise $\ell_1$-norm in nearly linear time.

Efficient Symmetric Norm Regression via Linear Sketching

no code implementations NeurIPS 2019 Zhao Song, Ruosong Wang, Lin F. Yang, Hongyang Zhang, Peilin Zhong

When the loss function is a general symmetric norm, our algorithm produces a $\sqrt{d} \cdot \mathrm{polylog} n \cdot \mathrm{mmc}(\ell)$-approximate solution in input-sparsity time, where $\mathrm{mmc}(\ell)$ is a quantity related to the symmetric norm under consideration.

Optimal Sketching for Kronecker Product Regression and Low Rank Approximation

no code implementations NeurIPS 2019 Huaian Diao, Rajesh Jayaram, Zhao Song, Wen Sun, David P. Woodruff

For input $\mathcal{A}$ as above, we give $O(\sum_{i=1}^q \text{nnz}(A_i))$ time algorithms, which is much faster than computing $\mathcal{A}$.

Total Least Squares Regression in Input Sparsity Time

1 code implementation NeurIPS 2019 Huaian Diao, Zhao Song, David P. Woodruff, Xin Yang

In the total least squares problem, one is given an $m \times n$ matrix $A$, and an $m \times d$ matrix $B$, and one seeks to "correct" both $A$ and $B$, obtaining matrices $\hat{A}$ and $\hat{B}$, so that there exists an $X$ satisfying the equation $\hat{A}X = \hat{B}$.

Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound

no code implementations9 Jun 2019 Zhao Song, Xin Yang

We improve the over-parametrization size over two beautiful results [Li and Liang' 2018] and [Du, Zhai, Poczos and Singh' 2019] in deep learning theory.

Learning Theory

Non-Autoregressive Neural Text-to-Speech

2 code implementations ICML 2020 Kainan Peng, Wei Ping, Zhao Song, Kexin Zhao

In this work, we propose ParaNet, a non-autoregressive seq2seq model that converts text to spectrogram.

Text-To-Speech Synthesis

Solving Empirical Risk Minimization in the Current Matrix Multiplication Time

no code implementations11 May 2019 Yin Tat Lee, Zhao Song, Qiuyi Zhang

Our result generalizes the very recent result of solving linear programs in the current matrix multiplication time [Cohen, Lee, Song'19] to a more broad class of problems.

Efficient Model-free Reinforcement Learning in Metric Spaces

1 code implementation1 May 2019 Zhao Song, Wen Sun

Model-free Reinforcement Learning (RL) algorithms such as Q-learning [Watkins, Dayan 92] have been widely used in practice and can achieve human level performance in applications such as video games [Mnih et al. 15].

Q-Learning reinforcement-learning

The Limitations of Adversarial Training and the Blind-Spot Attack

no code implementations ICLR 2019 Huan Zhang, Hongge Chen, Zhao Song, Duane Boning, Inderjit S. Dhillon, Cho-Jui Hsieh

In our paper, we shed some lights on the practicality and the hardness of adversarial training by showing that the effectiveness (robustness on test set) of adversarial training has a strong correlation with the distance between a test point and the manifold of training data embedded by the network.

Towards a Theoretical Understanding of Hashing-Based Neural Nets

no code implementations26 Dec 2018 Yibo Lin, Zhao Song, Lin F. Yang

In this paper, we provide provable guarantees on some hashing-based parameter reduction methods in neural nets.

Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities

no code implementations15 Dec 2018 Yin Tat Lee, Zhao Song, Santosh S. Vempala

We apply this to the sampling problem to obtain a nearly linear implementation of HMC for a broad class of smooth, strongly logconcave densities, with the number of iterations (parallel depth) and gradient evaluations being $\mathit{polylogarithmic}$ in the dimension (rather than polynomial as in previous work).

Revisiting the Softmax Bellman Operator: New Benefits and New Perspective

2 code implementations2 Dec 2018 Zhao Song, Ronald E. Parr, Lawrence Carin

The impact of softmax on the value function itself in reinforcement learning (RL) is often viewed as problematic because it leads to sub-optimal value (or Q) functions and interferes with the contraction properties of the Bellman operator.

Atari Games Q-Learning

A Convergence Theory for Deep Learning via Over-Parameterization

no code implementations9 Nov 2018 Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song

In terms of network architectures, our theory at least applies to fully-connected neural networks, convolutional neural networks (CNN), and residual neural networks (ResNet).

Towards a Zero-One Law for Column Subset Selection

1 code implementation NeurIPS 2019 Zhao Song, David P. Woodruff, Peilin Zhong

Our approximation algorithms handle functions which are not even scale-invariant, such as the Huber loss function, which we show have very different structural properties than $\ell_p$-norms, e. g., one can show the lack of scale-invariance causes any column subset selection algorithm to provably require a $\sqrt{\log n}$ factor larger number of columns than $\ell_p$-norms; nevertheless we design the first efficient column subset selection algorithms for such error measures.

On the Convergence Rate of Training Recurrent Neural Networks

no code implementations NeurIPS 2019 Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song

In this paper, we focus on recurrent neural networks (RNNs) which are multi-layer networks widely used in natural language processing.

Natural Language Processing

Nonlinear Inductive Matrix Completion based on One-layer Neural Networks

no code implementations26 May 2018 Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon

A standard approach to modeling this problem is Inductive Matrix Completion where the predicted rating is modeled as an inner product of the user and the item features projected onto a latent space.

Matrix Completion Recommendation Systems

Towards Fast Computation of Certified Robustness for ReLU Networks

6 code implementations ICML 2018 Tsui-Wei Weng, huan zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel

Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17].

Learning Long Term Dependencies via Fourier Recurrent Units

2 code implementations ICML 2018 Jiong Zhang, Yibo Lin, Zhao Song, Inderjit S. Dhillon

In this paper we propose a simple recurrent architecture, the Fourier Recurrent Unit (FRU), that stabilizes the gradients that arise in its training while giving us stronger expressive power.

Nearly Optimal Dynamic $k$-Means Clustering for High-Dimensional Data

no code implementations1 Feb 2018 Wei Hu, Zhao Song, Lin F. Yang, Peilin Zhong

We consider the $k$-means clustering problem in the dynamic streaming setting, where points from a discrete Euclidean space $\{1, 2, \ldots, \Delta\}^d$ can be dynamically inserted to or deleted from the dataset.

Sketching for Kronecker Product Regression and P-splines

no code implementations27 Dec 2017 Huaian Diao, Zhao Song, Wen Sun, David P. Woodruff

That is, TensorSketch only provides input sparsity time for Kronecker product regression with respect to the $2$-norm.

Stochastic Multi-armed Bandits in Constant Space

no code implementations25 Dec 2017 David Liau, Eric Price, Zhao Song, Ger Yang

We consider the stochastic bandit problem in the sublinear space setting, where one cannot record the win-loss record for all $K$ arms.

Multi-Armed Bandits

Scalable Model Selection for Belief Networks

no code implementations NeurIPS 2017 Zhao Song, Yusuke Muraoka, Ryohei Fujimaki, Lawrence Carin

We propose a scalable algorithm for model selection in sigmoid belief networks (SBNs), based on the factorized asymptotic Bayesian (FAB) framework.

Model Selection

Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels

no code implementations8 Nov 2017 Kai Zhong, Zhao Song, Inderjit S. Dhillon

In this paper, we consider parameter recovery for non-overlapping convolutional neural networks (CNNs) with multiple kernels.

Recovery Guarantees for One-hidden-layer Neural Networks

no code implementations ICML 2017 Kai Zhong, Zhao Song, Prateek Jain, Peter L. Bartlett, Inderjit S. Dhillon

For activation functions that are also smooth, we show $\mathit{local~linear~convergence}$ guarantees of gradient descent under a resampling rule.

Fast Regression with an $\ell_\infty$ Guarantee

no code implementations30 May 2017 Eric Price, Zhao Song, David P. Woodruff

Our main result is that, when $S$ is the subsampled randomized Fourier/Hadamard transform, the error $x' - x^*$ behaves as if it lies in a "random" direction within this bound: for any fixed direction $a\in \mathbb{R}^d$, we have with $1 - d^{-c}$ probability that \[ \langle a, x'-x^*\rangle \lesssim \frac{\|a\|_2\|x'-x^*\|_2}{d^{\frac{1}{2}-\gamma}}, \quad (1) \] where $c, \gamma > 0$ are arbitrary constants.

Relative Error Tensor Low Rank Approximation

no code implementations26 Apr 2017 Zhao Song, David P. Woodruff, Peilin Zhong

Despite the success on obtaining relative error low rank approximations for matrices, no such results were known for tensors.

Sublinear Time Orthogonal Tensor Decomposition

1 code implementation NeurIPS 2016 Zhao Song, David Woodruff, huan zhang

We show in a number of cases one can achieve the same theoretical guarantees in sublinear time, i. e., even without reading most of the input tensor.

Tensor Decomposition

Linear Feature Encoding for Reinforcement Learning

no code implementations NeurIPS 2016 Zhao Song, Ronald E. Parr, Xuejun Liao, Lawrence Carin

We then develop a supervised linear feature encoding method that is motivated by insights from linear value function approximation theory, as well as empirical successes from deep RL.

reinforcement-learning

Low Rank Approximation with Entrywise $\ell_1$-Norm Error

no code implementations3 Nov 2016 Zhao Song, David P. Woodruff, Peilin Zhong

We give the first provable approximation algorithms for $\ell_1$-low rank approximation, showing that it is possible to achieve approximation factor $\alpha = (\log d) \cdot \mathrm{poly}(k)$ in $\mathrm{nnz}(A) + (n+d) \mathrm{poly}(k)$ time, where $\mathrm{nnz}(A)$ denotes the number of non-zero entries of $A$.

A Max-Product EM Algorithm for Reconstructing Markov-tree Sparse Signals from Compressive Samples

no code implementations5 Sep 2012 Zhao Song, Aleksandar Dogandzic

Our signal reconstruction scheme is based on an EM iteration that aims at maximizing the posterior distribution of the signal and its state variables given the noise variance.

Image Reconstruction

Cannot find the paper you are looking for? You can Submit a new open access paper.