Search Results for author: Zhao Song

Found 144 papers, 19 papers with code

Instance-hiding Schemes for Private Distributed Learning

no code implementations ICML 2020 Yangsibo Huang, Zhao Song, Sanjeev Arora, Kai Li

The new ideas in the current paper are: (a) new variants of mixup with negative as well as positive coefficients, and extend the sample-wise mixup to be pixel-wise.

Federated Learning

Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic

no code implementations12 Feb 2024 Jiuxiang Gu, Chenyang Li, YIngyu Liang, Zhenmei Shi, Zhao Song, Tianyi Zhou

Our research presents a thorough analytical characterization of the features learned by stylized one-hidden layer neural networks and one-layer Transformers in addressing this task.

Mathematical Reasoning

Quantum Speedup for Spectral Approximation of Kronecker Products

no code implementations10 Feb 2024 Yeqi Gao, Zhao Song, Ruizhe Zhang

Given its widespread application in machine learning and optimization, the Kronecker product emerges as a pivotal linear algebra operator.

The Fine-Grained Complexity of Gradient Computation for Training Large Language Models

no code implementations7 Feb 2024 Josh Alman, Zhao Song

Large language models (LLMs) have made fundamental contributions over the last a few years.

On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis

no code implementations7 Feb 2024 Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song, Han Liu

Specifically, we establish an upper bound criterion for the norm of input query patterns and memory patterns.

Retrieval

Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence

no code implementations2 Feb 2024 Yichuan Deng, Zhao Song, Chiwun Yang

Based on SGD, previous works have proposed many algorithms that have improved convergence speed and generalization in stochastic optimization, such as SGDm, AdaGrad, Adam, etc.

Stochastic Optimization

Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression

no code implementations26 Nov 2023 Zhihang Li, Zhao Song, Zifan Wang, Junze Yin

Our main results involve analyzing the convergence properties of an approximate Newton method used to minimize the regularized training loss.

Question Answering regression +2

One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space

no code implementations24 Nov 2023 Raghav Addanki, Chenyang Li, Zhao Song, Chiwun Yang

Considering a single-layer self-attention with Query, Key, and Value matrices $Q, K, V \in \mathbb{R}^{n \times d}$, the polynomial method approximates the attention output $T \in \mathbb{R}^{n \times d}$.

Attribute

Revisiting Quantum Algorithms for Linear Regressions: Quadratic Speedups without Data-Dependent Parameters

no code implementations24 Nov 2023 Zhao Song, Junze Yin, Ruizhe Zhang

However, the running times of these algorithms depend on some quantum linear algebra-related parameters, such as $\kappa(A)$, the condition number of $A$.

regression

A Theoretical Insight into Attack and Defense of Gradient Leakage in Transformer

no code implementations22 Nov 2023 Chenyang Li, Zhao Song, Weixin Wang, Chiwun Yang

The Deep Leakage from Gradient (DLG) attack has emerged as a prevalent and highly effective method for extracting sensitive training data by inspecting exchanged gradients.

Privacy Preserving

Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training

no code implementations19 Nov 2023 Lianke Qin, Saayan Mitra, Zhao Song, Yuanyuan Yang, Tianyi Zhou

In this paper, we consider a heavy inner product identification problem, which generalizes the Light Bulb problem~(\cite{prr89}): Given two sets $A \subset \{-1,+1\}^d$ and $B \subset \{-1,+1\}^d$ with $|A|=|B| = n$, if there are exact $k$ pairs whose inner product passes a certain threshold, i. e., $\{(a_1, b_1), \cdots, (a_k, b_k)\} \subset A \times B$ such that $\forall i \in [k], \langle a_i, b_i \rangle \geq \rho \cdot d$, for a threshold $\rho \in (0, 1)$, the goal is to identify those $k$ heavy inner products.

The Expressibility of Polynomial based Attention Scheme

no code implementations30 Oct 2023 Zhao Song, Guangyi Xu, Junze Yin

In this paper, we offer a theoretical analysis of the expressive capabilities of polynomial attention.

Decision Making

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

1 code implementation26 Oct 2023 Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen

We show that contextual sparsity exists, that it can be accurately predicted, and that we can exploit it to speed up LLM inference in wall-clock time without compromising LLM's quality or in-context learning ability.

In-Context Learning

Unmasking Transformers: A Theoretical Approach to Data Recovery via Attention Weights

no code implementations19 Oct 2023 Yichuan Deng, Zhao Song, Shenghao Xie, Chiwun Yang

In the realm of deep learning, transformers have emerged as a dominant architecture, particularly in natural language processing tasks.

Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention

no code implementations18 Oct 2023 Yichuan Deng, Zhao Song, Tianyi Zhou

Large transformer models have achieved state-of-the-art results in numerous natural language processing tasks.

An Automatic Learning Rate Schedule Algorithm for Achieving Faster Convergence and Steeper Descent

no code implementations17 Oct 2023 Zhao Song, Chiwun Yang

The delta-bar-delta algorithm is recognized as a learning rate adaptation technique that enhances the convergence speed of the training process in optimization by dynamically scheduling the learning rate based on the difference between the current and previous weight updates.

Scheduling

How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation

no code implementations6 Oct 2023 Josh Alman, Zhao Song

Interestingly, the higher the order of the tensors, the lower the bound on the entries needs to be for an efficient algorithm.

Fine-tune Language Models to Approximate Unbiased In-context Learning

no code implementations5 Oct 2023 Timothy Chu, Zhao Song, Chiwun Yang

To address this issue, we introduce a reweighted algorithm called RICL (Reweighted In-context Learning).

In-Context Learning

A Unified Scheme of ResNet and Softmax

no code implementations23 Sep 2023 Zhao Song, Weixin Wang, Junze Yin

The Hessian is shown to be positive semidefinite, and its structure is characterized as the sum of a low-rank matrix and a diagonal matrix.

Image Classification object-detection +3

A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time

no code implementations14 Sep 2023 Yeqi Gao, Zhao Song, Weixin Wang, Junze Yin

$A_3$ is a matrix in $\mathbb{R}^{n \times d}$, $\mathsf{A}_{j_0} \in \mathbb{R}^{n \times d^2}$ is the $j_0$-th block of $\mathsf{A}$.

Is Solving Graph Neural Tangent Kernel Equivalent to Training Graph Neural Network?

no code implementations14 Sep 2023 Lianke Qin, Zhao Song, Baocheng Sun

A rising trend in theoretical deep learning is to understand why deep learning works through Neural Tangent Kernel (NTK) [jgh18], a kernel method that is equivalent to using gradient descent to train a multi-layer infinitely-wide neural network.

Graph Learning regression

Online Adaptive Mahalanobis Distance Estimation

no code implementations2 Sep 2023 Lianke Qin, Aravind Reddy, Zhao Song

Mahalanobis metrics are widely used in machine learning in conjunction with methods like $k$-nearest neighbors, $k$-means clustering, and $k$-medians clustering.

Clustering Dimensionality Reduction

Solving Attention Kernel Regression Problem via Pre-conditioner

no code implementations28 Aug 2023 Zhao Song, Junze Yin, Lichen Zhang

Large language models have shown impressive performance in many tasks.

regression

How to Protect Copyright Data in Optimization of Large Language Models?

no code implementations23 Aug 2023 Timothy Chu, Zhao Song, Chiwun Yang

Large language models (LLMs) and generative AI have played a transformative role in computer research and applications.

Language Modelling Large Language Model +1

Clustered Linear Contextual Bandits with Knapsacks

no code implementations21 Aug 2023 Yichuan Deng, Michalis Mamakos, Zhao Song

Thus, maximizing the total reward requires learning not only models about the reward and the resource consumption, but also cluster memberships.

Econometrics Multi-Armed Bandits

GradientCoin: A Peer-to-Peer Decentralized Large Language Models

no code implementations21 Aug 2023 Yeqi Gao, Zhao Song, Junze Yin

It is likely that only two types of people would be interested in setting up a practical system for it: $\bullet$ Those who prefer to use a decentralized ChatGPT-like software.

Convergence of Two-Layer Regression with Nonlinear Units

no code implementations16 Aug 2023 Yichuan Deng, Zhao Song, Shenghao Xie

Softmax unit and ReLU unit are the key structure in attention computation.

regression

Zero-th Order Algorithm for Softmax Attention Optimization

no code implementations17 Jul 2023 Yichuan Deng, Zhihang Li, Sridhar Mahadevan, Zhao Song

We demonstrate the convergence of our algorithm, highlighting its effectiveness in efficiently computing gradients for large-scale LLMs.

Fast Quantum Algorithm for Attention Computation

no code implementations16 Jul 2023 Yeqi Gao, Zhao Song, Xin Yang, Ruizhe Zhang

It is well-known that quantum machine has certain computational advantages compared to the classical machine.

Language Modelling Machine Translation +5

Faster Algorithms for Structured Linear and Kernel Support Vector Machines

no code implementations15 Jul 2023 Yuzhou Gu, Zhao Song, Lichen Zhang

Consequently, we obtain a variety of results for SVMs: * For linear SVM, where the quadratic constraint matrix has treewidth $\tau$, we can solve the corresponding program in time $\widetilde O(n\tau^{(\omega+1)/2}\log(1/\epsilon))$; * For linear SVM, where the quadratic constraint matrix admits a low-rank factorization of rank-$k$, we can solve the corresponding program in time $\widetilde O(nk^{(\omega+1)/2}\log(1/\epsilon))$; * For Gaussian kernel SVM, where the data dimension $d = \Theta(\log n)$ and the squared dataset radius is small, we can solve it in time $O(n^{1+o(1)}\log(1/\epsilon))$.

Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification

no code implementations13 Jul 2023 Lianke Qin, Zhao Song, Yuanyuan Yang

Deep learning has been widely used in many fields, but the model training process usually consumes massive computational resources and time.

Efficient Neural Network

In-Context Learning for Attention Scheme: from Single Softmax Regression to Multiple Softmax Regression via a Tensor Trick

no code implementations5 Jul 2023 Yeqi Gao, Zhao Song, Shenghao Xie

Given matrices $A_1 \in \mathbb{R}^{n \times d}$, and $A_2 \in \mathbb{R}^{n \times d}$ and $B \in \mathbb{R}^{n \times n}$, the purpose is to solve some certain optimization problems: Normalized version $\min_{X} \| D(X)^{-1} \exp(A_1 X A_2^\top) - B \|_F^2$ and Rescaled version $\| \exp(A_1 X A_2^\top) - D(X) \cdot B \|_F^2$.

In-Context Learning Natural Language Understanding +1

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

1 code implementation24 Jun 2023 Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.

Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation

no code implementations7 Jun 2023 Zhao Song, Mingquan Ye, Junze Yin, Lichen Zhang

For weighted low rank approximation, this improves the runtime of [LLR16] from $n^2 k^2$ to $n^2k$.

Faster Robust Tensor Power Method for Arbitrary Order

no code implementations1 Jun 2023 Yichuan Deng, Zhao Song, Junze Yin

Tensor decomposition is a fundamental method used in various areas to deal with high-dimensional data.

Tensor Decomposition

Federated Empirical Risk Minimization via Second-Order Method

no code implementations27 May 2023 Song Bian, Zhao Song, Junze Yin

Many convex optimization problems with important applications in machine learning are formulated as empirical risk minimization (ERM).

Federated Learning regression +1

Fast Submodular Function Maximization

no code implementations15 May 2023 Lianke Qin, Zhao Song, Yitan Wang

We consider both the online and offline versions of the problem: in each iteration, the data set changes incrementally or is not changed, and a user can issue a query to maximize the function on a given subset of the data.

Document Summarization Image Segmentation +1

Fast and Efficient Matching Algorithm with Deadline Instances

no code implementations15 May 2023 Zhao Song, Weixin Wang, Chenbo Yin, Junze Yin

But in \textsc{FastPostponedGreedy} algorithm, the status of each node is unknown at first.

Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data

no code implementations13 May 2023 Zhao Song, Mingquan Ye

Deep learning has achieved impressive success in a variety of fields because of its good generalization.

Differentially Private Attention Computation

no code implementations8 May 2023 Yeqi Gao, Zhao Song, Xin Yang

Inspired by [Vyas, Kakade and Barak 2023], in this work, we provide a provable result for showing how to differentially private approximate the attention matrix.

An Iterative Algorithm for Rescaled Hyperbolic Functions Regression

no code implementations1 May 2023 Yeqi Gao, Zhao Song, Junze Yin

LLMs have shown great promise in improving the accuracy and efficiency of these tasks, and have the potential to revolutionize the field of natural language processing (NLP) in the years to come.

In-Context Learning Language Modelling +4

The Closeness of In-Context Learning and Weight Shifting for Softmax Regression

no code implementations26 Apr 2023 Shuai Li, Zhao Song, Yu Xia, Tong Yu, Tianyi Zhou

Large language models (LLMs) are known for their exceptional performance in natural language processing, making them highly effective in many human life-related or even job-related tasks.

In-Context Learning regression

PVP: Pre-trained Visual Parameter-Efficient Tuning

no code implementations26 Apr 2023 Zhao Song, Ke Yang, Naiyang Guan, Junjie Zhu, Peng Qiao, Qingyong Hu

Large-scale pre-trained transformers have demonstrated remarkable success in various computer vision tasks.

Ranked #4 on Image Classification on VTAB-1k (using extra training data)

Fine-Grained Image Classification Visual Prompt Tuning

Solving Tensor Low Cycle Rank Approximation

no code implementations13 Apr 2023 Yichuan Deng, Yeqi Gao, Zhao Song

For the tensor classical rank, tucker rank and train rank, it has been well studied in [Song, Woodruff, Zhong SODA 2019].

speech-recognition Speech Recognition

Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension

no code implementations10 Apr 2023 Yichuan Deng, Sridhar Mahadevan, Zhao Song

It runs in $\widetilde{O}(\mathrm{nnz}(X) + n^{\omega} ) $ time, has $1-\delta$ succeed probability, and chooses $m = O(n \log(n/\delta))$.

Sentence

An Over-parameterized Exponential Regression

no code implementations29 Mar 2023 Yeqi Gao, Sridhar Mahadevan, Zhao Song

Mathematically, we define the neural function $F: \mathbb{R}^{d \times m} \times \mathbb{R}^d \rightarrow \mathbb{R}$ using an exponential activation function.

regression

Solving Regularized Exp, Cosh and Sinh Regression Problems

no code implementations28 Mar 2023 Zhihang Li, Zhao Song, Tianyi Zhou

In this paper, we make use of the input sparsity and purpose an algorithm that use $\log ( \|x_0 - x^*\|_2 / \epsilon)$ iterations and $\widetilde{O}(\mathrm{nnz}(A) + d^{\omega} )$ per iteration time to solve the problem.

regression

A General Algorithm for Solving Rank-one Matrix Sensing

no code implementations22 Mar 2023 Lianke Qin, Zhao Song, Ruizhe Zhang

In this paper, we relax that rank-$k$ assumption and solve a much more general matrix sensing problem.

A Theoretical Analysis Of Nearest Neighbor Search On Approximate Near Neighbor Graph

no code implementations10 Mar 2023 Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu

Current theoretical literature focuses on greedy search on exact near neighbor graph while practitioners use approximate near neighbor graph (ANN-Graph) to reduce the preprocessing time.

Streaming Kernel PCA Algorithm With Small Space

no code implementations8 Mar 2023 Yichuan Deng, Zhao Song, Zifan Wang, Han Zhang

The kernel method, which is commonly used in learning algorithms such as Support Vector Machines (SVMs), has also been applied in PCA algorithms.

Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time

no code implementations21 Feb 2023 Yuzhou Gu, Zhao Song, Junze Yin, Lichen Zhang

Moreover, our algorithm runs in time $\widetilde O(|\Omega| k)$, which is nearly linear in the time to verify the solution while preserving the sample complexity.

Low-Rank Matrix Completion regression

A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee

no code implementations1 Feb 2023 Zhao Song, Mingquan Ye, Junze Yin, Lichen Zhang

One popular approach for solving such $\ell_2$ regression problem is via sketching: picking a structured random matrix $S\in \mathbb{R}^{m\times n}$ with $m\ll n$ and $SA$ can be quickly computed, solve the ``sketched'' regression problem $\arg\min_{x\in \mathbb{R}^d} \|SAx-Sb\|_2$.

regression

Exit options sustain altruistic punishment and decrease the second-order free-riders, but it is not a panacea

no code implementations12 Jan 2023 Chen Shen, Zhao Song, Lei Shi, Jun Tanimoto, Zhen Wang

Altruistic punishment, where individuals incur personal costs to punish others who have harmed third parties, presents an evolutionary conundrum as it undermines individual fitness.

Open-Ended Question Answering

Adaptive and Dynamic Multi-Resolution Hashing for Pairwise Summations

no code implementations21 Dec 2022 Lianke Qin, Aravind Reddy, Zhao Song, Zhaozhuo Xu, Danyang Zhuo

In this paper, we propose Adam-Hash: an adaptive and dynamic multi-resolution hashing data-structure for fast pairwise summation estimation.

A Faster $k$-means++ Algorithm

no code implementations28 Nov 2022 Jiehao Liang, Somdeb Sarkhel, Zhao Song, Chenbo Yin, Junze Yin, Danyang Zhuo

We propose a new algorithm \textsc{FastKmeans++} that only takes in $\widetilde{O}(nd + nk^2)$ time, in total.

Clustering

A Convergence Theory for Federated Average: Beyond Smoothness

no code implementations3 Nov 2022 Xiaoxiao Li, Zhao Song, Runzhou Tao, Guangyi Zhang

As a leading algorithm in this setting, Federated Average FedAvg, which runs Stochastic Gradient Descent (SGD) in parallel on local devices and averages the sequences only once in a while, have been widely used due to their simplicity and low communication cost.

Edge-computing Federated Learning

Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability

no code implementations15 Oct 2022 Zhao Song, Yitan Wang, Zheng Yu, Lichen Zhang

In this paper, we propose a novel sketching scheme for the first order method in large-scale distributed learning setting, such that the communication costs between distributed agents are saved while the convergence of the algorithms is still guaranteed.

Federated Learning

Dynamic Tensor Product Regression

no code implementations8 Oct 2022 Aravind Reddy, Zhao Song, Lichen Zhang

In this work, we initiate the study of \emph{Dynamic Tensor Product Regression}.

regression

A Sublinear Adversarial Training Algorithm

no code implementations10 Aug 2022 Yeqi Gao, Lianke Qin, Zhao Song, Yitan Wang

For a neural network of width $m$, $n$ input training data in $d$ dimension, it takes $\Omega(mnd)$ time cost per training iteration for the forward and backward computation.

Training Overparametrized Neural Networks in Sublinear Time

no code implementations9 Aug 2022 Yichuan Deng, Hang Hu, Zhao Song, Omri Weinstein, Danyang Zhuo

The success of deep learning comes at a tremendous computational and energy cost, and the scalability of training massively overparametrized neural networks is becoming a real barrier to the progress of artificial intelligence (AI).

Dynamic Maintenance of Kernel Density Estimation Data Structure: From Practice to Theory

no code implementations8 Aug 2022 Jiehao Liang, Zhao Song, Zhaozhuo Xu, Junze Yin, Danyang Zhuo

In this work, we focus on the dynamic maintenance of KDE data structures with robustness to adversarial queries.

Density Estimation

Federated Adversarial Learning: A Framework with Convergence Analysis

no code implementations7 Aug 2022 Xiaoxiao Li, Zhao Song, Jiaming Yang

Unlike the convergence analysis in classical centralized training that relies on the gradient direction, it is significantly harder to analyze the convergence in FAL for three reasons: 1) the complexity of min-max optimization, 2) model not updating in the gradient direction due to the multi-local updates on the client-side before aggregation and 3) inter-client heterogeneity.

Federated Learning

Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis

no code implementations26 Jun 2022 Alexander Munteanu, Simon Omlor, Zhao Song, David P. Woodruff

A common method in training neural networks is to initialize all the weights to be independent Gaussian vectors.

Smoothed Online Combinatorial Optimization Using Imperfect Predictions

no code implementations23 Apr 2022 Kai Wang, Zhao Song, Georgios Theocharous, Sridhar Mahadevan

Smoothed online combinatorial optimization considers a learner who repeatedly chooses a combinatorial decision to minimize an unknown changing cost function with a penalty on switching decisions in consecutive rounds.

Combinatorial Optimization

Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time

no code implementations14 Dec 2021 Zhao Song, Lichen Zhang, Ruizhe Zhang

We consider the problem of training a multi-layer over-parametrized neural network to minimize the empirical risk induced by a loss function.

On Convergence of Federated Averaging Langevin Dynamics

no code implementations9 Dec 2021 Wei Deng, Qian Zhang, Yi-An Ma, Zhao Song, Guang Lin

We develop theoretical guarantees for FA-LD for strongly log-concave distributions with non-i. i. d data and study how the injected noise and the stochastic-gradient noise, the heterogeneity of data, and the varying learning rates affect the convergence.

Uncertainty Quantification

Fast Graph Neural Tangent Kernel via Kronecker Sketching

no code implementations4 Dec 2021 Shunhua Jiang, Yunze Man, Zhao Song, Zheng Yu, Danyang Zhuo

Given a kernel matrix of $n$ graphs, using sketching in solving kernel regression can reduce the running time to $o(n^3)$.

regression

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models

1 code implementation ICLR 2022 Tri Dao, Beidi Chen, Kaizhao Liang, Jiaming Yang, Zhao Song, Atri Rudra, Christopher Ré

To address this, our main insight is to optimize over a continuous superset of sparse matrices with a fixed structure known as products of butterfly matrices.

Language Modelling

Evaluating Gradient Inversion Attacks and Defenses in Federated Learning

1 code implementation NeurIPS 2021 Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, Sanjeev Arora

Gradient inversion attack (or input recovery from gradient) is an emerging threat to the security and privacy preservation of Federated learning, whereby malicious eavesdroppers or participants in the protocol can recover (partially) the clients' private data.

Federated Learning

Online MAP Inference and Learning for Nonsymmetric Determinantal Point Processes

no code implementations29 Nov 2021 Aravind Reddy, Ryan A. Rossi, Zhao Song, Anup Rao, Tung Mai, Nedim Lipka, Gang Wu, Eunyee Koh, Nesreen Ahmed

In this paper, we introduce the online and streaming MAP inference and learning problems for Non-symmetric Determinantal Point Processes (NDPPs) where data points arrive in an arbitrary order and the algorithms are constrained to use a single-pass over the data as well as sub-linear memory.

Point Processes valid

Scatterbrain: Unifying Sparse and Low-rank Attention Approximation

1 code implementation NeurIPS 2021 Beidi Chen, Tri Dao, Eric Winsor, Zhao Song, Atri Rudra, Christopher Ré

Recent advances in efficient Transformers have exploited either the sparsity or low-rank properties of attention matrices to reduce the computational and memory bottlenecks of modeling long sequences.

Image Generation Language Modelling

Does Preprocessing Help Training Over-parameterized Neural Networks?

no code implementations NeurIPS 2021 Zhao Song, Shuo Yang, Ruizhe Zhang

The classical training method requires paying $\Omega(mnd)$ cost for both forward computation and backward computation, where $m$ is the width of the neural network, and we are given $n$ training points in $d$-dimensional space.

Provable Federated Adversarial Learning via Min-max Optimization

no code implementations29 Sep 2021 Xiaoxiao Li, Zhao Song, Jiaming Yang

Unlike the convergence analysis in centralized training that relies on the gradient direction, it is significantly harder to analyze the convergence in FAL for two reasons: 1) the complexity of min-max optimization, and 2) model not updating in the gradient direction due to the multi-local updates on the client-side before aggregation.

Federated Learning

InstaHide’s Sample Complexity When Mixing Two Private Images

no code implementations29 Sep 2021 Baihe Huang, Zhao Song, Runzhou Tao, Ruizhe Zhang, Danyang Zhuo

Inspired by InstaHide challenge [Huang, Song, Li and Arora'20], [Chen, Song and Zhuo'20] recently provides one mathematical formulation of InstaHide attack problem under Gaussian images distribution.

Vocal Bursts Valence Prediction

Iterative Sketching and its Application to Federated Learning

no code implementations29 Sep 2021 Zhao Song, Zheng Yu, Lichen Zhang

Though most federated learning frameworks only require clients and the server to send gradient information over the network, they still face the challenges of communication efficiency and data privacy.

Federated Learning LEMMA

Sample Complexity of Deep Active Learning

no code implementations29 Sep 2021 Zhao Song, Baocheng Sun, Danyang Zhuo

In this paper, we present the first deep active learning algorithm which has a provable sample complexity.

Active Learning BIG-bench Machine Learning +2

Fast Sketching of Polynomial Kernels of Polynomial Degree

no code implementations21 Aug 2021 Zhao Song, David P. Woodruff, Zheng Yu, Lichen Zhang

Recent techniques in oblivious sketching reduce the dependence in the running time on the degree $q$ of the polynomial kernel from exponential to polynomial, which is useful for the Gaussian kernel, for which $q$ can be chosen to be polylogarithmic.

BIG-bench Machine Learning

Scatterbrain: Unifying Sparse and Low-rank Attention

1 code implementation NeurIPS 2021 Beidi Chen, Tri Dao, Eric Winsor, Zhao Song, Atri Rudra, Christopher Ré

Recent advances in efficient Transformers have exploited either the sparsity or low-rank properties of attention matrices to reduce the computational and memory bottlenecks of modeling long sequences.

Image Generation Language Modelling

Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing

no code implementations18 May 2021 Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu

We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions.

reinforcement-learning Reinforcement Learning (RL)

FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis

no code implementations11 May 2021 Baihe Huang, Xiaoxiao Li, Zhao Song, Xin Yang

Nevertheless, training analysis of neural networks in FL is non-trivial for two reasons: first, the objective loss function we are optimizing is non-smooth and non-convex, and second, we are even not updating in the gradient direction.

Federated Learning

Near-Optimal Two-Pass Streaming Algorithm for Sampling Random Walks over Directed Graphs

no code implementations22 Feb 2021 Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh Saxena, Zhao Song, Huacheng Yu

In addition, we show a similar $\tilde{\Theta}(n \cdot \sqrt{L})$ bound on the space complexity of any algorithm (with any number of passes) for the related problem of sampling an $L$-step random walk from every vertex in the graph.

Data Structures and Algorithms Computational Complexity

Symmetric Sparse Boolean Matrix Factorization and Applications

no code implementations2 Feb 2021 Sitan Chen, Zhao Song, Runzhou Tao, Ruizhe Zhang

As this problem is hard in the worst-case, we study a natural average-case variant that arises in the context of these reconstruction attacks: $\mathbf{M} = \mathbf{W}\mathbf{W}^{\top}$ for $\mathbf{W}$ a random Boolean matrix with $k$-sparse rows, and the goal is to recover $\mathbf{W}$ up to column permutation.

Tensor Decomposition

Solving SDP Faster: A Robust IPM Framework and Efficient Implementation

no code implementations20 Jan 2021 Baihe Huang, Shunhua Jiang, Zhao Song, Runzhou Tao

This paper introduces a new robust interior point method analysis for semidefinite programming (SDP).

Optimization and Control Data Structures and Algorithms

Minimum Cost Flows, MDPs, and $\ell_1$-Regression in Nearly Linear Time for Dense Instances

no code implementations14 Jan 2021 Jan van den Brand, Yin Tat Lee, Yang P. Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, Di Wang

In the special case of the minimum cost flow problem on $n$-vertex $m$-edge graphs with integer polynomially-bounded costs and capacities we obtain a randomized method which solves the problem in $\tilde{O}(m+n^{1. 5})$ time.

Data Structures and Algorithms Optimization and Control

MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training

no code implementations ICLR 2021 Beidi Chen, Zichang Liu, Binghui Peng, Zhaozhuo Xu, Jonathan Lingjie Li, Tri Dao, Zhao Song, Anshumali Shrivastava, Christopher Re

Recent advances by practitioners in the deep learning community have breathed new life into Locality Sensitive Hashing (LSH), using it to reduce memory and time bottlenecks in neural network (NN) training.

Efficient Neural Network Language Modelling +2

Graph Neural Network Acceleration via Matrix Dimension Reduction

no code implementations1 Jan 2021 Shunhua Jiang, Yunze Man, Zhao Song, Danyang Zhuo

Theoretically, we present two techniques to speed up GNTK training while preserving the generalization error: (1) We use a novel matrix decoupling method to reduce matrix dimensions during the kernel solving.

Dimensionality Reduction

What Can Phase Retrieval Tell Us About Private Distributed Learning?

no code implementations ICLR 2021 Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo

In this work, we examine the security of InstaHide, a scheme recently proposed by \cite{hsla20} for preserving the security of private datasets in the context of distributed learning.

Retrieval

Oblivious Sketching-based Central Path Method for Solving Linear Programming Problems

no code implementations1 Jan 2021 Zhao Song, Zheng Yu

In this work, we propose a sketching-based central path method for solving linear programmings, whose running time matches the state of art results [Cohen, Lee, Song STOC 19; Lee, Song, Zhang COLT 19].

InstaHide's Sample Complexity When Mixing Two Private Images

no code implementations24 Nov 2020 Baihe Huang, Zhao Song, Runzhou Tao, Junze Yin, Ruizhe Zhang, Danyang Zhuo

On the current InstaHide challenge setup, where each InstaHide image is a mixture of two private images, we present a new algorithm to recover all the private images with a provable guarantee and optimal sample complexity.

Vocal Bursts Valence Prediction

On InstaHide, Phase Retrieval, and Sparse Matrix Factorization

no code implementations23 Nov 2020 Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo

In this work, we examine the security of InstaHide, a scheme recently proposed by [Huang, Song, Li and Arora, ICML'20] for preserving the security of private datasets in the context of distributed learning.

Retrieval

Algorithms and Hardness for Linear Algebra on Geometric Graphs

no code implementations4 Nov 2020 Josh Alman, Timothy Chu, Aaron Schild, Zhao Song

We investigate whether or not it is possible to solve the following problems in $n^{1+o(1)}$ time for a $\mathsf{K}$-graph $G_P$ when $d < n^{o(1)}$: $\bullet$ Multiply a given vector by the adjacency matrix or Laplacian matrix of $G_P$ $\bullet$ Find a spectral sparsifier of $G_P$ $\bullet$ Solve a Laplacian system in $G_P$'s Laplacian matrix For each of these problems, we consider all functions of the form $\mathsf{K}(u, v) = f(\|u-v\|_2^2)$ for a function $f:\mathbb{R} \rightarrow \mathbb{R}$.

MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery

no code implementations22 Oct 2020 Xiaoxiao Li, Yangsibo Huang, Binghui Peng, Zhao Song, Kai Li

To address the issue that deep neural networks (DNNs) are vulnerable to model inversion attacks, we design an objective function, which adjusts the separability of the hidden data representations, as a way to control the trade-off between data utility and vulnerability to inversion attacks.

InstaHide: Instance-hiding Schemes for Private Distributed Learning

3 code implementations6 Oct 2020 Yangsibo Huang, Zhao Song, Kai Li, Sanjeev Arora

This paper introduces InstaHide, a simple encryption of training images, which can be plugged into existing distributed deep learning pipelines.

Generalized Leverage Score Sampling for Neural Networks

no code implementations NeurIPS 2020 Jason D. Lee, Ruoqi Shen, Zhao Song, Mengdi Wang, Zheng Yu

Leverage score sampling is a powerful technique that originates from theoretical computer science, which can be used to speed up a large number of fundamental questions, e. g. linear regression, linear programming, semi-definite programming, cutting plane method, graph sparsification, maximum matching and max-flow.

Learning Theory regression

Training (Overparametrized) Neural Networks in Near-Linear Time

no code implementations20 Jun 2020 Jan van den Brand, Binghui Peng, Zhao Song, Omri Weinstein

The slow convergence rate and pathological curvature issues of first-order gradient methods for training deep neural networks, initiated an ongoing effort for developing faster $\mathit{second}$-$\mathit{order}$ optimization algorithms beyond SGD, without compromising the generalization error.

Dimensionality Reduction regression

When is Particle Filtering Efficient for Planning in Partially Observed Linear Dynamical Systems?

no code implementations10 Jun 2020 Simon S. Du, Wei Hu, Zhiyuan Li, Ruoqi Shen, Zhao Song, Jiajun Wu

Though errors in past actions may affect the future, we are able to bound the number of particles needed so that the long-run reward of the policy based on particle filtering is close to that based on exact inference.

Decision Making

Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss

no code implementations16 Apr 2020 Zhao Song, David P. Woodruff, Peilin Zhong

entries drawn from any distribution $\mu$ for which the $(1+\gamma)$-th moment exists, for an arbitrarily small constant $\gamma > 0$, then it is possible to obtain a $(1+\epsilon)$-approximate column subset selection to the entrywise $\ell_1$-norm in nearly linear time.

An Improved Cutting Plane Method for Convex Optimization, Convex-Concave Games and its Applications

no code implementations8 Apr 2020 Haotian Jiang, Yin Tat Lee, Zhao Song, Sam Chiu-wai Wong

We propose a new cutting plane algorithm that uses an optimal $O(n \log (\kappa))$ evaluations of the oracle and an additional $O(n^2)$ time per evaluation, where $\kappa = nR/\epsilon$.

Privacy-preserving Learning via Deep Net Pruning

no code implementations4 Mar 2020 Yangsibo Huang, Yushan Su, Sachin Ravi, Zhao Song, Sanjeev Arora, Kai Li

This paper attempts to answer the question whether neural network pruning can be used as a tool to achieve differential privacy without losing much data utility.

Network Pruning Privacy Preserving

Sketching Transformed Matrices with Applications to Natural Language Processing

no code implementations23 Feb 2020 Yingyu Liang, Zhao Song, Mengdi Wang, Lin F. Yang, Xin Yang

We show that our approach obtains small error and is efficient in both space and time.

Meta-learning for mixed linear regression

no code implementations ICML 2020 Weihao Kong, Raghav Somani, Zhao Song, Sham Kakade, Sewoong Oh

In modern supervised learning, there are a large number of tasks, but many of them are associated with only a small amount of labeled data.

Meta-Learning regression +1

Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

no code implementations NeurIPS 2020 Yi Zhang, Orestis Plevrakis, Simon S. Du, Xingguo Li, Zhao Song, Sanjeev Arora

Our work proves convergence to low robust training loss for \emph{polynomial} width instead of exponential, under natural assumptions and with the ReLU activation.

Parallel Neural Text-to-Speech

no code implementations ICLR 2020 Kainan Peng, Wei Ping, Zhao Song, Kexin Zhao

In this work, we first propose ParaNet, a non-autoregressive seq2seq model that converts text to spectrogram.

Learning Mixtures of Linear Regressions in Subexponential Time via Fourier Moments

no code implementations16 Dec 2019 Sitan Chen, Jerry Li, Zhao Song

In this paper, we give the first algorithm for learning an MLR that runs in time which is sub-exponential in $k$.

Clustering Density Estimation

WaveFlow: A Compact Flow-based Model for Raw Audio

4 code implementations ICML 2020 Wei Ping, Kainan Peng, Kexin Zhao, Zhao Song

WaveFlow provides a unified view of likelihood-based models for 1-D data, including WaveNet and WaveGlow as special cases.

Speech Synthesis

Provable Non-linear Inductive Matrix Completion

no code implementations NeurIPS 2019 Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon

Inductive matrix completion (IMC) method is a standard approach for this problem where the given query as well as the items are embedded in a common low-dimensional space.

Matrix Completion Retrieval

Average Case Column Subset Selection for Entrywise \ell_1-Norm Loss

1 code implementation NeurIPS 2019 Zhao Song, David Woodruff, Peilin Zhong

entries drawn from any distribution $\mu$ for which the $(1+\gamma)$-th moment exists, for an arbitrarily small constant $\gamma > 0$, then it is possible to obtain a $(1+\epsilon)$-approximate column subset selection to the entrywise $\ell_1$-norm in nearly linear time.

Efficient Symmetric Norm Regression via Linear Sketching

no code implementations NeurIPS 2019 Zhao Song, Ruosong Wang, Lin F. Yang, Hongyang Zhang, Peilin Zhong

When the loss function is a general symmetric norm, our algorithm produces a $\sqrt{d} \cdot \mathrm{polylog} n \cdot \mathrm{mmc}(\ell)$-approximate solution in input-sparsity time, where $\mathrm{mmc}(\ell)$ is a quantity related to the symmetric norm under consideration.

regression

Optimal Sketching for Kronecker Product Regression and Low Rank Approximation

no code implementations NeurIPS 2019 Huaian Diao, Rajesh Jayaram, Zhao Song, Wen Sun, David P. Woodruff

For input $\mathcal{A}$ as above, we give $O(\sum_{i=1}^q \text{nnz}(A_i))$ time algorithms, which is much faster than computing $\mathcal{A}$.

regression

Total Least Squares Regression in Input Sparsity Time

1 code implementation NeurIPS 2019 Huaian Diao, Zhao Song, David P. Woodruff, Xin Yang

In the total least squares problem, one is given an $m \times n$ matrix $A$, and an $m \times d$ matrix $B$, and one seeks to "correct" both $A$ and $B$, obtaining matrices $\hat{A}$ and $\hat{B}$, so that there exists an $X$ satisfying the equation $\hat{A}X = \hat{B}$.

regression

Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound

no code implementations9 Jun 2019 Zhao Song, Xin Yang

We improve the over-parametrization size over two beautiful results [Li and Liang' 2018] and [Du, Zhai, Poczos and Singh' 2019] in deep learning theory.

Learning Theory

Non-Autoregressive Neural Text-to-Speech

2 code implementations ICML 2020 Kainan Peng, Wei Ping, Zhao Song, Kexin Zhao

In this work, we propose ParaNet, a non-autoregressive seq2seq model that converts text to spectrogram.

Text-To-Speech Synthesis

Solving Empirical Risk Minimization in the Current Matrix Multiplication Time

no code implementations11 May 2019 Yin Tat Lee, Zhao Song, Qiuyi Zhang

Our result generalizes the very recent result of solving linear programs in the current matrix multiplication time [Cohen, Lee, Song'19] to a more broad class of problems.

Efficient Model-free Reinforcement Learning in Metric Spaces

1 code implementation1 May 2019 Zhao Song, Wen Sun

Model-free Reinforcement Learning (RL) algorithms such as Q-learning [Watkins, Dayan 92] have been widely used in practice and can achieve human level performance in applications such as video games [Mnih et al. 15].

Q-Learning reinforcement-learning +1

The Limitations of Adversarial Training and the Blind-Spot Attack

no code implementations ICLR 2019 Huan Zhang, Hongge Chen, Zhao Song, Duane Boning, Inderjit S. Dhillon, Cho-Jui Hsieh

In our paper, we shed some lights on the practicality and the hardness of adversarial training by showing that the effectiveness (robustness on test set) of adversarial training has a strong correlation with the distance between a test point and the manifold of training data embedded by the network.

valid

Towards a Theoretical Understanding of Hashing-Based Neural Nets

no code implementations26 Dec 2018 Yibo Lin, Zhao Song, Lin F. Yang

In this paper, we provide provable guarantees on some hashing-based parameter reduction methods in neural nets.

Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities

no code implementations15 Dec 2018 Yin Tat Lee, Zhao Song, Santosh S. Vempala

We apply this to the sampling problem to obtain a nearly linear implementation of HMC for a broad class of smooth, strongly logconcave densities, with the number of iterations (parallel depth) and gradient evaluations being $\mathit{polylogarithmic}$ in the dimension (rather than polynomial as in previous work).

Revisiting the Softmax Bellman Operator: New Benefits and New Perspective

2 code implementations2 Dec 2018 Zhao Song, Ronald E. Parr, Lawrence Carin

The impact of softmax on the value function itself in reinforcement learning (RL) is often viewed as problematic because it leads to sub-optimal value (or Q) functions and interferes with the contraction properties of the Bellman operator.

Atari Games Q-Learning +1

A Convergence Theory for Deep Learning via Over-Parameterization

no code implementations9 Nov 2018 Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song

In terms of network architectures, our theory at least applies to fully-connected neural networks, convolutional neural networks (CNN), and residual neural networks (ResNet).

Towards a Zero-One Law for Column Subset Selection

1 code implementation NeurIPS 2019 Zhao Song, David P. Woodruff, Peilin Zhong

Our approximation algorithms handle functions which are not even scale-invariant, such as the Huber loss function, which we show have very different structural properties than $\ell_p$-norms, e. g., one can show the lack of scale-invariance causes any column subset selection algorithm to provably require a $\sqrt{\log n}$ factor larger number of columns than $\ell_p$-norms; nevertheless we design the first efficient column subset selection algorithms for such error measures.

On the Convergence Rate of Training Recurrent Neural Networks

no code implementations NeurIPS 2019 Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song

In this paper, we focus on recurrent neural networks (RNNs) which are multi-layer networks widely used in natural language processing.

Nonlinear Inductive Matrix Completion based on One-layer Neural Networks

no code implementations26 May 2018 Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon

A standard approach to modeling this problem is Inductive Matrix Completion where the predicted rating is modeled as an inner product of the user and the item features projected onto a latent space.

Clustering Matrix Completion +1

Towards Fast Computation of Certified Robustness for ReLU Networks

6 code implementations ICML 2018 Tsui-Wei Weng, huan zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel

Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17].

Learning Long Term Dependencies via Fourier Recurrent Units

2 code implementations ICML 2018 Jiong Zhang, Yibo Lin, Zhao Song, Inderjit S. Dhillon

In this paper we propose a simple recurrent architecture, the Fourier Recurrent Unit (FRU), that stabilizes the gradients that arise in its training while giving us stronger expressive power.

Nearly Optimal Dynamic $k$-Means Clustering for High-Dimensional Data

no code implementations1 Feb 2018 Wei Hu, Zhao Song, Lin F. Yang, Peilin Zhong

We consider the $k$-means clustering problem in the dynamic streaming setting, where points from a discrete Euclidean space $\{1, 2, \ldots, \Delta\}^d$ can be dynamically inserted to or deleted from the dataset.

Clustering Vocal Bursts Intensity Prediction

Sketching for Kronecker Product Regression and P-splines

no code implementations27 Dec 2017 Huaian Diao, Zhao Song, Wen Sun, David P. Woodruff

That is, TensorSketch only provides input sparsity time for Kronecker product regression with respect to the $2$-norm.

regression

Stochastic Multi-armed Bandits in Constant Space

no code implementations25 Dec 2017 David Liau, Eric Price, Zhao Song, Ger Yang

We consider the stochastic bandit problem in the sublinear space setting, where one cannot record the win-loss record for all $K$ arms.

Multi-Armed Bandits

Scalable Model Selection for Belief Networks

no code implementations NeurIPS 2017 Zhao Song, Yusuke Muraoka, Ryohei Fujimaki, Lawrence Carin

We propose a scalable algorithm for model selection in sigmoid belief networks (SBNs), based on the factorized asymptotic Bayesian (FAB) framework.

Model Selection

Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels

no code implementations8 Nov 2017 Kai Zhong, Zhao Song, Inderjit S. Dhillon

In this paper, we consider parameter recovery for non-overlapping convolutional neural networks (CNNs) with multiple kernels.

Recovery Guarantees for One-hidden-layer Neural Networks

no code implementations ICML 2017 Kai Zhong, Zhao Song, Prateek Jain, Peter L. Bartlett, Inderjit S. Dhillon

For activation functions that are also smooth, we show $\mathit{local~linear~convergence}$ guarantees of gradient descent under a resampling rule.

Fast Regression with an $\ell_\infty$ Guarantee

no code implementations30 May 2017 Eric Price, Zhao Song, David P. Woodruff

Our main result is that, when $S$ is the subsampled randomized Fourier/Hadamard transform, the error $x' - x^*$ behaves as if it lies in a "random" direction within this bound: for any fixed direction $a\in \mathbb{R}^d$, we have with $1 - d^{-c}$ probability that \[ \langle a, x'-x^*\rangle \lesssim \frac{\|a\|_2\|x'-x^*\|_2}{d^{\frac{1}{2}-\gamma}}, \quad (1) \] where $c, \gamma > 0$ are arbitrary constants.

regression

Relative Error Tensor Low Rank Approximation

no code implementations26 Apr 2017 Zhao Song, David P. Woodruff, Peilin Zhong

Despite the success on obtaining relative error low rank approximations for matrices, no such results were known for tensors.

Sublinear Time Orthogonal Tensor Decomposition

1 code implementation NeurIPS 2016 Zhao Song, David Woodruff, huan zhang

We show in a number of cases one can achieve the same theoretical guarantees in sublinear time, i. e., even without reading most of the input tensor.

Tensor Decomposition

Linear Feature Encoding for Reinforcement Learning

no code implementations NeurIPS 2016 Zhao Song, Ronald E. Parr, Xuejun Liao, Lawrence Carin

We then develop a supervised linear feature encoding method that is motivated by insights from linear value function approximation theory, as well as empirical successes from deep RL.

reinforcement-learning Reinforcement Learning (RL)

Low Rank Approximation with Entrywise $\ell_1$-Norm Error

no code implementations3 Nov 2016 Zhao Song, David P. Woodruff, Peilin Zhong

We give the first provable approximation algorithms for $\ell_1$-low rank approximation, showing that it is possible to achieve approximation factor $\alpha = (\log d) \cdot \mathrm{poly}(k)$ in $\mathrm{nnz}(A) + (n+d) \mathrm{poly}(k)$ time, where $\mathrm{nnz}(A)$ denotes the number of non-zero entries of $A$.

A Max-Product EM Algorithm for Reconstructing Markov-tree Sparse Signals from Compressive Samples

no code implementations5 Sep 2012 Zhao Song, Aleksandar Dogandzic

Our signal reconstruction scheme is based on an EM iteration that aims at maximizing the posterior distribution of the signal and its state variables given the noise variance.

Image Reconstruction

Cannot find the paper you are looking for? You can Submit a new open access paper.