Search Results for author: Tengyu Ma

Found 79 papers, 25 papers with code

Self-supervised Learning is More Robust to Dataset Imbalance

no code implementations11 Oct 2021 Hong Liu, Jeff Z. HaoChen, Adrien Gaidon, Tengyu Ma

Third, inspired by the theoretical insights, we devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets with several evaluation criteria, closing the small gap between balanced and imbalanced datasets with the same number of examples.

Self-Supervised Learning

On the Opportunities and Risks of Foundation Models

1 code implementation16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Kohd, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations

no code implementations4 Aug 2021 Yuping Luo, Tengyu Ma

This paper explores the possibility of safe RL algorithms with zero training-time safety violations in the challenging setting where we are only given a safe but trivial-reward initial policy without any prior knowledge of the dynamics model and additional offline data.

Safe Reinforcement Learning

Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers

no code implementations28 Jul 2021 Colin Wei, Yining Chen, Tengyu Ma

A common lens to theoretically study neural net architectures is to analyze the functions they can approximate.

Generalization Bounds

Calibrating Predictions to Decisions: A Novel Approach to Multi-Class Calibration

no code implementations12 Jul 2021 Shengjia Zhao, Michael P. Kim, Roshni Sahoo, Tengyu Ma, Stefano Ermon

In this work, we introduce a new notion -- \emph{decision calibration} -- that requires the predicted distribution and true distribution to be ``indistinguishable'' to a set of downstream decision-makers.

Decision Making

Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments

no code implementations18 Jun 2021 Yining Chen, Elan Rosenfeld, Mark Sellke, Tengyu Ma, Andrej Risteski

Domain generalization aims at performing well on unseen test environments with data from a limited number of training environments.

Domain Generalization

Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning

no code implementations17 Jun 2021 Colin Wei, Sang Michael Xie, Tengyu Ma

The generative model in our analysis is either a Hidden Markov Model (HMM) or an HMM augmented with a latent memory component, motivated by long-term dependencies in natural language.

Label Noise SGD Provably Prefers Flat Global Minimizers

no code implementations11 Jun 2021 Alex Damian, Tengyu Ma, Jason Lee

In overparametrized models, the noise in stochastic gradient descent (SGD) implicitly regularizes the optimization trajectory and determines which local minimum SGD converges to.

Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System

no code implementations9 Jun 2021 Zichuan Lin, Jing Huang, BoWen Zhou, Xiaodong He, Tengyu Ma

Recent work (Takanobu et al., 2020) proposed the system-wise evaluation on dialog systems and found that improvement on individual components (e. g., NLU, policy) in prior work may not necessarily bring benefit to pipeline systems in system-wise evaluation.

Data Augmentation Goal-Oriented Dialog

Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss

no code implementations8 Jun 2021 Jeff Z. HaoChen, Colin Wei, Adrien Gaidon, Tengyu Ma

Despite the empirical successes, theoretical foundations are limited -- prior analyses assume conditional independence of the positive pairs given the same class label, but recent empirical applications use heavily correlated positive pairs (i. e., data augmentations of the same image).

Contrastive Learning Generalization Bounds +1

Why Do Local Methods Solve Nonconvex Problems?

no code implementations24 Mar 2021 Tengyu Ma

Non-convex optimization is ubiquitous in modern machine learning.

Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap

no code implementations9 Feb 2021 Haike Xu, Tengyu Ma, Simon S. Du

We further show that for general MDPs, AMB suffers an additional $\frac{|Z_{mul}|}{\Delta_{min}}$ regret, where $Z_{mul}$ is the set of state-action pairs $(s, a)$'s satisfying $a$ is a non-unique optimal action for $s$.

Multi-Armed Bandits

Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

no code implementations8 Feb 2021 Kefan Dong, Jiaqi Yang, Tengyu Ma

This paper studies model-based bandit and reinforcement learning (RL) with nonlinear function approximations.

Improved Uncertainty Post-Calibration via Rank Preserving Transforms

no code implementations1 Jan 2021 Yu Bai, Tengyu Ma, Huan Wang, Caiming Xiong

In this paper, we propose Neural Rank Preserving Transforms (NRPT), a new post-calibration method that adjusts the output probabilities of a trained classifier using a calibrator of higher capacity, while maintaining its prediction accuracy.

Text Classification

In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness

1 code implementation ICLR 2021 Sang Michael Xie, Ananya Kumar, Robbie Jones, Fereshte Khani, Tengyu Ma, Percy Liang

To get the best of both worlds, we introduce In-N-Out, which first trains a model with auxiliary inputs and uses it to pseudolabel all the in-distribution inputs, then pre-trains a model on OOD auxiliary outputs and fine-tunes this model with the pseudolabels (self-training).

Time Series Unsupervised Domain Adaptation

Meta-learning Transferable Representations with a Single Target Domain

no code implementations3 Nov 2020 Hong Liu, Jeff Z. HaoChen, Colin Wei, Tengyu Ma

Recent works found that fine-tuning and joint training---two popular approaches for transfer learning---do not always improve accuracy on downstream tasks.

Meta-Learning Representation Learning +1

Beyond Lazy Training for Over-parameterized Tensor Decomposition

no code implementations NeurIPS 2020 Xiang Wang, Chenwei Wu, Jason D. Lee, Tengyu Ma, Rong Ge

We show that in a lazy training regime (similar to the NTK regime for neural networks) one needs at least $m = \Omega(d^{l-1})$, while a variant of gradient descent can find an approximate tensor when $m = O^*(r^{2. 5l}\log d)$.

Tensor Decomposition

Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling

1 code implementation21 Oct 2020 Wenxuan Zhou, Kevin Huang, Tengyu Ma, Jing Huang

In this paper, we propose two novel techniques, adaptive thresholding and localized context pooling, to solve the multi-label and multi-entity problems.

Document-level Multi-Label Classification +1

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

no code implementations ICLR 2021 Colin Wei, Kendrick Shen, Yining Chen, Tengyu Ma

Self-training algorithms, which train a model to fit pseudolabels predicted by another previously-learned model, have been very successful for learning with unlabeled data using neural networks.

Generalization Bounds Unsupervised Domain Adaptation

Entity and Evidence Guided Relation Extraction for DocRED

no code implementations27 Aug 2020 Kevin Huang, Guangtao Wang, Tengyu Ma, Jing Huang

Document-level relation extraction is a challenging task which requires reasoning over multiple sentences in order to predict relations in a document.

Document-level Language Modelling +1

Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK

no code implementations9 Jul 2020 Yuanzhi Li, Tengyu Ma, Hongyang R. Zhang

We consider the dynamic of gradient descent for learning a two-layer neural network.

Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization

2 code implementations29 Jun 2020 Sang Michael Xie, Tengyu Ma, Percy Liang

Empirically, we show that composed fine-tuning improves over standard fine-tuning on two pseudocode-to-code translation datasets (3% and 6% relative).

Code Translation Denoising +2

Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization

1 code implementation ICLR 2021 Kaidi Cao, Yining Chen, Junwei Lu, Nikos Arechiga, Adrien Gaidon, Tengyu Ma

Real-world large-scale datasets are heteroskedastic and imbalanced -- labels have varying levels of uncertainty and label distributions are long-tailed.

Image Classification

Active Online Learning with Hidden Shifting Domains

no code implementations25 Jun 2020 Yining Chen, Haipeng Luo, Tengyu Ma, Chicheng Zhang

We propose a surprisingly simple algorithm that adaptively balances its regret and its number of label queries in settings where the data streams are from a mixture of hidden domains.

Domain Adaptation

Individual Calibration with Randomized Forecasting

no code implementations ICML 2020 Shengjia Zhao, Tengyu Ma, Stefano Ermon

We show that calibration for individual samples is possible in the regression setup if the predictions are randomized, i. e. outputting randomized credible intervals.

Decision Making Fairness

Self-training Avoids Using Spurious Features Under Domain Shift

no code implementations NeurIPS 2020 Yining Chen, Colin Wei, Ananya Kumar, Tengyu Ma

In unsupervised domain adaptation, existing theory focuses on situations where the source and target domains are close.

Unsupervised Domain Adaptation

Federated Accelerated Stochastic Gradient Descent

1 code implementation NeurIPS 2020 Honglin Yuan, Tengyu Ma

We propose Federated Accelerated Stochastic Gradient Descent (FedAc), a principled acceleration of Federated Averaging (FedAvg, also known as Local SGD) for distributed optimization.

Distributed Optimization

Model-based Adversarial Meta-Reinforcement Learning

1 code implementation NeurIPS 2020 Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

When the test task distribution is different from the training task distribution, the performance may degrade significantly.

Continuous Control Meta Reinforcement Learning

Shape Matters: Understanding the Implicit Bias of the Noise Covariance

1 code implementation15 Jun 2020 Jeff Z. HaoChen, Colin Wei, Jason D. Lee, Tengyu Ma

We show that in an over-parameterized setting, SGD with label noise recovers the sparse ground-truth with an arbitrary initialization, whereas SGD with Gaussian noise or gradient descent overfits to dense solutions with large norms.

Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin

no code implementations ICLR 2020 Colin Wei, Tengyu Ma

For linear classifiers, the relationship between (normalized) output margin and generalization is captured in a clear and simple bound – a large output margin implies good generalization.

Generalization Bounds Robust classification

Robust and On-the-fly Dataset Denoising for Image Classification

no code implementations ECCV 2020 Jiaming Song, Lunjia Hu, Michael Auli, Yann Dauphin, Tengyu Ma

We address this problem by reasoning counterfactually about the loss distribution of examples with uniform random labels had they were trained with the real examples, and use this information to remove noisy examples from the training set.

Classification Denoising +2

Optimal Regularization Can Mitigate Double Descent

no code implementations ICLR 2021 Preetum Nakkiran, Prayaag Venkat, Sham Kakade, Tengyu Ma

Recent empirical and theoretical studies have shown that many learning algorithms -- from linear regression to neural networks -- can have test performance that is non-monotonic in quantities such the sample size and model size.

The Implicit and Explicit Regularization Effects of Dropout

1 code implementation ICML 2020 Colin Wei, Sham Kakade, Tengyu Ma

This implicit regularization effect is analogous to the effect of stochasticity in small mini-batch stochastic gradient descent.

Understanding Self-Training for Gradual Domain Adaptation

2 code implementations ICML 2020 Ananya Kumar, Tengyu Ma, Percy Liang

Machine learning systems must adapt to data distributions that evolve over time, in applications ranging from sensor networks and self-driving car perception modules to brain-machine interfaces.

Unsupervised Domain Adaptation

Variable-Viewpoint Representations for 3D Object Recognition

no code implementations8 Feb 2020 Tengyu Ma, Joel Michelson, James Ainooson, Deepayan Sanyal, Xiaohan Wang, Maithilee Kunda

For the problem of 3D object recognition, researchers using deep learning methods have developed several very different input representations, including "multi-view" snapshots taken from discrete viewpoints around an object, as well as "spherical" representations consisting of a dense map of essentially ray-traced samples of the object from all directions.

3D Object Recognition

On the Expressivity of Neural Networks for Deep Reinforcement Learning

1 code implementation ICML 2020 Kefan Dong, Yuping Luo, Tengyu Ma

We compare the model-free reinforcement learning with the model-based approaches through the lens of the expressive power of neural networks for policies, $Q$-functions, and dynamics.

Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin

1 code implementation9 Oct 2019 Colin Wei, Tengyu Ma

Unfortunately, for deep models, this relationship is less clear: existing analyses of the output margin give complicated bounds which sometimes depend exponentially on depth.

General Classification Generalization Bounds +1

Verified Uncertainty Calibration

3 code implementations NeurIPS 2019 Ananya Kumar, Percy Liang, Tengyu Ma

In these experiments, we also estimate the calibration error and ECE more accurately than the commonly used plugin estimators.

Weather Forecasting

Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling

1 code implementation ICLR 2020 Yuping Luo, Huazhe Xu, Tengyu Ma

Imitation learning, followed by reinforcement learning algorithms, is a promising paradigm to solve complex control tasks sample-efficiently.

Imitation Learning

Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks

2 code implementations NeurIPS 2019 Yuanzhi Li, Colin Wei, Tengyu Ma

This concept translates to a larger-scale setting: we demonstrate that one can add a small patch to CIFAR-10 images that is immediately memorizable by a model with small initial learning rate, but ignored by the model with large learning rate until after annealing.

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

4 code implementations NeurIPS 2019 Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma

Deep learning algorithms can fare poorly when the training dataset suffers from heavy class-imbalance but the testing criterion requires good generalization on less frequent classes.

Long-tail learning with class descriptors

On the Performance of Thompson Sampling on Logistic Bandits

no code implementations12 May 2019 Shi Dong, Tengyu Ma, Benjamin Van Roy

Specifically, we establish that, when the set of feasible actions is identical to the set of possible coefficient vectors, the Bayesian regret of Thompson sampling is $\tilde{O}(d\sqrt{T})$.

Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation

1 code implementation NeurIPS 2019 Colin Wei, Tengyu Ma

For feedforward neural nets as well as RNNs, we obtain tighter Rademacher complexity bounds by considering additional data-dependent properties of the network: the norms of the hidden layers of the network, and the norms of the Jacobians of each layer with respect to all previous layers.

On the Margin Theory of Feedforward Neural Networks

no code implementations ICLR 2019 Colin Wei, Jason Lee, Qiang Liu, Tengyu Ma

We establish: 1) for multi-layer feedforward relu networks, the global minimizer of a weakly-regularized cross-entropy loss has the maximum normalized margin among all networks, 2) as a result, increasing the over-parametrization improves the normalized margin and generalization error bounds for deep networks.

Better Generalization with On-the-fly Dataset Denoising

no code implementations ICLR 2019 Jiaming Song, Tengyu Ma, Michael Auli, Yann Dauphin

Memorization in over-parameterized neural networks can severely hurt generalization in the presence of mislabeled examples.


Explaining Adversarial Examples with Knowledge Representation

no code implementations ICLR 2019 Xingyu Zhou, Tengyu Ma, Huahong Zhang

This paper, in contrast, discusses the origin of adversarial examples from a more underlying knowledge representation point of view.

Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel

no code implementations NeurIPS 2019 Colin Wei, Jason D. Lee, Qiang Liu, Tengyu Ma

We prove that for infinite-width two-layer nets, noisy gradient descent optimizes the regularized neural net loss to a global minimum in polynomial iterations.

Approximability of Discriminators Implies Diversity in GANs

no code implementations ICLR 2019 Yu Bai, Tengyu Ma, Andrej Risteski

Our preliminary experiments show that on synthetic datasets the test IPM is well correlated with KL divergence or the Wasserstein distance, indicating that the lack of diversity in GANs may be caused by the sub-optimality in optimization instead of statistical inefficiency.

The Toybox Dataset of Egocentric Visual Object Transformations

no code implementations15 Jun 2018 Xiaohan Wang, Tengyu Ma, James Ainooson, Seunghwan Cha, Xiaotian Wang, Azhar Molla, Maithilee Kunda

In object recognition research, many commonly used datasets (e. g., ImageNet and similar) contain relatively sparse distributions of object instances and views, e. g., one might see a thousand different pictures of a thousand different giraffes, mostly taken from a few conventionally photographed angles.

Object Recognition Translation

A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors

1 code implementation ACL 2018 Mikhail Khodak, Nikunj Saunshi, YIngyu Liang, Tengyu Ma, Brandon Stewart, Sanjeev Arora

Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features.

Document Classification Domain Adaptation +1

Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations

no code implementations26 Dec 2017 Yuanzhi Li, Tengyu Ma, Hongyang Zhang

We show that the gradient descent algorithm provides an implicit regularization effect in the learning of over-parameterized matrix factorization models and one-hidden-layer neural networks with quadratic activations.

On the Optimization Landscape of Tensor Decompositions

no code implementations NeurIPS 2017 Rong Ge, Tengyu Ma

The landscape of many objective functions in learning has been conjectured to have the geometric property that "all local optima are (approximately) global optima", and thus they can be solved efficiently by local search algorithms.

Latent Variable Models Tensor Decomposition

Generalization and Equilibrium in Generative Adversarial Nets (GANs)

1 code implementation ICML 2017 Sanjeev Arora, Rong Ge, YIngyu Liang, Tengyu Ma, Yi Zhang

We show that training of generative adversarial network (GAN) may not have good generalization properties; e. g., training may appear successful but the trained distribution may be far from target distribution in standard metrics.

On the ability of neural nets to express distributions

no code implementations22 Feb 2017 Holden Lee, Rong Ge, Tengyu Ma, Andrej Risteski, Sanjeev Arora

We take a first cut at explaining the expressivity of multilayer nets by giving a sufficient criterion for a function to be approximable by a neural network with $n$ hidden layers.

Provable learning of Noisy-or Networks

no code implementations28 Dec 2016 Sanjeev Arora, Rong Ge, Tengyu Ma, Andrej Risteski

Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables.

Latent Variable Models Tensor Decomposition +1

Identity Matters in Deep Learning

no code implementations14 Nov 2016 Moritz Hardt, Tengyu Ma

An emerging design principle in deep learning is that each layer of a deep artificial neural network should be able to easily express the identity transformation.

Finding Approximate Local Minima Faster than Gradient Descent

1 code implementation3 Nov 2016 Naman Agarwal, Zeyuan Allen-Zhu, Brian Bullins, Elad Hazan, Tengyu Ma

We design a non-convex second-order optimization algorithm that is guaranteed to return an approximate local minimum in time which scales linearly in the underlying dimension and the number of training examples.

Polynomial-time Tensor Decompositions with Sum-of-Squares

no code implementations6 Oct 2016 Tengyu Ma, Jonathan Shi, David Steurer

We give new algorithms based on the sum-of-squares method for tensor decomposition.

Tensor Decomposition

A Non-generative Framework and Convex Relaxations for Unsupervised Learning

no code implementations NeurIPS 2016 Elad Hazan, Tengyu Ma

We give a novel formal theoretical framework for unsupervised learning with two distinctive characteristics.

Gradient Descent Learns Linear Dynamical Systems

no code implementations16 Sep 2016 Moritz Hardt, Tengyu Ma, Benjamin Recht

We prove that stochastic gradient descent efficiently converges to the global optimizer of the maximum likelihood objective of an unknown linear time-invariant dynamical system from a sequence of noisy observations generated by the system.

Matrix Completion has No Spurious Local Minimum

no code implementations NeurIPS 2016 Rong Ge, Jason D. Lee, Tengyu Ma

Matrix completion is a basic machine learning problem that has wide applications, especially in collaborative filtering and recommender systems.

Collaborative Filtering Matrix Completion +1

Linear Algebraic Structure of Word Senses, with Applications to Polysemy

1 code implementation TACL 2018 Sanjeev Arora, Yuanzhi Li, YIngyu Liang, Tengyu Ma, Andrej Risteski

A novel aspect of our technique is that each extracted word sense is accompanied by one of about 2000 "discourse atoms" that gives a succinct description of which other words co-occur with that word sense.

Information Retrieval Word Embeddings

Why are deep nets reversible: A simple theory, with implications for training

no code implementations18 Nov 2015 Sanjeev Arora, YIngyu Liang, Tengyu Ma

Under this assumption ---which is experimentally tested on real-life nets like AlexNet--- it is formally proved that feed forward net is a correct inference method for recovering the hidden layer.


Distributed Stochastic Variance Reduced Gradient Methods and A Lower Bound for Communication Complexity

no code implementations27 Jul 2015 Jason D. Lee, Qihang Lin, Tengyu Ma, Tianbao Yang

We also prove a lower bound for the number of rounds of communication for a broad class of distributed first-order methods including the proposed algorithms in this paper.

Distributed Optimization

Sum-of-Squares Lower Bounds for Sparse PCA

no code implementations NeurIPS 2015 Tengyu Ma, Avi Wigderson

It was also known that this quadratic gap cannot be improved by the the most basic {\em semi-definite} (SDP, aka spectral) relaxation, equivalent to a degree-2 SoS algorithms.

Communication Lower Bounds for Statistical Estimation Problems via a Distributed Data Processing Inequality

no code implementations24 Jun 2015 Mark Braverman, Ankit Garg, Tengyu Ma, Huy L. Nguyen, David P. Woodruff

We study the tradeoff between the statistical error and communication cost of distributed statistical estimation problems in high dimensions.

Decomposing Overcomplete 3rd Order Tensors using Sum-of-Squares Algorithms

no code implementations21 Apr 2015 Rong Ge, Tengyu Ma

We also give a polynomial time algorithm for certifying the injective norm of random low rank tensors.

Tensor Decomposition

Simple, Efficient, and Neural Algorithms for Sparse Coding

no code implementations2 Mar 2015 Sanjeev Arora, Rong Ge, Tengyu Ma, Ankur Moitra

Its standard formulation is as a non-convex optimization problem which is solved in practice by heuristics based on alternating minimization.

A Latent Variable Model Approach to PMI-based Word Embeddings

4 code implementations TACL 2016 Sanjeev Arora, Yuanzhi Li, YIngyu Liang, Tengyu Ma, Andrej Risteski

Semantic word embeddings represent the meaning of a word via a vector, and are created by diverse methods.

Word Embeddings

On Communication Cost of Distributed Statistical Estimation and Dimensionality

no code implementations NeurIPS 2014 Ankit Garg, Tengyu Ma, Huy L. Nguyen

We conjecture that the tradeoff between communication and squared loss demonstrated by this protocol is essentially optimal up to logarithmic factor.

More Algorithms for Provable Dictionary Learning

no code implementations3 Jan 2014 Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma

In dictionary learning, also known as sparse coding, the algorithm is given samples of the form $y = Ax$ where $x\in \mathbb{R}^m$ is an unknown random sparse vector and $A$ is an unknown dictionary matrix in $\mathbb{R}^{n\times m}$ (usually $m > n$, which is the overcomplete case).

Dictionary Learning

Provable Bounds for Learning Some Deep Representations

no code implementations23 Oct 2013 Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma

The analysis of the algorithm reveals interesting structure of neural networks with random edge weights.

Cannot find the paper you are looking for? You can Submit a new open access paper.