Search Results for author: Sham Kakade

Found 54 papers, 17 papers with code

Learning from Logged Implicit Exploration Data

no code implementations NeurIPS 2010 Alex Strehl, John Langford, Sham Kakade, Lihong Li

We provide a sound and consistent foundation for the use of \emph{nonrandom} exploration data in "contextual bandit" or "partially labeled" settings where only the value of a chosen action is learned.

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

no code implementations NeurIPS 2013 Animashree Anandkumar, Daniel Hsu, Majid Janzamin, Sham Kakade

This set of higher-order expansion conditions allow for overcomplete models, and require the existence of a perfect matching from latent topics to higher order observed words.

Topic Models

Minimal Realization Problems for Hidden Markov Models

no code implementations13 Nov 2014 Qingqing Huang, Rong Ge, Sham Kakade, Munther Dahleh

Consider a stationary discrete random process with alphabet size d, which is assumed to be the output process of an unknown stationary Hidden Markov Model (HMM).

Tensor Decomposition

A Linear Dynamical System Model for Text

no code implementations13 Feb 2015 David Belanger, Sham Kakade

Finally, the Kalman filter updates can be seen as a linear recurrent neural network.

Language Modelling Word Embeddings

Learning Features of Music from Scratch

2 code implementations29 Nov 2016 John Thickstun, Zaid Harchaoui, Sham Kakade

This paper introduces a new large-scale music dataset, MusicNet, to serve as a source of supervision and evaluation of machine learning methods for music research.

BIG-bench Machine Learning Multi-Label Classification +1

Prediction with a Short Memory

no code implementations8 Dec 2016 Vatsal Sharan, Sham Kakade, Percy Liang, Gregory Valiant

For a Hidden Markov Model with $n$ hidden states, $I$ is bounded by $\log n$, a quantity that does not depend on the mixing time, and we show that the trivial prediction algorithm based on the empirical frequencies of length $O(\log n/\epsilon)$ windows of observations achieves this error, provided the length of the sequence is $d^{\Omega(\log n/\epsilon)}$, where $d$ is the size of the observation alphabet.

Towards Generalization and Simplicity in Continuous Control

1 code implementation NeurIPS 2017 Aravind Rajeswaran, Kendall Lowrey, Emanuel Todorov, Sham Kakade

This work shows that policies with simple linear and RBF parameterizations can be trained to solve a variety of continuous control tasks, including the OpenAI gym benchmarks.

Continuous Control OpenAI Gym

Learning Overcomplete HMMs

no code implementations NeurIPS 2017 Vatsal Sharan, Sham Kakade, Percy Liang, Gregory Valiant

On the other hand, we show that learning is impossible given only a polynomial number of samples for HMMs with a small output alphabet and whose transition matrices are random regular graphs with large degree.

Leverage Score Sampling for Faster Accelerated Regression and ERM

no code implementations22 Nov 2017 Naman Agarwal, Sham Kakade, Rahul Kidambi, Yin Tat Lee, Praneeth Netrapalli, Aaron Sidford

Given a matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ and a vector $b \in\mathbb{R}^{d}$, we show how to compute an $\epsilon$-approximate solution to the regression problem $ \min_{x\in\mathbb{R}^{d}}\frac{1}{2} \|\mathbf{A} x - b\|_{2}^{2} $ in time $ \tilde{O} ((n+\sqrt{d\cdot\kappa_{\text{sum}}})\cdot s\cdot\log\epsilon^{-1}) $ where $\kappa_{\text{sum}}=\mathrm{tr}\left(\mathbf{A}^{\top}\mathbf{A}\right)/\lambda_{\min}(\mathbf{A}^{T}\mathbf{A})$ and $s$ is the maximum number of non-zero entries in a row of $\mathbf{A}$.

regression

Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines

no code implementations ICLR 2018 Cathy Wu, Aravind Rajeswaran, Yan Duan, Vikash Kumar, Alexandre M. Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel

To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the stochastic policy itself and does not make any additional assumptions about the MDP.

Policy Gradient Methods reinforcement-learning +1

Stochastic subgradient method converges on tame functions

1 code implementation20 Apr 2018 Damek Davis, Dmitriy Drusvyatskiy, Sham Kakade, Jason D. Lee

This work considers the question: what convergence guarantees does the stochastic subgradient method have in the absence of smoothness and convexity?

Provably Correct Automatic Subdifferentiation for Qualified Programs

no code implementations23 Sep 2018 Sham Kakade, Jason D. Lee

The Cheap Gradient Principle (Griewank 2008) --- the computational cost of computing the gradient of a scalar-valued function is nearly the same (often within a factor of $5$) as that of simply computing the function itself --- is of central importance in optimization; it allows us to quickly obtain (high dimensional) gradients of scalar loss functions which are subsequently used in black box gradient-based optimization procedures.

Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control

no code implementations ICLR 2019 Kendall Lowrey, Aravind Rajeswaran, Sham Kakade, Emanuel Todorov, Igor Mordatch

We study how local trajectory optimization can cope with approximation errors in the value function, and can stabilize and accelerate value function learning.

Online Meta-Learning

no code implementations ICLR Workshop LLD 2019 Chelsea Finn, Aravind Rajeswaran, Sham Kakade, Sergey Levine

Meta-learning views this problem as learning a prior over model parameters that is amenable for fast adaptation on a new task, but typically assumes the set of tasks are available together as a batch.

Meta-Learning

Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal

no code implementations10 Jun 2019 Alekh Agarwal, Sham Kakade, Lin F. Yang

In this work, we study the effectiveness of the most natural plug-in approach to model-based planning: we build the maximum likelihood estimate of the transition model in the MDP from observations and then find an optimal policy in this empirical MDP.

Model-based Reinforcement Learning reinforcement-learning +1

Meta-Learning with Implicit Gradients

6 code implementations NeurIPS 2019 Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine

By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer.

Few-Shot Image Classification Few-Shot Learning

Meta-learning for mixed linear regression

no code implementations ICML 2020 Weihao Kong, Raghav Somani, Zhao Song, Sham Kakade, Sewoong Oh

In modern supervised learning, there are a large number of tasks, but many of them are associated with only a small amount of labeled data.

Meta-Learning regression +1

Provable Representation Learning for Imitation Learning via Bi-level Optimization

no code implementations ICML 2020 Sanjeev Arora, Simon S. Du, Sham Kakade, Yuping Luo, Nikunj Saunshi

We formulate representation learning as a bi-level optimization problem where the "outer" optimization tries to learn the joint representation and the "inner" optimization encodes the imitation learning setup and tries to learn task-specific parameters.

Imitation Learning Representation Learning

The Implicit and Explicit Regularization Effects of Dropout

1 code implementation ICML 2020 Colin Wei, Sham Kakade, Tengyu Ma

This implicit regularization effect is analogous to the effect of stochasticity in small mini-batch stochastic gradient descent.

Optimal Regularization Can Mitigate Double Descent

no code implementations ICLR 2021 Preetum Nakkiran, Prayaag Venkat, Sham Kakade, Tengyu Ma

Recent empirical and theoretical studies have shown that many learning algorithms -- from linear regression to neural networks -- can have test performance that is non-monotonic in quantities such the sample size and model size.

regression

Robust Meta-learning for Mixed Linear Regression with Small Batches

no code implementations NeurIPS 2020 Weihao Kong, Raghav Somani, Sham Kakade, Sewoong Oh

Together, this approach is robust against outliers and achieves a graceful statistical trade-off; the lack of $\Omega(k^{1/2})$-size tasks can be compensated for with smaller tasks, which can now be as small as $O(\log k)$.

Meta-Learning regression

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

no code implementations NeurIPS 2020 Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun

In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common practice to make parametric assumptions where values or policies are functions of some low dimensional feature space.

reinforcement-learning Reinforcement Learning (RL) +1

Information Theoretic Regret Bounds for Online Nonlinear Control

1 code implementation NeurIPS 2020 Sham Kakade, Akshay Krishnamurthy, Kendall Lowrey, Motoya Ohnishi, Wen Sun

This work studies the problem of sequential control in an unknown, nonlinear dynamical system, where we model the underlying system dynamics as an unknown function in a known Reproducing Kernel Hilbert Space.

Continuous Control

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

1 code implementation NeurIPS 2020 Alekh Agarwal, Mikael Henaff, Sham Kakade, Wen Sun

Direct policy gradient methods for reinforcement learning are a successful approach for a variety of reasons: they are model free, they directly optimize the performance metric of interest, and they allow for richly parameterized policies.

Policy Gradient Methods Q-Learning

How Important is the Train-Validation Split in Meta-Learning?

no code implementations12 Oct 2020 Yu Bai, Minshuo Chen, Pan Zhou, Tuo Zhao, Jason D. Lee, Sham Kakade, Huan Wang, Caiming Xiong

A common practice in meta-learning is to perform a train-validation split (\emph{train-val method}) where the prior adapts to the task on one split of the data, and the resulting predictor is evaluated on another split.

Meta-Learning

Is Long Horizon RL More Difficult Than Short Horizon RL?

no code implementations NeurIPS 2020 Ruosong Wang, Simon S. Du, Lin Yang, Sham Kakade

In a COLT 2018 open problem, Jiang and Agarwal conjectured that, for tabular, episodic reinforcement learning problems, there exists a sample complexity lower bound which exhibits a polynomial dependence on the horizon --- a conjecture which is consistent with all known sample complexity upper bounds.

reinforcement-learning Reinforcement Learning (RL)

Robust and Differentially Private Mean Estimation

1 code implementation NeurIPS 2021 Xiyang Liu, Weihao Kong, Sham Kakade, Sewoong Oh

In statistical learning and analysis from shared data, which is increasingly widely adopted in platforms such as federated learning and meta-learning, there are two major concerns: privacy and robustness.

Federated Learning Meta-Learning

Gone Fishing: Neural Active Learning with Fisher Embeddings

1 code implementation NeurIPS 2021 Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

There is an increasing need for effective active learning algorithms that are compatible with deep neural networks.

Active Learning

Koopman Spectrum Nonlinear Regulator and Provably Efficient Online Learning

1 code implementation30 Jun 2021 Motoya Ohnishi, Isao Ishikawa, Kendall Lowrey, Masahiro Ikeda, Sham Kakade, Yoshinobu Kawahara

In this work, we present a novel paradigm of controlling nonlinear systems via the minimization of the Koopman spectrum cost: a cost over the Koopman operator of the controlled dynamics.

reinforcement-learning Reinforcement Learning (RL)

Sparsity in Partially Controllable Linear Systems

no code implementations12 Oct 2021 Yonathan Efroni, Sham Kakade, Akshay Krishnamurthy, Cyril Zhang

However, in practice, we often encounter systems in which a large set of state variables evolve exogenously and independently of the control inputs; such systems are only partially controllable.

Inductive Biases and Variable Creation in Self-Attention Mechanisms

no code implementations19 Oct 2021 Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Cyril Zhang

Self-attention, an architectural motif designed to model long-range interactions in sequential data, has driven numerous recent breakthroughs in natural language processing and beyond.

Anti-Concentrated Confidence Bonuses for Scalable Exploration

no code implementations ICLR 2022 Jordan T. Ash, Cyril Zhang, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

Intrinsic rewards play a central role in handling the exploration-exploitation trade-off when designing sequential decision-making algorithms, in both foundational theory and state-of-the-art deep reinforcement learning.

Decision Making reinforcement-learning +1

Multi-Stage Episodic Control for Strategic Exploration in Text Games

1 code implementation ICLR 2022 Jens Tuyls, Shunyu Yao, Sham Kakade, Karthik Narasimhan

Text adventure games present unique challenges to reinforcement learning methods due to their combinatorially large action spaces and sparse rewards.

Understanding Contrastive Learning Requires Incorporating Inductive Biases

no code implementations28 Feb 2022 Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy

Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs.

Contrastive Learning Self-Supervised Learning

A Complete Characterization of Linear Estimators for Offline Policy Evaluation

no code implementations8 Mar 2022 Juan C. Perdomo, Akshay Krishnamurthy, Peter Bartlett, Sham Kakade

Offline policy evaluation is a fundamental statistical problem in reinforcement learning that involves estimating the value function of some decision-making policy given data collected by a potentially different policy.

Decision Making reinforcement-learning +1

Matryoshka Representation Learning

4 code implementations26 May 2022 Aditya Kusupati, Gantavya Bhatt, Aniket Rege, Matthew Wallingford, Aditya Sinha, Vivek Ramanujan, William Howard-Snyder, KaiFeng Chen, Sham Kakade, Prateek Jain, Ali Farhadi

The flexibility within the learned Matryoshka Representations offer: (a) up to 14x smaller embedding size for ImageNet-1K classification at the same level of accuracy; (b) up to 14x real-world speed-ups for large-scale retrieval on ImageNet-1K and 4K; and (c) up to 2% accuracy improvements for long-tail few-shot classification, all while being as robust as the original representations.

Ranked #25 on Image Classification on ObjectNet (using extra training data)

4k Image Classification +2

Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit

no code implementations18 Jul 2022 Boaz Barak, Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

There is mounting evidence of emergent phenomena in the capabilities of deep learning methods as we scale up datasets, model sizes, and training times.

Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms

no code implementations1 Sep 2022 Surbhi Goel, Sham Kakade, Adam Tauman Kalai, Cyril Zhang

For example, on parity problems, the NN learns as well as Gaussian elimination, an efficient algorithm that can be succinctly described.

On Provable Copyright Protection for Generative Models

no code implementations21 Feb 2023 Nikhil Vyas, Sham Kakade, Boaz Barak

There is a growing concern that learned conditional generative models may output samples that are substantially similar to some copyrighted data $C$ that was in their training set.

Modified Gauss-Newton Algorithms under Noise

no code implementations18 May 2023 Krishna Pillutla, Vincent Roulet, Sham Kakade, Zaid Harchaoui

Gauss-Newton methods and their stochastic version have been widely used in machine learning and signal processing.

Structured Prediction

AdANNS: A Framework for Adaptive Semantic Search

1 code implementation NeurIPS 2023 Aniket Rege, Aditya Kusupati, Sharan Ranjit S, Alan Fan, Qingqing Cao, Sham Kakade, Prateek Jain, Ali Farhadi

Finally, we demonstrate that AdANNS can enable inference-time adaptivity for compute-aware search on ANNS indices built non-adaptively on matryoshka representations.

Natural Questions Quantization +1

Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning

no code implementations14 Jun 2023 Nikhil Vyas, Depen Morwani, Rosie Zhao, Gal Kaplun, Sham Kakade, Boaz Barak

The success of SGD in deep learning has been ascribed by prior works to the implicit bias induced by high learning rate or small batch size ("SGD noise").

Scaling Laws for Imitation Learning in Single-Agent Games

no code implementations18 Jul 2023 Jens Tuyls, Dhruv Madeka, Kari Torkkola, Dean Foster, Karthik Narasimhan, Sham Kakade

Inspired by recent work in Natural Language Processing (NLP) where "scaling up" has resulted in increasingly more capable LLMs, we investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.

Atari Games Imitation Learning +1

Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck

no code implementations7 Sep 2023 Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

Finally, we show that the synthetic sparse parity task can be useful as a proxy for real problems requiring axis-aligned feature learning.

tabular-classification

MatFormer: Nested Transformer for Elastic Inference

2 code implementations11 Oct 2023 Devvrit, Sneha Kudugunta, Aditya Kusupati, Tim Dettmers, KaiFeng Chen, Inderjit Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham Kakade, Ali Farhadi, Prateek Jain

Furthermore, we observe that smaller encoders extracted from a universal MatFormer-based ViT (MatViT) encoder preserve the metric-space structure for adaptive large-scale retrieval.

Language Modelling

Learning an Inventory Control Policy with General Inventory Arrival Dynamics

no code implementations26 Oct 2023 Sohrab Andaz, Carson Eisenach, Dhruv Madeka, Kari Torkkola, Randy Jia, Dean Foster, Sham Kakade

In this paper we address the problem of learning and backtesting inventory control policies in the presence of general arrival dynamics -- which we term as a quantity-over-time arrivals model (QOT).

Feature emergence via margin maximization: case studies in algebraic tasks

no code implementations13 Nov 2023 Depen Morwani, Benjamin L. Edelman, Costin-Andrei Oncescu, Rosie Zhao, Sham Kakade

Understanding the internal representations learned by neural networks is a cornerstone challenge in the science of machine learning.

A Study on the Calibration of In-context Learning

no code implementations7 Dec 2023 HANLIN ZHANG, Yi-Fan Zhang, Yaodong Yu, Dhruv Madeka, Dean Foster, Eric Xing, Himabindu Lakkaraju, Sham Kakade

Accurate uncertainty quantification is crucial for the safe deployment of machine learning models, and prior research has demonstrated improvements in the calibration of modern language models (LMs).

In-Context Learning Natural Language Understanding +1

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

1 code implementation22 Feb 2024 Kenneth Li, Samy Jelassi, Hugh Zhang, Sham Kakade, Martin Wattenberg, David Brandfonbrener

The idea is to learn a simple linear function on a model's embedding space that can be used to reweight candidate completions.

Code Generation Language Modelling

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

no code implementations27 Feb 2024 Zhenting Qi, HANLIN ZHANG, Eric Xing, Sham Kakade, Himabindu Lakkaraju

Retrieval-Augmented Generation (RAG) improves pre-trained models by incorporating external knowledge at test time to enable customized adaptation.

Instruction Following Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.