Search Results for author: Cyril Zhang

Found 18 papers, 4 papers with code

Inductive Biases and Variable Creation in Self-Attention Mechanisms

no code implementations19 Oct 2021 Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Cyril Zhang

Self-attention, an architectural motif designed to model long-range interactions in sequential data, has driven numerous recent breakthroughs in natural language processing and beyond.

Sparsity in Partially Controllable Linear Systems

no code implementations12 Oct 2021 Yonathan Efroni, Sham Kakade, Akshay Krishnamurthy, Cyril Zhang

However, in practice, we often encounter systems in which a large set of state variables evolve exogenously and independently of the control inputs; such systems are only \emph{partially controllable}.

Acceleration via Fractal Learning Rate Schedules

1 code implementation1 Mar 2021 Naman Agarwal, Surbhi Goel, Cyril Zhang

In practical applications of iterative first-order optimization, the learning rate schedule remains notoriously difficult to understand and expensive to tune.

Deluca -- A Differentiable Control Library: Environments, Methods, and Benchmarking

1 code implementation19 Feb 2021 Paula Gradu, John Hallman, Daniel Suo, Alex Yu, Naman Agarwal, Udaya Ghai, Karan Singh, Cyril Zhang, Anirudha Majumdar, Elad Hazan

We present an open-source library of natively differentiable physics and robotics environments, accompanied by gradient-based control methods and a benchmark-ing suite.

OpenAI Gym

Machine Learning for Mechanical Ventilation Control

no code implementations12 Feb 2021 Daniel Suo, Cyril Zhang, Paula Gradu, Udaya Ghai, Xinyi Chen, Edgar Minasyan, Naman Agarwal, Karan Singh, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, Elad Hazan

We consider the problem of controlling an invasive mechanical ventilator for pressure-controlled ventilation: a controller must let air in and out of a sedated patient's lungs according to a trajectory of airway pressures specified by a clinician.

Stochastic Optimization with Laggard Data Pipelines

no code implementations NeurIPS 2020 Naman Agarwal, Rohan Anil, Tomer Koren, Kunal Talwar, Cyril Zhang

State-of-the-art optimization is steadily shifting towards massively parallel pipelines with extremely large batch sizes.

Stochastic Optimization

Disentangling Adaptive Gradient Methods from Learning Rates

1 code implementation26 Feb 2020 Naman Agarwal, Rohan Anil, Elad Hazan, Tomer Koren, Cyril Zhang

We investigate several confounding factors in the evaluation of optimization algorithms for deep learning.

No-Regret Prediction in Marginally Stable Systems

no code implementations6 Feb 2020 Udaya Ghai, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang

This requires a refined regret analysis, including a structural lemma showing the current state of the system to be a small linear combination of past states, even if the state grows polynomially.

Revisiting the Generalization of Adaptive Gradient Methods

no code implementations ICLR 2020 Naman Agarwal, Rohan Anil, Elad Hazan, Tomer Koren, Cyril Zhang

A commonplace belief in the machine learning community is that using adaptive gradient methods hurts generalization.

Calibration, Entropy Rates, and Memory in Language Models

no code implementations ICML 2020 Mark Braverman, Xinyi Chen, Sham M. Kakade, Karthik Narasimhan, Cyril Zhang, Yi Zhang

Building accurate language models that capture meaningful long-term dependencies is a core challenge in natural language processing.

Robust guarantees for learning an autoregressive filter

no code implementations23 May 2019 Holden Lee, Cyril Zhang

The optimal predictor for a linear dynamical system (with hidden state and Gaussian noise) takes the form of an autoregressive linear filter, namely the Kalman filter.

Time Series Time Series Prediction

Extreme Tensoring for Low-Memory Preconditioning

no code implementations ICLR 2020 Xinyi Chen, Naman Agarwal, Elad Hazan, Cyril Zhang, Yi Zhang

State-of-the-art models are now trained with billions of parameters, reaching hardware limits in terms of memory consumption.

Stochastic Optimization

Efficient Full-Matrix Adaptive Regularization

no code implementations ICLR 2019 Naman Agarwal, Brian Bullins, Xinyi Chen, Elad Hazan, Karan Singh, Cyril Zhang, Yi Zhang

Due to the large number of parameters of machine learning problems, full-matrix preconditioning methods are prohibitively expensive.

Spectral Filtering for General Linear Dynamical Systems

no code implementations NeurIPS 2018 Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang

We give a polynomial-time algorithm for learning latent-state linear dynamical systems without system identification, and without assumptions on the spectral radius of the system's transition matrix.

Towards Provable Control for Unknown Linear Dynamical Systems

no code implementations ICLR 2018 Sanjeev Arora, Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang

We study the control of symmetric linear dynamical systems with unknown dynamics and a hidden state.

Learning Linear Dynamical Systems via Spectral Filtering

no code implementations NeurIPS 2017 Elad Hazan, Karan Singh, Cyril Zhang

We present an efficient and practical algorithm for the online prediction of discrete-time linear dynamical systems with a symmetric transition matrix.

Time Series

Not-So-Random Features

1 code implementation ICLR 2018 Brian Bullins, Cyril Zhang, Yi Zhang

We propose a principled method for kernel learning, which relies on a Fourier-analytic characterization of translation-invariant or rotation-invariant kernels.


Efficient Regret Minimization in Non-Convex Games

no code implementations ICML 2017 Elad Hazan, Karan Singh, Cyril Zhang

We consider regret minimization in repeated games with non-convex loss functions.

Cannot find the paper you are looking for? You can Submit a new open access paper.