Search Results for author: Krzysztof Choromanski

Found 51 papers, 16 papers with code

Hybrid Random Features

1 code implementation8 Oct 2021 Krzysztof Choromanski, Haoxian Chen, Han Lin, Yuanzhe Ma, Arijit Sehanobish, Deepali Jain, Michael S Ryoo, Jake Varley, Andy Zeng, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani, Adrian Weller

We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.

Graph Kernel Attention Transformers

no code implementations16 Jul 2021 Krzysztof Choromanski, Han Lin, Haoxian Chen, Jack Parker-Holder

We introduce a new class of graph neural networks (GNNs), by combining several concepts that were so far studied independently - graph kernels, attention-based networks with structural priors and more recently, efficient Transformers architectures applying small memory footprint implicit attention methods via low rank decomposition techniques.

Graph Attention

On the Expressive Power of Self-Attention Matrices

no code implementations7 Jun 2021 Valerii Likhosherstov, Krzysztof Choromanski, Adrian Weller

Our proof is constructive, enabling us to propose an algorithm for finding adaptive inputs and fixed self-attention parameters in order to approximate a given matrix.

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

1 code implementation4 Jun 2021 Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Davis, Adrian Weller

Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops.

ES-ENAS: Blackbox Optimization over Hybrid Spaces via Combinatorial and Continuous Evolution

1 code implementation19 Jan 2021 Xingyou Song, Krzysztof Choromanski, Jack Parker-Holder, Yunhao Tang, Daiyi Peng, Deepali Jain, Wenbo Gao, Aldo Pacchiano, Tamas Sarlos, Yuxiang Yang

We consider the problem of efficient blackbox optimization over a large hybrid search space, consisting of a mixture of a high dimensional continuous space and a complex combinatorial space.

Combinatorial Optimization Continuous Control +3

MLGO: a Machine Learning Guided Compiler Optimizations Framework

1 code implementation13 Jan 2021 Mircea Trofin, Yundi Qian, Eugene Brevdo, Zinan Lin, Krzysztof Choromanski, David Li

Leveraging machine-learning (ML) techniques for compiler optimizations has been widely studied and explored in academia.

Sub-Linear Memory: How to Make Performers SLiM

2 code implementations21 Dec 2020 Valerii Likhosherstov, Krzysztof Choromanski, Jared Davis, Xingyou Song, Adrian Weller

Recent works proposed various linear self-attention mechanisms, scaling only as $O(L)$ for serial computation.

Rethinking Attention with Performers

10 code implementations ICLR 2021 Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller

We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness.

Image Generation

On Optimism in Model-Based Reinforcement Learning

no code implementations21 Jun 2020 Aldo Pacchiano, Philip Ball, Jack Parker-Holder, Krzysztof Choromanski, Stephen Roberts

The principle of optimism in the face of uncertainty is prevalent throughout sequential decision making problems such as multi-armed bandits and reinforcement learning (RL), often coming with strong theoretical guarantees.

Decision Making Model-based Reinforcement Learning +1

An Ode to an ODE

no code implementations NeurIPS 2020 Krzysztof Choromanski, Jared Quincy Davis, Valerii Likhosherstov, Xingyou Song, Jean-Jacques Slotine, Jacob Varley, Honglak Lee, Adrian Weller, Vikas Sindhwani

We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the orthogonal group O(d).

Online Hyper-parameter Tuning in Off-policy Learning via Evolutionary Strategies

no code implementations13 Jun 2020 Yunhao Tang, Krzysztof Choromanski

Off-policy learning algorithms have been known to be sensitive to the choice of hyper-parameters.

Continuous Control

UFO-BLO: Unbiased First-Order Bilevel Optimization

no code implementations5 Jun 2020 Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Davis, Adrian Weller

Bilevel optimization (BLO) is a popular approach with many applications including hyperparameter optimization, neural architecture search, adversarial robustness and model-agnostic meta-learning.

bilevel optimization Hyperparameter Optimization +3

Demystifying Orthogonal Monte Carlo and Beyond

no code implementations NeurIPS 2020 Han Lin, Haoxian Chen, Tianyi Zhang, Clement Laroche, Krzysztof Choromanski

Orthogonal Monte Carlo (OMC) is a very effective sampling algorithm imposing structural geometric conditions (orthogonality) on samples for variance reduction.

Time Dependence in Non-Autonomous Neural ODEs

no code implementations5 May 2020 Jared Quincy Davis, Krzysztof Choromanski, Jake Varley, Honglak Lee, Jean-Jacques Slotine, Valerii Likhosterov, Adrian Weller, Ameesh Makadia, Vikas Sindhwani

Neural Ordinary Differential Equations (ODEs) are elegant reinterpretations of deep networks where continuous time can replace the discrete notion of depth, ODE solvers perform forward propagation, and the adjoint method enables efficient, constant memory backpropagation.

Image Classification Video Prediction

CWY Parametrization: a Solution for Parallelized Optimization of Orthogonal and Stiefel Matrices

no code implementations18 Apr 2020 Valerii Likhosherstov, Jared Davis, Krzysztof Choromanski, Adrian Weller

We introduce an efficient approach for optimization over orthogonal groups on highly parallel computation units such as GPUs or TPUs.

Machine Translation Translation +1

Robotic Table Tennis with Model-Free Reinforcement Learning

no code implementations31 Mar 2020 Wenbo Gao, Laura Graesser, Krzysztof Choromanski, Xingyou Song, Nevena Lazic, Pannag Sanketi, Vikas Sindhwani, Navdeep Jaitly

We propose a model-free algorithm for learning efficient policies capable of returning table tennis balls by controlling robot joints at a rate of 100Hz.

Curriculum Learning

Stochastic Flows and Geometric Optimization on the Orthogonal Group

no code implementations ICML 2020 Krzysztof Choromanski, David Cheikhi, Jared Davis, Valerii Likhosherstov, Achille Nazaret, Achraf Bahamou, Xingyou Song, Mrugank Akarte, Jack Parker-Holder, Jacob Bergquist, Yuan Gao, Aldo Pacchiano, Tamas Sarlos, Adrian Weller, Vikas Sindhwani

We present a new class of stochastic, geometrically-driven optimization algorithms on the orthogonal group $O(d)$ and naturally reductive homogeneous manifolds obtained from the action of the rotation group $SO(d)$.

Metric Learning Stochastic Optimization

Rapidly Adaptable Legged Robots via Evolutionary Meta-Learning

no code implementations2 Mar 2020 Xingyou Song, Yuxiang Yang, Krzysztof Choromanski, Ken Caluwaerts, Wenbo Gao, Chelsea Finn, Jie Tan

Learning adaptable policies is crucial for robots to operate autonomously in our complex and quickly changing world.

Legged Robots Meta-Learning

Ready Policy One: World Building Through Active Learning

no code implementations ICML 2020 Philip Ball, Jack Parker-Holder, Aldo Pacchiano, Krzysztof Choromanski, Stephen Roberts

Model-Based Reinforcement Learning (MBRL) offers a promising direction for sample efficient learning, often achieving state of the art results for continuous control tasks.

Active Learning Continuous Control +1

Effective Diversity in Population Based Reinforcement Learning

2 code implementations NeurIPS 2020 Jack Parker-Holder, Aldo Pacchiano, Krzysztof Choromanski, Stephen Roberts

Exploration is a key problem in reinforcement learning, since agents can only learn from data they acquire in the environment.

Point Processes

ES-MAML: Simple Hessian-Free Meta Learning

1 code implementation ICLR 2020 Xingyou Song, Wenbo Gao, Yuxiang Yang, Krzysztof Choromanski, Aldo Pacchiano, Yunhao Tang

We introduce ES-MAML, a new framework for solving the model agnostic meta learning (MAML) problem based on Evolution Strategies (ES).

Meta-Learning

Reinforcement Learning with Chromatic Networks for Compact Architecture Search

no code implementations10 Jul 2019 Xingyou Song, Krzysztof Choromanski, Jack Parker-Holder, Yunhao Tang, Wenbo Gao, Aldo Pacchiano, Tamas Sarlos, Deepali Jain, Yuxiang Yang

We present a neural architecture search algorithm to construct compact reinforcement learning (RL) policies, by combining ENAS and ES in a highly scalable and intuitive way.

Combinatorial Optimization Neural Architecture Search

Learning to Score Behaviors for Guided Policy Optimization

1 code implementation ICML 2020 Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Anna Choromanska, Krzysztof Choromanski, Michael. I. Jordan

We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space.

Efficient Exploration Imitation Learning

Structured Monte Carlo Sampling for Nonisotropic Distributions via Determinantal Point Processes

no code implementations29 May 2019 Krzysztof Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang

We propose a new class of structured methods for Monte Carlo (MC) sampling, called DPPMC, designed for high-dimensional nonisotropic distributions where samples are correlated to reduce the variance of the estimator via determinantal point processes.

Point Processes

Linear interpolation gives better gradients than Gaussian smoothing in derivative-free optimization

no code implementations29 May 2019 Albert S. Berahas, Liyuan Cao, Krzysztof Choromanski, Katya Scheinberg

We then demonstrate via rigorous analysis of the variance and by numerical comparisons on reinforcement learning tasks that the Gaussian sampling method used in [Salimans et al. 2016] is significantly inferior to the orthogonal sampling used in [Choromaski et al. 2018] as well as more general interpolation methods.

Variance Reduction for Evolution Strategies via Structured Control Variates

no code implementations29 May 2019 Yunhao Tang, Krzysztof Choromanski, Alp Kucukelbir

Evolution Strategies (ES) are a powerful class of blackbox optimization techniques that recently became a competitive alternative to state-of-the-art policy gradient (PG) algorithms for reinforcement learning (RL).

A Theoretical and Empirical Comparison of Gradient Approximations in Derivative-Free Optimization

no code implementations3 May 2019 Albert S. Berahas, Liyuan Cao, Krzysztof Choromanski, Katya Scheinberg

To this end, we use the results in [Berahas et al., 2019] and show how each method can satisfy the sufficient conditions, possibly only with some sufficiently large probability at each iteration, as happens to be the case with Gaussian smoothing and smoothing on a sphere.

Optimization and Control

Orthogonal Estimation of Wasserstein Distances

no code implementations9 Mar 2019 Mark Rowland, Jiri Hron, Yunhao Tang, Krzysztof Choromanski, Tamas Sarlos, Adrian Weller

Wasserstein distances are increasingly used in a wide variety of applications in machine learning.

Provably Robust Blackbox Optimization for Reinforcement Learning

no code implementations7 Mar 2019 Krzysztof Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Deepali Jain, Yuxiang Yang, Atil Iscen, Jasmine Hsu, Vikas Sindhwani

Interest in derivative-free optimization (DFO) and "evolutionary strategies" (ES) has recently surged in the Reinforcement Learning (RL) community, with growing evidence that they can match state of the art methods for policy optimization problems in Robotics.

Text-to-Image Generation

From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization

1 code implementation NeurIPS 2019 Krzysztof Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang

ASEBO adapts to the geometry of the function and learns optimal sets of sensing directions, which are used to probe it, on-the-fly.

Multi-Armed Bandits

Structured Evolution with Compact Architectures for Scalable Policy Optimization

no code implementations ICML 2018 Krzysztof Choromanski, Mark Rowland, Vikas Sindhwani, Richard E. Turner, Adrian Weller

We present a new method of blackbox optimization via gradient approximation with the use of structured random orthogonal matrices, providing more accurate estimators than baselines and with provable theoretical guarantees.

OpenAI Gym Text-to-Image Generation

On the Needs for Rotations in Hypercubic Quantization Hashing

no code implementations12 Feb 2018 Anne Morvan, Antoine Souloumiac, Krzysztof Choromanski, Cédric Gouy-Pailler, Jamal Atif

The aim of this paper is to endow the well-known family of hypercubic quantization hashing methods with theoretical guarantees.

Dimensionality Reduction Quantization

Initialization matters: Orthogonal Predictive State Recurrent Neural Networks

no code implementations ICLR 2018 Krzysztof Choromanski, Carlton Downey, Byron Boots

In this paper, we extend the theory of ORFs to Kernel Ridge Regression and show that ORFs can be used to obtain Orthogonal PSRNNs (OPSRNNs), which are smaller and faster than PSRNNs.

Time Series

Manifold Regularization for Kernelized LSTD

no code implementations15 Oct 2017 Xinyan Yan, Krzysztof Choromanski, Byron Boots, Vikas Sindhwani

Policy evaluation or value function or Q-function approximation is a key procedure in reinforcement learning (RL).

Policy Gradient Methods

Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car

8 code implementations25 Apr 2017 Mariusz Bojarski, Philip Yeres, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Lawrence Jackel, Urs Muller

This eliminates the need for human engineers to anticipate what is important in an image and foresee all the necessary rules for safe driving.

Autonomous Driving Self-Driving Cars

Graph sketching-based Space-efficient Data Clustering

1 code implementation7 Mar 2017 Anne Morvan, Krzysztof Choromanski, Cédric Gouy-Pailler, Jamal Atif

In this paper, we address the problem of recovering arbitrary-shaped data clusters from datasets while facing \emph{high space constraints}, as this is for instance the case in many real-world applications when analysis algorithms are directly deployed on resources-limited mobile devices collecting the data.

The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings

1 code implementation NeurIPS 2017 Krzysztof Choromanski, Mark Rowland, Adrian Weller

We examine a class of embeddings based on structured random matrices with orthogonal rows which can be applied in many machine learning applications including dimensionality reduction and kernel approximation.

Dimensionality Reduction

VisualBackProp: efficient visualization of CNNs

4 code implementations16 Nov 2016 Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Larry Jackel, Urs Muller, Karol Zieba

We furthermore justify our approach with theoretical arguments and theoretically confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction.

Self-Driving Cars

Orthogonal Random Features

no code implementations NeurIPS 2016 Felix X. Yu, Ananda Theertha Suresh, Krzysztof Choromanski, Daniel Holtmann-Rice, Sanjiv Kumar

We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error.

Recycling Randomness with Structure for Sublinear time Kernel Expansions

no code implementations29 May 2016 Krzysztof Choromanski, Vikas Sindhwani

We propose a scheme for recycling Gaussian random vectors into structured matrices to approximate various kernel functions in sublinear time via random embeddings.

TripleSpin - a generic compact paradigm for fast machine learning computations

no code implementations29 May 2016 Krzysztof Choromanski, Francois Fagan, Cedric Gouy-Pailler, Anne Morvan, Tamas Sarlos, Jamal Atif

In particular, as a byproduct of the presented techniques and by using relatively new Berry-Esseen-type CLT for random vectors, we give the first theoretical guarantees for one of the most efficient existing LSH algorithms based on the $\textbf{HD}_{3}\textbf{HD}_{2}\textbf{HD}_{1}$ structured matrix ("Practical and Optimal LSH for Angular Distance").

Quantization

On the boosting ability of top-down decision tree learning algorithm for multiclass classification

no code implementations17 May 2016 Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski

We analyze the performance of the top-down multiclass classification algorithm for decision tree learning called LOMtree, recently proposed in the literature Choromanska and Langford (2014) for solving efficiently classification problems with very large number of classes.

General Classification

Fast nonlinear embeddings via structured matrices

no code implementations25 Apr 2016 Krzysztof Choromanski, Francois Fagan

Our framework covers as special cases already known structured approaches such as the Fast Johnson-Lindenstrauss Transform, but is much more general since it can be applied also to highly nonlinear embeddings.

Binary embeddings with structured hashed projections

no code implementations16 Nov 2015 Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann Lecun

We prove several theoretical results showing that projections via various structured matrices followed by nonlinear mappings accurately preserve the angular distance between input high-dimensional vectors.

Quantization based Fast Inner Product Search

no code implementations4 Sep 2015 Ruiqi Guo, Sanjiv Kumar, Krzysztof Choromanski, David Simcha

We propose a quantization based approach for fast approximate Maximum Inner Product Search (MIPS).

Quantization

Fast Online Clustering with Randomized Skeleton Sets

no code implementations10 Jun 2015 Krzysztof Choromanski, Sanjiv Kumar, Xiaofeng Liu

To achieve fast clustering, we propose to represent each cluster by a skeleton set which is updated continuously as new data is seen.

Online Clustering

Notes on using Determinantal Point Processes for Clustering with Applications to Text Clustering

no code implementations26 Oct 2014 Apoorv Agarwal, Anna Choromanska, Krzysztof Choromanski

In this paper, we compare three initialization schemes for the KMEANS clustering algorithm: 1) random initialization (KMEANSRAND), 2) KMEANS++, and 3) KMEANSD++.

Point Processes Text Clustering

Differentially- and non-differentially-private random decision trees

no code implementations26 Oct 2014 Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Yann Lecun

We consider supervised learning with random decision trees, where the tree construction is completely random.

On Learning from Label Proportions

1 code implementation24 Feb 2014 Felix X. Yu, Krzysztof Choromanski, Sanjiv Kumar, Tony Jebara, Shih-Fu Chang

Learning from Label Proportions (LLP) is a learning setting, where the training data is provided in groups, or "bags", and only the proportion of each class in each bag is known.

Cannot find the paper you are looking for? You can Submit a new open access paper.