Search Results for author: Krzysztof Choromanski

Found 72 papers, 23 papers with code

SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention

no code implementations • 4 Dec 2023 • Isabel Leal, Krzysztof Choromanski, Deepali Jain, Avinava Dubey, Jake Varley, Michael Ryoo, Yao Lu, Frederick Liu, Vikas Sindhwani, Quan Vuong, Tamas Sarlos, Ken Oslund, Karol Hausman, Kanishka Rao

We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment.

Paper
Add Code

Scalable Neural Network Kernels

1 code implementation • 20 Oct 2023 • Arijit Sehanobish, Krzysztof Choromanski, Yunfan Zhao, Avinava Dubey, Valerii Likhosherstov

We introduce the concept of scalable neural network kernels (SNNKs), the replacements of regular feedforward layers (FFLs), capable of approximating the latter, but with favorable computational properties.

Paper
Code

Universal Graph Random Features

no code implementations • 7 Oct 2023 • Isaac Reid, Krzysztof Choromanski, Eli Berger, Adrian Weller

This includes many of the most popular examples of kernels defined on the nodes of a graph.

Node Clustering

Paper
Add Code

Repelling Random Walks

no code implementations • 7 Oct 2023 • Isaac Reid, Eli Berger, Krzysztof Choromanski, Adrian Weller

We present a novel quasi-Monte Carlo mechanism to improve graph-based sampling, coined repelling random walks.

Paper
Add Code

Augmenting conformers with structured state-space sequence models for online speech recognition

no code implementations • 15 Sep 2023 • Haozhe Shan, Albert Gu, Zhong Meng, Weiran Wang, Krzysztof Choromanski, Tara Sainath

Online speech recognition, where the model only accesses context to the left, is an important and challenging use case for ASR systems.

speech-recognition Speech Recognition

Paper
Add Code

Robotic Table Tennis: A Case Study into a High Speed Learning System

no code implementations • 6 Sep 2023 • David B. D'Ambrosio, Jonathan Abelian, Saminda Abeyruwan, Michael Ahn, Alex Bewley, Justin Boyd, Krzysztof Choromanski, Omar Cortes, Erwin Coumans, Tianli Ding, Wenbo Gao, Laura Graesser, Atil Iscen, Navdeep Jaitly, Deepali Jain, Juhana Kangaspunta, Satoshi Kataoka, Gus Kouretas, Yuheng Kuang, Nevena Lazic, Corey Lynch, Reza Mahjourian, Sherry Q. Moore, Thinh Nguyen, Ken Oslund, Barney J Reed, Krista Reymann, Pannag R. Sanketi, Anish Shankar, Pierre Sermanet, Vikas Sindhwani, Avi Singh, Vincent Vanhoucke, Grace Vesom, Peng Xu

We present a deep-dive into a real-world robotic learning system that, in previous work, was shown to be capable of hundreds of table tennis rallies with a human and has the ability to precisely return the ball to desired targets.

Paper
Add Code

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

1 code implementation • 28 Jul 2023 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web.

Object Question Answering +1

264

Paper
Code

Taming graph kernels with random features

no code implementations • 29 Apr 2023 • Krzysztof Choromanski

We also introduce a (still unbiased) quasi Monte Carlo variant of GRFs, q-GRFs, relying on the so-called reinforced random walks, that might be used to optimize the variance of GRFs.

Graph Clustering

Paper
Add Code

Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

no code implementations • 31 Mar 2023 • Rami Botros, Anmol Gulati, Tara N. Sainath, Krzysztof Choromanski, Ruoming Pang, Trevor Strohman, Weiran Wang, Jiahui Yu

Conformer models maintain a large number of internal states, the vast majority of which are associated with self-attention layers.

Paper
Add Code

Efficient Graph Field Integrators Meet Point Clouds

1 code implementation • 2 Feb 2023 • Krzysztof Choromanski, Arijit Sehanobish, Han Lin, Yunfan Zhao, Eli Berger, Tetiana Parshakova, Alvin Pan, David Watkins, Tianyi Zhang, Valerii Likhosherstov, Somnath Basu Roy Chowdhury, Avinava Dubey, Deepali Jain, Tamas Sarlos, Snigdha Chaturvedi, Adrian Weller

We present two new classes of algorithms for efficient field integration on graphs encoding point clouds.

Paper
Code

FAVOR#: Sharp Attention Kernel Approximations via New Classes of Positive Random Features

no code implementations • 1 Feb 2023 • Valerii Likhosherstov, Krzysztof Choromanski, Avinava Dubey, Frederick Liu, Tamas Sarlos, Adrian Weller

The problem of efficient approximation of a linear operator induced by the Gaussian or softmax kernel is often addressed using random features (RFs) which yield an unbiased approximation of the operator's result.

Paper
Add Code

Simplex Random Features

1 code implementation • 31 Jan 2023 • Isaac Reid, Krzysztof Choromanski, Valerii Likhosherstov, Adrian Weller

We present Simplex Random Features (SimRFs), a new random feature (RF) mechanism for unbiased approximation of the softmax and Gaussian kernels by geometrical correlation of random projection vectors.

Paper
Code

Karyotype AI for Precision Oncology

no code implementations • 20 Nov 2022 • Zahra Shamsi, Drew Bryant, Jacob Wilson, Xiaoyu Qu, Avinava Dubey, Konik Kothari, Mostafa Dehghani, Mariya Chavarha, Valerii Likhosherstov, Brian Williams, Michael Frumkin, Fred Appelbaum, Krzysztof Choromanski, Ali Bashir, Min Fang

These individual chromosomes were used to train and assess deep learning models for classifying the 24 human chromosomes and identifying chromosomal aberrations.

Few-Shot Learning Inductive Bias

Paper
Add Code

Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation

no code implementations • 22 Sep 2022 • Xuesu Xiao, Tingnan Zhang, Krzysztof Choromanski, Edward Lee, Anthony Francis, Jake Varley, Stephen Tu, Sumeet Singh, Peng Xu, Fei Xia, Sven Mikael Persson, Dmitry Kalashnikov, Leila Takayama, Roy Frostig, Jie Tan, Carolina Parada, Vikas Sindhwani

Despite decades of research, existing navigation systems still face real-world challenges when deployed in the wild, e. g., in cluttered home environments or in human-occupied public spaces.

Imitation Learning Model Predictive Control

Paper
Add Code

Multiple View Performers for Shape Completion

no code implementations • 13 Sep 2022 • David Watkins-Valls, Peter Allen, Krzysztof Choromanski, Jacob Varley, Nicholas Waytowich

We propose the Multiple View Performer (MVP) - a new architecture for 3D shape completion from a series of temporally sequential views.

Paper
Add Code

Implicit Two-Tower Policies

no code implementations • 2 Aug 2022 • Yunfan Zhao, Qingkai Pan, Krzysztof Choromanski, Deepali Jain, Vikas Sindhwani

We present a new class of structured reinforcement learning policy-architectures, Implicit Two-Tower (ITT) policies, where the actions are chosen based on the attention scores of their learnable latent representations with those of the input states.

OpenAI Gym Vocal Bursts Valence Prediction

Paper
Add Code

Chefs' Random Tables: Non-Trigonometric Random Features

1 code implementation • 30 May 2022 • Valerii Likhosherstov, Krzysztof Choromanski, Avinava Dubey, Frederick Liu, Tamas Sarlos, Adrian Weller

We introduce chefs' random tables (CRTs), a new class of non-trigonometric random features (RFs) to approximate Gaussian and softmax kernels.

76,574

Paper
Code

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

1 code implementation • 1 Apr 2022 • Andy Zeng, Maria Attarian, Brian Ichter, Krzysztof Choromanski, Adrian Wong, Stefan Welker, Federico Tombari, Aveek Purohit, Michael Ryoo, Vikas Sindhwani, Johnny Lee, Vincent Vanhoucke, Pete Florence

Large pretrained (e. g., "foundation") models exhibit distinct capabilities depending on the domain of data they are trained on.

Ranked #21 on Video Retrieval on MSR-VTT-1kA (video-to-text R@1 metric)

Image Captioning Multimodal Reasoning +5

32,735

Paper
Code

PolyViT: Co-training Vision Transformers on Images, Videos and Audio

no code implementations • 25 Nov 2021 • Valerii Likhosherstov, Anurag Arnab, Krzysztof Choromanski, Mario Lucic, Yi Tay, Adrian Weller, Mostafa Dehghani

Can we train a single transformer model capable of processing multiple modalities and datasets, whilst sharing almost all of its learnable parameters?

Audio Classification

Paper
Add Code

Hybrid Random Features

1 code implementation • ICLR 2022 • Krzysztof Choromanski, Haoxian Chen, Han Lin, Yuanzhe Ma, Arijit Sehanobish, Deepali Jain, Michael S Ryoo, Jake Varley, Andy Zeng, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani, Adrian Weller

We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.

Benchmarking

Paper
Code

From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers

1 code implementation • 16 Jul 2021 • Krzysztof Choromanski, Han Lin, Haoxian Chen, Tianyi Zhang, Arijit Sehanobish, Valerii Likhosherstov, Jack Parker-Holder, Tamas Sarlos, Adrian Weller, Thomas Weingarten

In this paper we provide, to the best of our knowledge, the first comprehensive approach for incorporating various masking mechanisms into Transformers architectures in a scalable way.

Graph Attention

Paper
Code

On the Expressive Power of Self-Attention Matrices

no code implementations • 7 Jun 2021 • Valerii Likhosherstov, Krzysztof Choromanski, Adrian Weller

Our proof is constructive, enabling us to propose an algorithm for finding adaptive inputs and fixed self-attention parameters in order to approximate a given matrix.

LEMMA

Paper
Add Code

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

1 code implementation • 4 Jun 2021 • Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Davis, Adrian Weller

Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops.

Paper
Code

ES-ENAS: Efficient Evolutionary Optimization for Large Hybrid Search Spaces

2 code implementations • 19 Jan 2021 • Xingyou Song, Krzysztof Choromanski, Jack Parker-Holder, Yunhao Tang, Qiuyi Zhang, Daiyi Peng, Deepali Jain, Wenbo Gao, Aldo Pacchiano, Tamas Sarlos, Yuxiang Yang

In this paper, we approach the problem of optimizing blackbox functions over large hybrid search spaces consisting of both combinatorial and continuous parameters.

Combinatorial Optimization Continuous Control +4

32,745

Paper
Code

MLGO: a Machine Learning Guided Compiler Optimizations Framework

1 code implementation • 13 Jan 2021 • Mircea Trofin, Yundi Qian, Eugene Brevdo, Zinan Lin, Krzysztof Choromanski, David Li

Leveraging machine-learning (ML) techniques for compiler optimizations has been widely studied and explored in academia.

BIG-bench Machine Learning

582

Paper
Code

Sub-Linear Memory: How to Make Performers SLiM

2 code implementations • NeurIPS 2021 • Valerii Likhosherstov, Krzysztof Choromanski, Jared Davis, Xingyou Song, Adrian Weller

Recent works proposed various linear self-attention mechanisms, scaling only as $O(L)$ for serial computation.

32,745

Paper
Code

Rethinking Attention with Performers

12 code implementations • ICLR 2021 • Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller

We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness.

Ranked #7 on Offline RL on D4RL

D4RL Image Generation +2

76,577

Paper
Code

Towards Tractable Optimism in Model-Based Reinforcement Learning

no code implementations • 21 Jun 2020 • Aldo Pacchiano, Philip J. Ball, Jack Parker-Holder, Krzysztof Choromanski, Stephen Roberts

The principle of optimism in the face of uncertainty is prevalent throughout sequential decision making problems such as multi-armed bandits and reinforcement learning (RL).

Continuous Control Decision Making +4

Paper
Add Code

An Ode to an ODE

no code implementations • NeurIPS 2020 • Krzysztof Choromanski, Jared Quincy Davis, Valerii Likhosherstov, Xingyou Song, Jean-Jacques Slotine, Jacob Varley, Honglak Lee, Adrian Weller, Vikas Sindhwani

We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the orthogonal group O(d).

Paper
Add Code

Online Hyper-parameter Tuning in Off-policy Learning via Evolutionary Strategies

no code implementations • 13 Jun 2020 • Yunhao Tang, Krzysztof Choromanski

Off-policy learning algorithms have been known to be sensitive to the choice of hyper-parameters.

Continuous Control

Paper
Add Code

Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers

1 code implementation • 5 Jun 2020 • Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, David Belanger, Lucy Colwell, Adrian Weller

In response, solutions that exploit the structure and sparsity of the learned attention matrix have blossomed.

Language Modelling Masked Language Modeling

32,745

Paper
Code

UFO-BLO: Unbiased First-Order Bilevel Optimization

no code implementations • 5 Jun 2020 • Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Davis, Adrian Weller

Bilevel optimization (BLO) is a popular approach with many applications including hyperparameter optimization, neural architecture search, adversarial robustness and model-agnostic meta-learning.

Adversarial Robustness Bilevel Optimization +4

Paper
Add Code

Demystifying Orthogonal Monte Carlo and Beyond

no code implementations • NeurIPS 2020 • Han Lin, Haoxian Chen, Tianyi Zhang, Clement Laroche, Krzysztof Choromanski

Orthogonal Monte Carlo (OMC) is a very effective sampling algorithm imposing structural geometric conditions (orthogonality) on samples for variance reduction.

Paper
Add Code

Time Dependence in Non-Autonomous Neural ODEs

no code implementations • ICLR Workshop DeepDiffEq 2019 • Jared Quincy Davis, Krzysztof Choromanski, Jake Varley, Honglak Lee, Jean-Jacques Slotine, Valerii Likhosterov, Adrian Weller, Ameesh Makadia, Vikas Sindhwani

Neural Ordinary Differential Equations (ODEs) are elegant reinterpretations of deep networks where continuous time can replace the discrete notion of depth, ODE solvers perform forward propagation, and the adjoint method enables efficient, constant memory backpropagation.

Image Classification Video Prediction

Paper
Add Code

CWY Parametrization: a Solution for Parallelized Optimization of Orthogonal and Stiefel Matrices

no code implementations • 18 Apr 2020 • Valerii Likhosherstov, Jared Davis, Krzysztof Choromanski, Adrian Weller

We introduce an efficient approach for optimization over orthogonal groups on highly parallel computation units such as GPUs or TPUs.

Machine Translation Translation +1

Paper
Add Code

Robotic Table Tennis with Model-Free Reinforcement Learning

no code implementations • 31 Mar 2020 • Wenbo Gao, Laura Graesser, Krzysztof Choromanski, Xingyou Song, Nevena Lazic, Pannag Sanketi, Vikas Sindhwani, Navdeep Jaitly

We propose a model-free algorithm for learning efficient policies capable of returning table tennis balls by controlling robot joints at a rate of 100Hz.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Stochastic Flows and Geometric Optimization on the Orthogonal Group

no code implementations • ICML 2020 • Krzysztof Choromanski, David Cheikhi, Jared Davis, Valerii Likhosherstov, Achille Nazaret, Achraf Bahamou, Xingyou Song, Mrugank Akarte, Jack Parker-Holder, Jacob Bergquist, Yuan Gao, Aldo Pacchiano, Tamas Sarlos, Adrian Weller, Vikas Sindhwani

We present a new class of stochastic, geometrically-driven optimization algorithms on the orthogonal group $O(d)$ and naturally reductive homogeneous manifolds obtained from the action of the rotation group $SO(d)$.

Metric Learning Stochastic Optimization

Paper
Add Code

Rapidly Adaptable Legged Robots via Evolutionary Meta-Learning

no code implementations • 2 Mar 2020 • Xingyou Song, Yuxiang Yang, Krzysztof Choromanski, Ken Caluwaerts, Wenbo Gao, Chelsea Finn, Jie Tan

Learning adaptable policies is crucial for robots to operate autonomously in our complex and quickly changing world.

Meta-Learning

Paper
Add Code

Ready Policy One: World Building Through Active Learning

no code implementations • ICML 2020 • Philip Ball, Jack Parker-Holder, Aldo Pacchiano, Krzysztof Choromanski, Stephen Roberts

Model-Based Reinforcement Learning (MBRL) offers a promising direction for sample efficient learning, often achieving state of the art results for continuous control tasks.

Active Learning Continuous Control +1

Paper
Add Code

Effective Diversity in Population Based Reinforcement Learning

2 code implementations • NeurIPS 2020 • Jack Parker-Holder, Aldo Pacchiano, Krzysztof Choromanski, Stephen Roberts

Exploration is a key problem in reinforcement learning, since agents can only learn from data they acquire in the environment.

Point Processes reinforcement-learning +1

Paper
Code

ES-MAML: Simple Hessian-Free Meta Learning

1 code implementation • ICLR 2020 • Xingyou Song, Wenbo Gao, Yuxiang Yang, Krzysztof Choromanski, Aldo Pacchiano, Yunhao Tang

We introduce ES-MAML, a new framework for solving the model agnostic meta learning (MAML) problem based on Evolution Strategies (ES).

Meta-Learning

32,738

Paper
Code

Behavior-Guided Reinforcement Learning

no code implementations • 25 Sep 2019 • Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Anna Choromanska, Krzysztof Choromanski, Michael I. Jordan

We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Reinforcement Learning with Chromatic Networks

no code implementations • 25 Sep 2019 • Xingyou Song, Krzysztof Choromanski, Jack Parker-Holder, Yunhao Tang, Wenbo Gao, Aldo Pacchiano, Tamas Sarlos, Deepali Jain, Yuxiang Yang

We present a neural architecture search algorithm to construct compact reinforcement learning (RL) policies, by combining ENAS and ES in a highly scalable and intuitive way.

Neural Architecture Search reinforcement-learning +1

Paper
Add Code

Reinforcement Learning with Chromatic Networks for Compact Architecture Search

no code implementations • 10 Jul 2019 • Xingyou Song, Krzysztof Choromanski, Jack Parker-Holder, Yunhao Tang, Wenbo Gao, Aldo Pacchiano, Tamas Sarlos, Deepali Jain, Yuxiang Yang

We present a neural architecture search algorithm to construct compact reinforcement learning (RL) policies, by combining ENAS and ES in a highly scalable and intuitive way.

Combinatorial Optimization Neural Architecture Search +2

Paper
Add Code

Learning to Score Behaviors for Guided Policy Optimization

1 code implementation • ICML 2020 • Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Anna Choromanska, Krzysztof Choromanski, Michael. I. Jordan

We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space.

Efficient Exploration Imitation Learning +2

Paper
Code

Linear interpolation gives better gradients than Gaussian smoothing in derivative-free optimization

no code implementations • 29 May 2019 • Albert S. Berahas, Liyuan Cao, Krzysztof Choromanski, Katya Scheinberg

We then demonstrate via rigorous analysis of the variance and by numerical comparisons on reinforcement learning tasks that the Gaussian sampling method used in [Salimans et al. 2016] is significantly inferior to the orthogonal sampling used in [Choromaski et al. 2018] as well as more general interpolation methods.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Structured Monte Carlo Sampling for Nonisotropic Distributions via Determinantal Point Processes

no code implementations • 29 May 2019 • Krzysztof Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang

We propose a new class of structured methods for Monte Carlo (MC) sampling, called DPPMC, designed for high-dimensional nonisotropic distributions where samples are correlated to reduce the variance of the estimator via determinantal point processes.

Point Processes

Paper
Add Code

Variance Reduction for Evolution Strategies via Structured Control Variates

no code implementations • 29 May 2019 • Yunhao Tang, Krzysztof Choromanski, Alp Kucukelbir

Evolution Strategies (ES) are a powerful class of blackbox optimization techniques that recently became a competitive alternative to state-of-the-art policy gradient (PG) algorithms for reinforcement learning (RL).

Reinforcement Learning (RL)

Paper
Add Code

A Theoretical and Empirical Comparison of Gradient Approximations in Derivative-Free Optimization

no code implementations • 3 May 2019 • Albert S. Berahas, Liyuan Cao, Krzysztof Choromanski, Katya Scheinberg

To this end, we use the results in [Berahas et al., 2019] and show how each method can satisfy the sufficient conditions, possibly only with some sufficiently large probability at each iteration, as happens to be the case with Gaussian smoothing and smoothing on a sphere.

Optimization and Control

Paper
Add Code

Orthogonal Estimation of Wasserstein Distances

no code implementations • 9 Mar 2019 • Mark Rowland, Jiri Hron, Yunhao Tang, Krzysztof Choromanski, Tamas Sarlos, Adrian Weller

Wasserstein distances are increasingly used in a wide variety of applications in machine learning.

BIG-bench Machine Learning reinforcement-learning +1

Paper
Add Code

From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization

1 code implementation • NeurIPS 2019 • Krzysztof Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang

ASEBO adapts to the geometry of the function and learns optimal sets of sensing directions, which are used to probe it, on-the-fly.

Multi-Armed Bandits

Paper
Code

Provably Robust Blackbox Optimization for Reinforcement Learning

no code implementations • 7 Mar 2019 • Krzysztof Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Deepali Jain, Yuxiang Yang, Atil Iscen, Jasmine Hsu, Vikas Sindhwani

Interest in derivative-free optimization (DFO) and "evolutionary strategies" (ES) has recently surged in the Reinforcement Learning (RL) community, with growing evidence that they can match state of the art methods for policy optimization problems in Robotics.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Structured Evolution with Compact Architectures for Scalable Policy Optimization

no code implementations • ICML 2018 • Krzysztof Choromanski, Mark Rowland, Vikas Sindhwani, Richard E. Turner, Adrian Weller

We present a new method of blackbox optimization via gradient approximation with the use of structured random orthogonal matrices, providing more accurate estimators than baselines and with provable theoretical guarantees.

OpenAI Gym Text-to-Image Generation

Paper
Add Code

On the Needs for Rotations in Hypercubic Quantization Hashing

no code implementations • 12 Feb 2018 • Anne Morvan, Antoine Souloumiac, Krzysztof Choromanski, Cédric Gouy-Pailler, Jamal Atif

The aim of this paper is to endow the well-known family of hypercubic quantization hashing methods with theoretical guarantees.

Dimensionality Reduction Quantization

Paper
Add Code

Initialization matters: Orthogonal Predictive State Recurrent Neural Networks

no code implementations • ICLR 2018 • Krzysztof Choromanski, Carlton Downey, Byron Boots

In this paper, we extend the theory of ORFs to Kernel Ridge Regression and show that ORFs can be used to obtain Orthogonal PSRNNs (OPSRNNs), which are smaller and faster than PSRNNs.

regression Time Series +1

Paper
Add Code

Manifold Regularization for Kernelized LSTD

no code implementations • 15 Oct 2017 • Xinyan Yan, Krzysztof Choromanski, Byron Boots, Vikas Sindhwani

Policy evaluation or value function or Q-function approximation is a key procedure in reinforcement learning (RL).

Policy Gradient Methods Reinforcement Learning (RL)

Paper
Add Code

Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car

8 code implementations • 25 Apr 2017 • Mariusz Bojarski, Philip Yeres, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Lawrence Jackel, Urs Muller

This eliminates the need for human engineers to anticipate what is important in an image and foresee all the necessary rules for safe driving.

Autonomous Driving Self-Driving Cars

Paper
Code

Graph sketching-based Space-efficient Data Clustering

1 code implementation • 7 Mar 2017 • Anne Morvan, Krzysztof Choromanski, Cédric Gouy-Pailler, Jamal Atif

In this paper, we address the problem of recovering arbitrary-shaped data clusters from datasets while facing \emph{high space constraints}, as this is for instance the case in many real-world applications when analysis algorithms are directly deployed on resources-limited mobile devices collecting the data.

Clustering

Paper
Code

The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings

2 code implementations • NeurIPS 2017 • Krzysztof Choromanski, Mark Rowland, Adrian Weller

We examine a class of embeddings based on structured random matrices with orthogonal rows which can be applied in many machine learning applications including dimensionality reduction and kernel approximation.

BIG-bench Machine Learning Dimensionality Reduction

Paper
Code

VisualBackProp: efficient visualization of CNNs

4 code implementations • 16 Nov 2016 • Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Larry Jackel, Urs Muller, Karol Zieba

We furthermore justify our approach with theoretical arguments and theoretically confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction.

Self-Driving Cars

Paper
Code

Orthogonal Random Features

no code implementations • NeurIPS 2016 • Felix X. Yu, Ananda Theertha Suresh, Krzysztof Choromanski, Daniel Holtmann-Rice, Sanjiv Kumar

We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error.

Paper
Add Code

Structured adaptive and random spinners for fast machine learning computations

no code implementations • 19 Oct 2016 • Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Francois Fagan, Cedric Gouy-Pailler, Anne Morvan, Nourhan Sakr, Tamas Sarlos, Jamal Atif

We consider an efficient computational framework for speeding up several machine learning algorithms with almost no loss of accuracy.

BIG-bench Machine Learning Dimensionality Reduction +1

Paper
Add Code

Recycling Randomness with Structure for Sublinear time Kernel Expansions

no code implementations • 29 May 2016 • Krzysztof Choromanski, Vikas Sindhwani

We propose a scheme for recycling Gaussian random vectors into structured matrices to approximate various kernel functions in sublinear time via random embeddings.

Paper
Add Code

TripleSpin - a generic compact paradigm for fast machine learning computations

no code implementations • 29 May 2016 • Krzysztof Choromanski, Francois Fagan, Cedric Gouy-Pailler, Anne Morvan, Tamas Sarlos, Jamal Atif

In particular, as a byproduct of the presented techniques and by using relatively new Berry-Esseen-type CLT for random vectors, we give the first theoretical guarantees for one of the most efficient existing LSH algorithms based on the $\textbf{HD}_{3}\textbf{HD}_{2}\textbf{HD}_{1}$ structured matrix ("Practical and Optimal LSH for Angular Distance").

BIG-bench Machine Learning Quantization

Paper
Add Code

On the boosting ability of top-down decision tree learning algorithm for multiclass classification

no code implementations • 17 May 2016 • Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski

We analyze the performance of the top-down multiclass classification algorithm for decision tree learning called LOMtree, recently proposed in the literature Choromanska and Langford (2014) for solving efficiently classification problems with very large number of classes.

General Classification

Paper
Add Code

Fast nonlinear embeddings via structured matrices

no code implementations • 25 Apr 2016 • Krzysztof Choromanski, Francois Fagan

Our framework covers as special cases already known structured approaches such as the Fast Johnson-Lindenstrauss Transform, but is much more general since it can be applied also to highly nonlinear embeddings.

Paper
Add Code

Binary embeddings with structured hashed projections

no code implementations • 16 Nov 2015 • Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann Lecun

We prove several theoretical results showing that projections via various structured matrices followed by nonlinear mappings accurately preserve the angular distance between input high-dimensional vectors.

LEMMA

Paper
Add Code

Quantization based Fast Inner Product Search

no code implementations • 4 Sep 2015 • Ruiqi Guo, Sanjiv Kumar, Krzysztof Choromanski, David Simcha

We propose a quantization based approach for fast approximate Maximum Inner Product Search (MIPS).

Quantization

Paper
Add Code

Fast Online Clustering with Randomized Skeleton Sets

no code implementations • 10 Jun 2015 • Krzysztof Choromanski, Sanjiv Kumar, Xiaofeng Liu

To achieve fast clustering, we propose to represent each cluster by a skeleton set which is updated continuously as new data is seen.

Clustering Nonparametric Clustering +1

Paper
Add Code

Differentially- and non-differentially-private random decision trees

no code implementations • 26 Oct 2014 • Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Yann Lecun

We consider supervised learning with random decision trees, where the tree construction is completely random.

Paper
Add Code

Notes on using Determinantal Point Processes for Clustering with Applications to Text Clustering

no code implementations • 26 Oct 2014 • Apoorv Agarwal, Anna Choromanska, Krzysztof Choromanski

In this paper, we compare three initialization schemes for the KMEANS clustering algorithm: 1) random initialization (KMEANSRAND), 2) KMEANS++, and 3) KMEANSD++.

Clustering Point Processes +1

Paper
Add Code

On Learning from Label Proportions

1 code implementation • 24 Feb 2014 • Felix X. Yu, Krzysztof Choromanski, Sanjiv Kumar, Tony Jebara, Shih-Fu Chang

Learning from Label Proportions (LLP) is a learning setting, where the training data is provided in groups, or "bags", and only the proportion of each class in each bag is known.

Marketing

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.