Search Results for author: Tamas Sarlos

Found 22 papers, 9 papers with code

SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention

no code implementations4 Dec 2023 Isabel Leal, Krzysztof Choromanski, Deepali Jain, Avinava Dubey, Jake Varley, Michael Ryoo, Yao Lu, Frederick Liu, Vikas Sindhwani, Quan Vuong, Tamas Sarlos, Ken Oslund, Karol Hausman, Kanishka Rao

We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment.

Hot PATE: Private Aggregation of Distributions for Diverse Task

no code implementations4 Dec 2023 Edith Cohen, Xin Lyu, Jelani Nelson, Tamas Sarlos, Uri Stemmer

Until now, PATE has primarily been explored with classification-like tasks, where each example possesses a ground-truth label, and knowledge is transferred to the student by labeling public examples.

Privacy Preserving valid

FAVOR#: Sharp Attention Kernel Approximations via New Classes of Positive Random Features

no code implementations1 Feb 2023 Valerii Likhosherstov, Krzysztof Choromanski, Avinava Dubey, Frederick Liu, Tamas Sarlos, Adrian Weller

The problem of efficient approximation of a linear operator induced by the Gaussian or softmax kernel is often addressed using random features (RFs) which yield an unbiased approximation of the operator's result.

Chefs' Random Tables: Non-Trigonometric Random Features

1 code implementation30 May 2022 Valerii Likhosherstov, Krzysztof Choromanski, Avinava Dubey, Frederick Liu, Tamas Sarlos, Adrian Weller

We introduce chefs' random tables (CRTs), a new class of non-trigonometric random features (RFs) to approximate Gaussian and softmax kernels.

From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers

1 code implementation16 Jul 2021 Krzysztof Choromanski, Han Lin, Haoxian Chen, Tianyi Zhang, Arijit Sehanobish, Valerii Likhosherstov, Jack Parker-Holder, Tamas Sarlos, Adrian Weller, Thomas Weingarten

In this paper we provide, to the best of our knowledge, the first comprehensive approach for incorporating various masking mechanisms into Transformers architectures in a scalable way.

Graph Attention

ES-ENAS: Efficient Evolutionary Optimization for Large Hybrid Search Spaces

1 code implementation19 Jan 2021 Xingyou Song, Krzysztof Choromanski, Jack Parker-Holder, Yunhao Tang, Qiuyi Zhang, Daiyi Peng, Deepali Jain, Wenbo Gao, Aldo Pacchiano, Tamas Sarlos, Yuxiang Yang

In this paper, we approach the problem of optimizing blackbox functions over large hybrid search spaces consisting of both combinatorial and continuous parameters.

Combinatorial Optimization Continuous Control +4

Differentially Private Weighted Sampling

no code implementations25 Oct 2020 Edith Cohen, Ofir Geri, Tamas Sarlos, Uri Stemmer

A weighted sample of keys by (a function of) frequency is a highly versatile summary that provides a sparse set of representative keys and supports approximate evaluations of query statistics.

Rethinking Attention with Performers

12 code implementations ICLR 2021 Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller

We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness.

D4RL Image Generation +1

Stochastic Flows and Geometric Optimization on the Orthogonal Group

no code implementations ICML 2020 Krzysztof Choromanski, David Cheikhi, Jared Davis, Valerii Likhosherstov, Achille Nazaret, Achraf Bahamou, Xingyou Song, Mrugank Akarte, Jack Parker-Holder, Jacob Bergquist, Yuan Gao, Aldo Pacchiano, Tamas Sarlos, Adrian Weller, Vikas Sindhwani

We present a new class of stochastic, geometrically-driven optimization algorithms on the orthogonal group $O(d)$ and naturally reductive homogeneous manifolds obtained from the action of the rotation group $SO(d)$.

Metric Learning Stochastic Optimization

Reinforcement Learning with Chromatic Networks

no code implementations25 Sep 2019 Xingyou Song, Krzysztof Choromanski, Jack Parker-Holder, Yunhao Tang, Wenbo Gao, Aldo Pacchiano, Tamas Sarlos, Deepali Jain, Yuxiang Yang

We present a neural architecture search algorithm to construct compact reinforcement learning (RL) policies, by combining ENAS and ES in a highly scalable and intuitive way.

Neural Architecture Search reinforcement-learning +1

Reinforcement Learning with Chromatic Networks for Compact Architecture Search

no code implementations10 Jul 2019 Xingyou Song, Krzysztof Choromanski, Jack Parker-Holder, Yunhao Tang, Wenbo Gao, Aldo Pacchiano, Tamas Sarlos, Deepali Jain, Yuxiang Yang

We present a neural architecture search algorithm to construct compact reinforcement learning (RL) policies, by combining ENAS and ES in a highly scalable and intuitive way.

Combinatorial Optimization Neural Architecture Search +2

Matrix-Free Preconditioning in Online Learning

no code implementations29 May 2019 Ashok Cutkosky, Tamas Sarlos

We provide an online convex optimization algorithm with regret that interpolates between the regret of an algorithm using an optimal preconditioning matrix and one using a diagonal preconditioning matrix.

Benchmarking

Linear Additive Markov Processes

1 code implementation5 Apr 2017 Ravi Kumar, Maithra Raghu, Tamas Sarlos, Andrew Tomkins

We introduce LAMP: the Linear Additive Markov Process.

TripleSpin - a generic compact paradigm for fast machine learning computations

no code implementations29 May 2016 Krzysztof Choromanski, Francois Fagan, Cedric Gouy-Pailler, Anne Morvan, Tamas Sarlos, Jamal Atif

In particular, as a byproduct of the presented techniques and by using relatively new Berry-Esseen-type CLT for random vectors, we give the first theoretical guarantees for one of the most efficient existing LSH algorithms based on the $\textbf{HD}_{3}\textbf{HD}_{2}\textbf{HD}_{1}$ structured matrix ("Practical and Optimal LSH for Angular Distance").

BIG-bench Machine Learning Quantization

Fastfood: Approximate Kernel Expansions in Loglinear Time

1 code implementation13 Aug 2014 Quoc Viet Le, Tamas Sarlos, Alexander Johannes Smola

These improvements, especially in terms of memory usage, make kernel methods more practical for applications that have large training sets and/or require real-time prediction.

Cannot find the paper you are looking for? You can Submit a new open access paper.