1 code implementation • NeurIPS 2021 • Aurelien Lucchi, Antonio Orvieto, Adamos Solomou
We prove that this approach converges to a second-order stationary point at a much faster rate than vanilla methods: namely, the complexity in terms of the number of function evaluations is only linear in the problem dimension.
1 code implementation • 15 Oct 2021 • Sankalan Pal Chowdhury, Adamos Solomou, Avinava Dubey, Mrinmaya Sachan
In this work we introduce KERNELIZED TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers.