no code implementations • 23 Oct 2024 • Artem Basharin, Andrei Chertkov, Ivan Oseledets
We propose a new model for multi-token prediction in transformers, aiming to enhance sampling efficiency without compromising accuracy.
no code implementations • 5 Feb 2024 • Gleb Ryzhakov, Andrei Chertkov, Artem Basharin, Ivan Oseledets
We develop a new method HTBB for the multidimensional black-box approximation and gradient-free optimization, which is based on the low-rank hierarchical Tucker decomposition with the use of the MaxVol indices selection procedure.
1 code implementation • 28 Dec 2023 • Nikita Pospelov, Andrei Chertkov, Maxim Beketov, Ivan Oseledets, Konstantin Anokhin
For a living system, such as a neuron, whose response to a stimulus is unknown and not differentiable, the only way to reveal these features is through a feedback loop that exposes it to a large set of different stimuli.
1 code implementation • 20 Mar 2023 • Andrei Chertkov, Olga Tsymboi, Mikhail Pautov, Ivan Oseledets
Neural networks are deployed widely in natural language processing tasks on the industrial scale, and perhaps the most often they are used as compounds of automatic machine translation systems.
1 code implementation • 9 May 2022 • Artyom Nikitin, Andrei Chertkov, Rafael Ballester-Ripoll, Ivan Oseledets, Evgeny Frolov
The problem is formulated as a Quadratic Unconstrained Binary Optimization (QUBO) which, due to its NP-hard complexity, is solved using Quantum Annealing on a quantum computer provided by D-Wave.
1 code implementation • 30 Apr 2022 • Konstantin Sozykin, Andrei Chertkov, Roman Schutski, Anh-Huy Phan, Andrzej Cichocki, Ivan Oseledets
We present a novel procedure for optimization based on the combination of efficient quantized tensor train representation and a generalized maximum matrix volume principle.
no code implementations • 14 Feb 2022 • Valentin Khrulkov, Gleb Ryzhakov, Andrei Chertkov, Ivan Oseledets
Diffusion models have recently outperformed alternative approaches to model the distribution of natural images, such as GANs.