no code implementations • 22 Dec 2023 • Valérie Castin, Pierre Ablin, Gabriel Peyré
This allows us to generalize attention to inputs of infinite length, and to derive an upper bound and a lower bound on the Lipschitz constant of self-attention on compact sets.