2 code implementations • ICLR 2022 • Paul-Aymeric McRae, Prasanna Parthasarathi, Mahmoud Assran, Sarath Chandar
Popular approaches for minimizing loss in data-driven learning often involve an abstraction or an explicit retention of the history of gradients for efficient parameter updates.