Search Results for author: Alena Shilova

Found 6 papers, 0 papers with code

AdaStop: adaptive statistical testing for sound comparisons of Deep RL agents

no code implementations19 Jun 2023 Timothée Mathieu, Riccardo Della Vecchia, Alena Shilova, Matheus Medeiros Centa, Hector Kohler, Odalric-Ambrym Maillard, Philippe Preux

When comparing several RL algorithms, a major question is how many executions must be made and how can we ensure that the results of such a comparison are theoretically sound.

Reinforcement Learning (RL)

Survey on Large Scale Neural Network Training

no code implementations21 Feb 2022 Julia Gusak, Daria Cherniuk, Alena Shilova, Alexander Katrutsa, Daniel Bershatsky, Xunyi Zhao, Lionel Eyraud-Dubois, Oleg Shlyazhko, Denis Dimitrov, Ivan Oseledets, Olivier Beaumont

Modern Deep Neural Networks (DNNs) require significant memory to store weight, activations, and other intermediate tensors during training.

Efficient Combination of Rematerialization and Offloading for Training DNNs

no code implementations NeurIPS 2021 Olivier Beaumont, Lionel Eyraud-Dubois, Alena Shilova

Rematerialization and offloading are two well known strategies to save memory during the training phase of deep neural networks, allowing data scientists to consider larger models, batch sizes or higher resolution data.

valid

Optimal checkpointing for heterogeneous chains: how to train deep neural networks with limited memory

no code implementations27 Nov 2019 Julien Herrmann, Olivier Beaumont, Lionel Eyraud-Dubois, Julien Hermann, Alexis Joly, Alena Shilova

This paper introduces a new activation checkpointing method which allows to significantly decrease memory usage when training Deep Neural Networks with the back-propagation algorithm.

Training on the Edge: The why and the how

no code implementations13 Feb 2019 Navjot Kukreja, Alena Shilova, Olivier Beaumont, Jan Huckelheim, Nicola Ferrier, Paul Hovland, Gerard Gorman

Edge computing is the natural progression from Cloud computing, where, instead of collecting all data and processing it centrally, like in a cloud computing environment, we distribute the computing power and try to do as much processing as possible, close to the source of the data.

Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.