Search Results for author: Aditya Rawal

Found 10 papers, 5 papers with code

From Nodes to Networks: Evolving Recurrent Neural Networks

no code implementations12 Mar 2018 Aditya Rawal, Risto Miikkulainen

Gated recurrent networks such as those composed of Long Short-Term Memory (LSTM) nodes have recently been used to improve state of the art in many sequential processing tasks such as speech recognition and machine translation.

Language Modelling Machine Translation +3

First-Order Preconditioning via Hypergradient Descent

1 code implementation18 Oct 2019 Ted Moskovitz, Rui Wang, Janice Lan, Sanyam Kapoor, Thomas Miconi, Jason Yosinski, Aditya Rawal

Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space. These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence.

Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions

1 code implementation ICML 2020 Rui Wang, Joel Lehman, Aditya Rawal, Jiale Zhi, Yulun Li, Jeff Clune, Kenneth O. Stanley

Creating open-ended algorithms, which generate their own never-ending stream of novel and appropriately challenging learning opportunities, could help to automate and accelerate progress in machine learning.

Reinforcement Learning (RL)

Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search

1 code implementation27 May 2020 Aditya Rawal, Joel Lehman, Felipe Petroski Such, Jeff Clune, Kenneth O. Stanley

Neural Architecture Search (NAS) explores a large space of architectural motifs -- a compute-intensive process that often involves ground-truth evaluation of each motif by instantiating it within a large network, and training and evaluating the network with thousands of domain-specific data samples.

Neural Architecture Search

Memory Efficient Continual Learning with Transformers

no code implementations9 Mar 2022 Beyza Ermis, Giovanni Zappella, Martin Wistuba, Aditya Rawal, Cedric Archambeau

Moreover, applications increasingly rely on large pre-trained neural networks, such as pre-trained Transformers, since the resources or data might not be available in sufficiently large quantities to practitioners to train the model from scratch.

Continual Learning text-classification +1

Continual Learning with Transformers for Image Classification

no code implementations28 Jun 2022 Beyza Ermis, Giovanni Zappella, Martin Wistuba, Aditya Rawal, Cedric Archambeau

This phenomenon is known as catastrophic forgetting and it is often difficult to prevent due to practical constraints, such as the amount of data that can be stored or the limited computation sources that can be used.

Continual Learning Image Classification +2

Extreme Miscalibration and the Illusion of Adversarial Robustness

no code implementations27 Feb 2024 Vyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha, George Karypis

Deep learning-based Natural Language Processing (NLP) models are vulnerable to adversarial attacks, where small perturbations can cause a model to misclassify.

Adversarial Attack Adversarial Robustness

Cannot find the paper you are looking for? You can Submit a new open access paper.