ICLR 2019

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

ICLR 2019 uber-research/deep-neuroevolution

Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid locomotion.


PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

ICLR 2019 ppocma/ppocma

Drawing inspiration from CMA-ES, a black-box evolutionary optimization method designed for robustness in similar situations, we propose PPO-CMA, a proximal policy optimization approach that adaptively expands the exploration variance to speed up progress.


Exploring the Landscape of Spatial Robustness

ICLR 2019 MadryLab/adversarial_spatial

The study of adversarial robustness has so far largely focused on perturbations bound in p-norms.


Supervised Policy Update for Deep Reinforcement Learning

ICLR 2019 quanvuong/Supervised_Policy_Update

We show how the Natural Policy Gradient and Trust Region Policy Optimization (NPG/TRPO) problems, and the Proximal Policy Optimization (PPO) problem can be addressed by this methodology.

Towards the Latent Transcriptome

ICLR 2019 TrofimovAssya/TheLatentTranscriptome

In this work we propose a method to compute continuous embeddings for kmers from raw RNA-seq data, without the need for alignment to a reference genome.

WAIC, but Why? Generative Ensembles for Robust Anomaly Detection

ICLR 2019 ericjang/odin

Machine learning models encounter Out-of-Distribution (OoD) errors when the data seen at test time are generated from a different stochastic generator than the one used to generate the training data.


Zero-Shot Dual Machine Translation

ICLR 2019 liernisestorain/zero-shot-dual-MT

Our method can obtain improvements also on the setting where a small amount of parallel data for the zero-shot language pair is available.


Anomaly Detection With Multiple-Hypotheses Predictions

ICLR 2019 YeongHyeon/ConAD-PyTorch

In one-class-learning tasks, only the normal case (foreground) can be modeled with data, whereas the variation of all possible anomalies is too erratic to be described by samples.