Search Results for author: Joel Lehman

Found 28 papers, 20 papers with code

Quality-Diversity through AI Feedback

no code implementations19 Oct 2023 Herbie Bradley, Andrew Dai, Hannah Teufel, Jenny Zhang, Koen Oostermeijer, Marco Bellagente, Jeff Clune, Kenneth Stanley, Grégory Schott, Joel Lehman

In many text-generation problems, users may prefer not only a single response, but a diverse range of high-quality outputs from which to choose.

Diversity Text Generation

Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization

1 code implementation18 Oct 2023 Li Ding, Jenny Zhang, Jeff Clune, Lee Spector, Joel Lehman

Meanwhile, Quality Diversity (QD) algorithms excel at identifying diverse and high-quality solutions but often rely on manually crafted diversity metrics.

Diversity reinforcement-learning +4

OMNI: Open-endedness via Models of human Notions of Interestingness

1 code implementation2 Jun 2023 Jenny Zhang, Joel Lehman, Kenneth Stanley, Jeff Clune

An Achilles Heel of open-endedness research is the inability to quantify (and thus prioritize) tasks that are not just learnable, but also $\textit{interesting}$ (e. g., worthwhile and novel).

Language Model Crossover: Variation through Few-Shot Prompting

1 code implementation23 Feb 2023 Elliot Meyerson, Mark J. Nelson, Herbie Bradley, Adam Gaier, Arash Moradi, Amy K. Hoover, Joel Lehman

The promise of such language model crossover (which is simple to implement and can leverage many different open-source language models) is that it enables a simple mechanism to evolve semantically-rich text representations (with few domain-specific tweaks), and naturally benefits from current progress in language models.

In-Context Learning Language Modelling

Machine Love

no code implementations18 Feb 2023 Joel Lehman

While ML generates much economic value, many of us have problematic relationships with social media and other ML-powered applications.

Artificial Life Philosophy

Evolution through Large Models

no code implementations17 Jun 2022 Joel Lehman, Jonathan Gordon, Shawn Jain, Kamal Ndousse, Cathy Yeh, Kenneth O. Stanley

This paper pursues the insight that large language models (LLMs) trained to generate code can vastly improve the effectiveness of mutation operators applied to programs in genetic programming (GP).

Language Modelling

Open Questions in Creating Safe Open-ended AI: Tensions Between Control and Creativity

1 code implementation12 Jun 2020 Adrien Ecoffet, Jeff Clune, Joel Lehman

This paper proposes that open-ended evolution and artificial life have much to contribute towards the understanding of open-ended AI, focusing here in particular on the safety of open-ended search.

Artificial Life

Reinforcement Learning Under Moral Uncertainty

1 code implementation8 Jun 2020 Adrien Ecoffet, Joel Lehman

An ambitious goal for machine learning is to create agents that behave ethically: The capacity to abide by human moral norms would greatly expand the context in which autonomous agents could be practically and safely deployed, e. g. fully autonomous vehicles will encounter charged moral decisions that complicate their deployment.

Autonomous Vehicles BIG-bench Machine Learning +4

Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search

1 code implementation27 May 2020 Aditya Rawal, Joel Lehman, Felipe Petroski Such, Jeff Clune, Kenneth O. Stanley

Neural Architecture Search (NAS) explores a large space of architectural motifs -- a compute-intensive process that often involves ground-truth evaluation of each motif by instantiating it within a large network, and training and evaluating the network with thousands of domain-specific data samples.

Neural Architecture Search

First return, then explore

2 code implementations27 Apr 2020 Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, Jeff Clune

The promise of reinforcement learning is to solve complex sequential decision problems autonomously by specifying a high-level reward function only.

Montezuma's Revenge reinforcement-learning +2

Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions

1 code implementation ICML 2020 Rui Wang, Joel Lehman, Aditya Rawal, Jiale Zhi, Yulun Li, Jeff Clune, Kenneth O. Stanley

Creating open-ended algorithms, which generate their own never-ending stream of novel and appropriately challenging learning opportunities, could help to automate and accelerate progress in machine learning.

Reinforcement Learning Reinforcement Learning (RL)

Learning to Continually Learn

5 code implementations21 Feb 2020 Shawn Beaulieu, Lapo Frati, Thomas Miconi, Joel Lehman, Kenneth O. Stanley, Jeff Clune, Nick Cheney

Continual lifelong learning requires an agent or model to learn many sequentially ordered tasks, building on previous knowledge without catastrophically forgetting it.

Continual Learning Meta-Learning

Evolvability ES: Scalable and Direct Optimization of Evolvability

1 code implementation13 Jul 2019 Alexander Gajewski, Jeff Clune, Kenneth O. Stanley, Joel Lehman

Designing evolutionary algorithms capable of uncovering highly evolvable representations is an open challenge; such evolvability is important because it accelerates evolution and enables fast adaptation to changing circumstances.

Diversity Evolutionary Algorithms +1

Towards Empathic Deep Q-Learning

1 code implementation26 Jun 2019 Bart Bussmann, Jacqueline Heinerman, Joel Lehman

As reinforcement learning (RL) scales to solve increasingly complex tasks, interest continues to grow in the fields of AI safety and machine ethics.

Ethics Q-Learning +2

Evolutionary Computation and AI Safety: Research Problems Impeding Routine and Safe Real-world Application of Evolution

no code implementations24 Jun 2019 Joel Lehman

Recent developments in artificial intelligence and machine learning have spurred interest in the growing field of AI safety, which studies how to prevent human-harming accidents when deploying AI systems.

BIG-bench Machine Learning

Learning Belief Representations for Imitation Learning in POMDPs

1 code implementation22 Jun 2019 Tanmay Gangwani, Joel Lehman, Qiang Liu, Jian Peng

We consider the problem of imitation learning from expert demonstrations in partially observable Markov decision processes (POMDPs).

Continuous Control Imitation Learning +2

Go-Explore: a New Approach for Hard-Exploration Problems

3 code implementations30 Jan 2019 Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge.

Imitation Learning Montezuma's Revenge +1

Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions

2 code implementations7 Jan 2019 Rui Wang, Joel Lehman, Jeff Clune, Kenneth O. Stanley

Our results show that POET produces a diverse range of sophisticated behaviors that solve a wide range of environmental challenges, many of which cannot be solved by direct optimization alone, or even through a direct-path curriculum-building control algorithm introduced to highlight the critical role of open-endedness in solving ambitious challenges.

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents

1 code implementation17 Dec 2018 Felipe Petroski Such, Vashisht Madhavan, Rosanne Liu, Rui Wang, Pablo Samuel Castro, Yulun Li, Jiale Zhi, Ludwig Schubert, Marc G. Bellemare, Jeff Clune, Joel Lehman

We lessen this friction, by (1) training several algorithms at scale and releasing trained models, (2) integrating with a previous Deep RL model release, and (3) releasing code that makes it easy for anyone to load, visualize, and analyze such models.

Atari Games Friction +3

An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution

22 code implementations NeurIPS 2018 Rosanne Liu, Joel Lehman, Piero Molino, Felipe Petroski Such, Eric Frank, Alex Sergeev, Jason Yosinski

In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x, y) Cartesian space and one-hot pixel space.

Atari Games Image Classification +1

ES Is More Than Just a Traditional Finite-Difference Approximator

no code implementations18 Dec 2017 Joel Lehman, Jay Chen, Jeff Clune, Kenneth O. Stanley

However, this ES optimizes for a different gradient than just reward: It optimizes for the average reward of the entire population, thereby seeking parameters that are robust to perturbation.

reinforcement-learning Reinforcement Learning +1

Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients

1 code implementation18 Dec 2017 Joel Lehman, Jay Chen, Jeff Clune, Kenneth O. Stanley

While neuroevolution (evolving neural networks) has a successful track record across a variety of domains from reinforcement learning to artificial life, it is rarely applied to large, deep neural networks.

Artificial Life Reinforcement Learning

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

14 code implementations18 Dec 2017 Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid locomotion.

Evolutionary Algorithms Q-Learning +2

Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents

2 code implementations NeurIPS 2018 Edoardo Conti, Vashisht Madhavan, Felipe Petroski Such, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e. g. hours vs. days) because they parallelize better.

Policy Gradient Methods Q-Learning +3

Cannot find the paper you are looking for? You can Submit a new open access paper.