Search Results for author: Minqi Jiang

Found 31 papers, 23 papers with code

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

no code implementations • 26 Feb 2024 • Mikayel Samvelyan, Sharath Chandra Raparthy, Andrei Lupu, Eric Hambro, Aram H. Markosyan, Manish Bhatt, Yuning Mao, Minqi Jiang, Jack Parker-Holder, Jakob Foerster, Tim Rocktäschel, Roberta Raileanu

As large language models (LLMs) become increasingly prevalent across many real-world applications, understanding and enhancing their robustness to user inputs is of paramount importance.

Question Answering

Paper
Add Code

Refining Minimax Regret for Unsupervised Environment Design

1 code implementation • 19 Feb 2024 • Michael Beukman, Samuel Coward, Michael Matthews, Mattie Fellows, Minqi Jiang, Michael Dennis, Jakob Foerster

In this work, we introduce Bayesian level-perfect MMR (BLP), a refinement of the minimax regret objective that overcomes this limitation.

Paper
Code

Multi-Agent Diagnostics for Robustness via Illuminated Diversity

no code implementations • 24 Jan 2024 • Mikayel Samvelyan, Davide Paglieri, Minqi Jiang, Jack Parker-Holder, Tim Rocktäschel

In the rapidly advancing field of multi-agent systems, ensuring robustness in unfamiliar and adversarial settings is crucial.

Decision Making Multi-agent Reinforcement Learning

Paper
Add Code

Learning to Act without Actions

1 code implementation • 17 Dec 2023 • Dominik Schmidt, Minqi Jiang

LAPO takes a first step towards pre-training powerful, generalist policies and world models on the vast amounts of videos readily available on the web.

Reinforcement Learning (RL)

Paper
Code

The Generalization Gap in Offline Reinforcement Learning

1 code implementation • 10 Dec 2023 • Ishita Mediratta, Qingfei You, Minqi Jiang, Roberta Raileanu

Our experiments reveal that existing offline learning algorithms struggle to match the performance of online RL on both train and test environments.

Offline RL reinforcement-learning +1

Paper
Code

Learning Curricula in Open-Ended Worlds

1 code implementation • 3 Dec 2023 • Minqi Jiang

Deep reinforcement learning (RL) provides powerful methods for training optimal sequential decision-making agents.

Decision Making Reinforcement Learning (RL)

109

Paper
Code

minimax: Efficient Baselines for Autocurricula in JAX

1 code implementation • 21 Nov 2023 • Minqi Jiang, Michael Dennis, Edward Grefenstette, Tim Rocktäschel

This compute requirement is a major obstacle to rapid innovation for the field.

Decision Making

143

Paper
Code

JaxMARL: Multi-Agent RL Environments in JAX

2 code implementations • 16 Nov 2023 • Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Gardar Ingvarsson, Timon Willi, Akbir Khan, Christian Schroeder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert Tjarko Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktaschel, Chris Lu, Jakob Nicolaus Foerster

This not only enables GPU acceleration, but also provides a more flexible MARL environment, unlocking the potential for self-play, meta-learning, and other future applications in MARL.

Meta-Learning Multi-agent Reinforcement Learning +3

322

Paper
Code

Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design

1 code implementation • NeurIPS 2023 • Matthew Thomas Jackson, Minqi Jiang, Jack Parker-Holder, Risto Vuorio, Chris Lu, Gregory Farquhar, Shimon Whiteson, Jakob Nicolaus Foerster

Recently, it has been shown that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks.

General Reinforcement Learning reinforcement-learning +1

Paper
Code

Stabilizing Unsupervised Environment Design with a Learned Adversary

1 code implementation • 21 Aug 2023 • Ishita Mediratta, Minqi Jiang, Jack Parker-Holder, Michael Dennis, Eugene Vinitsky, Tim Rocktäschel

As a result, we make it possible for PAIRED to match or exceed state-of-the-art methods, producing robust agents in several established challenging procedurally-generated environments, including a partially-observed maze navigation task and a continuous-control car racing environment.

Car Racing Reinforcement Learning (RL)

109

Paper
Code

Anomaly Detection with Score Distribution Discrimination

1 code implementation • 26 Jun 2023 • Minqi Jiang, Songqiao Han, Hailiang Huang

In this paper, we propose to optimize the anomaly scoring function from the view of score distribution, thus better retaining the diversity and more fine-grained information of input data, especially when the unlabeled data contains anomaly noises in more practical AD scenarios.

Anomaly Detection

Paper
Code

Reward-Free Curricula for Training Robust World Models

1 code implementation • 15 Jun 2023 • Marc Rigter, Minqi Jiang, Ingmar Posner

We consider robustness in terms of minimax regret over all environment instantiations and show that the minimax regret can be connected to minimising the maximum error in the world model across environment instances.

Paper
Code

A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs

2 code implementations • 5 Jun 2023 • Mikael Henaff, Minqi Jiang, Roberta Raileanu

This results in an algorithm which sets a new state of the art across 16 tasks from the MiniHack suite used in prior work, and also performs robustly on Habitat and Montezuma's Revenge.

Montezuma's Revenge

448

Paper
Code

MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning

no code implementations • 6 Mar 2023 • Mikayel Samvelyan, Akbir Khan, Michael Dennis, Minqi Jiang, Jack Parker-Holder, Jakob Foerster, Roberta Raileanu, Tim Rocktäschel

Open-ended learning methods that automatically generate a curriculum of increasingly challenging tasks serve as a promising avenue toward generally capable reinforcement learning agents.

Continuous Control Multi-agent Reinforcement Learning +2

Paper
Add Code

Weakly Supervised Anomaly Detection: A Survey

2 code implementations • 9 Feb 2023 • Minqi Jiang, Chaochuan Hou, Ao Zheng, Xiyang Hu, Songqiao Han, Hailiang Huang, Xiangnan He, Philip S. Yu, Yue Zhao

Anomaly detection (AD) is a crucial task in machine learning with various applications, such as detecting emerging diseases, identifying financial frauds, and detecting fake news.

Supervised Anomaly Detection Time Series +2

131

Paper
Code

How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection

1 code implementation • 18 Jan 2023 • Biyang Guo, Xin Zhang, Ziyuan Wang, Minqi Jiang, Jinran Nie, Yuxuan Ding, Jianwei Yue, Yupeng Wu

We call the collected dataset the Human ChatGPT Comparison Corpus (HC3).

1,189

Paper
Code

General Intelligence Requires Rethinking Exploration

no code implementations • 15 Nov 2022 • Minqi Jiang, Tim Rocktäschel, Edward Grefenstette

We are at the cusp of a transition from "learning from data" to "learning what data to learn from" as a central focus of artificial intelligence (AI) research.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Exploration via Elliptical Episodic Bonuses

3 code implementations • 11 Oct 2022 • Mikael Henaff, Roberta Raileanu, Minqi Jiang, Tim Rocktäschel

In recent years, a number of reinforcement learning (RL) methods have been proposed to explore complex environments which differ across episodes.

Reinforcement Learning (RL)

448

Paper
Code

GriddlyJS: A Web IDE for Reinforcement Learning

no code implementations • 13 Jul 2022 • Christopher Bamford, Minqi Jiang, Mikayel Samvelyan, Tim Rocktäschel

Progress in reinforcement learning (RL) research is often driven by the design of new, challenging environments -- a costly undertaking requiring skills orthogonal to that of a typical machine learning researcher.

Offline RL reinforcement-learning +1

Paper
Add Code

Grounding Aleatoric Uncertainty for Unsupervised Environment Design

1 code implementation • 11 Jul 2022 • Minqi Jiang, Michael Dennis, Jack Parker-Holder, Andrei Lupu, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster

Problematically, in partially-observable or stochastic settings, optimal policies may depend on the ground-truth distribution over aleatoric parameters of the environment in the intended deployment setting, while curriculum learning necessarily shifts the training distribution.

Reinforcement Learning (RL)

448

Paper
Code

Insights From the NeurIPS 2021 NetHack Challenge

1 code implementation • 22 Mar 2022 • Eric Hambro, Sharada Mohanty, Dmitrii Babaev, Minwoo Byeon, Dipam Chakraborty, Edward Grefenstette, Minqi Jiang, DaeJin Jo, Anssi Kanervisto, Jongmin Kim, Sungwoong Kim, Robert Kirk, Vitaly Kurin, Heinrich Küttler, Taehwon Kwon, Donghoon Lee, Vegard Mella, Nantas Nardelli, Ivan Nazarov, Nikita Ovsov, Jack Parker-Holder, Roberta Raileanu, Karolis Ramanauskas, Tim Rocktäschel, Danielle Rothermel, Mikayel Samvelyan, Dmitry Sorokin, Maciej Sypetkowski, Michał Sypetkowski

In this report, we summarize the takeaways from the first NeurIPS 2021 NetHack Challenge.

NetHack Reinforcement Learning (RL)

Paper
Code

Evolving Curricula with Regret-Based Environment Design

3 code implementations • 2 Mar 2022 • Jack Parker-Holder, Minqi Jiang, Michael Dennis, Mikayel Samvelyan, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel

Our approach, which we call Adversarially Compounding Complexity by Editing Levels (ACCEL), seeks to constantly produce levels at the frontier of an agent's capabilities, resulting in curricula that start simple but become increasingly complex.

Reinforcement Learning (RL)

448

Paper
Code

Improving Intrinsic Exploration with Language Abstractions

1 code implementation • 17 Feb 2022 • Jesse Mu, Victor Zhong, Roberta Raileanu, Minqi Jiang, Noah Goodman, Tim Rocktäschel, Edward Grefenstette

Reinforcement learning (RL) agents are particularly hard to train when rewards are sparse.

reinforcement-learning Reinforcement Learning (RL)

448

Paper
Code

Replay-Guided Adversarial Environment Design

4 code implementations • NeurIPS 2021 • Minqi Jiang, Michael Dennis, Jack Parker-Holder, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel

Furthermore, our theory suggests a highly counterintuitive improvement to PLR: by stopping the agent from updating its policy on uncurated levels (training on less data), we can improve the convergence to Nash equilibria.

Reinforcement Learning (RL)

143

Paper
Code

That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities

no code implementations • 29 Sep 2021 • Jack Parker-Holder, Minqi Jiang, Michael D Dennis, Mikayel Samvelyan, Jakob Nicolaus Foerster, Edward Grefenstette, Tim Rocktäschel

Deep Reinforcement Learning (RL) has recently produced impressive results in a series of settings such as games and robotics.

Reinforcement Learning (RL)

Paper
Add Code

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

1 code implementation • 27 Sep 2021 • Mikayel Samvelyan, Robert Kirk, Vitaly Kurin, Jack Parker-Holder, Minqi Jiang, Eric Hambro, Fabio Petroni, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel

By leveraging the full set of entities and environment dynamics from NetHack, one of the richest grid-based video games, MiniHack allows designing custom RL testbeds that are fast and convenient to use.

NetHack reinforcement-learning +2

448

Paper
Code

Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay

no code implementations • NeurIPS Workshop ICBINB 2021 • Iryna Korshunova, Minqi Jiang, Jack Parker-Holder, Tim Rocktäschel, Edward Grefenstette

Prioritized Level Replay (PLR) has been shown to induce adaptive curricula that improve the sample-efficiency and generalization of reinforcement learning policies in environments featuring multiple tasks or levels.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Resolving Causal Confusion in Reinforcement Learning via Robust Exploration

no code implementations • ICLR Workshop SSL-RL 2021 • Clare Lyle, Amy Zhang, Minqi Jiang, Joelle Pineau, Yarin Gal

To address this, we present a robust exploration strategy which enables causal hypothesis-testing by interaction with the environment.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Grid-to-Graph: Flexible Spatial Relational Inductive Biases for Reinforcement Learning

1 code implementation • 8 Feb 2021 • Zhengyao Jiang, Pasquale Minervini, Minqi Jiang, Tim Rocktaschel

In this work, we show that we can incorporate relational inductive biases, encoded in the form of relational graphs, into agents.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Prioritized Level Replay

4 code implementations • 8 Oct 2020 • Minqi Jiang, Edward Grefenstette, Tim Rocktäschel

Environments with procedurally generated content serve as important benchmarks for testing systematic generalization in deep reinforcement learning.

Systematic Generalization

2,513

Paper
Code

WordCraft: An Environment for Benchmarking Commonsense Agents

1 code implementation • ICML Workshop LaReL 2020 • Minqi Jiang, Jelena Luketina, Nantas Nardelli, Pasquale Minervini, Philip H. S. Torr, Shimon Whiteson, Tim Rocktäschel

This is partly due to the lack of lightweight simulation environments that sufficiently reflect the semantics of the real world and provide knowledge sources grounded with respect to observations in an RL environment.

Benchmarking Knowledge Graphs +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.