1 code implementation • 30 Oct 2024 • Michael Matthews, Michael Beukman, Chris Lu, Jakob Foerster
While large models trained with self-supervised learning on offline datasets have shown remarkable capabilities in text and image domains, achieving the same generalisation for agents that act in sequential decision problems remains an open challenge.
General Reinforcement Learning
Reinforcement Learning (RL)
+1
1 code implementation • 1 Sep 2024 • Chris Lu, Michael Beukman, Michael Matthews, Jakob Foerster
Towards this, we present JaxLife: an artificial life simulator in which embodied agents, parameterized by deep neural networks, must learn to survive in an expressive world containing programmable systems.
1 code implementation • 27 Aug 2024 • Alexander Rutherford, Michael Beukman, Timon Willi, Bruno Lacerda, Nick Hawes, Jakob Foerster
What data or environments to use for training to improve downstream performance is a longstanding and very topical question in reinforcement learning.
1 code implementation • 3 Jul 2024 • Michael Beukman, Branden Ingram, Geraud Nangue Tasse, Benjamin Rosman, Pravesh Ranchod
Our environment enables the creation of high-dimensional continuous control tasks within a robotics football simulation.
1 code implementation • 19 Mar 2024 • Samuel Coward, Michael Beukman, Jakob Foerster
We present JaxUED, an open-source library providing minimal dependency implementations of modern Unsupervised Environment Design (UED) algorithms in Jax.
1 code implementation • 26 Feb 2024 • Michael Matthews, Michael Beukman, Benjamin Ellis, Mikayel Samvelyan, Matthew Jackson, Samuel Coward, Jakob Foerster
Either they are too slow for meaningful research to be performed without enormous computational resources, like Crafter, NetHack and Minecraft, or they are not complex enough to pose a significant challenge, like Minigrid and Procgen.
1 code implementation • 19 Feb 2024 • Michael Beukman, Samuel Coward, Michael Matthews, Mattie Fellows, Minqi Jiang, Michael Dennis, Jakob Foerster
In this work, we introduce Bayesian level-perfect MMR (BLP), a refinement of the minimax regret objective that overcomes this limitation.
2 code implementations • NeurIPS 2023 • Michael Beukman, Devon Jarvis, Richard Klein, Steven James, Benjamin Rosman
To this end, we introduce a neural network architecture, the Decision Adapter, which generates the weights of an adapter module and conditions the behaviour of an agent on the context information.
1 code implementation • 11 Sep 2023 • Michael Beukman, Manuel Fokam
Transfer learning has led to large gains in performance for nearly all NLP tasks while making downstream models easier and faster to train.
1 code implementation • 3 Feb 2023 • Michael Beukman, Manuel Fokam, Marcel Kruger, Guy Axelrod, Muhammad Nasir, Branden Ingram, Benjamin Rosman, Steven James
Procedural content generation (PCG) is a growing field, with numerous applications in the video game industry and great potential to help create better games at a fraction of the cost of manual creation.
1 code implementation • 22 Oct 2022 • David Ifeoluwa Adelani, Graham Neubig, Sebastian Ruder, Shruti Rijhwani, Michael Beukman, Chester Palen-Michel, Constantine Lignos, Jesujoba O. Alabi, Shamsuddeen H. Muhammad, Peter Nabende, Cheikh M. Bamba Dione, Andiswa Bukula, Rooweither Mabuya, Bonaventure F. P. Dossou, Blessing Sibanda, Happy Buzaaba, Jonathan Mukiibi, Godson Kalipe, Derguene Mbaye, Amelia Taylor, Fatoumata Kabore, Chris Chinenye Emezue, Anuoluwapo Aremu, Perez Ogayo, Catherine Gitau, Edwin Munkoh-Buabeng, Victoire M. Koagne, Allahsera Auguste Tapo, Tebogo Macucwa, Vukosi Marivate, Elvis Mboning, Tajuddeen Gwadabe, Tosin Adewumi, Orevaoghene Ahia, Joyce Nakatumba-Nabende, Neo L. Mokono, Ignatius Ezeani, Chiamaka Chukwuneke, Mofetoluwa Adeyemi, Gilles Q. Hacheme, Idris Abdulmumin, Odunayo Ogundepo, Oreen Yousuf, Tatiana Moteu Ngoli, Dietrich Klakow
African languages are spoken by over a billion people, but are underrepresented in NLP research and development.
2 code implementations • 20 Oct 2022 • Muhammad Umair Nasir, Michael Beukman, Steven James, Christopher Wesley Cleghorn
In this work, we tackle the problem of open-ended learning by introducing a method that simultaneously evolves agents and increasingly challenging environments.
no code implementations • 19 Oct 2022 • Idris Abdulmumin, Michael Beukman, Jesujoba O. Alabi, Chris Emezue, Everlyn Asiko, Tosin Adewumi, Shamsuddeen Hassan Muhammad, Mofetoluwa Adeyemi, Oreen Yousuf, Sahib Singh, Tajuddeen Rabiu Gwadabe
We participated in the WMT 2022 Large-Scale Machine Translation Evaluation for the African Languages Shared Task.
no code implementations • 9 Aug 2022 • Manuel Fokam, Michael Beukman
Data availability and quality are major challenges in natural language processing for low-resourced languages.
1 code implementation • NAACL 2022 • David Ifeoluwa Adelani, Jesujoba Oluwadara Alabi, Angela Fan, Julia Kreutzer, Xiaoyu Shen, Machel Reid, Dana Ruiter, Dietrich Klakow, Peter Nabende, Ernie Chang, Tajuddeen Gwadabe, Freshia Sackey, Bonaventure F. P. Dossou, Chris Chinenye Emezue, Colin Leong, Michael Beukman, Shamsuddeen Hassan Muhammad, Guyo Dub Jarso, Oreen Yousuf, Andre Niyongabo Rubungo, Gilles Hacheme, Eric Peter Wairagala, Muhammad Umair Nasir, Benjamin Ayoade Ajibade, Tunde Oluwaseyi Ajayi, Yvonne Wambui Gitau, Jade Abbott, Mohamed Ahmed, Millicent Ochieng, Anuoluwapo Aremu, Perez Ogayo, Jonathan Mukiibi, Fatoumata Ouoba Kabore, Godson Koffi Kalipe, Derguene Mbaye, Allahsera Auguste Tapo, Victoire Memdjokam Koagne, Edwin Munkoh-Buabeng, Valencia Wagner, Idris Abdulmumin, Ayodele Awokoya, Happy Buzaaba, Blessing Sibanda, Andiswa Bukula, Sam Manthalu
We focus on two questions: 1) How can pre-trained models be used for languages not included in the initial pre-training?
1 code implementation • 22 Apr 2022 • Michael Beukman, Michael Mitchley, Dean Wookey, Steven James, George Konidaris
We further demonstrate that a fixed wavelet basis set performs comparably against the high-performing Fourier basis on Mountain Car and Acrobot, and that the adaptive methods provide a convenient approach to addressing an oversized initial basis set, while demonstrating performance comparable to, or greater than, the fixed wavelet basis.
1 code implementation • 14 Apr 2022 • Michael Beukman, Christopher W Cleghorn, Steven James
Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios.
1 code implementation • 25 Jan 2022 • Michael Beukman, Steven James, Christopher Cleghorn
With increasing interest in procedural content generation by academia and game developers alike, it is vital that different approaches can be compared fairly.