Search Results for author: Anssi Kanervisto

Found 29 papers, 19 papers with code

Diffusion for World Modeling: Visual Details Matter in Atari

1 code implementation20 May 2024 Eloi Alonso, Adam Jelley, Vincent Micheli, Anssi Kanervisto, Amos Storkey, Tim Pearce, François Fleuret

Motivated by this paradigm shift, we introduce DIAMOND (DIffusion As a Model Of eNvironment Dreams), a reinforcement learning agent trained in a diffusion world model.

Image Generation reinforcement-learning

Toward Human-AI Alignment in Large-Scale Multi-Player Games

no code implementations5 Feb 2024 Sugandha Sharma, Guy Davidson, Khimya Khetarpal, Anssi Kanervisto, Udit Arora, Katja Hofmann, Ida Momennejad

First, we analyze extensive human gameplay data from Xbox's Bleeding Edge (100K+ games), uncovering behavioral patterns in a complex task space.

BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks

1 code implementation NeurIPS 2023 Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Rohin Shah

Given the completion of two years of BASALT competitions, we offer to the community a formalized benchmark through the BASALT Evaluation and Demonstrations Dataset (BEDD), which serves as a resource for algorithm development and performance assessment.


Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games

no code implementations4 Dec 2023 Lukas Schäfer, Logan Jones, Anssi Kanervisto, Yuhan Cao, Tabish Rashid, Raluca Georgescu, Dave Bignell, Siddhartha Sen, Andrea Treviño Gavito, Sam Devlin

Video games have served as useful benchmarks for the decision making community, but going beyond Atari games towards training agents in modern games has been prohibitively expensive for the vast majority of the research community.

Atari Games Imitation Learning

Imitating Human Behaviour with Diffusion Models

1 code implementation25 Jan 2023 Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin

This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments.

A2C is a special case of PPO

1 code implementation18 May 2022 Shengyi Huang, Anssi Kanervisto, Antonin Raffin, Weixun Wang, Santiago Ontañón, Rousslan Fernand Julien Dossa

Advantage Actor-critic (A2C) and Proximal Policy Optimization (PPO) are popular deep reinforcement learning algorithms used for game AI in recent years.

reinforcement-learning Reinforcement Learning (RL)

GAN-Aimbots: Using Machine Learning for Cheating in First Person Shooters

1 code implementation14 May 2022 Anssi Kanervisto, Tomi Kinnunen, Ville Hautamäki

Playing games with cheaters is not fun, and in a multi-billion-dollar video game industry with hundreds of millions of players, game developers aim to improve the security and, consequently, the user experience of their games by preventing cheating.

BIG-bench Machine Learning

MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned

no code implementations17 Feb 2022 Anssi Kanervisto, Stephanie Milani, Karolis Ramanauskas, Nicholay Topin, Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang, Weijun Hong, Zhongyue Huang, Haicheng Chen, Guangjun Zeng, Yue Lin, Vincent Micheli, Eloi Alonso, François Fleuret, Alexander Nikulin, Yury Belousov, Oleg Svidchenko, Aleksei Shpilman

With this in mind, we hosted the third edition of the MineRL ObtainDiamond competition, MineRL Diamond 2021, with a separate track in which we permitted any solution to promote the participation of newcomers.

Optimizing Tandem Speaker Verification and Anti-Spoofing Systems

no code implementations24 Jan 2022 Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi

As automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, they are typically used in conjunction with spoofing countermeasure (CM) systems to improve security.

Speaker Verification

The MineRL BASALT Competition on Learning from Human Feedback

no code implementations5 Jul 2021 Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan

Rather than training AI systems using a predefined reward function or using a labeled dataset with a predefined set of categories, we instead train the AI system using a learning signal derived from some form of human feedback, which can evolve over time as the understanding of the task changes, or as the capabilities of the AI system improve.

Imitation Learning

Distilling Reinforcement Learning Tricks for Video Games

1 code implementation1 Jul 2021 Anssi Kanervisto, Christian Scheller, Yanick Schraner, Ville Hautamäki

Reinforcement learning (RL) research focuses on general solutions that can be applied across different domains.

Q-Learning reinforcement-learning +1

Multi-task Learning with Attention for End-to-end Autonomous Driving

no code implementations21 Apr 2021 Keishi Ishihara, Anssi Kanervisto, Jun Miura, Ville Hautamäki

This does not only improve the success rate of standard benchmarks, but also the ability to react to traffic lights, which we show with standard benchmarks.

Autonomous Driving Imitation Learning +1

Back to Square One: Superhuman Performance in Chutes and Ladders Through Deep Neural Networks and Tree Search

1 code implementation1 Apr 2021 Dylan Ashley, Anssi Kanervisto, Brendan Bennett

We present AlphaChute: a state-of-the-art algorithm that achieves superhuman performance in the ancient game of Chutes and Ladders.

General Characterization of Agents by States they Visit

1 code implementation2 Dec 2020 Anssi Kanervisto, Tomi Kinnunen, Ville Hautamäki

Behavioural characterizations (BCs) of decision-making agents, or their policies, are used to study outcomes of training algorithms and as part of the algorithms themselves to encourage unique policies, match expert policy or restrict changes to policy per update.

Imitation Learning

Playing Minecraft with Behavioural Cloning

1 code implementation7 May 2020 Anssi Kanervisto, Janne Karttunen, Ville Hautamäki

MineRL 2019 competition challenged participants to train sample-efficient agents to play Minecraft, by using a dataset of human gameplay and a limit number of steps the environment.

Behavioural cloning

Action Space Shaping in Deep Reinforcement Learning

1 code implementation2 Apr 2020 Anssi Kanervisto, Christian Scheller, Ville Hautamäki

In this work, we aim to gain insight on these action space modifications by conducting extensive experiments in video-game environments.

reinforcement-learning Reinforcement Learning (RL)

Benchmarking End-to-End Behavioural Cloning on Video Games

1 code implementation2 Apr 2020 Anssi Kanervisto, Joonas Pussinen, Ville Hautamäki

We take a step towards a general approach and study the general applicability of behavioural cloning on twelve video games, including six modern video games (published after 2010), by using human demonstrations as training data.

Behavioural cloning Benchmarking

Towards Debugging Deep Neural Networks by Generating Speech Utterances

1 code implementation6 Jul 2019 Bilal Soomro, Anssi Kanervisto, Trung Ngo Trong, Ville Hautamäki

One such debugging method used with image classification DNNs is activation maximization, which generates example-images that are classified as one of the classes.

General Classification Image Classification

Do Autonomous Agents Benefit from Hearing?

no code implementations10 May 2019 Abraham Woubie, Anssi Kanervisto, Janne Karttunen, Ville Hautamaki

In this work, we propose the use of audio as complementary information to visual only in state representation.

reinforcement-learning Reinforcement Learning (RL)

From Video Game to Real Robot: The Transfer between Action Spaces

1 code implementation2 May 2019 Janne Karttunen, Anssi Kanervisto, Ville Kyrki, Ville Hautamäki

Deep reinforcement learning has proven to be successful for learning tasks in simulated environments, but applying same techniques for robots in real-world domain is more challenging, as they require hours of training.

Transfer Learning

Who Do I Sound Like? Showcasing Speaker Recognition Technology by YouTube Voice Search

1 code implementation8 Nov 2018 Ville Vestman, Bilal Soomro, Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen

The popularization of science can often be disregarded by scientists as it may be challenging to put highly sophisticated research into words that general public can understand.

Audio and Speech Processing Sound

ToriLLE: Learning Environment for Hand-to-Hand Combat

1 code implementation26 Jul 2018 Anssi Kanervisto, Ville Hautamäki

We present Toribash Learning Environment (ToriLLE), a learning environment for machine learning agents based on the video game Toribash.

BIG-bench Machine Learning

Image-to-Markup Generation with Coarse-to-Fine Attention

14 code implementations ICML 2017 Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, Alexander M. Rush

We present a neural encoder-decoder model to convert images into presentational markup based on a scalable coarse-to-fine attention mechanism.

Decoder Optical Character Recognition (OCR)

Cannot find the paper you are looking for? You can Submit a new open access paper.