Search Results for author: Akifumi Wachi

Found 20 papers, 5 papers with code

A Provable Approach for End-to-End Safe Reinforcement Learning

no code implementations28 May 2025 Akifumi Wachi, Kohei Miyaguchi, Takumi Tanabe, Rei Sato, Youhei Akimoto

We propose a method, called Provably Lifetime Safe RL (PLS), that integrates offline safe RL with safe policy deployment to address this challenge.

Gaussian Processes Reinforcement Learning (RL) +1

Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies

no code implementations22 May 2025 Runze Yan, Xun Shen, Akifumi Wachi, Sebastien Gros, Anni Zhao, Xiao Hu

When applying offline reinforcement learning (RL) in healthcare scenarios, the out-of-distribution (OOD) issues pose significant risks, as inappropriate generalization beyond clinical expertise can result in potentially harmful recommendations.

Offline RL Q-Learning +4

Target Return Optimizer for Multi-Game Decision Transformer

no code implementations4 Mar 2025 Kensuke Tatematsu, Akifumi Wachi

Achieving autonomous agents with robust generalization capabilities across diverse games and tasks remains one of the ultimate goals in AI research.

Atari Games

Flipping-based Policy for Chance-Constrained Markov Decision Processes

no code implementations9 Oct 2024 Xun Shen, Shuo Jiang, Akifumi Wachi, Kaumune Hashimoto, Sebastien Gros

We demonstrate that the flipping-based policy can improve the performance of the existing safe RL algorithms under the same limits of safety constraints on Safety Gym benchmarks.

Reinforcement Learning (RL) Safe Reinforcement Learning

Stepwise Alignment for Constrained Language Model Policy Optimization

1 code implementation17 Apr 2024 Akifumi Wachi, Thien Q. Tran, Rei Sato, Takumi Tanabe, Youhei Akimoto

This paper formulates human value alignment as an optimization problem of the language model policy to maximize reward under a safety constraint, and then proposes an algorithm, Stepwise Alignment for Constrained Policy Optimization (SACPO).

Computational Efficiency Language Modeling +2

Long-term Safe Reinforcement Learning with Binary Feedback

no code implementations8 Jan 2024 Akifumi Wachi, Wataru Hashimoto, Kazumune Hashimoto

Our theoretical results show that LoBiSaRL guarantees the long-term safety constraint, with high probability.

reinforcement-learning Reinforcement Learning +2

Verbosity Bias in Preference Labeling by Large Language Models

no code implementations16 Oct 2023 Keita Saito, Akifumi Wachi, Koki Wataoka, Youhei Akimoto

In recent years, Large Language Models (LLMs) have witnessed a remarkable surge in prevalence, altering the landscape of natural language processing and machine learning.

reinforcement-learning Reinforcement Learning

Bayesian Meta-Learning on Control Barrier Functions with Data from On-Board Sensors

no code implementations10 Aug 2023 Wataru Hashimoto, Kazumune Hashimoto, Akifumi Wachi, Xun Shen, Masako Kishida, Shigemasa Takai

The proposed scheme realizes efficient online synthesis of the controller as shown in the simulation study and provides probabilistic safety guarantees on the resulting controller.

Meta-Learning Navigate

LOA: Logical Optimal Actions for Text-based Interaction Games

1 code implementation ACL 2021 Daiki Kimura, Subhajit Chaudhury, Masaki Ono, Michiaki Tatsubori, Don Joven Agravante, Asim Munawar, Akifumi Wachi, Ryosuke Kohita, Alexander Gray

We present Logical Optimal Actions (LOA), an action decision architecture of reinforcement learning applications with a neuro-symbolic framework which is a combination of neural network and symbolic knowledge acquisition approach for natural language interaction games.

reinforcement-learning Reinforcement Learning +2

Reinforcement Learning with External Knowledge by using Logical Neural Networks

no code implementations3 Mar 2021 Daiki Kimura, Subhajit Chaudhury, Akifumi Wachi, Ryosuke Kohita, Asim Munawar, Michiaki Tatsubori, Alexander Gray

Specifically, we propose an integrated method that enables model-free reinforcement learning from external knowledge sources in an LNNs-based logical constrained framework such as action shielding and guide.

Deep Reinforcement Learning reinforcement-learning +1

Polar Embedding

no code implementations CoNLL (EMNLP) 2021 Ran Iwamoto, Ryosuke Kohita, Akifumi Wachi

Particularly, the latest approaches such as hyperbolic embeddings showed significant performance by representing essential meanings in a hierarchy (generality and similarity of objects) with spatial properties (distance from the origin and difference of angles).

Link Prediction

Safe Reinforcement Learning in Constrained Markov Decision Processes

1 code implementation ICML 2020 Akifumi Wachi, Yanan Sui

Safe reinforcement learning has been a promising approach for optimizing the policy of an agent that operates in safety-critical applications.

reinforcement-learning Reinforcement Learning +2

Failure-Scenario Maker for Rule-Based Agent using Multi-agent Adversarial Reinforcement Learning and its Application to Autonomous Driving

no code implementations26 Mar 2019 Akifumi Wachi

We propose a method for efficiently finding failure scenarios; this method trains the adversarial agents using multi-agent reinforcement learning such that the tested rule-based agent fails.

Autonomous Driving Multi-agent Reinforcement Learning +3

Safe Exploration in Markov Decision Processes with Time-Variant Safety using Spatio-Temporal Gaussian Process

no code implementations12 Sep 2018 Akifumi Wachi, Hiroshi Kajino, Asim Munawar

This paper presents a learning algorithm called ST-SafeMDP for exploring Markov decision processes (MDPs) that is based on the assumption that the safety features are a priori unknown and time-variant.

Reinforcement Learning Robot Navigation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.