Search Results for author: Akifumi Wachi

Found 15 papers, 4 papers with code

Stepwise Alignment for Constrained Language Model Policy Optimization

no code implementations17 Apr 2024 Akifumi Wachi, Thien Q Tran, Rei Sato, Takumi Tanabe, Yohei Akimoto

This paper formulates a human value alignment as a language model policy optimization problem to maximize reward under a safety constraint and then proposes an algorithm called Stepwise Alignment for Constrained Policy Optimization (SACPO).

Computational Efficiency Language Modelling

Verbosity Bias in Preference Labeling by Large Language Models

no code implementations16 Oct 2023 Keita Saito, Akifumi Wachi, Koki Wataoka, Youhei Akimoto

In recent years, Large Language Models (LLMs) have witnessed a remarkable surge in prevalence, altering the landscape of natural language processing and machine learning.

reinforcement-learning

Bayesian Meta-Learning on Control Barrier Functions with Data from On-Board Sensors

no code implementations10 Aug 2023 Wataru Hashimoto, Kazumune Hashimoto, Akifumi Wachi, Xun Shen, Masako Kishida, Shigemasa Takai

The proposed scheme realizes efficient online synthesis of the controller as shown in the simulation study and provides probabilistic safety guarantees on the resulting controller.

Meta-Learning Navigate

LOA: Logical Optimal Actions for Text-based Interaction Games

1 code implementation ACL 2021 Daiki Kimura, Subhajit Chaudhury, Masaki Ono, Michiaki Tatsubori, Don Joven Agravante, Asim Munawar, Akifumi Wachi, Ryosuke Kohita, Alexander Gray

We present Logical Optimal Actions (LOA), an action decision architecture of reinforcement learning applications with a neuro-symbolic framework which is a combination of neural network and symbolic knowledge acquisition approach for natural language interaction games.

reinforcement-learning Reinforcement Learning (RL) +1

Reinforcement Learning with External Knowledge by using Logical Neural Networks

no code implementations3 Mar 2021 Daiki Kimura, Subhajit Chaudhury, Akifumi Wachi, Ryosuke Kohita, Asim Munawar, Michiaki Tatsubori, Alexander Gray

Specifically, we propose an integrated method that enables model-free reinforcement learning from external knowledge sources in an LNNs-based logical constrained framework such as action shielding and guide.

reinforcement-learning Reinforcement Learning (RL)

Polar Embedding

no code implementations CoNLL (EMNLP) 2021 Ran Iwamoto, Ryosuke Kohita, Akifumi Wachi

Particularly, the latest approaches such as hyperbolic embeddings showed significant performance by representing essential meanings in a hierarchy (generality and similarity of objects) with spatial properties (distance from the origin and difference of angles).

Link Prediction

Safe Reinforcement Learning in Constrained Markov Decision Processes

1 code implementation ICML 2020 Akifumi Wachi, Yanan Sui

Safe reinforcement learning has been a promising approach for optimizing the policy of an agent that operates in safety-critical applications.

reinforcement-learning Reinforcement Learning (RL) +1

Failure-Scenario Maker for Rule-Based Agent using Multi-agent Adversarial Reinforcement Learning and its Application to Autonomous Driving

no code implementations26 Mar 2019 Akifumi Wachi

We propose a method for efficiently finding failure scenarios; this method trains the adversarial agents using multi-agent reinforcement learning such that the tested rule-based agent fails.

Autonomous Driving Multi-agent Reinforcement Learning +2

Safe Exploration in Markov Decision Processes with Time-Variant Safety using Spatio-Temporal Gaussian Process

no code implementations12 Sep 2018 Akifumi Wachi, Hiroshi Kajino, Asim Munawar

This paper presents a learning algorithm called ST-SafeMDP for exploring Markov decision processes (MDPs) that is based on the assumption that the safety features are a priori unknown and time-variant.

Robot Navigation Safe Exploration

Cannot find the paper you are looking for? You can Submit a new open access paper.