Search Results for author: Wataru Kumagai

Found 14 papers, 3 papers with code

A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees

1 code implementation • 31 Jan 2024 • Toshinori Kitamura, Tadashi Kozuno, Masahiro Kato, Yuki Ichihara, Soichiro Nishimori, Akiyoshi Sannai, Sho Sonoda, Wataru Kumagai, Yutaka Matsuo

We study a primal-dual reinforcement learning (RL) algorithm for the online constrained Markov decision processes (CMDP) problem, wherein the agent explores an optimal policy that maximizes return while satisfying constraints.

Reinforcement Learning (RL)

Paper
Code

Towards Autonomous Hypothesis Verification via Language Models with Minimal Guidance

no code implementations • 16 Nov 2023 • Shiro Takagi, Ryutaro Yamauchi, Wataru Kumagai

Research automation efforts usually employ AI as a tool to automate specific tasks within the research process.

Paper
Add Code

LPML: LLM-Prompting Markup Language for Mathematical Reasoning

no code implementations • 21 Sep 2023 • Ryutaro Yamauchi, Sho Sonoda, Akiyoshi Sannai, Wataru Kumagai

In this paper, we propose a novel framework that integrates the Chain-of-Thought (CoT) method with an external tool (Python REPL).

Mathematical Reasoning

Paper
Add Code

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

1 code implementation • 22 May 2023 • Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo

Mirror descent value iteration (MDVI), an abstraction of Kullback-Leibler (KL) and entropy-regularized reinforcement learning (RL), has served as the basis for recent high-performing practical RL algorithms.

regression Reinforcement Learning (RL)

Paper
Code

Langevin Autoencoders for Learning Deep Latent Variable Models

1 code implementation • 15 Sep 2022 • Shohei Taniguchi, Yusuke Iwasawa, Wataru Kumagai, Yutaka Matsuo

Based on the ALD, we also present a new deep latent variable model named the Langevin autoencoder (LAE).

Image Generation valid +1

Paper
Code

Equivariant and Invariant Reynolds Networks

no code implementations • 15 Oct 2021 • Akiyoshi Sannai, Makoto Kawano, Wataru Kumagai

We construct learning models based on the reductive Reynolds operator called equivariant and invariant Reynolds networks (ReyNets) and prove that they have universal approximation property.

Paper
Add Code

Reynolds Equivariant and Invariant Networks

no code implementations • 29 Sep 2021 • Akiyoshi Sannai, Makoto Kawano, Wataru Kumagai

To overcome this difficulty, we consider representing the Reynolds operator as a sum over a subset instead of a sum over the whole group.

Paper
Add Code

Group Equivariant Conditional Neural Processes

no code implementations • ICLR 2021 • Makoto Kawano, Wataru Kumagai, Akiyoshi Sannai, Yusuke Iwasawa, Yutaka Matsuo

We present the group equivariant conditional neural process (EquivCNP), a meta-learning method with permutation invariance in a data set as in conventional conditional neural processes (CNPs), and it also has transformation equivariance in data space.

Meta-Learning Translation +1

Paper
Add Code

Bayesian Neural Networks with Variance Propagation for Uncertainty Evaluation

no code implementations • 1 Jan 2021 • Yuki Mae, Wataru Kumagai, Takafumi Kanamori

We report the computational efficiency and statistical reliability of our method in numerical experiments of the language modeling using RNNs, and the out-of-distribution detection with DNNs.

Bayesian Inference Computational Efficiency +2

Paper
Add Code

Universal Approximation Theorem for Equivariant Maps by Group CNNs

no code implementations • 27 Dec 2020 • Wataru Kumagai, Akiyoshi Sannai

However, universal approximation theorems for CNNs have been separately derived with individual techniques according to each group and setting.

Paper
Add Code

Variable Selection for Nonparametric Learning with Power Series Kernels

no code implementations • 2 Jun 2018 • Kota Matsui, Wataru Kumagai, Kenta Kanamori, Mitsuaki Nishikimi, Takafumi Kanamori

In this paper, we propose a variable selection method for general nonparametric kernel-based estimation.

Density Ratio Estimation regression +1

Paper
Add Code

Regret Analysis for Continuous Dueling Bandit

no code implementations • NeurIPS 2017 • Wataru Kumagai

The dueling bandit is a learning framework wherein the feedback information in the learning process is restricted to a noisy comparison between a pair of actions.

Paper
Add Code

Learning Bound for Parameter Transfer Learning

no code implementations • NeurIPS 2016 • Wataru Kumagai

We consider a transfer-learning problem by using the parameter transfer approach, where a suitable parameter of feature mapping is learned through one task and applied to another objective task.

Transfer Learning

Paper
Add Code

Parallel Distributed Block Coordinate Descent Methods based on Pairwise Comparison Oracle

no code implementations • 13 Sep 2014 • Kota Matsui, Wataru Kumagai, Takafumi Kanamori

Our algorithm consists of two steps; one is the direction estimate step and the other is the search step.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.