no code implementations • 22 Feb 2023 • Honglin Shu, Pei Gao, Lingwei Zhu, Zheng Chen
In this paper, we propose a novel framework for rapid clinical intervention by viewing health records as graphs whose nodes are mapped from medical events and edges as correspondence between events in given a time window.
no code implementations • 27 Jan 2023 • Lingwei Zhu, Zheng Chen, Takamitsu Matsubara, Martha White
Many policy optimization approaches in reinforcement learning incorporate a Kullback-Leilbler (KL) divergence to the previous policy, to prevent the policy from changing too quickly.
no code implementations • 29 Jul 2022 • Yuki Kadokawa, Lingwei Zhu, Yoshihisa Tsurumine, Takamitsu Matsubara
Deep reinforcement learning with domain randomization learns a control policy in various simulations with randomized physical and sensor model parameters to become transferable to the real world in a zero-shot setting.
no code implementations • 20 Jul 2022 • Zheng Chen, Ziwei Yang, Lingwei Zhu, Guang Shi, Kun Yue, Takashi Matsubara, Shigehiko Kanaya, MD Altaf-Ul-Amin
As such, existing methods often impose unrealistic assumptions to extract useful features from the data while avoiding overfitting to spurious correlations.
1 code implementation • 22 Jun 2022 • Zheng Chen, Lingwei Zhu, Ziwei Yang, Takashi Matsubara
Cancer subtyping is crucial for understanding the nature of tumors and providing suitable therapy.
no code implementations • 16 May 2022 • Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara
The recently successful Munchausen Reinforcement Learning (M-RL) features implicit Kullback-Leibler (KL) regularization by augmenting the reward function with logarithm of the current stochastic policy.
no code implementations • 16 May 2022 • Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara
Maximum Tsallis entropy (MTE) framework in reinforcement learning has gained popularity recently by virtue of its flexible modeling choices including the widely used Shannon entropy and sparse entropy.
no code implementations • 21 Apr 2022 • Zheng Chen, Lingwei Zhu, Ziwei Yang, Renyuan Zhang
A spiking neural network (SNN) based tier is designed to distill the principle information in terms of spike-streams from the rare features, which maintains the temporal implication in the nature of EEGs.
no code implementations • 7 Apr 2022 • Zheng Chen, Ziwei Yang, Lingwei Zhu, Wei Chen, Toshiyo Tamura, Naoaki Ono, MD Altaf-Ul-Amin, Shigehiko Kanaya, Ming Huang
This paper proposes a novel framework for automatically capturing the time-frequency nature of electroencephalogram (EEG) signals of human sleep based on the authoritative sleep medicine guidance.
no code implementations • 2 Apr 2022 • Lingwei Zhu, Koki Odani, Ziwei Yang, Guang Shi, Yirong Kan, Zheng Chen, Renyuan Zhang
Recently there has seen promising results on automatic stage scoring by extracting spatio-temporal features from electroencephalogram (EEG).
no code implementations • 2 Apr 2022 • Ziwei Yang, Lingwei Zhu, Zheng Chen, Ming Huang, Naoaki Ono, MD Altaf-Ul-Amin, Shigehiko Kanaya
In this paper, we propose to investigate automatic subtyping from an unsupervised learning perspective by directly constructing the underlying data distribution itself, hence sufficient data can be generated to alleviate the issue of overfitting.
no code implementations • 16 Jul 2021 • Toshinori Kitamura, Lingwei Zhu, Takamitsu Matsubara
The recent boom in the literature on entropy-regularized reinforcement learning (RL) approaches reveals that Kullback-Leibler (KL) regularization brings advantages to RL algorithms by canceling out errors under mild assumptions.
no code implementations • 13 Jul 2021 • Lingwei Zhu, Toshinori Kitamura, Takamitsu Matsubara
In this paper, we propose cautious policy programming (CPP), a novel value-based reinforcement learning (RL) algorithm that can ensure monotonic policy improvement during learning.
no code implementations • 12 Jul 2021 • Lingwei Zhu, Toshinori Kitamura, Takamitsu Matsubara
The oscillating performance of off-policy learning and persisting errors in the actor-critic (AC) setting call for algorithms that can conservatively learn to suit the stability-critical applications better.
no code implementations • 25 Aug 2020 • Lingwei Zhu, Takamitsu Matsubara
We propose a novel reinforcement learning algorithm that exploits this lower-bound as a criterion for adjusting the degree of a policy update for alleviating policy oscillation.