Search Results for author: Changnan Xiao

Found 13 papers, 4 papers with code

A Theory for Length Generalization in Learning to Reason

no code implementations31 Mar 2024 Changnan Xiao, Bing Liu

Length generalization (LG) is a challenging problem in learning to reason.

Conditions for Length Generalization in Learning Reasoning Skills

no code implementations22 Nov 2023 Changnan Xiao, Bing Liu

However, numerous evaluations of the reasoning capabilities of LLMs have also showed some limitations.

Open-World Continual Learning: Unifying Novelty Detection and Continual Learning

no code implementations20 Apr 2023 Gyuhak Kim, Changnan Xiao, Tatsuya Konishi, Zixuan Ke, Bing Liu

The key theoretical result is that regardless of whether WP and OOD detection (or TP) are defined explicitly or implicitly by a CIL algorithm, good WP and good OOD detection are necessary and sufficient conditions for good CIL, which unifies novelty or OOD detection and continual learning (CIL, in particular).

Class Incremental Learning Incremental Learning +2

Mastering Strategy Card Game (Hearthstone) with Improved Techniques

no code implementations9 Mar 2023 Changnan Xiao, Yongxin Zhang, Xuefeng Huang, Qinhan Huang, Jie Chen, Peng Sun

Strategy card game is a well-known genre that is demanding on the intelligent game-play and can be an ideal test-bench for AI.

Decision Making

Mastering Strategy Card Game (Legends of Code and Magic) via End-to-End Policy and Optimistic Smooth Fictitious Play

no code implementations7 Mar 2023 Wei Xi, Yongxin Zhang, Changnan Xiao, Xuefeng Huang, Shihong Deng, Haowei Liang, Jie Chen, Peng Sun

Deep Reinforcement Learning combined with Fictitious Play shows impressive results on many benchmark games, most of which are, however, single-stage.

Decision Making

Generalized Data Distribution Iteration

no code implementations7 Jun 2022 Jiajun Fan, Changnan Xiao

Then, we cast these two problems into the training data distribution optimization problem, namely to obtain desired training data within limited interactions, and address them concurrently via i) explicit modeling and control of the capacity and diversity of behavior policy and ii) more fine-grained and adaptive control of selective/sampling distribution of the behavior policy using a monotonic data distribution optimization.

Atari Games

Continual Learning Based on OOD Detection and Task Masking

1 code implementation17 Mar 2022 Gyuhak Kim, Sepideh Esmaeilpour, Changnan Xiao, Bing Liu

Existing continual learning techniques focus on either task incremental learning (TIL) or class incremental learning (CIL) problem, but not both.

Class Incremental Learning Incremental Learning +1

GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning

no code implementations11 Jun 2021 Jiajun Fan, Changnan Xiao, Yue Huang

Deep Q Network (DQN) firstly kicked the door of deep reinforcement learning (DRL) via combining deep learning (DL) with reinforcement learning (RL), which has noticed that the distribution of the acquired data would change during the training process.

Atari Games reinforcement-learning +1

An Entropy Regularization Free Mechanism for Policy-based Reinforcement Learning

no code implementations1 Jun 2021 Changnan Xiao, Haosen Shi, Jiajun Fan, Shihong Deng

We find valued-based reinforcement learning methods with {\epsilon}-greedy mechanism are capable of enjoying three characteristics, Closed-form Diversity, Objective-invariant Exploration and Adaptive Trade-off, which help value-based methods avoid the policy collapse problem.

Atari Games reinforcement-learning +1

CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration

no code implementations9 May 2021 Changnan Xiao, Haosen Shi, Jiajun Fan, Shihong Deng, Haiyan Yin

We study the problem of model-free reinforcement learning, which is often solved following the principle of Generalized Policy Iteration (GPI).

Atari Games

Hierarchical Meta Reinforcement Learning for Multi-Task Environments

1 code implementation1 Jan 2021 Dongyang Zhao, Yue Huang, Changnan Xiao, Yue Li, Shihong Deng

To address the problem brought by the environment, we propose a Meta Soft Hierarchical reinforcement learning framework (MeSH), in which each low-level sub-policy focuses on a specific sub-task respectively and high-level policy automatically learns to utilize low-level sub-policies through meta-gradients.

Hierarchical Reinforcement Learning Meta Reinforcement Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.