Search Results for author: Jiajun Fan

Found 9 papers, 0 papers with code

ConvFormer: Revisiting Transformer for Sequential User Modeling

no code implementations5 Aug 2023 Hao Wang, Jianxun Lian, Mingqi Wu, Haoxuan Li, Jiajun Fan, Wanyue Xu, Chaozhuo Li, Xing Xie

Sequential user modeling, a critical task in personalized recommender systems, focuses on predicting the next item a user would prefer, requiring a deep understanding of user behavior sequences.

Recommendation Systems

Entire Space Counterfactual Learning: Tuning, Analytical Properties and Industrial Applications

no code implementations20 Oct 2022 Hao Wang, Zhichao Chen, Jiajun Fan, Yuxin Huang, Weiming Liu, Xinggao Liu

As a basic research problem for building effective recommender systems, post-click conversion rate (CVR) estimation has long been plagued by sample selection bias and data sparsity issues.

Auxiliary Learning counterfactual +2

Generalized Data Distribution Iteration

no code implementations7 Jun 2022 Jiajun Fan, Changnan Xiao

Then, we cast these two problems into the training data distribution optimization problem, namely to obtain desired training data within limited interactions, and address them concurrently via i) explicit modeling and control of the capacity and diversity of behavior policy and ii) more fine-grained and adaptive control of selective/sampling distribution of the behavior policy using a monotonic data distribution optimization.

Atari Games

GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning

no code implementations11 Jun 2021 Jiajun Fan, Changnan Xiao, Yue Huang

Deep Q Network (DQN) firstly kicked the door of deep reinforcement learning (DRL) via combining deep learning (DL) with reinforcement learning (RL), which has noticed that the distribution of the acquired data would change during the training process.

Atari Games reinforcement-learning +1

An Entropy Regularization Free Mechanism for Policy-based Reinforcement Learning

no code implementations1 Jun 2021 Changnan Xiao, Haosen Shi, Jiajun Fan, Shihong Deng

We find valued-based reinforcement learning methods with {\epsilon}-greedy mechanism are capable of enjoying three characteristics, Closed-form Diversity, Objective-invariant Exploration and Adaptive Trade-off, which help value-based methods avoid the policy collapse problem.

Atari Games reinforcement-learning +1

CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration

no code implementations9 May 2021 Changnan Xiao, Haosen Shi, Jiajun Fan, Shihong Deng, Haiyan Yin

We study the problem of model-free reinforcement learning, which is often solved following the principle of Generalized Policy Iteration (GPI).

Atari Games

Cannot find the paper you are looking for? You can Submit a new open access paper.