Search Results for author: Lixin Zou

Found 17 papers, 5 papers with code

Evolutionary Reinforcement Learning: A Systematic Review and Future Directions

no code implementations20 Feb 2024 Yuanguo Lin, Fan Lin, Guorong Cai, Hong Chen, Lixin Zou, Pengcheng Wu

In response to the limitations of reinforcement learning and evolutionary algorithms (EAs) in complex problem-solving, Evolutionary Reinforcement Learning (EvoRL) has emerged as a synergistic solution.

Adversarial Robustness Evolutionary Algorithms +2

Unconfounded Propensity Estimation for Unbiased Ranking

no code implementations17 May 2023 Dan Luo, Lixin Zou, Qingyao Ai, Zhiyu Chen, Chenliang Li, Dawei Yin, Brian D. Davison

The goal of unbiased learning to rank (ULTR) is to leverage implicit user feedback for optimizing learning-to-rank systems.


User Retention-oriented Recommendation with Decision Transformer

1 code implementation11 Mar 2023 Kesen Zhao, Lixin Zou, Xiangyu Zhao, Maolin Wang, Dawei Yin

However, deploying the DT in recommendation is a non-trivial problem because of the following challenges: (1) deficiency in modeling the numerical reward value; (2) data discrepancy between the policy learning and recommendation generation; (3) unreliable offline performance evaluation.

Contrastive Learning counterfactual +1

Whole Page Unbiased Learning to Rank

no code implementations19 Oct 2022 Haitao Mao, Lixin Zou, Yujia Zheng, Jiliang Tang, Xiaokai Chu, Jiashu Zhao, Qian Wang, Dawei Yin

To address the above challenges, we propose a Bias Agnostic whole-page unbiased Learning to rank algorithm, named BAL, to automatically find the user behavior model with causal discovery and mitigate the biases induced by multiple SERP features with no specific design.

Causal Discovery Information Retrieval +2

Approximated Doubly Robust Search Relevance Estimation

no code implementations16 Aug 2022 Lixin Zou, Changying Hao, Hengyi Cai, Suqi Cheng, Shuaiqiang Wang, Wenwen Ye, Zhicong Cheng, Simiu Gu, Dawei Yin

We further instantiate the proposed unbiased relevance estimation framework in Baidu search, with comprehensive practical solutions designed regarding the data pipeline for click behavior tracking and online relevance estimation with an approximated deep neural network.


Model-based Unbiased Learning to Rank

1 code implementation24 Jul 2022 Dan Luo, Lixin Zou, Qingyao Ai, Zhiyu Chen, Dawei Yin, Brian D. Davison

Existing methods in unbiased learning to rank typically rely on click modeling or inverse propensity weighting (IPW).

Information Retrieval Learning-To-Rank +1

A Large Scale Search Dataset for Unbiased Learning to Rank

1 code implementation7 Jul 2022 Lixin Zou, Haitao Mao, Xiaokai Chu, Jiliang Tang, Wenwen Ye, Shuaiqiang Wang, Dawei Yin

The unbiased learning to rank (ULTR) problem has been greatly advanced by recent deep learning techniques and well-designed debias algorithms.

Causal Discovery Language Modelling +3

PReGAN: Answer Oriented Passage Ranking with Weakly Supervised GAN

no code implementations5 Jul 2022 Pan Du, Jian-Yun Nie, Yutao Zhu, Hao Jiang, Lixin Zou, Xiaohui Yan

Beyond topical relevance, passage ranking for open-domain factoid question answering also requires a passage to contain an answer (answerability).

Passage Ranking Question Answering

A Survey on Reinforcement Learning for Recommender Systems

no code implementations22 Sep 2021 Yuanguo Lin, Yong liu, Fan Lin, Lixin Zou, Pengcheng Wu, Wenhua Zeng, Huanhuan Chen, Chunyan Miao

To understand the challenges and relevant solutions, there should be a reference for researchers and practitioners working on RL-based recommender systems.

Explainable Recommendation reinforcement-learning +2

Enhanced Doubly Robust Learning for Debiasing Post-click Conversion Rate Estimation

1 code implementation28 May 2021 Siyuan Guo, Lixin Zou, Yiding Liu, Wenwen Ye, Suqi Cheng, Shuaiqiang Wang, Hechang Chen, Dawei Yin, Yi Chang

Based on it, a more robust doubly robust (MRDR) estimator has been proposed to further reduce its variance while retaining its double robustness.

counterfactual Imputation +2

Pre-trained Language Model based Ranking in Baidu Search

no code implementations24 May 2021 Lixin Zou, Shengqiang Zhang, Hengyi Cai, Dehong Ma, Suqi Cheng, Daiting Shi, Zhifan Zhu, Weiyue Su, Shuaiqiang Wang, Zhicong Cheng, Dawei Yin

However, it is nontrivial to directly apply these PLM-based rankers to the large-scale web search system due to the following challenging issues:(1) the prohibitively expensive computations of massive neural PLMs, especially for long texts in the web-document, prohibit their deployments in an online ranking system that demands extremely low latency;(2) the discrepancy between existing ranking-agnostic pre-training objectives and the ad-hoc retrieval scenarios that demand comprehensive relevance modeling is another main barrier for improving the online ranking system;(3) a real-world search engine typically involves a committee of ranking components, and thus the compatibility of the individually fine-tuned ranking model is critical for a cooperative ranking system.

Language Modelling Retrieval

Data-Efficient Reinforcement Learning for Malaria Control

no code implementations4 May 2021 Lixin Zou, Long Xia, Linfang Hou, Xiangyu Zhao, Dawei Yin

This work introduces a practical, data-efficient policy learning method, named Variance-Bonus Monte Carlo Tree Search~(VB-MCTS), which can copy with very little data and facilitate learning from scratch in only a few trials.

Decision Making Model-based Reinforcement Learning +2

Optimal Mixture Weights for Off-Policy Evaluation with Multiple Behavior Policies

no code implementations29 Nov 2020 Jinlin Lai, Lixin Zou, Jiaxing Song

Off-policy evaluation is a key component of reinforcement learning which evaluates a target policy with offline data collected from behavior policies.

Off-policy evaluation Recommendation Systems +2

Meta-Learning for Neural Relation Classification with Distant Supervision

no code implementations26 Oct 2020 Zhenzhen Li, Jian-Yun Nie, Benyou Wang, Pan Du, Yuhan Zhang, Lixin Zou, Dongsheng Li

Distant supervision provides a means to create a large number of weakly labeled data at low cost for relation classification.

Classification General Classification +3

Neural Interactive Collaborative Filtering

1 code implementation4 Jul 2020 Lixin Zou, Long Xia, Yulong Gu, Xiangyu Zhao, Weidong Liu, Jimmy Xiangji Huang, Dawei Yin

Therefore, the proposed exploration policy, to balance between learning the user profile and making accurate recommendations, can be directly optimized by maximizing users' long-term satisfaction with reinforcement learning.

Collaborative Filtering Meta-Learning +2

Toward Simulating Environments in Reinforcement Learning Based Recommendations

no code implementations27 Jun 2019 Xiangyu Zhao, Long Xia, Lixin Zou, Dawei Yin, Jiliang Tang

Thus, it calls for a user simulator that can mimic real users' behaviors where we can pre-train and evaluate new recommendation algorithms.

Generative Adversarial Network Recommendation Systems +2

Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems

no code implementations13 Feb 2019 Lixin Zou, Long Xia, Zhuoye Ding, Jiaxing Song, Weidong Liu, Dawei Yin

Though reinforcement learning~(RL) naturally fits the problem of maximizing the long term rewards, applying RL to optimize long-term user engagement is still facing challenges: user behaviors are versatile and difficult to model, which typically consists of both instant feedback~(e. g. clicks, ordering) and delayed feedback~(e. g. dwell time, revisit); in addition, performing effective off-policy learning is still immature, especially when combining bootstrapping and function approximation.

Recommendation Systems reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.