MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding

18 Feb 2020  ·  Haolin Zhou, Chaoqi Yang, Xiaofeng Gao, Qiong Chen, Gongshen Liu, Guihai Chen ·

Online Real-Time Bidding (RTB) is a complex auction game among which advertisers struggle to bid for ad impressions when a user request occurs. Considering display cost, Return on Investment (ROI), and other influential Key Performance Indicators (KPIs), large ad platforms try to balance the trade-off among various goals in dynamics. To address the challenge, we propose a Multi-ObjecTive Actor-Critics algorithm based on reinforcement learning (RL), named MoTiAC, for the problem of bidding optimization with various goals. In MoTiAC, objective-specific agents update the global network asynchronously with different goals and perspectives, leading to a robust bidding policy. Unlike previous RL models, the proposed MoTiAC can simultaneously fulfill multi-objective tasks in complicated bidding environments. In addition, we mathematically prove that our model will converge to Pareto optimality. Finally, experiments on a large-scale real-world commercial dataset from Tencent verify the effectiveness of MoTiAC versus a set of recent approaches

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here