Search Results for author: Xierui Song

Found 2 papers, 1 papers with code

ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference

1 code implementation • 5 Dec 2023 • Tianchi Cai, Xierui Song, Jiyan Jiang, Fei Teng, Jinjie Gu, Guannan Zhang

Aligning language models to human expectations, e. g., being helpful and harmless, has become a pressing challenge for large language models.

Language Modelling Large Language Model

Paper
Code

Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning

no code implementations • 6 Sep 2023 • Tianchi Cai, Jiyan Jiang, Wenpeng Zhang, Shiji Zhou, Xierui Song, Li Yu, Lihong Gu, Xiaodong Zeng, Jinjie Gu, Guannan Zhang

We further show that this method is guaranteed to converge to the optimal policy, which cannot be achieved by previous value-based reinforcement learning methods for marketing budget allocation.

Marketing reinforcement-learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.