no code implementations • COLING 2022 • Xuantao Lu, Jingping Liu, Zhouhong Gu, Hanwen Tong, Chenhao Xie, Junyang Huang, Yanghua Xiao, Wenguang Wang
In this paper, we propose a scoring model to automatically learn a model-based reward, and an effective training strategy based on curriculum learning is further proposed to stabilize the training process.