Sequence Generation Modeling for Continuous Value Prediction
Continuous value prediction (CVP) plays a crucial role in short video recommendation, capturing user preferences through precise numerical estimations. However, traditional regression-based methods often struggle with challenges like wide value ranges and imbalanced data, leading to prediction bias. While ordinal classification approaches have been introduced to address these issues, their reliance on discretization reduces accuracy and overlooks inherent relationships between intervals. To overcome these limitations, we introduce a novel Generative Regression (GR) framework for CVP, inspired by sequence generation techniques in language modeling. Our method transforms numerical values into token sequences through structural discretization, preserving original data fidelity while improving prediction precision. Leveraging a carefully crafted vocabulary and label encoding, GR employs curriculum learning with an embedding mixup strategy to bridge training-inference gaps. Experimental evaluations on four public datasets and one large-scale industrial dataset validate the superiority of GR over existing methods. Real-world A/B tests on Kuaishou, a leading video platform, further demonstrate its practical effectiveness. Additionally, GR proves adaptable to other regression tasks, such as Lifetime Value (LTV) prediction, showcasing its potential as a robust solution for diverse CVP challenges.
PDF Abstract