no code implementations • 10 Jun 2023 • Kexuan Wang, An Liu, Baishuo Liu
In spite of the biased policy gradient estimation incurred by the single-loop design and observation reuse, we prove that the SLDAC with a feasible initial point can converge to a Karush-Kuhn-Tuker (KKT) point of the original problem almost surely.
no code implementations • 4 Dec 2022 • Shu Liu, Enquan Huang, Yan Xu, Kexuan Wang, Xiaoyan Kui, Tao Lei, Hongying Meng
To make the best use of the dataset, the manual ratings, attractiveness score, and standard deviation are aggregated explicitly to construct a dual label distribution, including the attractiveness distribution and the rating distribution.