no code implementations • 8 Dec 2023 • Aradhya N. Mathur, Phu Pham, Aniket Bera, Ojaswa Sharma
Further, the recent work of Denoising Diffusion Policy Optimization (DDPO) demonstrates that the diffusion process is compatible with policy gradient methods and has been demonstrated to improve the 2D diffusion models using an aesthetic scoring function.