Search Results for author: David D. Yao

Found 6 papers, 0 papers with code

Fine-Tuning Diffusion Generative Models via Rich Preference Optimization

no code implementations13 Mar 2025 Hanyang Zhao, Haoxian Chen, Yucheng Guo, Genta Indra Winata, Tingting Ou, ZiYu Huang, David D. Yao, Wenpin Tang

We demonstrate the effectiveness of our pipeline and the resulting datasets in fine-tuning state-of-the-art diffusion models.

Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning

no code implementations3 Feb 2025 Hanyang Zhao, Haoxian Chen, Ji Zhang, David D. Yao, Wenpin Tang

Reinforcement learning from human feedback (RLHF), which aligns a diffusion model with input prompt, has become a crucial step in building reliable generative AI models.

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

no code implementations5 Oct 2024 Hanyang Zhao, Genta Indra Winata, Anirban Das, Shi-Xiong Zhang, David D. Yao, Wenpin Tang, Sambit Sahu

Recently, numerous preference optimization algorithms have been introduced as extensions to the Direct Preference Optimization (DPO) family.

Scores as Actions: a framework of fine-tuning diffusion models by continuous-time reinforcement learning

no code implementations12 Sep 2024 Hanyang Zhao, Haoxian Chen, Ji Zhang, David D. Yao, Wenpin Tang

Reinforcement Learning from human feedback (RLHF) has been shown a promising direction for aligning generative models with human intent and has also been explored in recent works for alignment of diffusion generative models.

reinforcement-learning Reinforcement Learning +1

Trading under the Proof-of-Stake Protocol -- a Continuous-Time Control Approach

no code implementations26 Jul 2022 Wenpin Tang, David D. Yao

In particular, we show when a participant is risk-neutral or risk-seeking, corresponding to the risk-adjusted valuation being a martingale or a sub-martingale, the optimal strategy must be to either buy all the time, sell all the time, or first buy then sell, and with both buying and selling executed at full capacity.

POS

Cannot find the paper you are looking for? You can Submit a new open access paper.