no code implementations • 17 Feb 2025 • Ruofei Ma, Wenpin Tang, David Yao
In this paper, we consider the impact of the order flow auction (OFA) in the context of the proposer-builder separation (PBS) mechanism through a game-theoretic perspective.
no code implementations • 3 Feb 2025 • Hanyang Zhao, Haoxian Chen, Ji Zhang, David D. Yao, Wenpin Tang
Reinforcement learning from human feedback (RLHF), which aligns a diffusion model with input prompt, has become a crucial step in building reliable generative AI models.
no code implementations • 2 Nov 2024 • Wenpin Tang, Xun Yu Zhou
We study the convergence of $q$-learning and related algorithms introduced by Jia and Zhou (J. Mach.
no code implementations • 5 Oct 2024 • Hanyang Zhao, Genta Indra Winata, Anirban Das, Shi-Xiong Zhang, David D. Yao, Wenpin Tang, Sambit Sahu
Recently, numerous preference optimization algorithms have been introduced as extensions to the Direct Preference Optimization (DPO) family.
no code implementations • 17 Sep 2024 • Genta Indra Winata, Hanyang Zhao, Anirban Das, Wenpin Tang, David D. Yao, Shi-Xiong Zhang, Sambit Sahu
Preference tuning is a crucial process for aligning deep generative models with human preferences.
no code implementations • 12 Sep 2024 • Hanyang Zhao, Haoxian Chen, Ji Zhang, David D. Yao, Wenpin Tang
Reinforcement Learning from human feedback (RLHF) has been shown a promising direction for aligning generative models with human intent and has also been explored in recent works for alignment of diffusion generative models.
no code implementations • 23 May 2024 • Haoxian Chen, Hanyang Zhao, Henry Lam, David Yao, Wenpin Tang
Direct Preference Optimization (DPO) has recently emerged as a popular approach to improve reinforcement learning with human feedback (RLHF), leading to better techniques to fine-tune large language models (LLM).
no code implementations • 10 Mar 2024 • Wenpin Tang
This paper aims to develop and provide a rigorous treatment to the problem of entropy regularized fine-tuning in the context of continuous-time diffusion models, which was recently proposed by Uehara et al. (arXiv:2402. 15194, 2024).
no code implementations • 12 Feb 2024 • Wenpin Tang, Hanyang Zhao
This is an expository article on the score-based diffusion models, with a particular focus on the formulation via stochastic differential equations (SDE).
no code implementations • 23 Jan 2024 • Wenpin Tang, Hanyang Zhao
In view of possibly unguaranteed score matching, we propose a new criterion -- the contraction of backward sampling in the design of DPMs, leading to a novel class of contractive DPMs (CDPMs).
no code implementations • 3 Aug 2023 • Wenpin Tang
With the increasing adoption of the Proof of Stake (PoS) blockchain, it is timely to study the economy created by such blockchain.
no code implementations • 26 Jul 2022 • Wenpin Tang, David D. Yao
In particular, we show when a participant is risk-neutral or risk-seeking, corresponding to the risk-adjusted valuation being a martingale or a sub-martingale, the optimal strategy must be to either buy all the time, sell all the time, or first buy then sell, and with both buying and selling executed at full capacity.
no code implementations • 5 Jun 2022 • Wenpin Tang
This paper is concerned with the stability of shares in a cryptocurrency where the new coins are issued according to the Proof of Stake protocol.
no code implementations • 26 Mar 2021 • Wenpin Tang, Xiao Xu, Xun Yu Zhou
Finally, we conduct an empirical analysis to verify the performance of the algorithm.
no code implementations • 3 Feb 2021 • Wenpin Tang, Xun Yu Zhou
in cumulative step size), and provide an explicit rate as a function of the model parameters.
no code implementations • 1 Jul 2020 • Wenpin Tang
In this paper, we consider mixtures of multinomial logistic models (MNL), which are known to $\epsilon$-approximate any random utility model.
no code implementations • 9 May 2020 • Xin Guo, Jiequn Han, Mahan Tajrobehkar, Wenpin Tang
Motivated by the super-diffusivity of self-repelling random walk, which has roots in statistical physics, this paper develops a new perturbation mechanism for optimization algorithms.
1 code implementation • 15 Aug 2019 • Wenpin Tang, Lu Zhang, Sudipto Banerjee
We formally establish results on the identifiability and consistency of the nugget in spatial models based upon the Gaussian process within the framework of in-fill asymptotics, i. e. the sample size increases within a sampling domain that is bounded.
Spatial Interpolation
Statistics Theory
Statistics Theory
no code implementations • 10 Sep 2018 • Xin Guo, Wenpin Tang, Renyuan Xu
In this paper we propose and analyze a class of $N$-player stochastic games that include finite fuel stochastic games as a special case.