no code implementations • ICLR 2019 • Huizhuo Yuan, Chris Junchi Li, Yuhao Tang, Yuren Zhou
In this paper, we propose the StochAstic Recursive grAdient Policy Optimization (SARAPO) algorithm which is a novel variance reduction method on Trust Region Policy Optimization (TRPO).
no code implementations • 21 Mar 2024 • Yan Wang, Lihao Wang, Yuning Shen, Yiqun Wang, Huizhuo Yuan, Yue Wu, Quanquan Gu
The conformational landscape of proteins is crucial to understanding their functionality in complex biological processes.
no code implementations • 15 Feb 2024 • Huizhuo Yuan, Zixiang Chen, Kaixuan Ji, Quanquan Gu
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI), especially when compared with the remarkable progress made in fine-tuning Large Language Models (LLMs).
2 code implementations • 2 Jan 2024 • Zixiang Chen, Yihe Deng, Huizhuo Yuan, Kaixuan Ji, Quanquan Gu
In this paper, we delve into the prospect of growing a strong LLM out of a weak one without the need for acquiring additional human-annotated data.
no code implementations • 14 Dec 2023 • Zixiang Chen, Huizhuo Yuan, YongQian Li, Yiwen Kou, Junkai Zhang, Quanquan Gu
Despite its success in continuous spaces, discrete diffusion models, which apply to domains such as texts and natural languages, remain under-studied and often suffer from slow generation speed.
no code implementations • 9 Mar 2020 • Huizhuo Yuan, Xiangru Lian, Ji Liu, Yuren Zhou
In this paper, we propose a novel algorithm named STOchastic Recursive Momentum for Policy Gradient (STORM-PG), which operates a SARAH-type stochastic recursive variance-reduced policy gradient in an exponential moving average fashion.
no code implementations • 7 Mar 2020 • Xiang Zhou, Huizhuo Yuan, Chris Junchi Li, Qingyun Sun
In this work, we put different variants of stochastic ADMM into a unified form, which includes standard, linearized and gradient-based ADMM with relaxation, and study their dynamics via a continuous-time model approach.
no code implementations • 31 Dec 2019 • Huizhuo Yuan, Xiangru Lian, Ji Liu
Such a complexity is known to be the best one among IFO complexity results for non-convex stochastic compositional optimization, and is believed to be optimal.
no code implementations • NeurIPS 2019 • Huizhuo Yuan, Xiangru Lian, Chris Junchi Li, Ji Liu, Wenqing Hu
Stochastic compositional optimization arises in many important machine learning tasks such as reinforcement learning and portfolio management.