1 code implementation • 2 Feb 2025 • Mingyu Chen, Yiding Chen, Wen Sun, Xuezhou Zhang
Reinforcement Learning from Human Feedback (RLHF) has emerged as a pivotal technique for large language model (LLM) alignment.
no code implementations • 27 Sep 2024 • Mingyu Chen, Aldo Pacchiano, Xuezhou Zhang
In this work, we study the \textit{state-free RL} problem, where the algorithm does not have the states information before interacting with the environment.
no code implementations • 1 May 2024 • Zexin Sun, Mingyu Chen, John Baillieul
Nonlinear differential equations are encountered as models of fluid flow, spiking neurons, and many other systems of interest in the real world.
no code implementations • 1 Mar 2024 • Mingyu Chen, Xuezhou Zhang
This paper initiates the study of scale-free learning in Markov Decision Processes (MDPs), where the scale of rewards/losses is unknown to the learner.
no code implementations • 3 Oct 2023 • Mingyu Chen, Xuezhou Zhang
We consider the Adversarial Multi-Armed Bandits (MAB) problem with unbounded losses, where the algorithms have no prior knowledge on the sizes of the losses.
1 code implementation • 26 May 2023 • Yao Fu, Litu Ou, Mingyu Chen, Yuhao Wan, Hao Peng, Tushar Khot
As large language models (LLMs) are continuously being developed, their evaluation becomes increasingly important yet challenging.
no code implementations • 26 Feb 2021 • Kazumasa Iida, Jens Hänisch, Keisuke Kondo, Mingyu Chen, Takafumi Hatano, Chao Wang, Hikaru Saito, Satoshi Hata, Hiroshi Ikuta
The anisotropic Ginzburg-Landau scaling for the angle dependence of $J_{\rm c}$ yielded temperature-dependent scaling parameters $\gamma_{\rm J}$ that decreased from 1. 6 at 30 K to 1. 3 at 5 K. This is opposite to the behaviour of NdFeAs(O, F).
Superconductivity
1 code implementation • CVPR 2018 • Hao-Min Liu, Mingyu Chen, Guofeng Zhang, Hujun Bao, Yingze Bao
However, jointly using visual and inertial measurements to optimize SLAM objective functions is a problem of high computational complexity.