Search Results for author: Mingyu Chen

Found 8 papers, 3 papers with code

Avoiding $\mathbf{exp(R_{max})}$ scaling in RLHF through Preference-based Exploration

1 code implementation2 Feb 2025 Mingyu Chen, Yiding Chen, Wen Sun, Xuezhou Zhang

Reinforcement Learning from Human Feedback (RLHF) has emerged as a pivotal technique for large language model (LLM) alignment.

Language Modeling Language Modelling +1

State-free Reinforcement Learning

no code implementations27 Sep 2024 Mingyu Chen, Aldo Pacchiano, Xuezhou Zhang

In this work, we study the \textit{state-free RL} problem, where the algorithm does not have the states information before interacting with the environment.

reinforcement-learning Reinforcement Learning

Koopman-based Deep Learning for Nonlinear System Estimation

no code implementations1 May 2024 Zexin Sun, Mingyu Chen, John Baillieul

Nonlinear differential equations are encountered as models of fluid flow, spiking neurons, and many other systems of interest in the real world.

Deep Learning Transfer Learning

Scale-free Adversarial Reinforcement Learning

no code implementations1 Mar 2024 Mingyu Chen, Xuezhou Zhang

This paper initiates the study of scale-free learning in Markov Decision Processes (MDPs), where the scale of rewards/losses is unknown to the learner.

reinforcement-learning Reinforcement Learning

Improved Algorithms for Adversarial Bandits with Unbounded Losses

no code implementations3 Oct 2023 Mingyu Chen, Xuezhou Zhang

We consider the Adversarial Multi-Armed Bandits (MAB) problem with unbounded losses, where the algorithms have no prior knowledge on the sizes of the losses.

Multi-Armed Bandits

Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance

1 code implementation26 May 2023 Yao Fu, Litu Ou, Mingyu Chen, Yuhao Wan, Hao Peng, Tushar Khot

As large language models (LLMs) are continuously being developed, their evaluation becomes increasingly important yet challenging.

High $J_{\rm c}$ and low anisotropy of hydrogen doped NdFeAsO superconducting thin film

no code implementations26 Feb 2021 Kazumasa Iida, Jens Hänisch, Keisuke Kondo, Mingyu Chen, Takafumi Hatano, Chao Wang, Hikaru Saito, Satoshi Hata, Hiroshi Ikuta

The anisotropic Ginzburg-Landau scaling for the angle dependence of $J_{\rm c}$ yielded temperature-dependent scaling parameters $\gamma_{\rm J}$ that decreased from 1. 6 at 30 K to 1. 3 at 5 K. This is opposite to the behaviour of NdFeAs(O, F).

Superconductivity

ICE-BA: Incremental, Consistent and Efficient Bundle Adjustment for Visual-Inertial SLAM

1 code implementation CVPR 2018 Hao-Min Liu, Mingyu Chen, Guofeng Zhang, Hujun Bao, Yingze Bao

However, jointly using visual and inertial measurements to optimize SLAM objective functions is a problem of high computational complexity.

Computational Efficiency Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.