1 code implementation • 21 Apr 2025 • Jie Cheng, Ruixi Qiao, Lijun Li, Chao Guo, Junle Wang, Gang Xiong, Yisheng Lv, Fei-Yue Wang
In this paper, we identify the main cause of PRM-induced reward hacking: the canonical summation-form credit assignment in reinforcement learning (RL), which defines the value as cumulative gamma-decayed future rewards, easily induces LLMs to hack steps with high rewards.
no code implementations • 10 Nov 2024 • Liuyue Xie, Jiancong Guo, Laszlo A. Jeni, Zhiheng Jia, Mingyang Li, Yunwen Zhou, Chao Guo
To motivate further studies on this problem, we provide the benchmarked dataset containing real and synthetic walkable scenes captured with protective cover optical aberrations.
no code implementations • 14 Jul 2023 • Chao Guo, Ning Yao
In this paper, we study option pricing under Vasicek Model by a Hamiltonian approach.
no code implementations • 15 Feb 2023 • Qi Chen, Chao Guo
Path integral method in quantum mechanics provides a new thinking for barrier option pricing.
no code implementations • 26 Sep 2022 • Qi Chen, Hong-tao Wang, Chao Guo
Hamiltonian approach in quantum mechanics provides a new thinking for barrier option pricing.
no code implementations • 8 Sep 2022 • Zeyu Liu, Yi Wang, Jing Wen, Yong Zhang, Hao Yin, Chao Guo, Zhongyu Wang
In addition, in order to improve the segmentation performance, we adopt multi-view and multi-window level method, at the same time we employ a fine-tune strategy to mitigate the impact of inconsistent labeling.
no code implementations • 16 Dec 2021 • Qi Chen, Chao Guo
Path integral method in quantum mechanics provides a new thinking for barrier option pricing.