no code implementations • 10 Feb 2024 • Nam Phuong Tran, The Anh Ta, Shuqing Shi, Debmalya Mandal, Yali Du, Long Tran-Thanh
Reward allocation, also known as the credit assignment problem, has been an important topic in economics, engineering, and machine learning.
no code implementations • 21 Feb 2022 • Shuqing Shi, Xiaobin Wang, Zhiyou Yang, Fan Zhang, Hong Qu
This algorithm achieves a total regret bound of $\tilde{\mathcal{O}}(D\sqrt{SAT})$in time horizon $T$ with $S$ states, $A$ actions and diameter $D$.