no code implementations • 5 Mar 2024 • Liangzhou Wang, Kaiwen Zhu, Fengming Zhu, Xinghu Yao, Shujie Zhang, Deheng Ye, Haobo Fu, Qiang Fu, Wei Yang
The common goal is an achievable state with high value, which is obtained by sampling from the distribution of future states.