no code implementations • 31 Mar 2024 • Dongsheng Zuo, Jiadong Zhu, Yikang Ouyang, Yuzhe ma
The agent can learn to optimize the multiplier structure based on a Pareto-driven reward which is customized to accommodate the trade-off between area and delay.