no code implementations • 13 Jul 2022 • Mukul Gagrani, Corrado Rainone, Yang Yang, Harris Teague, Wonseok Jeon, Herke van Hoof, Weiliang Will Zeng, Piero Zappi, Christopher Lott, Roberto Bondesan
Recent works on machine learning for combinatorial optimization have shown that learning based approaches can outperform heuristic methods in terms of speed and performance.
no code implementations • 19 Aug 2021 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
The regret bound of the algorithm was derived under a technical assumption on the induced norm of the closed loop system.
no code implementations • 9 Nov 2020 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
We consider optimal control of an unknown multi-agent linear quadratic (LQ) system where the dynamics and the cost are coupled across the agents through the mean-field (i. e., empirical mean) of the states and controls.
no code implementations • NeurIPS 2017 • Yi Ouyang, Mukul Gagrani, Ashutosh Nayyar, Rahul Jain
This regret bound matches the best available bound for weakly communicating MDPs.