1 code implementation • 21 Feb 2024 • Jiayu Chen, Bhargav Ganguly, Yang Xu, Yongsheng Mei, Tian Lan, Vaneet Aggarwal
This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms.
no code implementations • 18 Oct 2023 • Bhargav Ganguly, Yang Xu, Vaneet Aggarwal
Through thorough theoretical analysis, we demonstrate that the quantum advantage in mean estimation leads to exponential advancements in regret guarantees for infinite horizon Reinforcement Learning.
no code implementations • 16 Feb 2023 • Bhargav Ganguly, Yulian Wu, Di Wang, Vaneet Aggarwal
This improvement is a key to the significant regret improvement in quantum reinforcement learning.
no code implementations • 22 Nov 2022 • Bhargav Ganguly, Vaneet Aggarwal
Federated Learning (FL) is an emerging domain in the broader context of artificial intelligence research.
no code implementations • 26 Mar 2022 • Bhargav Ganguly, Seyyedali Hosseinalipour, Kwang Taik Kim, Christopher G. Brinton, Vaneet Aggarwal, David J. Love, Mung Chiang
CE-FL also introduces floating aggregation point, where the local models generated at the devices and the servers are aggregated at an edge server, which varies from one model training round to another to cope with the network evolution in terms of data distribution and users' mobility.
no code implementations • 22 Oct 2021 • Alec Koppel, Amrit Singh Bedi, Bhargav Ganguly, Vaneet Aggarwal
We establish that the sample complexity to obtain near-globally optimal solutions matches tight dependencies on the cardinality of the state and action spaces, and exhibits classical scalings with respect to the network in accordance with multi-agent optimization.
Multi-agent Reinforcement Learning Reinforcement Learning (RL)
no code implementations • 22 Feb 2021 • Mridul Agarwal, Bhargav Ganguly, Vaneet Aggarwal
We provide \NAM\ which runs at each agent and prove that the total cumulative regret of $M$ agents is upper bounded as $\Tilde{O}(DS\sqrt{MAT})$ for a Markov Decision Process with diameter $D$, number of states $S$, and number of actions $A$.