no code implementations • 30 May 2024 • Yashaswini Murthy, Isaac Grosof, Siva Theja Maguluri, R. Srikant
We consider policy optimization methods in reinforcement learning settings where the state space is arbitrarily large, or even countably infinite.
no code implementations • 11 Mar 2024 • Navdeep Kumar, Yashaswini Murthy, Itai Shufaro, Kfir Y. Levy, R. Srikant, Shie Mannor
We present the first finite time global convergence analysis of policy gradient in the context of infinite horizon average reward Markov decision processes (MDPs).
no code implementations • 8 Feb 2023 • Yashaswini Murthy, Mehrdad Moharrami, R. Srikant
Since the exponential cost formulation deals with the multiplicative Bellman equation, our main contribution is a convergence proof which is quite different than existing results for discounted and risk-neutral average-cost problems as well as risk sensitive value and policy iteration approaches.
no code implementations • 8 Feb 2022 • Mehrdad Moharrami, Yashaswini Murthy, Arghyadip Roy, R. Srikant
We study the risk-sensitive exponential cost MDP formulation and develop a trajectory-based gradient algorithm to find the stationary point of the cost associated with a set of parameterized policies.
no code implementations • 8 Nov 2020 • Taha Ameen ur Rahman, Alton S. Barbehenn, Xinan Chen, Hassan Dbouk, James A. Douglas, Yuncong Geng, Ian George, John B. Harvill, Sung Woo Jeon, Kartik K. Kansal, Kiwook Lee, Kelly A. Levick, Bochao Li, Ziyue Li, Yashaswini Murthy, Adarsh Muthuveeru-Subramaniam, S. Yagiz Olmez, Matthew J. Tomei, Tanya Veeravalli, Xuechao Wang, Eric A. Wayman, Fan Wu, Peng Xu, Shen Yan, Heling Zhang, Yibo Zhang, Yifan Zhang, Yibo Zhao, Sourya Basu, Lav R. Varshney
Many information sources are not just sequences of distinguishable symbols but rather have invariances governed by alternative counting paradigms such as permutations, combinations, and partitions.
Information Theory Information Theory
no code implementations • 25 Nov 2018 • Yashaswini Murthy
When our eyes are presented with the same image, the brain processes it to view it as a single coherent one.