Search Results for author: Shivaram Kalyanakrishnan

Found 6 papers, 1 papers with code

Regret Minimisation in Multi-Armed Bandits Using Bounded Arm Memory

no code implementations • 24 Jan 2019 • Arghya Roy Chaudhuri, Shivaram Kalyanakrishnan

We present a conceptually simple, and efficient algorithm that needs to remember statistics of at most $M$ arms, and for any $K$-armed finite bandit instance it enjoys a $O(KM +K^{1. 5}\sqrt{T\log (T/MK)}/M)$ upper-bound on regret.

Multi-Armed Bandits

Paper
Add Code

PAC Identification of Many Good Arms in Stochastic Multi-Armed Bandits

no code implementations • 24 Jan 2019 • Arghya Roy Chaudhuri, Shivaram Kalyanakrishnan

The problem of identifying $k > 1$ distinct arms from the best $\rho$ fraction is not always well-defined; for a special class of this problem, we present lower and upper bounds.

Multi-Armed Bandits

Paper
Add Code

Lower Bounds for Policy Iteration on Multi-action MDPs

no code implementations • 16 Sep 2020 • Kumar Ashutosh, Sarthak Consul, Bhishma Dedhia, Parthasarathi Khirwadkar, Sahil Shah, Shivaram Kalyanakrishnan

An important theoretical question is how many iterations a specified PI variant will take to terminate as a function of the number of states $n$ and the number of actions $k$ in the input MDP.

Paper
Add Code

An Analysis of Frame-skipping in Reinforcement Learning

no code implementations • 7 Feb 2021 • Shivaram Kalyanakrishnan, Siddharth Aravindan, Vishwajeet Bagdawat, Varun Bhatt, Harshith Goka, Archit Gupta, Kalpesh Krishna, Vihari Piratla

In this paper, we investigate the role of the parameter $d$ in RL; $d$ is called the "frame-skip" parameter, since states in the Atari domain are images.

Decision Making reinforcement-learning +1

Paper
Add Code

PAC Mode Estimation using PPR Martingale Confidence Sequences

1 code implementation • 10 Sep 2021 • Shubham Anand Jain, Rohan Shah, Sanit Gupta, Denil Mehta, Inderjeet Jayakumar Nair, Jian Vora, Sushil Khyalia, Sourav Das, Vinay J. Ribeiro, Shivaram Kalyanakrishnan

This problem reduces to the estimation of a single parameter when $\mathcal{P}$ has a support set of size $K = 2$.

Paper
Code

Artificial Intelligence and Life in 2030: The One Hundred Year Study on Artificial Intelligence

no code implementations • 31 Oct 2022 • Peter Stone, Rodney Brooks, Erik Brynjolfsson, Ryan Calo, Oren Etzioni, Greg Hager, Julia Hirschberg, Shivaram Kalyanakrishnan, Ece Kamar, Sarit Kraus, Kevin Leyton-Brown, David Parkes, William Press, AnnaLee Saxenian, Julie Shah, Milind Tambe, Astro Teller

In September 2016, Stanford's "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the first report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.