Search Results for author: Mehdi Jafarnia-Jahromi

Found 9 papers, 2 papers with code

A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent

no code implementations • 8 Sep 2021 • Mehdi Jafarnia-Jahromi, Rahul Jain, Ashutosh Nayyar

In this paper, we propose Posterior Sampling Reinforcement Learning for Zero-sum Stochastic Games (PSRL-ZSG), the first online learning algorithm that achieves Bayesian regret bound of $O(HS\sqrt{AT})$ in the infinite-horizon zero-sum stochastic games with average-reward criterion.

Reinforcement Learning (RL)

Paper
Add Code

Online Learning for Cooperative Multi-Player Multi-Armed Bandits

no code implementations • 7 Sep 2021 • William Chang, Mehdi Jafarnia-Jahromi, Rahul Jain

For the first setting, we propose a UCB-inspired algorithm that achieves $O(\log T)$ regret whether the rewards are IID or Markovian.

Multi-Armed Bandits

Paper
Add Code

Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path

no code implementations • NeurIPS 2021 • Liyu Chen, Mehdi Jafarnia-Jahromi, Rahul Jain, Haipeng Luo

We introduce a generic template for developing regret minimization algorithms in the Stochastic Shortest Path (SSP) model, which achieves minimax optimal regret as long as certain properties are ensured.

Paper
Add Code

Online Learning for Stochastic Shortest Path Model via Posterior Sampling

no code implementations • 9 Jun 2021 • Mehdi Jafarnia-Jahromi, Liyu Chen, Rahul Jain, Haipeng Luo

We consider the problem of online reinforcement learning for the Stochastic Shortest Path (SSP) problem modeled as an unknown MDP with an absorbing state.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Online Learning for Unknown Partially Observable MDPs

no code implementations • 25 Feb 2021 • Mehdi Jafarnia-Jahromi, Rahul Jain, Ashutosh Nayyar

Learning optimal controllers for POMDPs when the model is unknown is harder.

Paper
Add Code

Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation

no code implementations • 23 Jul 2020 • Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Rahul Jain

We develop several new algorithms for learning Markov Decision Processes in an infinite-horizon average-reward setting with linear function approximation.

Paper
Add Code

A Model-free Learning Algorithm for Infinite-horizon Average-reward MDPs with Near-optimal Regret

no code implementations • 8 Jun 2020 • Mehdi Jafarnia-Jahromi, Chen-Yu Wei, Rahul Jain, Haipeng Luo

Recently, model-free reinforcement learning has attracted research attention due to its simplicity, memory and computation efficiency, and the flexibility to combine with function approximation.

Q-Learning reinforcement-learning +1

Paper
Add Code

Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes

1 code implementation • ICML 2020 • Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Hiteshi Sharma, Rahul Jain

Model-free reinforcement learning is known to be memory and computation efficient and more amendable to large scale problems.

Multi-Armed Bandits reinforcement-learning +1

Paper
Code

PPD: Permutation Phase Defense Against Adversarial Examples in Deep Learning

1 code implementation • 25 Dec 2018 • Mehdi Jafarnia-Jahromi, Tasmin Chowdhury, Hsin-Tai Wu, Sayandev Mukherjee

In this paper, Permutation Phase Defense (PPD), is proposed as a novel method to resist adversarial attacks.

Adversarial Defense General Classification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.