no code implementations • 27 Apr 2024 • Mohammad Pedramfar, Vaneet Aggarwal
This paper introduces the notion of upper linearizable/quadratizable functions, a class that extends concavity and DR-submodularity in various settings, including monotone and non-monotone cases over different convex sets.
1 code implementation • 15 Mar 2024 • Mohammad Pedramfar, Yididiya Y. Nadew, Christopher J. Quinn, Vaneet Aggarwal
This paper introduces unified projection-free Frank-Wolfe type algorithms for adversarial continuous DR-submodular optimization, spanning scenarios such as full information and (semi-)bandit feedback, monotone and non-monotone functions, different constraints, and types of stochastic queries.
no code implementations • 13 Feb 2024 • Mohammad Pedramfar, Vaneet Aggarwal
We also show that any such algorithm that requires full-information feedback may be transformed to an algorithm with semi-bandit feedback with comparable regret bound.
no code implementations • NeurIPS 2023 • Ahmadreza Moradipari, Mohammad Pedramfar, Modjtaba Shokrian Zini, Vaneet Aggarwal
In this paper, we prove the first Bayesian regret bounds for Thompson Sampling in reinforcement learning in a multitude of settings.
no code implementations • NeurIPS 2023 • Mohammad Pedramfar, Christopher John Quinn, Vaneet Aggarwal
This paper presents a unified approach for maximizing continuous DR-submodular functions that encompasses a range of settings and oracle access types.
no code implementations • 23 Mar 2023 • Mohammad Pedramfar, Vaneet Aggarwal
This paper investigates the problem of combinatorial multiarmed bandits with stochastic submodular (in expectation) rewards and full-bandit delayed feedback, where the delayed feedback is assumed to be composite and anonymous.
1 code implementation • 28 Jan 2020 • Modjtaba Shokrian Zini, Mohammad Pedramfar, Matthew Riemer, Ahmadreza Moradipari, Miao Liu
Coagent networks formalize the concept of arbitrary networks of stochastic agents that collaborate to take actions in a reinforcement learning environment.