no code implementations • 30 Dec 2024 • Emilio Jorge, Christos Dimitrakakis, Debabrota Basu
First, we show that the Posterior Sampling-based RL (PSRL) yields sublinear regret if the data distributions satisfy LSI under some mild additional assumptions.
no code implementations • 4 Dec 2024 • Apurv Shukla, Debabrota Basu
First, to quantify the impact of preferences, we derive a novel lower bound on sample complexity for identifying the most preferred policy with a confidence level $1-\delta$.
no code implementations • 12 Nov 2024 • Chao Han, Debabrota Basu, Michael Mangan, Eleni Vasilaki, Aditya Gilra
Learning representations of underlying environmental dynamics from partial observations is a critical challenge in machine learning.
no code implementations • 24 Oct 2024 • Udvas Das, Debabrota Basu
First, we propose a Lagrangian relaxation of the sample complexity lower bound for pure exploration under constraints.
no code implementations • 10 Oct 2024 • Ayoub Ajarra, Bishwamittra Ghosh, Debabrota Basu
For this purpose, we develop a new framework that quantifies different properties in terms of the Fourier coefficients of the ML model under audit but does not parametrically reconstruct it.
no code implementations • 7 Oct 2024 • Debabrota Basu, Sourav Chakraborty, Debarshi Chanda, Buddha Dev Das, Arijit Ghosh, Arnab Ray
We prove the theoretical correctness of our algorithms while trying to reduce the sample complexity for both public and private surveys.
1 code implementation • 21 Sep 2024 • Naheed Anjum Arafat, Debabrota Basu, Yulia Gel, Yuzhou Chen
We introduce the concept of witness complex to adversarial analysis on graphs, which allows us to focus only on the salient shape characteristics of graphs, yielded by the subset of the most essential nodes (i. e., landmarks), with minimal loss of topological information on the whole graph.
no code implementations • 5 Jul 2024 • Mahdi Kallel, Debabrota Basu, Riad Akrour, Carlo D'Eramo
The resulting algorithm combines the convenience of the direct policy search with the scalability of reinforcement learning.
1 code implementation • 10 Jun 2024 • Achraf Azize, Marc Jourdan, Aymen Al Marjani, Debabrota Basu
First, to quantify the cost of privacy, we derive lower bounds on the sample complexity of any $\delta$-correct BAI algorithm satisfying $\epsilon$-global DP or $\epsilon$-local DP.
no code implementations • 22 May 2024 • Sunrit Chakraborty, Saptarshi Roy, Debabrota Basu
High dimensional sparse linear bandits serve as an efficient model for sequential decision-making problems (e. g. personalized medicine), where high dimensional features (e. g. genomic data) on the users are available, but only a small subset of them are relevant.
no code implementations • 11 Mar 2024 • Bishwamittra Ghosh, Debabrota Basu, Fu Huazhu, Wang Yuan, Renuga Kanagavelu, Jiang Jin Peng, Liu Yong, Goh Siow Mong Rick, Wei Qingsong
Additionally, to assess client contribution under limited computational budget, we propose a scheduling procedure that considers a two-sided fairness criteria to perform expensive Shapley value computation only in a subset of training epochs.
no code implementations • 15 Feb 2024 • Achraf Azize, Debabrota Basu
We study the per-datum Membership Inference Attacks (MIAs), where an attacker aims to infer whether a fixed target datum has been included in the input dataset of an algorithm and thus, violates privacy.
no code implementations • 14 Feb 2024 • Reabetswe M. Nkhumise, Debabrota Basu, Tony J. Prescott, Aditya Gilra
Using an optimal transport-based metric, we measure the length of the paths induced by the policy sequence yielded by an RL algorithm between an initial policy and a final optimal policy.
no code implementations • 28 Sep 2023 • Shubhada Agrawal, Timothée Mathieu, Debabrota Basu, Odalric-Ambrym Maillard
In this setting, accommodating potentially unbounded corruptions, we establish a problem-dependent lower bound on regret for a given family of arm distributions.
no code implementations • 1 Sep 2023 • Achraf Azize, Debabrota Basu
Next, we complement our regret upper bounds with the first minimax lower bounds on the regret of bandits with zCDP.
1 code implementation • 22 Jun 2023 • Emil Carlsson, Debabrota Basu, Fredrik D. Johansson, Devdatt Dubhashi
Both these algorithms try to track an optimal allocation based on the lower bound and computed by a weighted projection onto the boundary of a normal cone.
1 code implementation • 24 Feb 2023 • Edwige Cyffers, Aurélien Bellet, Debabrota Basu
We study differentially private (DP) machine learning algorithms as instances of noisy fixed-point iterations, in order to derive privacy and utility results from this well-studied framework.
no code implementations • 18 Feb 2023 • Riccardo Della Vecchia, Debabrota Basu
Endogeneity, i. e. the dependence of noise and covariates, is a common phenomenon in real data due to omitted variables, strategic behaviours, measurement errors etc.
no code implementations • 18 Feb 2023 • Hannes Eriksson, Debabrota Basu, Tommy Tram, Mina Alibeigi, Christos Dimitrakakis
Then, we propose a generic two-stage algorithm, MLEMTRL, to address the MTRL problem in discrete and continuous settings.
1 code implementation • 16 Feb 2023 • Pratik Karmakar, Debabrota Basu
We study design of black-box model extraction attacks that can send minimal number of queries from a publicly available dataset to a target ML model through a predictive API with an aim to create an informative and distributionally equivalent replica of the target.
no code implementations • 5 Oct 2022 • Reda Ouhamma, Debabrota Basu, Odalric-Ambrym Maillard
Our regret bound is order-optimal with respect to $H$ and $K$.
no code implementations • 6 Sep 2022 • Achraf Azize, Debabrota Basu
First, we prove the minimax and problem-dependent regret lower bounds for stochastic and linear bandits that quantify the hardness of bandits with $\epsilon$-global DP.
1 code implementation • 1 Jun 2022 • Bishwamittra Ghosh, Debabrota Basu, Kuldeep S. Meel
In this paper, we aim to quantify the influence of different features in a dataset on the bias of a classifier.
no code implementations • 20 Apr 2022 • Yannis Flet-Berliac, Debabrota Basu
In SAAC, the adversary aims to break the safety constraint while the RL agent aims to maximize the constrained value function given the adversary's policy.
no code implementations • 18 Mar 2022 • Hannes Eriksson, Debabrota Basu, Mina Alibeigi, Christos Dimitrakakis
In existing literature, the risk in stochastic games has been studied in terms of the inherent uncertainty evoked by the variability of transitions and actions.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 7 Mar 2022 • Debabrota Basu, Odalric-Ambrym Maillard, Timothée Mathieu
We study the corrupted bandit problem, i. e. a stochastic multi-armed bandit problem with $k$ unknown reward distributions, which are heavy-tailed and corrupted by a history-independent adversary or Nature.
1 code implementation • 14 Oct 2021 • Junxiong Wang, Debabrota Basu, Immanuel Trummer
In black-box optimization problems, we aim to maximize an unknown objective function, where the function is only accessible through feedbacks of an evaluation or simulation oracle.
1 code implementation • 20 Sep 2021 • Bishwamittra Ghosh, Debabrota Basu, Kuldeep S. Meel
In recent years, machine learning (ML) algorithms have been deployed in safety-critical and high-stake decision-making, where the fairness of algorithms is of paramount importance.
no code implementations • 23 Feb 2021 • Thomas Kleine Buening, Meirav Segal, Debabrota Basu, Christos Dimitrakakis, Anne-Marie George
Typically, merit is defined with respect to some intrinsic measure of worth.
no code implementations • 22 Feb 2021 • Hannes Eriksson, Debabrota Basu, Mina Alibeigi, Christos Dimitrakakis
In this paper, we consider risk-sensitive sequential decision-making in Reinforcement Learning (RL).
1 code implementation • 14 Sep 2020 • Bishwamittra Ghosh, Debabrota Basu, Kuldeep S. Meel
We instantiate Justicia on multiple classification and bias mitigation algorithms, and datasets to verify different fairness metrics, such as disparate impact, statistical parity, and equalized odds.
1 code implementation • 2 Mar 2020 • Ashish Dandekar, Debabrota Basu, Stephane Bressan
We also propose a cost model that bridges the gap between the privacy level and the compensation budget estimated by a GDPR compliant business entity.
no code implementations • NeurIPS Workshop ICBINB 2020 • Hannes Eriksson, Emilio Jorge, Christos Dimitrakakis, Debabrota Basu, Divya Grover
Bayesian reinforcement learning (BRL) offers a decision-theoretic solution for reinforcement learning.
no code implementations • 20 Jun 2019 • Aristide Tossou, Christos Dimitrakakis, Debabrota Basu
We derive the first polynomial time Bayesian algorithm, BUCRL{} that achieves up to logarithm factors, a regret (i. e the difference between the accumulated rewards of the optimal policy and our algorithm) of the optimal order $\tilde{\mathcal{O}}(\sqrt{DSAT})$.
no code implementations • 29 May 2019 • Debabrota Basu, Christos Dimitrakakis, Aristide Tossou
We derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions.
no code implementations • 27 May 2019 • Aristide Tossou, Debabrota Basu, Christos Dimitrakakis
We study model-based reinforcement learning in an unknown finite communicating Markov decision process.
Model-based Reinforcement Learning reinforcement-learning +2
1 code implementation • 7 Feb 2019 • Divya Grover, Debabrota Basu, Christos Dimitrakakis
We address the problem of Bayesian reinforcement learning using efficient model-based online planning.
1 code implementation • 4 May 2018 • Debabrota Basu, Pierre Senellart, Stéphane Bressan
BelMan alternates \emph{information projection} and \emph{reverse information projection}, i. e., projection of the pseudobelief-reward onto beliefs-rewards to choose the arm to play, and projection of the resulting beliefs-rewards onto the pseudobelief-reward.