Search Results for author: Debabrota Basu

Found 38 papers, 12 papers with code

Isoperimetry is All We Need: Langevin Posterior Sampling for RL with Sublinear Regret

no code implementations30 Dec 2024 Emilio Jorge, Christos Dimitrakakis, Debabrota Basu

First, we show that the Posterior Sampling-based RL (PSRL) yields sublinear regret if the data distributions satisfy LSI under some mild additional assumptions.

Reinforcement Learning (RL)

Preference-based Pure Exploration

no code implementations4 Dec 2024 Apurv Shukla, Debabrota Basu

First, to quantify the impact of preferences, we derive a novel lower bound on sample complexity for identifying the most preferred policy with a confidence level $1-\delta$.

Dynamical-VAE-based Hindsight to Learn the Causal Dynamics of Factored-POMDPs

no code implementations12 Nov 2024 Chao Han, Debabrota Basu, Michael Mangan, Eleni Vasilaki, Aditya Gilra

Learning representations of underlying environmental dynamics from partial observations is a critical challenge in machine learning.

Learning to Explore with Lagrangians for Bandits under Unknown Linear Constraints

no code implementations24 Oct 2024 Udvas Das, Debabrota Basu

First, we propose a Lagrangian relaxation of the sample complexity lower bound for pure exploration under constraints.

Fairness Multi-Armed Bandits

Active Fourier Auditor for Estimating Distributional Properties of ML Models

no code implementations10 Oct 2024 Ayoub Ajarra, Bishwamittra Ghosh, Debabrota Basu

For this purpose, we develop a new framework that quantifies different properties in terms of the Fourier coefficients of the ML model under audit but does not parametrically reconstruct it.


Testing Credibility of Public and Private Surveys through the Lens of Regression

no code implementations7 Oct 2024 Debabrota Basu, Sourav Chakraborty, Debarshi Chanda, Buddha Dev Das, Arijit Ghosh, Arnab Ray

We prove the theoretical correctness of our algorithms while trying to reduce the sample complexity for both public and private surveys.

regression Survey

When Witnesses Defend: A Witness Graph Topological Layer for Adversarial Graph Learning

1 code implementation21 Sep 2024 Naheed Anjum Arafat, Debabrota Basu, Yulia Gel, Yuzhou Chen

We introduce the concept of witness complex to adversarial analysis on graphs, which allows us to focus only on the salient shape characteristics of graphs, yielded by the subset of the most essential nodes (i. e., landmarks), with minimal loss of topological information on the whole graph.

Graph Learning

Augmented Bayesian Policy Search

no code implementations5 Jul 2024 Mahdi Kallel, Debabrota Basu, Riad Akrour, Carlo D'Eramo

The resulting algorithm combines the convenience of the direct policy search with the scalability of reinforcement learning.

Bayesian Optimization LEMMA +3

Differentially Private Best-Arm Identification

1 code implementation10 Jun 2024 Achraf Azize, Marc Jourdan, Aymen Al Marjani, Debabrota Basu

First, to quantify the cost of privacy, we derive lower bounds on the sample complexity of any $\delta$-correct BAI algorithm satisfying $\epsilon$-global DP or $\epsilon$-local DP.

FLIPHAT: Joint Differential Privacy for High Dimensional Sparse Linear Bandits

no code implementations22 May 2024 Sunrit Chakraborty, Saptarshi Roy, Debabrota Basu

High dimensional sparse linear bandits serve as an efficient model for sequential decision-making problems (e. g. personalized medicine), where high dimensional features (e. g. genomic data) on the users are available, but only a small subset of them are relevant.

Decision Making Sequential Decision Making

Don't Forget What I did?: Assessing Client Contributions in Federated Learning

no code implementations11 Mar 2024 Bishwamittra Ghosh, Debabrota Basu, Fu Huazhu, Wang Yuan, Renuga Kanagavelu, Jiang Jin Peng, Liu Yong, Goh Siow Mong Rick, Wei Qingsong

Additionally, to assess client contribution under limited computational budget, we propose a scheduling procedure that considers a two-sided fairness criteria to perform expensive Shapley value computation only in a subset of training epochs.

Data Poisoning Fairness +2

How Much Does Each Datapoint Leak Your Privacy? Quantifying the Per-datum Membership Leakage

no code implementations15 Feb 2024 Achraf Azize, Debabrota Basu

We study the per-datum Membership Inference Attacks (MIAs), where an attacker aims to infer whether a fixed target datum has been included in the input dataset of an algorithm and thus, violates privacy.

How does Your RL Agent Explore? An Optimal Transport Analysis of Occupancy Measure Trajectories

no code implementations14 Feb 2024 Reabetswe M. Nkhumise, Debabrota Basu, Tony J. Prescott, Aditya Gilra

Using an optimal transport-based metric, we measure the length of the paths induced by the policy sequence yielded by an RL algorithm between an initial policy and a final optimal policy.

reinforcement-learning Reinforcement Learning +2

CRIMED: Lower and Upper Bounds on Regret for Bandits with Unbounded Stochastic Corruption

no code implementations28 Sep 2023 Shubhada Agrawal, Timothée Mathieu, Debabrota Basu, Odalric-Ambrym Maillard

In this setting, accommodating potentially unbounded corruptions, we establish a problem-dependent lower bound on regret for a given family of arm distributions.

Concentrated Differential Privacy for Bandits

no code implementations1 Sep 2023 Achraf Azize, Debabrota Basu

Next, we complement our regret upper bounds with the first minimax lower bounds on the regret of bandits with zCDP.

Multi-Armed Bandits Recommendation Systems

Pure Exploration in Bandits with Linear Constraints

1 code implementation22 Jun 2023 Emil Carlsson, Debabrota Basu, Fredrik D. Johansson, Devdatt Dubhashi

Both these algorithms try to track an optimal allocation based on the lower bound and computed by a weighted projection onto the boundary of a normal cone.

From Noisy Fixed-Point Iterations to Private ADMM for Centralized and Federated Learning

1 code implementation24 Feb 2023 Edwige Cyffers, Aurélien Bellet, Debabrota Basu

We study differentially private (DP) machine learning algorithms as instances of noisy fixed-point iterations, in order to derive privacy and utility results from this well-studied framework.

Federated Learning

Stochastic Online Instrumental Variable Regression: Regrets for Endogeneity and Bandit Feedback

no code implementations18 Feb 2023 Riccardo Della Vecchia, Debabrota Basu

Endogeneity, i. e. the dependence of noise and covariates, is a common phenomenon in real data due to omitted variables, strategic behaviours, measurement errors etc.

Causal Inference regression

Marich: A Query-efficient Distributionally Equivalent Model Extraction Attack using Public Data

1 code implementation16 Feb 2023 Pratik Karmakar, Debabrota Basu

We study design of black-box model extraction attacks that can send minimal number of queries from a publicly available dataset to a target ML model through a predictive API with an aim to create an informative and distributionally equivalent replica of the target.

Model extraction

When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits

no code implementations6 Sep 2022 Achraf Azize, Debabrota Basu

First, we prove the minimax and problem-dependent regret lower bounds for stochastic and linear bandits that quantify the hardness of bandits with $\epsilon$-global DP.

Multi-Armed Bandits

SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics

no code implementations20 Apr 2022 Yannis Flet-Berliac, Debabrota Basu

In SAAC, the adversary aims to break the safety constraint while the RL agent aims to maximize the constrained value function given the adversary's policy.

continuous-control Continuous Control +6

Risk-Sensitive Bayesian Games for Multi-Agent Reinforcement Learning under Policy Uncertainty

no code implementations18 Mar 2022 Hannes Eriksson, Debabrota Basu, Mina Alibeigi, Christos Dimitrakakis

In existing literature, the risk in stochastic games has been studied in terms of the inherent uncertainty evoked by the variability of transitions and actions.

Multi-agent Reinforcement Learning reinforcement-learning +1

Bandits Corrupted by Nature: Lower Bounds on Regret and Robust Optimistic Algorithm

no code implementations7 Mar 2022 Debabrota Basu, Odalric-Ambrym Maillard, Timothée Mathieu

We study the corrupted bandit problem, i. e. a stochastic multi-armed bandit problem with $k$ unknown reward distributions, which are heavy-tailed and corrupted by a history-independent adversary or Nature.

Procrastinated Tree Search: Black-box Optimization with Delayed, Noisy, and Multi-Fidelity Feedback

1 code implementation14 Oct 2021 Junxiong Wang, Debabrota Basu, Immanuel Trummer

In black-box optimization problems, we aim to maximize an unknown objective function, where the function is only accessible through feedbacks of an evaluation or simulation oracle.

Algorithmic Fairness Verification with Graphical Models

1 code implementation20 Sep 2021 Bishwamittra Ghosh, Debabrota Basu, Kuldeep S. Meel

In recent years, machine learning (ML) algorithms have been deployed in safety-critical and high-stake decision-making, where the fairness of algorithms is of paramount importance.

Decision Making Fairness

Justicia: A Stochastic SAT Approach to Formally Verify Fairness

1 code implementation14 Sep 2020 Bishwamittra Ghosh, Debabrota Basu, Kuldeep S. Meel

We instantiate Justicia on multiple classification and bias mitigation algorithms, and datasets to verify different fairness metrics, such as disparate impact, statistical parity, and equalized odds.


Differential Privacy at Risk: Bridging Randomness and Privacy Budget

1 code implementation2 Mar 2020 Ashish Dandekar, Debabrota Basu, Stephane Bressan

We also propose a cost model that bridges the gap between the privacy level and the compensation budget estimated by a GDPR compliant business entity.

Privacy Preserving

Near-optimal Bayesian Solution For Unknown Discrete Markov Decision Process

no code implementations20 Jun 2019 Aristide Tossou, Christos Dimitrakakis, Debabrota Basu

We derive the first polynomial time Bayesian algorithm, BUCRL{} that achieves up to logarithm factors, a regret (i. e the difference between the accumulated rewards of the optimal policy and our algorithm) of the optimal order $\tilde{\mathcal{O}}(\sqrt{DSAT})$.

Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?

no code implementations29 May 2019 Debabrota Basu, Christos Dimitrakakis, Aristide Tossou

We derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions.

Multi-Armed Bandits

BelMan: Bayesian Bandits on the Belief--Reward Manifold

1 code implementation4 May 2018 Debabrota Basu, Pierre Senellart, Stéphane Bressan

BelMan alternates \emph{information projection} and \emph{reverse information projection}, i. e., projection of the pseudobelief-reward onto beliefs-rewards to choose the arm to play, and projection of the resulting beliefs-rewards onto the pseudobelief-reward.

Cannot find the paper you are looking for? You can Submit a new open access paper.