Search Results for author: Debangshu Banerjee

Found 9 papers, 2 papers with code

Bad Values but Good Behavior: Learning Highly Misspecified Bandits and MDPs

no code implementations13 Oct 2023 Debangshu Banerjee, Aditya Gopalan

Parametric, feature-based reward models are employed by a variety of algorithms in decision-making settings such as bandits and Markov decision processes (MDPs).

Decision Making Multi-Armed Bandits +1

Incremental Randomized Smoothing Certification

1 code implementation31 May 2023 Shubham Ugare, Tarun Suresh, Debangshu Banerjee, Gagandeep Singh, Sasa Misailovic

We experimentally demonstrate the effectiveness of our approach, showing up to 3x certification speedup over the certification that applies randomized smoothing of the approximate model from scratch.

Incremental Verification of Neural Networks

2 code implementations4 Apr 2023 Shubham Ugare, Debangshu Banerjee, Sasa Misailovic, Gagandeep Singh

Complete verification of deep neural networks (DNNs) can exactly determine whether the DNN satisfies a desired trustworthy property (e. g., robustness, fairness) on an infinite set of inputs or not.

Fairness

Interpreting Robustness Proofs of Deep Neural Networks

no code implementations31 Jan 2023 Debangshu Banerjee, Avaljot Singh, Gagandeep Singh

In recent years numerous methods have been developed to formally verify the robustness of deep neural networks (DNNs).

On the Minimax Regret for Linear Bandits in a wide variety of Action Spaces

no code implementations9 Jan 2023 Debangshu Banerjee, Aditya Gopalan

As noted in the works of \cite{lattimore2020bandit}, it has been mentioned that it is an open problem to characterize the minimax regret of linear bandits in a wide variety of action spaces.

Markov Chain Concentration with an Application in Reinforcement Learning

no code implementations7 Jan 2023 Debangshu Banerjee

Given $X_1,\cdot , X_N$ random variables whose joint distribution is given as $\mu$ we will use the Martingale Method to show any Lipshitz Function $f$ over these random variables is subgaussian.

reinforcement-learning Reinforcement Learning (RL)

Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference

no code implementations23 Jul 2022 Debangshu Banerjee, Avishek Ghosh, Sayak Ray Chowdhury, Aditya Gopalan

Furthermore, while the previous result is shown to hold only in the asymptotic regime (as $n \to \infty$), our result for these "locally rich" action spaces is any-time.

Clustering Model Selection

Critic Algorithms using Cooperative Networks

no code implementations19 Jan 2022 Debangshu Banerjee, Kavita Wagh

This algorithm tracks the Projected Bellman Algorithm and is therefore different from the class of residual algorithms.

HEX and Neurodynamic Programming

no code implementations11 Aug 2020 Debangshu Banerjee

Instead what we use is reinforcement learning through self play and approximations through neural networks to by pass the problem of high branching factor and maintaining large tables for state-action evaluations.

Cannot find the paper you are looking for? You can Submit a new open access paper.