Search Results for author: Debangshu Banerjee

Found 9 papers, 2 papers with code

Bad Values but Good Behavior: Learning Highly Misspecified Bandits and MDPs

no code implementations • 13 Oct 2023 • Debangshu Banerjee, Aditya Gopalan

Parametric, feature-based reward models are employed by a variety of algorithms in decision-making settings such as bandits and Markov decision processes (MDPs).

Decision Making Multi-Armed Bandits +1

Paper
Add Code

Incremental Randomized Smoothing Certification

1 code implementation • 31 May 2023 • Shubham Ugare, Tarun Suresh, Debangshu Banerjee, Gagandeep Singh, Sasa Misailovic

We experimentally demonstrate the effectiveness of our approach, showing up to 3x certification speedup over the certification that applies randomized smoothing of the approximate model from scratch.

Paper
Code

Incremental Verification of Neural Networks

2 code implementations • 4 Apr 2023 • Shubham Ugare, Debangshu Banerjee, Sasa Misailovic, Gagandeep Singh

Complete verification of deep neural networks (DNNs) can exactly determine whether the DNN satisfies a desired trustworthy property (e. g., robustness, fairness) on an infinite set of inputs or not.

Fairness

Paper
Code

Interpreting Robustness Proofs of Deep Neural Networks

no code implementations • 31 Jan 2023 • Debangshu Banerjee, Avaljot Singh, Gagandeep Singh

In recent years numerous methods have been developed to formally verify the robustness of deep neural networks (DNNs).

Paper
Add Code

On the Minimax Regret for Linear Bandits in a wide variety of Action Spaces

no code implementations • 9 Jan 2023 • Debangshu Banerjee, Aditya Gopalan

As noted in the works of \cite{lattimore2020bandit}, it has been mentioned that it is an open problem to characterize the minimax regret of linear bandits in a wide variety of action spaces.

Paper
Add Code

Markov Chain Concentration with an Application in Reinforcement Learning

no code implementations • 7 Jan 2023 • Debangshu Banerjee

Given $X_1,\cdot , X_N$ random variables whose joint distribution is given as $\mu$ we will use the Martingale Method to show any Lipshitz Function $f$ over these random variables is subgaussian.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference

no code implementations • 23 Jul 2022 • Debangshu Banerjee, Avishek Ghosh, Sayak Ray Chowdhury, Aditya Gopalan

Furthermore, while the previous result is shown to hold only in the asymptotic regime (as $n \to \infty$), our result for these "locally rich" action spaces is any-time.

Clustering Model Selection

Paper
Add Code

Critic Algorithms using Cooperative Networks

no code implementations • 19 Jan 2022 • Debangshu Banerjee, Kavita Wagh

This algorithm tracks the Projected Bellman Algorithm and is therefore different from the class of residual algorithms.

Paper
Add Code

HEX and Neurodynamic Programming

no code implementations • 11 Aug 2020 • Debangshu Banerjee

Instead what we use is reinforcement learning through self play and approximations through neural networks to by pass the problem of high branching factor and maintaining large tables for state-action evaluations.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.