Search Results for author: Siddhartha Banerjee

Found 21 papers, 7 papers with code

The SMART approach to instance-optimal online learning

no code implementations27 Feb 2024 Siddhartha Banerjee, Alankrita Bhatt, Christina Lee Yu

We devise an online learning algorithm -- titled Switching via Monotone Adapted Regret Traces (SMART) -- that adapts to the data and achieves regret that is instance optimal, i. e., simultaneously competitive on every input sequence compared to the performance of the follow-the-leader (FTL) policy and the worst case guarantee of any other input policy.

Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits

1 code implementation30 Sep 2022 Siddhartha Banerjee, Sean R. Sinclair, Milind Tambe, Lily Xu, Christina Lee Yu

How best to incorporate historical data to "warm start" bandit algorithms is an open question: naively initializing reward estimates using all historical samples can suffer from spurious data and imbalanced data coverage, leading to computational and storage issues $\unicode{x2014}$ particularly salient in continuous action spaces.

Open-Ended Question Answering

Adaptive Discretization in Online Reinforcement Learning

no code implementations29 Oct 2021 Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu

In this paper we provide a unified theoretical analysis of tree-based hierarchical partitioning methods for online reinforcement learning, providing model-free and model-based algorithms.

Management reinforcement-learning +1

Real-Time Approximate Routing for Smart Transit Systems

2 code implementations10 Mar 2021 Siddhartha Banerjee, Chamsi Hssaine, Noémie Périvier, Samitha Samaranayake

We study real-time routing policies in smart transit systems, where the platform has a combination of cars and high-capacity vehicles (e. g., buses or shuttles) and seeks to serve a set of incoming trip requests.

Optimization and Control

Explainable AI for Robot Failures: Generating Explanations that Improve User Assistance in Fault Recovery

no code implementations5 Jan 2021 Devleena Das, Siddhartha Banerjee, Sonia Chernova

In order for error explanations to be meaningful, we investigate what types of information within a set of hand-scripted explanations are most helpful to non-experts for failure and solution identification.

Decision Making

Explainable AI for System Failures: Generating Explanations that Improve Human Assistance in Fault Recovery

no code implementations18 Nov 2020 Devleena Das, Siddhartha Banerjee, Sonia Chernova

With the growing capabilities of intelligent systems, the integration of artificial intelligence (AI) and robots in everyday life is increasing.

Adaptive Discretization for Model-Based Reinforcement Learning

1 code implementation NeurIPS 2020 Sean R. Sinclair, Tianyu Wang, Gauri Jain, Siddhartha Banerjee, Christina Lee Yu

We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm in large (potentially continuous) state-action spaces.

Model-based Reinforcement Learning reinforcement-learning +1

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

1 code implementation17 Oct 2019 Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces.

Q-Learning reinforcement-learning +1

Hierarchical Transfer Learning for Multi-label Text Classification

no code implementations ACL 2019 Siddhartha Banerjee, Cem Akkaya, Francisco Perez-Sorrosal, Kostas Tsioutsiouliklis

Compared to binary classifiers trained from scratch, our HTrans approach results in significant improvements of 1{\%} on micro-F1 and 3{\%} on macro-F1 on the RCV1 dataset.

Binary Classification General Classification +4

Online Allocation and Pricing: Constant Regret via Bellman Inequalities

no code implementations14 Jun 2019 Alberto Vera, Siddhartha Banerjee, Itai Gurvich

We develop a framework for designing simple and efficient policies for a family of online allocation and pricing problems, that includes online packing, budget-constrained probing, dynamic pricing, and online contextual bandits with knapsacks.

Multi-Armed Bandits

The Bayesian Prophet: A Low-Regret Framework for Online Decision Making

1 code implementation15 Jan 2019 Alberto Vera, Siddhartha Banerjee

We develop a new framework for designing online policies given access to an oracle providing statistical information about an offline benchmark.

Decision Making

Generating Abstractive Summaries from Meeting Transcripts

no code implementations22 Sep 2016 Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama

The most informative and well-formed sub-graph obtained by integer linear programming (ILP) is selected to generate a one-sentence summary for each topic segment.

Document Summarization Sentence +1

Multi-document abstractive summarization using ILP based multi-sentence compression

no code implementations22 Sep 2016 Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama

The sentences in the most important document are aligned to sentences in other documents to generate clusters of similar sentences.

Abstractive Text Summarization Document Summarization +4

Abstractive Meeting Summarization UsingDependency Graph Fusion

no code implementations22 Sep 2016 Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama

Automatic summarization techniques on meeting conversations developed so far have been primarily extractive, resulting in poor summaries.

Meeting Summarization Sentence +1

Unbounded Human Learning: Optimal Scheduling for Spaced Repetition

1 code implementation23 Feb 2016 Siddharth Reddy, Igor Labutov, Siddhartha Banerjee, Thorsten Joachims

Second, we use this memory model to develop a stochastic model for spaced repetition systems.

Scheduling

Fast Bidirectional Probability Estimation in Markov Models

no code implementations NeurIPS 2015 Siddhartha Banerjee, Peter Lofgren

We develop a new bidirectional algorithm for estimating Markov chain multi-step transition probabilities: given a Markov chain, we want to estimate the probability of hitting a given target state in $\ell$ steps after starting from a given source distribution.

Personalized PageRank Estimation and Search: A Bidirectional Approach

1 code implementation21 Jul 2015 Peter Lofgren, Siddhartha Banerjee, Ashish Goel

First, for the problem of estimating Personalized PageRank (PPR) from a source distribution to a target node, we present a new bidirectional estimator with simple yet strong guarantees on correctness and performance, and 3x to 8x speedup over existing estimators in experiments on a diverse set of networks.

Online Collaborative-Filtering on Graphs

no code implementations7 Nov 2014 Siddhartha Banerjee, Sujay Sanghavi, Sanjay Shakkottai

We consider this problem under a simple natural model, wherein the number of items and the number of item-views are of the same order, and an `access-graph' constrains which user is allowed to see which item.

Collaborative Filtering Recommendation Systems

The Price of Privacy in Untrusted Recommendation Engines

no code implementations13 Jul 2012 Siddhartha Banerjee, Nidhi Hegde, Laurent Massoulié

In the information-rich regime, where each user rates at least a constant fraction of items, a spectral clustering approach is shown to achieve a sample-complexity lower bound derived from a simple information-theoretic argument based on Fano's inequality.

Clustering Recommendation Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.