no code implementations • 15 Jun 2023 • Alexia Atsidakou, Branislav Kveton, Sumeet Katariya, Constantine Caramanis, Sujay Sanghavi
In a multi-armed bandit, we obtain $O(c_\Delta \log n)$ and $O(c_h \log^2 n)$ upper bounds for an upper confidence bound algorithm, where $c_h$ and $c_\Delta$ are constants depending on the prior distribution and the gaps of bandit instances sampled from it, respectively.
no code implementations • 1 Feb 2023 • Sanath Kumar Krishnamurthy, Tanmay Gangwani, Sumeet Katariya, Branislav Kveton, Shrey Modi, Anshuka Rangi
We consider the finite-horizon offline reinforcement learning (RL) setting, and are motivated by the challenge of learning the policy at any step h in dynamic programming (DP) algorithms.
no code implementations • 9 Dec 2022 • Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh
We prove per-task bounds on the suboptimality of the learned policies, which show a clear improvement over not using the hierarchical model.
no code implementations • 15 Nov 2022 • Alexia Atsidakou, Sumeet Katariya, Sujay Sanghavi, Branislav Kveton
We also provide a lower bound on the probability of misidentification in a $2$-armed Bayesian bandit and show that our upper bound (almost) matches it for any budget.
1 code implementation • 30 May 2022 • Imad Aouali, Branislav Kveton, Sumeet Katariya
The regret bound has two terms, one for learning the action parameters and the other for learning the shared effect parameters.
1 code implementation • 25 Feb 2022 • MohammadJavad Azizi, Branislav Kveton, Mohammad Ghavamzadeh, Sumeet Katariya
The Bayesian algorithm has access to a prior distribution over the meta-parameters and its meta simple regret over $m$ bandit tasks with horizon $n$ is mere $\tilde{O}(m / \sqrt{n})$.
1 code implementation • 16 Feb 2022 • Yaochen Xie, Sumeet Katariya, Xianfeng Tang, Edward Huang, Nikhil Rao, Karthik Subbian, Shuiwang Ji
They are also unable to provide explanations in cases where the GNN is trained in a self-supervised manner, and the resulting representations are used in future downstream tasks.
no code implementations • 3 Feb 2022 • Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh
We use this exact posterior to analyze the Bayes regret of HierTS in Gaussian bandits.
2 code implementations • ICLR 2022 • Wenqing Zheng, Edward W Huang, Nikhil Rao, Sumeet Katariya, Zhangyang Wang, Karthik Subbian
We propose Cold Brew, a teacher-student distillation approach to address the SCS and noisy-neighbor challenges for GNNs.
1 code implementation • NeurIPS 2021 • Nurendra Choudhary, Nikhil Rao, Sumeet Katariya, Karthik Subbian, Chandan K. Reddy
Current approaches employ spatial geometries such as boxes to learn query representations that encompass the answer entities and model the logical operations of projection and intersection.
no code implementations • 29 Sep 2021 • Yaochen Xie, Sumeet Katariya, Xianfeng Tang, Edward W Huang, Nikhil Rao, Karthik Subbian, Shuiwang Ji
TAGE enables the explanation of GNN embedding models without downstream tasks and allows efficient explanation of multitask models.
no code implementations • 12 Apr 2021 • Shubham Gupta, Aadirupa Saha, Sumeet Katariya
We consider the problem of pure exploration with subset-wise preference feedback, which contains $N$ arms with features.
1 code implementation • 23 Dec 2020 • Nurendra Choudhary, Nikhil Rao, Sumeet Katariya, Karthik Subbian, Chandan K. Reddy
Promising approaches to tackle this problem include embedding the KG units (e. g., entities and relations) in a Euclidean space such that the query embedding contains the information relevant to its results.
1 code implementation • ICML 2020 • Yinglun Zhu, Sumeet Katariya, Robert Nowak
We study the problem of Robust Outlier Arm Identification (ROAI), where the goal is to identify arms whose expected rewards deviate substantially from the majority, by adaptively sampling from their reward distributions.
1 code implementation • NeurIPS 2019 • Sumeet Katariya, Ardhendu Tripathy, Robert Nowak
This paper studies the problem of adaptively sampling from K distributions (arms) in order to identify the largest gap between any two adjacent means.
no code implementations • 3 Jun 2018 • Sumeet Katariya, Branislav Kveton, Zheng Wen, Vamsi K. Potluru
In many practical problems, a learning agent may want to learn the best action in hindsight without ever taking a bad action, which is significantly worse than the default production action.
1 code implementation • 20 Feb 2018 • Sumeet Katariya, Lalit Jain, Nandana Sengupta, James Evans, Robert Nowak
We consider the problem of active coarse ranking, where the goal is to sort items according to their means into clusters of pre-specified sizes, by adaptively sampling from their reward distributions.
no code implementations • 19 Mar 2017 • Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen
The probability that a user will click a search result depends both on its relevance and its position on the results page.
no code implementations • 10 Aug 2016 • Sumeet Katariya, Branislav Kveton, Csaba Szepesvari, Claire Vernade, Zheng Wen
The main challenge of the problem is that the individual values of the row and column are unobserved.
1 code implementation • 9 Feb 2016 • Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Zheng Wen
This work presents the first practical and regret-optimal online algorithm for learning to rank with multiple clicks in a cascade-like click model.
no code implementations • 31 Jan 2015 • Kevin Jamieson, Sumeet Katariya, Atul Deshpande, Robert Nowak
We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of the inverse squared gaps between the Borda scores of each suboptimal arm and the best arm.