Search Results for author: Arpit Agarwal

Found 17 papers, 5 papers with code

Reinforcement Learning of Active Vision for Manipulating Objects under Occlusions

1 code implementation20 Nov 2018 Ricson Cheng, Arpit Agarwal, Katerina Fragkiadaki

We propose hand/eye con-trollers that learn to move the camera to keep the object within the field of viewand visible, in coordination to manipulating it to achieve the desired goal, e. g., pushing it to a target location.

Object reinforcement-learning +1

Simulation of Vision-based Tactile Sensors using Physics based Rendering

1 code implementation24 Dec 2020 Arpit Agarwal, Tim Man, Wenzhen Yuan

Tactile sensing has seen a rapid adoption with the advent of vision-based tactile sensors.

Robotics Graphics

Model Learning for Look-ahead Exploration in Continuous Control

1 code implementation20 Nov 2018 Arpit Agarwal, Katharina Muelling, Katerina Fragkiadaki

We propose an exploration method that incorporates look-ahead search over basic learnt skills and their dynamics, and use it for reinforcement learning (RL) of manipulation policies .

Continuous Control Reinforcement Learning (RL)

Accelerated Spectral Ranking

1 code implementation ICML 2018 Arpit Agarwal, Prathamesh Patil, Shivani Agarwal

In this paper, we design a provably faster spectral ranking algorithm, which we call accelerated spectral ranking (ASR), that is also consistent under the MNL/BTL models.

Recommendation Systems

Choice Bandits

no code implementations NeurIPS 2020 Arpit Agarwal, Nicholas Johnson, Shivani Agarwal

Here we study a natural generalization, that we term \emph{choice bandits}, where the learner plays a set of up to $k \geq 2$ arms and receives limited relative feedback in the form of a single multiway choice among the pulled arms, drawn from an underlying multiway choice model.

Machine learning models for prediction of droplet collision outcomes

no code implementations1 Oct 2021 Arpit Agarwal

Another key question we try to answer in this paper is whether existing knowledge of the physics based models can be exploited to boost the accuracy of the ML classifiers.

BIG-bench Machine Learning

Batched Dueling Bandits

no code implementations22 Feb 2022 Arpit Agarwal, Rohan Ghuge, Viswanath Nagarajan

The $K$-armed dueling bandit problem, where the feedback is in the form of noisy pairwise comparisons, has been widely studied.

Recommendation Systems

A Sharp Memory-Regret Trade-Off for Multi-Pass Streaming Bandits

no code implementations2 May 2022 Arpit Agarwal, Sanjeev Khanna, Prathamesh Patil

In this paper we study the trade-off between memory and regret when $B$ passes over the stream are allowed, for any $B \geq 1$, and establish tight regret upper and lower bounds for any $B$-pass algorithm.

Sublinear Algorithms for Hierarchical Clustering

no code implementations15 Jun 2022 Arpit Agarwal, Sanjeev Khanna, Huan Li, Prathamesh Patil

At the heart of our algorithmic results is a view of the objective in terms of cuts in the graph, which allows us to use a relaxed notion of cut sparsifiers to do hierarchical clustering while introducing only a small distortion in the objective function.

Clustering Information Retrieval +1

An Asymptotically Optimal Batched Algorithm for the Dueling Bandit Problem

no code implementations25 Sep 2022 Arpit Agarwal, Rohan Ghuge, Viswanath Nagarajan

}$ We answer this in the affirmative $\textit{under the Condorcet condition}$, a standard setting of the $K$-armed dueling bandit problem.

Recommendation Systems

Diversified Recommendations for Agents with Adaptive Preferences

no code implementations20 Sep 2022 Arpit Agarwal, William Brown

For this class, we give an algorithm for the Recommender which obtains $\tilde{O}(T^{3/4})$ regret against all item distributions satisfying two conditions: they are sufficiently diversified, and they are instantaneously realizable at any history by some distribution over menus.

When Can We Track Significant Preference Shifts in Dueling Bandits?

1 code implementation NeurIPS 2023 Joe Suk, Arpit Agarwal

Specifically, we study the recent notion of significant shifts (Suk and Kpotufe, 2022), and ask whether one can design an adaptive algorithm for the dueling problem with $O(\sqrt{K\tilde{L}T})$ dynamic regret, where $\tilde{L}$ is the (unknown) number of significant shifts in preferences.

Information Retrieval Recommendation Systems +1

Online Recommendations for Agents with Discounted Adaptive Preferences

no code implementations12 Feb 2023 Arpit Agarwal, William Brown

In each round, we show a menu of $k$ items (out of $n$ total) to the agent, who then chooses a single item, and we aim to minimize regret with respect to some $\textit{target set}$ (a subset of the item simplex) for adversarial losses over the agent's choices.

Semi-Bandit Learning for Monotone Stochastic Optimization

no code implementations24 Dec 2023 Arpit Agarwal, Rohan Ghuge, Viswanath Nagarajan

Stochastic optimization is a widely used approach for optimization under uncertainty, where uncertain input parameters are modeled by random variables.

Stochastic Optimization

Misalignment, Learning, and Ranking: Harnessing Users Limited Attention

no code implementations21 Feb 2024 Arpit Agarwal, Rad Niazadeh, Prathamesh Patil

Each user selects an item by first considering a prefix window of these ranked items and then picking the highest preferred item in that window (and the platform observes its payoff for this item).

Recommendation Systems

Optimal and Adaptive Non-Stationary Dueling Bandits Under a Generalized Borda Criterion

no code implementations19 Mar 2024 Joe Suk, Arpit Agarwal

In dueling bandits, the learner receives preference feedback between arms, and the regret of an arm is defined in terms of its suboptimality to a winner arm.

Cannot find the paper you are looking for? You can Submit a new open access paper.