Search Results for author: Nived Rajaraman

Found 11 papers, 1 papers with code

Toward a Theory of Tokenization in LLMs

no code implementations • 12 Apr 2024 • Nived Rajaraman, Jiantao Jiao, Kannan Ramchandran

In this paper, we investigate tokenization from a theoretical point of view by studying the behavior of transformers on simple data generating processes.

Language Modelling

Paper
Add Code

Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

no code implementations • 12 Feb 2023 • Nived Rajaraman, Yanjun Han, Jiantao Jiao, Kannan Ramchandran

We consider the sequential decision-making problem where the mean outcome is a non-linear function of the chosen action.

Decision Making

Paper
Add Code

Sample Efficient Deep Reinforcement Learning via Local Planning

no code implementations • 29 Jan 2023 • Dong Yin, Sridhar Thiagarajan, Nevena Lazic, Nived Rajaraman, Botao Hao, Csaba Szepesvari

One useful property of simulators is that it is typically easy to reset the environment to a previously observed state.

Montezuma's Revenge reinforcement-learning +1

Paper
Add Code

Spectral Regularization Allows Data-frugal Learning over Combinatorial Spaces

no code implementations • 5 Oct 2022 • Amirali Aghazadeh, Nived Rajaraman, Tony Tu, Kannan Ramchandran

Data-driven machine learning models are being increasingly employed in several important inference problems in biology, chemistry, and physics which require learning over combinatorial spaces.

Paper
Add Code

Minimax Optimal Online Imitation Learning via Replay Estimation

1 code implementation • 30 May 2022 • Gokul Swamy, Nived Rajaraman, Matthew Peng, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu, Jiantao Jiao, Kannan Ramchandran

In the tabular setting or with linear function approximation, our meta theorem shows that the performance gap incurred by our approach achieves the optimal $\widetilde{O} \left( \min({H^{3/2}} / {N}, {H} / {\sqrt{N}} \right)$ dependency, under significantly weaker assumptions compared to prior work.

Continuous Control Imitation Learning

Paper
Code

On the Value of Interaction and Function Approximation in Imitation Learning

no code implementations • NeurIPS 2021 • Nived Rajaraman, Yanjun Han, Lin Yang, Jingbo Liu, Jiantao Jiao, Kannan Ramchandran

In contrast, when the MDP transition structure is known to the learner such as in the case of simulators, we demonstrate fundamental differences compared to the tabular setting in terms of the performance of an optimal algorithm, Mimic-MD (Rajaraman et al. (2020)) when extended to the function approximation setting.

Imitation Learning Multi-class Classification

Paper
Add Code

Semi-supervised Active Regression

no code implementations • 12 Jun 2021 • Fnu Devvrit, Nived Rajaraman, Pranjal Awasthi

In this setting, the learner has access to a dataset $X \in \mathbb{R}^{(n_1+n_2) \times d}$ which is composed of $n_1$ unlabelled examples that an algorithm can actively query, and $n_2$ examples labelled a-priori.

Active Learning regression

Paper
Add Code

Provably Breaking the Quadratic Error Compounding Barrier in Imitation Learning, Optimally

no code implementations • 25 Feb 2021 • Nived Rajaraman, Yanjun Han, Lin F. Yang, Kannan Ramchandran, Jiantao Jiao

We establish an upper bound $O(|\mathcal{S}|H^{3/2}/N)$ for the suboptimality using the Mimic-MD algorithm in Rajaraman et al (2020) which we prove to be computationally efficient.

Imitation Learning

Paper
Add Code

How good is Good-Turing for Markov samples?

no code implementations • 3 Feb 2021 • Prafulla Chandra, Andrew Thangaraj, Nived Rajaraman

In this work, we study convergence of the GT estimator for missing stationary mass (i. e., total stationary probability of missing symbols) of Markov samples on an alphabet $\mathcal{X}$ with stationary distribution $[\pi_x:x \in \mathcal{X}]$ and transition probability matrix (t. p. m.)

Paper
Add Code

FastSecAgg: Scalable Secure Aggregation for Privacy-Preserving Federated Learning

no code implementations • 23 Sep 2020 • Swanand Kadhe, Nived Rajaraman, O. Ozan Koyluoglu, Kannan Ramchandran

In this paper, we propose a secure aggregation protocol, FastSecAgg, that is efficient in terms of computation and communication, and robust to client dropouts.

Federated Learning Privacy Preserving

Paper
Add Code

Toward the Fundamental Limits of Imitation Learning

no code implementations • NeurIPS 2020 • Nived Rajaraman, Lin F. Yang, Jiantao Jiao, Kannan Ramachandran

Here, we show that the policy which mimics the expert whenever possible is in expectation $\lesssim \frac{|\mathcal{S}| H^2 \log (N)}{N}$ suboptimal compared to the value of the expert, even when the expert follows an arbitrary stochastic policy.

Imitation Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.