no code implementations • 12 Apr 2024 • Nived Rajaraman, Jiantao Jiao, Kannan Ramchandran
In this paper, we investigate tokenization from a theoretical point of view by studying the behavior of transformers on simple data generating processes.
no code implementations • 12 Feb 2023 • Nived Rajaraman, Yanjun Han, Jiantao Jiao, Kannan Ramchandran
We consider the sequential decision-making problem where the mean outcome is a non-linear function of the chosen action.
no code implementations • 29 Jan 2023 • Dong Yin, Sridhar Thiagarajan, Nevena Lazic, Nived Rajaraman, Botao Hao, Csaba Szepesvari
One useful property of simulators is that it is typically easy to reset the environment to a previously observed state.
no code implementations • 5 Oct 2022 • Amirali Aghazadeh, Nived Rajaraman, Tony Tu, Kannan Ramchandran
Data-driven machine learning models are being increasingly employed in several important inference problems in biology, chemistry, and physics which require learning over combinatorial spaces.
1 code implementation • 30 May 2022 • Gokul Swamy, Nived Rajaraman, Matthew Peng, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu, Jiantao Jiao, Kannan Ramchandran
In the tabular setting or with linear function approximation, our meta theorem shows that the performance gap incurred by our approach achieves the optimal $\widetilde{O} \left( \min({H^{3/2}} / {N}, {H} / {\sqrt{N}} \right)$ dependency, under significantly weaker assumptions compared to prior work.
no code implementations • NeurIPS 2021 • Nived Rajaraman, Yanjun Han, Lin Yang, Jingbo Liu, Jiantao Jiao, Kannan Ramchandran
In contrast, when the MDP transition structure is known to the learner such as in the case of simulators, we demonstrate fundamental differences compared to the tabular setting in terms of the performance of an optimal algorithm, Mimic-MD (Rajaraman et al. (2020)) when extended to the function approximation setting.
no code implementations • 12 Jun 2021 • Fnu Devvrit, Nived Rajaraman, Pranjal Awasthi
In this setting, the learner has access to a dataset $X \in \mathbb{R}^{(n_1+n_2) \times d}$ which is composed of $n_1$ unlabelled examples that an algorithm can actively query, and $n_2$ examples labelled a-priori.
no code implementations • 25 Feb 2021 • Nived Rajaraman, Yanjun Han, Lin F. Yang, Kannan Ramchandran, Jiantao Jiao
We establish an upper bound $O(|\mathcal{S}|H^{3/2}/N)$ for the suboptimality using the Mimic-MD algorithm in Rajaraman et al (2020) which we prove to be computationally efficient.
no code implementations • 3 Feb 2021 • Prafulla Chandra, Andrew Thangaraj, Nived Rajaraman
In this work, we study convergence of the GT estimator for missing stationary mass (i. e., total stationary probability of missing symbols) of Markov samples on an alphabet $\mathcal{X}$ with stationary distribution $[\pi_x:x \in \mathcal{X}]$ and transition probability matrix (t. p. m.)
no code implementations • 23 Sep 2020 • Swanand Kadhe, Nived Rajaraman, O. Ozan Koyluoglu, Kannan Ramchandran
In this paper, we propose a secure aggregation protocol, FastSecAgg, that is efficient in terms of computation and communication, and robust to client dropouts.
no code implementations • NeurIPS 2020 • Nived Rajaraman, Lin F. Yang, Jiantao Jiao, Kannan Ramachandran
Here, we show that the policy which mimics the expert whenever possible is in expectation $\lesssim \frac{|\mathcal{S}| H^2 \log (N)}{N}$ suboptimal compared to the value of the expert, even when the expert follows an arbitrary stochastic policy.