no code implementations • 13 Nov 2014 • Qingqing Huang, Rong Ge, Sham Kakade, Munther Dahleh
Consider a stationary discrete random process with alphabet size d, which is assumed to be the output process of an unknown stationary Hidden Markov Model (HMM).
no code implementations • NeurIPS 2014 • Shreya Saxena, Munther Dahleh
Neuronal encoding models range from the detailed biophysically-based Hodgkin Huxley model, to the statistical linear time invariant model specifying firing rates in terms of the extrinsic signal.
no code implementations • 3 May 2021 • Chengzhuo Ni, Yaqi Duan, Munther Dahleh, Anru Zhang, Mengdi Wang
The transition kernel of a continuous-state-action Markov decision process (MDP) admits a natural tensor structure.
no code implementations • 30 Jun 2021 • Luis Lopez, Alvaro Gonzalez-Castellanos, David Pozo, Mardavij Roozbehani, Munther Dahleh
For unlocking such capabilities, it is essential to understand the aggregated flexibility that can be harvested from the large population of new technologies located in distribution grids.
no code implementations • 30 Sep 2021 • Anish Agarwal, Munther Dahleh, Devavrat Shah, Dennis Shen
In particular, we establish entry-wise, i. e., max-norm, finite-sample consistency and asymptotic normality results for matrix completion with MNAR data.
no code implementations • NeurIPS 2023 • Abdullah Alomar, Munther Dahleh, Sean Mann, Devavrat Shah
However, a theoretical underpinning of multi-stage learning algorithms involving both deterministic and stationary components has been absent in the literature despite its pervasiveness.
no code implementations • 4 Oct 2023 • Maryann Rui, Thibaut Horel, Munther Dahleh
Modern data sets, such as those in healthcare and e-commerce, are often derived from many individuals or systems but have insufficient data from each source alone to separately estimate individual, often high-dimensional, model parameters.
1 code implementation • 19 Dec 2023 • Meshal Alharbi, Mardavij Roozbehani, Munther Dahleh
In the setting of finite episodic Markov decision processes with $S$ states, $A$ actions, and episode length $H$, we present an optimistic Q-learning algorithm that achieves $\tilde{\mathcal{O}}(\text{Poly}(H)\sqrt{T})$ regret under perfect knowledge of $f$, where $T$ is the total number of interactions with the system.