1 code implementation • 2 Dec 2024 • Zexuan Sun, Garvesh Raskutti
Importantly, we provide theoretical guarantees by using the theory for early stopping of kernel-based methods for neural networks with sufficiently large (but not necessarily infinite) width and gradient-boosting decision trees that use symmetric trees as a weaker learner.
1 code implementation • 11 Jun 2023 • Yue Gao, Garvesh Raskutti, Rebecca Willet
This paper introduces a novel, computationally-efficient algorithm for predictive inference (PI) that requires no distributional assumptions on the data and can be computed faster than existing bootstrap-type methods for neural networks.
1 code implementation • 19 Jul 2022 • Yue Gao, Abby Stevens, Rebecca Willet, Garvesh Raskutti
Recently, there has been a proliferation of model-agnostic methods to measure variable importance (VI) that analyze the difference in predictive power between a full model trained on all variables and a reduced model that excludes the variable(s) of interest.
no code implementations • 19 Nov 2021 • Hao Chen, Lili Zheng, Raed Al Kontar, Garvesh Raskutti
Stochastic gradient descent (SGD) and its variants have established themselves as the go-to algorithms for large-scale machine learning problems with independent samples due to their generalization performance and intrinsic computational advantage.
no code implementations • 9 Nov 2021 • Raed Kontar, Naichen Shi, Xubo Yue, Seokhyun Chung, Eunshin Byon, Mosharaf Chowdhury, Judy Jin, Wissam Kontar, Neda Masoud, Maher Noueihed, Chinedum E. Okwudire, Garvesh Raskutti, Romesh Saigal, Karandeep Singh, Zhisheng Ye
The Internet of Things (IoT) is on the verge of a major paradigm shift.
no code implementations • 28 Jun 2021 • Yue Gao, Garvesh Raskutti
Network estimation from multi-variate point process or time series data is a problem of fundamental importance.
no code implementations • 25 Mar 2021 • Hyebin Song, Garvesh Raskutti, Rebecca Willett
In a variety of settings, limitations of sensing technologies or other sampling mechanisms result in missing labels, where the likelihood of a missing label in the training set is an unknown function of the data.
no code implementations • NeurIPS 2020 • Hao Chen, Lili Zheng, Raed Al Kontar, Garvesh Raskutti
Stochastic gradient descent (SGD) and its variants have established themselves as the go-to algorithms for large-scale machine learning problems with independent samples due to their generalization performance and intrinsic computational advantage.
no code implementations • 6 Aug 2020 • Yuetian Luo, Garvesh Raskutti, Ming Yuan, Anru R. Zhang
Rate matching deterministic lower bound for tensor reconstruction, which demonstrates the optimality of HOOI, is also provided.
no code implementations • 16 Mar 2020 • Lili Zheng, Garvesh Raskutti, Rebecca Willett, Benjamin Mark
High-dimensional autoregressive point processes model how current events trigger or inhibit future events, such as activity by one member of a social network can affect the future activity of his or her neighbors.
no code implementations • 9 Nov 2019 • Anru Zhang, Yuetian Luo, Garvesh Raskutti, Ming Yuan
In this paper, we develop a novel procedure for low-rank tensor regression, namely \emph{\underline{I}mportance \underline{S}ketching \underline{L}ow-rank \underline{E}stimation for \underline{T}ensors} (ISLET).
no code implementations • 31 Jan 2019 • Raed Kontar, Garvesh Raskutti, Shiyu Zhou
The proposed method has excellent scalability when the number of outputs is large and minimizes the negative transfer of knowledge between uncorrelated outputs.
no code implementations • 7 Nov 2018 • Benjamin Mark, Garvesh Raskutti, Rebecca Willett
Multivariate Bernoulli autoregressive (BAR) processes model time series of events in which the likelihood of current events is determined by the times and locations of past events.
no code implementations • 20 Mar 2018 • Yuan Li, Benjamin Mark, Garvesh Raskutti, Rebecca Willett, Hyebin Song, David Neiman
This work considers a high-dimensional regression setting in which a graph governs both correlations among the covariates and the similarity among regression coefficients -- meaning there is \emph{alignment} between the covariates and regression coefficients.
no code implementations • 13 Feb 2018 • Benjamin Mark, Garvesh Raskutti, Rebecca Willett
Using our general framework, we provide a number of novel theoretical guarantees for high-dimensional self-exciting point processes that reflect the role played by the underlying network structure and long-term memory.
no code implementations • 23 Jan 2018 • Hao Henry Zhou, Garvesh Raskutti
Using a combination of $\beta$ and $\phi$-mixing properties of Markov chains and empirical process techniques for reproducing kernel Hilbert spaces (RKHSs), we provide upper bounds on mean-squared error in terms of the sparsity $s$, logarithm of the dimension $\log d$, number of time points $T$, and the smoothness of the RKHSs.
no code implementations • 28 Apr 2017 • Gunwoong Park, Garvesh Raskutti
We prove that this class of QVF DAG models is identifiable, and introduce a new algorithm, the OverDispersion Scoring (ODS) algorithm, for learning large-scale QVF DAG models.
no code implementations • 30 Nov 2016 • Han Chen, Garvesh Raskutti, Ming Yuan
The two main differences between the convex and non-convex approach are: (i) from a computational perspective whether the non-convex projection operator is computable and whether the projection has desirable contraction properties and (ii) from a statistical upper bound perspective, the non-convex approach has a superior rate for a number of examples.
no code implementations • 9 May 2016 • Eric C. Hall, Garvesh Raskutti, Rebecca Willett
For instance, each element of an observation vector could correspond to a different node in a network, and the parameters of an autoregressive model would correspond to the impact of the network structure on the time series evolution.
no code implementations • 14 Feb 2016 • Gunwoong Park, Garvesh Raskutti
Our simulation study supports our theoretical results, showing that the algorithms based on our two new principles generally out-perform algorithms based on the faithfulness assumption in terms of selecting the true skeleton for DCG models.
no code implementations • NeurIPS 2015 • Gunwoong Park, Garvesh Raskutti
In this paper, we address the question of identifiability and learning algorithms for large-scale Poisson Directed Acyclic Graphical (DAG) models.
no code implementations • 25 May 2015 • Garvesh Raskutti, Michael Mahoney
We then consider the statistical prediction efficiency (PE) and the statistical residual efficiency (RE) of the sketched LS estimator; and we use our framework to provide upper bounds for several types of random projection and random sampling algorithms.
no code implementations • 23 Jun 2014 • Garvesh Raskutti, Michael Mahoney
Prior results show that, when using sketching matrices such as random projections and leverage-score sampling algorithms, with $p < r \ll n$, the WC error is the same as solving the original problem, up to a small constant.
no code implementations • 29 Oct 2013 • Garvesh Raskutti, Sayan Mukherjee
Using this equivalence, it follows that (1) mirror descent is the steepest descent direction along the Riemannian manifold of the exponential family; (2) mirror descent with log-likelihood loss applied to parameter estimation in exponential families asymptotically achieves the classical Cram\'er-Rao lower bound and (3) natural gradient descent for manifolds corresponding to exponential families can be implemented as a first-order method through mirror descent.
no code implementations • 1 Jul 2013 • Garvesh Raskutti, Caroline Uhler
However, there is only limited work on consistency guarantees for score-based and hybrid algorithms and it has been unclear whether consistency guarantees can be proven under weaker conditions than the faithfulness assumption.
no code implementations • 15 Jun 2013 • Garvesh Raskutti, Martin J. Wainwright, Bin Yu
The strategy of early stopping is a regularization technique based on choosing a stopping time for an iterative algorithm.
no code implementations • NeurIPS 2009 • Garvesh Raskutti, Bin Yu, Martin J. Wainwright
components from some distribution $\mP$, we determine tight lower bounds on the minimax rate for estimating the regression function with respect to squared $\LTP$ error.