no code implementations • ICML 2020 • Kinjal Basu, Amol Ghoting, Rahul Mazumder, Yao Pan
Experiments on real-world data show that our proposed LP solver, ECLIPSE, can solve problems with $10^{12}$ decision variables -- well beyond the capabilities of current solvers.
1 code implementation • 11 Mar 2024 • Xiang Meng, Wenyu Chen, Riade Benbaki, Rahul Mazumder
In this paper, we propose FALCON, a novel combinatorial-optimization-based framework for network pruning that jointly takes into account model accuracy (fidelity), FLOPs, and sparsity constraints.
no code implementations • 20 Feb 2024 • Brian Liu, Rahul Mazumder
We study the often overlooked phenomenon, first noted in \cite{breiman2001random}, that random forests appear to reduce bias compared to bagging.
no code implementations • 20 Feb 2024 • Brian Liu, Rahul Mazumder
We present FAST, an optimization framework for fast additive segmentation.
no code implementations • 8 Jan 2024 • Zirui Liu, Qingquan Song, Qiang Charles Xiao, Sathiya Keerthi Selvaraj, Rahul Mazumder, Aman Gupta, Xia Hu
This usually results in a trade-off between model accuracy and efficiency.
no code implementations • 28 Oct 2023 • Shibal Ibrahim, Kayhan Behdin, Rahul Mazumder
Skinny Trees lead to superior feature selection than many existing toolkits e. g., in terms of AUC performance for $25\%$ feature budget, Skinny Trees outperforms LightGBM by $10. 2\%$ (up to $37. 7\%$), and Random Forests by $3\%$ (up to $12. 5\%$).
no code implementations • 5 Sep 2023 • Kayhan Behdin, Ayan Acharya, Aman Gupta, Qingquan Song, Siyu Zhu, Sathiya Keerthi, Rahul Mazumder
Particularly noteworthy is our outlier-aware algorithm's capability to achieve near or sub-3-bit quantization of LLMs with an acceptable drop in accuracy, obviating the need for non-uniform quantization or grouping techniques, improving upon methods such as SpQR by up to two times in terms of perplexity.
no code implementations • 18 Jul 2023 • Kayhan Behdin, Wenyu Chen, Rahul Mazumder
To solve the MIP, we propose a custom nonlinear branch-and-bound (BnB) framework that solves node relaxations with tailored first-order methods.
1 code implementation • 12 Jun 2023 • Brian Liu, Rahul Mazumder
We present FIRE, Fast Interpretable Rule Extraction, an optimization-based framework to extract a small but useful collection of decision rules from tree ensembles.
1 code implementation • 5 Jun 2023 • Shibal Ibrahim, Wenyu Chen, Hussein Hazimeh, Natalia Ponomareva, Zhe Zhao, Rahul Mazumder
To deal with this challenge, we propose a novel, permutation-based local search method that can complement first-order methods in training any sparse gate, e. g., Hash routing, Top-k, DSelect-k, and COMET.
no code implementations • 4 Jun 2023 • Hanbyul Lee, Rahul Mazumder, Qifan Song, Jean Honorio
Most of the existing works on provable guarantees for low-rank matrix completion algorithms rely on some unrealistic assumptions such that matrix entries are sampled randomly or the sampling pattern has a specific structure.
no code implementations • 28 Feb 2023 • Riade Benbaki, Wenyu Chen, Xiang Meng, Hussein Hazimeh, Natalia Ponomareva, Zhe Zhao, Rahul Mazumder
Our approach, CHITA, extends the classical Optimal Brain Surgeon framework and results in significant improvements in speed, memory, and performance over existing optimization-based approaches for network pruning.
no code implementations • 23 Feb 2023 • Kayhan Behdin, Rahul Mazumder
As SAM has been numerically successful, recent papers have studied the theoretical aspects of the framework and have shown SAM solutions are indeed flat.
no code implementations • 19 Feb 2023 • Kayhan Behdin, Qingquan Song, Aman Gupta, Sathiya Keerthi, Ayan Acharya, Borja Ocejo, Gregory Dexter, Rajiv Khanna, David Durfee, Rahul Mazumder
Modern deep learning models are over-parameterized, where different optima can result in widely varying generalization performance.
1 code implementation • 16 Dec 2022 • Gabriel Loewinger, Kayhan Behdin, Kenneth T. Kishida, Giovanni Parmigiani, Rahul Mazumder
Allowing the regression coefficients of tasks to have different sparsity patterns (i. e., different supports), we propose a modeling framework for MTL that encourages models to share information across tasks, for a given covariate, through separately 1) shrinking the coefficient supports together, and/or 2) shrinking the coefficient values together.
no code implementations • 7 Dec 2022 • Kayhan Behdin, Qingquan Song, Aman Gupta, David Durfee, Ayan Acharya, Sathiya Keerthi, Rahul Mazumder
To that end, this paper presents a thorough empirical evaluation of mSAM on various tasks and datasets.
no code implementations • 24 Aug 2022 • Brian Hsu, Rahul Mazumder, Preetam Nandy, Kinjal Basu
The impossibility theorem of fairness is a foundational result in the algorithmic fairness literature.
no code implementations • 23 Jun 2022 • Rahul Mazumder, Xiang Meng, Haoyue Wang
Recently there has been significant interest in learning optimal decision trees using various approaches (e. g., based on integer programming, dynamic programming) -- to achieve computational scalability, most of these approaches focus on classification tasks with binary features.
no code implementations • 31 May 2022 • Brian Liu, Rahul Mazumder
We present ForestPrune, a novel optimization framework to post-process tree ensembles by pruning depth layers from individual trees.
no code implementations • 19 May 2022 • Shibal Ibrahim, Hussein Hazimeh, Rahul Mazumder
We therefore propose a novel tensor-based formulation of differentiable trees that allows for efficient vectorization on GPUs.
1 code implementation • 10 Feb 2022 • Hussein Hazimeh, Rahul Mazumder, Tim Nonet
We present L0Learn: an open-source package for sparse linear regression and classification using $\ell_0$ regularization.
no code implementations • 13 Oct 2021 • Shibal Ibrahim, Natalia Ponomareva, Rahul Mazumder
In this paper, we show that the statistical problems with covariance estimation drive the poor performance of H-score -- a common baseline for newer metrics -- and propose shrinkage-based estimator.
1 code implementation • 19 Sep 2021 • Gabriel Loewinger, Rolando Acosta Nunez, Rahul Mazumder, Giovanni Parmigiani
Importantly, our approach outperforms multi-study stacking and other standard methods in this application.
1 code implementation • 24 Aug 2021 • Shibal Ibrahim, Peter Radchenko, Emanuel Ben-David, Rahul Mazumder
We discuss and interpret findings from our model on the US Census Planning Database.
3 code implementations • NeurIPS 2021 • Hussein Hazimeh, Zhe Zhao, Aakanksha Chowdhery, Maheswaran Sathiamoorthy, Yihua Chen, Rahul Mazumder, Lichan Hong, Ed H. Chi
State-of-the-art MoE models use a trainable sparse gate to select a subset of the experts for each input example.
no code implementations • 3 Jun 2021 • Rahul Mazumder, Haoyue Wang
We prove that under a suitable scaling of the number of mismatched pairs compared to the number of samples and features, and certain assumptions on problem data; our local search algorithm converges to a nearly-optimal solution at a linear rate.
1 code implementation • 14 Apr 2021 • Hussein Hazimeh, Rahul Mazumder, Peter Radchenko
Our algorithmic framework consists of approximate and exact algorithms.
1 code implementation • 8 Apr 2021 • Kayhan Behdin, Rahul Mazumder
We consider the problem of sparse nonnegative matrix factorization (NMF) using archetypal regularization.
1 code implementation • 23 May 2020 • Wenyu Chen, Rahul Mazumder
We present new large-scale algorithms for fitting a subgradient regularized multivariate convex regression function to $n$ samples in $d$ dimensions -- a key problem in shape constrained nonparametric regression with applications in statistics, engineering and the applied sciences.
2 code implementations • 13 Apr 2020 • Hussein Hazimeh, Rahul Mazumder, Ali Saab
In this work, we present a new exact MIP framework for $\ell_0\ell_2$-regularized regression that can scale to $p \sim 10^7$, achieving speedups of at least $5000$x, compared to state-of-the-art exact methods.
2 code implementations • ICML 2020 • Hussein Hazimeh, Natalia Ponomareva, Petros Mol, Zhenyu Tan, Rahul Mazumder
We aim to combine these advantages by introducing a new layer for neural networks, composed of an ensemble of differentiable decision trees (a. k. a.
1 code implementation • 17 Jan 2020 • Antoine Dedieu, Hussein Hazimeh, Rahul Mazumder
We aim to bridge this gap in computation times by developing new MIP-based algorithms for $\ell_0$-regularized classification.
1 code implementation • 18 Aug 2019 • Rahul Mazumder, Stephen Wright, Andrew Zheng
We consider a class of linear-programming based estimators in reconstructing a sparse signal from linear measurements.
1 code implementation • 5 Feb 2019 • Hussein Hazimeh, Rahul Mazumder
In addition, we introduce a specialized active-set strategy with gradient screening for avoiding costly gradient computations.
2 code implementations • 6 Jan 2019 • Antoine Dedieu, Rahul Mazumder, Haoyue Wang
The linear Support Vector Machine (SVM) is a classic classification technique in machine learning.
1 code implementation • 24 Oct 2018 • Haihao Lu, Rahul Mazumder
Gradient Boosting Machine (GBM) introduced by Friedman is a powerful supervised learning algorithm that is very widely used in practice---it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup.
no code implementations • 20 Oct 2018 • Robert M. Freund, Paul Grigas, Rahul Mazumder
When the training data is non-separable, we show that the degree of non-separability naturally enters the analysis and informs the properties and convergence guarantees of two standard first-order methods: steepest descent (for any given norm) and stochastic gradient descent.
1 code implementation • 5 Mar 2018 • Hussein Hazimeh, Rahul Mazumder
In spite of the usefulness of $L_0$-based estimators and generic MIO solvers, there is a steep computational price to pay when compared to popular sparse learning algorithms (e. g., based on $L_1$ regularization).
no code implementations • 4 Mar 2018 • Antoine Dedieu, Rahul Mazumder, Zhen Zhu, Hossein Vahabi
In this work we present a novel framework inspired by hierarchical Bayesian modeling to predict, at the moment of login, the amount of time a user will spend in the streaming service.
no code implementations • 24 Jan 2018 • Rahul Mazumder, Diego F. Saldana, Haolei Weng
Our contributions herein, enhance our prior work on nuclear norm regularized problems for matrix completion (Mazumder et al., 2010) by incorporating a continuum of nonconvex penalty functions between the convex nuclear norm and nonconvex rank functions.
no code implementations • 18 Jan 2018 • Koulik Khamaru, Rahul Mazumder
Factor analysis, a classical multivariate statistical technique is popularly used as a fundamental tool for dimensionality reduction in statistics, econometrics and data science.
1 code implementation • 15 Aug 2017 • Dimitris Bertsimas, Martin S. Copenhaver, Rahul Mazumder
Nonconvex penalty methods for sparse modeling in linear regression have been a topic of fervent interest in recent years.
1 code implementation • 10 Aug 2017 • Rahul Mazumder, Peter Radchenko, Antoine Dedieu
We conduct an extensive theoretical analysis of the predictive properties of the proposed approach and provide justification for its superior predictive performance relative to best subset selection when the noise-level is high.
no code implementations • 6 Nov 2015 • Robert M. Freund, Paul Grigas, Rahul Mazumder
Motivated principally by the low-rank matrix completion problem, we present an extension of the Frank-Wolfe method that is designed to induce near-optimal solutions on low-dimensional faces of the feasible region.
no code implementations • 8 Aug 2015 • Rahul Mazumder, Peter Radchenko
We propose new discrete first-order methods, which when paired with state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget.
no code implementations • 11 Jul 2015 • Dimitris Bertsimas, Angela King, Rahul Mazumder
In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving Mixed Integer Optimization (MIO) problems.
no code implementations • 16 May 2015 • Robert M. Freund, Paul Grigas, Rahul Mazumder
Furthermore, we show that these new algorithms for the Lasso may also be interpreted as the same master algorithm (subgradient descent), applied to a regularized version of the maximum absolute correlation loss function.
5 code implementations • 9 Oct 2014 • Trevor Hastie, Rahul Mazumder, Jason Lee, Reza Zadeh
The matrix-completion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition.
no code implementations • 20 Aug 2013 • William Fithian, Rahul Mazumder
We propose a general framework for reduced-rank modeling of matrix-valued data.
no code implementations • 4 Jul 2013 • Robert M. Freund, Paul Grigas, Rahul Mazumder
Boosting methods are highly popular and effective supervised learning methods which combine weak learners into a single accurate model with good statistical performance.