Search Results for author: Rahul Mazumder

Found 50 papers, 22 papers with code

ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications

no code implementations • ICML 2020 • Kinjal Basu, Amol Ghoting, Rahul Mazumder, Yao Pan

Experiments on real-world data show that our proposed LP solver, ECLIPSE, can solve problems with $10^{12}$ decision variables -- well beyond the capabilities of current solvers.

Paper
Add Code

FALCON: FLOP-Aware Combinatorial Optimization for Neural Network Pruning

1 code implementation • 11 Mar 2024 • Xiang Meng, Wenyu Chen, Riade Benbaki, Rahul Mazumder

In this paper, we propose FALCON, a novel combinatorial-optimization-based framework for network pruning that jointly takes into account model accuracy (fidelity), FLOPs, and sparsity constraints.

Combinatorial Optimization Network Pruning

Paper
Code

Randomization Can Reduce Both Bias and Variance: A Case Study in Random Forests

no code implementations • 20 Feb 2024 • Brian Liu, Rahul Mazumder

We study the often overlooked phenomenon, first noted in \cite{breiman2001random}, that random forests appear to reduce bias compared to bagging.

Paper
Add Code

FAST: An Optimization Framework for Fast Additive Segmentation in Transparent ML

no code implementations • 20 Feb 2024 • Brian Liu, Rahul Mazumder

We present FAST, an optimization framework for fast additive segmentation.

Additive models Computational Efficiency +1

Paper
Add Code

FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference

no code implementations • 8 Jan 2024 • Zirui Liu, Qingquan Song, Qiang Charles Xiao, Sathiya Keerthi Selvaraj, Rahul Mazumder, Aman Gupta, Xia Hu

This usually results in a trade-off between model accuracy and efficiency.

Language Modelling Model Compression

Paper
Add Code

End-to-end Feature Selection Approach for Learning Skinny Trees

no code implementations • 28 Oct 2023 • Shibal Ibrahim, Kayhan Behdin, Rahul Mazumder

Skinny Trees lead to superior feature selection than many existing toolkits e. g., in terms of AUC performance for $25\%$ feature budget, Skinny Trees outperforms LightGBM by $10. 2\%$ (up to $37. 7\%$), and Random Forests by $3\%$ (up to $12. 5\%$).

Ensemble Learning Feature Compression +2

Paper
Add Code

QuantEase: Optimization-based Quantization for Language Models

no code implementations • 5 Sep 2023 • Kayhan Behdin, Ayan Acharya, Aman Gupta, Qingquan Song, Siyu Zhu, Sathiya Keerthi, Rahul Mazumder

Particularly noteworthy is our outlier-aware algorithm's capability to achieve near or sub-3-bit quantization of LLMs with an acceptable drop in accuracy, obviating the need for non-uniform quantization or grouping techniques, improving upon methods such as SpQR by up to two times in terms of perplexity.

Quantization

Paper
Add Code

Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives

no code implementations • 18 Jul 2023 • Kayhan Behdin, Wenyu Chen, Rahul Mazumder

To solve the MIP, we propose a custom nonlinear branch-and-bound (BnB) framework that solves node relaxations with tailored first-order methods.

Variable Selection

Paper
Add Code

FIRE: An Optimization Approach for Fast Interpretable Rule Extraction

1 code implementation • 12 Jun 2023 • Brian Liu, Rahul Mazumder

We present FIRE, Fast Interpretable Rule Extraction, an optimization-based framework to extract a small but useful collection of decision rules from tree ensembles.

Paper
Code

COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local Search

1 code implementation • 5 Jun 2023 • Shibal Ibrahim, Wenyu Chen, Hussein Hazimeh, Natalia Ponomareva, Zhe Zhao, Rahul Mazumder

To deal with this challenge, we propose a novel, permutation-based local search method that can complement first-order methods in training any sparse gate, e. g., Hash routing, Top-k, DSelect-k, and COMET.

Language Modelling Recommendation Systems

Paper
Code

Matrix Completion from General Deterministic Sampling Patterns

no code implementations • 4 Jun 2023 • Hanbyul Lee, Rahul Mazumder, Qifan Song, Jean Honorio

Most of the existing works on provable guarantees for low-rank matrix completion algorithms rely on some unrealistic assumptions such that matrix entries are sampled randomly or the sampling pattern has a specific structure.

Low-Rank Matrix Completion

Paper
Add Code

Fast as CHITA: Neural Network Pruning with Combinatorial Optimization

no code implementations • 28 Feb 2023 • Riade Benbaki, Wenyu Chen, Xiang Meng, Hussein Hazimeh, Natalia Ponomareva, Zhe Zhao, Rahul Mazumder

Our approach, CHITA, extends the classical Optimal Brain Surgeon framework and results in significant improvements in speed, memory, and performance over existing optimization-based approaches for network pruning.

Combinatorial Optimization Network Pruning

Paper
Add Code

On Statistical Properties of Sharpness-Aware Minimization: Provable Guarantees

no code implementations • 23 Feb 2023 • Kayhan Behdin, Rahul Mazumder

As SAM has been numerically successful, recent papers have studied the theoretical aspects of the framework and have shown SAM solutions are indeed flat.

regression Stochastic Optimization

Paper
Add Code

mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization

no code implementations • 19 Feb 2023 • Kayhan Behdin, Qingquan Song, Aman Gupta, Sathiya Keerthi, Ayan Acharya, Borja Ocejo, Gregory Dexter, Rajiv Khanna, David Durfee, Rahul Mazumder

Modern deep learning models are over-parameterized, where different optima can result in widely varying generalization performance.

Image Classification

Paper
Add Code

Multi-Task Learning for Sparsity Pattern Heterogeneity: A Discrete Optimization Approach

1 code implementation • 16 Dec 2022 • Gabriel Loewinger, Kayhan Behdin, Kenneth T. Kishida, Giovanni Parmigiani, Rahul Mazumder

Allowing the regression coefficients of tasks to have different sparsity patterns (i. e., different supports), we propose a modeling framework for MTL that encourages models to share information across tasks, for a given covariate, through separately 1) shrinking the coefficient supports together, and/or 2) shrinking the coefficient values together.

Multi-Task Learning Variable Selection

Paper
Code

Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization

no code implementations • 7 Dec 2022 • Kayhan Behdin, Qingquan Song, Aman Gupta, David Durfee, Ayan Acharya, Sathiya Keerthi, Rahul Mazumder

To that end, this paper presents a thorough empirical evaluation of mSAM on various tasks and datasets.

Image Classification

Paper
Add Code

Pushing the limits of fairness impossibility: Who's the fairest of them all?

no code implementations • 24 Aug 2022 • Brian Hsu, Rahul Mazumder, Preetam Nandy, Kinjal Basu

The impossibility theorem of fairness is a foundational result in the algorithmic fairness literature.

Fairness Model Selection

Paper
Add Code

Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features

no code implementations • 23 Jun 2022 • Rahul Mazumder, Xiang Meng, Haoyue Wang

Recently there has been significant interest in learning optimal decision trees using various approaches (e. g., based on integer programming, dynamic programming) -- to achieve computational scalability, most of these approaches focus on classification tasks with binary features.

Combinatorial Optimization

Paper
Add Code

ForestPrune: Compact Depth-Controlled Tree Ensembles

no code implementations • 31 May 2022 • Brian Liu, Rahul Mazumder

We present ForestPrune, a novel optimization framework to post-process tree ensembles by pruning depth layers from individual trees.

Paper
Add Code

Flexible Modeling and Multitask Learning using Differentiable Tree Ensembles

no code implementations • 19 May 2022 • Shibal Ibrahim, Hussein Hazimeh, Rahul Mazumder

We therefore propose a novel tensor-based formulation of differentiable trees that allows for efficient vectorization on GPUs.

Multi-Task Learning

Paper
Add Code

L0Learn: A Scalable Package for Sparse Learning using L0 Regularization

1 code implementation • 10 Feb 2022 • Hussein Hazimeh, Rahul Mazumder, Tim Nonet

We present L0Learn: an open-source package for sparse linear regression and classification using $\ell_0$ regularization.

Combinatorial Optimization regression +1

Paper
Code

Newer is not always better: Rethinking transferability metrics, their peculiarities, stability and performance

no code implementations • 13 Oct 2021 • Shibal Ibrahim, Natalia Ponomareva, Rahul Mazumder

In this paper, we show that the statistical problems with covariance estimation drive the poor performance of H-score -- a common baseline for newer metrics -- and propose shrinkage-based estimator.

Dimensionality Reduction Domain Adaptation

Paper
Add Code

Optimal Ensemble Construction for Multi-Study Prediction with Applications to COVID-19 Excess Mortality Estimation

1 code implementation • 19 Sep 2021 • Gabriel Loewinger, Rolando Acosta Nunez, Rahul Mazumder, Giovanni Parmigiani

Importantly, our approach outperforms multi-study stacking and other standard methods in this application.

Mortality Prediction

Paper
Code

Predicting Census Survey Response Rates With Parsimonious Additive Models and Structured Interactions

1 code implementation • 24 Aug 2021 • Shibal Ibrahim, Peter Radchenko, Emanuel Ben-David, Rahul Mazumder

We discuss and interpret findings from our model on the US Census Planning Database.

Additive models regression

Paper
Code

DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning

3 code implementations • NeurIPS 2021 • Hussein Hazimeh, Zhe Zhao, Aakanksha Chowdhery, Maheswaran Sathiamoorthy, Yihua Chen, Rahul Mazumder, Lichan Hong, Ed H. Chi

State-of-the-art MoE models use a trainable sparse gate to select a subset of the experts for each input example.

Multi-Task Learning Recommendation Systems

32,803

Paper
Code

Linear regression with partially mismatched data: local search with theoretical guarantees

no code implementations • 3 Jun 2021 • Rahul Mazumder, Haoyue Wang

We prove that under a suitable scaling of the number of mismatched pairs compared to the number of samples and features, and certain assumptions on problem data; our local search algorithm converges to a nearly-optimal solution at a linear rate.

regression

Paper
Add Code

Grouped Variable Selection with Discrete Optimization: Computational and Statistical Perspectives

1 code implementation • 14 Apr 2021 • Hussein Hazimeh, Rahul Mazumder, Peter Radchenko

Our algorithmic framework consists of approximate and exact algorithms.

Sparse Learning Variable Selection

Paper
Code

Sparse NMF with Archetypal Regularization: Computational and Robustness Properties

1 code implementation • 8 Apr 2021 • Kayhan Behdin, Rahul Mazumder

We consider the problem of sparse nonnegative matrix factorization (NMF) using archetypal regularization.

Paper
Code

Subgradient Regularized Multivariate Convex Regression at Scale

1 code implementation • 23 May 2020 • Wenyu Chen, Rahul Mazumder

We present new large-scale algorithms for fitting a subgradient regularized multivariate convex regression function to $n$ samples in $d$ dimensions -- a key problem in shape constrained nonparametric regression with applications in statistics, engineering and the applied sciences.

regression

Paper
Code

Sparse Regression at Scale: Branch-and-Bound rooted in First-Order Optimization

2 code implementations • 13 Apr 2020 • Hussein Hazimeh, Rahul Mazumder, Ali Saab

In this work, we present a new exact MIP framework for $\ell_0\ell_2$-regularized regression that can scale to $p \sim 10^7$, achieving speedups of at least $5000$x, compared to state-of-the-art exact methods.

regression Sparse Learning

Paper
Code

The Tree Ensemble Layer: Differentiability meets Conditional Computation

2 code implementations • ICML 2020 • Hussein Hazimeh, Natalia Ponomareva, Petros Mol, Zhenyu Tan, Rahul Mazumder

We aim to combine these advantages by introducing a new layer for neural networks, composed of an ensemble of differentiable decision trees (a. k. a.

32,808

Paper
Code

Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

1 code implementation • 17 Jan 2020 • Antoine Dedieu, Hussein Hazimeh, Rahul Mazumder

We aim to bridge this gap in computation times by developing new MIP-based algorithms for $\ell_0$-regularized classification.

Variable Selection

Paper
Code

Computing Estimators of Dantzig Selector type via Column and Constraint Generation

1 code implementation • 18 Aug 2019 • Rahul Mazumder, Stephen Wright, Andrew Zheng

We consider a class of linear-programming based estimators in reconstructing a sparse signal from linear measurements.

Vocal Bursts Type Prediction

Paper
Code

Learning Hierarchical Interactions at Scale: A Convex Optimization Approach

1 code implementation • 5 Feb 2019 • Hussein Hazimeh, Rahul Mazumder

In addition, we introduce a specialized active-set strategy with gradient screening for avoiding costly gradient computations.

Structured Prediction Variable Selection

Paper
Code

Solving L1-regularized SVMs and related linear programs: Revisiting the effectiveness of Column and Constraint Generation

2 code implementations • 6 Jan 2019 • Antoine Dedieu, Rahul Mazumder, Haoyue Wang

The linear Support Vector Machine (SVM) is a classic classification technique in machine learning.

BIG-bench Machine Learning

Paper
Code

Randomized Gradient Boosting Machine

1 code implementation • 24 Oct 2018 • Haihao Lu, Rahul Mazumder

Gradient Boosting Machine (GBM) introduced by Friedman is a powerful supervised learning algorithm that is very widely used in practice---it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup.

Paper
Code

Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods

no code implementations • 20 Oct 2018 • Robert M. Freund, Paul Grigas, Rahul Mazumder

When the training data is non-separable, we show that the degree of non-separability naturally enters the analysis and informs the properties and convergence guarantees of two standard first-order methods: steepest descent (for any given norm) and stochastic gradient descent.

Binary Classification General Classification +1

Paper
Add Code

Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms

1 code implementation • 5 Mar 2018 • Hussein Hazimeh, Rahul Mazumder

In spite of the usefulness of $L_0$-based estimators and generic MIO solvers, there is a steep computational price to pay when compared to popular sparse learning algorithms (e. g., based on $L_1$ regularization).

Combinatorial Optimization Sparse Learning +1

Paper
Code

Hierarchical Modeling and Shrinkage for User Session Length Prediction in Media Streaming

no code implementations • 4 Mar 2018 • Antoine Dedieu, Rahul Mazumder, Zhen Zhu, Hossein Vahabi

In this work we present a novel framework inspired by hierarchical Bayesian modeling to predict, at the moment of login, the amount of time a user will spend in the streaming service.

Paper
Add Code

Matrix Completion with Nonconvex Regularization: Spectral Operators and Scalable Algorithms

no code implementations • 24 Jan 2018 • Rahul Mazumder, Diego F. Saldana, Haolei Weng

Our contributions herein, enhance our prior work on nuclear norm regularized problems for matrix completion (Mazumder et al., 2010) by incorporating a continuum of nonconvex penalty functions between the convex nuclear norm and nonconvex rank functions.

Matrix Completion

Paper
Add Code

Computation of the Maximum Likelihood estimator in low-rank Factor Analysis

no code implementations • 18 Jan 2018 • Koulik Khamaru, Rahul Mazumder

Factor analysis, a classical multivariate statistical technique is popularly used as a fundamental tool for dimensionality reduction in statistics, econometrics and data science.

Dimensionality Reduction Econometrics

Paper
Add Code

The Trimmed Lasso: Sparsity and Robustness

1 code implementation • 15 Aug 2017 • Dimitris Bertsimas, Martin S. Copenhaver, Rahul Mazumder

Nonconvex penalty methods for sparse modeling in linear regression have been a topic of fervent interest in recent years.

Paper
Code

Subset Selection with Shrinkage: Sparse Linear Modeling when the SNR is low

1 code implementation • 10 Aug 2017 • Rahul Mazumder, Peter Radchenko, Antoine Dedieu

We conduct an extensive theoretical analysis of the predictive properties of the proposed approach and provide justification for its superior predictive performance relative to best subset selection when the noise-level is high.

regression Sparse Learning

Paper
Code

An Extended Frank-Wolfe Method with "In-Face" Directions, and its Application to Low-Rank Matrix Completion

no code implementations • 6 Nov 2015 • Robert M. Freund, Paul Grigas, Rahul Mazumder

Motivated principally by the low-rank matrix completion problem, we present an extension of the Frank-Wolfe method that is designed to induce near-optimal solutions on low-dimensional faces of the feasible region.

Low-Rank Matrix Completion

Paper
Add Code

The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization

no code implementations • 8 Aug 2015 • Rahul Mazumder, Peter Radchenko

We propose new discrete first-order methods, which when paired with state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget.

regression Variable Selection

Paper
Add Code

Best Subset Selection via a Modern Optimization Lens

no code implementations • 11 Jul 2015 • Dimitris Bertsimas, Angela King, Rahul Mazumder

In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving Mixed Integer Optimization (MIO) problems.

regression Sparse Learning

Paper
Add Code

A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives

no code implementations • 16 May 2015 • Robert M. Freund, Paul Grigas, Rahul Mazumder

Furthermore, we show that these new algorithms for the Lasso may also be interpreted as the same master algorithm (subgradient descent), applied to a regularized version of the maximum absolute correlation loss function.

regression

Paper
Add Code

Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares

5 code implementations • 9 Oct 2014 • Trevor Hastie, Rahul Mazumder, Jason Lee, Reza Zadeh

The matrix-completion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition.

Matrix Completion

1,234

Paper
Code

Flexible Low-Rank Statistical Modeling with Side Information

no code implementations • 20 Aug 2013 • William Fithian, Rahul Mazumder

We propose a general framework for reduced-rank modeling of matrix-valued data.

Matrix Completion Multi-Task Learning

Paper
Add Code

AdaBoost and Forward Stagewise Regression are First-Order Convex Optimization Methods

no code implementations • 4 Jul 2013 • Robert M. Freund, Paul Grigas, Rahul Mazumder

Boosting methods are highly popular and effective supervised learning methods which combine weak learners into a single accurate model with good statistical performance.

regression

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.