Search Results for author: Madeleine Udell

Found 54 papers, 18 papers with code

Interpretable Survival Analysis for Heart Failure Risk Prediction

no code implementations24 Oct 2023 Mike Van Ness, Tomas Bosschieter, Natasha Din, Andrew Ambrosy, Alexander Sandhu, Madeleine Udell

Specifically, we use an improved version of survival stacking to transform a survival analysis problem to a classification problem, ControlBurn to perform feature selection, and Explainable Boosting Machines to generate interpretable predictions.

feature selection Survival Analysis

OptiMUS: Optimization Modeling Using MIP Solvers and large language models

1 code implementation9 Oct 2023 Ali AhmadiTeshnizi, Wenzhi Gao, Madeleine Udell

Optimization problems are pervasive across various sectors, from manufacturing and distribution to healthcare.

Language Modelling Large Language Model

PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates

no code implementations5 Sep 2023 Zachary Frangella, Pratik Rathore, Shipu Zhao, Madeleine Udell

This paper introduces PROMISE ($\textbf{Pr}$econditioned Stochastic $\textbf{O}$ptimization $\textbf{M}$ethods by $\textbf{I}$ncorporating $\textbf{S}$calable Curvature $\textbf{E}$stimates), a suite of sketching-based preconditioned stochastic gradient algorithms for solving large-scale convex optimization problems arising in machine learning.

Stochastic Optimization

Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models

no code implementations15 Jun 2023 Sarah J. Zhang, Samuel Florin, Ariel N. Lee, Eamon Niknafs, Andrei Marginean, Annie Wang, Keith Tyser, Zad Chin, Yann Hicke, Nikhil Singh, Madeleine Udell, Yoon Kim, Tonio Buonassisi, Armando Solar-Lezama, Iddo Drori

We curate a comprehensive dataset of 4, 550 questions and solutions from problem sets, midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and Computer Science (EECS) courses required for obtaining a degree.

Electrical Engineering Few-Shot Learning +3

SketchySGD: Reliable Stochastic Optimization via Randomized Curvature Estimates

no code implementations16 Nov 2022 Zachary Frangella, Pratik Rathore, Shipu Zhao, Madeleine Udell

Numerical experiments on both ridge and logistic regression problems with dense and sparse data, show that SketchySGD equipped with its default hyperparameters can achieve comparable or better results than popular stochastic gradient methods, even when they have been tuned to yield their best performance.

regression Stochastic Optimization

The Missing Indicator Method: From Low to High Dimensions

1 code implementation16 Nov 2022 Mike Van Ness, Tomas M. Bosschieter, Roberto Halpin-Gregorio, Madeleine Udell

In this paper, we show empirically and theoretically that MIM improves performance for informative missing values, and we prove that MIM does not hurt linear models asymptotically for uninformative missing values.

Imputation Vocal Bursts Intensity Prediction

ControlBurn: Nonlinear Feature Selection with Sparse Tree Ensembles

1 code implementation8 Jul 2022 Brian Liu, Miaolan Xie, Haoyue Yang, Madeleine Udell

ControlBurn is a Python package to construct feature-sparse tree ensembles that support nonlinear feature selection and interpretable machine learning.

Additive models feature selection +1

TabNAS: Rejection Sampling for Neural Architecture Search on Tabular Datasets

1 code implementation15 Apr 2022 Chengrun Yang, Gabriel Bender, Hanxiao Liu, Pieter-Jan Kindermans, Madeleine Udell, Yifeng Lu, Quoc Le, Da Huang

The best neural architecture for a given machine learning problem depends on many factors: not only the complexity and structure of the dataset, but also on resource constraints including latency, compute, energy consumption, etc.

Image Retrieval Neural Architecture Search +1

Data-Efficient and Interpretable Tabular Anomaly Detection

no code implementations3 Mar 2022 Chun-Hao Chang, Jinsung Yoon, Sercan Arik, Madeleine Udell, Tomas Pfister

In addition, the proposed framework, DIAD, can incorporate a small amount of labeled data to further boost anomaly detection performances in semi-supervised settings.

Additive models Anomaly Detection

Towards Group Robustness in the presence of Partial Group Labels

no code implementations10 Jan 2022 Vishnu Suresh Lokhande, Kihyuk Sohn, Jinsung Yoon, Madeleine Udell, Chen-Yu Lee, Tomas Pfister

Such a requirement is impractical in situations where the data labeling efforts for minority or rare groups are significantly laborious or where the individuals comprising the dataset choose to conceal sensitive information.

Invariant Learning with Partial Group Labels

no code implementations29 Sep 2021 Vishnu Suresh Lokhande, Kihyuk Sohn, Jinsung Yoon, Madeleine Udell, Chen-Yu Lee, Tomas Pfister

Such a requirement is impractical in situations where the data labelling efforts for minority or rare groups is significantly laborious or where the individuals comprising the dataset choose to conceal sensitive information.

CDF Normalization for Controlling the Distribution of Hidden Nodes

no code implementations NeurIPS Workshop ICBINB 2021 Mike Van Ness, Madeleine Udell

Batch Normalizaiton (BN) is a normalization method for deep neural networks that has been shown to accelerate training.

Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression

no code implementations NeurIPS 2021 William T. Stephenson, Zachary Frangella, Madeleine Udell, Tamara Broderick

In the present paper, we show that, in the case of ridge regression, the CV loss may fail to be quasiconvex and thus may have multiple local optima.


ControlBurn: Feature Selection by Sparse Forests

1 code implementation1 Jul 2021 Brian Liu, Miaolan Xie, Madeleine Udell

Like the linear LASSO, ControlBurn assigns all the feature importance of a correlated group of features to a single feature.

Feature Importance feature selection

Privileged Zero-Shot AutoML

no code implementations25 Jun 2021 Nikhil Singh, Brandon Kates, Jeff Mentch, Anant Kharkar, Madeleine Udell, Iddo Drori

This work improves the quality of automated machine learning (AutoML) systems by using dataset and function descriptions while significantly decreasing computation time from minutes to milliseconds by using a zero-shot approach.

AutoML BIG-bench Machine Learning +1

Tensor Random Projection for Low Memory Dimension Reduction

no code implementations30 Apr 2021 Yiming Sun, Yang Guo, Joel A. Tropp, Madeleine Udell

The TRP map is formed as the Khatri-Rao product of several smaller random projections, and is compatible with any base random projection including sparse maps, which enable dimension reduction with very low query cost and no floating point operations.

Dimensionality Reduction

Real-Time AutoML

no code implementations1 Jan 2021 Iddo Drori, Brandon Kates, Anant Kharkar, Lu Liu, Qiang Ma, Jonah Deykin, Nihar Sidhu, Madeleine Udell

We train a graph neural network in which each node represents a dataset to predict the best machine learning pipeline for a new test dataset.

AutoML BIG-bench Machine Learning +1

Euclidean-Norm-Induced Schatten-p Quasi-Norm Regularization for Low-Rank Tensor Completion and Tensor Robust Principal Component Analysis

no code implementations7 Dec 2020 Jicong Fan, Lijun Ding, Chengrun Yang, Zhao Zhang, Madeleine Udell

The theorems show that a relatively sharper regularizer leads to a tighter error bound, which is consistent with our numerical results.

Impact of Accuracy on Model Interpretations

no code implementations17 Nov 2020 Brian Liu, Madeleine Udell

Model interpretations are often used in practice to extract real world insights from machine learning models.

GalaxyTSP: A New Billion-Node Benchmark for TSP

no code implementations NeurIPS Workshop LMCA 2020 Iddo Drori, Brandon J Kates, William R. Sickinger, Anant Girish Kharkar, Brenda Dietrich, Avi Shporer, Madeleine Udell

We approximate a Traveling Salesman Problem (TSP) three orders of magnitude larger than the largest known benchmark, increasing the number of nodes from millions to billions.

Scheduling Traveling Salesman Problem

An Information-Theoretic Approach to Persistent Environment Monitoring Through Low Rank Model Based Planning and Prediction

no code implementations2 Sep 2020 Elizabeth A. Ricci, Madeleine Udell, Ross A. Knepper

We combine a low rank model of a target attribute with an information-maximizing path planner to predict the state of the attribute throughout a region.

Approximate Cross-Validation with Low-Rank Data in High Dimensions

no code implementations NeurIPS 2020 William T. Stephenson, Madeleine Udell, Tamara Broderick

Our second key insight is that, in the presence of ALR data, error in existing ACV methods roughly grows with the (approximate, low) rank rather than with the (full, high) dimension.

Vocal Bursts Intensity Prediction

$k$FW: A Frank-Wolfe style algorithm with stronger subproblem oracles

no code implementations29 Jun 2020 Lijun Ding, Jicong Fan, Madeleine Udell

This paper proposes a new variant of Frank-Wolfe (FW), called $k$FW.

Matrix Completion with Quantified Uncertainty through Low Rank Gaussian Copula

2 code implementations NeurIPS 2020 Yuxuan Zhao, Madeleine Udell

The time required to fit the model scales linearly with the number of rows and the number of columns in the dataset.

Imputation Matrix Completion +1

Efficient AutoML Pipeline Search with Matrix and Tensor Factorization

1 code implementation7 Jun 2020 Chengrun Yang, Jicong Fan, Ziyang Wu, Madeleine Udell

Data scientists seeking a good supervised learning model on a new dataset have many choices to make: they must preprocess the data, select features, possibly reduce the dimension, select an estimation algorithm, and choose hyperparameters for each of these pipeline components.


Robust Non-Linear Matrix Factorization for Dictionary Learning, Denoising, and Clustering

no code implementations4 May 2020 Jicong Fan, Chengrun Yang, Madeleine Udell

RNLMF constructs a dictionary for the data space by factoring a kernelized feature space; a noisy matrix can then be decomposed as the sum of a sparse noise matrix and a clean data matrix that lies in a low dimensional nonlinear manifold.

Clustering Denoising +2

On the simplicity and conditioning of low rank semidefinite programs

no code implementations25 Feb 2020 Lijun Ding, Madeleine Udell

It is more challenging to show that an approximate solution to the SDP formulated with noisy problem data acceptably solves the original problem; arguments are usually ad hoc for each problem setting, and can be complex.

Matrix Completion Stochastic Block Model

Online high rank matrix completion

no code implementations CVPR 2019 Jicong Fan, Madeleine Udell

Recent advances in matrix completion enable data imputation in full-rank matrices by exploiting low dimensional (nonlinear) latent structure.

Imputation Matrix Completion +1

Polynomial Matrix Completion for Missing Data Imputation and Transductive Learning

no code implementations15 Dec 2019 Jicong Fan, Yuqian Zhang, Madeleine Udell

This paper develops new methods to recover the missing entries of a high-rank or even full-rank matrix when the intrinsic dimension of the data is low compared to the ambient dimension.

Clustering Imputation +2

Factor Group-Sparse Regularization for Efficient Low-Rank Matrix Recovery

no code implementations NeurIPS 2019 Jicong Fan, Lijun Ding, Yudong Chen, Madeleine Udell

Compared to the max norm and the factored formulation of the nuclear norm, factor group-sparse regularizers are more efficient, accurate, and robust to the initial guess of rank.

Low-Rank Matrix Completion

AutoML using Metadata Language Embeddings

2 code implementations8 Oct 2019 Iddo Drori, Lu Liu, Yi Nian, Sharath C. Koorathota, Jie S. Li, Antonio Khalil Moretti, Juliana Freire, Madeleine Udell

We use these embeddings in a neural architecture to learn the distance between best-performing pipelines.


"Why Should You Trust My Explanation?" Understanding Uncertainty in LIME Explanations

no code implementations29 Apr 2019 Yu-jia Zhang, Kuangyan Song, Yiming Sun, Sarah Tan, Madeleine Udell

Methods for interpreting machine learning black-box models increase the outcomes' transparency and in turn generates insight into the reliability and fairness of the algorithms.

Fairness General Classification +3

Low-Rank Tucker Approximation of a Tensor From Streaming Data

2 code implementations24 Apr 2019 Yiming Sun, Yang Guo, Charlene Luo, Joel Tropp, Madeleine Udell

This paper describes a new algorithm for computing a low-Tucker-rank approximation of a tensor.

An Optimal-Storage Approach to Semidefinite Programming using Approximate Complementarity

no code implementations9 Feb 2019 Lijun Ding, Alp Yurtsever, Volkan Cevher, Joel A. Tropp, Madeleine Udell

This paper develops a new storage-optimal algorithm that provably solves generic semidefinite programs (SDPs) in standard form.

Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved

1 code implementation27 Nov 2018 Jiahao Chen, Nathan Kallus, Xiaojie Mao, Geoffry Svacha, Madeleine Udell

We also propose an alternative weighted estimator that uses soft classification, and show that its bias arises simply from the conditional covariance of the outcome with the true class membership.

Decision Making Fairness +1

Frank-Wolfe Style Algorithms for Large Scale Optimization

no code implementations15 Aug 2018 Lijun Ding, Madeleine Udell

We introduce a few variants on Frank-Wolfe style algorithms suitable for large scale optimization.

OBOE: Collaborative Filtering for AutoML Model Selection

1 code implementation9 Aug 2018 Chengrun Yang, Yuji Akimoto, Dae Won Kim, Madeleine Udell

Algorithm selection and hyperparameter tuning remain two of the most challenging tasks in machine learning.

Active Learning AutoML +5

Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data

no code implementations NeurIPS 2017 Joel A. Tropp, Alp Yurtsever, Madeleine Udell, Volkan Cevher

Several important applications, such as streaming PCA and semidefinite programming, involve a large-scale positive-semidefinite (psd) matrix that is presented as a sequence of linear updates.

Why are Big Data Matrices Approximately Low Rank?

no code implementations21 May 2017 Madeleine Udell, Alex Townsend

Here, we explain the effectiveness of low rank models in data science by considering a simple generative model for these matrices: we suppose that each row or column is associated to a (possibly high dimensional) bounded latent variable, and entries of the matrix are generated by applying a piecewise analytic function to these latent variables.

Recommendation Systems Topic Models

The Sound of APALM Clapping: Faster Nonsmooth Nonconvex Optimization with Stochastic Asynchronous PALM

no code implementations NeurIPS 2016 Damek Davis, Brent Edmunds, Madeleine Udell

We introduce the Stochastic Asynchronous Proximal Alternating Linearized Minimization (SAPALM) method, a block coordinate stochastic proximal-gradient method for solving nonconvex, nonsmooth optimization problems.

Dynamic Assortment Personalization in High Dimensions

no code implementations18 Oct 2016 Nathan Kallus, Madeleine Udell

In the dynamic setting, we show that structure-aware dynamic assortment personalization can have regret that is an order of magnitude smaller than structure-ignorant approaches.

Management Vocal Bursts Intensity Prediction

Disciplined Multi-Convex Programming

3 code implementations12 Sep 2016 Xinyue Shen, Steven Diamond, Madeleine Udell, Yuantao Gu, Stephen Boyd

A multi-convex optimization problem is one in which the variables can be partitioned into sets over which the problem is convex when the other variables are fixed.

Optimization and Control

Practical sketching algorithms for low-rank matrix approximation

no code implementations31 Aug 2016 Joel A. Tropp, Alp Yurtsever, Madeleine Udell, Volkan Cevher

This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch.

Revealed Preference at Scale: Learning Personalized Preferences from Assortment Choices

no code implementations17 Sep 2015 Nathan Kallus, Madeleine Udell

In our model, the preferences of each customer or segment follow a separate parametric choice model, but the underlying structure of these parameters over all the models has low dimension.

Convex Optimization in Julia

1 code implementation17 Oct 2014 Madeleine Udell, Karanveer Mohan, David Zeng, Jenny Hong, Steven Diamond, Stephen Boyd

This paper describes Convex, a convex optimization modeling framework in Julia.

Generalized Low Rank Models

1 code implementation1 Oct 2014 Madeleine Udell, Corinne Horn, Reza Zadeh, Stephen Boyd

Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types.

Clustering Denoising +1

Cannot find the paper you are looking for? You can Submit a new open access paper.