Search Results for author: Inderjit S. Dhillon

Found 89 papers, 18 papers with code

LASER: Attention with Exponential Transformation

no code implementations5 Nov 2024 Sai Surya Duvvuri, Inderjit S. Dhillon

To this end, we introduce a new attention mechanism called LASER, which we analytically show to admit a larger gradient signal.

LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization

no code implementations27 Oct 2024 Jui-Nan Yen, Si Si, Zhao Meng, Felix Yu, Sai Surya Duvvuri, Inderjit S. Dhillon, Cho-Jui Hsieh, Sanjiv Kumar

Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements.

GSM8K HellaSwag

Retraining with Predicted Hard Labels Provably Increases Model Accuracy

no code implementations17 Jun 2024 Rudrajit Das, Inderjit S. Dhillon, Alessandro Epasto, Adel Javanmard, Jieming Mao, Vahab Mirrokni, Sujay Sanghavi, Peilin Zhong

In this paper, we theoretically analyze retraining in a linearly separable setting with randomly corrupted labels given to us and prove that retraining can improve the population accuracy obtained by initially training with the given (noisy) labels.

Towards Quantifying the Preconditioning Effect of Adam

no code implementations11 Feb 2024 Rudrajit Das, Naman Agarwal, Sujay Sanghavi, Inderjit S. Dhillon

Specifically, for a $d$-dimensional quadratic with a diagonal Hessian having condition number $\kappa$, we show that the effective condition number-like quantity controlling the iteration complexity of Adam without momentum is $\mathcal{O}(\min(d, \kappa))$.

Automatic Engineering of Long Prompts

no code implementations16 Nov 2023 Cho-Jui Hsieh, Si Si, Felix X. Yu, Inderjit S. Dhillon

Large language models (LLMs) have demonstrated remarkable capabilities in solving complex open-domain tasks, guided by comprehensive instructions and demonstrations provided in the form of prompts.

Prompt Engineering

FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search

no code implementations22 Jun 2022 Patrick H. Chen, Chang Wei-cheng, Yu Hsiang-fu, Inderjit S. Dhillon, Hsieh Cho-jui

Approximate K-Nearest Neighbor Search (AKNNS) has now become ubiquitous in modern applications, for example, as a fast search procedure with two tower deep learning models.

Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion

no code implementations22 Apr 2022 Adam Block, Rahul Kidambi, Daniel N. Hill, Thorsten Joachims, Inderjit S. Dhillon

A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions.

counterfactual Information Retrieval +2

Sample Efficiency of Data Augmentation Consistency Regularization

no code implementations24 Feb 2022 Shuo Yang, Yijun Dong, Rachel Ward, Inderjit S. Dhillon, Sujay Sanghavi, Qi Lei

Data augmentation is popular in the training of large neural networks; currently, however, there is no clear theoretical comparison between different algorithmic choices on how to use augmented data.

Data Augmentation Generalization Bounds

Cluster-and-Conquer: A Framework For Time-Series Forecasting

no code implementations26 Oct 2021 Reese Pathak, Rajat Sen, Nikhil Rao, N. Benjamin Erichson, Michael I. Jordan, Inderjit S. Dhillon

Our framework -- which we refer to as "cluster-and-conquer" -- is highly general, allowing for any time-series forecasting and clustering method to be used in each step.

Time Series Time Series Forecasting

Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

1 code implementation NeurIPS 2021 Jiong Zhang, Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon

Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs.

Multi Label Text Classification Multi-Label Text Classification +2

Label Disentanglement in Partition-based Extreme Multilabel Classification

no code implementations NeurIPS 2021 Xuanqing Liu, Wei-Cheng Chang, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S. Dhillon

Partition-based methods are increasingly-used in extreme multi-label classification (XMC) problems due to their scalability to large output spaces (e. g., millions or more).

Classification Disentanglement +1

Robust Training in High Dimensions via Block Coordinate Geometric Median Descent

2 code implementations16 Jun 2021 Anish Acharya, Abolfazl Hashemi, Prateek Jain, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu

Geometric median (\textsc{Gm}) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0. 5.

Ranked #22 on Image Classification on MNIST (Accuracy metric)

Image Classification Vocal Bursts Intensity Prediction

On the Convergence of Differentially Private Federated Learning on Non-Lipschitz Objectives, and with Normalized Client Updates

no code implementations13 Jun 2021 Rudrajit Das, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon

The primary reason for this is that the clipping operation (i. e., projection onto an $\ell_2$ ball of a fixed radius called the clipping threshold) for bounding the sensitivity of the average update to each client's update introduces bias depending on the clipping threshold and the number of local steps in FL, and analyzing this is not easy.

Benchmarking Federated Learning +1

Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme Classification

no code implementations1 Jun 2021 Tavor Z. Baharav, Daniel L. Jiang, Kedarnath Kolluri, Sujay Sanghavi, Inderjit S. Dhillon

For such applications, a common approach is to organize these labels into a tree, enabling training and inference times that are logarithmic in the number of labels.

Extreme Multi-Label Classification TAG

Combinatorial Bandits without Total Order for Arms

no code implementations3 Mar 2021 Shuo Yang, Tongzheng Ren, Inderjit S. Dhillon, Sujay Sanghavi

Specifically, we focus on a challenging setting where 1) the reward distribution of an arm depends on the set $s$ it is part of, and crucially 2) there is \textit{no total order} for the arms in $\mathcal{A}$.

Linear Bandit Algorithms with Sublinear Time Complexity

no code implementations3 Mar 2021 Shuo Yang, Tongzheng Ren, Sanjay Shakkottai, Eric Price, Inderjit S. Dhillon, Sujay Sanghavi

For sufficiently large $K$, our algorithms have sublinear per-step complexity and $\tilde O(\sqrt{T})$ regret.

Movie Recommendation

Session-Aware Query Auto-completion using Extreme Multi-label Ranking

1 code implementation9 Dec 2020 Nishant Yadav, Rajat Sen, Daniel N. Hill, Arya Mazumdar, Inderjit S. Dhillon

Previous queries in the user session can provide useful context for the user's intent and can be leveraged to suggest auto-completions that are more relevant while adhering to the user's prefix.

Faster Non-Convex Federated Learning via Global and Local Momentum

no code implementations7 Dec 2020 Rudrajit Das, Anish Acharya, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu

We propose \texttt{FedGLOMO}, a novel federated learning (FL) algorithm with an iteration complexity of $\mathcal{O}(\epsilon^{-1. 5})$ to converge to an $\epsilon$-stationary point (i. e., $\mathbb{E}[\|\nabla f(\bm{x})\|^2] \leq \epsilon$) for smooth non-convex functions -- under arbitrary client heterogeneity and compressed communication -- compared to the $\mathcal{O}(\epsilon^{-2})$ complexity of most prior works.

Federated Learning

PECOS: Prediction for Enormous and Correlated Output Spaces

no code implementations12 Oct 2020 Hsiang-Fu Yu, Kai Zhong, Jiong Zhang, Wei-Cheng Chang, Inderjit S. Dhillon

In this paper, we propose the Prediction for Enormous and Correlated Output Spaces (PECOS) framework, a versatile and modular machine learning framework for solving prediction problems for very large output spaces, and apply it to the eXtreme Multilabel Ranking (XMR) problem: given an input instance, find and rank the most relevant items from an enormous but fixed and finite output space.

Learning from eXtreme Bandit Feedback

no code implementations27 Sep 2020 Romain Lopez, Inderjit S. Dhillon, Michael. I. Jordan

In POXM, the selected actions for the sIS estimator are the top-p actions of the logging policy, where p is adjusted from the data and is significantly smaller than the size of the action space.

Extreme Multi-Label Classification Recommendation Systems

Non-Exhaustive, Overlapping Co-Clustering: An Extended Analysis

no code implementations24 Apr 2020 Joyce Jiyoung Whang, Inderjit S. Dhillon

To solve this problem, we propose intuitive objective functions, and develop an an efficient iterative algorithm which we call the NEO-CC algorithm.

Clustering

Provable Non-linear Inductive Matrix Completion

no code implementations NeurIPS 2019 Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon

Inductive matrix completion (IMC) method is a standard approach for this problem where the given query as well as the items are embedded in a common low-dimensional space.

Matrix Completion Retrieval

Multiresolution Transformer Networks: Recurrence is Not Essential for Modeling Hierarchical Structure

no code implementations27 Aug 2019 Vikas K. Garg, Inderjit S. Dhillon, Hsiang-Fu Yu

The architecture of Transformer is based entirely on self-attention, and has been shown to outperform models that employ recurrence on sequence transduction tasks such as machine translation.

Machine Translation Translation

Inverting Deep Generative models, One layer at a time

1 code implementation NeurIPS 2019 Qi Lei, Ajil Jalal, Inderjit S. Dhillon, Alexandros G. Dimakis

For generative models of arbitrary depth, we show that exact recovery is possible in polynomial time with high probability, if the layers are expanding and the weights are randomly selected.

Primal-Dual Block Frank-Wolfe

1 code implementation6 Jun 2019 Qi Lei, Jiacheng Zhuo, Constantine Caramanis, Inderjit S. Dhillon, Alexandros G. Dimakis

We propose a variant of the Frank-Wolfe algorithm for solving a class of sparse/low-rank optimization problems.

General Classification Multi-class Classification +1

The Limitations of Adversarial Training and the Blind-Spot Attack

no code implementations ICLR 2019 Huan Zhang, Hongge Chen, Zhao Song, Duane Boning, Inderjit S. Dhillon, Cho-Jui Hsieh

In our paper, we shed some lights on the practicality and the hardness of adversarial training by showing that the effectiveness (robustness on test set) of adversarial training has a strong correlation with the distance between a test point and the manifold of training data embedded by the network.

valid

Nonlinear Inductive Matrix Completion based on One-layer Neural Networks

no code implementations26 May 2018 Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon

A standard approach to modeling this problem is Inductive Matrix Completion where the predicted rating is modeled as an inner product of the user and the item features projected onto a latent space.

Clustering Matrix Completion +1

Towards Fast Computation of Certified Robustness for ReLU Networks

6 code implementations ICML 2018 Tsui-Wei Weng, huan zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel

Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17].

Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization

1 code implementation ICML 2018 Jiong Zhang, Qi Lei, Inderjit S. Dhillon

Theoretically, we demonstrate that our parameterization does not lose any expressive power, and show how it controls generalization of RNN for the classification task.

Learning Long Term Dependencies via Fourier Recurrent Units

2 code implementations ICML 2018 Jiong Zhang, Yibo Lin, Zhao Song, Inderjit S. Dhillon

In this paper we propose a simple recurrent architecture, the Fourier Recurrent Unit (FRU), that stabilizes the gradients that arise in its training while giving us stronger expressive power.

Realtime query completion via deep language models

no code implementations ICLR 2018 Po-Wei Wang, J. Zico Kolter, Vijai Mohan, Inderjit S. Dhillon

Search engine users nowadays heavily depend on query completion and correction to shape their queries.

Language Modelling

Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels

no code implementations8 Nov 2017 Kai Zhong, Zhao Song, Inderjit S. Dhillon

In this paper, we consider parameter recovery for non-overlapping convolutional neural networks (CNNs) with multiple kernels.

Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization

no code implementations ICML 2017 Qi Lei, Ian En-Hsu Yen, Chao-yuan Wu, Inderjit S. Dhillon, Pradeep Ravikumar

We consider the popular problem of sparse empirical risk minimization with linear predictors and a large number of both features and observations.

Recovery Guarantees for One-hidden-layer Neural Networks

no code implementations ICML 2017 Kai Zhong, Zhao Song, Prateek Jain, Peter L. Bartlett, Inderjit S. Dhillon

For activation functions that are also smooth, we show $\mathit{local~linear~convergence}$ guarantees of gradient descent under a resampling rule.

Structured Sparse Regression via Greedy Hard Thresholding

no code implementations NeurIPS 2016 Prateek Jain, Nikhil Rao, Inderjit S. Dhillon

Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups.

regression

Coordinate-wise Power Method

no code implementations NeurIPS 2016 Qi Lei, Kai Zhong, Inderjit S. Dhillon

The vanilla power method simultaneously updates all the coordinates of the iterate, which is essential for its convergence analysis.

Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain

no code implementations NeurIPS 2016 Ian En-Hsu Yen, Xiangru Huang, Kai Zhong, Ruohan Zhang, Pradeep K. Ravikumar, Inderjit S. Dhillon

In this work, we show that, by decomposing training of Structural Support Vector Machine (SVM) into a series of multiclass SVM problems connected through messages, one can replace expensive structured oracle with Factorwise Maximization Oracle (FMO) that allows efficient implementation of complexity sublinear to the factor domain.

Asynchronous Parallel Greedy Coordinate Descent

no code implementations NeurIPS 2016 Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh

n this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints.

Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction

no code implementations NeurIPS 2016 Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon

We develop novel regularization schemes and use scalable matrix factorization methods that are eminently suited for high-dimensional time series data that has many missing values.

Demand Forecasting Missing Values +3

Mixed Linear Regression with Multiple Components

no code implementations NeurIPS 2016 Kai Zhong, Prateek Jain, Inderjit S. Dhillon

Furthermore, our empirical results indicate that even with random initialization, our approach converges to the global optima in linear time, providing speed-up of up to two orders of magnitude.

Clustering regression

A Greedy Approach for Budgeted Maximum Inner Product Search

no code implementations NeurIPS 2017 Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon

Maximum Inner Product Search (MIPS) is an important task in many machine learning applications such as the prediction phase of a low-rank matrix factorization model for a recommender system.

Recommendation Systems

Communication-Efficient Parallel Block Minimization for Kernel Machines

no code implementations5 Aug 2016 Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon

Kernel machines often yield superior predictive performance on various tasks; however, they suffer from severe computational challenges.

Generalized Root Models: Beyond Pairwise Graphical Models for Univariate Exponential Families

1 code implementation2 Jun 2016 David I. Inouye, Pradeep Ravikumar, Inderjit S. Dhillon

As in the recent work with square root graphical (SQR) models [Inouye et al. 2016]---which was restricted to pairwise dependencies---we give the conditions of the parameters that are needed for normalization using the radial conditionals similar to the pairwise case [Inouye et al. 2016].

PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification

1 code implementation ICML 2016 Ian En-Hsu Yen, Xiangru Huang, Pradeep Ravikumar, Kai Zhong, Inderjit S. Dhillon

In this work, we show that a margin-maximizing loss with l1 penalty, in case of Extreme Classification, yields extremely sparse solution both in primal and in dual without sacrificing the expressive power of predictor.

General Classification Text Classification

Extreme Stochastic Variational Inference: Distributed and Asynchronous

no code implementations31 May 2016 Jiong Zhang, Parameswaran Raman, Shihao Ji, Hsiang-Fu Yu, S. V. N. Vishwanathan, Inderjit S. Dhillon

Moreover, it requires the parameters to fit in the memory of a single processor; this is problematic when the number of parameters is in billions.

Variational Inference

Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies

no code implementations11 Mar 2016 David I. Inouye, Pradeep Ravikumar, Inderjit S. Dhillon

With this motivation, we give an example of our model class derived from the univariate exponential distribution that allows for almost arbitrary positive and negative dependencies with only a mild condition on the parameter matrix---a condition akin to the positive definiteness of the Gaussian covariance matrix.

Fast Multiplier Methods to Optimize Non-exhaustive, Overlapping Clustering

no code implementations5 Feb 2016 Yangyang Hou, Joyce Jiyoung Whang, David F. Gleich, Inderjit S. Dhillon

In this paper, we consider two fast multiplier methods to accelerate the convergence of an augmented Lagrangian scheme: a proximal method of multipliers and an alternating direction method of multipliers (ADMM).

Clustering

Collaborative Filtering with Graph Information: Consistency and Scalable Methods

2 code implementations NeurIPS 2015 Nikhil Rao, Hsiang-Fu Yu, Pradeep K. Ravikumar, Inderjit S. Dhillon

Low rank matrix completion plays a fundamental role in collaborative filtering applications, the key idea being that the variables lie in a smaller subspace than the ambient space.

 Ranked #1 on Recommendation Systems on Flixster (using extra training data)

Collaborative Filtering Low-Rank Matrix Completion +1

Consistent Multilabel Classification

no code implementations NeurIPS 2015 Oluwasanmi O. Koyejo, Nagarajan Natarajan, Pradeep K. Ravikumar, Inderjit S. Dhillon

In particular, we show that for multilabel metrics constructed as instance-, micro- and macro-averages, the population optimal classifier can be decomposed into binary classifiers based on the marginal instance-conditional distribution of each label, with a weak association between labels via the threshold.

Classification General Classification

Fixed-Length Poisson MRF: Adding Dependencies to the Multinomial

no code implementations NeurIPS 2015 David I. Inouye, Pradeep K. Ravikumar, Inderjit S. Dhillon

We show the effectiveness of our LPMRF distribution over Multinomial models by evaluating the test set perplexity on a dataset of abstracts and Wikipedia.

Topic Models

Sparse Linear Programming via Primal and Dual Augmented Coordinate Descent

no code implementations NeurIPS 2015 Ian En-Hsu Yen, Kai Zhong, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon

Over the past decades, Linear Programming (LP) has been widely used in different areas and considered as one of the mature technologies in numerical optimization.

Matrix Completion with Noisy Side Information

no code implementations NeurIPS 2015 Kai-Yang Chiang, Cho-Jui Hsieh, Inderjit S. Dhillon

Moreover, we study the effectof general features in theory, and show that by using our model, the sample complexity can still be lower than matrix completion as long as features are sufficiently informative.

Clustering Matrix Completion

High-dimensional Time Series Prediction with Missing Values

no code implementations28 Sep 2015 Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon

High-dimensional time series prediction is needed in applications as diverse as demand forecasting and climatology.

Demand Forecasting Matrix Completion +4

Preference Completion: Large-scale Collaborative Ranking from Pairwise Comparisons

1 code implementation16 Jul 2015 Dohyung Park, Joe Neeman, Jin Zhang, Sujay Sanghavi, Inderjit S. Dhillon

In this paper we consider the collaborative ranking setting: a pool of users each provides a small number of pairwise preferences between $d$ possible items; from these we need to predict preferences of the users for items they have not yet seen.

Collaborative Filtering Collaborative Ranking +1

Optimal Decision-Theoretic Classification Using Non-Decomposable Performance Metrics

no code implementations7 May 2015 Nagarajan Natarajan, Oluwasanmi Koyejo, Pradeep Ravikumar, Inderjit S. Dhillon

We provide a general theoretical analysis of expected out-of-sample utility, also referred to as decision-theoretic classification, for non-decomposable binary classification metrics such as F-measure and Jaccard coefficient.

Binary Classification Classification +1

A Scalable Asynchronous Distributed Algorithm for Topic Modeling

1 code implementation16 Dec 2014 Hsiang-Fu Yu, Cho-Jui Hsieh, Hyokun Yun, S. V. N. Vishwanathan, Inderjit S. Dhillon

Learning meaningful topic models with massive document collections which contain millions of documents and billions of tokens is challenging because of two reasons: First, one needs to deal with a large number of topics (typically in the order of thousands).

Topic Models

Multi-Scale Spectral Decomposition of Massive Graphs

no code implementations NeurIPS 2014 Si Si, Donghyuk Shin, Inderjit S. Dhillon, Beresford N. Parlett

Thus, eigenvectors of the clusters serve as good initializations to a block Lanczos algorithm that is used to compute spectral decomposition of the original graph.

Clustering

Capturing Semantically Meaningful Word Dependencies with an Admixture of Poisson MRFs

no code implementations NeurIPS 2014 David I. Inouye, Pradeep K. Ravikumar, Inderjit S. Dhillon

We develop a fast algorithm for the Admixture of Poisson MRFs (APM) topic model and propose a novel metric to directly evaluate this model.

Topic Models

Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space

no code implementations NeurIPS 2014 Ian En-Hsu Yen, Ting-Wei Lin, Shou-De Lin, Pradeep K. Ravikumar, Inderjit S. Dhillon

In this paper, we propose a Sparse Random Feature algorithm, which learns a sparse non-linear predictor by minimizing an $\ell_1$-regularized objective function over the Hilbert Space induced from kernel function.

Fast Prediction for Large-Scale Kernel Machines

no code implementations NeurIPS 2014 Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon

Second, we provide a new theoretical analysis on bounding the error of the solution computed by using Nystr¨om kernel approximation method, and show that the error is related to the weighted kmeans objective function where the weights are given by the model computed from the original kernel.

General Classification regression

QUIC & DIRTY: A Quadratic Approximation Approach for Dirty Statistical Models

no code implementations NeurIPS 2014 Cho-Jui Hsieh, Inderjit S. Dhillon, Pradeep K. Ravikumar, Stephen Becker, Peder A. Olsen

In this paper, we develop a family of algorithms for optimizing superposition-structured” or “dirty” statistical estimators for high-dimensional problems involving the minimization of the sum of a smooth loss function with a hybrid regularization.

Model Selection Multi-Task Learning +1

Constant Nullspace Strong Convexity and Fast Convergence of Proximal Methods under High-Dimensional Settings

no code implementations NeurIPS 2014 Ian En-Hsu Yen, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon

State of the art statistical estimators for high-dimensional problems take the form of regularized, and hence non-smooth, convex programs.

PU Learning for Matrix Completion

no code implementations22 Nov 2014 Cho-Jui Hsieh, Nagarajan Natarajan, Inderjit S. Dhillon

For the first case, we propose a "shifted matrix completion" method that recovers M using only a subset of indices corresponding to ones, while for the second case, we propose a "biased matrix completion" method that recovers the (thresholded) binary matrix.

Binary Classification Clustering +3

Proximal Quasi-Newton for Computationally Intensive L1-regularized M-estimators

no code implementations NeurIPS 2014 Kai Zhong, Ian E. H. Yen, Inderjit S. Dhillon, Pradeep Ravikumar

We consider the class of optimization problems arising from computationally intensive L1-regularized M-estimators, where the function or gradient values are very expensive to compute.

General Classification Structured Prediction

Learning with Noisy Labels

no code implementations NeurIPS 2013 Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In this paper, we theoretically study the problem of binary classification in the presence of random classification noise --- the learner, instead of seeing the true labels, sees labels that have independently been flipped with some small probability.

Binary Classification General Classification +1

Large Scale Distributed Sparse Precision Estimation

no code implementations NeurIPS 2013 Huahua Wang, Arindam Banerjee, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon

We consider the problem of sparse precision matrix estimation in high dimensions using the CLIME estimator, which has several desirable theoretical properties.

BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables

no code implementations NeurIPS 2013 Cho-Jui Hsieh, Matyas A. Sustik, Inderjit S. Dhillon, Pradeep K. Ravikumar, Russell Poldrack

The l1-regularized Gaussian maximum likelihood estimator (MLE) has been shown to have strong statistical guarantees in recovering a sparse inverse covariance matrix even under high-dimensional settings.

Clustering

A Divide-and-Conquer Solver for Kernel Support Vector Machines

no code implementations4 Nov 2013 Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon

We show theoretically that the support vectors identified by the subproblem solution are likely to be support vectors of the entire kernel SVM problem, provided that the problem is partitioned appropriately by kernel clustering.

Clustering

Large-scale Multi-label Learning with Missing Labels

no code implementations18 Jul 2013 Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, Inderjit S. Dhillon

The multi-label classification problem has generated significant interest in recent years.

Missing Labels

Sparse Inverse Covariance Matrix Estimation Using Quadratic Approximation

no code implementations NeurIPS 2011 Cho-Jui Hsieh, Matyas A. Sustik, Inderjit S. Dhillon, Pradeep Ravikumar

The L1-regularized Gaussian maximum likelihood estimator (MLE) has been shown to have strong statistical guarantees in recovering a sparse inverse covariance matrix, or alternatively the underlying graph structure of a Gaussian Markov Random Field, from very limited samples.

Provable Inductive Matrix Completion

no code implementations4 Jun 2013 Prateek Jain, Inderjit S. Dhillon

In addition to inductive matrix completion, we show that two other low-rank estimation problems can be studied in our framework: a) general low-rank matrix sensing using rank-1 measurements, and b) multi-label regression with missing labels.

Matrix Completion Missing Labels +1

Nearest Neighbor based Greedy Coordinate Descent

no code implementations NeurIPS 2011 Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In particular, we investigate the greedy coordinate descent algorithm, and note that performing the greedy step efficiently weakens the costly dependence on the problem size provided the solution is sparse.

Greedy Algorithms for Structurally Constrained High Dimensional Problems

no code implementations NeurIPS 2011 Ambuj Tewari, Pradeep K. Ravikumar, Inderjit S. Dhillon

A hallmark of modern machine learning is its ability to deal with high dimensional problems by exploiting structural assumptions that limit the degrees of freedom in the underlying model.

Vocal Bursts Intensity Prediction

Orthogonal Matching Pursuit with Replacement

no code implementations NeurIPS 2011 Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursuit.

Inductive Regularized Learning of Kernel Functions

no code implementations NeurIPS 2010 Prateek Jain, Brian Kulis, Inderjit S. Dhillon

Our result shows that the learned kernel matrices parameterize a linear transformation kernel function and can be applied inductively to new data points.

Dimensionality Reduction General Classification +1

Matrix Completion from Power-Law Distributed Samples

no code implementations NeurIPS 2009 Raghu Meka, Prateek Jain, Inderjit S. Dhillon

In this paper, we propose a graph theoretic approach to matrix completion that solves the problem for more realistic sampling models.

Low-Rank Matrix Completion

Guaranteed Rank Minimization via Singular Value Projection

1 code implementation NeurIPS 2010 Raghu Meka, Prateek Jain, Inderjit S. Dhillon

Minimizing the rank of a matrix subject to affine constraints is a fundamental problem with many important applications in machine learning and statistics.

Low-Rank Matrix Completion

Online Metric Learning and Fast Similarity Search

no code implementations NeurIPS 2008 Prateek Jain, Brian Kulis, Inderjit S. Dhillon, Kristen Grauman

Metric learning algorithms can provide useful distance functions for a variety of domains, and recent work has shown good accuracy for problems where the learner can access all distance constraints at once.

Metric Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.