Search Results for author: Inderjit S. Dhillon

Found 86 papers, 18 papers with code

Online Metric Learning and Fast Similarity Search

no code implementations • NeurIPS 2008 • Prateek Jain, Brian Kulis, Inderjit S. Dhillon, Kristen Grauman

Metric learning algorithms can provide useful distance functions for a variety of domains, and recent work has shown good accuracy for problems where the learner can access all distance constraints at once.

Metric Learning

Paper
Add Code

Guaranteed Rank Minimization via Singular Value Projection

1 code implementation • NeurIPS 2010 • Raghu Meka, Prateek Jain, Inderjit S. Dhillon

Minimizing the rank of a matrix subject to affine constraints is a fundamental problem with many important applications in machine learning and statistics.

Low-Rank Matrix Completion

Paper
Code

Matrix Completion from Power-Law Distributed Samples

no code implementations • NeurIPS 2009 • Raghu Meka, Prateek Jain, Inderjit S. Dhillon

In this paper, we propose a graph theoretic approach to matrix completion that solves the problem for more realistic sampling models.

Low-Rank Matrix Completion

Paper
Add Code

Inductive Regularized Learning of Kernel Functions

no code implementations • NeurIPS 2010 • Prateek Jain, Brian Kulis, Inderjit S. Dhillon

Our result shows that the learned kernel matrices parameterize a linear transformation kernel function and can be applied inductively to new data points.

Dimensionality Reduction General Classification +1

Paper
Add Code

Greedy Algorithms for Structurally Constrained High Dimensional Problems

no code implementations • NeurIPS 2011 • Ambuj Tewari, Pradeep K. Ravikumar, Inderjit S. Dhillon

A hallmark of modern machine learning is its ability to deal with high dimensional problems by exploiting structural assumptions that limit the degrees of freedom in the underlying model.

Vocal Bursts Intensity Prediction

Paper
Add Code

Orthogonal Matching Pursuit with Replacement

no code implementations • NeurIPS 2011 • Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursuit.

Paper
Add Code

Nearest Neighbor based Greedy Coordinate Descent

no code implementations • NeurIPS 2011 • Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In particular, we investigate the greedy coordinate descent algorithm, and note that performing the greedy step efficiently weakens the costly dependence on the problem size provided the solution is sparse.

Paper
Add Code

A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation

no code implementations • NeurIPS 2012 • Cho-Jui Hsieh, Arindam Banerjee, Inderjit S. Dhillon, Pradeep K. Ravikumar

We derive a bound on the distance of the approximate solution to the true solution.

Clustering

Paper
Add Code

Provable Inductive Matrix Completion

no code implementations • 4 Jun 2013 • Prateek Jain, Inderjit S. Dhillon

In addition to inductive matrix completion, we show that two other low-rank estimation problems can be studied in our framework: a) general low-rank matrix sensing using rank-1 measurements, and b) multi-label regression with missing labels.

Matrix Completion Missing Labels +1

Paper
Add Code

Sparse Inverse Covariance Matrix Estimation Using Quadratic Approximation

no code implementations • NeurIPS 2011 • Cho-Jui Hsieh, Matyas A. Sustik, Inderjit S. Dhillon, Pradeep Ravikumar

The L1-regularized Gaussian maximum likelihood estimator (MLE) has been shown to have strong statistical guarantees in recovering a sparse inverse covariance matrix, or alternatively the underlying graph structure of a Gaussian Markov Random Field, from very limited samples.

Paper
Add Code

Large-scale Multi-label Learning with Missing Labels

no code implementations • 18 Jul 2013 • Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, Inderjit S. Dhillon

The multi-label classification problem has generated significant interest in recent years.

Missing Labels

Paper
Add Code

A Divide-and-Conquer Solver for Kernel Support Vector Machines

no code implementations • 4 Nov 2013 • Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon

We show theoretically that the support vectors identified by the subproblem solution are likely to be support vectors of the entire kernel SVM problem, provided that the problem is partitioned appropriately by kernel clustering.

Clustering

Paper
Add Code

Large Scale Distributed Sparse Precision Estimation

no code implementations • NeurIPS 2013 • Huahua Wang, Arindam Banerjee, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon

We consider the problem of sparse precision matrix estimation in high dimensions using the CLIME estimator, which has several desirable theoretical properties.

Paper
Add Code

Learning with Noisy Labels

no code implementations • NeurIPS 2013 • Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In this paper, we theoretically study the problem of binary classification in the presence of random classification noise --- the learner, instead of seeing the true labels, sees labels that have independently been flipped with some small probability.

Binary Classification General Classification +1

Paper
Add Code

BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables

no code implementations • NeurIPS 2013 • Cho-Jui Hsieh, Matyas A. Sustik, Inderjit S. Dhillon, Pradeep K. Ravikumar, Russell Poldrack

The l1-regularized Gaussian maximum likelihood estimator (MLE) has been shown to have strong statistical guarantees in recovering a sparse inverse covariance matrix even under high-dimensional settings.

Clustering

Paper
Add Code

Proximal Quasi-Newton for Computationally Intensive L1-regularized M-estimators

no code implementations • NeurIPS 2014 • Kai Zhong, Ian E. H. Yen, Inderjit S. Dhillon, Pradeep Ravikumar

We consider the class of optimization problems arising from computationally intensive L1-regularized M-estimators, where the function or gradient values are very expensive to compute.

General Classification Structured Prediction

Paper
Add Code

PU Learning for Matrix Completion

no code implementations • 22 Nov 2014 • Cho-Jui Hsieh, Nagarajan Natarajan, Inderjit S. Dhillon

For the first case, we propose a "shifted matrix completion" method that recovers M using only a subset of indices corresponding to ones, while for the second case, we propose a "biased matrix completion" method that recovers the (thresholded) binary matrix.

Binary Classification Clustering +3

Paper
Add Code

Constant Nullspace Strong Convexity and Fast Convergence of Proximal Methods under High-Dimensional Settings

no code implementations • NeurIPS 2014 • Ian En-Hsu Yen, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon

State of the art statistical estimators for high-dimensional problems take the form of regularized, and hence non-smooth, convex programs.

Paper
Add Code

Fast Prediction for Large-Scale Kernel Machines

no code implementations • NeurIPS 2014 • Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon

Second, we provide a new theoretical analysis on bounding the error of the solution computed by using Nystr¨om kernel approximation method, and show that the error is related to the weighted kmeans objective function where the weights are given by the model computed from the original kernel.

General Classification regression

Paper
Add Code

Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space

no code implementations • NeurIPS 2014 • Ian En-Hsu Yen, Ting-Wei Lin, Shou-De Lin, Pradeep K. Ravikumar, Inderjit S. Dhillon

In this paper, we propose a Sparse Random Feature algorithm, which learns a sparse non-linear predictor by minimizing an $\ell_1$-regularized objective function over the Hilbert Space induced from kernel function.

Paper
Add Code

Capturing Semantically Meaningful Word Dependencies with an Admixture of Poisson MRFs

no code implementations • NeurIPS 2014 • David I. Inouye, Pradeep K. Ravikumar, Inderjit S. Dhillon

We develop a fast algorithm for the Admixture of Poisson MRFs (APM) topic model and propose a novel metric to directly evaluate this model.

Topic Models

Paper
Add Code

QUIC & DIRTY: A Quadratic Approximation Approach for Dirty Statistical Models

no code implementations • NeurIPS 2014 • Cho-Jui Hsieh, Inderjit S. Dhillon, Pradeep K. Ravikumar, Stephen Becker, Peder A. Olsen

In this paper, we develop a family of algorithms for optimizing superposition-structured” or “dirty” statistical estimators for high-dimensional problems involving the minimization of the sum of a smooth loss function with a hybrid regularization.

Model Selection Multi-Task Learning +1

Paper
Add Code

Consistent Binary Classification with Generalized Performance Metrics

no code implementations • NeurIPS 2014 • Oluwasanmi O. Koyejo, Nagarajan Natarajan, Pradeep K. Ravikumar, Inderjit S. Dhillon

We consider a fairly large family of performance metrics given by ratios of linear combinations of the four fundamental population quantities.

Binary Classification Classification +1

Paper
Add Code

Multi-Scale Spectral Decomposition of Massive Graphs

no code implementations • NeurIPS 2014 • Si Si, Donghyuk Shin, Inderjit S. Dhillon, Beresford N. Parlett

Thus, eigenvectors of the clusters serve as good initializations to a block Lanczos algorithm that is used to compute spectral decomposition of the original graph.

Clustering

Paper
Add Code

A Scalable Asynchronous Distributed Algorithm for Topic Modeling

1 code implementation • 16 Dec 2014 • Hsiang-Fu Yu, Cho-Jui Hsieh, Hyokun Yun, S. V. N. Vishwanathan, Inderjit S. Dhillon

Learning meaningful topic models with massive document collections which contain millions of documents and billions of tokens is challenging because of two reasons: First, one needs to deal with a large number of topics (typically in the order of thousands).

Topic Models

127

Paper
Code

PASSCoDe: Parallel ASynchronous Stochastic dual Co-ordinate Descent

no code implementations • 6 Apr 2015 • Cho-Jui Hsieh, Hsiang-Fu Yu, Inderjit S. Dhillon

In this paper, we parallelize the SDCD algorithms in LIBLINEAR.

Paper
Add Code

Optimal Decision-Theoretic Classification Using Non-Decomposable Performance Metrics

no code implementations • 7 May 2015 • Nagarajan Natarajan, Oluwasanmi Koyejo, Pradeep Ravikumar, Inderjit S. Dhillon

We provide a general theoretical analysis of expected out-of-sample utility, also referred to as decision-theoretic classification, for non-decomposable binary classification metrics such as F-measure and Jaccard coefficient.

Binary Classification Classification +1

Paper
Add Code

Preference Completion: Large-scale Collaborative Ranking from Pairwise Comparisons

1 code implementation • 16 Jul 2015 • Dohyung Park, Joe Neeman, Jin Zhang, Sujay Sanghavi, Inderjit S. Dhillon

In this paper we consider the collaborative ranking setting: a pool of users each provides a small number of pairwise preferences between $d$ possible items; from these we need to predict preferences of the users for items they have not yet seen.

Collaborative Filtering Collaborative Ranking +1

Paper
Code

High-dimensional Time Series Prediction with Missing Values

no code implementations • 28 Sep 2015 • Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon

High-dimensional time series prediction is needed in applications as diverse as demand forecasting and climatology.

Matrix Completion Time Series +2

Paper
Add Code

Collaborative Filtering with Graph Information: Consistency and Scalable Methods

2 code implementations • NeurIPS 2015 • Nikhil Rao, Hsiang-Fu Yu, Pradeep K. Ravikumar, Inderjit S. Dhillon

Low rank matrix completion plays a fundamental role in collaborative filtering applications, the key idea being that the variables lie in a smaller subspace than the ambient space.

Ranked #1 on Recommendation Systems on Flixster (using extra training data)

Collaborative Filtering Low-Rank Matrix Completion +1

Paper
Code

Consistent Multilabel Classification

no code implementations • NeurIPS 2015 • Oluwasanmi O. Koyejo, Nagarajan Natarajan, Pradeep K. Ravikumar, Inderjit S. Dhillon

In particular, we show that for multilabel metrics constructed as instance-, micro- and macro-averages, the population optimal classifier can be decomposed into binary classifiers based on the marginal instance-conditional distribution of each label, with a weak association between labels via the threshold.

Classification General Classification

Paper
Add Code

Fixed-Length Poisson MRF: Adding Dependencies to the Multinomial

no code implementations • NeurIPS 2015 • David I. Inouye, Pradeep K. Ravikumar, Inderjit S. Dhillon

We show the effectiveness of our LPMRF distribution over Multinomial models by evaluating the test set perplexity on a dataset of abstracts and Wikipedia.

Topic Models

Paper
Add Code

Sparse Linear Programming via Primal and Dual Augmented Coordinate Descent

no code implementations • NeurIPS 2015 • Ian En-Hsu Yen, Kai Zhong, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon

Over the past decades, Linear Programming (LP) has been widely used in different areas and considered as one of the mature technologies in numerical optimization.

Paper
Add Code

Matrix Completion with Noisy Side Information

no code implementations • NeurIPS 2015 • Kai-Yang Chiang, Cho-Jui Hsieh, Inderjit S. Dhillon

Moreover, we study the effectof general features in theory, and show that by using our model, the sample complexity can still be lower than matrix completion as long as features are sufficiently informative.

Clustering Matrix Completion

Paper
Add Code

Fast Multiplier Methods to Optimize Non-exhaustive, Overlapping Clustering

no code implementations • 5 Feb 2016 • Yangyang Hou, Joyce Jiyoung Whang, David F. Gleich, Inderjit S. Dhillon

In this paper, we consider two fast multiplier methods to accelerate the convergence of an augmented Lagrangian scheme: a proximal method of multipliers and an alternating direction method of multipliers (ADMM).

Clustering

Paper
Add Code

Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies

no code implementations • 11 Mar 2016 • David I. Inouye, Pradeep Ravikumar, Inderjit S. Dhillon

With this motivation, we give an example of our model class derived from the univariate exponential distribution that allows for almost arbitrary positive and negative dependencies with only a mild condition on the parameter matrix---a condition akin to the positive definiteness of the Gaussian covariance matrix.

Paper
Add Code

Extreme Stochastic Variational Inference: Distributed and Asynchronous

no code implementations • 31 May 2016 • Jiong Zhang, Parameswaran Raman, Shihao Ji, Hsiang-Fu Yu, S. V. N. Vishwanathan, Inderjit S. Dhillon

Moreover, it requires the parameters to fit in the memory of a single processor; this is problematic when the number of parameters is in billions.

Variational Inference

Paper
Add Code

PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification

1 code implementation • ICML 2016 • Ian En-Hsu Yen, Xiangru Huang, Pradeep Ravikumar, Kai Zhong, Inderjit S. Dhillon

In this work, we show that a margin-maximizing loss with l1 penalty, in case of Extreme Classification, yields extremely sparse solution both in primal and in dual without sacrificing the expressive power of predictor.

General Classification Text Classification

Paper
Code

Generalized Root Models: Beyond Pairwise Graphical Models for Univariate Exponential Families

1 code implementation • 2 Jun 2016 • David I. Inouye, Pradeep Ravikumar, Inderjit S. Dhillon

As in the recent work with square root graphical (SQR) models [Inouye et al. 2016]---which was restricted to pairwise dependencies---we give the conditions of the parameters that are needed for normalization using the radial conditionals similar to the pairwise case [Inouye et al. 2016].

Paper
Code

Communication-Efficient Parallel Block Minimization for Kernel Machines

no code implementations • 5 Aug 2016 • Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon

Kernel machines often yield superior predictive performance on various tasks; however, they suffer from severe computational challenges.

Paper
Add Code

A Greedy Approach for Budgeted Maximum Inner Product Search

no code implementations • NeurIPS 2017 • Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon

Maximum Inner Product Search (MIPS) is an important task in many machine learning applications such as the prediction phase of a low-rank matrix factorization model for a recommender system.

Recommendation Systems

Paper
Add Code

Structured Sparse Regression via Greedy Hard Thresholding

no code implementations • NeurIPS 2016 • Prateek Jain, Nikhil Rao, Inderjit S. Dhillon

Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups.

regression

Paper
Add Code

Asynchronous Parallel Greedy Coordinate Descent

no code implementations • NeurIPS 2016 • Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh

n this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints.

Paper
Add Code

Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain

no code implementations • NeurIPS 2016 • Ian En-Hsu Yen, Xiangru Huang, Kai Zhong, Ruohan Zhang, Pradeep K. Ravikumar, Inderjit S. Dhillon

In this work, we show that, by decomposing training of Structural Support Vector Machine (SVM) into a series of multiclass SVM problems connected through messages, one can replace expensive structured oracle with Factorwise Maximization Oracle (FMO) that allows efficient implementation of complexity sublinear to the factor domain.

Paper
Add Code

Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction

no code implementations • NeurIPS 2016 • Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon

We develop novel regularization schemes and use scalable matrix factorization methods that are eminently suited for high-dimensional time series data that has many missing values.

Time Series Time Series Prediction +1

Paper
Add Code

Coordinate-wise Power Method

no code implementations • NeurIPS 2016 • Qi Lei, Kai Zhong, Inderjit S. Dhillon

The vanilla power method simultaneously updates all the coordinates of the iterate, which is essential for its convergence analysis.

Paper
Add Code

Mixed Linear Regression with Multiple Components

no code implementations • NeurIPS 2016 • Kai Zhong, Prateek Jain, Inderjit S. Dhillon

Furthermore, our empirical results indicate that even with random initialization, our approach converges to the global optima in linear time, providing speed-up of up to two orders of magnitude.

Clustering regression

Paper
Add Code

Similarity Preserving Representation Learning for Time Series Clustering

no code implementations • 12 Feb 2017 • Qi Lei, Jin-Feng Yi, Roman Vaculin, Lingfei Wu, Inderjit S. Dhillon

A considerable amount of clustering algorithms take instance-feature matrices as their inputs.

Clustering Representation Learning +2

Paper
Add Code

Recovery Guarantees for One-hidden-layer Neural Networks

no code implementations • ICML 2017 • Kai Zhong, Zhao Song, Prateek Jain, Peter L. Bartlett, Inderjit S. Dhillon

For activation functions that are also smooth, we show $\mathit{local~linear~convergence}$ guarantees of gradient descent under a resampling rule.

Paper
Add Code

Gradient Boosted Decision Trees for High Dimensional Sparse Output

no code implementations • ICML 2017 • Si Si, huan zhang, S. Sathiya Keerthi, Dhruv Mahajan, Inderjit S. Dhillon, Cho-Jui Hsieh

In this paper, we study the gradient boosted decision trees (GBDT) when the output space is high dimensional and sparse.

General Classification Vocal Bursts Intensity Prediction

Paper
Add Code

Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization

no code implementations • ICML 2017 • Qi Lei, Ian En-Hsu Yen, Chao-yuan Wu, Inderjit S. Dhillon, Pradeep Ravikumar

We consider the popular problem of sparse empirical risk minimization with linear predictors and a large number of both features and observations.

Paper
Add Code

Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels

no code implementations • 8 Nov 2017 • Kai Zhong, Zhao Song, Inderjit S. Dhillon

In this paper, we consider parameter recovery for non-overlapping convolutional neural networks (CNNs) with multiple kernels.

Paper
Add Code

Realtime query completion via deep language models

no code implementations • ICLR 2018 • Po-Wei Wang, J. Zico Kolter, Vijai Mohan, Inderjit S. Dhillon

Search engine users nowadays heavily depend on query completion and correction to shape their queries.

Language Modelling

Paper
Add Code

Learning Long Term Dependencies via Fourier Recurrent Units

2 code implementations • ICML 2018 • Jiong Zhang, Yibo Lin, Zhao Song, Inderjit S. Dhillon

In this paper we propose a simple recurrent architecture, the Fourier Recurrent Unit (FRU), that stabilizes the gradients that arise in its training while giving us stronger expressive power.

Paper
Code

Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization

1 code implementation • ICML 2018 • Jiong Zhang, Qi Lei, Inderjit S. Dhillon

Theoretically, we demonstrate that our parameterization does not lose any expressive power, and show how it controls generalization of RNN for the classification task.

Paper
Code

Towards Fast Computation of Certified Robustness for ReLU Networks

6 code implementations • ICML 2018 • Tsui-Wei Weng, huan zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel

Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17].

Paper
Code

Nonlinear Inductive Matrix Completion based on One-layer Neural Networks

no code implementations • 26 May 2018 • Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon

A standard approach to modeling this problem is Inductive Matrix Completion where the predicted rating is modeled as an inner product of the user and the item features projected onto a latent space.

Clustering Matrix Completion +1

Paper
Add Code

SeCSeq: Semantic Coding for Sequence-to-Sequence based Extreme Multi-label Classification

no code implementations • NIPS Workshop CDNNRIA 2018 • Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon, Yiming Yang

To circumvent the softmax bottleneck, SeCSeq compresses labels into sequences of semantic-aware compact codes, on which Seq2Seq models are trained.

Extreme Multi-Label Classification

Paper
Add Code

Discrete Adversarial Attacks and Submodular Optimization with Applications to Text Classification

1 code implementation • 1 Dec 2018 • Qi Lei, Lingfei Wu, Pin-Yu Chen, Alexandros G. Dimakis, Inderjit S. Dhillon, Michael Witbrock

In this paper we formulate the attacks with discrete input on a set function as an optimization task.

Adversarial Text General Classification +3

Paper
Code

The Limitations of Adversarial Training and the Blind-Spot Attack

no code implementations • ICLR 2019 • Huan Zhang, Hongge Chen, Zhao Song, Duane Boning, Inderjit S. Dhillon, Cho-Jui Hsieh

In our paper, we shed some lights on the practicality and the hardness of adversarial training by showing that the effectiveness (robustness on test set) of adversarial training has a strong correlation with the distance between a test point and the manifold of training data embedded by the network.

valid

Paper
Add Code

MLSys: The New Frontier of Machine Learning Systems

no code implementations • 29 Mar 2019 • Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood, Furong Huang, Martin Jaggi, Kevin Jamieson, Michael. I. Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konečný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Aparna Lakshmiratan, Jing Li, Samuel Madden, H. Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Murray, Kunle Olukotun, Dimitris Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar

Machine learning (ML) techniques are enjoying rapidly increasing adoption.

BIG-bench Machine Learning

Paper
Add Code

AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

1 code implementation • NeurIPS 2019 • Jiong Zhang, Hsiang-Fu Yu, Inderjit S. Dhillon

In this paper, we propose AutoAssist, a simple framework to accelerate training of a deep neural network.

Image Classification

Paper
Code

Primal-Dual Block Frank-Wolfe

1 code implementation • 6 Jun 2019 • Qi Lei, Jiacheng Zhuo, Constantine Caramanis, Inderjit S. Dhillon, Alexandros G. Dimakis

We propose a variant of the Frank-Wolfe algorithm for solving a class of sparse/low-rank optimization problems.

General Classification Multi-class Classification +1

Paper
Code

Inverting Deep Generative models, One layer at a time

1 code implementation • NeurIPS 2019 • Qi Lei, Ajil Jalal, Inderjit S. Dhillon, Alexandros G. Dimakis

For generative models of arbitrary depth, we show that exact recovery is possible in polynomial time with high probability, if the layers are expanding and the weights are randomly selected.

Paper
Code

Multiresolution Transformer Networks: Recurrence is Not Essential for Modeling Hierarchical Structure

no code implementations • 27 Aug 2019 • Vikas K. Garg, Inderjit S. Dhillon, Hsiang-Fu Yu

The architecture of Transformer is based entirely on self-attention, and has been shown to outperform models that employ recurrence on sequence transduction tasks such as machine translation.

Machine Translation Translation

Paper
Add Code

Provable Non-linear Inductive Matrix Completion

no code implementations • NeurIPS 2019 • Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon

Inductive matrix completion (IMC) method is a standard approach for this problem where the given query as well as the items are embedded in a common low-dimensional space.

Matrix Completion Retrieval

Paper
Add Code

Primal-Dual Block Generalized Frank-Wolfe

1 code implementation • NeurIPS 2019 • Qi Lei, Jiacheng Zhuo, Constantine Caramanis, Inderjit S. Dhillon, Alexandros G. Dimakis

We propose a generalized variant of Frank-Wolfe algorithm for solving a class of sparse/low-rank optimization problems.

Multi-class Classification Retrieval

Paper
Code

Non-Exhaustive, Overlapping Co-Clustering: An Extended Analysis

no code implementations • 24 Apr 2020 • Joyce Jiyoung Whang, Inderjit S. Dhillon

To solve this problem, we propose intuitive objective functions, and develop an an efficient iterative algorithm which we call the NEO-CC algorithm.

Clustering

Paper
Add Code

Learning from eXtreme Bandit Feedback

no code implementations • 27 Sep 2020 • Romain Lopez, Inderjit S. Dhillon, Michael. I. Jordan

In POXM, the selected actions for the sIS estimator are the top-p actions of the logging policy, where p is adjusted from the data and is significantly smaller than the size of the action space.

Extreme Multi-Label Classification Recommendation Systems

Paper
Add Code

PECOS: Prediction for Enormous and Correlated Output Spaces

no code implementations • 12 Oct 2020 • Hsiang-Fu Yu, Kai Zhong, Jiong Zhang, Wei-Cheng Chang, Inderjit S. Dhillon

In this paper, we propose the Prediction for Enormous and Correlated Output Spaces (PECOS) framework, a versatile and modular machine learning framework for solving prediction problems for very large output spaces, and apply it to the eXtreme Multilabel Ranking (XMR) problem: given an input instance, find and rank the most relevant items from an enormous but fixed and finite output space.

Paper
Add Code

Faster Non-Convex Federated Learning via Global and Local Momentum

no code implementations • 7 Dec 2020 • Rudrajit Das, Anish Acharya, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu

We propose \texttt{FedGLOMO}, a novel federated learning (FL) algorithm with an iteration complexity of $\mathcal{O}(\epsilon^{-1. 5})$ to converge to an $\epsilon$-stationary point (i. e., $\mathbb{E}[\|\nabla f(\bm{x})\|^2] \leq \epsilon$) for smooth non-convex functions -- under arbitrary client heterogeneity and compressed communication -- compared to the $\mathcal{O}(\epsilon^{-2})$ complexity of most prior works.

Federated Learning

Paper
Add Code

Session-Aware Query Auto-completion using Extreme Multi-label Ranking

1 code implementation • 9 Dec 2020 • Nishant Yadav, Rajat Sen, Daniel N. Hill, Arya Mazumdar, Inderjit S. Dhillon

Previous queries in the user session can provide useful context for the user's intent and can be leveraged to suggest auto-completions that are more relevant while adhering to the user's prefix.

489

Paper
Code

Linear Bandit Algorithms with Sublinear Time Complexity

no code implementations • 3 Mar 2021 • Shuo Yang, Tongzheng Ren, Sanjay Shakkottai, Eric Price, Inderjit S. Dhillon, Sujay Sanghavi

For sufficiently large $K$, our algorithms have sublinear per-step complexity and $\tilde O(\sqrt{T})$ regret.

Movie Recommendation

Paper
Add Code

Combinatorial Bandits without Total Order for Arms

no code implementations • 3 Mar 2021 • Shuo Yang, Tongzheng Ren, Inderjit S. Dhillon, Sujay Sanghavi

Specifically, we focus on a challenging setting where 1) the reward distribution of an arm depends on the set $s$ it is part of, and crucially 2) there is \textit{no total order} for the arms in $\mathcal{A}$.

Paper
Add Code

Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme Classification

no code implementations • 1 Jun 2021 • Tavor Z. Baharav, Daniel L. Jiang, Kedarnath Kolluri, Sujay Sanghavi, Inderjit S. Dhillon

For such applications, a common approach is to organize these labels into a tree, enabling training and inference times that are logarithmic in the number of labels.

Extreme Multi-Label Classification TAG

Paper
Add Code

On the Convergence of Differentially Private Federated Learning on Non-Lipschitz Objectives, and with Normalized Client Updates

no code implementations • 13 Jun 2021 • Rudrajit Das, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon

The primary reason for this is that the clipping operation (i. e., projection onto an $\ell_2$ ball of a fixed radius called the clipping threshold) for bounding the sensitivity of the average update to each client's update introduces bias depending on the clipping threshold and the number of local steps in FL, and analyzing this is not easy.

Benchmarking Federated Learning +1

Paper
Add Code

Robust Training in High Dimensions via Block Coordinate Geometric Median Descent

2 code implementations • 16 Jun 2021 • Anish Acharya, Abolfazl Hashemi, Prateek Jain, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu

Geometric median (\textsc{Gm}) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0. 5.

Ranked #19 on Image Classification on MNIST (Accuracy metric)

Image Classification Vocal Bursts Intensity Prediction

Paper
Code

Extreme Multi-label Learning for Semantic Matching in Product Search

1 code implementation • 23 Jun 2021 • Wei-Cheng Chang, Daniel Jiang, Hsiang-Fu Yu, Choon-Hui Teo, Jiong Zhang, Kai Zhong, Kedarnath Kolluri, Qie Hu, Nikhil Shandilya, Vyacheslav Ievgrafov, Japinder Singh, Inderjit S. Dhillon

In this paper, we aim to improve semantic product search by using tree-based XMC models where inference time complexity is logarithmic in the number of products.

Extreme Multi-Label Classification Multi-Label Learning

489

Paper
Code

Label Disentanglement in Partition-based Extreme Multilabel Classification

no code implementations • NeurIPS 2021 • Xuanqing Liu, Wei-Cheng Chang, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S. Dhillon

Partition-based methods are increasingly-used in extreme multi-label classification (XMC) problems due to their scalability to large output spaces (e. g., millions or more).

Classification Disentanglement +1

Paper
Add Code

Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

1 code implementation • NeurIPS 2021 • Jiong Zhang, Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon

Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs.

Multi Label Text Classification Multi-Label Text Classification +2

489

Paper
Code

Cluster-and-Conquer: A Framework For Time-Series Forecasting

no code implementations • 26 Oct 2021 • Reese Pathak, Rajat Sen, Nikhil Rao, N. Benjamin Erichson, Michael I. Jordan, Inderjit S. Dhillon

Our framework -- which we refer to as "cluster-and-conquer" -- is highly general, allowing for any time-series forecasting and clustering method to be used in each step.

Time Series Time Series Forecasting

Paper
Add Code

Sample Efficiency of Data Augmentation Consistency Regularization

no code implementations • 24 Feb 2022 • Shuo Yang, Yijun Dong, Rachel Ward, Inderjit S. Dhillon, Sujay Sanghavi, Qi Lei

Data augmentation is popular in the training of large neural networks; currently, however, there is no clear theoretical comparison between different algorithmic choices on how to use augmented data.

Data Augmentation Generalization Bounds

Paper
Add Code

Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion

no code implementations • 22 Apr 2022 • Adam Block, Rahul Kidambi, Daniel N. Hill, Thorsten Joachims, Inderjit S. Dhillon

A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions.

counterfactual Information Retrieval +2

Paper
Add Code

FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search

no code implementations • 22 Jun 2022 • Patrick H. Chen, Chang Wei-cheng, Yu Hsiang-fu, Inderjit S. Dhillon, Hsieh Cho-jui

Approximate K-Nearest Neighbor Search (AKNNS) has now become ubiquitous in modern applications, for example, as a fast search procedure with two tower deep learning models.

Paper
Add Code

Automatic Engineering of Long Prompts

no code implementations • 16 Nov 2023 • Cho-Jui Hsieh, Si Si, Felix X. Yu, Inderjit S. Dhillon

Large language models (LLMs) have demonstrated remarkable capabilities in solving complex open-domain tasks, guided by comprehensive instructions and demonstrations provided in the form of prompts.

Prompt Engineering

Paper
Add Code

Towards Quantifying the Preconditioning Effect of Adam

no code implementations • 11 Feb 2024 • Rudrajit Das, Naman Agarwal, Sujay Sanghavi, Inderjit S. Dhillon

Specifically, for a $d$-dimensional quadratic with a diagonal Hessian having condition number $\kappa$, we show that the effective condition number-like quantity controlling the iteration complexity of Adam without momentum is $\mathcal{O}(\min(d, \kappa))$.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.