no code implementations • 5 Nov 2024 • Sai Surya Duvvuri, Inderjit S. Dhillon
To this end, we introduce a new attention mechanism called LASER, which we analytically show to admit a larger gradient signal.
no code implementations • 27 Oct 2024 • Jui-Nan Yen, Si Si, Zhao Meng, Felix Yu, Sai Surya Duvvuri, Inderjit S. Dhillon, Cho-Jui Hsieh, Sanjiv Kumar
Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements.
no code implementations • 17 Jun 2024 • Rudrajit Das, Inderjit S. Dhillon, Alessandro Epasto, Adel Javanmard, Jieming Mao, Vahab Mirrokni, Sujay Sanghavi, Peilin Zhong
In this paper, we theoretically analyze retraining in a linearly separable setting with randomly corrupted labels given to us and prove that retraining can improve the population accuracy obtained by initially training with the given (noisy) labels.
no code implementations • 11 Feb 2024 • Rudrajit Das, Naman Agarwal, Sujay Sanghavi, Inderjit S. Dhillon
Specifically, for a $d$-dimensional quadratic with a diagonal Hessian having condition number $\kappa$, we show that the effective condition number-like quantity controlling the iteration complexity of Adam without momentum is $\mathcal{O}(\min(d, \kappa))$.
no code implementations • 16 Nov 2023 • Cho-Jui Hsieh, Si Si, Felix X. Yu, Inderjit S. Dhillon
Large language models (LLMs) have demonstrated remarkable capabilities in solving complex open-domain tasks, guided by comprehensive instructions and demonstrations provided in the form of prompts.
no code implementations • 22 Jun 2022 • Patrick H. Chen, Chang Wei-cheng, Yu Hsiang-fu, Inderjit S. Dhillon, Hsieh Cho-jui
Approximate K-Nearest Neighbor Search (AKNNS) has now become ubiquitous in modern applications, for example, as a fast search procedure with two tower deep learning models.
no code implementations • 22 Apr 2022 • Adam Block, Rahul Kidambi, Daniel N. Hill, Thorsten Joachims, Inderjit S. Dhillon
A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions.
no code implementations • 24 Feb 2022 • Shuo Yang, Yijun Dong, Rachel Ward, Inderjit S. Dhillon, Sujay Sanghavi, Qi Lei
Data augmentation is popular in the training of large neural networks; currently, however, there is no clear theoretical comparison between different algorithmic choices on how to use augmented data.
no code implementations • 26 Oct 2021 • Reese Pathak, Rajat Sen, Nikhil Rao, N. Benjamin Erichson, Michael I. Jordan, Inderjit S. Dhillon
Our framework -- which we refer to as "cluster-and-conquer" -- is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
1 code implementation • NeurIPS 2021 • Jiong Zhang, Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon
Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs.
Multi Label Text Classification Multi-Label Text Classification +2
no code implementations • NeurIPS 2021 • Xuanqing Liu, Wei-Cheng Chang, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S. Dhillon
Partition-based methods are increasingly-used in extreme multi-label classification (XMC) problems due to their scalability to large output spaces (e. g., millions or more).
1 code implementation • 23 Jun 2021 • Wei-Cheng Chang, Daniel Jiang, Hsiang-Fu Yu, Choon-Hui Teo, Jiong Zhang, Kai Zhong, Kedarnath Kolluri, Qie Hu, Nikhil Shandilya, Vyacheslav Ievgrafov, Japinder Singh, Inderjit S. Dhillon
In this paper, we aim to improve semantic product search by using tree-based XMC models where inference time complexity is logarithmic in the number of products.
2 code implementations • 16 Jun 2021 • Anish Acharya, Abolfazl Hashemi, Prateek Jain, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu
Geometric median (\textsc{Gm}) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0. 5.
Ranked #22 on Image Classification on MNIST (Accuracy metric)
no code implementations • 13 Jun 2021 • Rudrajit Das, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon
The primary reason for this is that the clipping operation (i. e., projection onto an $\ell_2$ ball of a fixed radius called the clipping threshold) for bounding the sensitivity of the average update to each client's update introduces bias depending on the clipping threshold and the number of local steps in FL, and analyzing this is not easy.
no code implementations • 1 Jun 2021 • Tavor Z. Baharav, Daniel L. Jiang, Kedarnath Kolluri, Sujay Sanghavi, Inderjit S. Dhillon
For such applications, a common approach is to organize these labels into a tree, enabling training and inference times that are logarithmic in the number of labels.
no code implementations • 3 Mar 2021 • Shuo Yang, Tongzheng Ren, Inderjit S. Dhillon, Sujay Sanghavi
Specifically, we focus on a challenging setting where 1) the reward distribution of an arm depends on the set $s$ it is part of, and crucially 2) there is \textit{no total order} for the arms in $\mathcal{A}$.
no code implementations • 3 Mar 2021 • Shuo Yang, Tongzheng Ren, Sanjay Shakkottai, Eric Price, Inderjit S. Dhillon, Sujay Sanghavi
For sufficiently large $K$, our algorithms have sublinear per-step complexity and $\tilde O(\sqrt{T})$ regret.
1 code implementation • 9 Dec 2020 • Nishant Yadav, Rajat Sen, Daniel N. Hill, Arya Mazumdar, Inderjit S. Dhillon
Previous queries in the user session can provide useful context for the user's intent and can be leveraged to suggest auto-completions that are more relevant while adhering to the user's prefix.
no code implementations • 7 Dec 2020 • Rudrajit Das, Anish Acharya, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu
We propose \texttt{FedGLOMO}, a novel federated learning (FL) algorithm with an iteration complexity of $\mathcal{O}(\epsilon^{-1. 5})$ to converge to an $\epsilon$-stationary point (i. e., $\mathbb{E}[\|\nabla f(\bm{x})\|^2] \leq \epsilon$) for smooth non-convex functions -- under arbitrary client heterogeneity and compressed communication -- compared to the $\mathcal{O}(\epsilon^{-2})$ complexity of most prior works.
no code implementations • 12 Oct 2020 • Hsiang-Fu Yu, Kai Zhong, Jiong Zhang, Wei-Cheng Chang, Inderjit S. Dhillon
In this paper, we propose the Prediction for Enormous and Correlated Output Spaces (PECOS) framework, a versatile and modular machine learning framework for solving prediction problems for very large output spaces, and apply it to the eXtreme Multilabel Ranking (XMR) problem: given an input instance, find and rank the most relevant items from an enormous but fixed and finite output space.
no code implementations • 27 Sep 2020 • Romain Lopez, Inderjit S. Dhillon, Michael. I. Jordan
In POXM, the selected actions for the sIS estimator are the top-p actions of the logging policy, where p is adjusted from the data and is significantly smaller than the size of the action space.
no code implementations • 24 Apr 2020 • Joyce Jiyoung Whang, Inderjit S. Dhillon
To solve this problem, we propose intuitive objective functions, and develop an an efficient iterative algorithm which we call the NEO-CC algorithm.
1 code implementation • NeurIPS 2019 • Qi Lei, Jiacheng Zhuo, Constantine Caramanis, Inderjit S. Dhillon, Alexandros G. Dimakis
We propose a generalized variant of Frank-Wolfe algorithm for solving a class of sparse/low-rank optimization problems.
no code implementations • NeurIPS 2019 • Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon
Inductive matrix completion (IMC) method is a standard approach for this problem where the given query as well as the items are embedded in a common low-dimensional space.
no code implementations • 27 Aug 2019 • Vikas K. Garg, Inderjit S. Dhillon, Hsiang-Fu Yu
The architecture of Transformer is based entirely on self-attention, and has been shown to outperform models that employ recurrence on sequence transduction tasks such as machine translation.
1 code implementation • NeurIPS 2019 • Qi Lei, Ajil Jalal, Inderjit S. Dhillon, Alexandros G. Dimakis
For generative models of arbitrary depth, we show that exact recovery is possible in polynomial time with high probability, if the layers are expanding and the weights are randomly selected.
1 code implementation • 6 Jun 2019 • Qi Lei, Jiacheng Zhuo, Constantine Caramanis, Inderjit S. Dhillon, Alexandros G. Dimakis
We propose a variant of the Frank-Wolfe algorithm for solving a class of sparse/low-rank optimization problems.
1 code implementation • NeurIPS 2019 • Jiong Zhang, Hsiang-Fu Yu, Inderjit S. Dhillon
In this paper, we propose AutoAssist, a simple framework to accelerate training of a deep neural network.
no code implementations • 29 Mar 2019 • Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood, Furong Huang, Martin Jaggi, Kevin Jamieson, Michael. I. Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konečný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Aparna Lakshmiratan, Jing Li, Samuel Madden, H. Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Murray, Kunle Olukotun, Dimitris Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar
Machine learning (ML) techniques are enjoying rapidly increasing adoption.
no code implementations • ICLR 2019 • Huan Zhang, Hongge Chen, Zhao Song, Duane Boning, Inderjit S. Dhillon, Cho-Jui Hsieh
In our paper, we shed some lights on the practicality and the hardness of adversarial training by showing that the effectiveness (robustness on test set) of adversarial training has a strong correlation with the distance between a test point and the manifold of training data embedded by the network.
1 code implementation • 1 Dec 2018 • Qi Lei, Lingfei Wu, Pin-Yu Chen, Alexandros G. Dimakis, Inderjit S. Dhillon, Michael Witbrock
In this paper we formulate the attacks with discrete input on a set function as an optimization task.
no code implementations • NIPS Workshop CDNNRIA 2018 • Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon, Yiming Yang
To circumvent the softmax bottleneck, SeCSeq compresses labels into sequences of semantic-aware compact codes, on which Seq2Seq models are trained.
no code implementations • 26 May 2018 • Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon
A standard approach to modeling this problem is Inductive Matrix Completion where the predicted rating is modeled as an inner product of the user and the item features projected onto a latent space.
6 code implementations • ICML 2018 • Tsui-Wei Weng, huan zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel
Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17].
1 code implementation • ICML 2018 • Jiong Zhang, Qi Lei, Inderjit S. Dhillon
Theoretically, we demonstrate that our parameterization does not lose any expressive power, and show how it controls generalization of RNN for the classification task.
2 code implementations • ICML 2018 • Jiong Zhang, Yibo Lin, Zhao Song, Inderjit S. Dhillon
In this paper we propose a simple recurrent architecture, the Fourier Recurrent Unit (FRU), that stabilizes the gradients that arise in its training while giving us stronger expressive power.
no code implementations • ICLR 2018 • Po-Wei Wang, J. Zico Kolter, Vijai Mohan, Inderjit S. Dhillon
Search engine users nowadays heavily depend on query completion and correction to shape their queries.
no code implementations • 8 Nov 2017 • Kai Zhong, Zhao Song, Inderjit S. Dhillon
In this paper, we consider parameter recovery for non-overlapping convolutional neural networks (CNNs) with multiple kernels.
no code implementations • ICML 2017 • Si Si, huan zhang, S. Sathiya Keerthi, Dhruv Mahajan, Inderjit S. Dhillon, Cho-Jui Hsieh
In this paper, we study the gradient boosted decision trees (GBDT) when the output space is high dimensional and sparse.
no code implementations • ICML 2017 • Qi Lei, Ian En-Hsu Yen, Chao-yuan Wu, Inderjit S. Dhillon, Pradeep Ravikumar
We consider the popular problem of sparse empirical risk minimization with linear predictors and a large number of both features and observations.
no code implementations • ICML 2017 • Kai Zhong, Zhao Song, Prateek Jain, Peter L. Bartlett, Inderjit S. Dhillon
For activation functions that are also smooth, we show $\mathit{local~linear~convergence}$ guarantees of gradient descent under a resampling rule.
no code implementations • 12 Feb 2017 • Qi Lei, Jin-Feng Yi, Roman Vaculin, Lingfei Wu, Inderjit S. Dhillon
A considerable amount of clustering algorithms take instance-feature matrices as their inputs.
no code implementations • NeurIPS 2016 • Prateek Jain, Nikhil Rao, Inderjit S. Dhillon
Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups.
no code implementations • NeurIPS 2016 • Qi Lei, Kai Zhong, Inderjit S. Dhillon
The vanilla power method simultaneously updates all the coordinates of the iterate, which is essential for its convergence analysis.
no code implementations • NeurIPS 2016 • Ian En-Hsu Yen, Xiangru Huang, Kai Zhong, Ruohan Zhang, Pradeep K. Ravikumar, Inderjit S. Dhillon
In this work, we show that, by decomposing training of Structural Support Vector Machine (SVM) into a series of multiclass SVM problems connected through messages, one can replace expensive structured oracle with Factorwise Maximization Oracle (FMO) that allows efficient implementation of complexity sublinear to the factor domain.
no code implementations • NeurIPS 2016 • Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh
n this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints.
no code implementations • NeurIPS 2016 • Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon
We develop novel regularization schemes and use scalable matrix factorization methods that are eminently suited for high-dimensional time series data that has many missing values.
no code implementations • NeurIPS 2016 • Kai Zhong, Prateek Jain, Inderjit S. Dhillon
Furthermore, our empirical results indicate that even with random initialization, our approach converges to the global optima in linear time, providing speed-up of up to two orders of magnitude.
no code implementations • NeurIPS 2017 • Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon
Maximum Inner Product Search (MIPS) is an important task in many machine learning applications such as the prediction phase of a low-rank matrix factorization model for a recommender system.
no code implementations • 5 Aug 2016 • Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon
Kernel machines often yield superior predictive performance on various tasks; however, they suffer from severe computational challenges.
1 code implementation • 2 Jun 2016 • David I. Inouye, Pradeep Ravikumar, Inderjit S. Dhillon
As in the recent work with square root graphical (SQR) models [Inouye et al. 2016]---which was restricted to pairwise dependencies---we give the conditions of the parameters that are needed for normalization using the radial conditionals similar to the pairwise case [Inouye et al. 2016].
1 code implementation • ICML 2016 • Ian En-Hsu Yen, Xiangru Huang, Pradeep Ravikumar, Kai Zhong, Inderjit S. Dhillon
In this work, we show that a margin-maximizing loss with l1 penalty, in case of Extreme Classification, yields extremely sparse solution both in primal and in dual without sacrificing the expressive power of predictor.
no code implementations • 31 May 2016 • Jiong Zhang, Parameswaran Raman, Shihao Ji, Hsiang-Fu Yu, S. V. N. Vishwanathan, Inderjit S. Dhillon
Moreover, it requires the parameters to fit in the memory of a single processor; this is problematic when the number of parameters is in billions.
no code implementations • 11 Mar 2016 • David I. Inouye, Pradeep Ravikumar, Inderjit S. Dhillon
With this motivation, we give an example of our model class derived from the univariate exponential distribution that allows for almost arbitrary positive and negative dependencies with only a mild condition on the parameter matrix---a condition akin to the positive definiteness of the Gaussian covariance matrix.
no code implementations • 5 Feb 2016 • Yangyang Hou, Joyce Jiyoung Whang, David F. Gleich, Inderjit S. Dhillon
In this paper, we consider two fast multiplier methods to accelerate the convergence of an augmented Lagrangian scheme: a proximal method of multipliers and an alternating direction method of multipliers (ADMM).
2 code implementations • NeurIPS 2015 • Nikhil Rao, Hsiang-Fu Yu, Pradeep K. Ravikumar, Inderjit S. Dhillon
Low rank matrix completion plays a fundamental role in collaborative filtering applications, the key idea being that the variables lie in a smaller subspace than the ambient space.
Ranked #1 on Recommendation Systems on Flixster (using extra training data)
no code implementations • NeurIPS 2015 • Oluwasanmi O. Koyejo, Nagarajan Natarajan, Pradeep K. Ravikumar, Inderjit S. Dhillon
In particular, we show that for multilabel metrics constructed as instance-, micro- and macro-averages, the population optimal classifier can be decomposed into binary classifiers based on the marginal instance-conditional distribution of each label, with a weak association between labels via the threshold.
no code implementations • NeurIPS 2015 • David I. Inouye, Pradeep K. Ravikumar, Inderjit S. Dhillon
We show the effectiveness of our LPMRF distribution over Multinomial models by evaluating the test set perplexity on a dataset of abstracts and Wikipedia.
no code implementations • NeurIPS 2015 • Ian En-Hsu Yen, Kai Zhong, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon
Over the past decades, Linear Programming (LP) has been widely used in different areas and considered as one of the mature technologies in numerical optimization.
no code implementations • NeurIPS 2015 • Kai-Yang Chiang, Cho-Jui Hsieh, Inderjit S. Dhillon
Moreover, we study the effectof general features in theory, and show that by using our model, the sample complexity can still be lower than matrix completion as long as features are sufficiently informative.
no code implementations • 28 Sep 2015 • Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon
High-dimensional time series prediction is needed in applications as diverse as demand forecasting and climatology.
1 code implementation • 16 Jul 2015 • Dohyung Park, Joe Neeman, Jin Zhang, Sujay Sanghavi, Inderjit S. Dhillon
In this paper we consider the collaborative ranking setting: a pool of users each provides a small number of pairwise preferences between $d$ possible items; from these we need to predict preferences of the users for items they have not yet seen.
no code implementations • 7 May 2015 • Nagarajan Natarajan, Oluwasanmi Koyejo, Pradeep Ravikumar, Inderjit S. Dhillon
We provide a general theoretical analysis of expected out-of-sample utility, also referred to as decision-theoretic classification, for non-decomposable binary classification metrics such as F-measure and Jaccard coefficient.
no code implementations • 6 Apr 2015 • Cho-Jui Hsieh, Hsiang-Fu Yu, Inderjit S. Dhillon
In this paper, we parallelize the SDCD algorithms in LIBLINEAR.
1 code implementation • 16 Dec 2014 • Hsiang-Fu Yu, Cho-Jui Hsieh, Hyokun Yun, S. V. N. Vishwanathan, Inderjit S. Dhillon
Learning meaningful topic models with massive document collections which contain millions of documents and billions of tokens is challenging because of two reasons: First, one needs to deal with a large number of topics (typically in the order of thousands).
no code implementations • NeurIPS 2014 • Si Si, Donghyuk Shin, Inderjit S. Dhillon, Beresford N. Parlett
Thus, eigenvectors of the clusters serve as good initializations to a block Lanczos algorithm that is used to compute spectral decomposition of the original graph.
no code implementations • NeurIPS 2014 • David I. Inouye, Pradeep K. Ravikumar, Inderjit S. Dhillon
We develop a fast algorithm for the Admixture of Poisson MRFs (APM) topic model and propose a novel metric to directly evaluate this model.
no code implementations • NeurIPS 2014 • Oluwasanmi O. Koyejo, Nagarajan Natarajan, Pradeep K. Ravikumar, Inderjit S. Dhillon
We consider a fairly large family of performance metrics given by ratios of linear combinations of the four fundamental population quantities.
no code implementations • NeurIPS 2014 • Ian En-Hsu Yen, Ting-Wei Lin, Shou-De Lin, Pradeep K. Ravikumar, Inderjit S. Dhillon
In this paper, we propose a Sparse Random Feature algorithm, which learns a sparse non-linear predictor by minimizing an $\ell_1$-regularized objective function over the Hilbert Space induced from kernel function.
no code implementations • NeurIPS 2014 • Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon
Second, we provide a new theoretical analysis on bounding the error of the solution computed by using Nystr¨om kernel approximation method, and show that the error is related to the weighted kmeans objective function where the weights are given by the model computed from the original kernel.
no code implementations • NeurIPS 2014 • Cho-Jui Hsieh, Inderjit S. Dhillon, Pradeep K. Ravikumar, Stephen Becker, Peder A. Olsen
In this paper, we develop a family of algorithms for optimizing superposition-structured” or “dirty” statistical estimators for high-dimensional problems involving the minimization of the sum of a smooth loss function with a hybrid regularization.
no code implementations • NeurIPS 2014 • Ian En-Hsu Yen, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon
State of the art statistical estimators for high-dimensional problems take the form of regularized, and hence non-smooth, convex programs.
no code implementations • 22 Nov 2014 • Cho-Jui Hsieh, Nagarajan Natarajan, Inderjit S. Dhillon
For the first case, we propose a "shifted matrix completion" method that recovers M using only a subset of indices corresponding to ones, while for the second case, we propose a "biased matrix completion" method that recovers the (thresholded) binary matrix.
no code implementations • NeurIPS 2014 • Kai Zhong, Ian E. H. Yen, Inderjit S. Dhillon, Pradeep Ravikumar
We consider the class of optimization problems arising from computationally intensive L1-regularized M-estimators, where the function or gradient values are very expensive to compute.
no code implementations • NeurIPS 2013 • Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari
In this paper, we theoretically study the problem of binary classification in the presence of random classification noise --- the learner, instead of seeing the true labels, sees labels that have independently been flipped with some small probability.
no code implementations • NeurIPS 2013 • Huahua Wang, Arindam Banerjee, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon
We consider the problem of sparse precision matrix estimation in high dimensions using the CLIME estimator, which has several desirable theoretical properties.
no code implementations • NeurIPS 2013 • Cho-Jui Hsieh, Matyas A. Sustik, Inderjit S. Dhillon, Pradeep K. Ravikumar, Russell Poldrack
The l1-regularized Gaussian maximum likelihood estimator (MLE) has been shown to have strong statistical guarantees in recovering a sparse inverse covariance matrix even under high-dimensional settings.
no code implementations • 4 Nov 2013 • Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon
We show theoretically that the support vectors identified by the subproblem solution are likely to be support vectors of the entire kernel SVM problem, provided that the problem is partitioned appropriately by kernel clustering.
no code implementations • 18 Jul 2013 • Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, Inderjit S. Dhillon
The multi-label classification problem has generated significant interest in recent years.
no code implementations • NeurIPS 2011 • Cho-Jui Hsieh, Matyas A. Sustik, Inderjit S. Dhillon, Pradeep Ravikumar
The L1-regularized Gaussian maximum likelihood estimator (MLE) has been shown to have strong statistical guarantees in recovering a sparse inverse covariance matrix, or alternatively the underlying graph structure of a Gaussian Markov Random Field, from very limited samples.
no code implementations • 4 Jun 2013 • Prateek Jain, Inderjit S. Dhillon
In addition to inductive matrix completion, we show that two other low-rank estimation problems can be studied in our framework: a) general low-rank matrix sensing using rank-1 measurements, and b) multi-label regression with missing labels.
no code implementations • NeurIPS 2012 • Cho-Jui Hsieh, Arindam Banerjee, Inderjit S. Dhillon, Pradeep K. Ravikumar
We derive a bound on the distance of the approximate solution to the true solution.
no code implementations • NeurIPS 2011 • Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari
In particular, we investigate the greedy coordinate descent algorithm, and note that performing the greedy step efficiently weakens the costly dependence on the problem size provided the solution is sparse.
no code implementations • NeurIPS 2011 • Ambuj Tewari, Pradeep K. Ravikumar, Inderjit S. Dhillon
A hallmark of modern machine learning is its ability to deal with high dimensional problems by exploiting structural assumptions that limit the degrees of freedom in the underlying model.
no code implementations • NeurIPS 2011 • Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon
Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursuit.
no code implementations • NeurIPS 2010 • Prateek Jain, Brian Kulis, Inderjit S. Dhillon
Our result shows that the learned kernel matrices parameterize a linear transformation kernel function and can be applied inductively to new data points.
no code implementations • NeurIPS 2009 • Raghu Meka, Prateek Jain, Inderjit S. Dhillon
In this paper, we propose a graph theoretic approach to matrix completion that solves the problem for more realistic sampling models.
1 code implementation • NeurIPS 2010 • Raghu Meka, Prateek Jain, Inderjit S. Dhillon
Minimizing the rank of a matrix subject to affine constraints is a fundamental problem with many important applications in machine learning and statistics.
no code implementations • NeurIPS 2008 • Prateek Jain, Brian Kulis, Inderjit S. Dhillon, Kristen Grauman
Metric learning algorithms can provide useful distance functions for a variety of domains, and recent work has shown good accuracy for problems where the learner can access all distance constraints at once.