Search Results for author: Inderjit Dhillon

Found 23 papers, 10 papers with code

Dual-Encoders for Extreme Multi-Label Classification

1 code implementation16 Oct 2023 Nilesh Gupta, Devvrit Khatri, Ankit S Rawat, Srinadh Bhojanapalli, Prateek Jain, Inderjit Dhillon

We propose decoupled softmax loss - a simple modification to the InfoNCE loss - that overcomes the limitations of existing contrastive losses.

Classification Extreme Multi-Label Classification +2

EHI: End-to-end Learning of Hierarchical Index for Efficient Dense Retrieval

no code implementations13 Oct 2023 Ramnath Kumar, Anshul Mittal, Nilesh Gupta, Aditya Kusupati, Inderjit Dhillon, Prateek Jain

Such techniques use a two-stage process: (a) contrastive learning to train a dual encoder to embed both the query and documents and (b) approximate nearest neighbor search (ANNS) for finding similar documents for a given query.

Contrastive Learning Retrieval

MatFormer: Nested Transformer for Elastic Inference

2 code implementations11 Oct 2023 Devvrit, Sneha Kudugunta, Aditya Kusupati, Tim Dettmers, KaiFeng Chen, Inderjit Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham Kakade, Ali Farhadi, Prateek Jain

Furthermore, we observe that smaller encoders extracted from a universal MatFormer-based ViT (MatViT) encoder preserve the metric-space structure for adaptive large-scale retrieval.

Language Modelling

Bayesian regularization of empirical MDPs

no code implementations3 Aug 2022 Samarth Gupta, Daniel N. Hill, Lexing Ying, Inderjit Dhillon

Due to noise, the policy learnedfrom the estimated model is often far from the optimal policy of the underlying model.

Positive Unlabeled Contrastive Learning

no code implementations1 Jun 2022 Anish Acharya, Sujay Sanghavi, Li Jing, Bhargav Bhushanam, Dhruv Choudhary, Michael Rabbat, Inderjit Dhillon

We extend this paradigm to the classical positive unlabeled (PU) setting, where the task is to learn a binary classifier given only a few labeled positive samples, and (often) a large amount of unlabeled samples (which could be positive or negative).

Contrastive Learning Pseudo Label

Extreme Zero-Shot Learning for Extreme Text Classification

1 code implementation NAACL 2022 Yuanhao Xiong, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Inderjit Dhillon

To learn the semantic embeddings of instances and labels with raw text, we propose to pre-train Transformer-based encoders with self-supervised contrastive losses.

Multi Label Text Classification Multi-Label Text Classification +2

DRONE: Data-aware Low-rank Compression for Large NLP Models

no code implementations NeurIPS 2021 Pei-Hung Chen, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh

In addition to compressing standard models, out method can also be used on distilled BERT models to further improve compression rate.

Low-rank compression MRPC +1

Approximate Newton policy gradient algorithms

no code implementations5 Oct 2021 Haoya Li, Samarth Gupta, HsiangFu Yu, Lexing Ying, Inderjit Dhillon

This paper proposes an approximate Newton method for the policy gradient algorithm with entropy regularization.

Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy

1 code implementation15 Feb 2021 Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon

We show that our algorithm has a regret guarantee of $O(k\sqrt{(A-k+1)T \log (|\mathcal{F}|T)})$, where $A$ is the total number of arms and $\mathcal{F}$ is the class containing the regression function, while only requiring $\tilde{O}(A)$ computation per time step.

Computational Efficiency Extreme Multi-Label Classification +2

Voting based ensemble improves robustness of defensive models

no code implementations28 Nov 2020 Devvrit, Minhao Cheng, Cho-Jui Hsieh, Inderjit Dhillon

Several previous attempts tackled this problem by ensembling the soft-label prediction and have been proved vulnerable based on the latest attack methods.

On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization

1 code implementation20 Nov 2020 Abolfazl Hashemi, Anish Acharya, Rudrajit Das, Haris Vikalo, Sujay Sanghavi, Inderjit Dhillon

In this paper, we show that, in such compressed decentralized optimization settings, there are benefits to having {\em multiple} gossip steps between subsequent gradient iterations, even when the cost of doing so is appropriately accounted for e. g. by means of reducing the precision of compressed information.

Extreme Multi-label Classification from Aggregated Labels

no code implementations ICML 2020 Yanyao Shen, Hsiang-Fu Yu, Sujay Sanghavi, Inderjit Dhillon

Current XMC approaches are not built for such multi-instance multi-label (MIML) training data, and MIML approaches do not scale to XMC sizes.

Classification Extreme Multi-Label Classification +1

Learning to Encode Position for Transformer with Continuous Dynamical Model

1 code implementation ICML 2020 Xuanqing Liu, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh

The main reason is that position information among input units is not inherently encoded, i. e., the models are permutation equivalent; this problem justifies why all of the existing models are accompanied by a sinusoidal encoding/embedding layer at the input.

Inductive Bias Linguistic Acceptability +4

CAT: Customized Adversarial Training for Improved Robustness

no code implementations17 Feb 2020 Minhao Cheng, Qi Lei, Pin-Yu Chen, Inderjit Dhillon, Cho-Jui Hsieh

Adversarial training has become one of the most effective methods for improving robustness of neural networks.

Taming Pretrained Transformers for Extreme Multi-label Text Classification

2 code implementations7 May 2019 Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, Inderjit Dhillon

However, naively applying deep transformer models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue.

Extreme Multi-Label Classification General Classification +4

Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

no code implementations1 Nov 2018 Anish Acharya, Rahul Goel, Angeliki Metallinou, Inderjit Dhillon

Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression.

General Classification Quantization +3

Kernel Ridge Regression via Partitioning

no code implementations5 Aug 2016 Rashish Tandon, Si Si, Pradeep Ravikumar, Inderjit Dhillon

In this paper, we investigate a divide and conquer approach to Kernel Ridge Regression (KRR).

Clustering Generalization Bounds +1

Structured Sparse Regression via Greedy Hard-Thresholding

no code implementations19 Feb 2016 Prateek Jain, Nikhil Rao, Inderjit Dhillon

Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups.

regression

Coordinate Descent Methods for Symmetric Nonnegative Matrix Factorization

no code implementations4 Sep 2015 Arnaud Vandaele, Nicolas Gillis, Qi Lei, Kai Zhong, Inderjit Dhillon

Given a symmetric nonnegative matrix $A$, symmetric nonnegative matrix factorization (symNMF) is the problem of finding a nonnegative matrix $H$, usually with much fewer columns than $A$, such that $A \approx HH^T$.

Clustering

NOMAD: Non-locking, stOchastic Multi-machine algorithm for Asynchronous and Decentralized matrix completion

1 code implementation1 Dec 2013 Hyokun Yun, Hsiang-Fu Yu, Cho-Jui Hsieh, S. V. N. Vishwanathan, Inderjit Dhillon

One of the key features of NOMAD is that the ownership of a variable is asynchronously transferred between processors in a decentralized fashion.

Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.