1 code implementation • 5 Dec 2023 • Wei-Cheng Chang, Jyun-Yu Jiang, Jiong Zhang, Mutasem Al-Darabsah, Choon Hui Teo, Cho-Jui Hsieh, Hsiang-Fu Yu, S. V. N. Vishwanathan
For product search, PEFA improves the Recall@100 of the fine-tuned ERMs by an average of 5. 3% and 14. 5%, for PEFA-XS and PEFA-XL, respectively.
no code implementations • 8 Oct 2023 • Xiusi Chen, Jyun-Yu Jiang, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Wei Wang
Recent advances in few-shot question answering (QA) mostly rely on the power of pre-trained large language models (LLMs) and fine-tuning in specific settings.
no code implementations • 31 May 2023 • Che-Ping Tsai, Jiong Zhang, Eli Chien, Hsiang-Fu Yu, Cho-Jui Hsieh, Pradeep Ravikumar
We introduce a novel class of sample-based explanations we term high-dimensional representers, that can be used to explain the predictions of a regularized high-dimensional model in terms of importance weights for each of the training samples.
1 code implementation • 21 May 2023 • Eli Chien, Jiong Zhang, Cho-Jui Hsieh, Jyun-Yu Jiang, Wei-Cheng Chang, Olgica Milenkovic, Hsiang-Fu Yu
Unlike most existing XMC frameworks that treat labels and input instances as featureless indicators and independent entries, PINA extracts information from the label metadata and the correlations among training instances.
no code implementations • 18 Oct 2022 • Jyun-Yu Jiang, Wei-Cheng Chang, Jiong Zhong, Cho-Jui Hsieh, Hsiang-Fu Yu
Uncertainty quantification is one of the most crucial tasks to obtain trustworthy and reliable machine learning models for decision making.
1 code implementation • 16 Oct 2022 • Nilesh Gupta, Patrick H. Chen, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S Dhillon
A popular approach for dealing with the large label space is to arrange the labels into a shallow tree-based index and then learn an ML model to efficiently search this index via beam search.
no code implementations • 21 Feb 2022 • Haoya Li, Hsiang-Fu Yu, Lexing Ying, Inderjit Dhillon
Entropy regularized Markov decision processes have been widely used in reinforcement learning.
1 code implementation • NAACL 2022 • Yuanhao Xiong, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Inderjit Dhillon
To learn the semantic embeddings of instances and labels with raw text, we propose to pre-train Transformer-based encoders with self-supervised contrastive losses.
Multi Label Text Classification Multi-Label Text Classification +2
no code implementations • NeurIPS 2021 • Pei-Hung Chen, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh
In addition to compressing standard models, out method can also be used on distilled BERT models to further improve compression rate.
4 code implementations • ICLR 2022 • Eli Chien, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Jiong Zhang, Olgica Milenkovic, Inderjit S Dhillon
We also provide a theoretical analysis that justifies the use of XMC over link prediction and motivates integrating XR-Transformers, a powerful method for solving XMC problems, into the GIANT framework.
Ranked #2 on Node Property Prediction on ogbn-papers100M
1 code implementation • NeurIPS 2021 • Jiong Zhang, Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon
Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs.
Multi Label Text Classification Multi-Label Text Classification +2
no code implementations • NeurIPS 2021 • Xuanqing Liu, Wei-Cheng Chang, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S. Dhillon
Partition-based methods are increasingly-used in extreme multi-label classification (XMC) problems due to their scalability to large output spaces (e. g., millions or more).
1 code implementation • 23 Jun 2021 • Wei-Cheng Chang, Daniel Jiang, Hsiang-Fu Yu, Choon-Hui Teo, Jiong Zhang, Kai Zhong, Kedarnath Kolluri, Qie Hu, Nikhil Shandilya, Vyacheslav Ievgrafov, Japinder Singh, Inderjit S. Dhillon
In this paper, we aim to improve semantic product search by using tree-based XMC models where inference time complexity is logarithmic in the number of products.
1 code implementation • 4 Jun 2021 • Philip A. Etter, Kai Zhong, Hsiang-Fu Yu, Lexing Ying, Inderjit Dhillon
In industrial applications, these models operate at extreme scales, where every bit of performance is critical.
no code implementations • 1 Jan 2021 • Patrick Chen, Hsiang-Fu Yu, Inderjit S Dhillon, Cho-Jui Hsieh
In this paper, we observe that the learned representation of each layer lies in a low-dimensional space.
no code implementations • 12 Oct 2020 • Hsiang-Fu Yu, Kai Zhong, Jiong Zhang, Wei-Cheng Chang, Inderjit S. Dhillon
In this paper, we propose the Prediction for Enormous and Correlated Output Spaces (PECOS) framework, a versatile and modular machine learning framework for solving prediction problems for very large output spaces, and apply it to the eXtreme Multilabel Ranking (XMR) problem: given an input instance, find and rank the most relevant items from an enormous but fixed and finite output space.
no code implementations • ICML 2020 • Yanyao Shen, Hsiang-Fu Yu, Sujay Sanghavi, Inderjit Dhillon
Current XMC approaches are not built for such multi-instance multi-label (MIML) training data, and MIML approaches do not scale to XMC sizes.
1 code implementation • ICML 2020 • Xuanqing Liu, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh
The main reason is that position information among input units is not inherently encoded, i. e., the models are permutation equivalent; this problem justifies why all of the existing models are accompanied by a sinusoidal encoding/embedding layer at the input.
Ranked #5 on Semantic Textual Similarity on MRPC
no code implementations • 27 Aug 2019 • Vikas K. Garg, Inderjit S. Dhillon, Hsiang-Fu Yu
The architecture of Transformer is based entirely on self-attention, and has been shown to outperform models that employ recurrence on sequence transduction tasks such as machine translation.
no code implementations • 29 May 2019 • Liwei Wu, Hsiang-Fu Yu, Nikhil Rao, James Sharpnack, Cho-Jui Hsieh
In this paper, we propose using Graph DNA, a novel Deep Neighborhood Aware graph encoding algorithm, for exploiting deeper neighborhood information.
1 code implementation • NeurIPS 2019 • Rajat Sen, Hsiang-Fu Yu, Inderjit Dhillon
Forecasting high-dimensional time series plays a crucial role in many applications such as demand forecasting and financial predictions.
1 code implementation • NeurIPS 2019 • Jiong Zhang, Hsiang-Fu Yu, Inderjit S. Dhillon
In this paper, we propose AutoAssist, a simple framework to accelerate training of a deep neural network.
2 code implementations • 7 May 2019 • Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, Inderjit Dhillon
However, naively applying deep transformer models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue.
Extreme Multi-Label Classification General Classification +4
no code implementations • NIPS Workshop CDNNRIA 2018 • Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon, Yiming Yang
To circumvent the softmax bottleneck, SeCSeq compresses labels into sequences of semantic-aware compact codes, on which Seq2Seq models are trained.
no code implementations • NAACL 2018 • Chao Jiang, Hsiang-Fu Yu, Cho-Jui Hsieh, Kai-Wei Chang
In such a situation, the co-occurrence matrix is sparse as the co-occurrences of many word pairs are unobserved.
1 code implementation • 9 May 2018 • Chao Jiang, Hsiang-Fu Yu, Cho-Jui Hsieh, Kai-Wei Chang
In such a situation, the co-occurrence matrix is sparse as the co-occurrences of many word pairs are unobserved.
no code implementations • NeurIPS 2016 • Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon
We develop novel regularization schemes and use scalable matrix factorization methods that are eminently suited for high-dimensional time series data that has many missing values.
no code implementations • NeurIPS 2016 • Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh
n this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints.
no code implementations • NeurIPS 2017 • Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon
Maximum Inner Product Search (MIPS) is an important task in many machine learning applications such as the prediction phase of a low-rank matrix factorization model for a recommender system.
no code implementations • 31 May 2016 • Jiong Zhang, Parameswaran Raman, Shihao Ji, Hsiang-Fu Yu, S. V. N. Vishwanathan, Inderjit S. Dhillon
Moreover, it requires the parameters to fit in the memory of a single processor; this is problematic when the number of parameters is in billions.
2 code implementations • NeurIPS 2015 • Nikhil Rao, Hsiang-Fu Yu, Pradeep K. Ravikumar, Inderjit S. Dhillon
Low rank matrix completion plays a fundamental role in collaborative filtering applications, the key idea being that the variables lie in a smaller subspace than the ambient space.
Ranked #1 on Recommendation Systems on Flixster (using extra training data)
no code implementations • 28 Sep 2015 • Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon
High-dimensional time series prediction is needed in applications as diverse as demand forecasting and climatology.
no code implementations • 6 Apr 2015 • Cho-Jui Hsieh, Hsiang-Fu Yu, Inderjit S. Dhillon
In this paper, we parallelize the SDCD algorithms in LIBLINEAR.
1 code implementation • 16 Dec 2014 • Hsiang-Fu Yu, Cho-Jui Hsieh, Hyokun Yun, S. V. N. Vishwanathan, Inderjit S. Dhillon
Learning meaningful topic models with massive document collections which contain millions of documents and billions of tokens is challenging because of two reasons: First, one needs to deal with a large number of topics (typically in the order of thousands).
1 code implementation • 1 Dec 2013 • Hyokun Yun, Hsiang-Fu Yu, Cho-Jui Hsieh, S. V. N. Vishwanathan, Inderjit Dhillon
One of the key features of NOMAD is that the ownership of a variable is asynchronously transferred between processors in a decentralized fashion.
Distributed, Parallel, and Cluster Computing
no code implementations • 18 Jul 2013 • Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, Inderjit S. Dhillon
The multi-label classification problem has generated significant interest in recent years.