Search Results for author: Hsiang-Fu Yu

Found 36 papers, 16 papers with code

PEFA: Parameter-Free Adapters for Large-scale Embedding-based Retrieval Models

1 code implementation5 Dec 2023 Wei-Cheng Chang, Jyun-Yu Jiang, Jiong Zhang, Mutasem Al-Darabsah, Choon Hui Teo, Cho-Jui Hsieh, Hsiang-Fu Yu, S. V. N. Vishwanathan

For product search, PEFA improves the Recall@100 of the fine-tuned ERMs by an average of 5. 3% and 14. 5%, for PEFA-XS and PEFA-XL, respectively.

Text Retrieval

MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering

no code implementations8 Oct 2023 Xiusi Chen, Jyun-Yu Jiang, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Wei Wang

Recent advances in few-shot question answering (QA) mostly rely on the power of pre-trained large language models (LLMs) and fine-tuning in specific settings.

Data Augmentation Question Answering +3

Representer Point Selection for Explaining Regularized High-dimensional Models

no code implementations31 May 2023 Che-Ping Tsai, Jiong Zhang, Eli Chien, Hsiang-Fu Yu, Cho-Jui Hsieh, Pradeep Ravikumar

We introduce a novel class of sample-based explanations we term high-dimensional representers, that can be used to explain the predictions of a regularized high-dimensional model in terms of importance weights for each of the training samples.

Binary Classification Collaborative Filtering +1

PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation

1 code implementation21 May 2023 Eli Chien, Jiong Zhang, Cho-Jui Hsieh, Jyun-Yu Jiang, Wei-Cheng Chang, Olgica Milenkovic, Hsiang-Fu Yu

Unlike most existing XMC frameworks that treat labels and input instances as featureless indicators and independent entries, PINA extracts information from the label metadata and the correlations among training instances.

Extreme Multi-Label Classification Recommendation Systems

Uncertainty in Extreme Multi-label Classification

no code implementations18 Oct 2022 Jyun-Yu Jiang, Wei-Cheng Chang, Jiong Zhong, Cho-Jui Hsieh, Hsiang-Fu Yu

Uncertainty quantification is one of the most crucial tasks to obtain trustworthy and reliable machine learning models for decision making.

Classification Decision Making +4

ELIAS: End-to-End Learning to Index and Search in Large Output Spaces

1 code implementation16 Oct 2022 Nilesh Gupta, Patrick H. Chen, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S Dhillon

A popular approach for dealing with the large label space is to arrange the labels into a shallow tree-based index and then learn an ML model to efficiently search this index via beam search.

Extreme Multi-Label Classification

Extreme Zero-Shot Learning for Extreme Text Classification

1 code implementation NAACL 2022 Yuanhao Xiong, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Inderjit Dhillon

To learn the semantic embeddings of instances and labels with raw text, we propose to pre-train Transformer-based encoders with self-supervised contrastive losses.

Multi Label Text Classification Multi-Label Text Classification +2

DRONE: Data-aware Low-rank Compression for Large NLP Models

no code implementations NeurIPS 2021 Pei-Hung Chen, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh

In addition to compressing standard models, out method can also be used on distilled BERT models to further improve compression rate.

Low-rank compression MRPC +1

Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction

4 code implementations ICLR 2022 Eli Chien, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Jiong Zhang, Olgica Milenkovic, Inderjit S Dhillon

We also provide a theoretical analysis that justifies the use of XMC over link prediction and motivates integrating XR-Transformers, a powerful method for solving XMC problems, into the GIANT framework.

Extreme Multi-Label Classification Language Modelling +3

Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

1 code implementation NeurIPS 2021 Jiong Zhang, Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon

Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs.

Multi Label Text Classification Multi-Label Text Classification +2

Label Disentanglement in Partition-based Extreme Multilabel Classification

no code implementations NeurIPS 2021 Xuanqing Liu, Wei-Cheng Chang, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S. Dhillon

Partition-based methods are increasingly-used in extreme multi-label classification (XMC) problems due to their scalability to large output spaces (e. g., millions or more).

Classification Disentanglement +1

Data-aware Low-Rank Compression for Large NLP Models

no code implementations1 Jan 2021 Patrick Chen, Hsiang-Fu Yu, Inderjit S Dhillon, Cho-Jui Hsieh

In this paper, we observe that the learned representation of each layer lies in a low-dimensional space.

Low-rank compression MRPC +1

PECOS: Prediction for Enormous and Correlated Output Spaces

no code implementations12 Oct 2020 Hsiang-Fu Yu, Kai Zhong, Jiong Zhang, Wei-Cheng Chang, Inderjit S. Dhillon

In this paper, we propose the Prediction for Enormous and Correlated Output Spaces (PECOS) framework, a versatile and modular machine learning framework for solving prediction problems for very large output spaces, and apply it to the eXtreme Multilabel Ranking (XMR) problem: given an input instance, find and rank the most relevant items from an enormous but fixed and finite output space.

Extreme Multi-label Classification from Aggregated Labels

no code implementations ICML 2020 Yanyao Shen, Hsiang-Fu Yu, Sujay Sanghavi, Inderjit Dhillon

Current XMC approaches are not built for such multi-instance multi-label (MIML) training data, and MIML approaches do not scale to XMC sizes.

Classification Extreme Multi-Label Classification +1

Learning to Encode Position for Transformer with Continuous Dynamical Model

1 code implementation ICML 2020 Xuanqing Liu, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh

The main reason is that position information among input units is not inherently encoded, i. e., the models are permutation equivalent; this problem justifies why all of the existing models are accompanied by a sinusoidal encoding/embedding layer at the input.

Inductive Bias Linguistic Acceptability +4

Multiresolution Transformer Networks: Recurrence is Not Essential for Modeling Hierarchical Structure

no code implementations27 Aug 2019 Vikas K. Garg, Inderjit S. Dhillon, Hsiang-Fu Yu

The architecture of Transformer is based entirely on self-attention, and has been shown to outperform models that employ recurrence on sequence transduction tasks such as machine translation.

Machine Translation Translation

Graph DNA: Deep Neighborhood Aware Graph Encoding for Collaborative Filtering

no code implementations29 May 2019 Liwei Wu, Hsiang-Fu Yu, Nikhil Rao, James Sharpnack, Cho-Jui Hsieh

In this paper, we propose using Graph DNA, a novel Deep Neighborhood Aware graph encoding algorithm, for exploiting deeper neighborhood information.

Collaborative Filtering Recommendation Systems

Taming Pretrained Transformers for Extreme Multi-label Text Classification

2 code implementations7 May 2019 Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, Inderjit Dhillon

However, naively applying deep transformer models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue.

Extreme Multi-Label Classification General Classification +4

LearningWord Embeddings for Low-resource Languages by PU Learning

1 code implementation9 May 2018 Chao Jiang, Hsiang-Fu Yu, Cho-Jui Hsieh, Kai-Wei Chang

In such a situation, the co-occurrence matrix is sparse as the co-occurrences of many word pairs are unobserved.

Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction

no code implementations NeurIPS 2016 Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon

We develop novel regularization schemes and use scalable matrix factorization methods that are eminently suited for high-dimensional time series data that has many missing values.

Missing Values Time Series +2

Asynchronous Parallel Greedy Coordinate Descent

no code implementations NeurIPS 2016 Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh

n this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints.

A Greedy Approach for Budgeted Maximum Inner Product Search

no code implementations NeurIPS 2017 Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon

Maximum Inner Product Search (MIPS) is an important task in many machine learning applications such as the prediction phase of a low-rank matrix factorization model for a recommender system.

Recommendation Systems

Extreme Stochastic Variational Inference: Distributed and Asynchronous

no code implementations31 May 2016 Jiong Zhang, Parameswaran Raman, Shihao Ji, Hsiang-Fu Yu, S. V. N. Vishwanathan, Inderjit S. Dhillon

Moreover, it requires the parameters to fit in the memory of a single processor; this is problematic when the number of parameters is in billions.

Variational Inference

Collaborative Filtering with Graph Information: Consistency and Scalable Methods

2 code implementations NeurIPS 2015 Nikhil Rao, Hsiang-Fu Yu, Pradeep K. Ravikumar, Inderjit S. Dhillon

Low rank matrix completion plays a fundamental role in collaborative filtering applications, the key idea being that the variables lie in a smaller subspace than the ambient space.

 Ranked #1 on Recommendation Systems on Flixster (using extra training data)

Collaborative Filtering Low-Rank Matrix Completion +1

High-dimensional Time Series Prediction with Missing Values

no code implementations28 Sep 2015 Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon

High-dimensional time series prediction is needed in applications as diverse as demand forecasting and climatology.

Matrix Completion Missing Values +3

A Scalable Asynchronous Distributed Algorithm for Topic Modeling

1 code implementation16 Dec 2014 Hsiang-Fu Yu, Cho-Jui Hsieh, Hyokun Yun, S. V. N. Vishwanathan, Inderjit S. Dhillon

Learning meaningful topic models with massive document collections which contain millions of documents and billions of tokens is challenging because of two reasons: First, one needs to deal with a large number of topics (typically in the order of thousands).

Topic Models

NOMAD: Non-locking, stOchastic Multi-machine algorithm for Asynchronous and Decentralized matrix completion

1 code implementation1 Dec 2013 Hyokun Yun, Hsiang-Fu Yu, Cho-Jui Hsieh, S. V. N. Vishwanathan, Inderjit Dhillon

One of the key features of NOMAD is that the ownership of a variable is asynchronously transferred between processors in a decentralized fashion.

Distributed, Parallel, and Cluster Computing

Large-scale Multi-label Learning with Missing Labels

no code implementations18 Jul 2013 Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, Inderjit S. Dhillon

The multi-label classification problem has generated significant interest in recent years.

Missing Labels

Cannot find the paper you are looking for? You can Submit a new open access paper.