no code implementations • 25 Jun 2024 • Anish Acharya, Inderjit S Dhillon, Sujay Sanghavi
Large-scale data collections in the wild, are invariably noisy.
no code implementations • 1 Nov 2022 • Yihan Wang, Si Si, Daliang Li, Michal Lukasik, Felix Yu, Cho-Jui Hsieh, Inderjit S Dhillon, Sanjiv Kumar
Pretrained large language models (LLMs) are general purpose problem solvers applicable to a diverse set of tasks with prompts.
1 code implementation • 16 Oct 2022 • Nilesh Gupta, Patrick H. Chen, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S Dhillon
A popular approach for dealing with the large label space is to arrange the labels into a shallow tree-based index and then learn an ML model to efficiently search this index via beam search.
Extreme Multi-Label Classification MUlTI-LABEL-ClASSIFICATION
4 code implementations • ICLR 2022 • Eli Chien, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Jiong Zhang, Olgica Milenkovic, Inderjit S Dhillon
We also provide a theoretical analysis that justifies the use of XMC over link prediction and motivates integrating XR-Transformers, a powerful method for solving XMC problems, into the GIANT framework.
Ranked #2 on Node Property Prediction on ogbn-papers100M
no code implementations • 29 Sep 2021 • Shuo Yang, Yijun Dong, Rachel Ward, Inderjit S Dhillon, Sujay Sanghavi, Qi Lei
Data augmentation is popular in the training of large neural networks; currently, however, there is no clear theoretical comparison between different algorithmic choices on how to use augmented data.
no code implementations • 1 Jan 2021 • Patrick Chen, Hsiang-Fu Yu, Inderjit S Dhillon, Cho-Jui Hsieh
In this paper, we observe that the learned representation of each layer lies in a low-dimensional space.