no code implementations • 20 Dec 2023 • Rahul Chand, Yashoteja Prabhu, Pratyush Kumar
Extensive experiments on multiple natural language understanding benchmarks demonstrate that DSFormer obtains up to 40% better compression than the state-of-the-art low-rank factorizers, leading semi-structured sparsity baselines and popular knowledge distillation approaches.
no code implementations • 7 Oct 2022 • Parikshit Bansal, Yashoteja Prabhu, Emre Kiciman, Amit Sharma
To explain this generalization failure, we consider an intervention-based importance metric, which shows that a fine-tuned model captures spurious correlations and fails to learn the causal features that determine the relevance between any two text inputs.
no code implementations • 15 Jan 2020 • Yashoteja Prabhu, Aditya Kusupati, Nilesh Gupta, Manik Varma
This paper also introduces a (3) new labelwise prediction algorithm in XReg useful for DSA and other recommendation tasks.