Search Results for author: Madhura Pande

Found 4 papers, 1 papers with code

On the weak link between importance and prunability of attention heads

no code implementations EMNLP 2020 Aakriti Budhraja, Madhura Pande, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra

Given the success of Transformer-based models, two directions of study have emerged: interpreting role of individual attention heads and down-sizing the models for efficiency.

On the Prunability of Attention Heads in Multilingual BERT

no code implementations26 Sep 2021 Aakriti Budhraja, Madhura Pande, Pratyush Kumar, Mitesh M. Khapra

Large multilingual models, such as mBERT, have shown promise in crosslingual transfer.

The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT

1 code implementation22 Jan 2021 Madhura Pande, Aakriti Budhraja, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra

There are two main challenges with existing methods for classification: (a) there are no standard scores across studies or across functional roles, and (b) these scores are often average quantities measured across sentences without capturing statistical significance.

Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.