Search Results for author: Sandipan Dandapat

Found 15 papers, 2 papers with code

A Case Study of Efficacy and Challenges in Practical Human-in-Loop Evaluation of NLP Systems Using Checklist

no code implementations EACL (HumEval) 2021 Shaily Bhatt, Rahul Jain, Sandipan Dandapat, Sunayana Sitaram

We conduct experiments for evaluating an offensive content detection system and use a data augmentation technique for improving the model using insights from Checklist.

Data Augmentation

”Diversity and Uncertainty in Moderation” are the Key to Data Selection for Multilingual Few-shot Transfer

no code implementations Findings (NAACL) 2022 Shanu Kumar, Sandipan Dandapat, Monojit Choudhury

Few-shot transfer often shows substantial gain over zero-shot transfer (CITATION), which is a practically useful trade-off between fully supervised and unsupervised learning approaches for multilingual pretained model-based systems.

Language Modelling NER +2

DiTTO: A Feature Representation Imitation Approach for Improving Cross-Lingual Transfer

no code implementations4 Mar 2023 Shanu Kumar, Abbaraju Soujanya, Sandipan Dandapat, Sunayana Sitaram, Monojit Choudhury

Zero-shot cross-lingual transfer is promising, however has been shown to be sub-optimal, with inferior transfer performance across low-resource languages.

Zero-Shot Cross-Lingual Transfer

Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models

1 code implementation27 Oct 2022 Harshita Diddee, Sandipan Dandapat, Monojit Choudhury, Tanuja Ganu, Kalika Bali

Leveraging shared learning through Massively Multilingual Models, state-of-the-art machine translation models are often able to adapt to the paucity of data for low-resource languages.

Knowledge Distillation Machine Translation +1

On the Calibration of Massively Multilingual Language Models

1 code implementation21 Oct 2022 Kabir Ahuja, Sunayana Sitaram, Sandipan Dandapat, Monojit Choudhury

Massively Multilingual Language Models (MMLMs) have recently gained popularity due to their surprising effectiveness in cross-lingual transfer.

Cross-Lingual Transfer

"Diversity and Uncertainty in Moderation" are the Key to Data Selection for Multilingual Few-shot Transfer

no code implementations30 Jun 2022 Shanu Kumar, Sandipan Dandapat, Monojit Choudhury

Few-shot transfer often shows substantial gain over zero-shot transfer~\cite{lauscher2020zero}, which is a practically useful trade-off between fully supervised and unsupervised learning approaches for multilingual pretrained model-based systems.

Language Modelling NER +2

On the Economics of Multilingual Few-shot Learning: Modeling the Cost-Performance Trade-offs of Machine Translated and Manual Data

no code implementations NAACL 2022 Kabir Ahuja, Monojit Choudhury, Sandipan Dandapat

Borrowing ideas from {\em Production functions} in micro-economics, in this paper we introduce a framework to systematically evaluate the performance and cost trade-offs between machine-translated and manually-created labelled data for task-specific fine-tuning of massively multilingual language models.

Few-Shot Learning Machine Translation +1

Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models

no code implementations ACL 2022 Kabir Ahuja, Shanu Kumar, Sandipan Dandapat, Monojit Choudhury

Massively Multilingual Transformer based Language Models have been observed to be surprisingly effective on zero-shot transfer across languages, though the performance varies from language to language depending on the pivot language(s) used for fine-tuning.

feature selection Multi-Task Learning

Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages

no code implementations nlppower (ACL) 2022 Kabir Ahuja, Sandipan Dandapat, Sunayana Sitaram, Monojit Choudhury

Although recent Massively Multilingual Language Models (MMLMs) like mBERT and XLMR support around 100 languages, most existing multilingual NLP benchmarks provide evaluation data in only a handful of these languages with little linguistic diversity.

Benchmarking Multilingual NLP +1

Predicting the Performance of Multilingual NLP Models

no code implementations17 Oct 2021 Anirudh Srinivasan, Sunayana Sitaram, Tanuja Ganu, Sandipan Dandapat, Kalika Bali, Monojit Choudhury

Recent advancements in NLP have given us models like mBERT and XLMR that can serve over 100 languages.

Multilingual NLP

On the Universality of Deep Contextual Language Models

no code implementations ICON 2021 Shaily Bhatt, Poonam Goyal, Sandipan Dandapat, Monojit Choudhury, Sunayana Sitaram

Deep Contextual Language Models (LMs) like ELMO, BERT, and their successors dominate the landscape of Natural Language Processing due to their ability to scale across multiple tasks rapidly by pre-training a single model, followed by task-specific fine-tuning.

XLM-R Zero-Shot Cross-Lingual Transfer

A New Dataset for Natural Language Inference from Code-mixed Conversations

no code implementations LREC 2020 Simran Khanuja, Sandipan Dandapat, Sunayana Sitaram, Monojit Choudhury

Code-mixing is the use of more than one language in the same conversation or utterance, and is prevalent in multilingual communities all over the world.

Natural Language Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.