Search Results for author: Simran Khanuja

Found 18 papers, 4 papers with code

What Is Missing in Multilingual Visual Reasoning and How to Fix It

1 code implementation3 Mar 2024 Yueqi Song, Simran Khanuja, Graham Neubig

NLP models today strive for supporting multiple languages and modalities, improving accessibility for diverse users.

Image Captioning Visual Reasoning

DeMuX: Data-efficient Multilingual Learning

no code implementations10 Nov 2023 Simran Khanuja, Srinivas Gowriraj, Lucio Dery, Graham Neubig

In this paper, we introduce DEMUX, a framework that prescribes the exact data-points to label from vast amounts of unlabelled multilingual data, having unknown degrees of overlap with the target set.

Active Learning

Evaluating the Diversity, Equity and Inclusion of NLP Technology: A Case Study for Indian Languages

no code implementations25 May 2022 Simran Khanuja, Sebastian Ruder, Partha Talukdar

In order for NLP technology to be widely applicable, fair, and useful, it needs to serve a diverse set of speakers across the world's languages, be equitable, i. e., not unduly biased towards any particular language, and be inclusive of all users, particularly in low-resource settings where compute constraints are common.

XTREME-S: Evaluating Cross-lingual Speech Representations

no code implementations21 Mar 2022 Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson

Covering 102 languages from 10+ language families, 3 different domains and 4 task families, XTREME-S aims to simplify multilingual speech representation evaluation, as well as catalyze research in "universal" speech representation learning.

Representation Learning Retrieval +4

mSLAM: Massively multilingual joint pre-training for speech and text

no code implementations3 Feb 2022 Ankur Bapna, Colin Cherry, Yu Zhang, Ye Jia, Melvin Johnson, Yong Cheng, Simran Khanuja, Jason Riesa, Alexis Conneau

We present mSLAM, a multilingual Speech and LAnguage Model that learns cross-lingual cross-modal representations of speech and text by pre-training jointly on large amounts of unlabeled speech and text in multiple languages.

intent-classification Intent Classification +4

MergeDistill: Merging Pre-trained Language Models using Distillation

no code implementations5 Jun 2021 Simran Khanuja, Melvin Johnson, Partha Talukdar

Pre-trained multilingual language models (LMs) have achieved state-of-the-art results in cross-lingual transfer, but they often lead to an inequitable representation of languages due to limited capacity, skewed pre-training data, and sub-optimal vocabularies.

Cross-Lingual Transfer Knowledge Distillation

MuRIL: Multilingual Representations for Indian Languages

1 code implementation19 Mar 2021 Simran Khanuja, Diksha Bansal, Sarvesh Mehtani, Savya Khosla, Atreyee Dey, Balaji Gopalan, Dilip Kumar Margam, Pooja Aggarwal, Rajiv Teja Nagipogu, Shachi Dave, Shruti Gupta, Subhash Chandra Bose Gali, Vish Subramanian, Partha Talukdar

This can be explained by the fact that multilingual language models (LMs) are often trained on 100+ languages together, leading to a small representation of IN languages in their vocabulary and training data.

Cross-lingual and Multilingual Spoken Term Detection for Low-Resource Indian Languages

no code implementations12 Nov 2020 Sanket Shah, Satarupa Guha, Simran Khanuja, Sunayana Sitaram

Since no publicly available dataset exists for Spoken Term Detection in these languages, we create a new dataset using a publicly available TTS dataset.

A New Dataset for Natural Language Inference from Code-mixed Conversations

no code implementations LREC 2020 Simran Khanuja, Sandipan Dandapat, Sunayana Sitaram, Monojit Choudhury

Code-mixing is the use of more than one language in the same conversation or utterance, and is prevalent in multilingual communities all over the world.

Natural Language Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.