Search Results for author: Ananya B. Sai

Found 12 papers, 6 papers with code

How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?

1 code implementation6 Jun 2024 Anushka Singh, Ananya B. Sai, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Mitesh M Khapra

While machine translation evaluation has been studied primarily for high-resource languages, there has been a recent interest in evaluation for low-resource languages due to the increasing availability of data and models.

Machine Translation

Closing the Gap in the Trade-off between Fair Representations and Accuracy

no code implementations15 Apr 2024 Biswajit Rout, Ananya B. Sai, Arun Rajkumar

The rapid developments of various machine learning models and their deployments in several applications has led to discussions around the importance of looking beyond the accuracies of these models.

Fairness

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages

1 code implementation20 Dec 2022 Ananya B. Sai, Vignesh Nagarajan, Tanay Dixit, Raj Dabre, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

In this paper, we fill this gap by creating an MQM dataset consisting of 7000 fine-grained annotations, spanning 5 Indian languages and 7 MT systems, and use it to establish correlations between annotator scores and scores obtained using existing automatic metrics.

Machine Translation

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

2 code implementations6 Dec 2021 Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo, Samuel Cahyawijaya, Emile Chapuis, Wanxiang Che, Mukund Choudhary, Christian Clauss, Pierre Colombo, Filip Cornell, Gautier Dagan, Mayukh Das, Tanay Dixit, Thomas Dopierre, Paul-Alexis Dray, Suchitra Dubey, Tatiana Ekeinhor, Marco Di Giovanni, Tanya Goyal, Rishabh Gupta, Louanes Hamla, Sang Han, Fabrice Harel-Canada, Antoine Honore, Ishan Jindal, Przemyslaw K. Joniak, Denis Kleyko, Venelin Kovatchev, Kalpesh Krishna, Ashutosh Kumar, Stefan Langer, Seungjae Ryan Lee, Corey James Levinson, Hualou Liang, Kaizhao Liang, Zhexiong Liu, Andrey Lukyanenko, Vukosi Marivate, Gerard de Melo, Simon Meoni, Maxime Meyer, Afnan Mir, Nafise Sadat Moosavi, Niklas Muennighoff, Timothy Sum Hon Mun, Kenton Murray, Marcin Namysl, Maria Obedkova, Priti Oli, Nivranshu Pasricha, Jan Pfister, Richard Plant, Vinay Prabhu, Vasile Pais, Libo Qin, Shahab Raji, Pawan Kumar Rajpoot, Vikas Raunak, Roy Rinberg, Nicolas Roberts, Juan Diego Rodriguez, Claude Roux, Vasconcellos P. H. S., Ananya B. Sai, Robin M. Schmidt, Thomas Scialom, Tshephisho Sefara, Saqib N. Shamsi, Xudong Shen, Haoyue Shi, Yiwen Shi, Anna Shvets, Nick Siegel, Damien Sileo, Jamie Simon, Chandan Singh, Roman Sitelew, Priyank Soni, Taylor Sorensen, William Soto, Aman Srivastava, KV Aditya Srivatsa, Tony Sun, Mukund Varma T, A Tabassum, Fiona Anting Tan, Ryan Teehan, Mo Tiwari, Marie Tolkiehn, Athena Wang, Zijian Wang, Gloria Wang, Zijie J. Wang, Fuxuan Wei, Bryan Wilie, Genta Indra Winata, Xinyi Wu, Witold Wydmański, Tianbao Xie, Usama Yaseen, Michael A. Yee, Jing Zhang, Yue Zhang

Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on.

Data Augmentation Diversity

Perturbation CheckLists for Evaluating NLG Evaluation Metrics

1 code implementation EMNLP 2021 Ananya B. Sai, Tanay Dixit, Dev Yashpal Sheth, Sreyas Mohan, Mitesh M. Khapra

Natural Language Generation (NLG) evaluation is a multifaceted task requiring assessment of multiple desirable criteria, e. g., fluency, coherency, coverage, relevance, adequacy, overall quality, etc.

Data-to-Text Generation nlg evaluation

Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining

1 code implementation23 Sep 2020 Ananya B. Sai, Akash Kumar Mohankumar, Siddhartha Arora, Mitesh M. Khapra

However, no such data is publicly available, and hence existing models are usually trained using a single relevant response and multiple randomly selected responses from other contexts (random negatives).

Dialogue Evaluation

A Survey of Evaluation Metrics Used for NLG Systems

no code implementations27 Aug 2020 Ananya B. Sai, Akash Kumar Mohankumar, Mitesh M. Khapra

The expanding number of NLG models and the shortcomings of the current metrics has led to a rapid surge in the number of evaluation metrics proposed since 2014.

Image Captioning nlg evaluation +2

Frustratingly Poor Performance of Reading Comprehension Models on Non-adversarial Examples

no code implementations4 Apr 2019 Soham Parikh, Ananya B. Sai, Preksha Nema, Mitesh M. Khapra

We believe that the non-adversarial dataset created as a part of this work would complement the research on adversarial evaluation and give a more realistic assessment of the ability of RC models.

Reading Comprehension

ElimiNet: A Model for Eliminating Options for Reading Comprehension with Multiple Choice Questions

1 code implementation ICLR 2018 Soham Parikh, Ananya B. Sai, Preksha Nema, Mitesh M. Khapra

Specifically, it has gates which decide whether an option can be eliminated given the passage, question pair and if so it tries to make the passage representation orthogonal to this eliminated option (akin to ignoring portions of the passage corresponding to the eliminated option).

Multiple-choice Reading Comprehension

Re-evaluating ADEM: A Deeper Look at Scoring Dialogue Responses

no code implementations23 Feb 2019 Ananya B. Sai, Mithun Das Gupta, Mitesh M. Khapra, Mukundhan Srinivasan

ADEM(Lowe et al. 2017) formulated the automatic evaluation of dialogue systems as a learning problem and showed that such a model was able to predict responses which correlate significantly with human judgements, both at utterance and system level.

Dialogue Evaluation Response Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.