no code implementations • WMT (EMNLP) 2021 • Jyotsana Khatri, Rudra Murthy, Pushpak Bhattacharyya
This paper describes our submission for the shared task on Unsupervised MT and Very Low Resource Supervised MT at WMT 2021.
no code implementations • 27 Feb 2025 • Parul Awasthy, Aashka Trivedi, Yulong Li, Mihaela Bornea, David Cox, Abraham Daniels, Martin Franz, Gabe Goodhart, Bhavani Iyer, Vishwajeet Kumar, Luis Lastras, Scott McCarley, Rudra Murthy, Vignesh P, Sara Rosenthal, Salim Roukos, Jaydeep Sen, Sukriti Sharma, Avirup Sil, Kate Soule, Arafat Sultan, Radu Florian
We introduce the Granite Embedding models, a family of encoder-based embedding models designed for retrieval tasks, spanning dense-retrieval and sparse retrieval architectures, with both English and Multilingual capabilities.
1 code implementation • 4 Nov 2024 • Sshubam Verma, Mohammed Safi Ur Rahman Khan, Vishwajeet Kumar, Rudra Murthy, Jaydeep Sen
Models also perform better in high resource languages as compared to low resource ones.
no code implementations • 16 Oct 2024 • Rudra Murthy, Prince Kumar, Praveen Venkateswaran, Danish Contractor
In this work, we focus our attention on developing a benchmark for instruction-following where it is easy to verify both task performance as well as instruction-following capabilities.
no code implementations • 9 Sep 2024 • Arkadeep Acharya, Rudra Murthy, Vishwajeet Kumar, Jaydeep Sen
Given the large number of Hindi speakers worldwide, there is a pressing need for robust and efficient information retrieval systems for Hindi.
1 code implementation • 20 Aug 2024 • Meet Doshi, Vishwajeet Kumar, Rudra Murthy, Vignesh P, Jaydeep Sen
We use Mistral as the backbone to develop our Learned Sparse Retriever similar to SPLADE and train it on a subset of sentence-transformer data which is often used for training text embedding models.
no code implementations • 18 Aug 2024 • Arkadeep Acharya, Rudra Murthy, Vishwajeet Kumar, Jaydeep Sen
Given the large number of Hindi speakers worldwide, there is a pressing need for robust and efficient information retrieval systems for Hindi.
no code implementations • 18 Jul 2024 • Abhishek Kumar Singh, Rudra Murthy, Vishwajeet Kumar, Jaydeep Sen, Ganesh Ramakrishnan
Large Language Models (LLMs) have demonstrated remarkable zero-shot and few-shot capabilities in unseen tasks, including context-grounded question answering (QA) in English.
1 code implementation • 26 Jan 2024 • Jay Gala, Thanmay Jayakumar, Jaavid Aktar Husain, Aswanth Kumar M, Mohammed Safi Ur Rahman Khan, Diptesh Kanojia, Ratish Puduppully, Mitesh M. Khapra, Raj Dabre, Rudra Murthy, Anoop Kunchukuttan
We announce the initial release of "Airavata," an instruction-tuned LLM for Hindi.
no code implementations • 13 Jan 2024 • Settaluri Lakshmi Sravanthi, Meet Doshi, Tankala Pavan Kalyan, Rudra Murthy, Pushpak Bhattacharyya, Raj Dabre
To demonstrate this fact, we release a Pragmatics Understanding Benchmark (PUB) dataset consisting of fourteen tasks in four pragmatics phenomena, namely, Implicature, Presupposition, Reference, and Deixis.
4 code implementations • 9 May 2023 • Raymond Li, Loubna Ben allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries
The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15. 5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention.
Ranked #53 on
Code Generation
on MBPP
1 code implementation • LREC 2022 • Rudra Murthy, Pallab Bhattacharjee, Rahul Sharnagat, Jyotsana Khatri, Diptesh Kanojia, Pushpak Bhattacharyya
We use different language models to perform the sequence labelling task for NER and show the efficacy of our data by performing a comparative evaluation with models trained on another dataset available for the Hindi NER task.
Ranked #1 on
Named Entity Recognition (NER)
on HiNER-original
no code implementations • ICON 2020 • Sandeep Mathias, Rudra Murthy, Diptesh Kanojia, Pushpak Bhattacharyya
Automatic essay grading (AEG) is a process in which machines assign a grade to an essay written in response to a topic, called the prompt.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Girishkumar Ponkiya, Rudra Murthy, Pushpak Bhattacharyya, Girish Palshikar
Our approach uses templates to prepare the input sequence for the language model.
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Sandeep Mathias, Rudra Murthy, Diptesh Kanojia, Abhijit Mishra, Pushpak Bhattacharyya
To demonstrate the efficacy of this multi-task learning based approach to automatic essay grading, we collect gaze behaviour for 48 essays across 4 essay sets, and learn gaze behaviour for the rest of the essays, numbering over 7000 essays.
1 code implementation • ACL 2018 • Rudra Murthy, Anoop Kunchukuttan, Pushpak Bhattacharyya
Multilingual learning for Neural Named Entity Recognition (NNER) involves jointly training a neural network for multiple languages.