no code implementations • MTSummit 2021 • Vandan Mujadia, Dipti Misra Sharma
In this paper, we (team - oneNLP-IIITH) describe our Neural Machine Translation approaches for English-Marathi (both direction) for LoResMT-20211 .
no code implementations • ICON 2019 • Priyank Gupta, Manish Shrivastava, Dipti Misra Sharma, Rashid Ahmad
Similarly, translators working on Computer Aided Translation workbenches, also require help from various kinds of resources - glossaries, terminologies, concordances and translation memory in the workbenches in order to increase their productivity.
no code implementations • 5 Dec 2024 • Vandan Mujadia, Dipti Misra Sharma
This paper focuses on developing translation models and related applications for 36 Indian languages, including Assamese, Awadhi, Bengali, Bhojpuri, Braj, Bodo, Dogri, English, Konkani, Gondi, Gujarati, Hindi, Hinglish, Ho, Kannada, Kangri, Kashmiri (Arabic and Devanagari), Khasi, Mizo, Magahi, Maithili, Malayalam, Marathi, Manipuri (Bengali and Meitei), Nepali, Oriya, Punjabi, Sanskrit, Santali, Sinhala, Sindhi (Arabic and Devanagari), Tamil, Tulu, Telugu, and Urdu.
no code implementations • 8 May 2024 • Sankalp Bahad, Pruthwik Mishra, Karunesh Arora, Rakesh Chandra Balabantaray, Dipti Misra Sharma, Parameswari Krishnamurthy
We present a human annotated named entity corpora of 40K sentences for 4 Indian languages from two of the major Indian language families.
no code implementations • 3 Apr 2024 • Vandan Mujadia, Pruthwik Mishra, Arafat Ahsan, Dipti Misra Sharma
We constructed a translation evaluation task where we performed zero-shot learning, in-context example-driven learning, and fine-tuning of large language models to provide a score out of 100, where 100 represents a perfect translation and 1 represents a poor translation.
no code implementations • 22 Dec 2023 • Nikhilesh Bhatnagar, Ashok Urlana, Vandan Mujadia, Pruthwik Mishra, Dipti Misra Sharma
We analyze the data and propose methods to match articles to video descriptions that serve as document and summary pairs.
1 code implementation • 18 Dec 2023 • Harshita Sharma, Pruthwik Mishra, Dipti Misra Sharma
Verbs are very important for solving word problems with addition/subtraction operations as they help us identify the set of operations required to solve the word problems.
no code implementations • 15 Nov 2023 • Vandan Mujadia, Ashok Urlana, Yash Bhaskar, Penumalla Aditya Pavani, Kukkapalli Shravya, Parameswari Krishnamurthy, Dipti Misra Sharma
In this work, our aim is to explore the multilingual capabilities of large language models by using machine translation as a task involving English and 22 Indian languages.
no code implementations • 21 Oct 2022 • Akshat Gahoi, Jayant Duneja, Anshul Padhi, Shivam Mangale, Saransh Rajput, Tanvi Kamble, Dipti Misra Sharma, Vasudeva Varma
The first task dealt with both Roman and Devanagari script as we had monolingual data in both English and Hindi whereas the second task only had data in Roman script.
2 code implementations • 19 Apr 2022 • Pruthwik Mishra, Dipti Misra Sharma
Shallow parsing is an essential task for many NLP applications like machine translation, summarization, sentiment analysis, aspect identification and many more.
no code implementations • ICON 2021 • Arafat Ahsan, Vandan Mujadia, Dipti Misra Sharma
We present findings from a first in-depth post-editing effort estimation study in the English-Hindi direction along multiple effort indicators.
no code implementations • ACL 2021 • Sourav Kumar, Salil Aggarwal, Dipti Misra Sharma, Radhika Mamidi
India is one of the most linguistically diverse nations of the world and is culturally very rich.
1 code implementation • ACL (SIGMORPHON) 2021 • Saujas Vaduguru, Aalok Sathe, Monojit Choudhury, Dipti Misra Sharma
Neural models excel at extracting statistical patterns from large amounts of data, but struggle to learn patterns or reason about language from only a few examples.
no code implementations • ACL 2020 • P, Vinay ramish, Dipti Misra Sharma
After training a Neural Machine Translation (NMT) baseline system, it has been observed that these iteration outputs have an oracle score higher than baseline up to 1. 01 BLEU points compared to the last iteration of the trained system. We come up with a ranking mechanism by solely focusing on the decoder{'}s ability to generate distinct tokens and without the usage of any language model or data.
no code implementations • ACL 2020 • Vikrant Goyal, Sourav Kumar, Dipti Misra Sharma
Since the condition of large parallel corpora is not met for Indian-English language pairs, we present our efforts towards building efficient NMT systems between Indian languages (specifically Indo-Aryan languages) and English via efficiently exploiting parallel data from the related languages.
no code implementations • WS 2020 • Aamir Farhan, Mashrukh Islam, Dipti Misra Sharma
Thus, Urdu not only has space omission but also space insertion issues which make the word segmentation task challenging.
no code implementations • LREC 2020 • Vikrant Goyal, Pruthwik Mishra, Dipti Misra Sharma
Hindi-English Machine Translation is a challenging problem, owing to multiple factors including the morphological complexity and relatively free word order of Hindi, in addition to the lack of sufficient parallel training data.
no code implementations • WS 2019 • Vikrant Goyal, Dipti Misra Sharma
This paper describes the Neural Machine Translation systems of IIIT-Hyderabad (LTRC-MT) for WAT 2019 Hindi-English shared task.
no code implementations • WS 2019 • Vikrant Goyal, Dipti Misra Sharma
This paper describes the Neural Machine Translation system of IIIT-Hyderabad for the Gujarati→English news translation shared task of WMT19.
no code implementations • 13 Feb 2019 • Riyaz Ahmad Bhat, Irshad Ahmad Bhat, Dipti Misra Sharma
We investigate the problem of parsing conversational data of morphologically-rich languages such as Hindi where argument scrambling occurs frequently.
2 code implementations • 9 Aug 2018 • Ketan Kumar Todi, Pruthwik Mishra, Dipti Misra Sharma
POS Tagging serves as a preliminary task for many NLP applications.
Ranked #1 on
POS
on Kannada Treebank
no code implementations • 9 Aug 2018 • Pruthwik Mishra, Litton J Kurisinkel, Dipti Misra Sharma
The frame identification is dependent on the verb in a sentence.
2 code implementations • NAACL 2018 • Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Manish Shrivastava, Dipti Misra Sharma
We present a treebank of Hindi-English code-switching tweets under Universal Dependencies scheme and propose a neural stacking model for parsing that efficiently leverages part-of-speech tag and syntactic tree annotations in the code-switching treebank and the preexisting Hindi and English treebanks.
no code implementations • EACL 2017 • Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Manish Shrivastava, Dipti Misra Sharma
In this paper, we propose efficient and less resource-intensive strategies for parsing of code-mixed data.
no code implementations • COLING 2016 • Riyaz A. Bhat, Irshad A. Bhat, Naman jain, Dipti Misra Sharma
With respect to text processing, addressing the differences between the Hindi and Urdu texts would be beneficial in the following ways: (a) instead of training separate models, their individual resources can be augmented to train single, unified models for better generalization, and (b) their individual text processing applications can be used interchangeably under varied resource conditions.
no code implementations • LREC 2016 • V Mujadia, an, Palash Gupta, Dipti Misra Sharma
This paper describes a coreference annotation scheme, coreference annotation specific issues and their solutions through our proposed annotation scheme for Hindi.
no code implementations • LREC 2014 • Jayendra Rakesh Yeka, Prasanth Kolachina, Dipti Misra Sharma
We conclude the paper by presenting evaluation scores of different statistical MT systems on the corpora detailed in this paper for English{\^a}Hindi and present the proposed plans for future work.
no code implementations • LREC 2014 • Riyaz Ahmad Bhat, Shahid Mushtaq Bhat, Dipti Misra Sharma
As the main contribution of this paper, we present an initial version of the Kashmiri Dependency Treebank.
no code implementations • LREC 2012 • Sudheer Kolachina, Rashmi Prasad, Dipti Misra Sharma, Aravind Joshi
While the proposed modifications were driven by the desire to introduce greater conceptual clarity in the PDTB scheme and to facilitate better annotation quality, our findings indicate that overall, some of the changes render the annotation task much more difficult for the annotators, as also reflected in lower inter-annotator agreement for the relevant sub-tasks.