1 code implementation • EMNLP 2021 • Anirudh Mittal, Pranav Jeevan P, Prerak Gandhi, Diptesh Kanojia, Pushpak Bhattacharyya
The normalized duration (laughter duration divided by the clip duration) of laughter in each clip is used to compute this humour coefficient score on a five-point scale (0-4).
no code implementations • GWC 2016 • Meghna Singh, Rajita Shukla, Jaya Saraswati, Laxmi Kashyap, Diptesh Kanojia, Pushpak Bhattacharyya
This paper reports the work of creating bilingual mappings in English for certain synsets of Hindi wordnet, the need for doing this, the methods adopted and the tools created for the task.
no code implementations • GWC 2016 • Diptesh Kanojia, Shehzaad Dhuliawala, Pushpak Bhattacharyya
Our contribution is three fold: (1) We develop a system, which, given a synset in English, finds an appropriate image for the synset.
no code implementations • GWC 2016 • Diptesh Kanojia, Raj Dabre, Pushpak Bhattacharyya
India is a country with 22 officially recognized languages and 17 of these have WordNets, a crucial resource.
no code implementations • WASSA (ACL) 2022 • Shenbin Qian, Constantin Orasan, Diptesh Kanojia, Hadeel Saadany, Félix do Carmo
This paper summarises the submissions our team, SURREY-CTS-NLP has made for the WASSA 2022 Shared Task for the prediction of empathy, distress and emotion.
no code implementations • GWC 2018 • Hanumant Redkar, Rajita Shukla, Sandhya Singh, Jaya Saraswati, Laxmi Kashyap, Diptesh Kanojia, Preethi Jyothi, Malhar Kulkarni, Pushpak Bhattacharyya
This aid is based on modern pedagogical axioms and is aligned to the learning objectives of the syllabi of the school education in India.
no code implementations • GWC 2018 • Diptesh Kanojia, Preethi Jyothi, Pushpak Bhattacharyya
We also develop voices using the existing implementations of the aforementioned systems, and (2) We use these voices to generate sample audios for randomly chosen words; manually evaluate the audio generated, and produce audio for all WordNet words using the winner voice model.
no code implementations • GWC 2018 • Ritesh Panjwani, Diptesh Kanojia, Pushpak Bhattacharyya
Indian language WordNets have their individual web-based browsing interfaces along with a common interface for IndoWordNet.
no code implementations • 1 Dec 2023 • Archchana Sindhujan, Diptesh Kanojia, Constantin Orasan, Tharindu Ranasinghe
Quality Estimation (QE) systems are important in situations where it is necessary to assess the quality of translations, but there is no reference available.
1 code implementation • 30 Oct 2023 • Heather Lent, Kushal Tatariya, Raj Dabre, Yiyi Chen, Marcell Fekete, Esther Ploeger, Li Zhou, Hans Erik Heje, Diptesh Kanojia, Paul Belony, Marcel Bollmann, Loïc Grobol, Miryam de Lhoneux, Daniel Hershcovich, Michel DeGraff, Anders Søgaard, Johannes Bjerva
Creoles represent an under-explored and marginalized group of languages, with few available resources for NLP research.
no code implementations • 29 Sep 2023 • Swapnil Bhosale, Abhra Chaudhuri, Alex Lee Robert Williams, Divyank Tiwari, Anjan Dutta, Xiatian Zhu, Pushpak Bhattacharyya, Diptesh Kanojia
The introduction of the MUStARD dataset, and its emotion recognition extension MUStARD++, have identified sarcasm to be a multi-modal phenomenon -- expressed not only in natural language text, but also through manners of speech (like tonality and intonation) and visual cues (facial expression).
no code implementations • 13 Sep 2023 • Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Xiatian Zhu
Particularly, in situations where existing supervised AVS methods struggle with overlapping foreground objects, our models still excel in accurately segmenting overlapped auditory objects.
no code implementations • 14 Aug 2023 • Swapnil Bhosale, Sauradip Nag, Diptesh Kanojia, Jiankang Deng, Xiatian Zhu
In this work, we reformulate the SED problem by taking a generative learning perspective.
1 code implementation • 20 Jun 2023 • Shenbin Qian, Constantin Orasan, Felix Do Carmo, Qiuliang Li, Diptesh Kanojia
In this paper, we focus on how current Machine Translation (MT) tools perform on the translation of emotion-loaded texts by evaluating outputs from Google Translate according to a framework proposed in this paper.
no code implementations • 24 Jan 2023 • Diptesh Kanojia, Aditya Joshi
Sentiment analysis has benefited from the availability of lexicons and benchmark datasets created over decades of research.
1 code implementation • COLING 2022 • Varad Bhatnagar, Diptesh Kanojia, Kameswari Chebrolu
We propose a new workflow for efficiently detecting previously fact-checked claims that uses abstractive summarization to generate crisp queries.
1 code implementation • LREC 2022 • Rudra Murthy, Pallab Bhattacharjee, Rahul Sharnagat, Jyotsana Khatri, Diptesh Kanojia, Pushpak Bhattacharyya
We use different language models to perform the sequence labelling task for NER and show the efficacy of our data by performing a comparative evaluation with models trained on another dataset available for the Hindi NER task.
Ranked #1 on
Named Entity Recognition (NER)
on HiNER-original
1 code implementation • LREC 2022 • Leonardo Zilio, Hadeel Saadany, Prashant Sharma, Diptesh Kanojia, Constantin Orăsan
This paper presents PLOD, a large-scale dataset for abbreviation detection and extraction that contains 160k+ segments automatically annotated with abbreviations and their long forms.
Ranked #1 on
AbbreviationDetection
on PLOD-unfiltered
no code implementations • LREC 2018 • Diptesh Kanojia, Kevin Patel, Pushpak Bhattacharyya
Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages.
1 code implementation • 9 Jan 2022 • Prashant Sharma, Hadeel Saadany, Leonardo Zilio, Diptesh Kanojia, Constantin Orăsan
Acronyms are abbreviated units of a phrase constructed by using initial components of the phrase in a text.
no code implementations • GWC 2018 • Kevin Patel, Diptesh Kanojia, Pushpak Bhattacharyya
Thus techniques that can aid the experts are desirable.
no code implementations • 5 Jan 2022 • Diptesh Kanojia, Malhar Kulkarni, Sayali Ghodekar, Eivind Kahrs, Pushpak Bhattacharyya
We use the text of the K\=a\'sik\=avrtti (KV) as a sample text, and with the help of philologists, we digitize the commentaries available to us.
no code implementations • 5 Jan 2022 • Swaraja Salaskar, Diptesh Kanojia, Malhar Kulkarni
Our paper attempts to show the implication of the creation of our tool in this area.
no code implementations • GWC 2019 • Diptesh Kanojia, Kevin Patel, Pushpak Bhattacharyya, Malhar Kulkarni, Gholamreza Haffari
Automatic Cognate Detection (ACD) is a challenging task which has been utilized to help NLP applications like Machine Translation, Information Retrieval and Computational Phylogenetics.
no code implementations • 27 Dec 2021 • Kumar Saurav, Kumar Saunack, Diptesh Kanojia, Pushpak Bhattacharyya
In this paper, we use various existing approaches to create multiple word embeddings for 14 Indian languages.
no code implementations • 21 Dec 2021 • Sandeep Mathias, Diptesh Kanojia, Abhijit Mishra, Pushpak Bhattacharyya
Gaze behaviour has been used as a way to gather cognitive information for a number of years.
1 code implementation • LREC 2020 • Diptesh Kanojia, Pushpak Bhattacharyya, Malhar Kulkarni, Gholamreza Haffari
In this paper, we describe the creation of two cognate datasets for twelve Indian languages, namely Sanskrit, Hindi, Assamese, Oriya, Kannada, Gujarati, Tamil, Telugu, Punjabi, Bengali, Marathi, and Malayalam.
1 code implementation • COLING 2020 • Diptesh Kanojia, Raj Dabre, Shubham Dewangan, Pushpak Bhattacharyya, Gholamreza Haffari, Malhar Kulkarni
We, then, evaluate the impact of our cognate detection mechanism on neural machine translation (NMT), as a downstream task.
Cross-Lingual Information Retrieval
Cross-Lingual Word Embeddings
+5
1 code implementation • EACL 2021 • Diptesh Kanojia, Prashant Sharma, Sayali Ghodekar, Pushpak Bhattacharyya, Gholamreza Haffari, Malhar Kulkarni
We collect gaze behaviour data for a small sample of cognates and show that extracted cognitive features help the task of cognate detection.
1 code implementation • ICON 2021 • Mrinal Rawat, Diptesh Kanojia
The results show that our approach outperforms the state-of-the-art methods in fake news detection to achieve an F1-score of 99. 25 over the dataset provided for the CONSTRAINT-2021 Shared Task.
1 code implementation • 25 Oct 2021 • Anirudh Mittal, Pranav Jeevan, Prerak Gandhi, Diptesh Kanojia, Pushpak Bhattacharyya
We devise a novel scoring mechanism to annotate the training data with a humour quotient score using the audience's laughter.
1 code implementation • WMT (EMNLP) 2021 • Diptesh Kanojia, Marina Fomicheva, Tharindu Ranasinghe, Frédéric Blain, Constantin Orăsan, Lucia Specia
However, this ability is yet to be tested in the current evaluation practices, where QE systems are assessed only in terms of their correlation with human judgements.
no code implementations • ICON 2020 • Sandeep Mathias, Rudra Murthy, Diptesh Kanojia, Pushpak Bhattacharyya
Automatic essay grading (AEG) is a process in which machines assign a grade to an essay written in response to a topic, called the prompt.
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Sandeep Mathias, Rudra Murthy, Diptesh Kanojia, Abhijit Mishra, Pushpak Bhattacharyya
To demonstrate the efficacy of this multi-task learning based approach to automatic essay grading, we collect gaze behaviour for 48 essays across 4 essay sets, and learn gaze behaviour for the rest of the essays, numbering over 7000 essays.
no code implementations • LREC 2020 • Saurav Kumar, Saunack Kumar, Diptesh Kanojia, Pushpak Bhattacharyya
In this paper, we use various existing approaches to create multiple word embeddings for 14 Indian languages.
no code implementations • LREC 2020 • Akash Sheoran, Diptesh Kanojia, Aditya Joshi, Pushpak Bhattacharyya
Cross-domain sentiment analysis (CDSA) helps to address the problem of data scarcity in scenarios where labelled data for a domain (known as the target domain) is unavailable or insufficient.
no code implementations • 9 Apr 2020 • Akash Sheoran, Diptesh Kanojia, Aditya Joshi, Pushpak Bhattacharyya
Cross-domain sentiment analysis (CDSA) helps to address the problem of data scarcity in scenarios where labelled data for a domain (known as the target domain) is unavailable or insufficient.
no code implementations • ACL 2018 • Sandeep Mathias, Diptesh Kanojia, Kevin Patel, Samarth Agarwal, Abhijit Mishra, Pushpak Bhattacharyya
Such subjective aspects are better handled using cognitive information.
no code implementations • 10 Oct 2018 • Jayashree Gajjam, Diptesh Kanojia, Malhar Kulkarni
The notions of a sentence and a word as a meaningful linguistic unit in the language have been a subject matter for the discussion in many works that followed later on.
no code implementations • WS 2017 • Diptesh Kanojia, Nikhil Wani, Pushpak Bhattacharyya
We present a quantitative, data-driven machine learning approach to mitigate the problem of unpredictability of Computer Science Graduate School Admissions.
no code implementations • ACL 2016 • Abhijit Mishra, Diptesh Kanojia, Seema Nagar, Kuntal Dey, Pushpak Bhattacharyya
In this paper, we propose a novel mechanism for enriching the feature vector, for the task of sarcasm detection, with cognitive features extracted from eye-movement patterns of human readers.
no code implementations • CONLL 2016 • Abhijit Mishra, Diptesh Kanojia, Seema Nagar, Kuntal Dey, Pushpak Bhattacharyya
Sentiments expressed in user-generated short text and sentences are nuanced by subtleties at lexical, syntactic, semantic and pragmatic levels.
no code implementations • 14 Oct 2016 • Diptesh Kanojia, Vishwajeet Kumar, Krithi Ramamritham
We present the Civique system for emergency detection in urban areas by monitoring micro blogs like Tweets.
no code implementations • LREC 2016 • Diptesh Kanojia, Aditya Joshi, Pushpak Bhattacharyya, Mark James Carman
As demonstrated by the quality of our coarse lexical resource and its benefit to MT, we believe that our sentential approach to create such a resource will help MT for resource-constrained languages.
no code implementations • LREC 2016 • Shehzaad Dhuliawala, Diptesh Kanojia, Pushpak Bhattacharyya
We present a WordNet like structured resource for slang words and neologisms on the internet.