no code implementations • TERM (LREC) 2022 • Shubhanker Banerjee, Bharathi Raja Chakravarthi, John Philip McCrae
Automatic Term Extraction (ATE) is one of the core problems in natural language processing and forms a key component of text mining pipelines of domain specific corpora.
no code implementations • LTEDI (ACL) 2022 • Bharathi B, Bharathi Raja Chakravarthi, Subalalitha Cn, Sripriya N, Arunaggiri Pandian, Swetha Valli
This paper illustrates the overview of the sharedtask on automatic speech recognition in the Tamillanguage.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • LTEDI (ACL) 2022 • Pradeep Roy, Snehaan Bhawal, Abhinav Kumar, Bharathi Raja Chakravarthi
This paper addresses the issue of Hope Speech detection using machine learning techniques.
1 code implementation • ICON 2021 • Sean Benhur, Roshan Nayak, Kanchana Sivanraju, Adeep Hande, Cn Subalalitha, Ruba Priyadharshini, Bharathi Raja Chakravarthi
Due to the exponential increasing reach of social media, it is essential to focus on its negative aspects as it can potentially divide society and incite people into violence.
no code implementations • LTEDI (ACL) 2022 • Adeep Hande, Siddhanth U Hegde, Sangeetha S, Ruba Priyadharshini, Bharathi Raja Chakravarthi
In recent years, various methods have been developed to control the spread of negativity by removing profane, aggressive, and offensive comments from social media platforms.
no code implementations • EACL (LTEDI) 2021 • Senthil Kumar B, Aravindan Chandrabose, Bharathi Raja Chakravarthi
Data in general encodes human biases by default; being aware of this is a good start, and the research around how to handle it is ongoing.
no code implementations • EACL (LTEDI) 2021 • Bharathi Raja Chakravarthi, Vigneshwaran Muralidaran
This paper reports on the shared task of hope speech detection for Tamil, English, and Malayalam languages.
no code implementations • EACL (DravidianLangTech) 2021 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Shubhanker Banerjee, Richard Saldanha, John P. McCrae, Anand Kumar M, Parameswari Krishnamurthy, Melvin Johnson
This paper describes the datasets used, the methodology used for the evaluation of participants, and the experiments’ overall results.
no code implementations • EACL (DravidianLangTech) 2021 • Shardul Suryawanshi, Bharathi Raja Chakravarthi
On the other hand, this freedom of expression or free speech can be abused by its user or a troll to demean an individual or a group.
no code implementations • EACL (DravidianLangTech) 2021 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Navya Jose, Anand Kumar M, Thomas Mandl, Prasanna Kumar Kumaresan, Rahul Ponnusamy, Hariharan R L, John P. McCrae, Elizabeth Sherly
Detecting offensive language in social media in local languages is critical for moderating user-generated content.
1 code implementation • EACL (DravidianLangTech) 2021 • Konthala Yasaswini, Karthik Puranik, Adeep Hande, Ruba Priyadharshini, Sajeetha Thavareesan, Bharathi Raja Chakravarthi
This paper demonstrates our work for the shared task on Offensive Language Identification in Dravidian Languages-EACL 2021.
no code implementations • NAACL (SMM4H) 2021 • Atul Kr. Ojha, Priya Rani, Koustava Goswami, Bharathi Raja Chakravarthi, John P. McCrae
Social media platforms such as Twitter and Facebook have been utilised for various research studies, from the cohort-level discussion to community-driven approaches to address the challenges in utilizing social media data for health, clinical and biomedical information.
no code implementations • COLING (PEOPLES) 2020 • Adeep Hande, Ruba Priyadharshini, Bharathi Raja Chakravarthi
We introduce Kannada CodeMixed Dataset (KanCMD), a multi-task learning dataset for sentiment analysis and offensive language identification.
no code implementations • EACL (VarDial) 2021 • Bharathi Raja Chakravarthi, Gaman Mihaela, Radu Tudor Ionescu, Heidi Jauhiainen, Tommi Jauhiainen, Krister Lindén, Nikola Ljubešić, Niko Partanen, Ruba Priyadharshini, Christoph Purschke, Eswari Rajagopal, Yves Scherrer, Marcos Zampieri
This paper describes the results of the shared tasks organized as part of the VarDial Evaluation Campaign 2021.
1 code implementation • ACL (CASE) 2021 • Pawan Kalyan, Duddukunta Reddy, Adeep Hande, Ruba Priyadharshini, Ratnasingam Sakuntharaj, Bharathi Raja Chakravarthi
In a world abounding in constant protests resulting from events like a global pandemic, climate change, religious or political conflicts, there has always been a need to detect events/protests before getting amplified by news media or social media.
no code implementations • LTEDI (ACL) 2022 • Bharathi Raja Chakravarthi, Vigneshwaran Muralidaran, Ruba Priyadharshini, Subalalitha Cn, John McCrae, Miguel Ángel García, Salud María Jiménez-Zafra, Rafael Valencia-García, Prasanna Kumaresan, Rahul Ponnusamy, Daniel García-Baena, José García-Díaz
Hope Speech detection is the task of classifying a sentence as hope speech or non-hope speech given a corpus of sentences.
no code implementations • SIGUL (LREC) 2022 • Asha Hegde, Mudoor Devadas Anusha, Sharal Coelho, Hosahalli Lakshmaiah Shashirekha, Bharathi Raja Chakravarthi
The lack of annotated code-mixed data for SA in a low-resource language like Tulu makes the SA a challenging task.
no code implementations • DravidianLangTech (ACL) 2022 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Subalalitha Cn, Sangeetha S, Malliga Subramanian, Kogilavani Shanmugavadivel, Parameswari Krishnamurthy, Adeep Hande, Siddhanth U Hegde, Roshan Nayak, Swetha Valli
It is one of the first shared tasks that focuses on Multi-task Learning for closely related tasks, especially for a very low-resourced language family such as the Dravidian language family.
no code implementations • DravidianLangTech (ACL) 2022 • Ruba Priyadharshini, Bharathi Raja Chakravarthi, Subalalitha Cn, Thenmozhi Durairaj, Malliga Subramanian, Kogilavani Shanmugavadivel, Siddhanth U Hegde, Prasanna Kumaresan
The social media is one of the significantdigital platforms that create a huge im-pact in peoples of all levels.
no code implementations • SemEval (NAACL) 2022 • Shankar Mahadevan, Sean Benhur, Roshan Nayak, Malliga Subramanian, Kogilavani Shanmugavadivel, Kanchana Sivanraju, Bharathi Raja Chakravarthi
Social media is an idea created to make theworld smaller and more connected.
no code implementations • LREC 2022 • Shankar Mahadevan, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Prabakaran Chandran, Ruba Priyadharshini, Sangeetha S, Bharathi Raja Chakravarthi
Thirumurai, also known as Panniru Thirumurai, is a collection of Tamil Shaivite poems dating back to the Hindu revival period between the 6th and the 10th century.
no code implementations • GWC 2018 • Bharathi Raja Chakravarthi, Mihael Arcan, John P. McCrae
In addition to that, we carried out a manual evaluation of the translations for the Tamil language, where we demonstrate that our approach can aid in improving wordnet resources for under-resourced Dravidian languages.
no code implementations • VarDial (COLING) 2020 • Bharathi Raja Chakravarthi, Navaneethan Rajasekaran, Mihael Arcan, Kevin McGuinness, Noel E. O’Connor, John P. McCrae
Bilingual lexicons are a vital tool for under-resourced languages and recent state-of-the-art approaches to this leverage pretrained monolingual word embeddings using supervised or semi-supervised approaches.
no code implementations • WMT (EMNLP) 2020 • Atul Kr. Ojha, Priya Rani, Akanksha Bansal, Bharathi Raja Chakravarthi, Ritesh Kumar, John P. McCrae
NUIG-Panlingua-KMI submission to WMT 2020 seeks to push the state-of-the-art in Similar Language Translation Task for Hindi↔Marathi language pair.
no code implementations • LTEDI (ACL) 2022 • Kayalvizhi S, Thenmozhi Durairaj, Bharathi Raja Chakravarthi, Jerin Mahibha C
Social media is considered as a platform whereusers express themselves.
no code implementations • LTEDI (ACL) 2022 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Thenmozhi Durairaj, John McCrae, Paul Buitelaar, Prasanna Kumaresan, Rahul Ponnusamy
This shared taskfocused on three sub-tasks for Tamil, English, and Tamil-English (code-mixed) languages.
no code implementations • BioNLP (ACL) 2022 • Usman Naseem, Ajay Bandi, Shaina Raza, Junaid Rashid, Bharathi Raja Chakravarthi
In this study, we propose a new method that addresses the challenges of medical dialogue generation by incorporating medical knowledge into transformer-based language models.
no code implementations • DravidianLangTech (ACL) 2022 • Vasanth Palanikumar, Sean Benhur, Adeep Hande, Bharathi Raja Chakravarthi
With the rise of social media and internet, thereis a necessity to provide an inclusive space andprevent the abusive topics against any gender, race or community.
no code implementations • DravidianLangTech (ACL) 2022 • Premjith B, Bharathi Raja Chakravarthi, Malliga Subramanian, Bharathi B, Soman Kp, Dhanalakshmi V, Sreelakshmi K, Arunaggiri Pandian, Prasanna Kumaresan
The only team that participated in the Multimodal Sentiment Analysis shared task obtained an F1-score of 0. 24.
no code implementations • DravidianLangTech (ACL) 2022 • Manikandan Ravikiran, Bharathi Raja Chakravarthi, Anand Kumar Madasamy, Sangeetha S, Ratnavel Rajalakshmi, Sajeetha Thavareesan, Rahul Ponnusamy, Shankar Mahadevan
Offensive content moderation is vital in social media platforms to support healthy online discussions.
no code implementations • DravidianLangTech (ACL) 2022 • Anand Kumar Madasamy, Asha Hegde, Shubhanker Banerjee, Bharathi Raja Chakravarthi, Ruba Priyadharshini, Hosahalli Shashirekha, John McCrae
This paper presents an outline of the shared task on translation of under-resourced Dravidian languages at DravidianLangTech-2022 workshop to be held jointly with ACL 2022.
no code implementations • DravidianLangTech (ACL) 2022 • Anbukkarasi Sampath, Thenmozhi Durairaj, Bharathi Raja Chakravarthi, Ruba Priyadharshini, Subalalitha Cn, Kogilavani Shanmugavadivel, Sajeetha Thavareesan, Sathiyaraj Thangasamy, Parameswari Krishnamurthy, Adeep Hande, Sean Benhur, Kishore Ponnusamy, Santhiya Pandiyan
This paper presents the dataset used in the shared task, task description, and the methodology used by the participants and the evaluation results of the submission.
no code implementations • 12 May 2022 • Manikandan Ravikiran, Bharathi Raja Chakravarthi, Anand Kumar Madasamy, Sangeetha Sivanesan, Ratnavel Rajalakshmi, Sajeetha Thavareesan, Rahul Ponnusamy, Shankar Mahadevan. /
Offensive content moderation is vital in social media platforms to support healthy online discussions.
1 code implementation • DravidianLangTech (ACL) 2022 • Manikandan Ravikiran, Bharathi Raja Chakravarthi
This paper investigates the effectiveness of sentence-level transformers for zero-shot offensive span identification on a code-mixed Tamil dataset.
1 code implementation • 19 Apr 2022 • Md. Rezaul Karim, Sumon Kanti Dey, Tanhim Islam, Md. Shajalal, Bharathi Raja Chakravarthi
This paper is about hate speech detection from multimodal Bengali memes and texts.
no code implementations • 12 Apr 2022 • Hariharan RamakrishnaIyer LekshmiAmmal, Manikandan Ravikiran, Gayathri Nisha, Navyasree Balamuralidhar, Adithya Madhusoodanan, Anand Kumar Madasamy, Bharathi Raja Chakravarthi
Our work revisits this issue in hope-speech detection by introducing focal loss, data augmentation, and pre-processing strategies.
no code implementations • 9 Feb 2022 • Charangan Vasantharajan, Sean Benhur, Prasanna Kumar Kumarasen, Rahul Ponnusamy, Sathiyaraj Thangasamy, Ruba Priyadharshini, Thenmozhi Durairaj, Kanchana Sivanraju, Anbukkarasi Sampath, Bharathi Raja Chakravarthi, John Phillip McCrae
Our MURIL-base model has achieved a 0. 60 macro average F1-score across our 3-class group dataset.
1 code implementation • 31 Dec 2021 • Sean Benhur, Roshan Nayak, Kanchana Sivanraju, Adeep Hande, Subalalitha Chinnaudayar Navaneethakrishnan, Ruba Priyadharshini, Bharathi Raja Chakravarthi
Due to the exponentially increasing reach of social media, it is essential to focus on its negative aspects as it can potentially divide society and incite people into violence.
no code implementations • 18 Nov 2021 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Sajeetha Thavareesan, Dhivya Chinnappa, Durairaj Thenmozhi, Elizabeth Sherly, John P. McCrae, Adeep Hande, Rahul Ponnusamy, Shubhanker Banerjee, Charangan Vasantharajan
We received 22 systems for Tamil-English, 15 systems for Malayalam-English, and 15 for Kannada-English.
no code implementations • 5 Nov 2021 • Bharathi Raja Chakravarthi, Dhivya Chinnappa, Ruba Priyadharshini, Anand Kumar Madasamy, Sangeetha Sivanesan, Subalalitha Chinnaudayar Navaneethakrishnan, Sajeetha Thavareesan, Dhanalakshmi Vadivel, Rahul Ponnusamy, Prasanna Kumar Kumaresan
With the fast growth of mobile computing and Web technologies, offensive language has become more prevalent on social networking platforms.
no code implementations • 8 Sep 2021 • Shardul Suryawanshi, Bharathi Raja Chakravarthi, Mihael Arcan, Suzanne Little, Paul Buitelaar
To enable this analysis, we enhanced an existing dataset by annotating the data with our defined classes, resulting in a dataset of 8, 881 IWT or multimodal memes in the English language (TrollsWithOpinion dataset).
no code implementations • 1 Sep 2021 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Kayalvizhi Sampath, Durairaj Thenmozhi, Sathiyaraj Thangasamy, Rajendran Nallathambi, John Phillip McCrae
We provide a new hierarchical taxonomy for online homophobia and transphobia, as well as an expert-labelled dataset that will allow homophobic/transphobic content to be automatically identified.
1 code implementation • 27 Aug 2021 • Adeep Hande, Karthik Puranik, Konthala Yasaswini, Ruba Priyadharshini, Sajeetha Thavareesan, Anbukkarasi Sampath, Kogilavani Shanmugavadivel, Durairaj Thenmozhi, Bharathi Raja Chakravarthi
We fine-tune several recent pretrained language models on the newly constructed dataset.
1 code implementation • MTSummit 2021 • Karthik Puranik, Adeep Hande, Ruba Priyadharshini, Thenmozhi Durairaj, Anbukkarasi Sampath, Kingston Pal Thamburaj, Bharathi Raja Chakravarthi
This paper reports the Machine Translation (MT) systems submitted by the IIITT team for the English->Marathi and English->Irish language pairs LoResMT 2021 shared task.
2 code implementations • 10 Aug 2021 • Adeep Hande, Ruba Priyadharshini, Anbukkarasi Sampath, Kingston Pal Thamburaj, Prabakaran Chandran, Bharathi Raja Chakravarthi
Numerous methods have been developed to monitor the spread of negativity in modern years by eliminating vulgar, offensive, and fierce comments from social media platforms.
Ranked #1 on
Hope Speech Detection
on KanHope
1 code implementation • 9 Aug 2021 • Siddhanth U Hegde, Adeep Hande, Ruba Priyadharshini, Sajeetha Thavareesan, Ratnasingam Sakuntharaj, Sathiyaraj Thangasamy, B Bharathi, Bharathi Raja Chakravarthi
Our work illustrates different textual analysis methods and contrasting multimodal methods ranging from simple merging to cross attention to utilising both worlds' - best visual and textual features.
1 code implementation • 17 Jun 2021 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Vigneshwaran Muralidaran, Navya Jose, Shardul Suryawanshi, Elizabeth Sherly, John P. McCrae
This paper describes the development of a multilingual, manually annotated dataset for three under-resourced Dravidian languages generated from social media comments.
no code implementations • 9 Jun 2021 • Bharathi Raja Chakravarthi, Jishnu Parameswaran P. K, Premjith B, K. P Soman, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Kingston Pal Thamburaj, John P. McCrae
This is the first multimodal sentiment analysis dataset for Tamil and Malayalam by volunteer annotators.
1 code implementation • 19 Apr 2021 • Nikhil Ghanghor, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Ruba Priyadharshini, Sajeetha Thavareesan, Bharathi Raja Chakravarthi
This paper describes the IIITK’s team submissions to the hope speech detection for equality, diversity and inclusion in Dravidian languages shared task organized by LT-EDI 2021 workshop@EACL 2021.
1 code implementation • 19 Apr 2021 • Karthik Puranik, Adeep Hande, Ruba Priyadharshini, Sajeetha Thavareesan, Bharathi Raja Chakravarthi
In a world filled with serious challenges like climate change, religious and political conflicts, global pandemics, terrorism, and racial discrimination, an internet full of hate speech, abusive and offensive content is the last thing we desire for.
1 code implementation • EACL (DravidianLangTech) 2021 • Siddhanth U Hegde, Adeep Hande, Ruba Priyadharshini, Sajeetha Thavareesan, Bharathi Raja Chakravarthi
We propose an ingenious model comprising of a transformer-transformer architecture that tries to attain state-of-the-art by using attention as its main component.
Ranked #2 on
Meme Classification
on Tamil Memes
1 code implementation • 17 Apr 2021 • Nikhil Ghanghor, Parameswari Krishnamurthy, Sajeetha Thavareesan, Ruba Priyadharshini, Bharathi Raja Chakravarthi
This paper describes the IIITK team’s submissions to the offensive language identification, and troll memes classification shared tasks for Dravidian languages at DravidianLangTech 2021 workshop@EACL 2021.
1 code implementation • 28 Dec 2020 • Md. Rezaul Karim, Sumon Kanti Dey, Tanhim Islam, Sagor Sarker, Mehadi Hasan Menon, Kabir Hossain, Bharathi Raja Chakravarthi, Md. Azam Hossain, Stefan Decker
The exponential growths of social media and micro-blogging sites not only provide platforms for empowering freedom of expressions and individual voices, but also enables people to express anti-social behaviour like online harassment, cyberbullying, and hate speech.
no code implementations • COLING 2020 • Koustava Goswami, Rajdeep Sarkar, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae
Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely related languages or dialects, is one of the primary steps in many natural language processing pipelines.
no code implementations • 1 Dec 2020 • Bharathi Raja Chakravarthi
To our knowledge, this is the first research of its kind to annotate hope speech for equality, diversity and inclusion in a multilingual setting.
Ranked #2 on
Hope Speech Detection for English
on HopeEDI
no code implementations • 28 Sep 2020 • Daniel Torregrosa, Nivranshu Pasricha, Maraim Masoud, Bharathi Raja Chakravarthi, Juan Alonso, Noe Casas, Mihael Arcan
Rule-based machine translation is a machine translation paradigm where linguistic knowledge is encoded by an expert in the form of rules that translate text from source to target language.
no code implementations • 4 Aug 2020 • Bharathi Raja Chakravarthi, Priya Rani, Mihael Arcan, John P. McCrae
It introduces under-resourced languages in terms of machine translation and how orthographic information can be utilised to improve machine translation.
no code implementations • SEMEVAL 2020 • Koustava Goswami, Priya Rani, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae
Code mixing is a common phenomena in multilingual societies where people switch from one language to another for various reasons.
1 code implementation • LREC 2020 • Bharathi Raja Chakravarthi, Vigneshwaran Muralidaran, Ruba Priyadharshini, John P. McCrae
One such application is to analyse the popular sentiments of videos on social media based on viewer comments.
1 code implementation • LREC 2020 • Bharathi Raja Chakravarthi, Navya Jose, Shardul Suryawanshi, Elizabeth Sherly, John P. McCrae
However, very few resources are available for code-mixed data to create models specific for this data.
no code implementations • LREC 2020 • Priya Rani, Shardul Suryawanshi, Koustava Goswami, Bharathi Raja Chakravarthi, Theodorus Fransen, John Philip McCrae
Hate speech detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities.
1 code implementation • LREC 2020 • Shardul Suryawanshi, Bharathi Raja Chakravarthi, Mihael Arcan, Paul Buitelaar
Since there was no publicly available dataset for multimodal offensive meme content detection, we leveraged the memes related to the 2016 U. S. presidential election and created the MultiOFF multimodal meme dataset for offensive content detection dataset.
Ranked #2 on
Meme Classification
on MultiOFF
(F1 metric)
no code implementations • LREC 2020 • Shardul Suryawanshi, Bharathi Raja Chakravarthi, Pranav Verma, Mihael Arcan, John Philip McCrae, Paul Buitelaar
Social media are interactive platforms that facilitate the creation or sharing of information, ideas or other forms of expression among people.
1 code implementation • 11 Apr 2020 • Md. Rezaul Karim, Bharathi Raja Chakravarthi, John P. McCrae, Michael Cochez
Evaluations against several baseline embedding models, e. g., Word2Vec and GloVe yield up to 92. 30%, 82. 25%, and 90. 45% F1-scores in case of document classification, sentiment analysis, and hate speech detection, respectively during 5-fold cross-validation tests.