Search Results for author: Ritesh Kumar

Found 39 papers, 6 papers with code

Multilingual Protest News Detection - Shared Task 1, CASE 2021

no code implementations ACL (CASE) 2021 Ali Hürriyetoğlu, Osman Mutlu, Erdem Yörük, Farhana Ferdousi Liza, Ritesh Kumar, Shyam Ratan

Task 1, which is the focus of this report, is on multilingual protest news detection and comprises four subtasks that are document classification (subtask 1), sentence classification (subtask 2), event sentence coreference identification (subtask 3), and event extraction (subtask 4).

Decision Making Document Classification +2

UniMorph 4.0: Universal Morphology

no code implementations7 May 2022 Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay, Juan López Bautista, Gema Celeste Silva Villegas, Lucas Torroba Hennigen, Adam Ek, David Guriel, Peter Dirix, Jean-Philippe Bernardy, Andrey Scherbakov, Aziyana Bayyr-ool, Antonios Anastasopoulos, Roberto Zariquiey, Karina Sheifer, Sofya Ganieva, Hilaria Cruz, Ritván Karahóǧa, Stella Markantonatou, George Pavlidis, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Candy Angulo, Jatayu Baxi, Andrew Krizhanovsky, Natalia Krizhanovskaya, Elizabeth Salesky, Clara Vania, Sardana Ivanova, Jennifer White, Rowan Hall Maudslay, Josef Valvoda, Ran Zmigrod, Paula Czarnowska, Irene Nikkarinen, Aelita Salchak, Brijesh Bhatt, Christopher Straughn, Zoey Liu, Jonathan North Washington, Yuval Pinter, Duygu Ataman, Marcin Wolinski, Totok Suhardijanto, Anna Yablonskaya, Niklas Stoehr, Hossep Dolatian, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Aryaman Arora, Richard J. Hatcher, Ritesh Kumar, Jeremiah Young, Daria Rodionova, Anastasia Yemelina, Taras Andrushko, Igor Marchenko, Polina Mashkovtseva, Alexandra Serova, Emily Prud'hommeaux, Maria Nepomniashchaya, Fausto Giunchiglia, Eleanor Chodroff, Mans Hulden, Miikka Silfverberg, Arya D. McCarthy, David Yarowsky, Ryan Cotterell, Reut Tsarfaty, Ekaterina Vylomova

The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema.

Morphological Inflection

Developing Universal Dependency Treebanks for Magahi and Braj

no code implementations26 Apr 2022 Mohit Raj, Shyam Ratan, Deepak Alok, Ritesh Kumar, Atul Kr. Ojha

In this paper, we discuss the development of treebanks for two low-resourced Indian languages - Magahi and Braj based on the Universal Dependencies framework.

Aggression in Hindi and English Speech: Acoustic Correlates and Automatic Identification

no code implementations6 Apr 2022 Ritesh Kumar, Atul Kr. Ojha, Bornini Lahiri, Chingrimnng Lungleng

The study is based on a corpus of slightly over 10 hours of political discourse and includes debates on news channel and political speeches.

Language Resources and Technologies for Non-Scheduled and Endangered Indian Languages

no code implementations6 Apr 2022 Ritesh Kumar, Bornini Lahiri

In this paper, we give a summary of the resources and technologies for those Indian languages which are not included in the 8th schedule of the Indian Constitution and/or which are endangered.

Demo of the Linguistic Field Data Management and Analysis System -- LiFE

1 code implementation22 Mar 2022 Siddharth Singh, Ritesh Kumar, Shyam Ratan, Sonal Sinha

The interface allows creation of multiple projects that could be shared with the other users.

Translating Politeness Across Cultures: Case of Hindi and English

no code implementations3 Dec 2021 Ritesh Kumar, Girish Nath Jha

In this paper, we present a corpus based study of politeness across two languages-English and Hindi.

Machine Translation Translation

Creating and Managing a large annotated parallel corpora of Indian languages

no code implementations3 Dec 2021 Ritesh Kumar, Shiv Bhusan Kaushik, Pinkey Nainwani, Girish Nath Jha

This paper presents the challenges in creating and managing large parallel corpora of 12 major Indian languages (which is soon to be extended to 23 languages) as part of a major consortium project funded by the Department of Information Technology (DIT), Govt.

POS

Challenges in Developing LRs for Non-Scheduled Languages: A Case of Magahi

no code implementations30 Nov 2021 Ritesh Kumar

Magahi is an Indo-Aryan Language, spoken mainly in the Eastern parts of India.

POS

Towards automatic identification of linguistic politeness in Hindi texts

no code implementations30 Nov 2021 Ritesh Kumar

In this paper I present a classifier for automatic identification of linguistic politeness in Hindi texts.

The ComMA Dataset V0.2: Annotating Aggression and Bias in Multilingual Social Media Discourse

no code implementations19 Nov 2021 Ritesh Kumar, Enakshi Nandi, Laishram Niranjana Devi, Shyam Ratan, Siddharth Singh, Akash Bhagat, Yogesh Dawer

In this paper, we discuss the development of a multilingual dataset annotated with a hierarchical, fine-grained tagset marking different types of aggression and the "context" in which they occur.

Aggression Identification

Diagnosing Data from ICTs to Provide Focused Assistance in Agricultural Adoptions

no code implementations29 Oct 2021 Ashwin Singh, Mallika Subramanian, Anmol Agarwal, Pratyush Priyadarshi, Shrey Gupta, Kiran Garimella, Sanjeev Kumar, Ritesh Kumar, Lokesh Garg, Erica Arya, Ponnurangam Kumaraguru

Our classifier achieves accuracies ranging from 79% to 90% across the five states, demonstrating its potential for assisting future ethnographic investigations.

What a million Indian farmers say?: A crowdsourcing-based method for pest surveillance

no code implementations7 Aug 2021 Poonam Adhikari, Ritesh Kumar, S. R. S Iyengar, Rishemjit Kaur

Many different technologies are used to detect pests in the crops, such as manual sampling, sensors, and radar.

MSTREAM: Fast Anomaly Detection in Multi-Aspect Streams

1 code implementation17 Sep 2020 Siddharth Bhatia, Arjit Jain, Pan Li, Ritesh Kumar, Bryan Hooi

Given a stream of entries in a multi-aspect data setting i. e., entries having multiple dimensions, how can we detect anomalous activities in an unsupervised manner?

Group Anomaly Detection Intrusion Detection

Evaluating Aggression Identification in Social Media

no code implementations LREC 2020 Ritesh Kumar, Atul Kr. Ojha, Shervin Malmasi, Marcos Zampieri

The task consisted of two sub-tasks - aggression identification (sub-task A) and gendered identification (sub-task B) - in three languages - Bangla, Hindi and English.

Aggression Identification

Developing a Multilingual Annotated Corpus of Misogyny and Aggression

no code implementations LREC 2020 Shiladitya Bhattacharya, Siddharth Singh, Ritesh Kumar, Akanksha Bansal, Akash Bhagat, Yogesh Dawer, Bornini Lahiri, Atul Kr. Ojha

In this paper, we discuss the development of a multilingual annotated corpus of misogyny and aggression in Indian English, Hindi, and Indian Bangla as part of a project on studying and automatically identifying misogyny and communalism on social media (the ComMA Project).

Tale of tails using rule augmented sequence labeling for event extraction

no code implementations19 Aug 2019 Ayush Maheshwari, Hrishikesh Patel, Nandan Rathod, Ritesh Kumar, Ganesh Ramakrishnan, Pushpak Bhattacharyya

The problem of event extraction is a relatively difficult task for low resource languages due to the non-availability of sufficient annotated data.

Event Extraction

Panlingua-KMI MT System for Similar Language Translation Task at WMT 2019

no code implementations WS 2019 Atul Kr. Ojha, Ritesh Kumar, Akanksha Bansal, Priya Rani

The present paper enumerates the development of Panlingua-KMI Machine Translation (MT) systems for Hindi ↔ Nepali language pair, designed as part of the Similar Language Translation Task at the WMT 2019 Shared Task.

Machine Translation Translation

Alzheimer's Disease Brain MRI Classification: Challenges and Insights

1 code implementation10 Jun 2019 Yi Ren Fung, Ziqiang Guan, Ritesh Kumar, Joie Yeahuay Wu, Madalina Fiterau

In recent years, many papers have reported state-of-the-art performance on Alzheimer's Disease classification with MRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset using convolutional neural networks.

Classification General Classification

bhanodaig at SemEval-2019 Task 6: Categorizing Offensive Language in social media

no code implementations SEMEVAL 2019 Ritesh Kumar, Guggilla Bhanodai, Rajendra Pamula, Maheswara Reddy Chennuru

This paper describes the work that our team bhanodaig did at Indian Institute of Technology (ISM) towards OffensEval i. e. identifying and categorizing offensive language in social media.

General Classification

A Comprehensive Study of Alzheimer's Disease Classification Using Convolutional Neural Networks

no code implementations16 Apr 2019 Ziqiang Guan, Ritesh Kumar, Yi Ren Fung, Yeahuay Wu, Madalina Fiterau

A plethora of deep learning models have been developed for the task of Alzheimer's disease classification from brain MRI scans.

General Classification

SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)

1 code implementation SEMEVAL 2019 Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, Ritesh Kumar

We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval).

Language Identification

Benchmarking Aggression Identification in Social Media

no code implementations COLING 2018 Ritesh Kumar, Atul Kr. Ojha, Shervin Malmasi, Marcos Zampieri

For this task, the participants were provided with a dataset of 15, 000 aggression-annotated Facebook Posts and Comments each in Hindi (in both Roman and Devanagari script) and English for training and validation.

Aggression Identification

TRAC-1 Shared Task on Aggression Identification: IIT(ISM)@COLING'18

no code implementations COLING 2018 Ritesh Kumar, Guggilla Bhanodai, Rajendra Pamula, Maheshwar Reddy Chennuru

This paper describes the work that our team bhanodaig did at Indian Institute of Technology (ISM) towards TRAC-1 Shared Task on Aggression Identification in Social Media for COLING 2018.

Aggression Identification Transfer Learning +1

Part-of-Speech Annotation of English-Assamese code-mixed texts: Two Approaches

no code implementations COLING 2018 Ritesh Kumar, Manas Jyoti Bora

In this paper, we discuss the development of a part-of-speech tagger for English-Assamese code-mixed texts.

Automatic Identification of Closely-related Indian Languages: Resources and Experiments

no code implementations26 Mar 2018 Ritesh Kumar, Bornini Lahiri, Deepak Alok, Atul Kr. Ojha, Mayank Jain, Abdul Basit, Yogesh Dawer

In this paper, we discuss an attempt to develop an automatic language identification system for 5 closely-related Indo-Aryan languages of India, Awadhi, Bhojpuri, Braj, Hindi and Magahi.

Language Identification

Aggression-annotated Corpus of Hindi-English Code-mixed Data

no code implementations LREC 2018 Ritesh Kumar, Aishwarya N. Reganti, Akshit Bhatia, Tushar Maheshwari

As the interaction over the web has increased, incidents of aggression and related events like trolling, cyberbullying, flaming, hate speech, etc.

Developing Politeness Annotated Corpus of Hindi Blogs

no code implementations LREC 2014 Ritesh Kumar

In this paper I discuss the creation and annotation of a corpus of Hindi blogs.

Challenges in the development of annotated corpora of computer-mediated communication in Indian Languages: A Case of Hindi

no code implementations LREC 2012 Ritesh Kumar

The present paper describes an ongoing effort to compile and annotate a large corpus of computer-mediated communication (CMC) in Hindi.

POS Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.