Search Results for author: Raviraj Joshi

Found 51 papers, 15 papers with code

Domain Adaptation of NMT models for English-Hindi Machine Translation Task : AdapMT Shared Task ICON 2020

no code implementations ICON 2020 Ramchandra Joshi, Rusbabh Karnavat, Kaustubh Jirapure, Raviraj Joshi

We train these models primarily using the out of domain data and employ simple domain adaptation techniques based on the characteristics of the in-domain dataset.

Domain Adaptation Machine Translation +2

TextGram: Towards a better domain-adaptive pretraining

no code implementations28 Apr 2024 Sharayu Hiwarkhedkar, Saloni Mittal, Vidula Magdum, Omkar Dhekane, Raviraj Joshi, Geetanjali Kale, Arnav Ladkat

Thus, it is important that we select the correct data in the form of domain-specific data from this vast corpus to achieve optimum results aligned with our domain-specific tasks.

text-classification Text Classification

MahaSQuAD: Bridging Linguistic Divides in Marathi Question-Answering

1 code implementation20 Apr 2024 Ruturaj Ghatage, Aditya Kulkarni, Rajlaxmi Patil, Sharvi Endait, Raviraj Joshi

Hence, to address this challenge, we also present a generic approach for translating SQuAD into any low-resource language.

Information Retrieval Question Answering +1

L3Cube-IndicNews: News-based Short Text and Long Document Classification Datasets in Indic Languages

1 code implementation4 Jan 2024 Aishwarya Mirashi, Srushti Sonavane, Purva Lingayat, Tejas Padhiyar, Raviraj Joshi

This research contributes significantly to expanding the pool of available text classification datasets and also makes it possible to develop topic classification models for Indian regional languages.

Document Classification Multilingual text classification +3

L3Cube-MahaSocialNER: A Social Media based Marathi NER Dataset and BERT models

1 code implementation30 Dec 2023 Harsh Chaudhari, Anuja Patil, Dhanashree Lavekar, Pranav Khairnar, Raviraj Joshi

This work introduces the L3Cube-MahaSocialNER dataset, the first and largest social media dataset specifically designed for Named Entity Recognition (NER) in the Marathi language.

Marketing named-entity-recognition +2

Code-Mixed Text to Speech Synthesis under Low-Resource Constraints

no code implementations2 Dec 2023 Raviraj Joshi, Nikesh Garera

We further present an exhaustive evaluation of single-speaker adaptation and multi-speaker training with Tacotron2 + Waveglow setup to show that the former approach works better.

Speech Synthesis Text-To-Speech Synthesis +2

Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning

no code implementations2 Dec 2023 Raviraj Joshi, Nikesh Garera

Using transfer learning from high-resource language and synthetic corpus we present a low-cost solution to train a custom TTS model.

Decoder Transfer Learning

mahaNLP: A Marathi Natural Language Processing Library

1 code implementation5 Nov 2023 Vidula Magdum, Omkar Dhekane, Sharayu Hiwarkhedkar, Saloni Mittal, Raviraj Joshi

We present mahaNLP, an open-source natural language processing (NLP) library specifically built for the Marathi language.

Hate Speech Detection NER +3

Harnessing Pre-Trained Sentence Transformers for Offensive Language Detection in Indian Languages

no code implementations3 Oct 2023 Ananya Joshi, Raviraj Joshi

In our increasingly interconnected digital world, social media platforms have emerged as powerful channels for the dissemination of hate speech and offensive content.

Hate Speech Detection Sentence +2

Robust Sentiment Analysis for Low Resource languages Using Data Augmentation Approaches: A Case Study in Marathi

no code implementations1 Oct 2023 Aabha Pingle, Aditya Vyawahare, Isha Joshi, Rahul Tangsali, Geetanjali Kale, Raviraj Joshi

While sentiment analysis research has been extensively conducted in English and other Western languages, there exists a significant gap in research efforts for sentiment analysis in low-resource languages.

Data Augmentation Pseudo Label +3

L3Cube-MahaSent-MD: A Multi-domain Marathi Sentiment Analysis Dataset and Transformer Models

1 code implementation24 Jun 2023 Aabha Pingle, Aditya Vyawahare, Isha Joshi, Rahul Tangsali, Raviraj Joshi

The exploration of sentiment analysis in low-resource languages, such as Marathi, has been limited due to the availability of suitable datasets.

Sentiment Analysis

Enhancing Low Resource NER Using Assisting Language And Transfer Learning

no code implementations10 Jun 2023 Maithili Sabane, Aparna Ranade, Onkar Litake, Parth Patil, Raviraj Joshi, Dipali Kadam

Named Entity Recognition (NER) is a fundamental task in NLP that is used to locate the key information in text and is primarily applied in conversational and search systems.

named-entity-recognition Named Entity Recognition +4

Leveraging Language Identification to Enhance Code-Mixed Text Classification

no code implementations8 Jun 2023 Gauri Takawane, Abhishek Phaltankar, Varad Patwardhan, Aryan Patil, Raviraj Joshi, Mukta S. Takalikar

We propose a pipeline to improve code-mixed systems that comprise data preprocessing, word-level language identification, language augmentation, and model training on downstream tasks like sentiment analysis.

Hate Speech Detection Language Identification +4

L3Cube-IndicSBERT: A simple approach for learning cross-lingual sentence representations using multilingual BERT

no code implementations22 Apr 2023 Samruddhi Deode, Janhavi Gadre, Aditi Kajale, Ananya Joshi, Raviraj Joshi

We propose a simple yet effective approach to convert vanilla multilingual BERT models into multilingual sentence BERT models using synthetic corpus.

Sentence Sentence Similarity +1

A Twitter BERT Approach for Offensive Language Detection in Marathi

no code implementations20 Dec 2022 Tanmay Chavan, Shantanu Patankar, Aditya Kane, Omkar Gokhale, Raviraj Joshi

The MahaTweetBERT, a BERT model, pre-trained on Marathi tweets when fine-tuned on the combined dataset (HASOC 2021 + HASOC 2022 + MahaHate), outperforms all models with an F1 score of 98. 43 on the HASOC 2022 test set.

Data Augmentation Language Identification +2

Implementing Deep Learning-Based Approaches for Article Summarization in Indian Languages

no code implementations12 Dec 2022 Rahul Tangsali, Aabha Pingle, Aditya Vyawahare, Isha Joshi, Raviraj Joshi

The research on text summarization for low-resource Indian languages has been limited due to the availability of relevant datasets.

Text Summarization

L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi

1 code implementation21 Nov 2022 Ananya Joshi, Aditi Kajale, Janhavi Gadre, Samruddhi Deode, Raviraj Joshi

We evaluate these models on real text classification datasets to show embeddings obtained from synthetic data training are generalizable to real datasets as well and thus represent an effective training strategy for low-resource languages.

Benchmarking Machine Translation +7

Spread Love Not Hate: Undermining the Importance of Hateful Pre-training for Hate Speech Detection

1 code implementation9 Oct 2022 Omkar Gokhale, Aditya Kane, Shantanu Patankar, Tanmay Chavan, Raviraj Joshi

Pre-training large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks.

Hate Speech Detection

A Review of Challenges in Machine Learning based Automated Hate Speech Detection

no code implementations12 Sep 2022 Abhishek Velankar, Hrushikesh Patil, Raviraj Joshi

In this work, we deeply explore a wide range of challenges in automatic hate speech detection by presenting a hierarchical organization of these problems.

Hate Speech Detection

L3Cube-MahaNLP: Marathi Natural Language Processing Datasets, Models, and Library

1 code implementation29 May 2022 Raviraj Joshi

With L3Cube-MahaNLP, we aim to build resources and a library for Marathi natural language processing.

Hate Speech Detection Language Modelling +4

Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi

no code implementations19 Apr 2022 Abhishek Velankar, Hrushikesh Patil, Raviraj Joshi

We focus on the Marathi language and evaluate the models on the datasets for hate speech detection, sentiment analysis and simple text classification in Marathi.

Hate Speech Detection Sentence +5

Hierarchical Neural Network Approaches for Long Document Classification

no code implementations18 Jan 2022 Snehal Khandve, Vedangi Wagh, Apurva Wani, Isha Joshi, Raviraj Joshi

Along with the hierarchical approaches, this work also provides a comparison of different deep learning algorithms like USE, BERT, HAN, Longformer, and BigBird for long document classification.

Document Classification Sentence +2

On Sensitivity of Deep Learning Based Text Classification Algorithms to Practical Input Perturbations

no code implementations2 Jan 2022 Aamir Miyajiwala, Arnav Ladkat, Samiksha Jagadale, Raviraj Joshi

In this work, we carry out a data-focused study evaluating the impact of systematic practical perturbations on the performance of the deep learning based text classification models like CNN, LSTM, and BERT-based algorithms.

text-classification Text Classification

Comparative Study of Long Document Classification

no code implementations1 Nov 2021 Vedangi Wagh, Snehal Khandve, Isha Joshi, Apurva Wani, Geetanjali Kale, Raviraj Joshi

We re-iterate that long document classification is a simpler task and even basic algorithms perform competitively with BERT-based approaches on most of the datasets.

BIG-bench Machine Learning Document Classification +1

SISA: Securing Images by Selective Alteration

no code implementations20 Jun 2021 Prutha Gaherwar, Shraddha Joshi, Raviraj Joshi, Rahul Khengare

While encryption is the best way to ensure image security, full encryption and decryption is a computationally-intensive process.

Object Recognition

ICodeNet -- A Hierarchical Neural Network Approach for Source Code Author Identification

no code implementations30 Jan 2021 Pranali Bora, Tulika Awalgaonkar, Himanshu Palve, Raviraj Joshi, Purvi Goel

We have also compared our image-based hierarchical neural network model with simple image-based CNN architecture and text-based CNN and LSTM models to highlight its novelty and efficiency.

Evaluation of Deep Learning Models for Hostility Detection in Hindi Text

no code implementations11 Jan 2021 Ramchandra Joshi, Rushabh Karnavat, Kaustubh Jirapure, Raviraj Joshi

The pre-trained Hindi fast text word embeddings by IndicNLP and Facebook are used in conjunction with CNN and LSTM models.

Multi-Label Classification Text Detection +1

Domain Adaptation of NMT models for English-Hindi Machine Translation Task at AdapMT ICON 2020

no code implementations22 Dec 2020 Ramchandra Joshi, Rushabh Karnavat, Kaustubh Jirapure, Raviraj Joshi

The shared task aims to build a translation system for Indian languages in specific domains like Artificial Intelligence (AI) and Chemistry using a small in-domain parallel corpus.

Domain Adaptation Machine Translation +2

Deep Learning for Hindi Text Classification: A Comparison

no code implementations19 Jan 2020 Ramchandra Joshi, Purvi Goel, Raviraj Joshi

Usage of deep learning in text processing has revolutionized the techniques for text processing and achieved remarkable results.

General Classification Sentence +3

Cannot find the paper you are looking for? You can Submit a new open access paper.