Multilingual text classification

14 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Latest papers with no code

L3Cube-IndicNews: News-based Short Text and Long Document Classification Datasets in Indic Languages

no code yet • 4 Jan 2024

This research contributes significantly to expanding the pool of available text classification datasets and also makes it possible to develop topic classification models for Indian regional languages.

Comparative Analysis of Multilingual Text Classification & Identification through Deep Learning and Embedding Visualization

no code yet • 6 Dec 2023

This research conducts a comparative study on multilingual text classification methods, utilizing deep learning and embedding visualization.

Model and Evaluation: Towards Fairness in Multilingual Text Classification

no code yet • 28 Mar 2023

The multilingual text representation module uses a multilingual pre-trained language model to represent the text, the language fusion module makes the semantic spaces of different languages tend to be consistent through contrastive learning, and the text debiasing module uses contrastive learning to make the model unable to identify sensitive attributes' information.

MiLMo:Minority Multilingual Pre-trained Language Model

no code yet • 4 Dec 2022

To solve the problem of scarcity of datasets on minority languages and verify the effectiveness of the MiLMo model, this paper constructs a minority multilingual text classification dataset named MiTC, and trains a word2vec model for each language.

muBoost: An Effective Method for Solving Indic Multilingual Text Classification Problem

no code yet • 21 Jun 2022

In this paper, we are presenting our solution to Multilingual Abusive Comment Identification Problem on Moj, an Indian video-sharing social networking service, powered by ShareChat.

Graph Neural Network Enhanced Language Models for Efficient Multilingual Text Classification

no code yet • 6 Mar 2022

To overcome these challenges, we propose a multilingual disaster related text classification system which is capable to work under \{mono, cross and multi\} lingual scenarios and under limited supervision.

Multilingual Text Classification for Dravidian Languages

no code yet • 3 Dec 2021

On the other hand, in view of the problem that the model cannot well recognize and utilize the correlation among languages, we further proposed a language-specific representation module to enrich semantic information for the model.

A Primer on Pretrained Multilingual Language Models

no code yet • 1 Jul 2021

Multilingual Language Models (\MLLMs) such as mBERT, XLM, XLM-R, \textit{etc.}

Multilingual Epidemiological Text Classification: A Comparative Study

no code yet • COLING 2020

We conduct a comparative study of different machine and deep learning text classification models using a dataset comprising news articles related to epidemic outbreaks from six languages, four low-resourced and two high-resourced, in order to analyze the influence of the nature of the language, the structure of the document, and the size of the data.

Evaluating Transformer-Based Multilingual Text Classification

no code yet • 29 Apr 2020

As NLP tools become ubiquitous in today's technological landscape, they are increasingly applied to languages with a variety of typological structures.