Topic Classification

73 papers with code • 2 benchmarks • 10 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Active learning in annotating micro-blogs dealing with e-reputation

ungeimer/FLAT-TextTagger 16 Jun 2017

This paper intends to develop a so-called active learning process for automatically annotating French language tweets that deal with the image (i. e., representation, web reputation) of politicians.

KLUE: Korean Language Understanding Evaluation

KLUE-benchmark/KLUE 20 May 2021

We introduce Korean Language Understanding Evaluation (KLUE) benchmark.

Hierarchical Transformers for Long Document Classification

helmy-elrais/RoBERT_Recurrence_over_BERT 23 Oct 2019

BERT, which stands for Bidirectional Encoder Representations from Transformers, is a recently introduced language representation model based upon the transfer learning paradigm.

Entailment as Few-Shot Learner

PaddlePaddle/PaddleNLP 29 Apr 2021

Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners.

Cross-Lingual Adaptation using Structural Correspondence Learning

pprett/bolt 4 Aug 2010

From these correspondences a cross-lingual representation is created that enables the transfer of classification knowledge from the source to the target language.

Controlling the Interaction Between Generation and Inference in Semi-Supervised Variational Autoencoders Using Importance Weighting

ghazi-f/SSPIWO 13 Oct 2020

Even though Variational Autoencoders (VAEs) are widely used for semi-supervised learning, the reason why they work remains unclear.

Leveraging QA Datasets to Improve Generative Data Augmentation

dheeraj7596/conda 25 May 2022

The ability of generative language models (GLMs) to generate text has improved considerably in the last few years, enabling their use for generative data augmentation.

SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects

dadelani/sib-200 14 Sep 2023

Despite the progress we have recorded in the last few years in multilingual natural language processing, evaluation is typically limited to a small set of languages with available datasets which excludes a large number of low-resource languages.

LexC-Gen: Generating Data for Extremely Low-Resource Languages with Large Language Models and Bilingual Lexicons

BatsResearch/LexC-Gen 21 Feb 2024

Data scarcity in low-resource languages can be addressed with word-to-word translations from labeled task data in high-resource languages using bilingual lexicons.

Topic-based Evaluation for Conversational Bots

knights207210/Deep-Learning-for-VUI 11 Jan 2018

Dialog evaluation is a challenging problem, especially for non task-oriented dialogs where conversational success is not well-defined.