News Classification
27 papers with code • 3 benchmarks • 11 datasets
Datasets
Most implemented papers
Classifying the Ideological Orientation of User-Submitted Texts in Social Media
With the long-term goal of understanding how language is used and evolves within online communities, this work explores the application of natural language processing techniques to classify text articles according to their ideological orientation (i. e., conservative or liberal).
MasakhaNEWS: News Topic Classification for African languages
Furthermore, we explore several alternatives to full fine-tuning of language models that are better suited for zero-shot and few-shot learning such as cross-lingual parameter-efficient fine-tuning (like MAD-X), pattern exploiting training (PET), prompting language models (like ChatGPT), and prompt-free sentence transformer fine-tuning (SetFit and Cohere Embedding API).
A Dataset and Strong Baselines for Classification of Czech News Texts
Pre-trained models for Czech Natural Language Processing are often evaluated on purely linguistic tasks (POS tagging, parsing, NER) and relatively simple classification tasks such as sentiment classification or article classification from a single news source.
Benchmarking Multilabel Topic Classification in the Kyrgyz Language
Kyrgyz is a very underrepresented language in terms of modern natural language processing resources.
InterpretCC: Conditional Computation for Inherently Interpretable Neural Networks
Real-world interpretability for neural networks is a tradeoff between three concerns: 1) it requires humans to trust the explanation approximation (e. g. post-hoc approaches), 2) it compromises the understandability of the explanation (e. g. automatically identified feature masks), and 3) it compromises the model performance (e. g. decision trees).
Improving Black-box Robustness with In-Context Rewriting
Most techniques for improving OOD robustness are not applicable to settings where the model is effectively a black box, such as when the weights are frozen, retraining is costly, or the model is leveraged via an API.