News Classification
27 papers with code • 3 benchmarks • 11 datasets
Datasets
Latest papers with no code
EthioMT: Parallel Corpus for Low-resource Ethiopian Languages
Recent research in natural language processing (NLP) has achieved impressive performance in tasks such as machine translation (MT), news classification, and question-answering in high-resource languages.
Exploring Tokenization Strategies and Vocabulary Sizes for Enhanced Arabic Language Models
This paper presents a comprehensive examination of the impact of tokenization strategies and vocabulary sizes on the performance of Arabic language models in downstream natural language processing tasks.
The effect of stemming and lemmatization on Portuguese fake news text classification
With the popularization of the internet, smartphones and social media, information is being spread quickly and easily way, which implies bigger traffic of information in the world, but there is a problem that is harming society with the dissemination of fake news.
Developing and Evaluating Tiny to Medium-Sized Turkish BERT Models
This study introduces and evaluates tiny, mini, small, and medium-sized uncased Turkish BERT models, aiming to bridge the research gap in less-resourced languages.
Don't Retrain, Just Rewrite: Countering Adversarial Perturbations by Rewriting Text
For example, on sentiment classification using the SST-2 dataset, our method improves the adversarial accuracy over the best existing defense approach by more than 4% with a smaller decrease in task accuracy (0. 5% vs 2. 5%).
Analyzing the Generalizability of Deep Contextualized Language Representations For Text Classification
This study evaluates the robustness of two state-of-the-art deep contextual language representations, ELMo and DistilBERT, on supervised learning of binary protest news classification and sentiment analysis of product reviews.
Machine and Deep Learning Methods with Manual and Automatic Labelling for News Classification in Bangla Language
This paper introduces several machine and deep learning methods with manual and automatic labelling for news classification in the Bangla language.
Potrika: Raw and Balanced Newspaper Datasets in the Bangla Language with Eight Topics and Five Attributes
Moreover, using NLP augmentation techniques, we create from the raw (unbalanced) dataset another (balanced) dataset comprising 320, 000 news articles with 40, 000 articles in each of the eight news categories.
BaIT: Barometer for Information Trustworthiness
This paper presents a new approach to the FNC-1 fake news classification task which involves employing pre-trained encoder models from similar NLP tasks, namely sentence similarity and natural language inference, and two neural network architectures using this approach are proposed.
Lifelong Learning Natural Language Processing Approach for Multilingual Data Classification
The abundance of information in digital media, which in today's world is the main source of knowledge about current events for the masses, makes it possible to spread disinformation on a larger scale than ever before.