SB10k

Introduced by Cieliebak et al. in A Twitter Corpus and Benchmark Resources for German Sentiment Analysis

The SB10k dataset is a valuable resource for sentiment analysis in German. Here are the key details:

Corpus Size: It contains approximately 10,000 German tweets¹.
Language: German.
Task: Text classification, specifically sentiment analysis.
Multilinguality: Monolingual (German only).
Size Category: Falls within the range of 1K to 10K examples.
Tags: Sentiment analysis.
License: CC-BY-4.0.

The dataset was created by annotating German tweets, with each tweet labeled by three annotators. Researchers have used SB10k to benchmark various machine learning classifiers, including convolutional neural networks (CNNs) and feature-based support vector machines (SVMs) for sentiment analysis²³.

(1) Alienmaster/SB10k · Datasets at Hugging Face. https://huggingface.co/datasets/Alienmaster/SB10k. (2) A Twitter Corpus and Benchmark Resources for German Sentiment Analysis. https://aclanthology.org/W17-1106/. (3) A Twitter Corpus and Benchmark Resources for German Sentiment Analysis. https://aclanthology.org/W17-1106.pdf. (4) undefined. http://t.co/9rhta65MSx. (5) undefined. http://t.co/G84qcIGk7k. (6) undefined. http://t.co/LvwyZgew4Q.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

SB10k

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

ISEAR

HeadQA

Usage

License

Modalities

Languages

SB10k

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

ISEAR

HeadQA

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages