AWARE: Aspect-Based Sentiment Analysis Dataset of Apps Reviews for Requirements Elicitation

The smartphone apps market is growing rapidly which challenges apps owners to continue improving their products and to compete in the market. The analysis of users feedback is a key enabler for improvements as stakeholders can utilize it to gain a broad understanding of the successes and failures of their products as well as those of competitors. That leads to generating evidence-based requirements and enhancing the requirements elicitation activities. Aspect-Based Sentiment Analysis (ABSA) is a branch of Sentiment Analysis that identifies aspects and assigns a sentiment to each aspect. Having the aspect information adds a more accurate understanding of opinions and addresses the limited use of the overall sentiment. However, the ABSA task has not yet been investigated in the context of smartphone apps reviews and requirements elicitation. In this paper, we introduce AWARE as a benchmark dataset of 11323 apps reviews that are annotated with aspect terms, categories, and sentiment. Reviews were collected from three domains: productivity, social networking, and games. We derived the aspect categories for each domain using content analysis and validated them with domain experts in terms of importance, comprehensiveness, overlapping, and granularity level. We crowdsourced the annotations of aspect categories and sentiment polarities and performed quality control procedures. The aspect terms were annotated using a partially automated Natural Language Processing (NLP) approach and validated by annotators, which resulted in 98% correct aspect terms. Lastly, we built machine learning baselines for three tasks, namely (i) aspect term extraction using a POS tagger, (ii) aspect category classification, and (iii) aspect sentiment classification, using both Support Vector Machine (SVM) and Multi-layer Perceptron (MLP) classifiers.

PDF

Datasets


Introduced in the Paper:

AWARE

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Term Extraction AWARE Baseline F1-Score 0.82 # 1
Aspect Category Polarity AWARE Baseline Accuracy (%) 67 # 1
Aspect Category Detection AWARE Baseline F1-score 0.32 # 1

Methods