Building a fine-grained subjectivity lexicon from a web corpus

LREC 2012  ·  Isa Maks, Piek Vossen ·

In this paper we propose a method to build fine-grained subjectivity lexicons including nouns, verbs and adjectives. The method, which is applied for Dutch, is based on the comparison of word frequencies of three corpora: Wikipedia, News and News comments. Comparison of the corpora is carried out with two measures: log-likelihood ratio and a percentage difference calculation. The first step of the method involves subjectivity identification, i.e. determining if a word is subjective or not. The second step aims at the identification of more fine-grained subjectivity which is the distinction between actor subjectivity and speaker / writer subjectivity. The results suggest that this approach can be usefully applied producing subjectivity lexicons of high quality.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here