Sentiment Analysis of Tweets in Three Indian Languages

WS 2016 · Shanta Phani, Shibamouli Lahiri, Arindam Biswas ·

In this paper, we describe the results of sentiment analysis on tweets in three Indian languages {--} Bengali, Hindi, and Tamil. We used the recently released SAIL dataset (Patra et al., 2015), and obtained state-of-the-art results in all three languages. Our features are simple, robust, scalable, and language-independent. Further, we show that these simple features provide better results than more complex and language-specific features, in two separate classification tasks. Detailed feature analysis and error analysis have been reported, along with learning curves for Hindi and Bengali.

PDF Abstract