Using machine learning and information visualisation for discovering latent topics in Twitter news

21 Oct 2019 · Vladimir Vargas-Calderón, Marlon Steibeck Dominguez, N. Parra-A., Herbert Vinck-Posada, Jorge E. Camargo ·

We propose a method to discover latent topics and visualise large collections of tweets for easy identification and interpretation of topics, and exemplify its use with tweets from a Colombian mass media giant in the period 2014--2019. The latent topic analysis is performed in two ways: with the training of a Latent Dirichlet Allocation model, and with the combination of the FastText unsupervised model to represent tweets as vectors and the implementation of K-means clustering to group tweets into topics. Using a classification task, we found that people respond differently according to the various news topics. The classification tasks consists of the following: given a reply to a news tweet, we train a supervised algorithm to predict the topic of the news tweet solely from the reply. Furthermore, we show how the Colombian peace treaty has had a profound impact on the Colombian society, as it is the topic in which most people engage to show their opinions.

PDF Abstract