This work presents the Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students, a corpus of raw data for general use.
Indigenous languages of the American continent are highly diverse.
Results showed that our model outperformed the state of the art in well-known Semantic Textual Similarity (STS) benchmarks.
The purpose of this corpus is to automatically assess the similarity between a pair of texts and to evaluate different similarity measures, both for whole documents or for individual sentences.
In this paper we describe a dynamic normalization process applied to social network multilingual documents (Facebook and Twitter) to improve the performance of the Author profiling task for short texts.
This paper describes the project called Axolotl which comprises a Spanish-Nahuatl parallel corpus and its search interface.
This article focuses on the description and evaluation of a new unsupervised learning method of clustering of definitions in Spanish according to their semantic.