Semantics and Homothetic Clustering of Hafez Poetry

WS 2019  ·  Arya Rahgozar, Diana Inkpen ·

We have created two sets of labels for Hafez (1315-1390) poems, using unsupervised learning. Our labels are the only semantic clustering alternative to the previously existing, hand-labeled, gold-standard classification of Hafez poems, to be used for literary research. We have cross-referenced, measured and analyzed the agreements of our clustering labels with Houman{'}s chronological classes. Our features are based on topic modeling and word embeddings. We also introduced a similarity of similarities{'} features, we called homothetic clustering approach that proved effective, in case of Hafez{'}s small corpus of ghazals2. Although all our experiments showed different clusters when compared with Houman{'}s classes, we think they were valid in their own right to have provided further insights, and have proved useful as a contrasting alternative to Houman{'}s classes. Our homothetic clusterer and its feature design and engineering framework can be used for further semantic analysis of Hafez{'}s poetry and other similar literary research.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here