sCAKE: Semantic Connectivity Aware Keyword Extraction
Keyword Extraction is an important task in several text analysis endeavors. In this paper, we present a critical discussion of the issues and challenges ingraph-based keyword extraction methods, along with comprehensive empirical analysis. We propose a parameterless method for constructing graph of text that captures the contextual relation between words. A novel word scoring method is also proposed based on the connection between concepts. We demonstrate that both proposals are individually superior to those followed by the state-of-the-art graph-based keyword extraction algorithms. Combination of the proposed graph construction and scoring methods leads to a novel, parameterless keyword extraction method (sCAKE) based on semantic connectivity of words in the document. Motivated by limited availability of NLP tools for several languages, we also design and present a language-agnostic keyword extraction (LAKE) method. We eliminate the need of NLP tools by using a statistical filter to identify candidate keywords before constructing the graph. We show that the resulting method is a competent solution for extracting keywords from documents oflanguages lacking sophisticated NLP support.
PDF Abstract