Long Concept Query on Conceptual Taxonomies

29 Nov 2015  ·  Zhang Yi, Xiao Yanghua, Hwang Seung-won, Wang Wei ·

This paper studies the problem of finding typical entities when the concept is given as a query. For a short concept such as university, this is a well-studied problem of retrieving knowledge base such as Microsoft's Probase and Google's isA database pre-materializing entities found for the concept in Hearst patterns of the web corpus. However, we find most real-life queries are long concept queries (LCQs), such as top American private university, which cannot and should not be pre-materialized. Our goal is an online construction of entity retrieval for LCQs. We argue a naive baseline of rewriting LCQs into an intersection of an expanded set of composing short concepts leads to highly precise results with extremely low recall. Instead, we propose to augment the concept list, by identifying related concepts of the query concept. However, as such increase of recall often invites false positives and decreases precision in return, we propose the following two techniques: First, we identify concepts with different relatedness to generate linear orderings and pairwise ordering constraints. Second, we rank entities trying to avoid conflicts with these constraints, to prune out lowly ranked one (likely false positives). With these novel techniques, our approach significantly outperforms state-of-the-arts.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here