Paper

Long Concept Query on Conceptual Taxonomies

This paper studies the problem of finding typical entities when the concept is given as a query. For a short concept such as university, this is a well-studied problem of retrieving knowledge base such as Microsoft's Probase and Google's isA database pre-materializing entities found for the concept in Hearst patterns of the web corpus. However, we find most real-life queries are long concept queries (LCQs), such as top American private university, which cannot and should not be pre-materialized. Our goal is an online construction of entity retrieval for LCQs. We argue a naive baseline of rewriting LCQs into an intersection of an expanded set of composing short concepts leads to highly precise results with extremely low recall. Instead, we propose to augment the concept list, by identifying related concepts of the query concept. However, as such increase of recall often invites false positives and decreases precision in return, we propose the following two techniques: First, we identify concepts with different relatedness to generate linear orderings and pairwise ordering constraints. Second, we rank entities trying to avoid conflicts with these constraints, to prune out lowly ranked one (likely false positives). With these novel techniques, our approach significantly outperforms state-of-the-arts.

Results in Papers With Code
(↓ scroll down to see all results)