Toward Tweet Entity Linking with Heterogeneous Information Networks

Twitter, a microblogging platform, has developed into an increasingly invaluable information source, where millions of users post a great quantity of tweets with various topics per day. Heterogeneous information networks consisting of multi-type objects and relations are becoming more and more prevalent as an organization form of knowledge and information. The task of linking an entity mention in a tweet with its corresponding entity in a heterogeneous information network is of great importance, for the purpose of enriching heterogeneous information networks with the abundant and fresh knowledge embedded in tweets. However, the entity mention is ambiguous. Additionally, tweets are short and informal, making it difficult to mine enough information from a single tweet for entity linking. In this paper, we propose an unsupervised iterative clustering framework TELHIN to link multiple similar tweets with a heterogeneous information network jointly. Our framework takes three dimensions of tweet similarity into consideration: (1) content similarity, (2) temporal similarity, and (3) user similarity. The appropriate weights of different similarity dimensions for each entity mention are learned iteratively based on the metric learning algorithm by leveraging the pairwise constraints generated automatically. Experiments on real data demonstrate the effectiveness of our framework in comparison with the baselines.



  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here