PubSE: A Hierarchical Model for Publication Extraction from Academic Homepages

EMNLP 2018  ·  Yiqing Zhang, Jianzhong Qi, Rui Zhang, Chu Yin, ong ·

Publication information in a researcher{'}s academic homepage provides insights about the researcher{'}s expertise, research interests, and collaboration networks. We aim to extract all the publication strings from a given academic homepage. This is a challenging task because the publication strings in different academic homepages may be located at different positions with different structures. To capture the positional and structural diversity, we propose an end-to-end hierarchical model named PubSE based on Bi-LSTM-CRF. We further propose an alternating training method for training the model. Experiments on real data show that PubSE outperforms the state-of-the-art models by up to 11.8{\%} in F1-score.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here