no code implementations • 23 May 2023 • Yu Zhang, Hao Cheng, Zhihong Shen, Xiaodong Liu, Ye-Yi Wang, Jianfeng Gao
Scientific literature understanding tasks have gained significant attention due to their potential to accelerate scientific discovery.
1 code implementation • 11 Feb 2022 • Yu Zhang, Zhihong Shen, Chieh-Han Wu, Boya Xie, Junheng Hao, Ye-Yi Wang, Kuansan Wang, Jiawei Han
Large-scale multi-label text classification (LMTC) aims to associate a document with its relevant labels from a large candidate set.
no code implementations • ACL 2021 • Zhihong Shen, Chieh-Han Wu, Li Ma, Chien-Pang Chen, Kuansan Wang
In this paper, we introduce a self-supervised end-to-end system, SciConceptMiner, for the automatic capture of emerging scientific concepts from both independent knowledge sources (semi-structured data) and academic publications (unstructured documents).
no code implementations • 25 Jun 2021 • Yu Wang, Jinchao Li, Tristan Naumann, Chenyan Xiong, Hao Cheng, Robert Tinn, Cliff Wong, Naoto Usuyama, Richard Rogahn, Zhihong Shen, Yang Qin, Eric Horvitz, Paul N. Bennett, Jianfeng Gao, Hoifung Poon
A prominent case in point is the explosion of the biomedical literature on COVID-19, which swelled to hundreds of thousands of papers in a matter of months.
1 code implementation • 15 Feb 2021 • Yu Zhang, Zhihong Shen, Yuxiao Dong, Kuansan Wang, Jiawei Han
Multi-label text classification refers to the problem of assigning each given document its most relevant labels from the label set.
no code implementations • COLING 2020 • Keng-Te Liao, Zhihong Shen, Chiyuan Huang, Chieh-Han Wu, PoChun Chen, Kuansan Wang, Shou-De Lin
Provided with the interpretable concepts and knowledge encoded in a pre-trained neural model, we investigate whether the tagged concepts can be applied to a broader class of applications.
4 code implementations • ACL 2020 • Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang, Doug Burdick, Darrin Eide, Kathryn Funk, Yannis Katsis, Rodney Kinney, Yunyao Li, Ziyang Liu, William Merrill, Paul Mooney, Dewey Murdick, Devvret Rishi, Jerry Sheehan, Zhihong Shen, Brandon Stilson, Alex Wade, Kuansan Wang, Nancy Xin Ru Wang, Chris Wilhelm, Boya Xie, Douglas Raymond, Daniel S. Weld, Oren Etzioni, Sebastian Kohlmeier
The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical coronavirus research.
3 code implementations • 26 Jan 2020 • Jiaming Shen, Zhihong Shen, Chenyan Xiong, Chi Wang, Kuansan Wang, Jiawei Han
Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications.
1 code implementation • 21 May 2019 • Anshul Kanakia, Zhihong Shen, Darrin Eide, Kuansan Wang
We present the design and methodology for the large scale hybrid paper recommender system used by Microsoft Academic.
no code implementations • ACL 2018 • Zhihong Shen, Hao Ma, Kuansan Wang
To enable efficient exploration of Web-scale scientific knowledge, it is necessary to organize scientific publications into a hierarchical concept structure.
no code implementations • 17 Apr 2017 • Yuxiao Dong, Hao Ma, Zhihong Shen, Kuansan Wang
We find that science has benefited from the shift from individual work to collaborative effort, with over 90% of the world-leading innovations generated by collaborations in this century, nearly four times higher than they were in the 1900s.
Digital Libraries Social and Information Networks Physics and Society
no code implementations • WWW 2015 • Arnab Sinha, Zhihong Shen, Yang song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, Kuansan Wang
In addition to obtaining these entities from the publisher feeds as in the previous effort, we in this version include data mining results from the Web index and an in-house knowledge base from Bing, a major commercial search engine.