no code implementations • 4 Mar 2024 • Si Sun, Hanqing Zhang, Zhiyuan Liu, Jie Bao, Dawei Song
Dense Retrieval (DR) is now considered as a promising tool to enhance the memorization capacity of Large Language Models (LLM) such as GPT3 and GPT-4 by incorporating external memories.
1 code implementation • 5 Feb 2024 • Junjie Fang, Likai Tang, Hongzhe Bi, Yujia Qin, Si Sun, Zhenyu Li, Haolun Li, Yongjian Li, Xin Cong, Yankai Lin, Yukun Yan, Xiaodong Shi, Sen Song, Zhiyuan Liu, Maosong Sun
Distinguished by its four core dimensions-Memory Management, Memory Writing, Memory Reading, and Memory Injection, UniMem empowers researchers to conduct systematic exploration of long-context methods.
1 code implementation • 12 Apr 2023 • Si Sun, Yida Lu, Shi Yu, Xiangyang Li, Zhonghua Li, Zhao Cao, Zhiyuan Liu, Deiming Ye, Jie Bao
Moreover, the dataset is disjointed into base and novel classes, allowing DR models to be continuously trained on ample data from base classes and a few samples in novel classes.
1 code implementation • 31 Oct 2022 • Si Sun, Chenyan Xiong, Yue Yu, Arnold Overwijk, Zhiyuan Liu, Jie Bao
In this paper, we investigate the instability in the standard dense retrieval training, which iterates between model training and hard negative selection using the being-trained model.
1 code implementation • 27 Oct 2022 • Yue Yu, Chenyan Xiong, Si Sun, Chao Zhang, Arnold Overwijk
We present a new zero-shot dense retrieval (ZeroDR) method, COCO-DR, to improve the generalization ability of dense retrieval by combating the distribution shifts between source training tasks and target scenarios.
Ranked #1 on
Zero-shot Text Search
on CQADupStack
1 code implementation • ACL 2021 • Si Sun, Yingzhuo Qian, Zhenghao Liu, Chenyan Xiong, Kaitao Zhang, Jie Bao, Zhiyuan Liu, Paul Bennett
To democratize the benefits of Neu-IR, this paper presents MetaAdaptRank, a domain adaptive learning method that generalizes Neu-IR models from label-rich source domains to few-shot target domains.
no code implementations • 17 Dec 2020 • Haile Liu, Yonghui Li, Si Sun, Qi Xin, Shuhu Liu, Xiaoyu Mu, Xun Yuan, Ke Chen, Hao Wang, Kalman Varga, Wenbo Mi, Jiang Yang, Xiao-Dong Zhang
Emerging artificial enzymes with reprogrammed and augmented catalytic activity and substrate selectivity have long been pursued with sustained efforts.
Biological Physics Medical Physics
3 code implementations • 3 Nov 2020 • Chenyan Xiong, Zhenghao Liu, Si Sun, Zhuyun Dai, Kaitao Zhang, Shi Yu, Zhiyuan Liu, Hoifung Poon, Jianfeng Gao, Paul Bennett
Neural rankers based on deep pretrained language models (LMs) have been shown to improve many information retrieval benchmarks.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhenghao Liu, Chenyan Xiong, Zhuyun Dai, Si Sun, Maosong Sun, Zhiyuan Liu
With the epidemic of COVID-19, verifying the scientifically false online information, such as fake news and maliciously fabricated statements, has become crucial.
2 code implementations • 28 Apr 2020 • Si Sun, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu, Jie Bao
Open-domain KeyPhrase Extraction (KPE) aims to extract keyphrases from documents without domain or quality restrictions, e. g., web pages with variant domains and qualities.