1 code implementation • 27 Oct 2023 • Houwen Peng, Kan Wu, Yixuan Wei, Guoshuai Zhao, Yuxiang Yang, Ze Liu, Yifan Xiong, Ziyue Yang, Bolin Ni, Jingcheng Hu, Ruihang Li, Miaosen Zhang, Chen Li, Jia Ning, Ruizhe Wang, Zheng Zhang, Shuguang Liu, Joe Chau, Han Hu, Peng Cheng
In this paper, we explore FP8 low-bit data formats for efficient training of large language models (LLMs).
1 code implementation • 19 Jun 2023 • Ye Wang, Yaxiong Wang, Guoshuai Zhao, Xueming Qian
Concretely, RESA mimics the real incremental setting and constructs pseudo incremental tasks globally and locally, where the global pseudo incremental tasks are designed to coincide with the learning objective of FSCIL and the local pseudo incremental tasks are designed to improve the model's plasticity, respectively.
no code implementations • 6 Jun 2023 • Xiaoying Xie, Biao Gong, Yiliang Lv, Zhen Han, Guoshuai Zhao, Xueming Qian
Most recent works focus on answering first order logical queries to explore the knowledge graph reasoning via multi-hop logic predictions.
1 code implementation • 6 Jun 2022 • Yujing Wang, Yingyan Hou, Haonan Wang, Ziming Miao, Shibin Wu, Hao Sun, Qi Chen, Yuqing Xia, Chengmin Chi, Guoshuai Zhao, Zheng Liu, Xing Xie, Hao Allen Sun, Weiwei Deng, Qi Zhang, Mao Yang
To this end, we propose Neural Corpus Indexer (NCI), a sequence-to-sequence network that generates relevant document identifiers directly for a designated query.
1 code implementation • 1 Sep 2021 • Hao Tang, Guoshuai Zhao, Yuxia Wu, Xueming Qian
Therefore, we propose a Multi-Sample based Contrastive Loss (MSCL) function which solves the two problems by balancing the importance of positive and negative samples and data augmentation.
no code implementations • 27 Sep 2018 • Guoshuai Zhao, Jun Li, Lu Wang, Xueming Qian, Yun Fu
In this paper, we propose a Graph-Sequence-to-Sequence(GraphSeq2Seq) model to fuse the dependency graph among words into the traditional Seq2Seq framework.