no code implementations • 18 May 2023 • Tong Ye, Lingfei Wu, Tengfei Ma, Xuhong Zhang, Yangkai Du, Peiyu Liu, Wenhai Wang, Shouling Ji
Most previous approaches rely on the sentence-level retrieval and combination paradigm (retrieval of similar code snippets and use of the corresponding code and summary pairs) on the encoder side.
1 code implementation • 31 Mar 2023 • Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, YiFan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, Ji-Rong Wen
To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.
no code implementations • 27 Mar 2023 • Peiyu Liu, Ze-Feng Gao, Yushuo Chen, Wayne Xin Zhao, Ji-Rong Wen
Based on such a decomposition, our architecture shares the central tensor across all layers for reducing the model size and meanwhile keeps layer-specific auxiliary tensors (also using adapters) for enhancing the adaptation flexibility.
1 code implementation • 14 Jan 2023 • Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu
Compared with previous image-based dialogue datasets, the richer sources of context in TikTalk lead to a greater diversity of conversations.
no code implementations • 2 Jun 2022 • Peiyu Liu, Junping Du, Zhe Xue, Ang Li
With the rapid development of information technology, "information overload" has become the main theme that plagues people's online life.
no code implementations • 22 Mar 2022 • Sha Yuan, Shuai Zhao, Jiahong Leng, Zhao Xue, Hanyu Zhao, Peiyu Liu, Zheng Gong, Wayne Xin Zhao, Junyi Li, Jie Tang
The results show that WuDaoMM can be applied as an efficient dataset for VLPMs, especially for the model in text-to-image generation task.
2 code implementations • COLING 2022 • Ze-Feng Gao, Peiyu Liu, Wayne Xin Zhao, Zhong-Yi Lu, Ji-Rong Wen
Recently, Mixture-of-Experts (short as MoE) architecture has achieved remarkable success in increasing the model capacity of large-scale language models.
no code implementations • 29 Sep 2021 • Ze-Feng Gao, Peiyu Liu, Xiao-Hui Zhang, Xin Zhao, Z. Y. Xie, Zhong-Yi Lu, Ji-Rong Wen
Based on the MPS structure, we propose a new dataset compression method that compresses datasets by filtering long-range correlation information in task-agnostic scenarios and uses dataset distillation to supplement the information in task-specific scenarios.
1 code implementation • ACL 2021 • Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Z. Y. Xie, Zhong-Yi Lu, Ji-Rong Wen
This paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics.
2 code implementations • 11 Mar 2021 • Yuqi Huo, Manli Zhang, Guangzhen Liu, Haoyu Lu, Yizhao Gao, Guoxing Yang, Jingyuan Wen, Heng Zhang, Baogui Xu, Weihao Zheng, Zongzheng Xi, Yueqian Yang, Anwen Hu, Jinming Zhao, Ruichen Li, Yida Zhao, Liang Zhang, Yuqing Song, Xin Hong, Wanqing Cui, Danyang Hou, Yingyan Li, Junyi Li, Peiyu Liu, Zheng Gong, Chuhao Jin, Yuchong Sun, ShiZhe Chen, Zhiwu Lu, Zhicheng Dou, Qin Jin, Yanyan Lan, Wayne Xin Zhao, Ruihua Song, Ji-Rong Wen
We further construct a large Chinese multi-source image-text dataset called RUC-CAS-WenLan for pre-training our BriVL model.
Ranked #1 on Image Retrieval on RUC-CAS-WenLan