Search Results for author: Deli Chen

Found 18 papers, 9 papers with code

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

1 code implementation7 May 2024 DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Li, Hui Qu, J. L. Cai, Jian Liang, JianZhong Guo, Jiaqi Ni, Jiashi Li, Jin Chen, Jingyang Yuan, Junjie Qiu, Junxiao Song, Kai Dong, Kaige Gao, Kang Guan, Lean Wang, Lecong Zhang, Lei Xu, Leyi Xia, Liang Zhao, Liyue Zhang, Meng Li, Miaojun Wang, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Mingming Li, Ning Tian, Panpan Huang, Peiyi Wang, Peng Zhang, Qihao Zhu, Qinyu Chen, Qiushi Du, R. J. Chen, R. L. Jin, Ruiqi Ge, Ruizhe Pan, Runxin Xu, Ruyi Chen, S. S. Li, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shaoqing Wu, Shengfeng Ye, Shirong Ma, Shiyu Wang, Shuang Zhou, Shuiping Yu, Shunfeng Zhou, Size Zheng, T. Wang, Tian Pei, Tian Yuan, Tianyu Sun, W. L. Xiao, Wangding Zeng, Wei An, Wen Liu, Wenfeng Liang, Wenjun Gao, Wentao Zhang, X. Q. Li, Xiangyue Jin, Xianzu Wang, Xiao Bi, Xiaodong Liu, Xiaohan Wang, Xiaojin Shen, Xiaokang Chen, Xiaosha Chen, Xiaotao Nie, Xiaowen Sun, Xiaoxiang Wang, Xin Liu, Xin Xie, Xingkai Yu, Xinnan Song, Xinyi Zhou, Xinyu Yang, Xuan Lu, Xuecheng Su, Y. Wu, Y. K. Li, Y. X. Wei, Y. X. Zhu, Yanhong Xu, Yanping Huang, Yao Li, Yao Zhao, Yaofeng Sun, Yaohui Li, Yaohui Wang, Yi Zheng, Yichao Zhang, Yiliang Xiong, Yilong Zhao, Ying He, Ying Tang, Yishi Piao, Yixin Dong, Yixuan Tan, Yiyuan Liu, Yongji Wang, Yongqiang Guo, Yuchen Zhu, Yuduan Wang, Yuheng Zou, Yukun Zha, Yunxian Ma, Yuting Yan, Yuxiang You, Yuxuan Liu, Z. Z. Ren, Zehui Ren, Zhangli Sha, Zhe Fu, Zhen Huang, Zhen Zhang, Zhenda Xie, Zhewen Hao, Zhihong Shao, Zhiniu Wen, Zhipeng Xu, Zhongyu Zhang, Zhuoshu Li, Zihan Wang, Zihui Gu, Zilin Li, Ziwei Xie

MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation.

Language Modelling Reinforcement Learning (RL)

Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations

1 code implementation14 Dec 2023 Peiyi Wang, Lei LI, Zhihong Shao, R. X. Xu, Damai Dai, Yifei Li, Deli Chen, Y. Wu, Zhifang Sui

In this paper, we present an innovative process-oriented math process reward model called \textbf{Math-Shepherd}, which assigns a reward score to each step of math problem solutions.

Ranked #14 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +2

Towards Codable Watermarking for Injecting Multi-bits Information to LLMs

1 code implementation29 Jul 2023 Lean Wang, Wenkai Yang, Deli Chen, Hao Zhou, Yankai Lin, Fandong Meng, Jie zhou, Xu sun

As large language models (LLMs) generate texts with increasing fluency and realism, there is a growing need to identify the source of texts to prevent the abuse of LLMs.

Language Modelling

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

1 code implementation23 May 2023 Lean Wang, Lei LI, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

In-context learning (ICL) emerges as a promising capability of large language models (LLMs) by providing them with demonstration examples to perform diverse tasks.

In-Context Learning

Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias

no code implementations8 May 2023 Zhiyuan Zhang, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

To settle this issue, we propose the Fine-purifying approach, which utilizes the diffusion theory to study the dynamic process of fine-tuning for finding potentially poisonous dimensions.

Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning

no code implementations25 Jan 2023 Wenkai Yang, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively in a data privacy-preserving manner.

Federated Learning Privacy Preserving

Topology-Imbalance Learning for Semi-Supervised Node Classification

1 code implementation NeurIPS 2021 Deli Chen, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie zhou, Xu sun

The class imbalance problem, as an important issue in learning node representations, has drawn increasing attention from the community.

Classification Node Classification

Rethinking the Promotion Brought by Contrastive Learning to Semi-Supervised Node Classification

no code implementations14 Dec 2020 Deli Chen, Yankai Lin, Lei LI, Xuancheng Ren, Peng Li, Jie zhou, Xu sun

Graph Contrastive Learning (GCL) has proven highly effective in promoting the performance of Semi-Supervised Node Classification (SSNC).

Contrastive Learning Graph Learning +1

Modeling the Stock Relation with Graph Network for Overnight Stock Movement Prediction

no code implementations26 Jun 2020 Wei Li, Ruihan Bao, Keiko Harimoto, Deli Chen, Jingjing Xu and Qi Su

Further analysis shows that the introduction of the graph enables our model to predict the movement of stocks that are not directly associated with news as well as the whole market, which is not available in most previous methods.

Relation

HighwayGraph: Modelling Long-distance Node Relations for Improving General Graph Neural Network

no code implementations10 Nov 2019 Deli Chen, Xiaoqian Liu, Yankai Lin, Peng Li, Jie zhou, Qi Su, Xu sun

To address this issue, we propose to model long-distance node relations by simply relying on shallow GNN architectures with two solutions: (1) Implicitly modelling by learning to predict node pair relations (2) Explicitly modelling by adding edges between nodes that potentially have the same label.

General Classification Graph Neural Network +1

Identifying High-Quality Chinese News Comments Based on Multi-Target Text Matching Model

no code implementations22 Aug 2018 Deli Chen, Shuming Ma, Pengcheng Yang, Xu sun

In this work, we introduce a novel task: high-quality comment identification (HQCI), which aims to automatically assess the quality of online comments.

Informativeness Text Matching

Cannot find the paper you are looking for? You can Submit a new open access paper.