Search Results for author: Tiandeng Wu

Found 3 papers, 1 papers with code

P/D-Serve: Serving Disaggregated Large Language Model at Scale

no code implementations15 Aug 2024 Yibo Jin, Tao Wang, Huimin Lin, Mingyang Song, Peiyang Li, Yipeng Ma, Yicheng Shan, Zhengfan Yuan, Cailong Li, Yajing Sun, Tiandeng Wu, Xing Chu, Ruizhi Huan, Li Ma, Xiao You, Wenting Zhou, Yunpeng Ye, Wen Liu, Xiangkun Xu, Yongsheng Zhang, Tiantian Dong, Jiawei Zhu, Zhe Wang, Xijian Ju, Jianxun Song, Haoliang Cheng, Xiaojing Li, Jiandong Ding, Hefei Guo, Zhengyong Zhang

To overcome previous problems, this paper proposes an end-to-end system P/D-Serve, complying with the paradigm of MLOps (machine learning operations), which models end-to-end (E2E) P/D performance and enables: 1) fine-grained P/D organization, mapping the service with RoCE (RDMA over converged ethernet) as needed, to facilitate similar processing and dynamic adjustments on P/D ratios; 2) on-demand forwarding upon rejections for idle prefill, decoupling the scheduler from regular inaccurate reports and local queues, to avoid timeouts in prefill; and 3) efficient KVCache transfer via optimized D2D access.

Language Modeling Language Modelling +1

Continual Graph Convolutional Network for Text Classification

no code implementations9 Apr 2023 Tiandeng Wu, Qijiong Liu, Yi Cao, Yao Huang, Xiao-Ming Wu, Jiandong Ding

Graph convolutional network (GCN) has been successfully applied to capture global non-consecutive and long-distance semantic information for text classification.

Contrastive Learning text-classification +1

FANS: Fast Non-Autoregressive Sequence Generation for Item List Continuation

1 code implementation2 Apr 2023 Qijiong Liu, Jieming Zhu, Jiahao Wu, Tiandeng Wu, Zhenhua Dong, Xiao-Ming Wu

Item list continuation is proposed to model the overall trend of a list and predict subsequent items.

Cannot find the paper you are looking for? You can Submit a new open access paper.