Search Results for author: Hongxuan Zhang

Found 4 papers, 1 papers with code

GreenFlow: A Computation Allocation Framework for Building Environmentally Sound Recommendation System

no code implementations15 Dec 2023 Xingyu Lu, Zhining Liu, Yanchu Guan, Hongxuan Zhang, Chenyi Zhuang, Wenqi Ma, Yize Tan, Jinjie Gu, Guannan Zhang

of a cascade RS, when a user triggers a request, we define two actions that determine the computation: (1) the trained instances of models with different computational complexity; and (2) the number of items to be inferred in the stage.

Recommendation Systems

Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster

1 code implementation14 Nov 2023 Hongxuan Zhang, Zhining Liu, Jiaqi Zheng, Chenyi Zhuang, Jinjie Gu, Guihai Chen

In this work, we propose FastCoT, a model-agnostic framework based on parallel decoding without any further training of an auxiliary model or modification to the LLM itself.

Position

Dynamic DNN Decomposition for Lossless Synergistic Inference

no code implementations15 Jan 2021 Beibei Zhang, Tian Xiang, Hongxuan Zhang, Te Li, Shiqiang Zhu, Jianjun Gu

The algorithm can partially adjust the partitions at run time according to processing time and network conditions.

Edge-computing

Cannot find the paper you are looking for? You can Submit a new open access paper.