Search Results for author: Chenyi Zhuang

Found 11 papers, 3 papers with code

MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts

no code implementations31 Jan 2024 Zhitian Xie, Yinger Zhang, Chenyi Zhuang, Qitao Shi, Zhining Liu, Jinjie Gu, Guannan Zhang

However, the gate's routing mechanism also gives rise to narrow vision: the individual MoE's expert fails to use more samples in learning the allocated sub-task, which in turn limits the MoE to further improve its generalization ability.

Lookahead: An Inference Acceleration Framework for Large Language Model with Lossless Generation Accuracy

1 code implementation20 Dec 2023 Yao Zhao, Zhitian Xie, Chenyi Zhuang, Jinjie Gu

Hence, this paper presents a generic framework for accelerating the inference process, resulting in a substantial increase in speed and cost reduction for our RAG system, with lossless generation accuracy.

Language Modelling Large Language Model +3

GreenFlow: A Computation Allocation Framework for Building Environmentally Sound Recommendation System

no code implementations15 Dec 2023 Xingyu Lu, Zhining Liu, Yanchu Guan, Hongxuan Zhang, Chenyi Zhuang, Wenqi Ma, Yize Tan, Jinjie Gu, Guannan Zhang

of a cascade RS, when a user triggers a request, we define two actions that determine the computation: (1) the trained instances of models with different computational complexity; and (2) the number of items to be inferred in the stage.

Recommendation Systems

Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroup

no code implementations10 Dec 2023 Maolin Wang, Yao Zhao, Jiajia Liu, Jingdong Chen, Chenyi Zhuang, Jinjie Gu, Ruocheng Guo, Xiangyu Zhao

In our research, we constructed a dataset, the Multimodal Advertisement Audition Dataset (MAAD), from real-world scenarios within Alipay, and conducted experiments to validate the reliability of our proposed strategy.

Model Compression

Intelligent Virtual Assistants with LLM-based Process Automation

no code implementations4 Dec 2023 Yanchu Guan, Dong Wang, Zhixuan Chu, Shiyu Wang, Feiyue Ni, Ruihua Song, Longfei Li, Jinjie Gu, Chenyi Zhuang

This paper proposes a novel LLM-based virtual assistant that can automatically perform multi-step operations within mobile apps based on high-level user requests.

Language Modelling Large Language Model

Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster

1 code implementation14 Nov 2023 Hongxuan Zhang, Zhining Liu, Jiaqi Zheng, Chenyi Zhuang, Jinjie Gu, Guihai Chen

In this work, we propose FastCoT, a model-agnostic framework based on parallel decoding without any further training of an auxiliary model or modification to the LLM itself.

Position

StylePrompter: All Styles Need Is Attention

1 code implementation30 Jul 2023 Chenyi Zhuang, Pan Gao, Aljosa Smolic

We then prove that StylePrompter lies in a more disentangled $\mathcal{W^+}$ and show the controllability of SMART.

Attribute Image Manipulation

Tensorized Hypergraph Neural Networks

no code implementations5 Jun 2023 Maolin Wang, Yaoming Zhen, Yu Pan, Yao Zhao, Chenyi Zhuang, Zenglin Xu, Ruocheng Guo, Xiangyu Zhao

THNN is a faithful hypergraph modeling framework through high-order outer product feature message passing and is a natural tensor extension of the adjacency-matrix-based graph neural networks.

Cannot find the paper you are looking for? You can Submit a new open access paper.