Search Results for author: Chenyi Zhuang

Found 23 papers, 9 papers with code

FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement

2 code implementations26 May 2025 Bingguang Hao, Maolin Wang, Zengzhuang Xu, Cunyin Peng, Yicheng Chen, Xiangyu Zhao, Jinjie Gu, Chenyi Zhuang

FunReason provides a comprehensive solution for enhancing LLMs' function calling capabilities by introducing a balanced training methodology and a data refinement pipeline.

PiCo: Enhancing Text-Image Alignment with Improved Noise Selection and Precise Mask Control in Diffusion Models

no code implementations6 May 2025 Chang Xie, Chenyi Zhuang, Pan Gao

In this work, we highlight two factors that affect this alignment: the quality of the randomly initialized noise and the reliability of the generated controlling mask.

PICO

Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos

no code implementations7 Apr 2025 Zhi Zuo, Chenyi Zhuang, Pan Gao, Jie Qin, Hao Feng, Nicu Sebe

Self-supervised representation learning for point cloud videos remains a challenging problem with two key limitations: (1) existing methods rely on explicit knowledge to learn motion, resulting in suboptimal representations; (2) prior Masked AutoEncoder (MAE) frameworks struggle to bridge the gap between low-level geometry and high-level dynamics in 4D data.

Action Segmentation Representation Learning +1

CSR:Achieving 1 Bit Key-Value Cache via Sparse Representation

no code implementations16 Dec 2024 Hongxuan Zhang, Yao Zhao, Jiaqi Zheng, Chenyi Zhuang, Jinjie Gu, Guihai Chen

The emergence of long-context text applications utilizing large language models (LLMs) has presented significant scalability challenges, particularly in memory footprint.

Quantization

Explainable Behavior Cloning: Teaching Large Language Model Agents through Learning by Demonstration

no code implementations30 Oct 2024 Yanchu Guan, Dong Wang, Yan Wang, Haiqing Wang, Renen Sun, Chenyi Zhuang, Jinjie Gu, Zhixuan Chu

In this paper, we propose an Explainable Behavior Cloning LLM Agent (EBC-LLMAgent), a novel approach that combines large language models (LLMs) with behavior cloning by learning demonstrations to create intelligent and explainable agents for autonomous mobile app interaction.

Code Generation Language Modeling +3

DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer

1 code implementation19 Oct 2024 Ying Hu, Chenyi Zhuang, Pan Gao

Style transfer aims to fuse the artistic representation of a style image with the structural information of a content image.

Style Transfer

Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function

1 code implementation30 Sep 2024 Chenyi Zhuang, Ying Hu, Pan Gao

In this work, we critically examine the limitations of the CLIP text encoder in understanding attributes and investigate how this affects diffusion models.

Attribute Disentanglement

Mitigate Position Bias with Coupled Ranking Bias on CTR Prediction

no code implementations29 May 2024 Yao Zhao, Zhining Liu, Tianchi Cai, Haipeng Zhang, Chenyi Zhuang, Jinjie Gu

Using both synthetic and industrial datasets, we first show how this widely coexisted ranking bias deteriorates the performance of the existing position bias estimation methods.

Click-Through Rate Prediction Position +1

CDFormer:When Degradation Prediction Embraces Diffusion Model for Blind Image Super-Resolution

1 code implementation13 May 2024 Qingguo Liu, Chenyi Zhuang, Pan Gao, Jie Qin

Existing Blind image Super-Resolution (BSR) methods focus on estimating either kernel or degradation information, but have long overlooked the essential content details.

Diversity Image Super-Resolution

MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts

no code implementations31 Jan 2024 Zhitian Xie, Yinger Zhang, Chenyi Zhuang, Qitao Shi, Zhining Liu, Jinjie Gu, Guannan Zhang

However, the gate's routing mechanism also gives rise to narrow vision: the individual MoE's expert fails to use more samples in learning the allocated sub-task, which in turn limits the MoE to further improve its generalization ability.

Mixture-of-Experts

CDFormer: When Degradation Prediction Embraces Diffusion Model for Blind Image Super-Resolution

1 code implementation CVPR 2024 Qingguo Liu, Chenyi Zhuang, Pan Gao, Jie Qin

Existing Blind image Super-Resolution (BSR) methods focus on estimating either kernel or degradation information but have long overlooked the essential content details.

Diversity Image Super-Resolution

Lookahead: An Inference Acceleration Framework for Large Language Model with Lossless Generation Accuracy

1 code implementation20 Dec 2023 Yao Zhao, Zhitian Xie, Chen Liang, Chenyi Zhuang, Jinjie Gu

Instead of generating a single token at a time, we propose a Trie-based retrieval and verification mechanism to be able to accept several tokens at a forward step.

Language Modeling Language Modelling +4

GreenFlow: A Computation Allocation Framework for Building Environmentally Sound Recommendation System

no code implementations15 Dec 2023 Xingyu Lu, Zhining Liu, Yanchu Guan, Hongxuan Zhang, Chenyi Zhuang, Wenqi Ma, Yize Tan, Jinjie Gu, Guannan Zhang

of a cascade RS, when a user triggers a request, we define two actions that determine the computation: (1) the trained instances of models with different computational complexity; and (2) the number of items to be inferred in the stage.

Recommendation Systems

Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroup

no code implementations10 Dec 2023 Maolin Wang, Yao Zhao, Jiajia Liu, Jingdong Chen, Chenyi Zhuang, Jinjie Gu, Ruocheng Guo, Xiangyu Zhao

In our research, we constructed a dataset, the Multimodal Advertisement Audition Dataset (MAAD), from real-world scenarios within Alipay, and conducted experiments to validate the reliability of our proposed strategy.

Model Compression

Intelligent Virtual Assistants with LLM-based Process Automation

no code implementations4 Dec 2023 Yanchu Guan, Dong Wang, Zhixuan Chu, Shiyu Wang, Feiyue Ni, Ruihua Song, Longfei Li, Jinjie Gu, Chenyi Zhuang

This paper proposes a novel LLM-based virtual assistant that can automatically perform multi-step operations within mobile apps based on high-level user requests.

Language Modelling Large Language Model

Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster

1 code implementation14 Nov 2023 Hongxuan Zhang, Zhining Liu, Yao Zhao, Jiaqi Zheng, Chenyi Zhuang, Jinjie Gu, Guihai Chen

In this work, we propose FastCoT, a model-agnostic framework based on parallel decoding without any further training of an auxiliary model or modification to the LLM itself.

Position

StylePrompter: All Styles Need Is Attention

1 code implementation30 Jul 2023 Chenyi Zhuang, Pan Gao, Aljosa Smolic

We then prove that StylePrompter lies in a more disentangled $\mathcal{W^+}$ and show the controllability of SMART.

All Attribute +1

Tensorized Hypergraph Neural Networks

no code implementations5 Jun 2023 Maolin Wang, Yaoming Zhen, Yu Pan, Yao Zhao, Chenyi Zhuang, Zenglin Xu, Ruocheng Guo, Xiangyu Zhao

THNN is a faithful hypergraph modeling framework through high-order outer product feature message passing and is a natural tensor extension of the adjacency-matrix-based graph neural networks.

Cannot find the paper you are looking for? You can Submit a new open access paper.