Search Results for author: Zhanhui Kang

Found 37 papers, 12 papers with code

PatchRec: Multi-Grained Patching for Efficient LLM-based Sequential Recommendation

no code implementations25 Jan 2025 Jiayi Liao, Ruobing Xie, Sihang Li, Xiang Wang, Xingwu Sun, Zhanhui Kang, Xiangnan He

The framework consists of two stages: (1) Patch Pre-training, which familiarizes LLMs with item-level compression patterns, and (2) Patch Fine-tuning, which teaches LLMs to model sequences at multiple granularities.

Language Modeling Language Modelling +1

Autonomy-of-Experts Models

no code implementations22 Jan 2025 Ang Lv, Ruobing Xie, Yining Qian, Songhao Wu, Xingwu Sun, Zhanhui Kang, Di Wang, Rui Yan

We argue that the separation between the router's decision-making and the experts' execution is a critical yet overlooked issue, leading to suboptimal expert selection and ineffective learning.

Decision Making

Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"

1 code implementation21 Dec 2024 Yudong Zhang, Ruobing Xie, Jiansheng Chen, Xingwu Sun, Zhanhui Kang, Yu Wang

Contrastive learning is a prevalent technique in self-supervised vision representation learning, typically generating positive pairs by applying two data augmentations to the same image.

Contrastive Learning Data Augmentation +2

More Expressive Attention with Negative Weights

1 code implementation11 Nov 2024 Ang Lv, Ruobing Xie, Shuaipeng Li, Jiayi Liao, Xingwu Sun, Zhanhui Kang, Di Wang, Rui Yan

We propose a novel attention mechanism, named Cog Attention, that enables attention weights to be negative for enhanced expressiveness, which stems from two key factors: (1) Cog Attention enhances parameter flexibility.

Decoder Image Generation +2

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

3 code implementations4 Nov 2024 Xingwu Sun, Yanfeng Chen, Yiqing Huang, Ruobing Xie, Jiaqi Zhu, Kai Zhang, Shuaipeng Li, Zhen Yang, Jonny Han, Xiaobo Shu, Jiahao Bu, Zhongzhi Chen, Xuemeng Huang, Fengzong Lian, Saiyong Yang, Jianfeng Yan, Yuyuan Zeng, Xiaoqin Ren, Chao Yu, Lulu Wu, Yue Mao, Jun Xia, Tao Yang, Suncong Zheng, Kan Wu, Dian Jiao, Jinbao Xue, Xipeng Zhang, Decheng Wu, Kai Liu, Dengpeng Wu, Guanghui Xu, Shaohua Chen, Shuang Chen, Xiao Feng, Yigeng Hong, Junqiang Zheng, Chengcheng Xu, Zongwei Li, Xiong Kuang, Jianglu Hu, Yiqi Chen, Yuchi Deng, Guiyang Li, Ao Liu, Chenchen Zhang, Shihui Hu, Zilong Zhao, Zifan Wu, Yao Ding, Weichao Wang, Han Liu, Roberts Wang, Hao Fei, Peijie Yu, Ze Zhao, Xun Cao, Hai Wang, Fusheng Xiang, Mengyuan Huang, Zhiyuan Xiong, Bin Hu, Xuebin Hou, Lei Jiang, Jianqiang Ma, Jiajia Wu, Yaping Deng, Yi Shen, Qian Wang, Weijie Liu, Jie Liu, Meng Chen, Liang Dong, Weiwen Jia, Hu Chen, Feifei Liu, Rui Yuan, Huilin Xu, Zhenxiang Yan, Tengfei Cao, Zhichao Hu, Xinhua Feng, Dong Du, TingHao Yu, Yangyu Tao, Feng Zhang, Jianchen Zhu, Chengzhong Xu, Xirui Li, Chong Zha, Wen Ouyang, Yinben Xia, Xiang Li, Zekun He, Rongpeng Chen, Jiawei Song, Ruibin Chen, Fan Jiang, Chongqing Zhao, Bo wang, Hao Gong, Rong Gan, Winston Hu, Zhanhui Kang, Yong Yang, Yuhong Liu, Di Wang, Jie Jiang

In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens.

Logical Reasoning Mathematical Problem-Solving

Continuous Speech Tokenizer in Text To Speech

no code implementations22 Oct 2024 Yixing Li, Ruobing Xie, Xingwu Sun, Yu Cheng, Zhanhui Kang

Our results show that the speech language model based on the continuous speech tokenizer has better continuity and higher estimated Mean Opinion Scores (MoS).

Language Modeling Language Modelling +1

Exploring Forgetting in Large Language Model Pre-Training

no code implementations22 Oct 2024 Chonghua Liao, Ruobing Xie, Xingwu Sun, Haowen Sun, Zhanhui Kang

Catastrophic forgetting remains a formidable obstacle to building an omniscient model in large language models (LLMs).

Language Modeling Language Modelling +2

Lossless KV Cache Compression to 2%

no code implementations20 Oct 2024 Zhen Yang, J. N. Han, Kan Wu, Ruobing Xie, An Wang, Xingwu Sun, Zhanhui Kang

Large language models have revolutionized data processing in numerous domains, with their ability to handle extended context reasoning receiving notable recognition.

Dimensionality Reduction Quantization

RosePO: Aligning LLM-based Recommenders with Human Values

no code implementations16 Oct 2024 Jiayi Liao, Xiangnan He, Ruobing Xie, Jiancan Wu, Yancheng Yuan, Xingwu Sun, Zhanhui Kang, Xiang Wang

Recently, there has been a growing interest in leveraging Large Language Models (LLMs) for recommendation systems, which usually adapt a pre-trained LLM to the recommendation scenario through supervised fine-tuning (SFT).

Hallucination Recommendation Systems

Language Models "Grok" to Copy

no code implementations14 Sep 2024 Ang Lv, Ruobing Xie, Xingwu Sun, Zhanhui Kang, Rui Yan

We examine the pre-training dynamics of language models, focusing on their ability to copy text from preceding context--a fundamental skill for various LLM applications, including in-context learning (ICL) and retrieval-augmented generation (RAG).

In-Context Learning Language Modelling +1

Negative Sampling in Recommendation: A Survey and Future Directions

no code implementations11 Sep 2024 Haokai Ma, Ruobing Xie, Lei Meng, Fuli Feng, Xiaoyu Du, Xingwu Sun, Zhanhui Kang, Xiangxu Meng

Recommender systems aim to capture users' personalized preferences from the cast amount of user behaviors, making them pivotal in the era of information explosion.

Recommendation Systems Survey

HMoE: Heterogeneous Mixture of Experts for Language Modeling

no code implementations20 Aug 2024 An Wang, Xingwu Sun, Ruobing Xie, Shuaipeng Li, Jiaqi Zhu, Zhen Yang, Pinxue Zhao, J. N. Han, Zhanhui Kang, Di Wang, Naoaki Okazaki, Cheng-Zhong Xu

To address the imbalance in expert activation, we propose a novel training objective that encourages the frequent activation of smaller experts, enhancing computational efficiency and parameter utilization.

Computational Efficiency Language Modeling +1

Improving Multi-modal Recommender Systems by Denoising and Aligning Multi-modal Content and User Feedback

1 code implementation18 Jun 2024 Guipeng Xv, Xinyu Li, Ruobing Xie, Chen Lin, Chong Liu, Feng Xia, Zhanhui Kang, Leyu Lin

Multi-modal recommender systems (MRSs) are pivotal in diverse online web platforms and have garnered considerable attention in recent years.

Denoising Recommendation Systems

DFGNN: Dual-frequency Graph Neural Network for Sign-aware Feedback

no code implementations24 May 2024 Yiqing Wu, Ruobing Xie, Zhao Zhang, Xu Zhang, Fuzhen Zhuang, Leyu Lin, Zhanhui Kang, Yongjun Xu

Based on the two observations, we propose a novel model that models positive and negative feedback from a frequency filter perspective called Dual-frequency Graph Neural Network for Sign-aware Recommendation (DFGNN).

Graph Neural Network

ID-centric Pre-training for Recommendation

no code implementations6 May 2024 Yiqing Wu, Ruobing Xie, Zhao Zhang, Fuzhen Zhuang, Xu Zhang, Leyu Lin, Zhanhui Kang, Yongjun Xu

Specifically, in pre-training stage, besides the ID-based sequential model for recommendation, we also build a Cross-domain ID-matcher (CDIM) learned by both behavioral and modality information.

Language Modelling Sequential Recommendation

PhD: A ChatGPT-Prompted Visual hallucination Evaluation Dataset

1 code implementation17 Mar 2024 Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Li

This paper contributes a ChatGPT-Prompted visual hallucination evaluation Dataset (PhD) for objective VHE at a large scale.

Attribute Common Sense Reasoning +4

EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs

no code implementations5 Mar 2024 Hanlin Tang, Yifu Sun, Decheng Wu, Kai Liu, Jianchen Zhu, Zhanhui Kang

To our best knowledge, we are the first work that achieves almost lossless quantization performance for LLMs under a data-independent setting and our algorithm runs over 10 times faster than the data-dependent methods.

Data Free Quantization

Plug-in Diffusion Model for Sequential Recommendation

1 code implementation5 Jan 2024 Haokai Ma, Ruobing Xie, Lei Meng, Xin Chen, Xu Zhang, Leyu Lin, Zhanhui Kang

To address this issue, this paper presents a novel Plug-in Diffusion Model for Recommendation (PDRec) framework, which employs the diffusion model as a flexible plugin to jointly take full advantage of the diffusion-generating user preferences on all items.

Image Generation model +2

Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning

1 code implementation29 Dec 2023 Zhongzhi Chen, Xingwu Sun, Xianfeng Jiao, Fengzong Lian, Zhanhui Kang, Di Wang, Cheng-Zhong Xu

We introduce Truth Forest, a method that enhances truthfulness in LLMs by uncovering hidden truth representations using multi-dimensional orthogonal probes.

TruthfulQA

E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity

no code implementations24 Oct 2023 Yun Li, Lin Niu, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang

Traditional pruning methods are known to be challenging to work in Large Language Models (LLMs) for Generative AI because of their unaffordable training process and large computational demands.

Language Modeling Language Modelling +1

Thoroughly Modeling Multi-domain Pre-trained Recommendation as Language

no code implementations20 Oct 2023 Zekai Qu, Ruobing Xie, Chaojun Xiao, Yuan YAO, Zhiyuan Liu, Fengzong Lian, Zhanhui Kang, Jie zhou

With the thriving of pre-trained language model (PLM) widely verified in various of NLP tasks, pioneer efforts attempt to explore the possible cooperation of the general textual information in PLM with the personalized behavioral information in user historical behavior sequences to enhance sequential recommendation (SR).

Informativeness Language Modeling +2

TeachCLIP: Multi-Grained Teaching for Efficient Text-to-Video Retrieval

no code implementations2 Aug 2023 Kaibin Tian, Ruixiang Zhao, Hu Hu, Runquan Xie, Fengzong Lian, Zhanhui Kang, Xirong Li

For efficient T2VR, we propose TeachCLIP with multi-grained teaching to let a CLIP4Clip based student network learn from more advanced yet computationally heavy models such as X-CLIP, TS2-Net and X-Pool .

Retrieval text similarity +2

Multi-Feature Integration for Perception-Dependent Examination-Bias Estimation

1 code implementation27 Feb 2023 Xiaoshu Chen, Xiangsheng Li, Kunliang Wei, Bin Hu, Lei Jiang, Zeqian Huang, Zhanhui Kang

Eliminating examination bias accurately is pivotal to apply click-through data to train an unbiased ranking model.

BagFormer: Better Cross-Modal Retrieval via bag-wise interaction

no code implementations29 Dec 2022 Haowen Hou, Xiaopeng Yan, Yigeng Zhang, Fengzong Lian, Zhanhui Kang

In the field of cross-modal retrieval, single encoder models tend to perform better than dual encoder models, but they suffer from high latency and low throughput.

Cross-Modal Retrieval Retrieval

TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

3 code implementations13 Dec 2022 Zhe Zhao, Yudong Li, Cheng Hou, Jing Zhao, Rong Tian, Weijie Liu, Yiren Chen, Ningyuan Sun, Haoyan Liu, Weiquan Mao, Han Guo, Weigang Guo, Taiqiang Wu, Tao Zhu, Wenhang Shi, Chen Chen, Shan Huang, Sihong Chen, Liqun Liu, Feifei Li, Xiaoshuai Chen, Xingwu Sun, Zhanhui Kang, Xiaoyong Du, Linlin Shen, Kimmo Yan

The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework.

Decoder

MKQ-BERT: Quantized BERT with 4-bits Weights and Activations

no code implementations25 Mar 2022 Hanlin Tang, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang

In this work, we propose MKQ-BERT, which further improves the compression level and uses 4-bits for quantization.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.