Search Results for author: Pengyu Cheng

Found 22 papers, 13 papers with code

Kimi-VL Technical Report

1 code implementation10 Apr 2025 Kimi Team, Angang Du, Bohong Yin, Bowei Xing, Bowen Qu, Bowen Wang, Cheng Chen, Chenlin Zhang, Chenzhuang Du, Chu Wei, Congcong Wang, Dehao Zhang, Dikang Du, Dongliang Wang, Enming Yuan, Enzhe Lu, Fang Li, Flood Sung, Guangda Wei, Guokun Lai, Han Zhu, Hao Ding, Hao Hu, Hao Yang, Hao Zhang, HaoNing Wu, Haotian Yao, Haoyu Lu, Heng Wang, Hongcheng Gao, Huabin Zheng, Jiaming Li, Jianlin Su, Jianzhou Wang, Jiaqi Deng, Jiezhong Qiu, Jin Xie, Jinhong Wang, Jingyuan Liu, Junjie Yan, Kun Ouyang, Liang Chen, Lin Sui, Longhui Yu, Mengfan Dong, Mengnan Dong, Nuo Xu, Pengyu Cheng, Qizheng Gu, Runjie Zhou, Shaowei Liu, Sihan Cao, Tao Yu, Tianhui Song, Tongtong Bai, Wei Song, Weiran He, Weixiao Huang, Weixin Xu, Xiaokun Yuan, Xingcheng Yao, Xingzhe Wu, Xinxing Zu, Xinyu Zhou, Xinyuan Wang, Y. Charles, Yan Zhong, Yang Li, Yangyang Hu, Yanru Chen, Yejie Wang, Yibo Liu, Yibo Miao, Yidao Qin, Yimin Chen, Yiping Bao, Yiqin Wang, Yongsheng Kang, Yuanxin Liu, Yulun Du, Yuxin Wu, Yuzhi Wang, Yuzi Yan, Zaida Zhou, Zhaowei Li, Zhejun Jiang, Zheng Zhang, Zhilin Yang, Zhiqi Huang, Zihao Huang, Zijia Zhao, Ziwei Chen, Zongyu Lin

We present Kimi-VL, an efficient open-source Mixture-of-Experts (MoE) vision-language model (VLM) that offers advanced multimodal reasoning, long-context understanding, and strong agent capabilities - all while activating only 2. 8B parameters in its language decoder (Kimi-VL-A3B).

Long-Context Understanding Mathematical Reasoning +4

Simplify RLHF as Reward-Weighted SFT: A Variational Method

no code implementations16 Feb 2025 Yuhao Du, Zhuo Li, Pengyu Cheng, Zhihong Chen, Yuejiao Xie, Xiang Wan, Anningzhe Gao

More specifically, by directly minimizing the distribution gap between the learning LLM policy and the optimal solution of RLHF, we transform the alignment objective into a reward-driven re-weighted supervised fine-tuning (SFT) form, which only requires minor adjustment on the SFT loss to obtain noticeable improvement on training stability and effectiveness.

Variational Inference

Atoxia: Red-teaming Large Language Models with Target Toxic Answers

no code implementations27 Aug 2024 Yuhao Du, Zhuo Li, Pengyu Cheng, Xiang Wan, Anningzhe Gao

Given a particular harmful answer, Atoxia generates a corresponding user query and a misleading answer opening to examine the internal defects of a given LLM.

Prompt Engineering Red Teaming

Self-playing Adversarial Language Game Enhances LLM Reasoning

1 code implementation16 Apr 2024 Pengyu Cheng, Tianhao Hu, Han Xu, Zhisong Zhang, Zheng Yuan, Yong Dai, Lei Han, Nan Du, Xiaolong Li

In this game, an attacker and a defender communicate around a target word only visible to the attacker.

On Diversified Preferences of Large Language Model Alignment

1 code implementation12 Dec 2023 Dun Zeng, Yong Dai, Pengyu Cheng, Longyue Wang, Tianhao Hu, Wanshun Chen, Nan Du, Zenglin Xu

Through experiments on four models and five human preference datasets, we find the calibration error can be adopted as a key metric for evaluating RMs and MORE can obtain superior alignment performance.

Language Modeling Language Modelling +1

Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game

1 code implementation14 Nov 2023 Pengyu Cheng, Yifan Yang, Jian Li, Yong Dai, Tianhao Hu, Peixin Cao, Nan Du, Xiaolong Li

Targeting more efficient human preference optimization, we propose an Adversarial Preference Optimization (APO) framework, in which the LLM and the reward model update alternatively via a min-max game.

Everyone Deserves A Reward: Learning Customized Human Preferences

1 code implementation6 Sep 2023 Pengyu Cheng, Jiawen Xie, Ke Bai, Yong Dai, Nan Du

Besides, from the perspective of data efficiency, we propose a three-stage customized RM learning scheme, then empirically verify its effectiveness on both general preference datasets and our DSP set.

Diversity Imitation Learning

Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers

1 code implementation25 Aug 2023 Jiawen Xie, Pengyu Cheng, Xiao Liang, Yong Dai, Nan Du

Although dominant in natural language processing, transformer-based models remain challenged by the task of long-sequence processing, because the computational cost of self-attention operations in transformers swells quadratically with the input sequence length.

Reading Comprehension Text Summarization

Toward Fairness in Text Generation via Mutual Information Minimization based on Importance Sampling

no code implementations25 Feb 2023 Rui Wang, Pengyu Cheng, Ricardo Henao

To improve the fairness of PLMs in text generation, we propose to minimize the mutual information between the semantics in the generated text sentences and their demographic polarity, i. e., the demographic group to which the sentence is referring.

Fairness Language Modeling +3

Replacing Language Model for Style Transfer

1 code implementation14 Nov 2022 Pengyu Cheng, Ruineng Li

The new span is generated via a non-autoregressive masked language model, which can better preserve the local-contextual meaning of the replaced token.

Disentanglement Language Modeling +6

Semi-constraint Optimal Transport for Entity Alignment with Dangling Cases

1 code implementation11 Mar 2022 Shengxuan Luo, Pengyu Cheng, Sheng Yu

To improve EA with dangling entities, we propose an unsupervised method called Semi-constraint Optimal Transport for Entity Alignment in Dangling cases (SoTead).

Entity Alignment Knowledge Graphs +2

Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis

no code implementations2 Mar 2022 Pengyu Cheng, ZhenHua Ling

In this paper, we propose a method of speaker adaption with intuitive prosodic features for statistical parametric speech synthesis.

Speech Synthesis

Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

1 code implementation ICLR 2021 Siyang Yuan, Pengyu Cheng, Ruiyi Zhang, Weituo Hao, Zhe Gan, Lawrence Carin

Voice style transfer, also called voice conversion, seeks to modify one speaker's voice to generate speech as if it came from another (target) speaker.

Decoder Representation Learning +2

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

no code implementations ICLR 2021 Pengyu Cheng, Weituo Hao, Siyang Yuan, Shijing Si, Lawrence Carin

Pretrained text encoders, such as BERT, have been applied increasingly in various natural language processing (NLP) tasks, and have recently demonstrated significant performance gains.

Contrastive Learning Fairness +1

WAFFLe: Weight Anonymized Factorization for Federated Learning

no code implementations13 Aug 2020 Weituo Hao, Nikhil Mehta, Kevin J Liang, Pengyu Cheng, Mostafa El-Khamy, Lawrence Carin

Experiments on MNIST, FashionMNIST, and CIFAR-10 demonstrate WAFFLe's significant improvement to local test performance and fairness while simultaneously providing an extra layer of security.

Fairness Federated Learning

Improving Disentangled Text Representation Learning with Information-Theoretic Guidance

no code implementations ACL 2020 Pengyu Cheng, Martin Renqiang Min, Dinghan Shen, Christopher Malon, Yizhe Zhang, Yitong Li, Lawrence Carin

Learning disentangled representations of natural language is essential for many NLP tasks, e. g., conditional text generation, style transfer, personalized dialogue systems, etc.

Conditional Text Generation Representation Learning +2

Straight-Through Estimator as Projected Wasserstein Gradient Flow

no code implementations5 Oct 2019 Pengyu Cheng, Chang Liu, Chunyuan Li, Dinghan Shen, Ricardo Henao, Lawrence Carin

The Straight-Through (ST) estimator is a widely used technique for back-propagating gradients through discrete random variables.

Learning Compressed Sentence Representations for On-Device Text Processing

1 code implementation ACL 2019 Dinghan Shen, Pengyu Cheng, Dhanasekar Sundararaman, Xinyuan Zhang, Qian Yang, Meng Tang, Asli Celikyilmaz, Lawrence Carin

Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems.

Retrieval Sentence +1

Understanding and Accelerating Particle-Based Variational Inference

1 code implementation4 Jul 2018 Chang Liu, Jingwei Zhuo, Pengyu Cheng, Ruiyi Zhang, Jun Zhu, Lawrence Carin

Particle-based variational inference methods (ParVIs) have gained attention in the Bayesian inference literature, for their capacity to yield flexible and accurate approximations.

Bayesian Inference Variational Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.