Search Results for author: Shuohuan Wang

Found 30 papers, 15 papers with code

Curiosity-Driven Reinforcement Learning from Human Feedback

1 code implementation20 Jan 2025 Haoran Sun, Yekun Chai, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang

Reinforcement learning from human feedback (RLHF) has proven effective in aligning large language models (LLMs) with human preferences, but often at the cost of reduced output diversity.

Diversity Instruction Following +3

Mixture of Hidden-Dimensions Transformer

no code implementations7 Dec 2024 Yilong Chen, Junyuan Shang, Zhengyu Zhang, Jiawei Sheng, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang

MOHD offers a new perspective for scaling the model, showcasing the potential of hidden dimension sparsity to boost efficiency

MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions

1 code implementation3 Oct 2024 Yekun Chai, Haoran Sun, Huang Fang, Shuohuan Wang, Yu Sun, Hua Wu

However, token-level RLHF suffers from the credit assignment problem over long sequences, where delayed rewards make it challenging for the model to discern which actions contributed to successful outcomes.

Code Generation Dialogue Generation +5

Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging

no code implementations2 Oct 2024 Tingfeng Hui, Zhenyu Zhang, Shuohuan Wang, Yu Sun, Hua Wu, Sen Su

To ensure that each specialized expert in the MoE model works as expected, we select a small amount of seed data that each expert excels to pre-optimize the router.

Diversity

NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time

2 code implementations7 Aug 2024 Yilong Chen, Guoxia Wang, Junyuan Shang, Shiyao Cui, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun, dianhai yu, Hua Wu

Large Language Models (LLMs) have ignited an innovative surge of AI applications, marking a new era of exciting possibilities equipped with extended context windows.

HFT: Half Fine-Tuning for Large Language Models

no code implementations29 Apr 2024 Tingfeng Hui, Zhenyu Zhang, Shuohuan Wang, Weiran Xu, Yu Sun, Hua Wu

Large language models (LLMs) with one or more fine-tuning phases have become a necessary step to unlock various capabilities, enabling LLMs to follow natural language instructions or align with human preferences.

Continual Learning

Autoregressive Pre-Training on Pixels and Texts

1 code implementation16 Apr 2024 Yekun Chai, Qingyi Liu, Jingwu Xiao, Shuohuan Wang, Yu Sun, Hua Wu

Our extensive evaluation across a wide range of benchmarks shows that incorporating both visual and textual data significantly improves the performance of pixel-based language models.

Language Modeling Language Modelling

On Training Data Influence of GPT Models

2 code implementations11 Apr 2024 Yekun Chai, Qingyi Liu, Shuohuan Wang, Yu Sun, Qiwei Peng, Hua Wu

This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models.

Natural Language Understanding

Tool-Augmented Reward Modeling

1 code implementation2 Oct 2023 Lei LI, Yekun Chai, Shuohuan Wang, Yu Sun, Hao Tian, Ningyu Zhang, Hua Wu

We validate our approach across a wide range of domains, incorporating seven distinct external tools.

TruthfulQA

ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models

no code implementations9 Feb 2023 Pengfei Zhu, Chao Pang, Yekun Chai, Lei LI, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu

In response to this lacuna, this paper introduces a pioneering contribution in the form of a text-to-waveform music generation model, underpinned by the utilization of diffusion models.

Diversity Music Generation +1

ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages

1 code implementation13 Dec 2022 Yekun Chai, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu

Extensive results show that ERNIE-Code outperforms previous multilingual LLMs for PL or NL across a wide range of end tasks of code intelligence, including multilingual code-to-text, text-to-code, code-to-code, and text-to-text generation.

Code Summarization Language Modeling +3

X-PuDu at SemEval-2022 Task 6: Multilingual Learning for English and Arabic Sarcasm Detection

no code implementations SemEval (NAACL) 2022 Yaqian Han, Yekun Chai, Shuohuan Wang, Yu Sun, Hongyi Huang, Guanghao Chen, Yitong Xu, Yang Yang

Detecting sarcasm and verbal irony from people's subjective statements is crucial to understanding their intended meanings and real sentiments and positions in social scenarios.

Multi-Label Classification MUlTI-LABEL-ClASSIFICATION +3

ERNIE-UniX2: A Unified Cross-lingual Cross-modal Framework for Understanding and Generation

no code implementations9 Nov 2022 Bin Shan, Yaqian Han, Weichong Yin, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Recent cross-lingual cross-modal works attempt to extend Vision-Language Pre-training (VLP) models to non-English inputs and achieve impressive performance.

Contrastive Learning Decoder +6

ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech

2 code implementations7 Nov 2022 Xiaoran Fan, Chao Pang, Tian Yuan, He Bai, Renjie Zheng, Pengfei Zhu, Shuohuan Wang, Junkun Chen, Zeyu Chen, Liang Huang, Yu Sun, Hua Wu

In this paper, we extend the pretraining method for cross-lingual multi-speaker speech synthesis tasks, including cross-lingual multi-speaker voice cloning and cross-lingual multi-speaker speech editing.

Representation Learning Speech Representation Learning +4

Clip-Tuning: Towards Derivative-free Prompt Learning with a Mixture of Rewards

no code implementations21 Oct 2022 Yekun Chai, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Derivative-free prompt learning has emerged as a lightweight alternative to prompt tuning, which only requires model inference to optimize the prompts.

Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters

1 code implementation19 May 2022 Yang Xiang, Zhihua Wu, Weibao Gong, Siyu Ding, Xianjie Mo, Yuang Liu, Shuohuan Wang, Peng Liu, Yongshuai Hou, Long Li, Bin Wang, Shaohuai Shi, Yaqian Han, Yue Yu, Ge Li, Yu Sun, Yanjun Ma, dianhai yu

We took natural language processing (NLP) as an example to show how Nebula-I works in different training phases that include: a) pre-training a multilingual language model using two remote clusters; and b) fine-tuning a machine translation model using knowledge distilled from pre-trained models, which run through the most popular paradigm of recent deep learning.

Cross-Lingual Natural Language Inference Deep Learning +3

ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora

2 code implementations EMNLP 2021 Xuan Ouyang, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

In this paper, we propose ERNIE-M, a new training method that encourages the model to align the representation of multiple languages with monolingual corpora, to overcome the constraint that the parallel corpus size places on the model performance.

Sentence Translation

ERNIE-Doc: A Retrospective Long-Document Modeling Transformer

3 code implementations ACL 2021 Siyu Ding, Junyuan Shang, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Transformers are not suited for processing long documents, due to their quadratically increasing memory and time consumption.

Ranked #1000000000 on Text Classification on IMDb

Language Modeling Language Modelling +3

ERNIE at SemEval-2020 Task 10: Learning Word Emphasis Selection by Pre-trained Language Model

no code implementations SEMEVAL 2020 Zhengjie Huang, Shikun Feng, Weiyue Su, Xuyi Chen, Shuohuan Wang, Jiaxiang Liu, Xuan Ouyang, Yu Sun

This paper describes the system designed by ERNIE Team which achieved the first place in SemEval-2020 Task 10: Emphasis Selection For Written Text in Visual Media.

Data Augmentation Feature Engineering +4

ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

3 code implementations29 Jul 2019 Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, Haifeng Wang

Recently, pre-trained models have achieved state-of-the-art results in various language understanding tasks, which indicates that pre-training on large-scale corpora may play a crucial role in natural language processing.

Chinese Named Entity Recognition Chinese Reading Comprehension +8

OleNet at SemEval-2019 Task 9: BERT based Multi-Perspective Models for Suggestion Mining

no code implementations SEMEVAL 2019 Jiaxiang Liu, Shuohuan Wang, Yu Sun

This paper describes our system partici- pated in Task 9 of SemEval-2019: the task is focused on suggestion mining and it aims to classify given sentences into sug- gestion and non-suggestion classes in do- main specific and cross domain training setting respectively.

Sentence Suggestion mining

Cannot find the paper you are looking for? You can Submit a new open access paper.