Search Results for author: Yekun Chai

Found 22 papers, 11 papers with code

Counter-Contrastive Learning for Language GANs

no code implementations Findings (EMNLP) 2021 Yekun Chai, Haidong Zhang, Qiyue Yin, Junge Zhang

Generative Adversarial Networks (GANs) have achieved great success in image synthesis, but have proven to be difficult to generate natural language.

Contrastive Learning Image Generation

Curiosity-Driven Reinforcement Learning from Human Feedback

1 code implementation20 Jan 2025 Haoran Sun, Yekun Chai, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang

Reinforcement learning from human feedback (RLHF) has proven effective in aligning large language models (LLMs) with human preferences, but often at the cost of reduced output diversity.

Diversity Instruction Following +3

MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions

1 code implementation3 Oct 2024 Yekun Chai, Haoran Sun, Huang Fang, Shuohuan Wang, Yu Sun, Hua Wu

However, token-level RLHF suffers from the credit assignment problem over long sequences, where delayed rewards make it challenging for the model to discern which actions contributed to successful outcomes.

Code Generation Dialogue Generation +5

Tokenization Falling Short: On Subword Robustness in Large Language Models

1 code implementation17 Jun 2024 Yekun Chai, Yewei Fang, Qiwei Peng, Xuhong LI

Our findings reveal that scaling model parameters can mitigate the issue of tokenization; however, LLMs still suffer from biases induced by typos and other text format variations.

Autoregressive Pre-Training on Pixels and Texts

1 code implementation16 Apr 2024 Yekun Chai, Qingyi Liu, Jingwu Xiao, Shuohuan Wang, Yu Sun, Hua Wu

Our extensive evaluation across a wide range of benchmarks shows that incorporating both visual and textual data significantly improves the performance of pixel-based language models.

Language Modeling Language Modelling

On Training Data Influence of GPT Models

2 code implementations11 Apr 2024 Yekun Chai, Qingyi Liu, Shuohuan Wang, Yu Sun, Qiwei Peng, Hua Wu

This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models.

Natural Language Understanding

HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization

1 code implementation26 Feb 2024 Qiwei Peng, Yekun Chai, Xuhong LI

These benchmarks have overlooked the vast landscape of massively multilingual NL to multilingual code, leaving a critical gap in the evaluation of multilingual LLMs.

Code Generation HumanEval

Tool-Augmented Reward Modeling

1 code implementation2 Oct 2023 Lei LI, Yekun Chai, Shuohuan Wang, Yu Sun, Hao Tian, Ningyu Zhang, Hua Wu

We validate our approach across a wide range of domains, incorporating seven distinct external tools.

TruthfulQA

Improved Training of Mixture-of-Experts Language GANs

no code implementations23 Feb 2023 Yekun Chai, Qiyue Yin, Junge Zhang

In this work, we (1) first empirically show that the mixture-of-experts approach is able to enhance the representation capacity of the generator for language GANs and (2) harness the Feature Statistics Alignment (FSA) paradigm to render fine-grained learning signals to advance the generator training.

Adversarial Text Image Generation +1

ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models

no code implementations9 Feb 2023 Pengfei Zhu, Chao Pang, Yekun Chai, Lei LI, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu

In response to this lacuna, this paper introduces a pioneering contribution in the form of a text-to-waveform music generation model, underpinned by the utilization of diffusion models.

Diversity Music Generation +1

ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages

1 code implementation13 Dec 2022 Yekun Chai, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu

Extensive results show that ERNIE-Code outperforms previous multilingual LLMs for PL or NL across a wide range of end tasks of code intelligence, including multilingual code-to-text, text-to-code, code-to-code, and text-to-text generation.

Code Summarization Language Modeling +3

X-PuDu at SemEval-2022 Task 6: Multilingual Learning for English and Arabic Sarcasm Detection

no code implementations SemEval (NAACL) 2022 Yaqian Han, Yekun Chai, Shuohuan Wang, Yu Sun, Hongyi Huang, Guanghao Chen, Yitong Xu, Yang Yang

Detecting sarcasm and verbal irony from people's subjective statements is crucial to understanding their intended meanings and real sentiments and positions in social scenarios.

Multi-Label Classification MUlTI-LABEL-ClASSIFICATION +3

Clip-Tuning: Towards Derivative-free Prompt Learning with a Mixture of Rewards

no code implementations21 Oct 2022 Yekun Chai, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Derivative-free prompt learning has emerged as a lightweight alternative to prompt tuning, which only requires model inference to optimize the prompts.

RefineCap: Concept-Aware Refinement for Image Captioning

no code implementations8 Sep 2021 Yekun Chai, Shuo Jin, Junliang Xing

Automatically translating images to texts involves image scene understanding and language modeling.

Decoder Descriptive +5

Improving Sequence Generative Adversarial Networks with Feature Statistics Alignment

no code implementations1 Jan 2021 Yekun Chai, Qiyue Yin, Junge Zhang

Generative Adversarial Networks (GAN) are facing great challenges in synthesizing sequences of discrete elements, such as mode dropping and unstable training.

Binary Classification

Neural Text Classification by Jointly Learning to Cluster and Align

no code implementations24 Nov 2020 Yekun Chai, Haidong Zhang, Shuo Jin

Distributional text clustering delivers semantically informative representations and captures the relevance between each word and semantic clustering centroids.

Clustering General Classification +4

Highway Transformer: Self-Gating Enhanced Self-Attentive Networks

1 code implementation ACL 2020 Yekun Chai, Shuo Jin, Xinwen Hou

Self-attention mechanisms have made striking state-of-the-art (SOTA) progress in various sequence learning tasks, standing on the multi-headed dot product attention by attending to all the global contexts at different locations.

How to Evaluate Word Representations of Informal Domain?

1 code implementation12 Nov 2019 Yekun Chai, Naomi Saphra, Adam Lopez

Diverse word representations have surged in most state-of-the-art natural language processing (NLP) applications.

Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.