Search Results for author: Jiayi Fu

Found 7 papers, 3 papers with code

Iterative Predictor-Critic Code Decoding for Real-World Image Dehazing

no code implementations17 Mar 2025 Jiayi Fu, Siyu Liu, Zikun Liu, Chun-Le Guo, Hyunhee Park, Ruiqi Wu, Guoqing Wang, Chongyi Li

Apart from previous codebook-based methods that rely on one-shot decoding, our method utilizes high-quality codes obtained in the previous iteration to guide the prediction of the Code-Predictor in the subsequent iteration, improving code prediction accuracy and ensuring stable dehazing performance.

Image Dehazing

Reward Shaping to Mitigate Reward Hacking in RLHF

1 code implementation26 Feb 2025 Jiayi Fu, Xuandong Zhao, Chengyuan Yao, Heng Wang, Qi Han, Yanghua Xiao

Reinforcement Learning from Human Feedback (RLHF) is essential for aligning large language models (LLMs) with human values.

GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick

1 code implementation20 Feb 2024 Jiayi Fu, Xuandong Zhao, Ruihan Yang, Yuansen Zhang, Jiangjie Chen, Yanghua Xiao

Large language models (LLMs) excellently generate human-like text, but also raise concerns about misuse in fake news and academic dishonesty.

Diversity Language Modeling +1

Just Ask One More Time! Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios

no code implementations14 Nov 2023 Lei Lin, Jiayi Fu, Pengli Liu, Qingyang Li, Yan Gong, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Although chain-of-thought (CoT) prompting combined with language models has achieved encouraging results on complex reasoning tasks, the naive greedy decoding used in CoT prompting usually causes the repetitiveness and local optimality.

All Decoder +1

KwaiYiiMath: Technical Report

no code implementations11 Oct 2023 Jiayi Fu, Lei Lin, Xiaoyang Gao, Pengli Liu, Zhengzong Chen, Zhirui Yang, ShengNan Zhang, Xue Zheng, Yan Li, Yuliang Liu, Xucheng Ye, Yiqiao Liao, Chao Liao, Bin Chen, Chengru Song, Junchen Wan, Zijia Lin, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning.

Ranked #97 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +1

Cannot find the paper you are looking for? You can Submit a new open access paper.