no code implementations • 7 Apr 2024 • Bowen Qin, Duanyu Feng, Xi Yang
Reinforcement Learning from Human Feedback (RLHF) is a widely used framework for the training of language models.
no code implementations • 6 Apr 2024 • Duanyu Feng, Bowen Qin, Chen Huang, Zheng Zhang, Wenqiang Lei
Direct Preference Optimization (DPO), which derives reward signals directly from pairwise preference data, has shown its effectiveness on aligning Large Language Models (LLMs) with human preferences.
2 code implementations • 20 Feb 2024 • Qianqian Xie, Weiguang Han, Zhengyu Chen, Ruoyu Xiang, Xiao Zhang, Yueru He, Mengxi Xiao, Dong Li, Yongfu Dai, Duanyu Feng, Yijing Xu, Haoqiang Kang, Ziyan Kuang, Chenhan Yuan, Kailai Yang, Zheheng Luo, Tianlin Zhang, Zhiwei Liu, Guojun Xiong, Zhiyang Deng, Yuechen Jiang, Zhiyuan Yao, Haohang Li, Yangyang Yu, Gang Hu, Jiajia Huang, Xiao-Yang Liu, Alejandro Lopez-Lira, Benyou Wang, Yanzhao Lai, Hao Wang, Min Peng, Sophia Ananiadou, Jimin Huang
This along with the rapid development of LLMs, highlights the urgent need for a systematic financial evaluation benchmark for LLMs.
2 code implementations • 12 Feb 2024 • Xiao Zhang, Ruoyu Xiang, Chenhan Yuan, Duanyu Feng, Weiguang Han, Alejandro Lopez-Lira, Xiao-Yang Liu, Sophia Ananiadou, Min Peng, Jimin Huang, Qianqian Xie
We evaluate our model and existing LLMs using FLARE-ES, the first comprehensive bilingual evaluation benchmark with 21 datasets covering 9 tasks.
1 code implementation • 23 Jan 2024 • Chen Huang, Duanyu Feng, Wenqiang Lei, Jiancheng Lv
Motivated by this, we develop a time-efficient approach called DREditor to edit the matching rule of an off-the-shelf dense retrieval model to suit a specific domain.
1 code implementation • 9 Oct 2023 • Yongfu Dai, Duanyu Feng, Jimin Huang, Haochen Jia, Qianqian Xie, Yifang Zhang, Weiguang Han, Wei Tian, Hao Wang
Through automated evaluation of current general and legal domain LLMs on our benchmark, we indicate that these LLMs may not align with the logic of legal practice.
1 code implementation • 1 Oct 2023 • Duanyu Feng, Yongfu Dai, Jimin Huang, Yifang Zhang, Qianqian Xie, Weiguang Han, Zhengyu Chen, Alejandro Lopez-Lira, Hao Wang
We then propose the first Credit and Risk Assessment Large Language Model (CALM) by instruction tuning, tailored to the nuanced demands of various financial risk assessment tasks.