Search Results for author: Duanyu Feng

Found 13 papers, 6 papers with code

A Hybrid Loss Framework for Decomposition-based Time Series Forecasting Methods: Balancing Global and Component Errors

no code implementations18 Nov 2024 Ronghui Han, Duanyu Feng, Hongyu Du, Hao Wang

To investigate this, we conduct a study on the impact of overall loss on existing time series methods with sequence decomposition.

Time Series Time Series Forecasting

HARMONIC: Harnessing LLMs for Tabular Data Synthesis and Privacy Protection

no code implementations6 Aug 2024 Yuxin Wang, Duanyu Feng, Yongfu Dai, Zhengyu Chen, Jimin Huang, Sophia Ananiadou, Qianqian Xie, Hao Wang

In this paper, we take a step forward to explore LLMs for tabular data synthesis and privacy protection, by introducing a new framework HARMONIC for tabular data generation and evaluation.

Privacy Preserving Synthetic Data Generation +1

Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

1 code implementation12 Jun 2024 Duanyu Feng, Bowen Qin, Chen Huang, Youcheng Huang, Zheng Zhang, Wenqiang Lei

By leveraging this safety direction, Legend can then leverage the semantic distances of paired responses along this direction to annotate margins automatically.

Dishonesty in Helpful and Harmless Alignment

no code implementations4 Jun 2024 Youcheng Huang, Jingkun Tang, Duanyu Feng, Zheng Zhang, Wenqiang Lei, Jiancheng Lv, Anthony G. Cohn

We find that this also induces dishonesty in helpful and harmless alignment where LLMs tell lies in generating harmless responses.

Towards Understanding the Influence of Reward Margin on Preference Model Performance

no code implementations7 Apr 2024 Bowen Qin, Duanyu Feng, Xi Yang

Reinforcement Learning from Human Feedback (RLHF) is a widely used framework for the training of language models.

Language Modeling Language Modelling

Towards Analyzing and Understanding the Limitations of DPO: A Theoretical Perspective

no code implementations6 Apr 2024 Duanyu Feng, Bowen Qin, Chen Huang, Zheng Zhang, Wenqiang Lei

Direct Preference Optimization (DPO), which derives reward signals directly from pairwise preference data, has shown its effectiveness on aligning Large Language Models (LLMs) with human preferences.

FinBen: A Holistic Financial Benchmark for Large Language Models

2 code implementations20 Feb 2024 Qianqian Xie, Weiguang Han, Zhengyu Chen, Ruoyu Xiang, Xiao Zhang, Yueru He, Mengxi Xiao, Dong Li, Yongfu Dai, Duanyu Feng, Yijing Xu, Haoqiang Kang, Ziyan Kuang, Chenhan Yuan, Kailai Yang, Zheheng Luo, Tianlin Zhang, Zhiwei Liu, Guojun Xiong, Zhiyang Deng, Yuechen Jiang, Zhiyuan Yao, Haohang Li, Yangyang Yu, Gang Hu, Jiajia Huang, Xiao-Yang Liu, Alejandro Lopez-Lira, Benyou Wang, Yanzhao Lai, Hao Wang, Min Peng, Sophia Ananiadou, Jimin Huang

Our evaluation of 15 representative LLMs, including GPT-4, ChatGPT, and the latest Gemini, reveals several key findings: While LLMs excel in IE and textual analysis, they struggle with advanced reasoning and complex tasks like text generation and forecasting.

Question Answering RAG +2

Dólares or Dollars? Unraveling the Bilingual Prowess of Financial LLMs Between Spanish and English

1 code implementation12 Feb 2024 Xiao Zhang, Ruoyu Xiang, Chenhan Yuan, Duanyu Feng, Weiguang Han, Alejandro Lopez-Lira, Xiao-Yang Liu, Sophia Ananiadou, Min Peng, Jimin Huang, Qianqian Xie

We evaluate our model and existing LLMs using FLARE-ES, the first comprehensive bilingual evaluation benchmark with 21 datasets covering 9 tasks.

DREditor: An Time-efficient Approach for Building a Domain-specific Dense Retrieval Model

1 code implementation23 Jan 2024 Chen Huang, Duanyu Feng, Wenqiang Lei, Jiancheng Lv

Motivated by this, we develop a time-efficient approach called DREditor to edit the matching rule of an off-the-shelf dense retrieval model to suit a specific domain.

Retrieval

LAiW: A Chinese Legal Large Language Models Benchmark

1 code implementation9 Oct 2023 Yongfu Dai, Duanyu Feng, Jimin Huang, Haochen Jia, Qianqian Xie, Yifang Zhang, Weiguang Han, Wei Tian, Hao Wang

Through automated evaluation of current general and legal domain LLMs on our benchmark, we indicate that these LLMs may not align with the logic of legal practice.

Information Retrieval

Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models

1 code implementation1 Oct 2023 Duanyu Feng, Yongfu Dai, Jimin Huang, Yifang Zhang, Qianqian Xie, Weiguang Han, Zhengyu Chen, Alejandro Lopez-Lira, Hao Wang

We then propose the first Credit and Risk Assessment Large Language Model (CALM) by instruction tuning, tailored to the nuanced demands of various financial risk assessment tasks.

Decision Making Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.