no code implementations • 4 Jul 2024 • Shujun Liu, Xiaoyu Shen, Yuhang Lai, Siyuan Wang, Shengbin Yue, Zengfeng Huang, Xuanjing Huang, Zhongyu Wei
By decoupling the reward modeling procedure and incorporating hybrid supervision, our HaF-RM framework offers a principled and effective approach to enhancing the performance and alignment of reward models, a critical component in the responsible development of powerful language models.
1 code implementation • 11 Mar 2024 • Yuhang Lai, Siyuan Wang, Shujun Liu, Xuanjing Huang, Zhongyu Wei
We introduce ALaRM, the first framework modeling hierarchical rewards in reinforcement learning from human feedback (RLHF), which is designed to enhance the alignment of large language models (LLMs) with human preferences.
2 code implementations • 20 Sep 2023 • Shengbin Yue, Wei Chen, Siyuan Wang, Bingxuan Li, Chenchen Shen, Shujun Liu, Yuxuan Zhou, Yao Xiao, Song Yun, Xuanjing Huang, Zhongyu Wei
We propose DISC-LawLLM, an intelligent legal system utilizing large language models (LLMs) to provide a wide range of legal services.
no code implementations • 11 Oct 2021 • Shujun Liu, Hai Zhu, Kun Wang, Huajun Wang
For the phoneme encoder, based on the analysis that same phonemes corresponding to varying pitches can produce similar pronunciations, this encoder is followed by an adversarially trained pitch classifier to enforce the identical phonemes with different pitches mapping into the same phoneme feature space.
no code implementations • 21 Apr 2016 • Xichuan Zhou, Shengli Li, Kai Qin, Kunping Li, Fang Tang, Shengdong Hu, Shujun Liu, Zhi Lin
Deep neural networks are state-of-the-art models for understanding the content of images, video and raw input data.