no code implementations • 3 Feb 2024 • Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Qingfu Zhang, Siyuan Qi, Yaodong Yang
Our work marks a step forward in effectively and efficiently aligning models to diverse and intricate human preferences in a controllable and Pareto-optimal manner.
no code implementations • 30 Sep 2023 • Chengdong Ma, Ziran Yang, Minquan Gao, Hai Ci, Jun Gao, Xuehai Pan, Yaodong Yang
In this paper, we present Red-teaming Game (RTG), a general game-theoretic framework without manual annotation.