Search Results for author: Juntao Dai

Found 8 papers, 4 papers with code

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

no code implementations4 Feb 2024 Jiaming Ji, Boyuan Chen, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Juntao Dai, Yaodong Yang

Here we introduce Aligner, a new efficient alignment paradigm that bypasses the whole RLHF process by learning the correctional residuals between the aligned and the unaligned answers.

AI Alignment: A Comprehensive Survey

no code implementations30 Oct 2023 Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen Mcaleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao

The former aims to make AI systems aligned via alignment training, while the latter aims to gain evidence about the systems' alignment and govern them appropriately to avoid exacerbating misalignment risks.

Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

no code implementations19 Oct 2023 Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, Yaodong Yang

By introducing this benchmark, we aim to facilitate the evaluation and comparison of safety performance, thus fostering the development of reinforcement learning for safer, more reliable, and responsible real-world applications.

reinforcement-learning Safe Reinforcement Learning

OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research

1 code implementation16 May 2023 Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang

AI systems empowered by reinforcement learning (RL) algorithms harbor the immense potential to catalyze societal advancement, yet their deployment is often impeded by significant safety concerns.

Philosophy reinforcement-learning +2

Constrained Update Projection Approach to Safe Policy Optimization

3 code implementations15 Sep 2022 Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, Gang Pan

Compared to previous safe RL methods, CUP enjoys the benefits of 1) CUP generalizes the surrogate functions to generalized advantage estimator (GAE), leading to strong empirical performance.

Reinforcement Learning (RL) Safe Reinforcement Learning

CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning

1 code implementation15 Feb 2022 Long Yang, Jiaming Ji, Juntao Dai, Yu Zhang, Pengfei Li, Gang Pan

Although using bounds as surrogate functions to design safe RL algorithms have appeared in some existing works, we develop them at least three aspects: (i) We provide a rigorous theoretical analysis to extend the surrogate functions to generalized advantage estimator (GAE).

reinforcement-learning Reinforcement Learning (RL) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.