no code implementations • 29 May 2024 • Andrew Zhao, Quentin Xu, Matthieu Lin, Shenzhi Wang, Yong-Jin Liu, Zilong Zheng, Gao Huang
Recent advances in large language models (LLMs) have made them indispensable, raising significant concerns over managing their safety.
no code implementations • 19 Feb 2024 • Qisen Yang, Zekun Wang, Honghui Chen, Shenzhi Wang, Yifan Pu, Xin Gao, Wenhao Huang, Shiji Song, Gao Huang
Psychological measurement is essential for mental health, self-understanding, and personal development.
1 code implementation • NeurIPS 2023 • Shenzhi Wang, Qisen Yang, Jiawei Gao, Matthieu Gaetan Lin, Hao Chen, Liwei Wu, Ning Jia, Shiji Song, Gao Huang
Existing solutions tackle this problem by imposing a policy constraint on the policy improvement objective in both offline and online learning.
no code implementations • 2 Oct 2023 • Shenzhi Wang, Chang Liu, Zilong Zheng, Siyuan Qi, Shuo Chen, Qisen Yang, Andrew Zhao, Chaofei Wang, Shiji Song, Gao Huang
This study utilizes the intricate Avalon game as a testbed to explore LLMs' potential in deceptive environments.
no code implementations • 4 Sep 2023 • Qisen Yang, Shenzhi Wang, Qihang Zhang, Gao Huang, Shiji Song
Offline reinforcement learning (RL) optimizes the policy on a previously collected dataset without any interactions with the environment, yet usually suffers from the distributional shift problem.
no code implementations • 6 Jun 2023 • Qisen Yang, Shenzhi Wang, Matthieu Gaetan Lin, Shiji Song, Gao Huang
In particular, online fine-tuning has become a commonly used method to correct the erroneous estimates of out-of-distribution data learned in the offline training phase.
no code implementations • CVPR 2021 • Shenzhi Wang, Liwei Wu, Lei Cui, Yujun Shen
More concretely, we employ a Local-Net and Global-Net to extract features from any individual patch and its surrounding respectively.