no code implementations • 13 Jul 2023 • Qiuyi, Zhang, Michael S. Lee, Sherol Chen
Beliefs and values are increasingly being incorporated into our AI systems through alignment processes, such as carefully curating data collection principles or regularizing the loss function used for training.
no code implementations • 8 Mar 2022 • Ashok Cutkosky, Chris Dann, Abhimanyu Das, Qiuyi, Zhang
We study the setting of optimizing with bandit feedback with additional prior knowledge provided to the learner in the form of an initial hint of the optimal action.