1 code implementation • 6 Mar 2024 • Xiaolin Sun, Zizhan Zheng
Existing solutions either introduce a regularization term to improve the smoothness of the trained policy against perturbations or alternatively train the agent's policy and the attacker's policy.
no code implementations • 4 Mar 2024 • Zixuan Liu, Xiaolin Sun, Zizhan Zheng
Empirically, our approach provides a safety guarantee to LLMs that is missing in DPO while achieving significantly higher rewards under the same safety constraint compared to a recently proposed safe RLHF approach.
no code implementations • 18 Nov 2022 • Xiaolin Sun, Jacob Masur, Ben Abramowitz, Nicholas Mattei, Zizhan Zheng
We introduce a novel formal model of \emph{pandering}, or strategic preference reporting by candidates seeking to be elected, and examine the resilience of two democratic voting systems to pandering within a single round and across multiple rounds.
no code implementations • 28 Oct 2022 • Xiaolin Sun
We propose a new estimator for heterogeneous treatment effects in a partially linear model (PLM) with many exogenous covariates and a possibly endogenous treatment variable.
no code implementations • 15 Dec 2020 • Yuan YAO, Xiaolin Sun
We first review the exact solution of conventional linear quadratic regulation with a linear transition and a Gaussian noise, whose optimal policy does not depend on the Gaussian noise, which is an undesired feature in the presence of significant noises.
1 code implementation • 25 Oct 2019 • Xiaolin Sun, Zhufeng Hou, Masato Sumita, Shinsuke Ishihara, Ryo Tamura, Koji Tsuda
Machine learning applications in materials science are often hampered by shortage of experimental data.