Search Results for author: Xiaolin Sun

Found 6 papers, 2 papers with code

Belief-Enriched Pessimistic Q-Learning against Adversarial State Perturbations

1 code implementation • 6 Mar 2024 • Xiaolin Sun, Zizhan Zheng

Existing solutions either introduce a regularization term to improve the smoothness of the trained policy against perturbations or alternatively train the agent's policy and the attacker's policy.

Q-Learning Reinforcement Learning (RL)

Paper
Code

Enhancing LLM Safety via Constrained Direct Preference Optimization

no code implementations • 4 Mar 2024 • Zixuan Liu, Xiaolin Sun, Zizhan Zheng

Empirically, our approach provides a safety guarantee to LLMs that is missing in DPO while achieving significantly higher rewards under the same safety constraint compared to a recently proposed safe RLHF approach.

reinforcement-learning

Paper
Add Code

Pandering in a Flexible Representative Democracy

no code implementations • 18 Nov 2022 • Xiaolin Sun, Jacob Masur, Ben Abramowitz, Nicholas Mattei, Zizhan Zheng

We introduce a novel formal model of \emph{pandering}, or strategic preference reporting by candidates seeking to be elected, and examine the resilience of two democratic voting systems to pandering within a single round and across multiple rounds.

Paper
Add Code

Estimation of Heterogeneous Treatment Effects Using a Conditional Moment Based Approach

no code implementations • 28 Oct 2022 • Xiaolin Sun

We propose a new estimator for heterogeneous treatment effects in a partially linear model (PLM) with many exogenous covariates and a possibly endogenous treatment variable.

valid

Paper
Add Code

An exact solution in Markov decision process with multiplicative rewards as a general framework

no code implementations • 15 Dec 2020 • Yuan YAO, Xiaolin Sun

We first review the exact solution of conventional linear quadratic regulation with a linear transition and a Gaussian noise, whose optimal policy does not depend on the Gaussian noise, which is an undesired feature in the presence of significant noises.

Paper
Add Code

Leveraging Legacy Data to Accelerate Materials Design via Preference Learning

1 code implementation • 25 Oct 2019 • Xiaolin Sun, Zhufeng Hou, Masato Sumita, Shinsuke Ishihara, Ryo Tamura, Koji Tsuda

Machine learning applications in materials science are often hampered by shortage of experimental data.

Bayesian Optimization Data Integration +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.