Search Results for author: Luxi He

Found 2 papers, 0 papers with code

What's in Your "Safe" Data?: Identifying Benign Data that Breaks Safety

no code implementations • 1 Apr 2024 • Luxi He, Mengzhou Xia, Peter Henderson

Current Large Language Models (LLMs), even those tuned for safety and alignment, are susceptible to jailbreaking.

Paper
Add Code

Aleatoric and Epistemic Discrimination: Fundamental Limits of Fairness Interventions

no code implementations • NeurIPS 2023 • Hao Wang, Luxi He, Rui Gao, Flavio P. Calmon

We categorize sources of discrimination in the ML pipeline into two classes: aleatoric discrimination, which is inherent in the data distribution, and epistemic discrimination, which is due to decisions made during model development.

Fairness

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.