no code implementations • 22 Apr 2024 • Jesse Thibodeau, Hadi Nekoei, Afaf Taïk, Janarthanan Rajendran, Golnoosh Farnadi
We find that, upon deploying a learned tax and redistribution policy, social welfare improves on that of the fairness-agnostic baseline, and approaches that of the analytically optimal fairness-aware baseline for the multi-armed and contextual bandit settings, and surpassing it by 13. 19% in the full RL setting.