no code implementations • 2 Apr 2024 • Joy Qiping Yang, Salman Salamatian, Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami
The goal of language model alignment is to alter $p$ to a new distribution $\phi$ that results in a higher expected reward while keeping $\phi$ close to $p.$ A popular alignment method is the KL-constrained reinforcement learning (RL), which chooses a distribution $\phi_\Delta$ that maximizes $E_{\phi_{\Delta}} r(y)$ subject to a relative entropy constraint $KL(\phi_\Delta || p) \leq \Delta.$ Another simple alignment method is best-of-$N$, where $N$ samples are drawn from $p$ and one with highest reward is selected.
2 code implementations • 21 Feb 2019 • Hsiang Hsu, Salman Salamatian, Flavio P. Calmon
Correspondence analysis (CA) is a multivariate statistical tool used to visualize and interpret data dependencies.
no code implementations • 29 Nov 2018 • Hsiang Hsu, Flavio P. Calmon, José Cândido Silveira Santos Filho, Andre P. Calmon, Salman Salamatian
We analyze expenditure patterns of discretionary funds by Brazilian congress members.
no code implementations • 21 Jun 2018 • Hsiang Hsu, Salman Salamatian, Flavio P. Calmon
In this paper, we provide a novel interpretation of CA in terms of an information-theoretic quantity called the principal inertia components.
no code implementations • 16 Feb 2018 • Hsiang Hsu, Shahab Asoodeh, Salman Salamatian, Flavio P. Calmon
Given a pair of random variables $(X, Y)\sim P_{XY}$ and two convex functions $f_1$ and $f_2$, we introduce two bottleneck functionals as the lower and upper boundaries of the two-dimensional convex set that consists of the pairs $\left(I_{f_1}(W; X), I_{f_2}(W; Y)\right)$, where $I_f$ denotes $f$-information and $W$ varies over the set of all discrete random variables satisfying the Markov condition $W \to X \to Y$.