Search Results for author: Salman Salamatian

Found 5 papers, 1 papers with code

Asymptotics of Language Model Alignment

no code implementations2 Apr 2024 Joy Qiping Yang, Salman Salamatian, Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami

The goal of language model alignment is to alter $p$ to a new distribution $\phi$ that results in a higher expected reward while keeping $\phi$ close to $p.$ A popular alignment method is the KL-constrained reinforcement learning (RL), which chooses a distribution $\phi_\Delta$ that maximizes $E_{\phi_{\Delta}} r(y)$ subject to a relative entropy constraint $KL(\phi_\Delta || p) \leq \Delta.$ Another simple alignment method is best-of-$N$, where $N$ samples are drawn from $p$ and one with highest reward is selected.

Language Modelling Reinforcement Learning (RL)

Correspondence Analysis Using Neural Networks

2 code implementations21 Feb 2019 Hsiang Hsu, Salman Salamatian, Flavio P. Calmon

Correspondence analysis (CA) is a multivariate statistical tool used to visualize and interpret data dependencies.

Epidemiology

Generalizing Correspondence Analysis for Applications in Machine Learning

no code implementations21 Jun 2018 Hsiang Hsu, Salman Salamatian, Flavio P. Calmon

In this paper, we provide a novel interpretation of CA in terms of an information-theoretic quantity called the principal inertia components.

BIG-bench Machine Learning Dimensionality Reduction +2

Generalizing Bottleneck Problems

no code implementations16 Feb 2018 Hsiang Hsu, Shahab Asoodeh, Salman Salamatian, Flavio P. Calmon

Given a pair of random variables $(X, Y)\sim P_{XY}$ and two convex functions $f_1$ and $f_2$, we introduce two bottleneck functionals as the lower and upper boundaries of the two-dimensional convex set that consists of the pairs $\left(I_{f_1}(W; X), I_{f_2}(W; Y)\right)$, where $I_f$ denotes $f$-information and $W$ varies over the set of all discrete random variables satisfying the Markov condition $W \to X \to Y$.

LEMMA

Cannot find the paper you are looking for? You can Submit a new open access paper.