Search Results for author: Joy Qiping Yang

Found 4 papers, 0 papers with code

Asymptotics of Language Model Alignment

no code implementations2 Apr 2024 Joy Qiping Yang, Salman Salamatian, Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami

The goal of language model alignment is to alter $p$ to a new distribution $\phi$ that results in a higher expected reward while keeping $\phi$ close to $p.$ A popular alignment method is the KL-constrained reinforcement learning (RL), which chooses a distribution $\phi_\Delta$ that maximizes $E_{\phi_{\Delta}} r(y)$ subject to a relative entropy constraint $KL(\phi_\Delta || p) \leq \Delta.$ Another simple alignment method is best-of-$N$, where $N$ samples are drawn from $p$ and one with highest reward is selected.

Language Modelling Reinforcement Learning (RL)

Learning bounded-degree polytrees with known skeleton

no code implementations10 Oct 2023 Davin Choo, Joy Qiping Yang, Arnab Bhattacharyya, Clément L. Canonne

We establish finite-sample guarantees for efficient proper learning of bounded-degree polytrees, a rich class of high-dimensional probability distributions and a subclass of Bayesian networks, a widely-studied type of graphical model.

Near-Optimal Degree Testing for Bayes Nets

no code implementations13 Apr 2023 Vipul Arora, Arnab Bhattacharyya, Clément L. Canonne, Joy Qiping Yang

This paper considers the problem of testing the maximum in-degree of the Bayes net underlying an unknown probability distribution $P$ over $\{0, 1\}^n$, given sample access to $P$.

Independence Testing for Bounded Degree Bayesian Network

no code implementations19 Apr 2022 Arnab Bhattacharyya, Clément L. Canonne, Joy Qiping Yang

We study the following independence testing problem: given access to samples from a distribution $P$ over $\{0, 1\}^n$, decide whether $P$ is a product distribution or whether it is $\varepsilon$-far in total variation distance from any product distribution.

Cannot find the paper you are looking for? You can Submit a new open access paper.