Search Results for author: Manley Roberts

Found 4 papers, 4 papers with code

Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive

1 code implementation20 Feb 2024 Arka Pal, Deep Karkhanis, Samuel Dooley, Manley Roberts, Siddartha Naidu, Colin White

In this work, first we show theoretically that the standard DPO loss can lead to a \textit{reduction} of the model's likelihood of the preferred examples, as long as the relative probability between the preferred and dispreferred classes increases.

Data Contamination Through the Lens of Time

1 code implementation16 Oct 2023 Manley Roberts, Himanshu Thakur, Christine Herlihy, Colin White, Samuel Dooley

Recent claims about the impressive abilities of large language models (LLMs) are often supported by evaluating publicly available benchmarks.

Giraffe: Adventures in Expanding Context Lengths in LLMs

1 code implementation21 Aug 2023 Arka Pal, Deep Karkhanis, Manley Roberts, Samuel Dooley, Arvind Sundararajan, Siddartha Naidu

To use these models on sequences longer than the train-time context length, one might employ techniques from the growing family of context length extrapolation methods -- most of which focus on modifying the system of positional encodings used in the attention mechanism to indicate where tokens or activations are located in the input sequence.

16k 4k

Unsupervised Learning under Latent Label Shift

2 code implementations26 Jul 2022 Manley Roberts, Pranav Mani, Saurabh Garg, Zachary C. Lipton

Thus motivated, we introduce a practical algorithm that leverages domain-discriminative models as follows: (i) push examples through domain discriminator $p(d|\mathbf{x})$; (ii) discretize the data by clustering examples in $p(d|\mathbf{x})$ space; (iii) perform non-negative matrix factorization on the discrete data; (iv) combine the recovered $p(y|d)$ with the discriminator outputs $p(d|\mathbf{x})$ to compute $p_d(y|x) \; \forall d$.

Cannot find the paper you are looking for? You can Submit a new open access paper.