1 code implementation • 20 Feb 2024 • Arka Pal, Deep Karkhanis, Samuel Dooley, Manley Roberts, Siddartha Naidu, Colin White
In this work, first we show theoretically that the standard DPO loss can lead to a \textit{reduction} of the model's likelihood of the preferred examples, as long as the relative probability between the preferred and dispreferred classes increases.
1 code implementation • 16 Oct 2023 • Manley Roberts, Himanshu Thakur, Christine Herlihy, Colin White, Samuel Dooley
Recent claims about the impressive abilities of large language models (LLMs) are often supported by evaluating publicly available benchmarks.
1 code implementation • 21 Aug 2023 • Arka Pal, Deep Karkhanis, Manley Roberts, Samuel Dooley, Arvind Sundararajan, Siddartha Naidu
To use these models on sequences longer than the train-time context length, one might employ techniques from the growing family of context length extrapolation methods -- most of which focus on modifying the system of positional encodings used in the attention mechanism to indicate where tokens or activations are located in the input sequence.
2 code implementations • 26 Jul 2022 • Manley Roberts, Pranav Mani, Saurabh Garg, Zachary C. Lipton
Thus motivated, we introduce a practical algorithm that leverages domain-discriminative models as follows: (i) push examples through domain discriminator $p(d|\mathbf{x})$; (ii) discretize the data by clustering examples in $p(d|\mathbf{x})$ space; (iii) perform non-negative matrix factorization on the discrete data; (iv) combine the recovered $p(y|d)$ with the discriminator outputs $p(d|\mathbf{x})$ to compute $p_d(y|x) \; \forall d$.