no code implementations • 7 Feb 2024 • Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Rame, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel
Moreover, responses in these datasets are often sampled from a language model distinct from the one being aligned, and since the model evolves over training, the alignment phase is inevitably off-policy.
no code implementations • 5 Feb 2024 • Tianlin Liu, Shangmin Guo, Leonardo Bianco, Daniele Calandriello, Quentin Berthet, Felipe Llinares, Jessica Hoffmann, Lucas Dixon, Michal Valko, Mathieu Blondel
Aligning language models with human preferences is crucial for reducing errors and biases in these models.
no code implementations • 4 Jul 2014 • Felipe Llinares, Mahito Sugiyama, Karsten M. Borgwardt
Finding statistically significant interactions between binary variables is computationally and statistically challenging in high-dimensional settings, due to the combinatorial explosion in the number of hypotheses.