no code implementations • 22 Jul 2024 • Jiaming Shen, ran Xu, Yennie Jun, Zhen Qin, Tianqi Liu, Carl Yang, Yi Liang, Simon Baumgartner, Michael Bendersky
Unlike traditional methods, which generate two responses before obtaining the preference label, RMBoost first generates one response and selects a preference label, followed by generating the second more (or less) preferred response conditioned on the pre-selected preference label and the first response.
no code implementations • 22 Jul 2024 • Jiaming Shen, Tianqi Liu, Jialu Liu, Zhen Qin, Jay Pavagadhi, Simon Baumgartner, Michael Bendersky
In this study, we introduce the first multilingual, fine-grained news headline hallucination detection dataset that contains over 11 thousand pairs in 5 languages, each annotated with detailed hallucination types by experts.
no code implementations • 5 Jun 2024 • Rongzhi Zhang, Jiaming Shen, Tianqi Liu, Haorui Wang, Zhen Qin, Feng Han, Jialu Liu, Simon Baumgartner, Michael Bendersky, Chao Zhang
Through extensive experiments on two sequence generation tasks and with various LLMs, we demonstrate the effectiveness of our proposed PLaD framework.
1 code implementation • 2 Feb 2024 • Tianqi Liu, Zhen Qin, Junru Wu, Jiaming Shen, Misha Khalman, Rishabh Joshi, Yao Zhao, Mohammad Saleh, Simon Baumgartner, Jialu Liu, Peter J. Liu, Xuanhui Wang
In this work, we formulate the LM alignment as a \textit{listwise} ranking problem and describe the LiPO framework, where the policy can potentially learn more effectively from a ranked list of plausible responses given the prompt.
no code implementations • 12 Nov 2023 • Vasilisa Bashlovkina, Zhaobin Kuang, Riley Matthews, Edward Clifford, Yennie Jun, William W. Cohen, Simon Baumgartner
Large language models (LLMs) are trained on web-scale corpora that inevitably include contradictory factual information from sources of varying reliability.
no code implementations • 30 Jun 2023 • Vasilisa Bashlovkina, Riley Matthews, Zhaobin Kuang, Simon Baumgartner, Michael Bendersky
We study the ability of transformer-based language models (LMs) to understand social media language.
no code implementations • 21 Dec 2022 • Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
Market sentiment analysis on social media content requires knowledge of both financial markets and social media jargon, which makes it a challenging task for human raters.
2 code implementations • ICLR 2022 • Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler
In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model.
Ranked #3 on
Paraphrase Identification
on Quora Question Pairs
no code implementations • WS 2020 • Anjalie Field, Sascha Rothe, Simon Baumgartner, Cong Yu, Abe Ittycheriah
We evaluate the performance of transformer encoders with various decoders for information organization through a new task: generation of section headings for Wikipedia articles.