1 code implementation • 2 Mar 2024 • Weizhe Yuan, PengFei Liu, Matthias Gallé
In particular, we present a model-in-the-loop framework that semi-automatically derives criteria from collected guidelines for different writing tasks and constructs in-context demonstrations for each criterion.
2 code implementations • 18 Jan 2024 • Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston
We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal.
1 code implementation • 9 Jan 2024 • Shichao Sun, Junlong Li, Weizhe Yuan, Ruifeng Yuan, Wenjie Li, PengFei Liu
In this paper, we pioneer the critique of critique, termed MetaCritique, which is a framework to evaluate the critique from two aspects, i. e., factuality as precision score and comprehensiveness as recall score.
1 code implementation • 9 Oct 2023 • Junlong Li, Shichao Sun, Weizhe Yuan, Run-Ze Fan, Hai Zhao, PengFei Liu
The rapid development of Large Language Models (LLMs) has substantially expanded the range of tasks they can address.
3 code implementations • 25 Jul 2023 • I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, PengFei Liu
With the above challenges in mind, in this paper, we propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models (e. g., ChatGPT).
1 code implementation • 23 Jun 2023 • Weizhe Yuan, Kyunghyun Cho, Jason Weston
Natural language (NL) feedback offers rich insights into user experience.
1 code implementation • 12 Dec 2022 • Yiwei Qin, Weizhe Yuan, Graham Neubig, PengFei Liu
Both have their advantages; discriminative metrics are able to directly optimize for the problem of distinguishing between good and bad outputs, while generative metrics can be trained using abundant raw text.
2 code implementations • 22 Jun 2022 • Weizhe Yuan, PengFei Liu
In addition, we test our model in the 2022 College Entrance Examination English that happened a few days ago (2022. 06. 08), and it gets a total score of 134 (v. s.
no code implementations • ACL 2022 • Yang Xiao, Jinlan Fu, Weizhe Yuan, Vijay Viswanathan, Zhoumianze Liu, Yixin Liu, Graham Neubig, PengFei Liu
Despite data's crucial role in machine learning, most existing tools and research tend to focus on systems on top of existing data rather than how to interpret and manipulate data.
1 code implementation • 28 Jul 2021 • PengFei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, Graham Neubig
This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning".
1 code implementation • NeurIPS 2021 • Weizhe Yuan, Graham Neubig, PengFei Liu
In this work, we conceptualize the evaluation of generated text as a text generation problem, modeled using pre-trained sequence-to-sequence models.
1 code implementation • ACL 2021 • PengFei Liu, Jinlan Fu, Yang Xiao, Weizhe Yuan, Shuaicheng Chang, Junqi Dai, Yixin Liu, Zihuiwen Ye, Zi-Yi Dou, Graham Neubig
In this paper, we present a new conceptualization and implementation of NLP evaluation: the ExplainaBoard, which in addition to inheriting the functionality of the standard leaderboard, also allows researchers to (i) diagnose strengths and weaknesses of a single system (e. g.~what is the best-performing system bad at?)
1 code implementation • 30 Jan 2021 • Weizhe Yuan, PengFei Liu, Graham Neubig
The rapid development of science and technology has been accompanied by an exponential growth in peer-reviewed scientific publications.