no code implementations • 28 Jan 2024 • Zhumin Chu, Qingyao Ai, Yiteng Tu, Haitao Li, Yiqun Liu
Existing paradigms rely on either human annotators or model-based evaluators to evaluate the performance of LLMs on different tasks.
no code implementations • 19 Oct 2022 • Tetsuya Sakai, Sijie Tao, Maria Maistro, Zhumin Chu, Yujing Li, Nuo Chen, Nicola Ferro, Junjie Wang, Ian Soboroff, Yiqun Liu
The noise is due to a fatal bug in the backend of our relevance assessment interface.
1 code implementation • 6 Apr 2022 • Zhumin Chu, Qingyao Ai, Zhihong Wang, Yiqun Liu, Yingye Huang, Rui Zhang, Min Zhang, Shaoping Ma
This raises the question of how to accurately model user satisfaction in conversational search scenarios.