no code implementations • 16 Feb 2024 • Minsuk Kahng, Ian Tenney, Mahima Pushkarna, Michael Xieyang Liu, James Wexler, Emily Reif, Krystal Kallarackal, Minsuk Chang, Michael Terry, Lucas Dixon
Automatic side-by-side evaluation has emerged as a promising approach to evaluating the quality of responses from large language models (LLMs).
no code implementations • 3 Oct 2023 • Michael Xieyang Liu, Tongshuang Wu, Tianying Chen, Franklin Mingzhe Li, Aniket Kittur, Brad A. Myers
Sensemaking in unfamiliar domains can be challenging, demanding considerable user effort to compare different options with respect to various criteria.