1 code implementation • 17 Feb 2024 • Zongxia Li, Ishani Mondal, Yijun Liang, Huy Nghiem, Jordan Lee Boyd-Graber
Question answering (QA) can only make progress if we know if an answer is correct, but for many of the most challenging and interesting QA examples, current answer correctness (AC) metrics do not align with human judgments, particularly verbose, free form answers from large language models (LLM).
1 code implementation • 3 Dec 2023 • Hyojung Han, Jordan Lee Boyd-Graber, Marine Carpuat
Translations help people understand content written in another language.
1 code implementation • NeurIPS 2021 • Alexander Hoyle, Pranav Goel, Andrew Hian-Cheong, Denis Peskov, Jordan Lee Boyd-Graber, Philip Resnik
To address the standardization gap, we systematically evaluate a dominant classical model and two state-of-the-art neural models on two commonly used datasets.