no code implementations • 16 Feb 2024 • Johnny Tian-Zheng Wei, Ryan Yixiang Wang, Robin Jia
Detecting whether copyright holders' works were used in LLM pretraining is poised to be an important problem.
no code implementations • 20 Oct 2022 • Johnny Tian-Zheng Wei, Tom Kocmi, Christian Federmann
In MT evaluation, pairwise comparisons are conducted to identify the better system.
1 code implementation • ACL 2021 • Johnny Tian-Zheng Wei, Robin Jia
Our analysis compares the adjusted error of metrics to humans and a derived, perfect segment-level annotator, both of which are unbiased estimators dependent on the number of judgments collected.
no code implementations • 31 Jul 2019 • Johnny Tian-Zheng Wei
This document is intended for those validating existing metrics or proposing new ones in the broad context of NLG: we 1) begin with a write-up of best practices in validation studies, 2) outline how to adopt these practices, 3) conduct analyses in the WMT'17 metrics shared task\footnote{Our jupyter notebook containing the analyses is available at \url{https://github. com}}, and 4) highlight promising approaches to NLG metrics 5) conclude with our opinions on the future of this area.
no code implementations • WS 2019 • Sarik Ghazarian, Johnny Tian-Zheng Wei, Aram Galstyan, Nanyun Peng
Despite advances in open-domain dialogue systems, automatic evaluation of such systems is still a challenging problem.
no code implementations • 6 Sep 2018 • Johnny Tian-Zheng Wei, Khiem Pham, Brian Dillon, Brendan O'Connor
We explore whether such output belongs to a formal and realistic grammar, by employing the English Resource Grammar (ERG), a broad coverage, linguistically precise HPSG-based grammar of English.