Search Results for author: Johnny Tian-Zheng Wei

Found 6 papers, 1 papers with code

Proving membership in LLM pretraining data via data watermarks

no code implementations • 16 Feb 2024 • Johnny Tian-Zheng Wei, Ryan Yixiang Wang, Robin Jia

Detecting whether copyright holders' works were used in LLM pretraining is poised to be an important problem.

Paper
Add Code

Searching for a higher power in the human evaluation of MT

no code implementations • 20 Oct 2022 • Johnny Tian-Zheng Wei, Tom Kocmi, Christian Federmann

In MT evaluation, pairwise comparisons are conducted to identify the better system.

Paper
Add Code

The statistical advantage of automatic NLG metrics at the system level

1 code implementation • ACL 2021 • Johnny Tian-Zheng Wei, Robin Jia

Our analysis compares the adjusted error of metrics to humans and a derived, perfect segment-level annotator, both of which are unbiased estimators dependent on the number of judgments collected.

Paper
Code

On conducting better validation studies of automatic metrics in natural language generation evaluation

no code implementations • 31 Jul 2019 • Johnny Tian-Zheng Wei

This document is intended for those validating existing metrics or proposing new ones in the broad context of NLG: we 1) begin with a write-up of best practices in validation studies, 2) outline how to adopt these practices, 3) conduct analyses in the WMT'17 metrics shared task\footnote{Our jupyter notebook containing the analyses is available at \url{https://github. com}}, and 4) highlight promising approaches to NLG metrics 5) conclude with our opinions on the future of this area.

Text Generation

Paper
Add Code

Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings

no code implementations • WS 2019 • Sarik Ghazarian, Johnny Tian-Zheng Wei, Aram Galstyan, Nanyun Peng

Despite advances in open-domain dialogue systems, automatic evaluation of such systems is still a challenging problem.

Dialogue Evaluation valid +1

Paper
Add Code

Evaluating Syntactic Properties of Seq2seq Output with a Broad Coverage HPSG: A Case Study on Machine Translation

no code implementations • 6 Sep 2018 • Johnny Tian-Zheng Wei, Khiem Pham, Brian Dillon, Brendan O'Connor

We explore whether such output belongs to a formal and realistic grammar, by employing the English Resource Grammar (ERG), a broad coverage, linguistically precise HPSG-based grammar of English.

Machine Translation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.