no code implementations • 18 Dec 2022 • Chen Zhang, Luis Fernando D'Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li
To tackle the multi-domain dialogue evaluation task, we propose a Panel of Experts (PoE), a multitask network that consists of a shared transformer encoder and a collection of lightweight adapters.
2 code implementations • 25 Oct 2022 • Chen Zhang, Luis Fernando D'Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li
Recent model-based reference-free metrics for open-domain dialogue evaluation exhibit promising correlations with human judgment.
1 code implementation • 14 Dec 2021 • Chen Zhang, Luis Fernando D'Haro, Thomas Friedrichs, Haizhou Li
Chatbots are designed to carry out human-like conversations across different domains, such as general chit-chat, knowledge exchange, and persona-grounded conversations.
Ranked #1 on Dialogue Evaluation on USR-TopicalChat
no code implementations • 5 Oct 2021 • Chen Zhang, Luis Fernando D'Haro, Yiming Chen, Thomas Friedrichs, Haizhou Li
Yet, the impact of different Pr-LMs on the performance of automatic metrics is not well-understood.
1 code implementation • ACL 2021 • Chen Zhang, Yiming Chen, Luis Fernando D'Haro, Yan Zhang, Thomas Friedrichs, Grandee Lee, Haizhou Li
Effective evaluation metrics should reflect the dynamics of such interaction.