1 code implementation • 19 Mar 2024 • Zhixue Zhao, Nikolaos Aletras
Previous studies have explored how different factors affect faithfulness, mainly in the context of monolingual English models.
1 code implementation • 1 Feb 2024 • Zhixue Zhao, Boxuan Shan
Our method updates the token importance distribution in a recursive manner.
1 code implementation • 15 Nov 2023 • George Chrysostomou, Zhixue Zhao, Miles Williams, Nikolaos Aletras
Despite the remarkable performance of generative large language models (LLMs) on abstractive summarization, they face two significant challenges: their considerable size and tendency to hallucinate.
1 code implementation • 17 May 2023 • Zhixue Zhao, Nikolaos Aletras
Widely used faithfulness metrics, such as sufficiency and comprehensiveness use a hard erasure criterion, i. e. entirely removing or retaining the top most important tokens ranked by a given FA and observing the changes in predictive likelihood.
1 code implementation • 17 Oct 2022 • Zhixue Zhao, George Chrysostomou, Kalina Bontcheva, Nikolaos Aletras
Explanation faithfulness of model predictions in natural language processing is typically evaluated on held-out data from the same temporal distribution as the training data (i. e. synchronous settings).
no code implementations • 6 Sep 2021 • Zhixue Zhao, Ziqi Zhang, Frank Hopfgartner
Toxic comment classification models are often found biased toward identity terms which are terms characterizing a specific group of people such as "Muslim" and "black".