FIB (Factual Inconsistency Benchmark)

Introduced by Tam et al. in Evaluating the Factual Consistency of Large Language Models Through News Summarization

Factual Inconsistency Benchmark (FIB) is a benchmark that focuses on the task of summarization. Specifically, the benchmark involves comparing the scores an LLM assigns to a factually consistent versus a factual inconsistent summary for an input news article. For factually consistent summaries, human-written reference summaries are used to manually verify as factually consistent.

Source: Evaluating the Factual Consistency of Large Language Models Through Summarization

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages