Hallucination Pair-wise Detection (1-ref)

3 papers with code • 1 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Hallucination Pair-wise Detection (1-ref)

Trend	Dataset	Best Model	Paper	Code	Compare
	FOIL	CLIP-S			See all

Most implemented papers

Most implemented Social Latest No code

CLIPScore: A Reference-free Evaluation Metric for Image Captioning

jmhessel/clipscore • • EMNLP 2021

Image captioning has conventionally relied on reference-based automatic evaluations, where machine captions are compared against captions written by humans.

Paper
Code

Mutual Information Divergence: A Unified Metric for Multimodal Generative Models

naver-ai/mid.metric • • 25 May 2022

Based on a recent trend that multimodal generative evaluations exploit a vison-and-language pre-trained model, we propose the negative Gaussian cross-mutual information using the CLIP features as a unified metric, coined by Mutual Information Divergence (MID).

Paper
Code

Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation

eth-sri/chatprotect • • 25 May 2023

Large language models (large LMs) are susceptible to producing text that contains hallucinated content.

Paper
Code

Hallucination Pair-wise Detection (1-ref)

Benchmarks Add a Result

Most implemented papers

CLIPScore: A Reference-free Evaluation Metric for Image Captioning

Mutual Information Divergence: A Unified Metric for Multimodal Generative Models

Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation

Content

Benchmarks

Add a Result