NeuralNews is a dataset for machine-generated news detection. It consists of human-generated and machine-generated articles. The human-generated articles are extracted from the GoodNews dataset, which is extracted from the New York Times. It contains 4 types of articles:

  • Real Articles and Real Captions
  • Real Articles and Generated Captions
  • Generated Articles and Real Captions
  • Generated Articles and Generated Captions

In total, it contains about 32K samples of each article type (resulting in about 128K total).

Source: Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News


