BalitaNLP

A Filipino multi-modal language dataset for image-conditional language generation and text-conditional image generation. Consists of 351,755 Filipino news articles gathered from Filipino news outlets.

Each entry contains:

  • body - Article text
  • title - Article title
  • website - Name of the news outlet
  • category - News category given by the news outlet
  • date - Date published
  • author - Article author
  • url - URL of the article
  • img_url - URL of the article image
  • img_path - Filename of the image in the dataset

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Unknown

Modalities


Languages