A Filipino multi-modal language dataset for text+visual tasks. Consists of 351,755 Filipino news articles gathered from Filipino news outlets.
Each entry contains:
- body - Article text
- title - Article title
- website - Name of the news outlet
- category - News category given by the news outlet
- date - Date published
- author - Article author
- url - URL of the article
- img_url - URL of the article image
- img_path - Filename of the image in the dataset