Multimedia Goal-oriented Generative Script Learning Dataset Dataset

Name:*

Full name (optional):

Description (Markdown and $\LaTeX$ enabled):*

[Multimedia Goal-oriented Generative Script Learning Dataset](https://drive.google.com/file/d/1lSo-Kr4edNas0_uTl1SvDnEGuPYl0Or9/view?usp=sharing) 
This link contains a dataset consisting of multimedia steps for two categories: gardening and crafts. The dataset consists of a total of 79,089 multimedia steps across 5,652 tasks.

The dataset is split into three sets: training, development, and testing. The gardening category has 20,258 training tasks, 2,428 development tasks, and 2,684 testing tasks. The crafts category has 32,082 training tasks, 4,064 development tasks, and 3,937 testing tasks. Each task is associated with a set of multimedia steps, which include corresponding step images related to the task.

The `*_data` folder contains the full dataset, which will be released after the paper is published. Each `*_data` folder includes three files: `train.json`, `valid.json`, and `test.json`. These files are used for training, validation, and testing respectively.

Each file is a JSON file that contains multiple lines. Each line represents an instance and follows the schema described below:

```python
{
    "title":        #   goal of activity
    "method":       #   subgoal of activity
    "steps":        #   list of step text 
    "captions":     #   list of corresponding captions of step
    "target":       #   next step text
    "img":          #   last step image id
    "target_img":   #   next step image id
    "retrieve":     #   20 retrieved historical relevant steps
    "retrieve_neg": #   list of retrieved top-20 most similar steps with respect to the last step. They will serve as retrieve-negatives 
    }
```

The `img` subfolder in the `*_data` folder contains all images and the corresponding wikihow task json file for the gardening and crafts datasets.

Homepage URL (optional):

Paper where the dataset was introduced:

Introduction date:

Dataset license:

URL to full license terms:

Image

---

Multimedia Goal-oriented Generative Script Learning Dataset

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Usage

License

Modalities

Languages

Multimedia Goal-oriented Generative Script Learning Dataset

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit