VILT (Video Instructions Linking for Complex Tasks)

Introduced by Fischer et al. in VILT: Video Instructions Linking for Complex Tasks

VILT is a new benchmark collection of tasks and multimodal video content. The video linking collection includes annotations from 10 (recipe) tasks, which the annotators chose from a random subset of the collection of 2,275 high-quality 'Wholefoods' recipes. There are linking annotations for 61 query steps across these tasks which contain cooking techniques, chosen from the 189 total recipe steps. As each method results in approximately 10 videos to annotate, the collection consists of 831 linking judgments.

Source: VILT: Video Instructions Linking for Complex Tasks

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Unknown

Modalities


Languages