BURST is a benchmark suite built upon TAO that requires tracking and segmenting multiple objects from camera video. The benchmark contains 6 different sub-tasks divided into 2 groups that all share the same data for training/validation/testing.

Class-guided
  1. Common: Track and segment all objects belonging to a set of 78 common classes (based on the COCO class set)
  2. Long-tail: Track and segment all objects belonging to an extended set of 482 object classes (based on the LVIS class set)
  3. Open-world: Methods are only allowed to use the annotations of the 78 common classes during training, but during inference they are expected to track and segment all 482 object classes (class label predictions are not required)
Exemplar-guided
  1. Mask: Track and segment all objects in the video for which the first-frame object masks are given. This task is identical to Video Object Segmentation (VOS).
  2. Box: Track and segment all objects in the video for which the first-frame object bounding-boxes are given.
  3. Point: Track and segment all objects in the video for which we are only given the (x,y) point coordinates of the mask centroid in the first-frame in which the objects appear.

An illustration of the task hierarchy is given here and a detailed explanation is given in Sec. 5 of the dataset paper

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages