BURST is a benchmark suite built upon TAO that requires tracking and segmenting multiple objects from camera video. Class-guided Common: Track and segment all objects belonging to a set of 78 common classes (based on the COCO class set) Long-tail: Track and segment all objects belonging to an extended set of 482 object all 482 object classes (class label predictions are not required) Exemplar-guided Mask: Track and segment all objects in the video for which the first-frame object masks are given. This task is identical to Video Object Segmentation (VOS). Box: Track and segment all objects in the video for which the first-frame object bounding-boxes are given. Point: Track and segment all objects in the video for which we are only given the (x,y) point coordinates of the mask centroid in the first-frame in which the objects appear.
14 PAPERS • 5 BENCHMARKS