Argoverse-HD

Introduced by Li et al. in Towards Streaming Perception

Argoverse-HD is a dataset built for streaming object detection, which encompasses real-time object detection, video object detection, tracking, and short-term forecasting. It contains the video data from Argoverse 1.1 with our own MS COCO-style bounding box annotations with track IDs. The annotations are backward-compatible with COCO as one can directly evaluate COCO pre-trained models on this dataset to estimate the efficiency or the cross-dataset generalization capability of the models. The dataset contains high-quality and temporally-dense annotations for high-resolution videos (1920 x 1200 @ 30 FPS). Overall, there are 70,000 image frames and 1.3 million bounding boxes.

Argoverse-HD is the dataset used in the Streaming Perception Challenge, which includes two tracks:

  • Detection-only (real-time object detection). In this track, the participants will develop single-frame object detectors as they would for COCO and LVIS challenges. The crucial distinction is that the evaluation will score latency through streaming accuracy.
  • Full-stack. In this track, the method is unrestricted. However, most likely tracking and forecasting will be used to compensate for the latency of the detectors.

By default, all submissions measure their latency on a V100 GPU with the official toolkit.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages