The GOT-10k dataset contains more than 10,000 video segments of real-world moving objects and over 1.5 million manually labelled bounding boxes. The dataset contains more than 560 classes of real-world moving objects and 80+ classes of motion patterns.
204 PAPERS • 2 BENCHMARKS
TrackingNet is a large-scale tracking dataset consisting of videos in the wild. It has a total of 30,643 videos split into 30,132 training videos and 511 testing videos, with an average of 470,9 frames.
182 PAPERS • 2 BENCHMARKS
Youtube-VOS is a Video Object Segmentation dataset that contains 4,453 videos - 3,471 for training, 474 for validation, and 508 for testing. The training and validation videos have pixel-level ground truth annotations for every 5th frame (6 fps). It also contains Instance Segmentation annotations. It has more than 7,800 unique objects, 190k high-quality manual annotations and more than 340 minutes in duration.
175 PAPERS • 10 BENCHMARKS
OTB-2015, also referred as Visual Tracker Benchmark, is a visual tracking dataset. It contains 100 commonly used video sequences for evaluating visual tracking. Image Source: http://cvlab.hanyang.ac.kr/tracker_benchmark/datasets.html
174 PAPERS • 1 BENCHMARK
VOT2018 is a dataset for visual object tracking. It consists of 60 challenging videos collected from real-life datasets.
124 PAPERS • 1 BENCHMARK
OTB2013 is the previous version of the current OTB2015 Visual Tracker Benchmark. It contains only 50 tracking sequences, as opposed to the 100 sequences in the current version of the benchmark.
110 PAPERS • 2 BENCHMARKS
Source: https://www.vicos.si/Projects/CDTB 4.2 State-of-the-art Comparison A TH CTB (color-and-depth visual object tracking) dataset is recorded by several passive and active RGB-D setups and contains indoor as well as outdoor sequences acquired in direct sunlight. The sequences were recorded to contain significant object pose change, clutter, occlusion, and periods of long-term target absence to enable tracker evaluation under realistic conditions. Sequences are per-frame annotated with 13 visual attributes for detailed analysis. It contains around 100,000 samples. Image Source: https://www.vicos.si/Projects/CDTB
16 PAPERS • NO BENCHMARKS YET
The dataset comprises 25 short sequences showing various objects in challenging backgrounds. Eight sequences are from the VOT2013 challenge (bolt, bicycle, david, diving, gymnastics, hand, sunshade, woman). The new sequences show complementary objects and backgrounds, for example a fish underwater or a surfer riding a big wave. The sequences were chosen from a large pool of sequences using a methodology based on clustering visual features of object and background so that those 25 sequences sample evenly well the existing pool.
12 PAPERS • 1 BENCHMARK
VOT2020 is a Visual Object Tracking benchmark for short-term tracking in RGB.
6 PAPERS • 1 BENCHMARK
VideoCube is a high-quality and large-scale benchmark to create a challenging real-world experimental environment for Global Instance Tracking (GIT). MGIT is a high-quality and multi-modal benchmark based on VideoCube-Tiny to fully represent the complex spatio-temporal and causal relationships coupled in longer narrative content.
5 PAPERS • NO BENCHMARKS YET
The evaluation of object detection models is usually performed by optimizing a single metric, e.g. mAP, on a fixed set of datasets, e.g. Microsoft COCO and Pascal VOC. Due to image retrieval and annotation costs, these datasets consist largely of images found on the web and do not represent many real-life domains that are being modelled in practice, e.g. satellite, microscopic and gaming, making it difficult to assert the degree of generalization learned by the model.
4 PAPERS • 1 BENCHMARK
VOT2019 is a Visual Object Tracking benchmark for short-term tracking in RGB.
The AU-AIR is a multi-modal aerial dataset captured by a UAV. Having visual data, object annotations, and flight data (time, GPS, altitude, IMU sensor data, velocities), AU-AIR meets vision and robotics for UAVs.
2 PAPERS • NO BENCHMARKS YET
BioDrone is the first bionic drone-based single object tracking benchmark, it features videos captured from a flapping-wing UAV system with a major camera shake due to its aerodynamics. BioDrone highlights the tracking of tiny targets with drastic changes between consecutive frames, providing a new robust vision benchmark for SOT. 1. Large-scale and high-quality benchmark with robust vision challenges 2. Rich challenging factor annotation 3. Videos from Bionic-based UAV 4. Tracking baselines with comprehensive experimental analyses
1 PAPER • NO BENCHMARKS YET
SOTVerse is a user-defined task space of single object tracking. It allows users to customize SOT tasks according to their research purposes, which on the one hand makes research more targeted, and on the other hand can significantly improve the efficiency of research.