We introduce the problem of weakly supervised Multi-Object Tracking and Segmentation, i. e. joint weakly supervised instance segmentation and multi-object tracking, in which we do not provide any kind of mask annotation.
Existing data on building destruction in conflict zones rely on eyewitness reports or manual detection, which makes it generally scarce, incomplete and potentially biased.
Training MOTSNet with our automatically extracted data leads to significantly improved sMOTSA scores on the novel KITTI MOTS dataset (+1. 9%/+7. 5% on cars/pedestrians), and MOTSNet improves by +4. 1% over previously best methods on the MOTSChallenge dataset.
When neural networks process images which do not resemble the distribution seen during training, so called out-of-distribution images, they often make wrong predictions, and do so too confidently.
The major challenges of road detection are dealing with shadows and lighting variations and the presence of other objects in the scene.