To demonstrate the applicability of the synthetic trajectory data, we show that an RNN-based prediction model solely trained on the generated data can outperform classic reference models on a real-world UAV tracking dataset.
By providing missing tokens, binary-encoded missing events, the model learns to in-attend to missing data and infers a complete trajectory conditioned on the remaining inputs.
For providing a full temporal filtering cycle, a basic RNN is extended to take observations and the associated belief about its accuracy into account for updating the current state.
The 3D reconstruction of the scene is computed with an image-based Structure-from-Motion (SfM) component that enables us to leverage a state estimator in the corresponding 3D scene during tracking.
Methods to quantify the complexity of trajectory datasets are still a missing piece in benchmarking human trajectory prediction models.
The analysis and quantification of sequence complexity is an open problem frequently encountered when defining trajectory prediction benchmarks.
The problem of varying dynamics of tracked objects, such as pedestrians, is traditionally tackled with approaches like the Interacting Multiple Model (IMM) filter using a Bayesian formulation.
In recent years, there is a shift from modeling the tracking problem based on Bayesian formulation towards using deep neural networks.
Recurrent neural networks are able to learn complex long-term relationships from sequential data and output a pdf over the state space.