The dataset used in this work is the Multispectral Object Detection Dataset, where each scene is available in the FIR, MIR and NIR as well as visual spectrum.
To demonstrate the applicability of the synthetic trajectory data, we show that an RNN-based prediction model solely trained on the generated data can outperform classic reference models on a real-world UAV tracking dataset.
By providing missing tokens, binary-encoded missing events, the model learns to in-attend to missing data and infers a complete trajectory conditioned on the remaining inputs.
We present the current state of development of the sensor-equipped car MODISSA, with which Fraunhofer IOSB realizes a configurable experimental platform for hardware evaluation and software development in the context of mobile mapping and vehicle-related safety and protection.
For providing a full temporal filtering cycle, a basic RNN is extended to take observations and the associated belief about its accuracy into account for updating the current state.
The reconstruction of accurate three-dimensional environment models is one of the most fundamental goals in the field of photogrammetry.
We propose a framework that extends Blender to exploit Structure from Motion (SfM) and Multi-View Stereo (MVS) techniques for image-based modeling tasks such as sculpting or camera and motion tracking.
The 3D reconstruction of the scene is computed with an image-based Structure-from-Motion (SfM) component that enables us to leverage a state estimator in the corresponding 3D scene during tracking.
Methods to quantify the complexity of trajectory datasets are still a missing piece in benchmarking human trajectory prediction models.
The analysis and quantification of sequence complexity is an open problem frequently encountered when defining trajectory prediction benchmarks.
Therefore, it is necessary for such a learning-based system in a real world environment, to be aware of its own capabilities and limits and to be able to distinguish between confident and unconfident results of the inference, especially if the sample cannot be explained by the underlying distribution.
Towards this end, a neural network model for continuous-time stochastic processes usable for sequence prediction is proposed.
In this paper, several variants of two-stream architectures for temporal action proposal generation in long, untrimmed videos are presented.
The problem of varying dynamics of tracked objects, such as pedestrians, is traditionally tackled with approaches like the Interacting Multiple Model (IMM) filter using a Bayesian formulation.
To demonstrate the applicability of SMIL, we fit the model to RGB-D sequences of freely moving infants and show, with a case study, that our method captures enough motion detail for General Movements Assessment (GMA), a method used in clinical practice for early detection of neurodevelopmental disorders in infants.
We apply Structure from Motion techniques to vehicle and background images to determine for each frame camera poses relative to vehicle instances and background structures.
We compute the object trajectory by combining object and background camera pose information.
In recent years, there is a shift from modeling the tracking problem based on Bayesian formulation towards using deep neural networks.
Recurrent neural networks are able to learn complex long-term relationships from sequential data and output a pdf over the state space.
We apply Structure from Motion techniques to object and background images to determine for each frame camera poses relative to object instances and background structures.
The evaluation shows that our tracking approach is able to track objects with high relative motions.