We first propose the Fusion Transformer, an attention-based model for multimodal and multi-sensor fusion.
In the second stage, the generative model serves as a reconstruction prior and the search manifold for the sensor fusion tasks.
This further reduces the semantic gap between different feature channel layers.
1 code implementation • 8 Oct 2021 • Mohammud J. Bocus, Wenda Li, Shelly Vishwakarma, Roget Kou, Chong Tang, Karl Woodbridge, Ian Craddock, Ryan McConville, Raul Santos-Rodriguez, Kevin Chetty, Robert Piechocki
This dataset can be exploited to advance WiFi and vision-based HAR, for example, using pattern recognition, skeletal representation, deep learning algorithms or other novel approaches to accurately recognize human activities.
Traditional approaches to activity recognition involve the use of wearable sensors or cameras in order to recognise human activities.
In this paper, we introduce a novel suspect-and-investigate framework, which can be easily embedded in a drone for automated parking violation detection (PVD).
The experimental results demonstrate that, firstly, the transformed disparity (or inverse depth) images become more informative; secondly, AA-UNet and AA-RTFNet, our best performing implementations, respectively outperform all other state-of-the-art single-modal and data-fusion networks for road pothole detection; and finally, the training set augmentation technique based on adversarial domain adaptation not only improves the accuracy of the state-of-the-art semantic segmentation networks, but also accelerates their convergence.