Camera sensors are increasingly being combined with machine learning to perform various tasks such as intelligent surveillance.
We find that our framework can generate suitable cover art for most genres, and that the visual features adapt themselves to audio feature changes.
Deploying machine learning applications on edge devices can bring clear benefits such as improved reliability, latency and privacy but it also introduces its own set of challenges.
Automated anomaly detection in surveillance videos has attracted much interest as it provides a scalable alternative to manual monitoring.
For this, we train a variational autoencoder on high quality face images from a publicly available dataset and use the reconstruction probability as a metric to estimate the quality of each face crop.
Automating the analysis of surveillance video footage is of great interest when urban environments or industrial sites are monitored by a large number of cameras.
In this work we explore the generalization characteristics of unsupervised representation learning by leveraging disentangled VAE's to learn a useful latent space on a set of relational reasoning problems derived from Raven Progressive Matrices.
Deep neural networks require large amounts of resources which makes them hard to use on resource constrained devices such as Internet-of-things devices.
Deep residual networks (ResNets) made a recent breakthrough in deep learning.
Binary neural networks are attractive in this case because the logical operations are very fast and efficient when implemented in hardware.
However, when learning a task using reinforcement learning, the agent cannot distinguish the characteristics of the environment from those of the task.
In this paper we propose a technique which avoids the evaluation of certain convolutional filters in a deep neural network.
We present four training and prediction schedules from the same character-level recurrent neural network.