Additionally, due to crowdedness and occlusion in the videos, aligning the identity of runners across multiple disjoint cameras is a challenge.
We present the first edition of "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" challenges.
We make the observation that pruning weights adds the value 0 as an additional symbol and thus increases the information capacity of the network.
We introduce a refined and efficient real-time rPPG pipeline with novel filtering and motion suppression that not only estimates heart rates, but also extracts the pulse waveform to time heart beats and measure heart rate variability.
This layer is shown to minimize a penalized term of the Wasserstein distance between the learned continuous image features and the optimal half-half bit distribution.
1 code implementation • 1 Dec 2020 • Burak Yildiz, Hayley Hung, Jesse H. Krijthe, Cynthia C. S. Liem, Marco Loog, Gosia Migut, Frans Oliehoek, Annibale Panichella, Przemyslaw Pawelczak, Stjepan Picek, Mathijs de Weerdt, Jan van Gemert
We present ReproducedPapers. org: an open online repository for teaching and structuring machine learning reproducibility.
Current weakly supervised object localization and segmentation rely on class-discriminative visualization techniques to generate pseudo-labels for pixel-level training.
In this work, we define DIRs employed by existing works in probabilistic terms and show that by learning DIRs, overly strict requirements are imposed concerning the invariance.
Such methods are less stable than BN as they critically depend on the statistics of a single input sample.
Our study investigates the subjective human factor in comparisons of state of the art results and scientific reproducibility in deep learning.
In this paper, we run two methods of explanation, namely LIME and Grad-CAM, on a convolutional neural network trained to label images with the LEGO bricks that are visible in them.
Our results confirm the problems of the previous evaluation protocols, and suggest that an IA-based protocol is more adequate to the online scenario.
The problem of Online Human Behaviour Recognition in untrimmed videos, aka Online Action Detection (OAD), needs to be revisited.
Different from conventional VPR settings where the query images and gallery images come from the same domain, we propose a more common but challenging setup where the query images are collected under a new unseen condition.
To this end, we identify two methods for runner identification at different points of the event, for determining their trajectory.
Remote photo-plethysmography (rPPG) uses a remotely placed camera to estimating a person's heart rate (HR).
(ii) A pretrained semantic segmentation model is used to label objects in pixel level, and then we introduce statistical measures to quantitatively evaluate the interpretability of discriminate objects.
We propose ViDeNN: a CNN for Video Denoising without prior knowledge on the noise distribution (blind denoising).
Ranked #2 on Color Image Denoising on CBSD68 sigma35
To facilitate this, we propose a novel global pooling technique called Spatial Pyramid Averaged Max (SPAM) pooling for training this CAM-based network for object extent localisation with only weak image-level supervision.
There is an inherent need for autonomous cars, drones, and other robots to have a notion of how their environment behaves and to anticipate changes in the near future.
First, inspired by selective search for object proposals, we introduce an approach to generate action proposals from spatiotemporal super-voxels in an unsupervised manner, we call them Tubelets.
Our approach significantly outperforms the state-of-the-art on both datasets, while restricting the search of actions to a fraction of possible bounding box sequences.