Modern Earth observation satellites capture multi-exposure bursts of push-frame images that can be super-resolved via computational means.
In this paper, we propose a study aiming to determine which is the best approach to train denoising networks for real raw videos: supervision on synthetic realistic data or self-supervision on real data.
We argue that in doing so, the challenge ranks the proposed methods not only by their MISR performance, but mainly by the heuristics used to guess which image in the series is the most similar to the high-resolution target.
We propose a self-supervised approach for training multi-frame video denoising networks.
VBM3D is an extension to video of the well known image denoising algorithm BM3D, which takes advantage of the sparse representation of stacks of similar patches in a transform domain.
Due to the unavailability of ground truth data these networks cannot be currently trained using real RAW images.
Modeling the processing chain that has produced a video is a difficult reverse engineering task, even when the camera is available.