29 papers with code • 7 benchmarks • 6 datasets
Pre-training video transformers on extra large-scale datasets is generally required to achieve premier performance on relatively small datasets.
We present a method that decomposes, or "unwraps", an input video into a set of layered 2D atlases, each providing a unified representation of the appearance of an object (or background) over the video.
Moreover, on the seemingly implausible x16 interpolation task, our method outperforms existing methods by more than 1. 5 dB in terms of PSNR.
In this paper, we propose a novel encoder-decoder neural network model referred to as DeepBinaryMask for video compressive sensing.
Based on the abundant event data alongside a low frame rate, easily blurred images, we propose a simple yet effective approach to reconstruct high-quality and high frame rate sharp videos.