Recently, multi-scale autoregressive models have been proposed to address this limitation.
One possible solution approach consists of adapting current human-targeted image and video coding standards to the use case of machine consumption.
Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images.
In a second phase, the Model-Agnostic Meta-learning approach is adapted to the specific case of image compression, where the inner-loop performs latent tensor overfitting, and the outer loop updates both encoder and decoder neural networks based on the overfitting performance.
We present an efficient finetuning methodology for neural-network filters which are applied as a postprocessing artifact-removal step in video coding pipelines.
One of the core components of conventional (i. e., non-learned) video codecs consists of predicting a frame from a previously-decoded frame, by leveraging temporal correlations.
In this manuscript we propose two objective terms for neural image compression: a compression objective and a cycle loss.
In this paper, we present a novel approach for fine-tuning a decoder-side neural network in the context of image compression, such that the weight-updates are better compressible.
In this work, we propose an end-to-end block-based auto-encoder system for image compression.
We introduced a high-resolution equirectangular panorama (360-degree, virtual reality) dataset for object detection and propose a multi-projection variant of YOLO detector.
Depth information provides a strong cue for occlusion detection and handling, but has been largely omitted in generic object tracking until recently due to lack of suitable benchmark datasets and applications.
In this work, we propose an improvement over DCF based trackers by combining saliency based and other features based filter responses.
We show the effect of l2 normalization on anomaly detection accuracy.
In order to have an in-depth theoretical understanding, in this manuscript, we investigate the graph degree in spectral graph clustering based and kernel based point of views and draw connections to a recent kernel method for the two sample problem.
By using these encoded images, we train a memory-efficient network using only 0. 048\% of the number of parameters that other deep salient object detection networks have.