Standard video codecs rely on optical flow to guide inter-frame prediction: pixels from reference frames are moved via motion vectors to predict target video frames.
In this work, we focus on precise 3D track state estimation and propose a learning-based approach for object-centric relative motion estimation of partially observed objects.
Most previous fusion strategies either fail to fully utilize temporal information or cost too much time, and how to effectively fuse temporal information from consecutive frames plays an important role in video super-resolution (SR).
We present the collections of images of the same rotating plastic object made in X-ray and visible spectra.
Our tracker achieves leading performance in OTB2013, OTB2015, VOT2015, VOT2016 and LaSOT, and operates at a real-time speed of 26 FPS, which indicates our method is effective and practical.
Current image translation methods, albeit effective to produce high-quality results in various applications, still do not consider much geometric transform.