Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction

CVPR 2020  ·  Vincent Le Guen, Nicolas Thome ·

Leveraging physical knowledge described by partial differential equations (PDEs) is an appealing way to improve unsupervised video prediction methods. Since physics is too restrictive for describing the full visual content of generic videos, we introduce PhyDNet, a two-branch deep architecture, which explicitly disentangles PDE dynamics from unknown complementary information. A second contribution is to propose a new recurrent physical cell (PhyCell), inspired from data assimilation techniques, for performing PDE-constrained prediction in latent space. Extensive experiments conducted on four various datasets show the ability of PhyDNet to outperform state-of-the-art methods. Ablation studies also highlight the important gain brought out by both disentanglement and PDE-constrained prediction. Finally, we show that PhyDNet presents interesting features for dealing with missing data and long-term forecasting.

PDF Abstract CVPR 2020 PDF CVPR 2020 Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Video Prediction Human3.6M PhyDNet SSIM 0.901 # 2
MSE 369 # 1
MAE 1620 # 1
Video Prediction Moving MNIST PhyDNet MSE 24.4 # 3
MAE 70.3 # 1
SSIM 0.947 # 4
Video Prediction SynpickVP PhyDNet MSE 57.31 # 4
PSNR 26.84 # 4
SSIM 0.877 # 5
LPIPS 0.053 # 2


No methods listed for this paper. Add relevant methods here