The majority of prior monocular depth estimation methods without groundtruth depth guidance focus on driving scenarios.
Our synergy process leverages a representation cycle for 3DMM parameters and 3D landmarks.
Ranked #1 on Head Pose Estimation on AFLW2000
This work focuses on the analysis that whether 3D face models can be learned from only the speech inputs of speakers.
This work focuses on complete 3D facial geometry prediction, including 3D facial alignment via 3D face modeling and face orientation estimation using the proposed multi-task, multi-modal, and multi-representation landmark refinement network (M$^3$-LRN).
Mask regression is based on 2D, 2. 5D, and 3D ROI using the pseudo-lidar and image-based representations.
Ranked #1 on Instance Segmentation on Cityscapes val (using extra training data)
Recent sparse depth completion for lidars only focuses on the lower scenes and produces irregular estimations on the upper because existing datasets, such as KITTI, do not provide groundtruth for upper areas.
Compared with popular sampling methods such as Farthest Point Sampling (FPS) and Ball Query, CAGQ achieves up to 50X speed-up.
Such a transformation enables CFCNet to predict features and reconstruct data of missing depth measurements according to their corresponding, transformed RGB features.
We first propose a novel nonconvex rank surrogate on the general rank minimization problem and apply this to the corrupted image completion problem.
In this paper, a very effective method to solve the contiguous face occlusion recognition problem is proposed.
Also, we propose to create building masks from semantic segmentation using an encoder-decoder network.