MODNet is a light-weight matting objective decomposition network that can process portrait matting from a single input image in real time. The design of MODNet benefits from optimizing a series of correlated sub-objectives simultaneously via explicit constraints. To overcome the domain shift problem, MODNet introduces a self-supervised strategy based on subobjective consistency (SOC) and a one-frame delay trick to smooth the results when applying MODNet to portrait video sequence.
Given an input image $I$, MODNet predicts human semantics $s_{p}$, boundary details $d_{p}$, and final alpha matte $\alpha_{p}$ through three interdependent branches, $S, D$, and $F$, which are constrained by specific supervisions generated from the ground truth matte $\alpha_{g}$. Since the decomposed sub-objectives are correlated and help strengthen each other, we can optimize MODNet end-to-end.
Source: MODNet: Real-Time Trimap-Free Portrait Matting via Objective DecompositionPaper | Code | Results | Date | Stars |
---|
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |