Image Matting
96 papers with code • 8 benchmarks • 8 datasets
Image Matting is the process of accurately estimating the foreground object in images and videos. It is a very important technique in image and video editing applications, particularly in film production for creating visual effects. In case of image segmentation, we segment the image into foreground and background by labeling the pixels. Image segmentation generates a binary image, in which a pixel either belongs to foreground or background. However, Image Matting is different from the image segmentation, wherein some pixels may belong to foreground as well as background, such pixels are called partial or mixed pixels. In order to fully separate the foreground from the background in an image, accurate estimation of the alpha values for partial or mixed pixels is necessary.
Source: Automatic Trimap Generation for Image Matting
Image Source: Real-Time High-Resolution Background Matting
Libraries
Use these libraries to find Image Matting models and implementationsDatasets
Latest papers with no code
Towards Label-Efficient Human Matting: A Simple Baseline for Weakly Semi-Supervised Trimap-Free Human Matting
To address this challenge, we introduce a new learning paradigm, weakly semi-supervised human matting (WSSHM), which leverages a small amount of expensive matte labels and a large amount of budget-friendly segmentation labels, to save the annotation cost and resolve the domain generalization problem.
Learning Multiple Representations with Inconsistency-Guided Detail Regularization for Mask-Guided Matting
Our framework and model introduce the following key aspects: (1) to learn real-world adaptive semantic representation for objects with diverse and complex structures under real-world scenes, we introduce extra semantic segmentation and edge detection tasks on more diverse real-world data with segmentation annotations; (2) to avoid overfitting on low-level details, we propose a module to utilize the inconsistency between learned segmentation and matting representations to regularize detail refinement; (3) we propose a novel background line detection task into our auxiliary learning framework, to suppress interference of background lines or textures.
DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation
Our DiffuMatting shows several potential applications (e. g., matting-data generator, community-friendly art design and controllable generation).
Diffusion Models Trained with Large Data Are Transferable Visual Models
We show that, simply initializing image understanding models using a pre-trained UNet (or transformer) of diffusion models, it is possible to achieve remarkable transferable performance on fundamental vision perception tasks using a moderate amount of target data (even synthetic data only), including monocular depth, surface normal, image segmentation, matting, human pose estimation, among virtually many others.
DART: Depth-Enhanced Accurate and Real-Time Background Matting
In this paper, we leverage the rich depth information provided by the RGB-Depth (RGB-D) cameras to enhance background matting performance in real-time, dubbed DART.
POBEVM: Real-time Video Matting via Progressively Optimize the Target Body and Edge
Deep convolutional neural networks (CNNs) based approaches have achieved great performance in video matting.
Lightweight high-resolution Subject Matting in the Real World
To alleviate these issues, we propose to construct a saliency object matting dataset HRSOM and a lightweight network PSUNet.
DiffusionMat: Alpha Matting as Sequential Refinement Learning
In this paper, we introduce DiffusionMat, a novel image matting framework that employs a diffusion model for the transition from coarse to refined alpha mattes.
Lightweight Portrait Matting via Regional Attention and Refinement
We present a lightweight model for high resolution portrait matting.
EFormer: Enhanced Transformer towards Semantic-Contour Features of Foreground for Portraits Matting
Based on cross-attention module, we further build a semantic and contour detector (SCD) to accurately capture both of the low-frequency semantic and high-frequency contour features.