We further show that GIST and RIST can be combined with existing semi-supervised learning methods to boost performance.
Therefore, we propose Latent Optimization of Hairstyles via Orthogonalization (LOHO), an optimization-based approach using GAN inversion to infill missing hair structure details in latent space during hairstyle transfer.
SST extracts per-pixel representations for each object in a video using sparse attention over spatiotemporal features.
Ranked #1 on Video Object Segmentation on DAVIS-2017 validation
We also provide a postprocessing and rendering algorithm for nail polish try-on, which integrates with our semantic segmentation and fingernail base-tip direction predictions.
Recent works on convolutional neural networks (CNNs) for facial alignment have demonstrated unprecedented accuracy on a variety of large, publicly available datasets.
We propose a generalized class of multimodal fusion operators for the task of visual question answering (VQA).