A reverse dictionary takes descriptions of words as input and outputs words semantically matching the input descriptions.
While only the semantics of each task differ, current research focuses on designing specialized architectures for each task.
Ranked #1 on Panoptic Segmentation on COCO minival
Blind face restoration usually relies on facial priors, such as facial geometry prior or reference prior, to restore realistic and faithful details.
Ranked #1 on Blind Face Restoration on CelebA-Test
To further improve the performance of the proposed method, we propose a skeleton-based search space to reduce false positive detection.
Despite the initial belief that Convolutional Neural Networks (CNNs) are driven by shapes to perform visual recognition tasks, recent evidence suggests that texture bias in CNNs provides higher performing models when learning on large labeled training datasets.
Ranked #2 on Few-Shot Semantic Segmentation on FSS-1000
We approach text-to-image generation by combining the power of the retrained CLIP representation with an off-the-shelf image generator (GANs), optimizing in the latent space of GAN to find images that achieve maximum CLIP score with the given input text.
Ranked #3 on Text-to-Image Generation on COCO
In this work, we introduce this approach into the realm of encoder-based inversion.
Ranked #1 on Fine-tuning on 2021 Hotel-ID
YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker TTS.