YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

edresson/yourtts 4 Dec 2021

YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker TTS.

Speech Synthesis Voice Conversion

WantWords: An Open-source Online Reverse Dictionary System

thunlp/WantWords EMNLP 2020

A reverse dictionary takes descriptions of words as input and outputs words semantically matching the input descriptions.

Masked-attention Mask Transformer for Universal Image Segmentation

facebookresearch/Mask2Former 2 Dec 2021

While only the semantics of each task differ, current research focuses on designing specialized architectures for each task.

Instance Segmentation Panoptic Segmentation

Towards Real-World Blind Face Restoration with Generative Facial Prior


Blind face restoration usually relies on facial priors, such as facial geometry prior or reference prior, to restore realistic and faithful details.

Blind Face Restoration GAN inversion

Text2Mesh: Text-Driven Neural Stylization for Meshes

threedle/text2mesh 6 Dec 2021

In order to modify style, we obtain a similarity score between a text prompt (describing style) and a stylized mesh by harnessing the representational power of CLIP.

Neural Stylization

Stacked Hourglass Network with a Multi-level Attention Mechanism: Where to Look for Intervertebral Disc Labeling

rezazad68/deep-intervertebral-disc-labeling 14 Aug 2021

To further improve the performance of the proposed method, we propose a skeleton-based search space to reduce false positive detection.

Pose Estimation Semantic Segmentation

VocBench: A Neural Vocoder Benchmark for Speech Synthesis

facebookresearch/vocoder-benchmark 6 Dec 2021

We perform a subjective and objective evaluation to compare the performance of each vocoder along a different axis.

Speech Synthesis

On the Texture Bias for Few-Shot CNN Segmentation

rezazad68/fewshot-segmentation 9 Mar 2020

Despite the initial belief that Convolutional Neural Networks (CNNs) are driven by shapes to perform visual recognition tasks, recent evidence suggests that texture bias in CNNs provides higher performing models when learning on large labeled training datasets.

Few-Shot Semantic Segmentation Semantic Segmentation

timeseriesAI/tsai 24 Feb 2020

Time series Timeseries Deep Learning Machine Learning Pytorch fastai | State-of-the-art Deep Learning library for Time Series and Sequences in Pytorch / fastai

Classification General Classification +2

FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization

gnobitab/fusedream 2 Dec 2021

We approach text-to-image generation by combining the power of the retrained CLIP representation with an off-the-shelf image generator (GANs), optimizing in the latent space of GAN to find images that achieve maximum CLIP score with the given input text.

Zero-Shot Text-to-Image Generation

