Search Results for author: Gautham J. Mysore

Found 6 papers, 4 papers with code

Emotion Embedding Spaces for Matching Music to Stories

1 code implementation26 Nov 2021 Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore, Xavier Serra

Content creators often use music to enhance their stories, as it can be a powerful tool to convey emotion.

Cross-Modal Retrieval Metric Learning

Controllable Neural Prosody Synthesis

no code implementations7 Aug 2020 Max Morrison, Zeyu Jin, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore

Speech synthesis has recently seen significant improvements in fidelity, driven by the advent of neural vocoders and neural prosody generators.

Speech Synthesis

F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder

1 code implementation15 Apr 2020 Kaizhi Qian, Zeyu Jin, Mark Hasegawa-Johnson, Gautham J. Mysore

Recently, AutoVC, a conditional autoencoders (CAEs) based method achieved state-of-the-art results by disentangling the speaker identity and speech content using information-constraining bottlenecks, and it achieves zero-shot conversion by swapping in a different speaker's identity embedding to synthesize a new voice.

Style Transfer Voice Conversion

A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences

1 code implementation13 Jan 2020 Pranay Manocha, Adam Finkelstein, Zeyu Jin, Nicholas J. Bryan, Richard Zhang, Gautham J. Mysore

Assessment of many audio processing tasks relies on subjective evaluation which is time-consuming and expensive.

Denoising

B-Script: Transcript-based B-roll Video Editing with Recommendations

no code implementations28 Feb 2019 Bernd Huber, Hijung Valentina Shin, Bryan Russell, Oliver Wang, Gautham J. Mysore

In video production, inserting B-roll is a widely used technique to enrich the story and make a video more engaging.

Video Editing

A Generative Product-of-Filters Model of Audio

1 code implementation20 Dec 2013 Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore

We propose the product-of-filters (PoF) model, a generative model that decomposes audio spectra as sparse linear combinations of "filters" in the log-spectral domain.

Speaker Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.