Search Results for author: Christopher Song

Found 1 papers, 0 papers with code

Text-Free Image-to-Speech Synthesis Using Learned Segmental Units

no code implementations ACL 2021 Wei-Ning Hsu, David Harwath, Christopher Song, James Glass

In this paper we present the first model for directly synthesizing fluent, natural-sounding spoken audio captions for images that does not require natural language text as an intermediate representation or source of supervision.

Image Captioning Speech Synthesis +1

Cannot find the paper you are looking for? You can Submit a new open access paper.