1 code implementation • 31 Jul 2021 • Uttaran Bhattacharya, Elizabeth Childs, Nicholas Rewkowski, Dinesh Manocha
Our network consists of two components: a generator to synthesize gestures from a joint embedding space of features encoded from the input speech and the seed poses, and a discriminator to distinguish between the synthesized pose sequences and real 3D pose sequences.
Ranked #4 on Gesture Generation on TED Gesture Dataset