3 dataset results for art AND Audio AND English

…CVSS is derived from the Common Voice speech corpus and the CoVoST 2 speech-to-text translation (ST) corpus, by synthesizing the translation text from CoVoST 2 into speech using state-of-the-art TTS systems

18 PAPERS • 1 BENCHMARK

BEAT (Body-Expression-Audio-Text)

…Qualitative and quantitative experiments demonstrate metrics' validness, ground truth data quality, and baseline's state-of-the-art performance.

37 PAPERS • 1 BENCHMARK

BEAT2 (BEAT-SMPLX-FLAME)

…Experiments demonstrate that EMAGE generates holistic gestures with state-of-the-art performance and is flexible in accepting predefined spatial-temporal gesture inputs, generating complete, audio-synchronized

8 PAPERS • 2 BENCHMARKS

Datasets

3 dataset results for art AND Audio AND English