7 dataset results for art AND Audio

…CVSS is derived from the Common Voice speech corpus and the CoVoST 2 speech-to-text translation (ST) corpus, by synthesizing the translation text from CoVoST 2 into speech using state-of-the-art TTS systems

18 PAPERS • 1 BENCHMARK

ASR-RAMC-BIGCCSC: A CHINESE CONVERSATIONAL SPEECH CORPUS

…It covers 15 topics, including humanities, entertainment, sports, military, finance, religion, family life, politics, education, digital devices, environment, science, professional development, art and

1 PAPER • NO BENCHMARKS YET

AVASpeech-SMAD

AVASpeech-SMAD (AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-Occurrence)

…Evaluation results from two state-of-the-art SMAD systems are also provided as a benchmark for future reference.

1 PAPER • NO BENCHMARKS YET

ARTE

ARTE (Ambisonics Recordings of Typical Environments)

The ARTE database, so far, contains 13 acoustic environments that were recorded with a purpose-built 62-channel microphone array in various locations around Sydney (Australia), and was decoded into the Apart from the acoustic environment specific files, the ARTE database includes a number of MatlabTM functions that help decoding the provided HOA files into a format that can be played back via a given This structure is generated automatically when downloading (and unzipping) the main zip-file (ARTE database downloas.7z). Acknowledgement: The development of the ARTE database was financially supported by the HEARing CRC, established and supported under the Cooperative Research Centres Program – an initiative of the Australian The Ambisonics Recordings of Typical Environments (ARTE) database. Acta Acustica united with Acustica. (see provided pdf-file)

1 PAPER • NO BENCHMARKS YET

BEAT (Body-Expression-Audio-Text)

…Qualitative and quantitative experiments demonstrate metrics' validness, ground truth data quality, and baseline's state-of-the-art performance.

37 PAPERS • 1 BENCHMARK

BEAT2 (BEAT-SMPLX-FLAME)

…Experiments demonstrate that EMAGE generates holistic gestures with state-of-the-art performance and is flexible in accepting predefined spatial-temporal gesture inputs, generating complete, audio-synchronized

8 PAPERS • 2 BENCHMARKS

SingFake

SingFake (SingFake: Singing Voice Deepfake Detection)

…We then use SingFake to evaluate four state-of-the-art speech countermeasure systems trained on speech utterances.

1 PAPER • NO BENCHMARKS YET

Datasets

7 dataset results for art AND Audio