3 dataset results for segmentation AND Speech AND Chinese

DISRPT2021

DISRPT2021 (DISRPT2021 shared task on Discourse Unit Segmentation, Connective Detection and Discourse Relation Classification)

The DISRPT 2021 shared task, co-located with CODI 2021 at EMNLP, introduces the second iteration of a cross-formalism shared task on discourse unit segmentation and connective detection, as well as the

3 PAPERS • NO BENCHMARKS YET

AVSpeech

…The segments are of varying length, between 3 and 10 seconds long, and in each clip the only visible face in the video and audible sound in the soundtrack belong to a single speaking person. In total, the dataset contains roughly 4700 hours of video segments with approximately 150,000 distinct speakers, spanning a wide variety of people, languages and face poses.

35 PAPERS • NO BENCHMARKS YET

aidatatang_200zh

…Segmented transcripts are also provided. The corpus aims to support researchers in speech recognition, machine translation, voiceprint recognition, and other speech-related fields.

0 PAPER • NO BENCHMARKS YET

Datasets

3 dataset results for segmentation AND Speech AND Chinese