🔔 Share your dataset with the ML community!

Filter by Modality (clear)

Filter by Task

Filter by Language

14 dataset results for segmentation AND Speech

DISRPT2019

DISRPT2019 (DISRPT2019 shared task on Discourse Unit Segmentation and Connective Detection)

The DISRPT 2019 workshop introduces the first iteration of a cross-formalism shared task on discourse unit segmentation. Since all major discourse parsing frameworks imply a segmentation of texts into segments, learning segmentations for and from diverse resources is a promising area for converging methods and insights. Because different corpora, languages and frameworks use different guidelines for segmentation, the shared task is meant to promote design of flexible methods for dealing with various guidelines, and help

4 PAPERS • NO BENCHMARKS YET

DISRPT2021

DISRPT2021 (DISRPT2021 shared task on Discourse Unit Segmentation, Connective Detection and Discourse Relation Classification)

The DISRPT 2021 shared task, co-located with CODI 2021 at EMNLP, introduces the second iteration of a cross-formalism shared task on discourse unit segmentation and connective detection, as well as the

3 PAPERS • NO BENCHMARKS YET

AVSpeech

…The segments are of varying length, between 3 and 10 seconds long, and in each clip the only visible face in the video and audible sound in the soundtrack belong to a single speaking person. In total, the dataset contains roughly 4700 hours of video segments with approximately 150,000 distinct speakers, spanning a wide variety of people, languages and face poses.

35 PAPERS • NO BENCHMARKS YET

TR_AR_S2S

…This work proposes an unsupervised approach to construct speech-to-speech corpus, aligned on short segment levels, to produce a parallel speech corpus in the source- and target- languages. Our methodology exploits video frames, speech recognition, machine translation, and noisy frames removal algorithms to match segments in both languages.

1 PAPER • NO BENCHMARKS YET

VoxClamantis

A large-scale corpus for phonetic typology, with aligned segments and estimated phoneme-level labels in 690 readings spanning 635 languages, along with acoustic-phonetic measures of vowels and sibilants

4 PAPERS • NO BENCHMARKS YET

Common Phone

…It comprises around 116 hours of speech enriched with automatically generated phonetic segmentation.

2 PAPERS • NO BENCHMARKS YET

Multimodal Opinionlevel Sentiment Intensity (MOSI)

…Opinionlevel Sentiment Intensity (MOSI) contains: (1) multimodal observations including transcribed speech and visual gestures as well as automatic audio and visual features, (2) opinion-level subjectivity segmentation

59 PAPERS • 1 BENCHMARK

DIHARD II

…The development set includes reference diarization and speech segmentation and may be used for any purpose including system development or training.

31 PAPERS • 1 BENCHMARK

MRDA

MRDA (ICSI Meeting Recorder Dialog Act Corpus)

…It is annotated with three types of information: marking of the dialogue act segment boundaries, marking of the dialogue acts and marking of correspondences between dialogue acts.

8 PAPERS • 1 BENCHMARK

MediaSpeech

…The dataset consists of short speech segments automatically extracted from media videos available on YouTube and manually transcribed, with some pre- and post-processing.

4 PAPERS • 1 BENCHMARK

GUM (Georgetown University Multilayer corpus)

…Annotations include: Multiple POS tags, morphological features and lemmatization Sentence segmentation and rough speech act Document structure in TEI XML (paragraphs, headings, figures, etc.)

8 PAPERS • 1 BENCHMARK

WenetSpeech

…An optical character recognition (OCR) based method is introduced to generate the audio/text segmentation candidates for the YouTube data on its corresponding video captions.

38 PAPERS • 1 BENCHMARK

Open Images V7

…A subset of 1.9M includes diverse annotations types. 15,851,536 boxes on 600 classes 2,785,498 instance segmentations on 350 classes 3,284,280 relationship annotations on 1,466 relationships 675,155

3 PAPERS • NO BENCHMARKS YET

aidatatang_200zh

…Segmented transcripts are also provided. The corpus aims to support researchers in speech recognition, machine translation, voiceprint recognition, and other speech-related fields.

0 PAPER • NO BENCHMARKS YET

Datasets

14 dataset results for segmentation AND Speech