Skit-S2I (Skit-S2I: An Indian Accented Speech to Intent dataset)

Introduced by Rajaa et al. in Skit-S2I: An Indian Accented Speech to Intent dataset

This dataset for Intent classification from human speech covers 14 coarse-grained intents from the Banking domain. This work is inspired by a similar release in the Minds-14 dataset - here, we restrict ourselves to Indian English but with a much larger training set. The data was generated by 11 (Indian English) speakers and recorded over a telephony line. We also provide access to anonymized speaker information - like gender, languages spoken, and native language - to allow more structured discussions around robustness and bias in the models you train.

Source: Skit-S2I: An Indian Accented Speech to Intent dataset

Homepage