Datasets

11,071 machine learning datasets
Filter by Language (clear)
Votic English 3426 Chinese 394 German 184 French 175 Spanish 142 Russian 126 Japanese 98 Arabic 95 Italian 91 Portuguese 84 Hindi 75 Vietnamese 66 Korean 62 Bengali 57 Turkish 56 Persian 51 Dutch 45 Tamil 44 Polish 41 Czech 39 Indonesian 38 Danish 34 Finnish 33 Romanian 33 Telugu 31 Thai 29 Multilingual 28 Swedish 28 Urdu 28 Marathi 27 Hungarian 26 Greek 24 Estonian 23 Gujarati 23 Mandarin Chinese 23 Hebrew 22 Bulgarian 21 Malayalam 21 Basque 19 Kannada 18 Ukrainian 18 Catalan 17 Punjabi 17 Slovak 17 Swahili 17 Latvian 16 Slovenian 16 Croatian 15 Lithuanian 15 Kazakh 14 Norwegian 14 Serbian 14 Amharic 13 Iranian Persian 13 Kurdish 12 Albanian 11 Assamese 11 Irish 10 Welsh 10 Yoruba 10 American Sign Language 9 Armenian 9 Macedonian 9 Maltese 9 Mongolian 9 Sanskrit 9 Tagalog 9 Azerbaijani 8 Breton 8 Burmese 8 Hausa 8 Igbo 8 Odia 8 Oriya (macrolanguage) 8 Sinhala 8 Esperanto 7 Filipino 7 Galician 7 Georgian 7 Nepali (macrolanguage) 7 Bambara 6 Guarani 6 Icelandic 6 Malagasy 6 Nigerian Pidgin 6 Oromo 6 Serbo-Croatian 6 Somali 6 Uzbek 6 Western Panjabi 6 Wolof 6 Afrikaans 5 Belarusian 5 Bosnian 5 Central Khmer 5 Central Kurdish 5 Fon 5 Ganda 5 Haitian 5 Javanese 5 Latin 5 Norwegian Nynorsk 5 Quechua 5 Scottish Gaelic 5 Sindhi 5 Sundanese 5 Tibetan 5 Tigrinya 5 Aymara 4 Bangala 4 Chechen 4 Dhivehi 4 Egyptian Arabic 4 Ewe 4 Kabyle 4 Lingala 4 Malay (individual language) 4 Norwegian Bokmål 4 Standard Arabic 4 Tatar 4 Tetum 4 Tswana 4 Twi 4 Upper Sorbian 4 Aragonese 3 Bashkir 3 Bavarian 3 Bishnupriya 3 Cebuano 3 Chuvash 3 Erzya 3 Faroese 3 Fulah 3 German Sign Language 3 Goan Konkani 3 Iloko 3 Interlingue 3 Kinyarwanda 3 Kirghiz 3 Lao 3 Luo (Kenya and Tanzania) 3 Maithili 3 Nyanja 3 Occitan (post 1500) 3 Romansh 3 Rundi 3 Russia Buriat 3 Sardinian 3 South Azerbaijani 3 Swiss German 3 Turkmen 3 Uighur 3 Xhosa 3 Yiddish 3 Zulu 3 Argentine Sign Language 2 Asturian 2 Avaric 2 Bangladeshi Sign Language 2 Bhojpuri 2 Central Bikol 2 Cherokee 2 Church Slavic 2 Cornish 2 Corsican 2 Dimli (individual language) 2 Eastern Mari 2 Gothic 2 Ido 2 Inuktitut 2 Jamaican Creole English 2 Jejueo 2 Kalaallisut 2 Kalmyk 2 Karachay-Balkar 2 Komi 2 Komi-Permyak 2 Lezghian 2 Limburgan 2 Livvi 2 Lojban 2 Lombard 2 Low German 2 Lower Sorbian 2 Luxembourgish 2 Malay (macrolanguage) 2 Manipuri 2 Manx 2 Mazanderani 2 Minangkabau 2 Mingrelian 2 Mirandese 2 Modern Greek 2 Moksha 2 Mossi 2 Naxi 2 Neapolitan 2 Newari 2 Northern Frisian 2 Northern Kurdish 2 Northern Luri 2 Northern Sami 2 Old Spanish 2 Ossetian 2 Pampanga 2 Piemontese 2 Pushto 2 Shona 2 Sichuan Yi 2 Sicilian 2 Swati 2 Swiss-German Sign Language 2 Tai 2 Tajik 2 Tsonga 2 Turkish Sign Language 2 Tuvinian 2 Udmurt 2 Venetian 2 Volapük 2 Walloon 2 Waray (Philippines) 2 Western Frisian 2 Western Mari 2 Wu Chinese 2 Yakut 2 Yue Chinese 2 Abkhazian 1 Achinese 1 Adyghe 1 Afar 1 Akan 1 Akkadian 1 Akuntsu 1 Ambonese Malay 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Arpitan 1 Assyrian Neo-Aramaic 1 Banjar 1 Bemba (Zambia) 1 Bislama 1 Bodo (India) 1 Buginese 1 Central Pashto 1 Chamorro 1 Chavacano 1 Cheyenne 1 Choctaw 1 Chukot 1 Congo Swahili 1 Coptic 1 Cree 1 Creek 1 Crimean Tatar 1 Dogri (macrolanguage) 1 Dzongkha 1 Extremaduran 1 Fiji Hindi 1 Fijian 1 French Sign Language 1 Friulian 1 Gagauz 1 Gan Chinese 1 Geez 1 Gilaki 1 Greek Sign Language 1 Gulf Arabic 1 Hakha Chin 1 Hakka Chinese 1 Halh Mongolian 1 Hawaiian 1 Herero 1 Hiri Motu 1 Interlingua (International Auxiliary Language Association) 1 Inupiaq 1 Kabardian 1 Kanuri 1 Kara-Kalpak 1 Karelian 1 Kashmiri 1 Kashubian 1 Khunsari 1 Kikuyu 1 Komi-Zyrian 1 Kongo 1 Krio 1 Kuanyama 1 Kupang Malay 1 Kölsch 1 Ladino 1 Lak 1 Latgalian 1 Ligurian 1 Literary Chinese 1 Lozi 1 Lunda 1 Luo (Cameroon) 1 Lushai 1 Makasar 1 Malayic Dayak 1 Maori 1 Marshallese 1 Mbyá Guaraní 1 Min Dong Chinese 1 Modern Greek (1453-) 1 Moroccan Arabic 1 Mundurukú 1 Narom 1 Nauru 1 Navajo 1 Nayini 1 Ndonga 1 Nepali (individual language) 1 Novial 1 Official Aramaic (700-300 BCE) 1 Old English (ca. 450-1100) 1 Old French 1 Old Russian 1 Old Turkish 1 Pali 1 Pangasinan 1 Papiamento 1 Pedi 1 Pennsylvania German 1 Pfaelzisch 1 Picard 1 Pitcairn-Norfolk 1 Pontic 1 Rajasthani 1 Rusyn 1 Samoan 1 Sango 1 Saterfriesisch 1 Scots 1 Silesian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Southern Sotho 1 Sranan Tongo 1 Swahili (macrolanguage) 1 Swedish Sign Language 1 Tahitian 1 Tok Pisin 1 Tonga (Tonga Islands) 1 Tonga (Zambia) 1 Tosk Albanian 1 Tulu 1 Tumbuka 1 Tunisian Arabic 1 Tupinambá 1 Uab Meto 1 Venda 1 Veps 1 Vlaams 1 Vlax Romani 1 Warlpiri 1 Zaza 1 Zeeuws 1 Zhuang 1 Dogri (individual language) 0 Kabuverdianu 0 Kachin 0 Lingua Franca 0 Mesopotamian Arabic 0 Najdi Arabic 0 Nigerian Fulfulde 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Huishui Hmong 0 Northern Uzbek 0 Plateau Malagasy 0 Portuguse 0 Saidi Arabic 0 Santali 0 Shan 0 Southern Pashto 0 Standard Latvian 0 Thai Song 0 Tunisian Sign Language 0 West Central Oromo 0