Datasets

10,559 machine learning datasets
Filter by Language (clear)
Halh Mongolian English 3210 Chinese 359 German 177 French 167 Spanish 136 Russian 121 Japanese 96 Arabic 90 Italian 88 Portuguese 84 Hindi 74 Vietnamese 66 Korean 61 Turkish 54 Bengali 50 Persian 50 Dutch 46 Tamil 44 Polish 41 Czech 39 Indonesian 38 Danish 35 Finnish 34 Romanian 34 Telugu 31 Multilingual 29 Swedish 29 Urdu 29 Hungarian 27 Thai 27 Marathi 26 Greek 25 Estonian 24 Gujarati 23 Hebrew 23 Mandarin Chinese 23 Bulgarian 22 Malayalam 22 Basque 20 Catalan 18 Kannada 18 Slovak 18 Swahili 18 Ukrainian 18 Latvian 17 Punjabi 17 Slovenian 17 Croatian 16 Lithuanian 16 Kazakh 15 Norwegian 15 Serbian 15 Amharic 13 Albanian 12 Assamese 12 Iranian Persian 12 Kurdish 12 Armenian 10 Irish 10 Macedonian 10 Maltese 10 Welsh 10 Yoruba 10 Burmese 9 Hausa 9 Igbo 9 Mongolian 9 Oriya (macrolanguage) 9 Sanskrit 9 Tagalog 9 American Sign Language 8 Azerbaijani 8 Breton 8 Georgian 8 Odia 8 Sinhala 8 Bambara 7 Esperanto 7 Filipino 7 Galician 7 Guarani 7 Icelandic 7 Malagasy 7 Nepali (macrolanguage) 7 Oromo 7 Serbo-Croatian 7 Somali 7 Uzbek 7 Wolof 7 Afrikaans 6 Central Khmer 6 Central Kurdish 6 Ganda 6 Haitian 6 Nigerian Pidgin 6 Sindhi 6 Tibetan 6 Tigrinya 6 Western Panjabi 6 Belarusian 5 Bosnian 5 Egyptian Arabic 5 Fon 5 Javanese 5 Latin 5 Lingala 5 Malay (individual language) 5 Norwegian Bokmål 5 Norwegian Nynorsk 5 Quechua 5 Scottish Gaelic 5 Standard Arabic 5 Sundanese 5 Tswana 5 Aymara 4 Bangala 4 Cebuano 4 Chechen 4 Dhivehi 4 Ewe 4 Fulah 4 Iloko 4 Kabyle 4 Kinyarwanda 4 Kirghiz 4 Lao 4 Luo (Kenya and Tanzania) 4 Nyanja 4 South Azerbaijani 4 Tatar 4 Tetum 4 Twi 4 Upper Sorbian 4 Xhosa 4 Zulu 4 Aragonese 3 Bashkir 3 Bavarian 3 Bishnupriya 3 Chuvash 3 Erzya 3 Faroese 3 Goan Konkani 3 Interlingue 3 Maithili 3 Malay (macrolanguage) 3 Occitan (post 1500) 3 Romansh 3 Rundi 3 Russia Buriat 3 Shona 3 Swati 3 Swiss German 3 Tajik 3 Tsonga 3 Turkmen 3 Uighur 3 Waray (Philippines) 3 Yiddish 3 Argentine Sign Language 2 Asturian 2 Avaric 2 Bangladeshi Sign Language 2 Bhojpuri 2 Central Bikol 2 Cherokee 2 Church Slavic 2 Cornish 2 Corsican 2 Dimli (individual language) 2 Eastern Mari 2 German Sign Language 2 Gothic 2 Gulf Arabic 2 Ido 2 Inuktitut 2 Jejueo 2 Kalaallisut 2 Kalmyk 2 Karachay-Balkar 2 Komi 2 Komi-Permyak 2 Lezghian 2 Limburgan 2 Livvi 2 Lojban 2 Lombard 2 Low German 2 Lower Sorbian 2 Luxembourgish 2 Manipuri 2 Manx 2 Maori 2 Mazanderani 2 Minangkabau 2 Mingrelian 2 Mirandese 2 Modern Greek 2 Moksha 2 Moroccan Arabic 2 Mossi 2 Naxi 2 Neapolitan 2 Nepali (individual language) 2 Newari 2 Northern Frisian 2 Northern Kurdish 2 Northern Luri 2 Northern Sami 2 Old Spanish 2 Ossetian 2 Pampanga 2 Pedi 2 Piemontese 2 Pushto 2 Sardinian 2 Sichuan Yi 2 Sicilian 2 Southern Sotho 2 Swiss-German Sign Language 2 Tai 2 Tosk Albanian 2 Turkish Sign Language 2 Tuvinian 2 Udmurt 2 Venetian 2 Volapük 2 Walloon 2 Western Frisian 2 Western Mari 2 Wu Chinese 2 Yakut 2 Yue Chinese 2 Abkhazian 1 Achinese 1 Adyghe 1 Afar 1 Akan 1 Akkadian 1 Akuntsu 1 Ambonese Malay 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Arpitan 1 Assyrian Neo-Aramaic 1 Banjar 1 Bemba (Zambia) 1 Bislama 1 Bodo (India) 1 Buginese 1 Central Pashto 1 Chamorro 1 Chavacano 1 Cheyenne 1 Choctaw 1 Chukot 1 Congo Swahili 1 Coptic 1 Cree 1 Creek 1 Crimean Tatar 1 Dogri (macrolanguage) 1 Dzongkha 1 Extremaduran 1 Fiji Hindi 1 Fijian 1 French Sign Language 1 Friulian 1 Gagauz 1 Gan Chinese 1 Geez 1 Gilaki 1 Greek Sign Language 1 Hakha Chin 1 Hakka Chinese 1 Hawaiian 1 Herero 1 Hiri Motu 1 Interlingua (International Auxiliary Language Association) 1 Inupiaq 1 Jamaican Creole English 1 Kabardian 1 Kabuverdianu 1 Kachin 1 Kanuri 1 Kara-Kalpak 1 Karelian 1 Kashmiri 1 Kashubian 1 Khunsari 1 Kikuyu 1 Komi-Zyrian 1 Kongo 1 Krio 1 Kuanyama 1 Kupang Malay 1 Kölsch 1 Ladino 1 Lak 1 Latgalian 1 Ligurian 1 Literary Chinese 1 Lozi 1 Lunda 1 Luo (Cameroon) 1 Lushai 1 Makasar 1 Malayic Dayak 1 Marshallese 1 Mbyá Guaraní 1 Mesopotamian Arabic 1 Min Dong Chinese 1 Modern Greek (1453-) 1 Mundurukú 1 Najdi Arabic 1 Narom 1 Nauru 1 Navajo 1 Nayini 1 Ndonga 1 Nigerian Fulfulde 1 North Azerbaijani 1 North Levantine Arabic 1 Northern Uzbek 1 Novial 1 Official Aramaic (700-300 BCE) 1 Old English (ca. 450-1100) 1 Old French 1 Old Russian 1 Old Turkish 1 Pali 1 Pangasinan 1 Papiamento 1 Pennsylvania German 1 Pfaelzisch 1 Picard 1 Pitcairn-Norfolk 1 Plateau Malagasy 1 Pontic 1 Rajasthani 1 Rusyn 1 Samoan 1 Sango 1 Saterfriesisch 1 Scots 1 Shan 1 Silesian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Southern Pashto 1 Sranan Tongo 1 Standard Latvian 1 Swahili (macrolanguage) 1 Swedish Sign Language 1 Tahitian 1 Tok Pisin 1 Tonga (Tonga Islands) 1 Tonga (Zambia) 1 Tulu 1 Tumbuka 1 Tunisian Arabic 1 Tupinambá 1 Uab Meto 1 Venda 1 Veps 1 Vlaams 1 Vlax Romani 1 Votic 1 Warlpiri 1 West Central Oromo 1 Zaza 1 Zeeuws 1 Zhuang 1 Dogri (individual language) 0 Northern Huishui Hmong 0 Portuguse 0 Saidi Arabic 0 Santali 0 Thai Song 0 Tunisian Sign Language 0