Datasets

10,086 machine learning datasets
Filter by Language (clear)
North Levantine Arabic English 3019 Chinese 337 German 172 French 153 Spanish 129 Russian 116 Japanese 88 Arabic 87 Italian 83 Portuguese 80 Hindi 71 Vietnamese 61 Korean 56 Turkish 52 Bengali 49 Persian 46 Dutch 44 Tamil 42 Polish 40 Czech 37 Indonesian 36 Danish 34 Finnish 33 Romanian 32 Telugu 30 Multilingual 29 Urdu 29 Hungarian 26 Swedish 26 Thai 26 Marathi 25 Greek 24 Gujarati 23 Estonian 22 Hebrew 22 Malayalam 22 Mandarin Chinese 22 Bulgarian 21 Basque 19 Kannada 18 Punjabi 17 Slovak 17 Swahili 17 Catalan 16 Latvian 16 Ukrainian 16 Croatian 15 Lithuanian 15 Slovenian 15 Kazakh 14 Norwegian 14 Serbian 14 Amharic 13 Assamese 12 Iranian Persian 12 Kurdish 12 Albanian 11 Irish 10 Maltese 10 Yoruba 10 Armenian 9 Burmese 9 Hausa 9 Igbo 9 Macedonian 9 Oriya (macrolanguage) 9 Welsh 9 American Sign Language 8 Georgian 8 Mongolian 8 Odia 8 Sanskrit 8 Sinhala 8 Tagalog 8 Azerbaijani 7 Bambara 7 Breton 7 Filipino 7 Icelandic 7 Malagasy 7 Nepali (macrolanguage) 7 Oromo 7 Serbo-Croatian 7 Somali 7 Uzbek 7 Wolof 7 Afrikaans 6 Central Khmer 6 Central Kurdish 6 Esperanto 6 Galician 6 Ganda 6 Guarani 6 Haitian 6 Nigerian Pidgin 6 Sindhi 6 Tigrinya 6 Western Panjabi 6 Belarusian 5 Egyptian Arabic 5 Fon 5 Javanese 5 Latin 5 Lingala 5 Malay (individual language) 5 Norwegian Bokmål 5 Norwegian Nynorsk 5 Quechua 5 Scottish Gaelic 5 Standard Arabic 5 Sundanese 5 Tibetan 5 Tswana 5 Aymara 4 Bangala 4 Bosnian 4 Cebuano 4 Chechen 4 Dhivehi 4 Ewe 4 Fulah 4 Iloko 4 Kabyle 4 Kinyarwanda 4 Kirghiz 4 Lao 4 Luo (Kenya and Tanzania) 4 Nyanja 4 South Azerbaijani 4 Tatar 4 Twi 4 Upper Sorbian 4 Xhosa 4 Zulu 4 Aragonese 3 Bashkir 3 Bavarian 3 Bishnupriya 3 Chuvash 3 Erzya 3 Faroese 3 Goan Konkani 3 Maithili 3 Malay (macrolanguage) 3 Romansh 3 Rundi 3 Russia Buriat 3 Shona 3 Swati 3 Swiss German 3 Tajik 3 Tetum 3 Tsonga 3 Uighur 3 Waray (Philippines) 3 Yiddish 3 Argentine Sign Language 2 Asturian 2 Avaric 2 Bangladeshi Sign Language 2 Bhojpuri 2 Central Bikol 2 Cherokee 2 Church Slavic 2 Cornish 2 Dimli (individual language) 2 Eastern Mari 2 German Sign Language 2 Gothic 2 Gulf Arabic 2 Ido 2 Interlingue 2 Inuktitut 2 Jejueo 2 Kalaallisut 2 Kalmyk 2 Karachay-Balkar 2 Komi 2 Komi-Permyak 2 Lezghian 2 Limburgan 2 Livvi 2 Lojban 2 Lombard 2 Low German 2 Lower Sorbian 2 Luxembourgish 2 Manipuri 2 Manx 2 Maori 2 Mazanderani 2 Minangkabau 2 Mingrelian 2 Mirandese 2 Modern Greek 2 Moksha 2 Moroccan Arabic 2 Mossi 2 Naxi 2 Neapolitan 2 Nepali (individual language) 2 Newari 2 Northern Frisian 2 Northern Kurdish 2 Northern Luri 2 Northern Sami 2 Occitan (post 1500) 2 Ossetian 2 Pampanga 2 Pedi 2 Piemontese 2 Pushto 2 Sardinian 2 Sichuan Yi 2 Sicilian 2 Southern Sotho 2 Swiss-German Sign Language 2 Tai 2 Tosk Albanian 2 Turkish Sign Language 2 Turkmen 2 Tuvinian 2 Udmurt 2 Venetian 2 Volapük 2 Walloon 2 Western Frisian 2 Western Mari 2 Wu Chinese 2 Yakut 2 Yue Chinese 2 Abkhazian 1 Achinese 1 Adyghe 1 Afar 1 Akan 1 Akkadian 1 Akuntsu 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Arpitan 1 Assyrian Neo-Aramaic 1 Banjar 1 Bemba (Zambia) 1 Bislama 1 Bodo (India) 1 Buginese 1 Central Pashto 1 Chamorro 1 Chavacano 1 Cheyenne 1 Choctaw 1 Chukot 1 Congo Swahili 1 Coptic 1 Corsican 1 Cree 1 Creek 1 Crimean Tatar 1 Dogri (macrolanguage) 1 Dzongkha 1 Extremaduran 1 Fiji Hindi 1 Fijian 1 French Sign Language 1 Friulian 1 Gagauz 1 Gan Chinese 1 Geez 1 Gilaki 1 Greek Sign Language 1 Hakha Chin 1 Hakka Chinese 1 Halh Mongolian 1 Hawaiian 1 Herero 1 Hiri Motu 1 Interlingua (International Auxiliary Language Association) 1 Inupiaq 1 Jamaican Creole English 1 Kabardian 1 Kabuverdianu 1 Kachin 1 Kanuri 1 Kara-Kalpak 1 Karelian 1 Kashmiri 1 Kashubian 1 Khunsari 1 Kikuyu 1 Komi-Zyrian 1 Kongo 1 Krio 1 Kuanyama 1 Kölsch 1 Ladino 1 Lak 1 Latgalian 1 Ligurian 1 Literary Chinese 1 Lozi 1 Lunda 1 Luo (Cameroon) 1 Lushai 1 Marshallese 1 Mbyá Guaraní 1 Mesopotamian Arabic 1 Min Dong Chinese 1 Modern Greek (1453-) 1 Mundurukú 1 Najdi Arabic 1 Narom 1 Nauru 1 Navajo 1 Nayini 1 Ndonga 1 Nigerian Fulfulde 1 North Azerbaijani 1 Northern Uzbek 1 Novial 1 Official Aramaic (700-300 BCE) 1 Old English (ca. 450-1100) 1 Old French 1 Old Russian 1 Old Turkish 1 Pali 1 Pangasinan 1 Papiamento 1 Pennsylvania German 1 Pfaelzisch 1 Picard 1 Pitcairn-Norfolk 1 Plateau Malagasy 1 Pontic 1 Rajasthani 1 Rusyn 1 Samoan 1 Sango 1 Saterfriesisch 1 Scots 1 Shan 1 Silesian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Southern Pashto 1 Sranan Tongo 1 Standard Latvian 1 Swahili (macrolanguage) 1 Swedish Sign Language 1 Tahitian 1 Tok Pisin 1 Tonga (Tonga Islands) 1 Tonga (Zambia) 1 Tulu 1 Tumbuka 1 Tunisian Arabic 1 Tupinambá 1 Venda 1 Veps 1 Vlaams 1 Vlax Romani 1 Votic 1 Warlpiri 1 West Central Oromo 1 Zaza 1 Zeeuws 1 Zhuang 1 Dogri (individual language) 0 Northern Huishui Hmong 0 Portuguse 0 Saidi Arabic 0 Santali 0