Datasets

7,957 machine learning datasets
Filter by Language (clear)
Multilingual English 2189 Chinese 310 German 149 French 127 Spanish 108 Russian 98 Japanese 78 Italian 74 Arabic 70 Portuguese 62 Hindi 58 Korean 57 Turkish 44 Vietnamese 41 Dutch 38 Persian 34 Polish 34 Tamil 34 Czech 32 Bengali 31 Indonesian 30 Danish 27 Romanian 27 Finnish 26 Telugu 24 Malayalam 23 Marathi 21 Urdu 21 Thai 20 Estonian 19 Hungarian 19 Mandarin Chinese 19 Greek 18 Swedish 18 Bulgarian 17 Gujarati 17 Hebrew 17 Kannada 17 Basque 16 Punjabi 15 Slovak 14 Slovenian 14 Croatian 13 Latvian 13 Norwegian 13 Ukrainian 13 Assamese 12 Catalan 12 Kazakh 12 Lithuanian 12 Amharic 10 Serbian 10 Swahili 10 Albanian 9 Armenian 9 Irish 9 Kurdish 9 Oriya (macrolanguage) 9 Welsh 9 Breton 8 Maltese 8 Mongolian 8 Sanskrit 8 Sinhala 8 American Sign Language 7 Georgian 7 Icelandic 7 Macedonian 7 Yoruba 7 Afrikaans 6 Azerbaijani 6 Esperanto 6 Uzbek 6 Belarusian 5 Burmese 5 Filipino 5 Galician 5 Hausa 5 Igbo 5 Iranian Persian 5 Latin 5 Norwegian Nynorsk 5 Scottish Gaelic 5 Sindhi 5 Tagalog 5 Bambara 4 Bosnian 4 Central Kurdish 4 Chechen 4 Dhivehi 4 Egyptian Arabic 4 Guarani 4 Haitian 4 Javanese 4 Malagasy 4 Malay (individual language) 4 Nepali (macrolanguage) 4 Odia 4 Oromo 4 Quechua 4 Serbo-Croatian 4 Somali 4 Standard Arabic 4 Sundanese 4 Tatar 4 Tibetan 4 Upper Sorbian 4 Wolof 4 Aragonese 3 Bashkir 3 Bavarian 3 Bishnupriya 3 Central Khmer 3 Chuvash 3 Erzya 3 Faroese 3 Fon 3 Fulah 3 Ganda 3 Goan Konkani 3 Iloko 3 Kinyarwanda 3 Kirghiz 3 Lao 3 Maithili 3 Nigerian Pidgin 3 Norwegian Bokmål 3 Romansh 3 Russia Buriat 3 South Azerbaijani 3 Swiss German 3 Tigrinya 3 Twi 3 Uighur 3 Western Panjabi 3 Yiddish 3 Asturian 2 Avaric 2 Aymara 2 Bangala 2 Bhojpuri 2 Cebuano 2 Central Bikol 2 Cherokee 2 Church Slavic 2 Cornish 2 Dimli (individual language) 2 Eastern Mari 2 Ewe 2 German Sign Language 2 Gothic 2 Ido 2 Interlingue 2 Inuktitut 2 Jejueo 2 Kabyle 2 Kalaallisut 2 Kalmyk 2 Karachay-Balkar 2 Komi 2 Komi-Permyak 2 Lezghian 2 Limburgan 2 Lingala 2 Livvi 2 Lojban 2 Lombard 2 Low German 2 Lower Sorbian 2 Luxembourgish 2 Malay (macrolanguage) 2 Manipuri 2 Manx 2 Mazanderani 2 Minangkabau 2 Mingrelian 2 Mirandese 2 Modern Greek 2 Moksha 2 Naxi 2 Neapolitan 2 Newari 2 Northern Frisian 2 Northern Kurdish 2 Northern Luri 2 Northern Sami 2 Occitan (post 1500) 2 Ossetian 2 Pampanga 2 Piemontese 2 Pushto 2 Sardinian 2 Sichuan Yi 2 Sicilian 2 Swati 2 Swiss-German Sign Language 2 Tai 2 Tajik 2 Tsonga 2 Tswana 2 Turkish Sign Language 2 Turkmen 2 Tuvinian 2 Udmurt 2 Venetian 2 Volapük 2 Walloon 2 Waray (Philippines) 2 Western Frisian 2 Western Mari 2 Wu Chinese 2 Xhosa 2 Yakut 2 Yue Chinese 2 Abkhazian 1 Achinese 1 Adyghe 1 Afar 1 Akan 1 Akkadian 1 Akuntsu 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Argentine Sign Language 1 Arpitan 1 Assyrian Neo-Aramaic 1 Bangladeshi Sign Language 1 Banjar 1 Bislama 1 Bodo (India) 1 Buginese 1 Central Pashto 1 Chamorro 1 Chavacano 1 Cheyenne 1 Choctaw 1 Chukot 1 Congo Swahili 1 Coptic 1 Corsican 1 Cree 1 Creek 1 Crimean Tatar 1 Dogri (macrolanguage) 1 Dzongkha 1 Extremaduran 1 Fiji Hindi 1 Fijian 1 Friulian 1 Gagauz 1 Gan Chinese 1 Geez 1 Gilaki 1 Greek Sign Language 1 Gulf Arabic 1 Hakha Chin 1 Hakka Chinese 1 Hawaiian 1 Herero 1 Hiri Motu 1 Interlingua (International Auxiliary Language Association) 1 Inupiaq 1 Jamaican Creole English 1 Kabardian 1 Kanuri 1 Kara-Kalpak 1 Karelian 1 Kashmiri 1 Kashubian 1 Khunsari 1 Kikuyu 1 Komi-Zyrian 1 Kongo 1 Krio 1 Kuanyama 1 Kölsch 1 Ladino 1 Lak 1 Latgalian 1 Ligurian 1 Literary Chinese 1 Luo (Cameroon) 1 Luo (Kenya and Tanzania) 1 Lushai 1 Maori 1 Marshallese 1 Mbyá Guaraní 1 Min Dong Chinese 1 Modern Greek (1453-) 1 Moroccan Arabic 1 Mundurukú 1 Narom 1 Nauru 1 Navajo 1 Nayini 1 Ndonga 1 Nepali (individual language) 1 Novial 1 Nyanja 1 Official Aramaic (700-300 BCE) 1 Old English (ca. 450-1100) 1 Old French 1 Old Russian 1 Old Turkish 1 Pali 1 Pangasinan 1 Papiamento 1 Pedi 1 Pennsylvania German 1 Pfaelzisch 1 Picard 1 Pitcairn-Norfolk 1 Pontic 1 Portuguse 1 Rajasthani 1 Rundi 1 Rusyn 1 Samoan 1 Sango 1 Saterfriesisch 1 Scots 1 Shona 1 Silesian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Southern Sotho 1 Sranan Tongo 1 Swahili (macrolanguage) 1 Swedish Sign Language 1 Tahitian 1 Tetum 1 Tok Pisin 1 Tonga (Tonga Islands) 1 Tosk Albanian 1 Tulu 1 Tumbuka 1 Tunisian Arabic 1 Tupinambá 1 Venda 1 Veps 1 Vlaams 1 Vlax Romani 1 Votic 1 Warlpiri 1 Zaza 1 Zeeuws 1 Zhuang 1 Zulu 1 Dogri (individual language) 0 Nigerian Fulfulde 0 Northern Huishui Hmong 0 Saidi Arabic 0 Santali 0

23 dataset results for Multilingual