Datasets

9,703 machine learning datasets
Filter by Task
Cross-Lingual NER 2 Cross-Lingual Transfer 2 Language Modelling 2 Multilingual text classification 2 Named Entity Recognition (NER) 2 Question Answering 2 Token Classification 2 Abuse Detection 1 Abusive Language 1 Chinese Reading Comprehension 1 Classification 1 Croatian Text Diacritization 1 Cross-Lingual POS Tagging 1 Cross-lingual zero-shot dependency parsing 1 Czech Text Diacritization 1 Dependency Parsing 1 Document Classification 1 Document Summarization 1 Fake News Detection 1 French Text Diacritization 1 Hate Speech Detection 1 Hungarian Text Diacritization 1 Image Classification 1 Irish Text Diacritization 1 LABELED_DEPENDENCIES 1 LEMMA 1 Language Identification 1 Latvian Text Diacritization 1 MORPH 1 Machine Reading Comprehension 1 Misinformation 1 Multilingual Image-Text Classification 1 Multilingual Machine Comprehension in English Hindi 1 Multilingual NLP 1 Multilingual Named Entity Recognition 1 Multiple-choice 1 Named Entity Recognition 1 Natural Language Understanding 1 Natural Questions 1 Node Classification 1 POS 1 Part-Of-Speech Tagging 1 Polish Text Diacritization 1 Pretrained Multilingual Language Models 1 Reading Comprehension 1 Reading Comprehension (Few-Shot) 1 Reading Comprehension (One-Shot) 1 Reading Comprehension (Zero-Shot) 1 Romanian Text Diacritization 1 SENTS 1 Slovak Text Diacritization 1 Spanish Text Diacritization 1 TAG 1 Text Classification 1 Text Summarization 1 Text-to-Image Generation 1 Turkish Text Diacritization 1 UNLABELED_DEPENDENCIES 1 Vietnamese Machine Reading Comprehension 1 Vietnamese Text Diacritization 1 Word Embeddings 1 XLM-R 1
Filter by Language (clear)
Croatian English 2859 Chinese 323 German 168 French 145 Spanish 121 Russian 111 Japanese 83 Arabic 82 Italian 79 Portuguese 75 Hindi 68 Korean 54 Vietnamese 54 Turkish 50 Bengali 45 Persian 44 Tamil 41 Dutch 40 Polish 37 Czech 34 Indonesian 34 Danish 31 Finnish 30 Telugu 30 Romanian 29 Multilingual 28 Urdu 27 Marathi 25 Hungarian 23 Malayalam 22 Mandarin Chinese 22 Swedish 22 Thai 22 Greek 21 Gujarati 20 Basque 19 Estonian 19 Hebrew 19 Bulgarian 18 Kannada 18 Punjabi 17 Slovak 16 Ukrainian 16 Swahili 15 Catalan 14 Norwegian 14 Slovenian 14 Latvian 13 Assamese 12 Iranian Persian 12 Kazakh 12 Lithuanian 12 Serbian 12 Amharic 11 Kurdish 10 Albanian 9 Armenian 9 Irish 9 Maltese 9 Oriya (macrolanguage) 9 Welsh 9 American Sign Language 8 Georgian 8 Mongolian 8 Sanskrit 8 Sinhala 8 Tagalog 8 Yoruba 8 Azerbaijani 7 Breton 7 Burmese 7 Hausa 7 Icelandic 7 Igbo 7 Macedonian 7 Serbo-Croatian 7 Somali 7 Uzbek 7 Afrikaans 6 Central Kurdish 6 Esperanto 6 Galician 6 Guarani 6 Haitian 6 Odia 6 Oromo 6 Sindhi 6 Bambara 5 Belarusian 5 Egyptian Arabic 5 Filipino 5 Javanese 5 Latin 5 Malagasy 5 Malay (individual language) 5 Nepali (macrolanguage) 5 Norwegian Bokmål 5 Norwegian Nynorsk 5 Quechua 5 Scottish Gaelic 5 Standard Arabic 5 Sundanese 5 Tibetan 5 Tigrinya 5 Wolof 5 Bangala 4 Cebuano 4 Central Khmer 4 Chechen 4 Dhivehi 4 Fulah 4 Ganda 4 Iloko 4 Kabyle 4 Kinyarwanda 4 Kirghiz 4 Lao 4 Lingala 4 Nigerian Pidgin 4 Nyanja 4 South Azerbaijani 4 Tatar 4 Upper Sorbian 4 Western Panjabi 4 Aragonese 3 Bashkir 3 Bavarian 3 Bishnupriya 3 Bosnian 3 Chuvash 3 Erzya 3 Faroese 3 Fon 3 German Sign Language 3 Goan Konkani 3 Maithili 3 Malay (macrolanguage) 3 Romansh 3 Russia Buriat 3 Swati 3 Swiss German 3 Tajik 3 Tsonga 3 Tswana 3 Twi 3 Uighur 3 Waray (Philippines) 3 Xhosa 3 Yiddish 3 Argentine Sign Language 2 Asturian 2 Avaric 2 Aymara 2 Bangladeshi Sign Language 2 Bhojpuri 2 Central Bikol 2 Cherokee 2 Church Slavic 2 Cornish 2 Dimli (individual language) 2 Eastern Mari 2 Ewe 2 Gothic 2 Gulf Arabic 2 Ido 2 Interlingue 2 Inuktitut 2 Jejueo 2 Kalaallisut 2 Kalmyk 2 Karachay-Balkar 2 Komi 2 Komi-Permyak 2 Lezghian 2 Limburgan 2 Livvi 2 Lojban 2 Lombard 2 Low German 2 Lower Sorbian 2 Luo (Kenya and Tanzania) 2 Luxembourgish 2 Manipuri 2 Manx 2 Maori 2 Mazanderani 2 Minangkabau 2 Mingrelian 2 Mirandese 2 Modern Greek 2 Moksha 2 Moroccan Arabic 2 Naxi 2 Neapolitan 2 Nepali (individual language) 2 Newari 2 Northern Frisian 2 Northern Kurdish 2 Northern Luri 2 Northern Sami 2 Occitan (post 1500) 2 Ossetian 2 Pampanga 2 Pedi 2 Piemontese 2 Pushto 2 Rundi 2 Sardinian 2 Shona 2 Sichuan Yi 2 Sicilian 2 Southern Sotho 2 Swiss-German Sign Language 2 Tai 2 Tosk Albanian 2 Turkish Sign Language 2 Turkmen 2 Tuvinian 2 Udmurt 2 Venetian 2 Volapük 2 Walloon 2 Western Frisian 2 Western Mari 2 Wu Chinese 2 Yakut 2 Yue Chinese 2 Zulu 2 Abkhazian 1 Achinese 1 Adyghe 1 Afar 1 Akan 1 Akkadian 1 Akuntsu 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Arpitan 1 Assyrian Neo-Aramaic 1 Banjar 1 Bemba (Zambia) 1 Bislama 1 Bodo (India) 1 Buginese 1 Central Pashto 1 Chamorro 1 Chavacano 1 Cheyenne 1 Choctaw 1 Chukot 1 Congo Swahili 1 Coptic 1 Corsican 1 Cree 1 Creek 1 Crimean Tatar 1 Dogri (macrolanguage) 1 Dzongkha 1 Extremaduran 1 Fiji Hindi 1 Fijian 1 Friulian 1 Gagauz 1 Gan Chinese 1 Geez 1 Gilaki 1 Greek Sign Language 1 Hakha Chin 1 Hakka Chinese 1 Halh Mongolian 1 Hawaiian 1 Herero 1 Hiri Motu 1 Interlingua (International Auxiliary Language Association) 1 Inupiaq 1 Jamaican Creole English 1 Kabardian 1 Kabuverdianu 1 Kachin 1 Kanuri 1 Kara-Kalpak 1 Karelian 1 Kashmiri 1 Kashubian 1 Khunsari 1 Kikuyu 1 Komi-Zyrian 1 Kongo 1 Krio 1 Kuanyama 1 Kölsch 1 Ladino 1 Lak 1 Latgalian 1 Ligurian 1 Literary Chinese 1 Lozi 1 Lunda 1 Luo (Cameroon) 1 Lushai 1 Marshallese 1 Mbyá Guaraní 1 Mesopotamian Arabic 1 Min Dong Chinese 1 Modern Greek (1453-) 1 Mundurukú 1 Najdi Arabic 1 Narom 1 Nauru 1 Navajo 1 Nayini 1 Ndonga 1 Nigerian Fulfulde 1 North Azerbaijani 1 North Levantine Arabic 1 Northern Uzbek 1 Novial 1 Official Aramaic (700-300 BCE) 1 Old English (ca. 450-1100) 1 Old French 1 Old Russian 1 Old Turkish 1 Pali 1 Pangasinan 1 Papiamento 1 Pennsylvania German 1 Pfaelzisch 1 Picard 1 Pitcairn-Norfolk 1 Plateau Malagasy 1 Pontic 1 Rajasthani 1 Rusyn 1 Samoan 1 Sango 1 Saterfriesisch 1 Scots 1 Shan 1 Silesian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Southern Pashto 1 Sranan Tongo 1 Standard Latvian 1 Swahili (macrolanguage) 1 Swedish Sign Language 1 Tahitian 1 Tetum 1 Tok Pisin 1 Tonga (Tonga Islands) 1 Tonga (Zambia) 1 Tulu 1 Tumbuka 1 Tunisian Arabic 1 Tupinambá 1 Venda 1 Veps 1 Vlaams 1 Vlax Romani 1 Votic 1 Warlpiri 1 West Central Oromo 1 Zaza 1 Zeeuws 1 Zhuang 1 Dogri (individual language) 0 Northern Huishui Hmong 0 Portuguse 0 Saidi Arabic 0 Santali 0

14 dataset results for Croatian