Datasets

9,496 machine learning datasets
Filter by Task
Question Answering 8 Machine Translation 6 Speech Recognition 4 Cross-Lingual Transfer 3 Data Augmentation 3 Misinformation 3 Text Categorization 3 Text Summarization 3 Token Classification 3 Abstractive Text Summarization 2 Cross-Lingual NER 2 Cross-Lingual POS Tagging 2 Document Summarization 2 Fake News Detection 2 Information Retrieval 2 Intent Detection 2 Language Identification 2 Language Modelling 2 Multilingual text classification 2 Named Entity Recognition (NER) 2 Natural Language Understanding 2 Part-Of-Speech Tagging 2 Reading Comprehension 2 Relation Extraction 2 Sentiment Analysis 2 Speech-to-Text Translation 2 Spoken Language Understanding 2 Translation 2 Word Embeddings 2 Accented Speech Recognition 1 Automatic Post-Editing 1 Automatic Speech Recognition 1 Automatic Speech Recognition (ASR) 1 Bias Detection 1 Chinese Reading Comprehension 1 Citation Recommendation 1 Code Generation 1 Cross-Lingual Abstractive Summarization 1 Cross-Lingual Document Classification 1 Cross-Lingual Natural Language Inference 1 Cross-Lingual Sentiment Classification 1 Cross-lingual zero-shot dependency parsing 1 Dependency Parsing 1 Document Classification 1 Document Translation 1 Domain Adaptation 1 Entity Embeddings 1 Entity Linking 1 Fact Verification 1 Few-shot NER 1 Food recommendation 1 Handwriting Recognition 1 Handwriting generation 1 Image Classification 1 Intent Classification 1 Keyword Spotting 1 Knowledge Graphs 1 LABELED_DEPENDENCIES 1 LEMMA 1 Language Acquisition 1 Linguistic Acceptability 1 MORPH 1 Machine Reading Comprehension 1 Multilingual Machine Comprehension in English Hindi 1 Multilingual NLP 1 Multilingual Named Entity Recognition 1 Multiple-choice 1 Multiview Clustering 1 Natural Language Inference 1 Natural Questions 1 News Summarization 1 Node Classification 1 Open-Domain Question Answering 1 POS 1 Pretrained Multilingual Language Models 1 Propaganda detection 1 Question Generation 1 Reading Comprehension (Few-Shot) 1 Reading Comprehension (One-Shot) 1 Reading Comprehension (Zero-Shot) 1 SENTS 1 Semantic Parsing 1 Sentence Embeddings 1 Slot Filling 1 Speech Synthesis 1 Speech-to-Speech Translation 1 Spoken language identification 1 TAG 1 Text Classification 1 Text Clustering 1 Text Generation 1 Text Style Transfer 1 Text-To-Speech Synthesis 1 UNLABELED_DEPENDENCIES 1 Unfairness Detection 1 Vietnamese Machine Reading Comprehension 1 XLM-R 1 Zero-Shot Cross-Lingual Transfer 1 Zero-shot Cross-lingual Fact-checking 1 text annotation 1 text similarity 1
Filter by Language (clear)
Italian English 1443 Chinese 205 German 125 French 111 Spanish 93 Russian 88 Portuguese 61 Japanese 56 Arabic 54 Hindi 52 Korean 40 Turkish 39 Vietnamese 34 Dutch 33 Persian 31 Bengali 30 Czech 30 Tamil 30 Danish 28 Polish 28 Indonesian 27 Finnish 24 Romanian 24 Marathi 22 Multilingual 22 Telugu 21 Hungarian 19 Urdu 19 Swedish 18 Thai 18 Estonian 17 Greek 17 Bulgarian 15 Gujarati 15 Hebrew 15 Malayalam 15 Slovak 14 Swahili 14 Basque 13 Croatian 13 Punjabi 13 Ukrainian 13 Latvian 11 Mandarin Chinese 11 Norwegian 11 Slovenian 11 Amharic 10 Catalan 10 Kazakh 10 Lithuanian 10 Serbian 10 Kannada 9 Albanian 8 Armenian 8 Assamese 8 Irish 7 Oriya (macrolanguage) 7 Sanskrit 7 Sinhala 7 Tagalog 7 Welsh 7 Yoruba 7 Burmese 6 Georgian 6 Hausa 6 Icelandic 6 Igbo 6 Iranian Persian 6 Kurdish 6 Macedonian 6 Maltese 6 Mongolian 6 Somali 6 Afrikaans 5 Azerbaijani 5 Galician 5 Guarani 5 Haitian 5 Malay (individual language) 5 Norwegian Bokmål 5 Oromo 5 Sindhi 5 Uzbek 5 American Sign Language 4 Bambara 4 Belarusian 4 Breton 4 Egyptian Arabic 4 Filipino 4 Latin 4 Malagasy 4 Nigerian Pidgin 4 Norwegian Nynorsk 4 Odia 4 Scottish Gaelic 4 Serbo-Croatian 4 Tigrinya 4 Wolof 4 Bangala 3 Cebuano 3 Central Khmer 3 Central Kurdish 3 Chechen 3 Esperanto 3 Fulah 3 Ganda 3 Iloko 3 Javanese 3 Kirghiz 3 Lao 3 Lingala 3 Nepali (macrolanguage) 3 Quechua 3 South Azerbaijani 3 Standard Arabic 3 Sundanese 3 Upper Sorbian 3 Western Panjabi 3 Aragonese 2 Bashkir 2 Bavarian 2 Bhojpuri 2 Bishnupriya 2 Bosnian 2 Dhivehi 2 Erzya 2 Faroese 2 Goan Konkani 2 Jejueo 2 Kabyle 2 Kinyarwanda 2 Luo (Kenya and Tanzania) 2 Maithili 2 Malay (macrolanguage) 2 Modern Greek 2 Moroccan Arabic 2 Nepali (individual language) 2 Nyanja 2 Romansh 2 Russia Buriat 2 Swati 2 Tajik 2 Tatar 2 Tibetan 2 Tsonga 2 Tswana 2 Uighur 2 Waray (Philippines) 2 Xhosa 2 Yiddish 2 Yue Chinese 2 Akkadian 1 Akuntsu 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Argentine Sign Language 1 Assyrian Neo-Aramaic 1 Asturian 1 Avaric 1 Aymara 1 Bemba (Zambia) 1 Central Bikol 1 Central Pashto 1 Chavacano 1 Chukot 1 Church Slavic 1 Chuvash 1 Congo Swahili 1 Coptic 1 Cornish 1 Dimli (individual language) 1 Dogri (macrolanguage) 1 Eastern Mari 1 Ewe 1 Fon 1 Geez 1 German Sign Language 1 Gothic 1 Gulf Arabic 1 Halh Mongolian 1 Ido 1 Interlingue 1 Inuktitut 1 Kabuverdianu 1 Kachin 1 Kalaallisut 1 Kalmyk 1 Karachay-Balkar 1 Karelian 1 Khunsari 1 Komi 1 Komi-Permyak 1 Komi-Zyrian 1 Krio 1 Lezghian 1 Limburgan 1 Literary Chinese 1 Livvi 1 Lojban 1 Lombard 1 Low German 1 Lower Sorbian 1 Lozi 1 Lunda 1 Luo (Cameroon) 1 Lushai 1 Luxembourgish 1 Manipuri 1 Manx 1 Maori 1 Mazanderani 1 Mbyá Guaraní 1 Mesopotamian Arabic 1 Minangkabau 1 Mingrelian 1 Mirandese 1 Moksha 1 Mundurukú 1 Najdi Arabic 1 Nayini 1 Neapolitan 1 Newari 1 Nigerian Fulfulde 1 North Azerbaijani 1 North Levantine Arabic 1 Northern Frisian 1 Northern Kurdish 1 Northern Luri 1 Northern Sami 1 Northern Uzbek 1 Occitan (post 1500) 1 Old French 1 Old Russian 1 Old Turkish 1 Ossetian 1 Pampanga 1 Pedi 1 Piemontese 1 Plateau Malagasy 1 Pushto 1 Rundi 1 Sardinian 1 Shan 1 Shona 1 Sicilian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Southern Pashto 1 Southern Sotho 1 Standard Latvian 1 Swedish Sign Language 1 Swiss German 1 Swiss-German Sign Language 1 Tonga (Zambia) 1 Tosk Albanian 1 Tupinambá 1 Turkmen 1 Tuvinian 1 Twi 1 Venetian 1 Volapük 1 Walloon 1 Warlpiri 1 West Central Oromo 1 Western Frisian 1 Western Mari 1 Wu Chinese 1 Yakut 1 Zulu 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 Arpitan 0 Bangladeshi Sign Language 0 Banjar 0 Bislama 0 Bodo (India) 0 Buginese 0 Chamorro 0 Cherokee 0 Cheyenne 0 Choctaw 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dogri (individual language) 0 Dzongkha 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Friulian 0 Gagauz 0 Gan Chinese 0 Gilaki 0 Greek Sign Language 0 Hakha Chin 0 Hakka Chinese 0 Hawaiian 0 Herero 0 Hiri Motu 0 Interlingua (International Auxiliary Language Association) 0 Inupiaq 0 Jamaican Creole English 0 Kabardian 0 Kanuri 0 Kara-Kalpak 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kongo 0 Kuanyama 0 Kölsch 0 Ladino 0 Lak 0 Latgalian 0 Ligurian 0 Marshallese 0 Min Dong Chinese 0 Modern Greek (1453-) 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Northern Huishui Hmong 0 Novial 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Pali 0 Pangasinan 0 Papiamento 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Pitcairn-Norfolk 0 Pontic 0 Portuguse 0 Rajasthani 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Saterfriesisch 0 Scots 0 Sichuan Yi 0 Silesian 0 Sranan Tongo 0 Swahili (macrolanguage) 0 Tahitian 0 Tai 0 Tetum 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Turkish Sign Language 0 Udmurt 0 Venda 0 Veps 0 Vlaams 0 Vlax Romani 0 Votic 0 Zaza 0 Zeeuws 0 Zhuang 0

58 dataset results for Texts AND Italian