Datasets

11,845 machine learning datasets
Filter by Task
Question Answering 309 Text Generation 142 Language Modelling 122 Text Classification 118 Visual Question Answering (VQA) 103 Named Entity Recognition (NER) 98 Text Summarization 78 Reading Comprehension 77 Natural Language Inference 68 Sentiment Analysis 67 Information Retrieval 63 Machine Translation 61 Classification 59 Relation Extraction 58 Natural Language Understanding 53 Common Sense Reasoning 47 Image Captioning 44 Code Generation 43 Hate Speech Detection 36 Abstractive Text Summarization 35 Coreference Resolution 35 Machine Reading Comprehension 35 Entity Linking 34 Misinformation 32 Video Question Answering 32 Word Embeddings 30 Semantic Parsing 28 Visual Question Answering 28 Data Augmentation 27 Retrieval 27 Speech Recognition 27 Stance Detection 25 Video Captioning 25 Document Summarization 24 Image Retrieval 24 Text Retrieval 24 Video Retrieval 24 Open-Domain Question Answering 23 Dialogue Generation 22 Fake News Detection 22 Visual Reasoning 22 Knowledge Graphs 21 Optical Character Recognition (OCR) 21 Recommendation Systems 21 Automatic Speech Recognition (ASR) 20 Data-to-Text Generation 20 Part-Of-Speech Tagging 20 Video Understanding 20 Image Classification 19 NER 19 Slot Filling 19 Intent Detection 18 Question Generation 18 Relation Classification 18 Domain Adaptation 17 Emotion Recognition 17 Few-Shot Learning 17 Logical Reasoning 17 Mathematical Reasoning 17 Multi-Task Learning 17 Task-Oriented Dialogue Systems 17 Math Word Problem Solving 16 Paraphrase Identification 16 Text Simplification 16 Text-to-Image Generation 16 Language Identification 15 Link Prediction 15 Semantic Textual Similarity 15 Fact Verification 14 Object Detection 14 Zero-shot Text Search 14 Decision Making 13 Grammatical Error Correction 13 Multi-Label Classification 13 Named Entity Recognition 13 Sarcasm Detection 13 Sentiment Classification 13 Translation 13 Zero-Shot Learning 13 Aspect-Based Sentiment Analysis (ABSA) 12 Dialogue State Tracking 12 Emotion Classification 12 Emotion Recognition in Conversation 12 Handwriting Recognition 12 Image Generation 12 Instruction Following 12 Joint Entity and Relation Extraction 12 Multi-Document Summarization 12 Paraphrase Generation 12 Text-To-SQL 12 Word Sense Disambiguation 12 Zero-Shot Video Question Answer 12 Cross-Lingual Transfer 11 Event Extraction 11 Knowledge Base Question Answering 11 Multiple-choice 11 Open Information Extraction 11 Semantic Segmentation 11 Semantic Similarity 11 Sentence Classification 11 Text-To-Speech Synthesis 11 UIE 11 Code Search 10 Conversational Response Selection 10 Cross-Modal Retrieval 10 Dependency Parsing 10 Document Classification 10 Handwritten Text Recognition 10 Intent Classification 10 Mathematical Question Answering 10 Medical Visual Question Answering 10 Multi-Label Text Classification 10 News Classification 10 Open-Domain Dialog 10 Semantic Role Labeling 10 Topic Models 10 Vision and Language Navigation 10 Automatic Post-Editing 9 Binary text classification 9 Entity Disambiguation 9 Entity Typing 9 Fact Checking 9 Multimodal Deep Learning 9 Nested Named Entity Recognition 9 Passage Retrieval 9 Scene Text Recognition 9 Sign Language Translation 9 Speech Synthesis 9 Text-to-Video Generation 9 Video Generation 9 Visual Dialog 9 Abusive Language 8 Answer Selection 8 Code Completion 8 Handwriting generation 8 Large Language Model 8 Multimodal Reasoning 8 Node Classification 8 Referring Expression Segmentation 8 Scene Text Detection 8 Sentence Embeddings 8 Speaker Verification 8 Spoken Language Understanding 8 Text-to-Code Generation 8 Topic Classification 8 Chatbot 7 Conversational Question Answering 7 Dense Video Captioning 7 Dialogue Understanding 7 Discourse Parsing 7 Explanation Generation 7 Extreme Summarization 7 Fairness 7 Medical Report Generation 7 Multilingual NLP 7 Multiple Choice Question Answering (MCQA) 7 Opinion Mining 7 Response Generation 7 Speech Emotion Recognition 7 Stance Classification 7 Story Generation 7 Temporal Tagging 7 Toxic Comment Classification 7 Vision-Language Navigation 7 Visual Grounding 7 6 Ad-hoc video search 6 Arithmetic Reasoning 6 Attribute Value Extraction 6 Automatic Phoneme Recognition 6 Automatic Speech Recognition 6 Bandwidth Extension 6 Bias Detection 6 Binary Classification 6 Chart Question Answering 6 Chinese Reading Comprehension 6 Cross-Lingual NER 6 Dialogue Act Classification 6 Event Detection 6 Hierarchical Multi-label Classification 6 Image-to-Text Retrieval 6 Layout-to-Image Generation 6 Lip Reading 6 Lipreading 6 Masked Language Modeling 6 Medical Named Entity Recognition 6 Meeting Summarization 6 Moment Retrieval 6 Multimodal Emotion Recognition 6 Multimodal Sentiment Analysis 6 News Summarization 6 Phrase Grounding 6 Referring Expression Comprehension 6 Self-Supervised Learning 6 Sign Language Recognition 6 Speech Enhancement 6 Table-to-Text Generation 6 Twitter Sentiment Analysis 6 Video Grounding 6 Video Summarization 6 Visual Commonsense Reasoning 6 Zero-Shot Video Retrieval 6 regression 6 AI and Safety 5 AMR Parsing 5 AMR-to-Text Generation 5 Abstractive Dialogue Summarization 5 Anomaly Detection 5 Argument Mining 5 Audio captioning 5 Audio-Visual Speech Recognition 5 Automated Theorem Proving 5 Chinese Named Entity Recognition 5 Citation Recommendation 5 Code Repair 5 Code Summarization 5 Code Translation 5 Community Question Answering 5 Composed Image Retrieval (CoIR) 5 Conversational Response Generation 5 Dialogue Evaluation 5 Document Ranking 5 Extractive Text Summarization 5 Generative Question Answering 5 Gloss-free Sign Language Translation 5 Goal-Oriented Dialog 5 Grammatical Error Detection 5 Key Information Extraction 5 Keyword Extraction 5 Language Acquisition 5 Learning-To-Rank 5 Linguistic Acceptability 5 Medical Concept Normalization 5 Medical Diagnosis 5 Medical Relation Extraction 5 Motion Synthesis 5 Multi-class Classification 5 Multilabel Text Classification 5 Music Generation 5 Natural Language Visual Grounding 5 News Generation 5 Open Intent Discovery 5 Reading Comprehension (Few-Shot) 5 Reading Comprehension (One-Shot) 5 Reading Comprehension (Zero-Shot) 5 SSTOD 5 Scene Graph Generation 5 Scientific Document Summarization 5 Speech Separation 5 Stochastic Optimization 5 Style Transfer 5 Text to Audio Retrieval 5 Token Classification 5 Transfer Learning 5 Visual Navigation 5 Vulnerability Detection 5 Zero-Shot Composed Image Retrieval (ZS-CIR) 5 multimodal generation 5 Abuse Detection 4 Action Recognition 4 Adversarial Robustness 4 Arabic Sentiment Analysis 4 Aspect Category Detection 4 Audio to Text Retrieval 4 Code Classification 4 Constituency Parsing 4 Cross-Lingual Question Answering 4 Discourse Segmentation 4 Elementary Mathematics 4 Entity Resolution 4 Entity Retrieval 4 Event Coreference Resolution 4 Few-Shot Relation Classification 4 Few-shot NER 4 Gender Bias Detection 4 Genre classification 4 Gesture Generation 4 Graph Classification 4 Graph Embedding 4 Image to Video Generation 4 Image-text Retrieval 4 KG-to-Text Generation 4 Keyphrase Extraction 4 Knowledge Probing 4 Logical Reasoning Question Answering 4 Long Form Question Answering 4 Low Resource Named Entity Recognition 4 Math 4 Memorization 4 Multi-task Language Understanding 4 Multimodal Recommendation 4 Music Captioning 4 Natural Language Moment Retrieval 4 Natural Questions 4 Nested Mention Recognition 4 News Recommendation 4 Paper generation 4 Person Re-Identification 4 Product Recommendation 4 Program Repair 4 RAG 4 Reinforcement Learning (RL) 4 Relational Reasoning 4 Sentence Embedding 4 Sequence-to-sequence Language Modeling 4 Spelling Correction 4 Systematic Generalization 4 Table Detection 4 Temporal Relation Classification 4 Temporal Relation Extraction 4 Term Extraction 4 Text Clustering 4 Text Matching 4 Text Segmentation 4 Text Style Transfer 4 Text to Video Retrieval 4 Text-to-Music Generation 4 Video Description 4 Video Segmentation 4 Video-Text Retrieval 4 Visual Relationship Detection 4 Visual Storytelling 4 Weakly-Supervised Named Entity Recognition 4 Zero-Shot Cross-Lingual Transfer 4 coreference-resolution 4 text annotation 4 2D Object Detection 3 3D Object Detection 3 Active Learning 3 Adversarial Attack 3 Answer Generation 3 Arabic Text Diacritization 3 Art Analysis 3 Aspect Category Polarity 3 Aspect Extraction 3 Aspect Term Extraction and Sentiment Classification 3 Aspect-Category-Opinion-Sentiment Quadruple Extraction 3 Attribute Mining 3 Audio Generation 3 Binary Relation Extraction 3 Biomedical Information Retrieval 3 Boundary Detection 3 Chinese Word Segmentation 3 Chunking 3 Citation Intent Classification 3 Click-Through Rate Prediction 3 Cloze (multi-choices) (Few-Shot) 3 Cloze (multi-choices) (One-Shot) 3 Cloze (multi-choices) (Zero-Shot) 3 Code Documentation Generation 3 College Mathematics 3 Conditional Text Generation 3 Continual Learning 3 Cross Document Coreference Resolution 3 Cross-Lingual Abstractive Summarization 3 Cross-modal retrieval with noisy correspondence 3 Data-free Knowledge Distillation 3 Definition Extraction 3 Depression Detection 3 Dialect Identification 3 Document Text Classification 3 Document-level Closed Information Extraction 3 Entity Alignment 3 Explainable artificial intelligence 3 Extractive Summarization 3 FG-1-PG-1 3 Few-Shot Image Classification 3 Financial Relation Extraction 3 Formal Logic 3 Game of Sudoku 3 Gender Prediction 3 Goal-Oriented Dialogue Systems 3 Humor Detection 3 Informal-to-formal Style Transfer 3 Instance Segmentation 3 Intent Discovery 3 Intent Recognition 3 Joint Event and Temporal Relation Extraction 3 Key-value Pair Extraction 3 LLM Jailbreak 3 Lemmatization 3 Lexical Entailment 3 Long-range modeling 3 Low-Resource Neural Machine Translation 3 Math Word Problem SolvingΩ 3 Meme Classification 3 Multi-Domain Recommender Systems 3 Multi-Hop Reading Comprehension 3 Multi-Label Learning 3 Multi-hop Question Answering 3 Multi-modal Dialogue Generation 3 Multimodal Abstractive Text Summarization 3 Multimodal Intent Recognition 3 Multimodal Machine Translation 3 Multiple Instance Learning 3 Music Recommendation 3 Negation Detection 3 News Annotation 3 Object Counting 3 Object Recognition 3 Open Intent Detection 3 Out of Distribution (OOD) Detection 3 Person Retrieval 3 Person Search 3 Point Processes 3 Recognizing Emotion Cause in Conversations 3 Referring Expression 3 Representation Learning 3 Review Generation 3 STS 3 Satire Detection 3 Scene Graph Detection 3 Science Question Answering 3 Segmentation 3 Semantic Image-Text Similarity 3 Sentence Retrieval 3 Sentence-Pair Classification 3 Sequential Recommendation 3 Sign Language Production 3 Source Code Summarization 3 Speaker Diarization 3 Speech-to-Text Translation 3 Spoken Dialogue Systems 3 Structured Prediction 3 Temporal Action Localization 3 Temporal/Casual QA 3 Text Categorization 3 Text Pair Classification 3 Text Reranking 3 Text based Person Retrieval 3 Text to 3D 3 Text-based Person Retrieval with Noisy Correspondence 3 Text-based de novo Molecule Generation 3 Text2text Generation 3 Time Series Forecasting 3 Transliteration 3 TruthfulQA 3 Unconstrained Lip-synchronization 3 Unsupervised Extractive Summarization 3 Unsupervised Machine Translation 3 Unsupervised Text Classification 3 Video Editing 3 Video Inpainting 3 Visual Entailment 3 Visual Speech Recognition 3 Voice Conversion 3 Weakly Supervised Classification 3 Word Alignment 3 Zero-Shot Text Classification 3 Zero-shot Image Retrieval 3 Zero-shot Named Entity Recognition (NER) 3 Zero-shot Text-to-Image Retrieval 3 automatic-speech-translation 3 knowledge editing 3 text similarity 3 2D Semantic Segmentation 2 3D Anomaly Detection 2 3D Face Animation 2 3D Shape Modeling 2 AbbreviationDetection 2 Action Anticipation 2 Ad-Hoc Information Retrieval 2 Age And Gender Classification 2 Aggression Identification 2 Argument Retrieval 2 Aspect Sentiment Triplet Extraction 2 Astronomy 2 Audio Classification 2 Author Attribution 2 AutoML 2 Automated Essay Scoring 2 Autonomous Driving 2 Bayesian Inference 2 CAD Reconstruction 2 Caption Generation 2 Causal Inference 2 Cell Segmentation 2 Chinese Sentence Pair Classification 2 Citation Prediction 2 Claim Extraction with Stance Classification (CESC) 2 Claim Verification 2 Classification of toxic, engaging, fact-claiming comments 2 Clustering 2 Code Comment Generation 2 Common Sense Reasoning (Zero-Shot) 2 Commonsense Causal Reasoning 2 Commonsense Knowledge Base Construction 2 Community Detection 2 Computer Security 2 ContextNER 2 Continual Pretraining 2 Counterfactual Reasoning 2 Cross-Lingual Document Classification 2 Cross-Lingual Natural Language Inference 2 Cross-Lingual POS Tagging 2 Cross-Lingual Paraphrase Identification 2 Cross-Modal Person Re-Identification 2 Curved Text Detection 2 Neural Architecture Search 2
Filter by Language
English 1853 Chinese 247 German 143 French 141 Spanish 114 Russian 103 Portuguese 72 Italian 70 Japanese 69 Hindi 65 Arabic 64 Korean 47 Vietnamese 47 Bengali 46 Turkish 46 Dutch 39 Persian 38 Czech 36 Tamil 36 Indonesian 34 Danish 33 Polish 33 Romanian 29 Finnish 28 Marathi 28 Thai 27 Telugu 25 Hungarian 24 Swedish 24 Urdu 23 Greek 22 Estonian 21 Gujarati 21 Multilingual 21 Bulgarian 18 Hebrew 18 Ukrainian 18 Swahili 17 Croatian 16 Malayalam 16 Punjabi 15 Slovak 15 Basque 14 Catalan 14 Latvian 14 Lithuanian 14 Serbian 13 Slovenian 13 Amharic 12 Kazakh 12 Norwegian 12 Mandarin Chinese 11 Albanian 10 Kannada 10 Sinhala 10 Yoruba 10 Armenian 8 Burmese 8 Filipino 8 Irish 8 Kurdish 8 Macedonian 8 Sanskrit 8 Tagalog 8 Welsh 8 Assamese 7 Galician 7 Hausa 7 Igbo 7 Iranian Persian 7 Mongolian 7 Azerbaijani 6 Maltese 6 Nigerian Pidgin 6 Odia 6 Oriya (macrolanguage) 6 Afrikaans 5 American Sign Language 5 Bambara 5 Breton 5 Central Khmer 5 Georgian 5 Guarani 5 Icelandic 5 Malagasy 5 Nepali (individual language) 5 Oromo 5 Somali 5 Western Panjabi 5 Wolof 5 Belarusian 4 Bosnian 4 Esperanto 4 Ganda 4 Haitian 4 Javanese 4 Latin 4 Malay (individual language) 4 Nepali (macrolanguage) 4 Norwegian Bokmål 4 Norwegian Nynorsk 4 Scottish Gaelic 4 Sindhi 4 Sundanese 4 Tigrinya 4 Uzbek 4 Aymara 3 Bangala 3 Chechen 3 Egyptian Arabic 3 Ewe 3 Fon 3 Lingala 3 Luo (Kenya and Tanzania) 3 Quechua 3 Serbo-Croatian 3 Tetum 3 Tswana 3 Upper Sorbian 3 Xhosa 3 Aragonese 2 Bashkir 2 Bavarian 2 Bhojpuri 2 Bishnupriya 2 Cebuano 2 Central Kurdish 2 Central Pashto 2 Dhivehi 2 Erzya 2 Faroese 2 Fulah 2 Goan Konkani 2 Iloko 2 Interlingue 2 Jejueo 2 Kabyle 2 Kirghiz 2 Lao 2 Maithili 2 Modern Greek 2 Mossi 2 Occitan (post 1500) 2 Romansh 2 Rundi 2 Russia Buriat 2 Sardinian 2 South Azerbaijani 2 Southern Pashto 2 Standard Arabic 2 Tatar 2 Tibetan 2 Turkmen 2 Twi 2 Uighur 2 Yiddish 2 Yue Chinese 2 Zulu 2 Akkadian 1 Akuntsu 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Argentine Sign Language 1 Assyrian Neo-Aramaic 1 Asturian 1 Avaric 1 Bemba (Zambia) 1 Central Bikol 1 Chavacano 1 Chukot 1 Church Slavic 1 Chuvash 1 Congo Swahili 1 Coptic 1 Cornish 1 Corsican 1 Cusco Quechua 1 Dimli (individual language) 1 Dogri (macrolanguage) 1 Eastern Mari 1 French Sign Language 1 Geez 1 German Sign Language 1 Gothic 1 Halh Mongolian 1 Ido 1 Inuktitut 1 Kalaallisut 1 Kalmyk 1 Karachay-Balkar 1 Karelian 1 Khunsari 1 Kinyarwanda 1 Komi 1 Komi-Permyak 1 Komi-Zyrian 1 Krio 1 Lezghian 1 Limburgan 1 Literary Chinese 1 Livvi 1 Lojban 1 Lombard 1 Low German 1 Lower Sorbian 1 Lozi 1 Lunda 1 Luo (Cameroon) 1 Lushai 1 Luxembourgish 1 Malay (macrolanguage) 1 Manipuri 1 Manx 1 Maori 1 Mazanderani 1 Mbyá Guaraní 1 Minangkabau 1 Mingrelian 1 Mirandese 1 Moksha 1 Moroccan Arabic 1 Mundurukú 1 Nayini 1 Neapolitan 1 Newari 1 Northern Frisian 1 Northern Kurdish 1 Northern Luri 1 Northern Sami 1 Nyanja 1 Old French 1 Old Russian 1 Old Spanish 1 Old Turkish 1 Ossetian 1 Pampanga 1 Piemontese 1 Pushto 1 Shona 1 Sicilian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Swati 1 Swedish Sign Language 1 Swiss German 1 Swiss-German Sign Language 1 Tajik 1 Tonga (Zambia) 1 Tsonga 1 Tupinambá 1 Tuvinian 1 Venda 1 Venetian 1 Volapük 1 Walloon 1 Waray (Philippines) 1 Warlpiri 1 Western Frisian 1 Western Mari 1 Wu Chinese 1 Yakut 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 Ambonese Malay 0 Andaman Creole Hindi 0 Arpitan 0 Bangladeshi Sign Language 0 Banjar 0 Bislama 0 Bodo (India) 0 Buginese 0 Chamorro 0 Cherokee 0 Cheyenne 0 Choctaw 0 Cree 0 Creek 0 Crimean Tatar 0 Dogri (individual language) 0 Dzongkha 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Friulian 0 Gagauz 0 Gan Chinese 0 Gilaki 0 Greek Sign Language 0 Gulf Arabic 0 Hakha Chin 0 Hakka Chinese 0 Hawaiian 0 Herero 0 Hiri Motu 0 Interlingua (International Auxiliary Language Association) 0 Inupiaq 0 Jamaican Creole English 0 Kabardian 0 Kabuverdianu 0 Kachin 0 Kanuri 0 Kara-Kalpak 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kongo 0 Kuanyama 0 Kupang Malay 0 Kölsch 0 Ladino 0 Lak 0 Latgalian 0 Ligurian 0 Lingua Franca 0 Makasar 0 Malayic Dayak 0 Marshallese 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Modern Greek (1453-) 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Nigerian Fulfulde 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Huishui Hmong 0 Northern Uzbek 0 Novial 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Pali 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Pontic 0 Portuguse 0 Rajasthani 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Saterfriesisch 0 Scots 0 Shan 0 Sichuan Yi 0 Silesian 0 Southern Sotho 0 Sranan Tongo 0 Standard Latvian 0 Swahili (macrolanguage) 0 Tahitian 0 Tai 0 Thai Song 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tosk Albanian 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Tunisian Sign Language 0 Turkish Sign Language 0 Uab Meto 0 Udmurt 0 Veps 0 Vlaams 0 Vlax Romani 0 Votic 0 West Central Oromo 0 Zaza 0 Zeeuws 0 Zhuang 0

3041 dataset results for Texts