Datasets

5,437 machine learning datasets
Filter by Task
Question Answering 207 Language Modelling 94 Reading Comprehension 71 Named Entity Recognition 57 Natural Language Inference 53 Visual Question Answering 53 Text Classification 51 Text Generation 50 Text Summarization 44 Machine Translation 42 Sentiment Analysis 42 Natural Language Understanding 36 Information Retrieval 35 Relation Extraction 33 Coreference Resolution 29 Machine Reading Comprehension 29 Common Sense Reasoning 28 Data Augmentation 27 Word Embeddings 27 Abstractive Text Summarization 23 Image Captioning 23 Semantic Parsing 23 Document Summarization 21 Entity Linking 17 Data-to-Text Generation 16 Fake News Detection 16 Video Question Answering 16 Misinformation 15 Video Captioning 15 Video Retrieval 15 Code Generation 14 Knowledge Graphs 14 Open-Domain Question Answering 14 Paraphrase Identification 14 Semantic Textual Similarity 14 Domain Adaptation 13 Part-Of-Speech Tagging 13 Cross-Lingual Transfer 12 Decision Making 12 Dialogue Generation 12 Hate Speech Detection 12 Multi-Task Learning 12 Question Generation 12 Recommendation Systems 12 Speech Recognition 12 Word Sense Disambiguation 12 Handwriting Recognition 11 Multi-Document Summarization 11 Paraphrase Generation 10 Relation Classification 10 Slot Filling 10 Visual Reasoning 10 Conversational Response Selection 9 Dependency Parsing 9 Fact Verification 9 Link Prediction 9 Vision and Language Navigation 9 Visual Dialog 9 Automatic Post-Editing 8 Dialogue State Tracking 8 Emotion Recognition 8 Intent Detection 8 Language Identification 8 Referring Expression Segmentation 8 Sentence Embeddings 8 Task-Oriented Dialogue Systems 8 Text Simplification 8 Document Classification 7 Emotion Classification 7 Entity Disambiguation 7 Event Extraction 7 Image Classification 7 Image Retrieval 7 Multi-Label Classification 7 Nested Named Entity Recognition 7 Open Information Extraction 7 Sentence Classification 7 Speech Synthesis 7 Stance Detection 7 Code Search 6 Cross-Modal Retrieval 6 Emotion Recognition in Conversation 6 Entity Typing 6 Grammatical Error Correction 6 Handwriting generation 6 Object Detection 6 Optical Character Recognition 6 Scene Text Detection 6 Sequence-to-sequence Language Modeling 6 Spoken Language Understanding 6 Table-to-Text Generation 6 Token Classification 6 Video Understanding 6 Zero-Shot Learning 6 Abusive Language 5 Answer Selection 5 Aspect-Based Sentiment Analysis 5 Chinese Reading Comprehension 5 Goal-Oriented Dialog 5 Joint Entity and Relation Extraction 5 Knowledge Base Question Answering 5 Learning-To-Rank 5 Math Word Problem Solving 5 Mathematical Question Answering 5 Medical Named Entity Recognition 5 Medical Relation Extraction 5 Multi-Label Text Classification 5 Opinion Mining 5 Sarcasm Detection 5 Scene Text Recognition 5 Scientific Document Summarization 5 Semantic Role Labeling 5 Semantic Segmentation 5 Semantic Similarity 5 Stochastic Optimization 5 Text-To-Sql 5 Vision-Language Navigation 5 Visual Navigation 5 Action Recognition 4 Automated Theorem Proving 4 Chatbot 4 Chinese Named Entity Recognition 4 Community Question Answering 4 Constituency Parsing 4 Dialogue Understanding 4 Discourse Parsing 4 Discourse Segmentation 4 Document Ranking 4 Extractive Text Summarization 4 Fairness 4 Handwritten Text Recognition 4 Image Generation 4 Intent Classification 4 Language Acquisition 4 Multimodal Machine Translation 4 Multimodal Sentiment Analysis 4 Nested Mention Recognition 4 News Classification 4 Passage Retrieval 4 Relational Reasoning 4 Self-Supervised Learning 4 Sentence Embedding 4 Text-To-Speech Synthesis 4 Weakly-Supervised Named Entity Recognition 4 Abuse Detection 3 Anomaly Detection 3 Biomedical Information Retrieval 3 Code Documentation Generation 3 Code Summarization 3 Cross-Lingual Question Answering 3 Dense Video Captioning 3 Entity Retrieval 3 Explainable artificial intelligence 3 Extreme Summarization 3 Few-Shot NLI 3 Goal-Oriented Dialogue Systems 3 Graph Classification 3 KG-to-Text Generation 3 Lexical Entailment 3 Lipreading 3 Low Resource Named Entity Recognition 3 Multiple Instance Learning 3 Natural Language Moment Retrieval 3 Open Intent Discovery 3 Paper generation 3 Person Search 3 Phrase Grounding 3 Program Repair 3 Reading Comprehension (Few-Shot) 3 Reading Comprehension (One-Shot) 3 Reading Comprehension (Zero-Shot) 3 Source Code Summarization 3 Story Generation 3 Structured Prediction 3 Style Transfer 3 Systematic Generalization 3 Table Detection 3 Text Style Transfer 3 Text-Image Retrieval 3 Unsupervised Machine Translation 3 Weakly Supervised Classification 3 Word Alignment 3 Zero-Shot Cross-Lingual Transfer 3 Action Classification 2 Active Learning 2 Arabic Sentiment Analysis 2 Argument Mining 2 Art Analysis 2 Aspect-Category-Opinion-Sentiment Quadruple Extraction 2 Chinese Sentence Pair Classification 2 Chinese Word Segmentation 2 Chunking 2 Citation Intent Classification 2 Citation Prediction 2 Citation Recommendation 2 Cloze (multi-choices) (Few-Shot) 2 Cloze (multi-choices) (One-Shot) 2 Cloze (multi-choices) (Zero-Shot) 2 Code Comment Generation 2 Code Completion 2 Code Repair 2 Code Translation 2 Commonsense Knowledge Base Construction 2 Cross Document Coreference Resolution 2 Cross-Lingual Document Classification 2 Cross-Lingual NER 2 Cross-Lingual Natural Language Inference 2 Cross-Lingual POS Tagging 2 Depression Detection 2 Dialog Relation Extraction 2 Dialogue Act Classification 2 Distractor Generation 2 Duplicate-Question Retrieval 2 End-To-End Dialogue Modelling 2 Event Coreference Resolution 2 Extractive Document Summarization 2 Fact Checking 2 Fact-based Text Editing 2 Feature Engineering 2 Few-Shot Learning 2 Few-Shot Relation Classification 2 Few-shot NER 2 Gender Bias Detection 2 Gender Prediction 2 Genre classification 2 Grammatical Error Detection 2 Graph Embedding 2 Graph Generation 2 Humor Detection 2 Image Manipulation 2 Image-to-Text Retrieval 2 Imitation Learning 2 Interpretable Machine Learning 2 Keyphrase Extraction 2 Keyword Extraction 2 Knowledge Graph Embeddings 2 Lemmatization 2 Linguistic Acceptability 2 Lip Reading 2 Logical Reasoning Question Answering 2 Long-range modeling 2 Low-Resource Neural Machine Translation 2 Meeting Summarization 2 Meta-Learning 2 Morphological Analysis 2 Multi-domain Dialogue State Tracking 2 NLP based Person Retrival 2 Native Language Identification 2 Natural Language Visual Grounding 2 Negation Detection 2 Network Embedding 2 Neural Architecture Search 2 News Annotation 2 News Generation 2 Node Classification 2 Open-Domain Dialog 2 Out-of-Distribution Detection 2 Paraphrase Identification within Bi-Encoder 2 Passage Re-Ranking 2 Person Re-Identification 2 Phrase Ranking 2 Phrase Tagging 2 Point Processes 2 Product Recommendation 2 Quantization 2 Recipe Generation 2 Recognizing Emotion Cause in Conversations 2 Referring Expression Comprehension 2 SQL Parsing 2 Scene Graph Detection 2 Scene Graph Generation 2 Scientific Concept Extraction 2 Scientific Results Extraction 2 Semantic Image-Text Similarity 2 Semantic Textual Similarity within Bi-Encoder 2 Sentence Fusion 2 Sign Language Recognition 2 Sign Language Translation 2 Speech-to-Text Translation 2 Spelling Correction 2 Spoken Dialogue Systems 2 Stock Market Prediction 2 Stock Prediction 2 Table-based Fact Verification 2 Text Categorization 2 Text Matching 2 Text based Person Retrieval 2 Text-to-Code Generation 2 Text-to-Image Retrieval 2 Timex normalization 2 Topic Classification 2 Topic Models 2 Transliteration 2 Tweet Retrieval 2 Twitter Sentiment Analysis 2 Unconstrained Lip-synchronization 2 Unsupervised KG-to-Text Generation 2 Unsupervised semantic parsing 2 Variational Inference 2 Video Description 2 Visual Commonsense Reasoning 2 Visual Keyword Spotting 2 Visual Relationship Detection 2 Visual Speech Recognition 2 Visual Storytelling 2 Weather Forecasting 2 3D Action Recognition 1 4-ary Relation Extraction 1 Abstractive Dialogue Summarization 1 Accented Speech Recognition 1 Action Anticipation 1 Action Quality Assessment 1 Action Recognition In Videos 1 Action Understanding 1 Adversarial Attack 1 Adversarial Robustness 1 Age And Gender Classification 1 Aggression Identification 1 Annotated Code Search 1 Arabic Text Diacritization 1 Argument Retrieval 1 Aspect Sentiment Triplet Extraction 1 Audio Super-Resolution 1 Audio-Visual Speech Recognition 1 Author Attribution 1 Authorship Verification 1 Autonomous Driving 1 Bayesian Inference 1 Bias Detection 1 Bidirectional Relationship Classification 1 Binary Relation Extraction 1 Blackout Poetry Generation 1 Bridging Anaphora Resolution 1 COVID-19 Diagnosis 1 Causal Discovery 1 Causal Emotion Entailment 1 Causal Identification 1 Click-Through Rate Prediction 1 Clinical Assertion Status Detection 1 Clinical Concept Extraction 1 Clinical Note Phenotyping 1 Clone Detection 1 Cloze Test 1 Code Classification 1 CodeSearchNet - Java 1 Combinatorial Optimization 1 Common Sense Reasoning (Few-Shot) 1 Common Sense Reasoning (One-Shot) 1 Common Sense Reasoning (Zero-Shot) 1 Community Detection 1 Complex Word Identification 1 Component Classification 1 Compositional Zero-Shot Learning 1 Computational Phenotyping 1 Computed Tomography (CT) 1 Concept-To-Text Generation 1 Conditional Text Generation 1 Constituency Grammar Induction 1 Context Query Reformulation 1 Contextual Embedding for Source Code 1 Continual Learning 1 Continuous Control 1 Conversation Disentanglement 1 Counterfactual Explanation 1 Croatian Text Diacritization 1 Cross-Document Language Modeling 1 Cross-Domain Named Entity Recognition 1 Cross-Lingual Abstractive Summarization 1 Cross-Lingual Bitext Mining 1 Cross-Lingual Entity Linking 1 Cross-Lingual Paraphrase Identification 1 Cross-Lingual Semantic Textual Similarity 1 Cross-Lingual Sentiment Classification 1 Cross-lingual zero-shot dependency parsing 1 Curved Text Detection 1 Czech Text Diacritization 1 Deblurring 1 Deception Detection 1 Decipherment 1 Defect Detection 1 Definition Extraction 1 Dialog Act Classification 1 Dialogue Management 1 Disaster Response 1 Distant Speech Recognition 1 Document Embedding 1 Document Layout Analysis 1 Document Translation 1 Document-level Event Extraction 1 Domain Generalization 1 Drug–drug Interaction Extraction 1 Email Thread Summarization 1 Emotion Recognition in Context 1 Emotional Dialogue Acts 1 Empathetic Response Generation 1 Entity Cross-Document Coreference Resolution 1 Entity Embeddings 1 Entity Extraction using GAN 1 Entity Resolution 1 Epidemiology 1 Event Cross-Document Coreference Resolution 1 Event Detection 1 Event Expansion 1 Event-Driven Trading 1 Evidence Selection 1 Extractive Summarization 1 Extreme Multi-Label Classification 1 Face Sketch Synthesis 1 Factual Visual Question Answering 1 Feature Importance 1 Federated Learning 1 Fill Mask 1 Fine-Grained Opinion Analysis 1 Fine-Grained Visual Categorization 1 Fine-Grained Visual Recognition 1 Fine-grained Action Recognition 1 Flowchart Grounded Dialog Response Generation 1 Food Recognition 1 French Text Diacritization 1 Generalized Zero-Shot Learning 1 Generative Question Answering 1 Graph Question Answering 1 Graph Representation Learning 1 Graph Similarity 1 Graph-to-Sequence 1 Hand Gesture Recognition 1 Handwritten Chinese Text Recognition 1 Handwritten Word Generation 1 Human Pose Forecasting 1 Human robot interaction 1 Hungarian Text Diacritization 1 Hypernym Discovery 1 Image Comprehension 1 Image Forensics 1 Image Inpainting 1 Image Paragraph Captioning 1 Image-to-Image Translation 1 Implicit Discourse Relation Classification 1 Incremental Learning 1 Informativeness 1 Intrusion Detection 1 Irish Text Diacritization 1 KB-to-Language Generation 1 Keyword Spotting 1 Knowledge Base Population 1 Knowledge Graph Completion 1 Latvian Text Diacritization 1 Layout-to-Image Generation 1 Length-of-Stay prediction 1 Lexical Simplification 1 Lip to Speech Synthesis 1 Logical Reasoning Reading Comprehension 1 Material Classification 1 Material Recognition 1 Mathematical Proofs 1 Medical Concept Normalization 1 Medical Diagnosis 1 Medical Image Denoising 1 Medical Procedure 1 Medical Report Generation 1 Medical Visual Question Answering 1 Meme Classification 1 Memex Question Answering 1 Method name prediction 1 Metric Learning 1 Missing Elements 1 Moment Retrieval 1 Morphological Disambiguation 1 Mortality Prediction 1 Multi Class Text Classification 1 Multi Label Text Classification 1 Multi-Agent Path Finding 1 Multi-Choice MRC 1 Multi-Domain Sentiment Classification 1 Multi-Grained Named Entity Recognition 1 Multi-Hop Reading Comprehension 1 Multi-Label Learning 1 Multi-class Classification 1 Multi-hop Question Answering 1 Multi-modal Dialogue Generation 1 Multilingual NLP 1 Multilingual Named Entity Recognition 1 Multilingual text classification 1 Multimodal Abstractive Text Summarization 1 Multimodal GIF Dialog 1 Multimodal Lexical Translation 1 Multimodal Text Prediction 1 Multiview Learning 1 Music Information Retrieval 1 Music Source Separation 1 Myocardial infarction detection 1 NER 1 Natural Language Inference (Few-Shot) 1 Natural Language Inference (One-Shot) 1 Natural Language Inference (Zero-Shot) 1 Temporal Action Localization 1 connective detection 1 dialogue summary 1 graph construction 1 imbalanced classification 1
Filter by Language
English 791 Chinese 114 German 85 French 63 Spanish 57 Russian 54 Italian 37 Japanese 36 Portuguese 35 Arabic 32 Turkish 27 Dutch 23 Hindi 23 Czech 21 Korean 21 Vietnamese 20 Persian 19 Finnish 18 Romanian 17 Multilingual 16 Polish 16 Bengali 15 Indonesian 15 Tamil 15 Marathi 13 Telugu 13 Thai 12 Estonian 11 Urdu 11 Danish 10 Gujarati 10 Hungarian 10 Swedish 10 Basque 9 Bulgarian 9 Hebrew 9 Malayalam 9 Punjabi 9 Swahili 9 Amharic 8 Greek 8 Kazakh 8 Norwegian 8 Ukrainian 8 Croatian 7 Kannada 7 Mandarin Chinese 7 Serbian 7 Slovak 7 Albanian 6 Armenian 6 Catalan 6 Latvian 6 Sinhala 6 Slovenian 6 Welsh 6 Assamese 5 Lithuanian 5 Oriya (macrolanguage) 5 Sanskrit 5 Yoruba 5 Azerbaijani 4 Breton 4 Georgian 4 Icelandic 4 Igbo 4 Irish 4 Kurdish 4 Latin 4 Macedonian 4 Mongolian 4 Scottish Gaelic 4 Sindhi 4 Afrikaans 3 American Sign Language 3 Belarusian 3 Burmese 3 Chechen 3 Galician 3 Haitian 3 Hausa 3 Malagasy 3 Maltese 3 Somali 3 Tagalog 3 Upper Sorbian 3 Uzbek 3 Wolof 3 Aragonese 2 Bambara 2 Bavarian 2 Bishnupriya 2 Bosnian 2 Central Khmer 2 Egyptian Arabic 2 Erzya 2 Esperanto 2 Filipino 2 Guarani 2 Iranian Persian 2 Javanese 2 Jejueo 2 Kirghiz 2 Lao 2 Malay (individual language) 2 Modern Greek 2 Nepali (macrolanguage) 2 Nigerian Pidgin 2 Norwegian Nynorsk 2 Oromo 2 Quechua 2 Romansh 2 Russia Buriat 2 Serbo-Croatian 2 South Azerbaijani 2 Standard Arabic 2 Sundanese 2 Tatar 2 Uighur 2 Yiddish 2 Yue Chinese 2 Akkadian 1 Akuntsu 1 Ancient Greek 1 Apurinã 1 Assyrian Neo-Aramaic 1 Asturian 1 Avaric 1 Bashkir 1 Bhojpuri 1 Cebuano 1 Central Bikol 1 Central Kurdish 1 Central Pashto 1 Chavacano 1 Chukot 1 Church Slavic 1 Chuvash 1 Coptic 1 Cornish 1 Dhivehi 1 Dimli (individual language) 1 Eastern Mari 1 Faroese 1 Fon 1 Fulah 1 Ganda 1 Geez 1 Goan Konkani 1 Gothic 1 Ido 1 Iloko 1 Interlingue 1 Inuktitut 1 Kalmyk 1 Karachay-Balkar 1 Karelian 1 Khunsari 1 Kinyarwanda 1 Komi 1 Komi-Permyak 1 Komi-Zyrian 1 Lezghian 1 Limburgan 1 Lingala 1 Literary Chinese 1 Livvi 1 Lojban 1 Lombard 1 Low German 1 Lower Sorbian 1 Luo (Cameroon) 1 Luo (Kenya and Tanzania) 1 Luxembourgish 1 Maithili 1 Manipuri 1 Manx 1 Mazanderani 1 Mbyá Guaraní 1 Minangkabau 1 Mingrelian 1 Mirandese 1 Moksha 1 Moroccan Arabic 1 Mundurukú 1 Nayini 1 Neapolitan 1 Nepali (individual language) 1 Newari 1 Northern Frisian 1 Northern Kurdish 1 Northern Luri 1 Northern Sami 1 Norwegian Bokmål 1 Occitan (post 1500) 1 Old French 1 Old Russian 1 Old Turkish 1 Ossetian 1 Pampanga 1 Piemontese 1 Portuguse 1 Pushto 1 Sardinian 1 Sicilian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Swati 1 Swedish Sign Language 1 Swiss German 1 Tajik 1 Tibetan 1 Tigrinya 1 Tswana 1 Tupinambá 1 Turkmen 1 Tuvinian 1 Venetian 1 Volapük 1 Walloon 1 Waray (Philippines) 1 Warlpiri 1 Western Frisian 1 Western Mari 1 Western Panjabi 1 Wu Chinese 1 Xhosa 1 Yakut 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 Argentine Sign Language 0 Arpitan 0 Aymara 0 Bangladeshi Sign Language 0 Banjar 0 Bislama 0 Bodo (India) 0 Buginese 0 Chamorro 0 Cherokee 0 Cheyenne 0 Choctaw 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dzongkha 0 Ewe 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Friulian 0 Gagauz 0 Gan Chinese 0 German Sign Language 0 Gilaki 0 Greek Sign Language 0 Gulf Arabic 0 Hakha Chin 0 Hakka Chinese 0 Hawaiian 0 Herero 0 Hiri Motu 0 Interlingua (International Auxiliary Language Association) 0 Inupiaq 0 Jamaican Creole English 0 Kabardian 0 Kabyle 0 Kalaallisut 0 Kanuri 0 Kara-Kalpak 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kongo 0 Kuanyama 0 Kölsch 0 Ladino 0 Lak 0 Latgalian 0 Ligurian 0 Malay (macrolanguage) 0 Maori 0 Marshallese 0 Min Dong Chinese 0 Modern Greek (1453-) 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Northern Huishui Hmong 0 Novial 0 Nyanja 0 Odia 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Pali 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Pitcairn-Norfolk 0 Pontic 0 Rajasthani 0 Rundi 0 Rusyn 0 Samoan 0 Sango 0 Santali 0 Saterfriesisch 0 Scots 0 Shona 0 Sichuan Yi 0 Silesian 0 Southern Sotho 0 Sranan Tongo 0 Swahili (macrolanguage) 0 Swiss-German Sign Language 0 Tahitian 0 Tai 0 Tetum 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tosk Albanian 0 Tsonga 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Turkish Sign Language 0 Twi 0 Udmurt 0 Venda 0 Veps 0 Vlaams 0 Vlax Romani 0 Votic 0 Zeeuws 0 Zhuang 0 Zulu 0

1509 dataset results for Texts