Datasets

6,203 machine learning datasets
Filter by Task
Question Answering 224 Language Modelling 97 Reading Comprehension 71 Named Entity Recognition 66 Text Classification 63 Natural Language Inference 57 Visual Question Answering 57 Text Generation 52 Text Summarization 47 Sentiment Analysis 45 Machine Translation 42 Natural Language Understanding 40 Information Retrieval 39 Relation Extraction 36 Common Sense Reasoning 31 Coreference Resolution 29 Machine Reading Comprehension 29 Data Augmentation 27 Word Embeddings 27 Abstractive Text Summarization 25 Semantic Parsing 25 Image Captioning 23 Document Summarization 22 Misinformation 22 Entity Linking 20 Fake News Detection 19 Code Generation 18 Hate Speech Detection 18 Data-to-Text Generation 16 Dialogue Generation 16 Knowledge Graphs 16 Part-Of-Speech Tagging 16 Video Question Answering 16 Open-Domain Question Answering 15 Stance Detection 15 Task-Oriented Dialogue Systems 15 Video Captioning 15 Video Retrieval 15 Domain Adaptation 14 Paraphrase Identification 14 Question Generation 14 Semantic Textual Similarity 14 Multi-Task Learning 13 Recommendation Systems 13 Visual Reasoning 13 Cross-Lingual Transfer 12 Decision Making 12 Image Classification 12 Speech Recognition 12 Word Sense Disambiguation 12 Handwriting Recognition 11 Multi-Document Summarization 11 Relation Classification 11 Slot Filling 11 Conversational Response Selection 10 Fact Verification 10 Language Identification 10 Link Prediction 10 Paraphrase Generation 10 Sequence-to-sequence Language Modeling 10 Zero-Shot Learning 10 Dependency Parsing 9 Token Classification 9 Vision and Language Navigation 9 Visual Dialog 9 Automatic Post-Editing 8 Dialogue State Tracking 8 Document Classification 8 Emotion Recognition 8 Few-Shot Learning 8 Intent Detection 8 Multi-Label Classification 8 Optical Character Recognition 8 Referring Expression Segmentation 8 Sentence Embeddings 8 Text Simplification 8 Aspect-Based Sentiment Analysis 7 Cross-Modal Retrieval 7 Emotion Classification 7 Emotion Recognition in Conversation 7 Entity Disambiguation 7 Entity Typing 7 Event Extraction 7 Grammatical Error Correction 7 Image Retrieval 7 Math Word Problem Solving 7 Multi-Label Text Classification 7 Nested Named Entity Recognition 7 Node Classification 7 Object Detection 7 Open Information Extraction 7 Sentence Classification 7 Speech Synthesis 7 Text-To-Speech Synthesis 7 Abusive Language 6 Ad-hoc video search 6 Answer Selection 6 Code Search 6 Community Question Answering 6 Extreme Summarization 6 Handwriting generation 6 Intent Classification 6 Knowledge Base Question Answering 6 Mathematical Question Answering 6 Sarcasm Detection 6 Scene Text Detection 6 Spoken Language Understanding 6 Table-to-Text Generation 6 Video Understanding 6 AMR Parsing 5 AMR-to-Text Generation 5 Chinese Reading Comprehension 5 Conversational Question Answering 5 Cross-Lingual Question Answering 5 Goal-Oriented Dialog 5 Joint Entity and Relation Extraction 5 Learning-To-Rank 5 Medical Named Entity Recognition 5 Medical Relation Extraction 5 Multimodal Sentiment Analysis 5 Open-Domain Dialog 5 Opinion Mining 5 SSTOD 5 Scene Text Recognition 5 Scientific Document Summarization 5 Semantic Role Labeling 5 Semantic Segmentation 5 Semantic Similarity 5 Stochastic Optimization 5 Text-To-Sql 5 Translation 5 Vision-Language Navigation 5 Visual Navigation 5 Action Recognition 4 Automated Theorem Proving 4 Chatbot 4 Chinese Named Entity Recognition 4 Citation Recommendation 4 Code Summarization 4 Constituency Parsing 4 Dialogue Understanding 4 Discourse Parsing 4 Discourse Segmentation 4 Document Ranking 4 Extractive Text Summarization 4 Fairness 4 Few-Shot Relation Classification 4 Handwritten Text Recognition 4 Image Generation 4 Language Acquisition 4 Lipreading 4 Low Resource Named Entity Recognition 4 NER 4 Nested Mention Recognition 4 News Classification 4 Passage Retrieval 4 Relational Reasoning 4 Self-Supervised Learning 4 Sentence Embedding 4 Systematic Generalization 4 Table Detection 4 Translation deu-eng 4 Translation eng-deu 4 Weakly-Supervised Named Entity Recognition 4 Zero-Shot Cross-Lingual Transfer 4 Abuse Detection 3 Active Learning 3 Anomaly Detection 3 Biomedical Information Retrieval 3 Code Documentation Generation 3 Dense Video Captioning 3 Entity Retrieval 3 Explainable artificial intelligence 3 Fact Checking 3 Few-Shot NLI 3 Few-shot NER 3 Goal-Oriented Dialogue Systems 3 Graph Classification 3 Graph Embedding 3 KG-to-Text Generation 3 Lexical Entailment 3 Lip Reading 3 Long-range modeling 3 Low-Resource Neural Machine Translation 3 Moment Retrieval 3 Multi Label Text Classification 3 Multimodal Machine Translation 3 Multiple Instance Learning 3 Natural Language Moment Retrieval 3 Natural Language Visual Grounding 3 Open Intent Discovery 3 Paper generation 3 Person Search 3 Phrase Grounding 3 Product Recommendation 3 Program Repair 3 Reading Comprehension (Few-Shot) 3 Reading Comprehension (One-Shot) 3 Reading Comprehension (Zero-Shot) 3 Recognizing Emotion Cause in Conversations 3 Sign Language Recognition 3 Sign Language Translation 3 Source Code Summarization 3 Story Generation 3 Structured Prediction 3 Style Transfer 3 Term Extraction 3 Text Matching 3 Text Style Transfer 3 Text-Image Retrieval 3 Text-to-Code Generation 3 Unconstrained Lip-synchronization 3 Unsupervised Machine Translation 3 Video Description 3 Visual Commonsense Reasoning 3 Visual Speech Recognition 3 Weakly Supervised Classification 3 Word Alignment 3 AbbreviationDetection 2 Action Classification 2 Arabic Sentiment Analysis 2 Argument Mining 2 Art Analysis 2 Aspect Category Detection 2 Aspect Category Polarity 2 Aspect Term Extraction and Sentiment Classification 2 Aspect-Category-Opinion-Sentiment Quadruple Extraction 2 Bayesian Inference 2 Chinese Sentence Pair Classification 2 Chinese Word Segmentation 2 Chunking 2 Citation Intent Classification 2 Citation Prediction 2 Cloze (multi-choices) (Few-Shot) 2 Cloze (multi-choices) (One-Shot) 2 Cloze (multi-choices) (Zero-Shot) 2 Code Comment Generation 2 Code Completion 2 Code Repair 2 Code Translation 2 Commonsense Knowledge Base Construction 2 Continual Learning 2 Cross Document Coreference Resolution 2 Cross-Lingual Document Classification 2 Cross-Lingual NER 2 Cross-Lingual Natural Language Inference 2 Cross-Lingual POS Tagging 2 Depression Detection 2 Dialog Relation Extraction 2 Dialogue Act Classification 2 Dialogue Evaluation 2 Distractor Generation 2 Duplicate-Question Retrieval 2 Dynamic Link Prediction 2 End-To-End Dialogue Modelling 2 Event Coreference Resolution 2 Extractive Document Summarization 2 Fact-based Text Editing 2 Feature Engineering 2 Gender Bias Detection 2 Gender Prediction 2 Generative Question Answering 2 Genre classification 2 Grammatical Error Detection 2 Graph Generation 2 Humor Detection 2 Image Manipulation 2 Image-to-Text Retrieval 2 Imitation Learning 2 Interpretable Machine Learning 2 Keyphrase Extraction 2 Keyword Extraction 2 Knowledge Graph Embeddings 2 Lemmatization 2 Linguistic Acceptability 2 Lip to Speech Synthesis 2 Logical Reasoning Question Answering 2 Max-Shot Cross-Lingual Visual Reasoning 2 Meeting Summarization 2 Meta-Learning 2 Morphological Analysis 2 Multi-domain Dialogue State Tracking 2 Multi-modal Dialogue Generation 2 Multimodal Emotion Recognition 2 NLP based Person Retrival 2 Native Language Identification 2 Negation Detection 2 Network Embedding 2 Neural Architecture Search 2 New Product Sales Forecasting 2 News Annotation 2 News Generation 2 Node Clustering 2 Out-of-Distribution Detection 2 Paper generation (Conclusion-to-title) 2 Paper generation (Title-to-abstract) 2 Paper generation (abstract-to-conclusion) 2 Paraphrase Identification within Bi-Encoder 2 Passage Re-Ranking 2 Person Re-Identification 2 Phrase Ranking 2 Phrase Tagging 2 Point Processes 2 Pretrained Language Models 2 Program Synthesis 2 Prosody Prediction 2 Quantization 2 Recipe Generation 2 Referring Expression Comprehension 2 SQL Parsing 2 Scene Graph Detection 2 Scene Graph Generation 2 Scientific Concept Extraction 2 Scientific Results Extraction 2 Semantic Image-Text Similarity 2 Semantic Textual Similarity within Bi-Encoder 2 Semi-Supervised Text Classification 2 Sentence Fusion 2 Sign Language Production 2 Speech Emotion Recognition 2 Speech-to-Text Translation 2 Spelling Correction 2 Spoken Dialogue Systems 2 Stance Classification 2 Stock Market Prediction 2 Stock Prediction 2 Table-based Fact Verification 2 Talking Face Generation 2 Temporal Action Localization 2 Temporal Information Extraction 2 Text Categorization 2 Text based Person Retrieval 2 Text-to-Image Retrieval 2 Timex normalization 2 Topic Classification 2 Topic Models 2 Transliteration 2 Tweet Retrieval 2 Twitter Sentiment Analysis 2 Unsupervised Extractive Summarization 2 Unsupervised KG-to-Text Generation 2 Unsupervised semantic parsing 2 Variational Inference 2 Video-Text Retrieval 2 Visual Entailment 2 Visual Keyword Spotting 2 Visual Relationship Detection 2 Visual Storytelling 2 Weather Forecasting 2 Zero-Shot Cross-Lingual Visual Reasoning 2 Zero-shot Relation Classification 2 Zero-shot Relation Triplet Extraction 2 2D Semantic Segmentation 1 3D Action Recognition 1 3D dense captioning 1 4-ary Relation Extraction 1 Abstractive Dialogue Summarization 1 Accented Speech Recognition 1 Action Anticipation 1 Action Quality Assessment 1 Action Recognition In Videos 1 Action Understanding 1 Adversarial Attack 1 Adversarial Robustness 1 Age And Gender Classification 1 Aggression Identification 1 Anchor link prediction 1 Annotated Code Search 1 Answer Generation 1 Arabic Text Diacritization 1 Argument Pair Extraction (APE) 1 Argument Retrieval 1 Aspect Extraction 1 Aspect Sentiment Triplet Extraction 1 Aspect-oriented Opinion Extraction 1 Audio Super-Resolution 1 Audio to Text Retrieval 1 Audio-Visual Speech Recognition 1 Author Attribution 1 Authorship Verification 1 Automated Essay Scoring 1 Autonomous Driving 1 Behavioural cloning 1 Bias Detection 1 Bidirectional Relationship Classification 1 Binary Relation Extraction 1 Blackout Poetry Generation 1 Breast Tumour Classification 1 Bridging Anaphora Resolution 1 COVID-19 Diagnosis 1 COVID-19 Tracking 1 Causal Discovery 1 Causal Emotion Entailment 1 Causal Identification 1 Claim Extraction with Stance Classification (CESC) 1 Claim-Evidence Pair Extraction (CEPE) 1 Classification 1 Click-Through Rate Prediction 1 Clinical Assertion Status Detection 1 Clinical Concept Extraction 1 Clinical Note Phenotyping 1 Clone Detection 1 Cloze Test 1 Code Classification 1 CodeSearchNet - Java 1 Combinatorial Optimization 1 Common Sense Reasoning (Few-Shot) 1 Common Sense Reasoning (One-Shot) 1 Common Sense Reasoning (Zero-Shot) 1 Community Detection 1 Complex Word Identification 1 Component Classification 1 Compositional Zero-Shot Learning 1 Computational Phenotyping 1 Computed Tomography (CT) 1 Concept-To-Text Generation 1 Conditional Text Generation 1 Constituency Grammar Induction 1 Context Query Reformulation 1 Contextual Embedding for Source Code 1 Continuous Control 1 Conversation Disentanglement 1 Conversational Response Generation 1 Conversational Search 1 Counterfactual Explanation 1 Croatian Text Diacritization 1 Cross-Document Language Modeling 1 Cross-Domain Named Entity Recognition 1 Cross-Lingual Abstractive Summarization 1 Cross-Lingual Bitext Mining 1 Cross-Lingual Entity Linking 1 Cross-Lingual Paraphrase Identification 1 Cross-Lingual Semantic Textual Similarity 1 Cross-Lingual Sentiment Classification 1 Cross-lingual zero-shot dependency parsing 1 Curved Text Detection 1 Czech Text Diacritization 1 De-identification 1 Deblurring 1 Deception Detection 1 Decipherment 1 Defect Detection 1 Definition Extraction 1 Dialog Act Classification 1 Dialogue Management 1 Dialogue Rewriting 1 Disaster Response 1 Distant Speech Recognition 1 Document Embedding 1 Document Layout Analysis 1 Document Text Classification 1 Document Translation 1 Document-level Event Extraction 1 Domain Generalization 1 Drug–drug Interaction Extraction 1 Email Thread Summarization 1 Emotion Recognition in Context 1 Emotional Dialogue Acts 1 Empathetic Response Generation 1 English Conversational Speech Recognition 1 Entity Alignment 1 Entity Cross-Document Coreference Resolution 1 Entity Embeddings 1 Entity Extraction using GAN 1 Entity Resolution 1 Epidemiology 1 Event Cross-Document Coreference Resolution 1 Event Detection 1 Event Expansion 1 Event-Driven Trading 1 Evidence Selection 1 Explanation Generation 1 Extractive Summarization 1 Extreme Multi-Label Classification 1 Face Sketch Synthesis 1 Facial Action Unit Detection 1 Facial Emotion Recognition 1 Fact Selection 1 Factual Visual Question Answering 1 Feature Importance 1 Federated Learning 1 Fill Mask 1 Fine-Grained Opinion Analysis 1 Fine-Grained Visual Categorization 1 Fine-Grained Visual Recognition 1 Fine-grained Action Recognition 1 Flowchart Grounded Dialog Response Generation 1 Food Recognition 1 French Text Diacritization 1 Generalized Zero-Shot Learning 1 Graph Question Answering 1 Graph Representation Learning 1 Graph Similarity 1 Graph-to-Sequence 1 Hand Gesture Recognition 1 audio-visual learning 1 connective detection 1 dialogue summary 1 graph construction 1
Filter by Language
English 923 Chinese 136 German 95 French 72 Spanish 64 Russian 61 Japanese 42 Italian 41 Portuguese 40 Arabic 37 Hindi 31 Turkish 31 Korean 29 Dutch 25 Czech 22 Danish 21 Persian 21 Vietnamese 21 Tamil 20 Bengali 19 Indonesian 19 Polish 19 Finnish 18 Romanian 18 Multilingual 17 Marathi 15 Telugu 15 Estonian 13 Hebrew 13 Thai 13 Urdu 13 Gujarati 12 Malayalam 12 Swedish 12 Greek 11 Hungarian 11 Bulgarian 10 Punjabi 10 Swahili 10 Basque 9 Kazakh 9 Norwegian 9 Ukrainian 9 Amharic 8 Croatian 8 Serbian 8 Slovak 8 Albanian 7 Armenian 7 Catalan 7 Kannada 7 Latvian 7 Mandarin Chinese 7 Slovenian 7 Welsh 7 Lithuanian 6 Oriya (macrolanguage) 6 Sinhala 6 Assamese 5 Icelandic 5 Macedonian 5 Mongolian 5 Sanskrit 5 Yoruba 5 Azerbaijani 4 Belarusian 4 Breton 4 Burmese 4 Georgian 4 Igbo 4 Irish 4 Kurdish 4 Latin 4 Scottish Gaelic 4 Sindhi 4 Afrikaans 3 American Sign Language 3 Chechen 3 Filipino 3 Galician 3 Haitian 3 Hausa 3 Malagasy 3 Maltese 3 Norwegian Nynorsk 3 Somali 3 Tagalog 3 Upper Sorbian 3 Uzbek 3 Wolof 3 Aragonese 2 Bambara 2 Bashkir 2 Bavarian 2 Bishnupriya 2 Bosnian 2 Central Khmer 2 Egyptian Arabic 2 Erzya 2 Esperanto 2 Faroese 2 Guarani 2 Iranian Persian 2 Javanese 2 Jejueo 2 Kirghiz 2 Lao 2 Malay (individual language) 2 Modern Greek 2 Nepali (macrolanguage) 2 Nigerian Pidgin 2 Norwegian Bokmål 2 Oromo 2 Quechua 2 Romansh 2 Russia Buriat 2 Serbo-Croatian 2 South Azerbaijani 2 Standard Arabic 2 Sundanese 2 Tatar 2 Uighur 2 Western Panjabi 2 Yiddish 2 Yue Chinese 2 Akkadian 1 Akuntsu 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Assyrian Neo-Aramaic 1 Asturian 1 Avaric 1 Bhojpuri 1 Cebuano 1 Central Bikol 1 Central Kurdish 1 Central Pashto 1 Chavacano 1 Chukot 1 Church Slavic 1 Chuvash 1 Coptic 1 Cornish 1 Dhivehi 1 Dimli (individual language) 1 Eastern Mari 1 Fon 1 Fulah 1 Ganda 1 Geez 1 Goan Konkani 1 Gothic 1 Ido 1 Iloko 1 Interlingue 1 Inuktitut 1 Kalmyk 1 Karachay-Balkar 1 Karelian 1 Khunsari 1 Kinyarwanda 1 Komi 1 Komi-Permyak 1 Komi-Zyrian 1 Lezghian 1 Limburgan 1 Lingala 1 Literary Chinese 1 Livvi 1 Lojban 1 Lombard 1 Low German 1 Lower Sorbian 1 Luo (Cameroon) 1 Luo (Kenya and Tanzania) 1 Luxembourgish 1 Maithili 1 Malay (macrolanguage) 1 Manipuri 1 Manx 1 Mazanderani 1 Mbyá Guaraní 1 Minangkabau 1 Mingrelian 1 Mirandese 1 Moksha 1 Moroccan Arabic 1 Mundurukú 1 Nayini 1 Neapolitan 1 Nepali (individual language) 1 Newari 1 Northern Frisian 1 Northern Kurdish 1 Northern Luri 1 Northern Sami 1 Occitan (post 1500) 1 Odia 1 Old French 1 Old Russian 1 Old Turkish 1 Ossetian 1 Pampanga 1 Piemontese 1 Portuguse 1 Pushto 1 Sardinian 1 Sicilian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Swati 1 Swedish Sign Language 1 Swiss German 1 Tajik 1 Tibetan 1 Tigrinya 1 Tswana 1 Tupinambá 1 Turkmen 1 Tuvinian 1 Venetian 1 Volapük 1 Walloon 1 Waray (Philippines) 1 Warlpiri 1 Western Frisian 1 Western Mari 1 Wu Chinese 1 Xhosa 1 Yakut 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 Argentine Sign Language 0 Arpitan 0 Aymara 0 Bangladeshi Sign Language 0 Banjar 0 Bislama 0 Bodo (India) 0 Buginese 0 Chamorro 0 Cherokee 0 Cheyenne 0 Choctaw 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dzongkha 0 Ewe 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Friulian 0 Gagauz 0 Gan Chinese 0 German Sign Language 0 Gilaki 0 Greek Sign Language 0 Gulf Arabic 0 Hakha Chin 0 Hakka Chinese 0 Hawaiian 0 Herero 0 Hiri Motu 0 Interlingua (International Auxiliary Language Association) 0 Inupiaq 0 Jamaican Creole English 0 Kabardian 0 Kabyle 0 Kalaallisut 0 Kanuri 0 Kara-Kalpak 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kongo 0 Kuanyama 0 Kölsch 0 Ladino 0 Lak 0 Latgalian 0 Ligurian 0 Maori 0 Marshallese 0 Min Dong Chinese 0 Modern Greek (1453-) 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Northern Huishui Hmong 0 Novial 0 Nyanja 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Pali 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Pitcairn-Norfolk 0 Pontic 0 Rajasthani 0 Rundi 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Saterfriesisch 0 Scots 0 Shona 0 Sichuan Yi 0 Silesian 0 Southern Sotho 0 Sranan Tongo 0 Swahili (macrolanguage) 0 Swiss-German Sign Language 0 Tahitian 0 Tai 0 Tetum 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tosk Albanian 0 Tsonga 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Turkish Sign Language 0 Twi 0 Udmurt 0 Venda 0 Veps 0 Vlaams 0 Vlax Romani 0 Votic 0 Zeeuws 0 Zhuang 0 Zulu 0

1704 dataset results for Texts