Datasets

10,027 machine learning datasets
Filter by Task
Question Answering 282 Language Modelling 116 Text Classification 100 Text Generation 95 Named Entity Recognition (NER) 93 Visual Question Answering (VQA) 82 Reading Comprehension 77 Text Summarization 71 Natural Language Inference 67 Sentiment Analysis 58 Machine Translation 57 Information Retrieval 56 Relation Extraction 53 Natural Language Understanding 52 Common Sense Reasoning 44 Classification 38 Image Captioning 38 Machine Reading Comprehension 35 Code Generation 34 Hate Speech Detection 34 Coreference Resolution 33 Abstractive Text Summarization 32 Entity Linking 32 Misinformation 29 Word Embeddings 29 Data Augmentation 28 Semantic Parsing 27 Video Question Answering 27 Document Summarization 24 Video Captioning 24 Open-Domain Question Answering 23 Speech Recognition 23 Stance Detection 23 Dialogue Generation 22 Video Retrieval 21 Fake News Detection 20 Image Retrieval 20 Part-Of-Speech Tagging 20 Retrieval 20 Visual Reasoning 20 Data-to-Text Generation 19 Knowledge Graphs 19 Recommendation Systems 19 Visual Question Answering 19 Question Generation 18 Relation Classification 18 Domain Adaptation 17 Few-Shot Learning 17 Multi-Task Learning 17 NER 17 Task-Oriented Dialogue Systems 17 Emotion Recognition 16 Image Classification 16 Slot Filling 16 Text Simplification 16 Intent Detection 15 Paraphrase Identification 15 Semantic Textual Similarity 15 Language Identification 14 Video Understanding 14 Handwriting Recognition 13 Mathematical Reasoning 13 Text-to-Image Generation 13 Zero-shot Text Search 13 Automatic Speech Recognition (ASR) 12 Decision Making 12 Emotion Classification 12 Emotion Recognition in Conversation 12 Grammatical Error Correction 12 Instruction Following 12 Joint Entity and Relation Extraction 12 Link Prediction 12 Math Word Problem Solving 12 Multi-Document Summarization 12 Optical Character Recognition (OCR) 12 Paraphrase Generation 12 Sarcasm Detection 12 Word Sense Disambiguation 12 Zero-Shot Learning 12 Cross-Lingual Transfer 11 Dialogue State Tracking 11 Event Extraction 11 Fact Verification 11 Logical Reasoning 11 Multi-Label Classification 11 Semantic Segmentation 11 Sentence Classification 11 Translation 11 Aspect-Based Sentiment Analysis (ABSA) 10 Conversational Response Selection 10 Dependency Parsing 10 Document Classification 10 Multi-Label Text Classification 10 Object Detection 10 Open Information Extraction 10 Open-Domain Dialog 10 Semantic Role Labeling 10 Text-To-Speech Synthesis 10 Vision and Language Navigation 10 Automatic Post-Editing 9 Entity Disambiguation 9 Entity Typing 9 Image Generation 9 Intent Classification 9 Nested Named Entity Recognition 9 News Classification 9 Speech Synthesis 9 Text Retrieval 9 Visual Dialog 9 Zero-Shot Video Question Answer 9 Abusive Language 8 Answer Selection 8 Code Completion 8 Cross-Modal Retrieval 8 Handwriting generation 8 Handwritten Text Recognition 8 Passage Retrieval 8 Referring Expression Segmentation 8 Scene Text Recognition 8 Semantic Similarity 8 Sentence Embeddings 8 Sign Language Translation 8 Spoken Language Understanding 8 Text-To-SQL 8 Chatbot 7 Code Search 7 Conversational Question Answering 7 Dialogue Understanding 7 Explanation Generation 7 Extreme Summarization 7 Knowledge Base Question Answering 7 Mathematical Question Answering 7 Medical Visual Question Answering 7 Multimodal Deep Learning 7 Multiple-choice 7 Named Entity Recognition 7 Node Classification 7 Opinion Mining 7 Response Generation 7 Scene Text Detection 7 Sentiment Classification 7 Speech Emotion Recognition 7 Stance Classification 7 Temporal Tagging 7 Topic Classification 7 Topic Models 7 Toxic Comment Classification 7 Abstractive Dialogue Summarization 6 Ad-hoc video search 6 Chart Question Answering 6 Chinese Reading Comprehension 6 Cross-Lingual NER 6 Dialogue Act Classification 6 Event Detection 6 Fact Checking 6 Fairness 6 Image-to-Text Retrieval 6 Medical Named Entity Recognition 6 Meeting Summarization 6 Multimodal Emotion Recognition 6 Multimodal Sentiment Analysis 6 Multiple Choice Question Answering (MCQA) 6 Phrase Grounding 6 Referring Expression Comprehension 6 Story Generation 6 Table-to-Text Generation 6 Text-to-Video Generation 6 Translation deu-eng 6 Translation eng-deu 6 Twitter Sentiment Analysis 6 Video Summarization 6 Vision-Language Navigation 6 Zero-Shot Video Retrieval 6 AMR Parsing 5 AMR-to-Text Generation 5 Automated Theorem Proving 5 Chinese Named Entity Recognition 5 Citation Recommendation 5 Code Repair 5 Code Translation 5 Community Question Answering 5 Composed Image Retrieval (CoIR) 5 Dense Video Captioning 5 Dialogue Evaluation 5 Discourse Parsing 5 Extractive Text Summarization 5 Generative Question Answering 5 Goal-Oriented Dialog 5 Language Acquisition 5 Large Language Model 5 Learning-To-Rank 5 Linguistic Acceptability 5 Medical Relation Extraction 5 Moment Retrieval 5 Open Intent Discovery 5 Reading Comprehension (Few-Shot) 5 Reading Comprehension (One-Shot) 5 Reading Comprehension (Zero-Shot) 5 SSTOD 5 Scientific Document Summarization 5 Self-Supervised Learning 5 Sequence-to-sequence Language Modeling 5 Sign Language Recognition 5 Stochastic Optimization 5 Text to Audio Retrieval 5 Text-to-Code Generation 5 Token Classification 5 Video Generation 5 Video Grounding 5 Visual Commonsense Reasoning 5 Visual Navigation 5 Abuse Detection 4 Action Recognition 4 Anomaly Detection 4 Argument Mining 4 Audio captioning 4 Audio to Text Retrieval 4 Audio-Visual Speech Recognition 4 Automatic Speech Recognition 4 Binary Classification 4 Code Summarization 4 Constituency Parsing 4 Conversational Response Generation 4 Cross-Lingual Question Answering 4 Discourse Segmentation 4 Document Ranking 4 Entity Resolution 4 Entity Retrieval 4 Few-Shot Relation Classification 4 Few-shot NER 4 Genre classification 4 Gesture Generation 4 Hierarchical Multi-label Classification 4 KG-to-Text Generation 4 Key Information Extraction 4 Lip Reading 4 Lipreading 4 Low Resource Named Entity Recognition 4 Medical Concept Normalization 4 Memorization 4 Multi-task Language Understanding 4 Multimodal Reasoning 4 Natural Language Visual Grounding 4 Natural Questions 4 Nested Mention Recognition 4 News Generation 4 Paper generation 4 Product Recommendation 4 Program Repair 4 Relational Reasoning 4 Sentence Embedding 4 Speech Enhancement 4 Speech Separation 4 Spelling Correction 4 Style Transfer 4 Systematic Generalization 4 Table Detection 4 Temporal Relation Classification 4 Term Extraction 4 Text Clustering 4 Text Matching 4 Text Segmentation 4 Text Style Transfer 4 Text to Video Retrieval 4 Transfer Learning 4 Video-Text Retrieval 4 Visual Grounding 4 Visual Relationship Detection 4 Visual Storytelling 4 Weakly-Supervised Named Entity Recognition 4 Zero-Shot Cross-Lingual Transfer 4 2D Object Detection 3 Active Learning 3 Answer Generation 3 Arabic Sentiment Analysis 3 Arithmetic Reasoning 3 Aspect Category Detection 3 Aspect Category Polarity 3 Aspect Term Extraction and Sentiment Classification 3 Aspect-Category-Opinion-Sentiment Quadruple Extraction 3 Bias Detection 3 Binary Relation Extraction 3 Binary text classification 3 Biomedical Information Retrieval 3 Chinese Word Segmentation 3 Chunking 3 Citation Intent Classification 3 Click-Through Rate Prediction 3 Cloze (multi-choices) (Few-Shot) 3 Cloze (multi-choices) (One-Shot) 3 Cloze (multi-choices) (Zero-Shot) 3 Code Classification 3 Code Documentation Generation 3 Conditional Text Generation 3 Continual Learning 3 Cross Document Coreference Resolution 3 Definition Extraction 3 Depression Detection 3 Dialect Identification 3 Document-level Closed Information Extraction 3 Entity Alignment 3 Event Coreference Resolution 3 Explainable artificial intelligence 3 FG-1-PG-1 3 Few-Shot Image Classification 3 Formal Logic 3 Game of Sudoku 3 Gender Bias Detection 3 Goal-Oriented Dialogue Systems 3 Grammatical Error Detection 3 Graph Classification 3 Graph Embedding 3 Humor Detection 3 Instance Segmentation 3 Intent Discovery 3 Joint Event and Temporal Relation Extraction 3 Keyphrase Extraction 3 Keyword Extraction 3 Lemmatization 3 Lexical Entailment 3 Logical Reasoning Question Answering 3 Long Form Question Answering 3 Long-range modeling 3 Low-Resource Neural Machine Translation 3 Medical Diagnosis 3 Medical Report Generation 3 Meme Classification 3 Motion Synthesis 3 Multi Label Text Classification 3 Multi-Domain Recommender Systems 3 Multi-Hop Reading Comprehension 3 Multi-Label Learning 3 Multi-class Classification 3 Multi-hop Question Answering 3 Multi-modal Dialogue Generation 3 Multilingual NLP 3 Multimodal Intent Recognition 3 Multimodal Machine Translation 3 Multiple Instance Learning 3 Music Generation 3 Natural Language Moment Retrieval 3 Negation Detection 3 News Recommendation 3 News Summarization 3 Object Counting 3 Open Intent Detection 3 Out of Distribution (OOD) Detection 3 Person Re-Identification 3 Person Search 3 Recognizing Emotion Cause in Conversations 3 Referring Expression 3 Representation Learning 3 Review Generation 3 Science Question Answering 3 Semantic Image-Text Similarity 3 Sentence-Pair Classification 3 Sign Language Production 3 Source Code Summarization 3 Speaker Diarization 3 Structured Prediction 3 Temporal Action Localization 3 Temporal Relation Extraction 3 Text Categorization 3 Text Reranking 3 Text based Person Retrieval 3 Text-based Person Retrieval with Noisy Correspondence 3 Text-to-Music Generation 3 Time Series Forecasting 3 Transliteration 3 Unconstrained Lip-synchronization 3 Unsupervised Extractive Summarization 3 Unsupervised Machine Translation 3 Unsupervised Text Classification 3 Video Description 3 Visual Entailment 3 Visual Speech Recognition 3 Weakly Supervised Classification 3 Word Alignment 3 Zero-Shot Composed Image Retrieval (ZS-CIR) 3 Zero-Shot Text Classification 3 Zero-shot Named Entity Recognition (NER) 3 regression 3 text similarity 3 2D Semantic Segmentation 2 3D Anomaly Detection 2 3D Face Animation 2 3D Object Detection 2 AbbreviationDetection 2 Ad-Hoc Information Retrieval 2 Adversarial Attack 2 Adversarial Robustness 2 Aggression Identification 2 Argument Retrieval 2 Art Analysis 2 Aspect Extraction 2 Astronomy 2 Audio Generation 2 Autonomous Driving 2 Bayesian Inference 2 Causal Inference 2 Chinese Sentence Pair Classification 2 Citation Prediction 2 Claim Extraction with Stance Classification (CESC) 2 Claim Verification 2 Classification of toxic, engaging, fact-claiming comments 2 Clustering 2 Code Comment Generation 2 Common Sense Reasoning (Zero-Shot) 2 Commonsense Knowledge Base Construction 2 Computer Security 2 ContextNER 2 Continual Pretraining 2 Cross-Lingual Abstractive Summarization 2 Cross-Lingual Document Classification 2 Cross-Lingual Natural Language Inference 2 Cross-Lingual POS Tagging 2 Cross-Lingual Paraphrase Identification 2 Curved Text Detection 2 Dark Humor Detection 2 Deception Detection 2 Defect Detection 2 Dialog Act Classification 2 Dialog Relation Extraction 2 Distractor Generation 2 Document Layout Analysis 2 Document Text Classification 2 Document Translation 2 Domain Generalization 2 Dynamic Link Prediction 2 Elementary Mathematics 2 Embeddings Evaluation 2 Emotion Recognition in Context 2 Emotional Dialogue Acts 2 Empathetic Response Generation 2 End-To-End Dialogue Modelling 2 Ethics 2 Event Argument Extraction 2 Extractive Document Summarization 2 FLUE 2 Facial Expression Recognition 2 Fact-based Text Editing 2 Factual Visual Question Answering 2 Feature Engineering 2 Few-Shot NLI 2 Few-Shot Text Classification 2 Gender Prediction 2 General Knowledge 2 Gloss-free Sign Language Translation 2 Graph Generation 2 Handwriting Verification 2 Handwritten Digit Recognition 2 Hate Span Identification 2 Headline Generation 2 Human Activity Recognition 2 Human Judgment Correlation 2 Image Clustering 2 Image Manipulation 2 Imitation Learning 2 Implicit Discourse Relation Classification 2 Incremental Learning 2 Intent Classification and Slot Filling 2 Intent Recognition 2 Interpretable Machine Learning 2 Irony Identification 2 Irregular Text Recognition 2 Knowledge Graph Embeddings 2 Knowledge Probing 2 Layout-to-Image Generation 2 Lip to Speech Synthesis 2 Max-Shot Cross-Lingual Visual Reasoning 2 Medical Code Prediction 2 Medical Procedure 2 Meta-Learning 2 Model Compression 2 Moral Scenarios 2 Morphological Analysis 2 Morphological Tagging 2 Mortality Prediction 2 Motion Captioning 2 Multi-domain Dialogue State Tracking 2 Multilabel Text Classification 2 Multilingual Named Entity Recognition 2 Multilingual text classification 2 Multimodal Abstractive Text Summarization 2 Multimodal Text Prediction 2 Multiview Contextual Commonsense Inference 2 Native Language Identification 2 Natural Language Inference (Few-Shot) 2 Network Embedding 2 Neural Architecture Search 2 New Product Sales Forecasting 2 News Annotation 2 Node Clustering 2 Object Localization 2 Object Recognition 2 Open Vocabulary Object Detection 2 knowledge editing 2 legal outcome extraction 2 multimodal generation 2
Filter by Language
English 1519 Chinese 207 German 130 French 117 Spanish 99 Russian 86 Portuguese 62 Italian 60 Japanese 56 Arabic 55 Hindi 55 Vietnamese 42 Korean 40 Turkish 39 Dutch 34 Bengali 31 Czech 31 Persian 31 Danish 30 Tamil 30 Polish 29 Indonesian 27 Finnish 25 Romanian 25 Marathi 22 Multilingual 22 Telugu 21 Hungarian 20 Swedish 20 Thai 20 Greek 19 Urdu 19 Estonian 18 Bulgarian 16 Gujarati 16 Hebrew 15 Malayalam 15 Slovak 15 Croatian 14 Swahili 14 Basque 13 Punjabi 13 Ukrainian 13 Latvian 12 Slovenian 12 Lithuanian 11 Mandarin Chinese 11 Norwegian 11 Amharic 10 Catalan 10 Kazakh 10 Serbian 10 Kannada 9 Albanian 8 Armenian 8 Assamese 8 Irish 8 Maltese 7 Oriya (macrolanguage) 7 Sanskrit 7 Sinhala 7 Tagalog 7 Welsh 7 Yoruba 7 Burmese 6 Georgian 6 Hausa 6 Icelandic 6 Igbo 6 Iranian Persian 6 Kurdish 6 Macedonian 6 Mongolian 6 Somali 6 Afrikaans 5 Azerbaijani 5 Galician 5 Guarani 5 Haitian 5 Malay (individual language) 5 Norwegian Bokmål 5 Oromo 5 Sindhi 5 Uzbek 5 American Sign Language 4 Bambara 4 Belarusian 4 Breton 4 Egyptian Arabic 4 Filipino 4 Latin 4 Malagasy 4 Nigerian Pidgin 4 Norwegian Nynorsk 4 Odia 4 Scottish Gaelic 4 Serbo-Croatian 4 Tigrinya 4 Wolof 4 Bangala 3 Cebuano 3 Central Khmer 3 Central Kurdish 3 Chechen 3 Esperanto 3 Fulah 3 Ganda 3 Iloko 3 Javanese 3 Kirghiz 3 Lao 3 Lingala 3 Nepali (macrolanguage) 3 Quechua 3 South Azerbaijani 3 Standard Arabic 3 Sundanese 3 Upper Sorbian 3 Western Panjabi 3 Aragonese 2 Bashkir 2 Bavarian 2 Bhojpuri 2 Bishnupriya 2 Bosnian 2 Dhivehi 2 Erzya 2 Faroese 2 Goan Konkani 2 Jejueo 2 Kabyle 2 Kinyarwanda 2 Luo (Kenya and Tanzania) 2 Maithili 2 Malay (macrolanguage) 2 Modern Greek 2 Moroccan Arabic 2 Nepali (individual language) 2 Nyanja 2 Romansh 2 Russia Buriat 2 Swati 2 Tajik 2 Tatar 2 Tibetan 2 Tsonga 2 Tswana 2 Uighur 2 Waray (Philippines) 2 Xhosa 2 Yiddish 2 Yue Chinese 2 Akkadian 1 Akuntsu 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Argentine Sign Language 1 Assyrian Neo-Aramaic 1 Asturian 1 Avaric 1 Aymara 1 Bemba (Zambia) 1 Central Bikol 1 Central Pashto 1 Chavacano 1 Chukot 1 Church Slavic 1 Chuvash 1 Congo Swahili 1 Coptic 1 Cornish 1 Dimli (individual language) 1 Dogri (macrolanguage) 1 Eastern Mari 1 Ewe 1 Fon 1 French Sign Language 1 Geez 1 German Sign Language 1 Gothic 1 Gulf Arabic 1 Halh Mongolian 1 Ido 1 Interlingue 1 Inuktitut 1 Kabuverdianu 1 Kachin 1 Kalaallisut 1 Kalmyk 1 Karachay-Balkar 1 Karelian 1 Khunsari 1 Komi 1 Komi-Permyak 1 Komi-Zyrian 1 Krio 1 Lezghian 1 Limburgan 1 Literary Chinese 1 Livvi 1 Lojban 1 Lombard 1 Low German 1 Lower Sorbian 1 Lozi 1 Lunda 1 Luo (Cameroon) 1 Lushai 1 Luxembourgish 1 Manipuri 1 Manx 1 Maori 1 Mazanderani 1 Mbyá Guaraní 1 Mesopotamian Arabic 1 Minangkabau 1 Mingrelian 1 Mirandese 1 Moksha 1 Mundurukú 1 Najdi Arabic 1 Nayini 1 Neapolitan 1 Newari 1 Nigerian Fulfulde 1 North Azerbaijani 1 North Levantine Arabic 1 Northern Frisian 1 Northern Kurdish 1 Northern Luri 1 Northern Sami 1 Northern Uzbek 1 Occitan (post 1500) 1 Old French 1 Old Russian 1 Old Turkish 1 Ossetian 1 Pampanga 1 Pedi 1 Piemontese 1 Plateau Malagasy 1 Pushto 1 Rundi 1 Sardinian 1 Shan 1 Shona 1 Sicilian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Southern Pashto 1 Southern Sotho 1 Standard Latvian 1 Swedish Sign Language 1 Swiss German 1 Swiss-German Sign Language 1 Tonga (Zambia) 1 Tosk Albanian 1 Tupinambá 1 Turkmen 1 Tuvinian 1 Twi 1 Venetian 1 Volapük 1 Walloon 1 Warlpiri 1 West Central Oromo 1 Western Frisian 1 Western Mari 1 Wu Chinese 1 Yakut 1 Zulu 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 Arpitan 0 Bangladeshi Sign Language 0 Banjar 0 Bislama 0 Bodo (India) 0 Buginese 0 Chamorro 0 Cherokee 0 Cheyenne 0 Choctaw 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dogri (individual language) 0 Dzongkha 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Friulian 0 Gagauz 0 Gan Chinese 0 Gilaki 0 Greek Sign Language 0 Hakha Chin 0 Hakka Chinese 0 Hawaiian 0 Herero 0 Hiri Motu 0 Interlingua (International Auxiliary Language Association) 0 Inupiaq 0 Jamaican Creole English 0 Kabardian 0 Kanuri 0 Kara-Kalpak 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kongo 0 Kuanyama 0 Kölsch 0 Ladino 0 Lak 0 Latgalian 0 Ligurian 0 Marshallese 0 Min Dong Chinese 0 Modern Greek (1453-) 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Northern Huishui Hmong 0 Novial 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Pali 0 Pangasinan 0 Papiamento 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Pitcairn-Norfolk 0 Pontic 0 Portuguse 0 Rajasthani 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Saterfriesisch 0 Scots 0 Sichuan Yi 0 Silesian 0 Sranan Tongo 0 Swahili (macrolanguage) 0 Tahitian 0 Tai 0 Tetum 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Turkish Sign Language 0 Udmurt 0 Venda 0 Veps 0 Vlaams 0 Vlax Romani 0 Votic 0 Zaza 0 Zeeuws 0 Zhuang 0

2619 dataset results for Texts