Browse State-of-the-Art
Datasets
Methods
More
Newsletter
RC2022
About
Trends
Portals
Libraries
Sign In
Datasets
10,014
machine learning datasets
Subscribe to the PwC Newsletter
×
Stay informed on the latest trending ML papers with code, research developments, libraries, methods, and datasets.
Read previous issues
Join the community
×
You need to
log in
to edit.
You can
create a new account
if you don't have one.
🔔
Share your dataset with the ML community!
Filters
List
Gallery
Best match
Most cited
Newest
Filter by Modality
(clear)
Texts
×
3D
0
3d meshes
0
6D
0
Actions
0
Audio
0
Biology
0
Biomedical
0
Cad
0
Dialog
0
EEG
0
Environment
0
Financial
0
Graphs
0
Hyperspectral images
0
Images
0
Interactive
0
LiDAR
0
Lyrics
0
MRI
0
Medical
0
Midi
0
Music
0
PSG
0
Parallel
0
Physics
0
Point cloud
0
RGB Video
0
RGB-D
0
Ranking
0
Replay data
0
Speech
0
Stereo
0
Tables
0
Tabular
0
Time series
0
Tracking
0
Videos
0
fMRI
0
Filter by Task
Cross-Lingual POS Tagging
1
Cross-lingual zero-shot dependency parsing
1
Dependency Parsing
1
LABELED_DEPENDENCIES
1
LEMMA
1
Language Identification
1
MORPH
1
POS
1
Part-Of-Speech Tagging
1
SENTS
1
TAG
1
UNLABELED_DEPENDENCIES
1
Filter by Language
(clear)
Tupinambá
×
English
54
Chinese
14
German
7
Russian
6
Spanish
6
Malayalam
5
Turkish
5
Vietnamese
5
Czech
4
French
4
Gujarati
4
Hindi
4
Kannada
4
Marathi
4
Punjabi
4
Tamil
4
Telugu
4
Arabic
3
Bengali
3
Dutch
3
Estonian
3
Finnish
3
Indonesian
3
Japanese
3
Korean
3
Multilingual
3
Persian
3
Romanian
3
Sindhi
3
Sinhala
3
Swedish
3
Thai
3
Urdu
3
Afrikaans
2
Albanian
2
Amharic
2
Armenian
2
Assamese
2
Azerbaijani
2
Basque
2
Belarusian
2
Bosnian
2
Breton
2
Bulgarian
2
Burmese
2
Catalan
2
Croatian
2
Danish
2
Esperanto
2
Galician
2
Georgian
2
Greek
2
Guarani
2
Haitian
2
Hebrew
2
Hungarian
2
Icelandic
2
Irish
2
Italian
2
Javanese
2
Kazakh
2
Kurdish
2
Lao
2
Latin
2
Latvian
2
Lithuanian
2
Macedonian
2
Malagasy
2
Mongolian
2
Norwegian
2
Oriya (macrolanguage)
2
Polish
2
Portuguese
2
Quechua
2
Romansh
2
Sanskrit
2
Scottish Gaelic
2
Serbian
2
Slovak
2
Slovenian
2
Somali
2
Sundanese
2
Swahili
2
Tagalog
2
Tatar
2
Ukrainian
2
Uzbek
2
Welsh
2
Yiddish
2
Yoruba
2
Aragonese
1
Asturian
1
Avaric
1
Bashkir
1
Bavarian
1
Bishnupriya
1
Cebuano
1
Central Bikol
1
Central Khmer
1
Central Kurdish
1
Chavacano
1
Chechen
1
Chuvash
1
Cornish
1
Dhivehi
1
Dimli (individual language)
1
Eastern Mari
1
Egyptian Arabic
1
Erzya
1
Filipino
1
Fulah
1
Ganda
1
Goan Konkani
1
Hausa
1
Ido
1
Igbo
1
Iloko
1
Interlingue
1
Kabyle
1
Kalmyk
1
Karachay-Balkar
1
Kirghiz
1
Komi
1
Lezghian
1
Limburgan
1
Lingala
1
Lojban
1
Lombard
1
Low German
1
Lower Sorbian
1
Luxembourgish
1
Maithili
1
Malay (individual language)
1
Maltese
1
Mazanderani
1
Minangkabau
1
Mingrelian
1
Mirandese
1
Modern Greek
1
Neapolitan
1
Newari
1
Northern Frisian
1
Northern Luri
1
Norwegian Nynorsk
1
Occitan (post 1500)
1
Oromo
1
Ossetian
1
Pampanga
1
Piemontese
1
Pushto
1
Russia Buriat
1
Sardinian
1
Serbo-Croatian
1
Sicilian
1
South Azerbaijani
1
Swati
1
Tajik
1
Tibetan
1
Tswana
1
Turkmen
1
Tuvinian
1
Uighur
1
Upper Sorbian
1
Venetian
1
Volapük
1
Walloon
1
Waray (Philippines)
1
Western Frisian
1
Western Mari
1
Western Panjabi
1
Wolof
1
Wu Chinese
1
Xhosa
1
Yakut
1
Yue Chinese
1
Abkhazian
0
Achinese
0
Adyghe
0
Afar
0
Akan
0
Akkadian
0
Akuntsu
0
American Sign Language
0
Ancient Greek
0
Ancient Hebrew
0
Apurinã
0
Argentine Sign Language
0
Arpitan
0
Assyrian Neo-Aramaic
0
Aymara
0
Bambara
0
Bangala
0
Bangladeshi Sign Language
0
Banjar
0
Bemba (Zambia)
0
Bhojpuri
0
Bislama
0
Bodo (India)
0
Buginese
0
Central Pashto
0
Chamorro
0
Cherokee
0
Cheyenne
0
Choctaw
0
Chukot
0
Church Slavic
0
Congo Swahili
0
Coptic
0
Corsican
0
Cree
0
Creek
0
Crimean Tatar
0
Dogri (individual language)
0
Dogri (macrolanguage)
0
Dzongkha
0
Ewe
0
Extremaduran
0
Faroese
0
Fiji Hindi
0
Fijian
0
Fon
0
French Sign Language
0
Friulian
0
Gagauz
0
Gan Chinese
0
Geez
0
German Sign Language
0
Gilaki
0
Gothic
0
Greek Sign Language
0
Gulf Arabic
0
Hakha Chin
0
Hakka Chinese
0
Halh Mongolian
0
Hawaiian
0
Herero
0
Hiri Motu
0
Interlingua (International Auxiliary Language Association)
0
Inuktitut
0
Inupiaq
0
Iranian Persian
0
Jamaican Creole English
0
Jejueo
0
Kabardian
0
Kabuverdianu
0
Kachin
0
Kalaallisut
0
Kanuri
0
Kara-Kalpak
0
Karelian
0
Kashmiri
0
Kashubian
0
Khunsari
0
Kikuyu
0
Kinyarwanda
0
Komi-Permyak
0
Komi-Zyrian
0
Kongo
0
Krio
0
Kuanyama
0
Kölsch
0
Ladino
0
Lak
0
Latgalian
0
Ligurian
0
Literary Chinese
0
Livvi
0
Lozi
0
Lunda
0
Luo (Cameroon)
0
Luo (Kenya and Tanzania)
0
Lushai
0
Malay (macrolanguage)
0
Mandarin Chinese
0
Manipuri
0
Manx
0
Maori
0
Marshallese
0
Mbyá Guaraní
0
Mesopotamian Arabic
0
Min Dong Chinese
0
Modern Greek (1453-)
0
Moksha
0
Moroccan Arabic
0
Mundurukú
0
Najdi Arabic
0
Narom
0
Nauru
0
Navajo
0
Naxi
0
Nayini
0
Ndonga
0
Nepali (individual language)
0
Nepali (macrolanguage)
0
Nigerian Fulfulde
0
Nigerian Pidgin
0
North Azerbaijani
0
North Levantine Arabic
0
Northern Huishui Hmong
0
Northern Kurdish
0
Northern Sami
0
Northern Uzbek
0
Norwegian Bokmål
0
Novial
0
Nyanja
0
Odia
0
Official Aramaic (700-300 BCE)
0
Old English (ca. 450-1100)
0
Old French
0
Old Russian
0
Old Turkish
0
Pali
0
Pangasinan
0
Papiamento
0
Pedi
0
Pennsylvania German
0
Pfaelzisch
0
Picard
0
Pitcairn-Norfolk
0
Plateau Malagasy
0
Pontic
0
Portuguse
0
Rajasthani
0
Rundi
0
Rusyn
0
Saidi Arabic
0
Samoan
0
Sango
0
Santali
0
Saterfriesisch
0
Scots
0
Shan
0
Shona
0
Sichuan Yi
0
Silesian
0
Skolt Sami
0
Soi
0
South Levantine Arabic
0
Southern Pashto
0
Southern Sotho
0
Sranan Tongo
0
Standard Arabic
0
Standard Latvian
0
Swahili (macrolanguage)
0
Swedish Sign Language
0
Swiss German
0
Swiss-German Sign Language
0
Tahitian
0
Tai
0
Tetum
0
Tigrinya
0
Tok Pisin
0
Tonga (Tonga Islands)
0
Tonga (Zambia)
0
Tosk Albanian
0
Tsonga
0
Tulu
0
Tumbuka
0
Tunisian Arabic
0
Turkish Sign Language
0
Twi
0
Udmurt
0
Venda
0
Veps
0
Vlaams
0
Vlax Romani
0
Votic
0
Warlpiri
0
West Central Oromo
0
Zaza
0
Zeeuws
0
Zhuang
0
Zulu
0
0 dataset results for
Language Modelling
AND
Texts
AND
Tupinambá
Search without filters