3 dataset results for Language Acquisition AND English

BabySLM is a language-acquisition-friendly benchmark to probe speech-based LMs at the lexical and syntactic levels, both of which are compatible with the vocabulary typical of children's language experiences.

1 PAPER • NO BENCHMARKS YET

Duolingo SLAM Shared Task

This repository contains gzipped files containing more than 2 million tokens (words) from answers submitted by more than 6,000 students over the course of their first 30 days of using Duolingo. It also contains baseline starter code written in Python. There are three data sets, corresponding to three different language courses. More details on the data set and task are available at: http://sharedtask.duolingo.com. (2018-01-10)

3 PAPERS • NO BENCHMARKS YET

Duolingo Spaced Repetition Data

This is a gzipped CSV file containing the 13 million Duolingo student learning traces used in experiments by Settles & Meeder (2016). For more details and replication source code, visit: https://github.com/duolingo/halflife-regression (2016-06-07)

3 PAPERS • NO BENCHMARKS YET

Datasets

3 dataset results for Language Acquisition AND English