Teaching precursors to data science in introductory and second courses in statistics

14 Jan 2014 · Nicholas J Horton, Benjamin S Baumer, Hadley Wickham ·

Statistics students need to develop the capacity to make sense of the staggering amount of information collected in our increasingly data-centered world. Data science is an important part of modern statistics, but our introductory and second statistics courses often neglect this fact. This paper discusses ways to provide a practical foundation for students to learn to "compute with data" as defined by Nolan and Temple Lang (2010), as well as develop "data habits of mind" (Finzer, 2013). We describe how introductory and second courses can integrate two key precursors to data science: the use of reproducible analysis tools and access to large databases. By introducing students to commonplace tools for data management, visualization, and reproducible analysis in data science and applying these to real-world scenarios, we prepare them to think statistically in the era of big data.

PDF Abstract

Code

Add Remove Mark official

beanumber/tidy-databases

Datasets

Add Datasets introduced or used in this paper

Edit Social Preview

Teaching precursors to data science in introductory and second courses in statistics

Code Edit Add Remove Mark official

Categories

Datasets Edit

Code

Add Remove Mark official

Datasets