|
|
Nov 23, 2024
|
|
NTRES 6100 - Collaborative and Reproducible Data Science in R Fall. 2-3 credits, variable. S/U grades only.
Forbidden Overlap: due to an overlap in content, students will receive credit for only one course in the following group: AEM 2850 , GDEV 2295 , GDEV 5290 , NTRES 6100, STSCI 3040 , STSCI 5040 . N.O. Therkildsen.
As datasets grow larger and more complex across all areas of science, computational skills are increasingly in high demand. This course introduces a series of practical tools that enable researchers to spend less time wrestling with software or repeating error-prone manual data processing and more time getting research done in efficient and transparent ways that facilitate collaboration and reproducibility. We will work in R/RStudio. Topics covered include 1) tidy data formatting, 2) rearrangement, filtering, exploration, and visualization of complex datasets, 3) basic programming, 4) version control with Git and GitHub, and 5) using R Markdown to combine text, code, tables, and figures into reports, websites, and presentations. The course emphasizes practical skill development and is structured around hands-on (the keyboard) learning.
Outcome 1: Describe strategies for ensuring that their data analysis is reproducible.
Outcome 2: Demonstrate best practices for coding and project-oriented workflows in RStudio.
Outcome 3: Import and clean messy data files using a variety of packages and functions in R.
Outcome 4: Subset, reorganize, and merge diverse datasets in R.
Outcome 5: Effectively explore and visualize patterns in complex datasets with ggplot in R.
Outcome 6: Write simple functions/programs and data analysis pipelines in R.
Outcome 7: Automate repeated analysis tasks in R.
Outcome 8: Track the history of file changes (version control) and collaborate effectively on scripts with others with Git and GitHub.
Outcome 9: Use R Markdown to combine text, equations, code, tables, and figures into reports, websites, and presentations.
Add to Favorites (opens a new window)
|
|
|