Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
Tuesday 13 September 2022\ #### **Require pre-session activities:** - Work through [this R tutorial](https://canvas.ubc.ca/courses/97662/files/19784375?wrap=1 "LDP-SDM-2021-lesson3_tutorial.R")[ Download this R tutorial](https://canvas.ubc.ca/courses/97662/files/19784375/download?download_frd=1), familiarizing yourself with the functions that we will mention in class. Note that you will first need to install a few packages, including: assertr, stringdist, GGally, taxize, and palmerpenguins (see pre-class [Instructions for Students: R and RStudio](https://canvas.ubc.ca/courses/97662/pages/instructions-for-students-r-and-rstudio "Instructions for students: R and RStudio") for package installation instructions).\ \ #### **Optional readings:** - Broman, K.W., and Woo, K.H. (2018). Data organization in spreadsheets. *American Statistician*, 72, 2-10. [https://doi.org/10.1080/00031305.2017.1375989 Links to an external site.](https://doi.org/10.1080/00031305.2017.1375989)[**PDF**](https://canvas.ubc.ca/courses/97662/files/19784378?wrap=1 "Broman & Woo 2018 Am. Stat.pdf")[Download PDF](https://canvas.ubc.ca/courses/97662/files/19784378/download?download_frd=1) - de Jonge, E., and van der Loo, M. (2013). An introduction to data cleaning with R. *Statistics Netherlands*. [https://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf Links to an external site.](https://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf)[**PDF**](https://canvas.ubc.ca/courses/97662/files/19784387?wrap=1 "de Jonge & van der Loo 2013 Stat. NL.pdf")[ Download PDF](https://canvas.ubc.ca/courses/97662/files/19784387/download?download_frd=1)\ \ #### **In-session activities:** 1. **10 min wrap-up of last Thursday's activity**\ \ 2. **Lecture (20-30 min; Rachel Germain)** - Data cleaning and quality control - Identifying outliers, typos, etc. with functions and algorithms - Data standards for species, time, and space - See: [lesson 3 lecture slides](https://canvas.ubc.ca/courses/97662/files/22428437?wrap=1 "LDP-SDM-2022-lesson3_lecture_slides.pdf")[ Download lesson 3 lecture slides](https://canvas.ubc.ca/courses/97662/files/22428437/download?download_frd=1)\ \ 3. **Break (5 min)** 4. **Activity / discussion (40 min; Mike Lavender)** - See: [lesson 3 pre-tutorial code](https://canvas.ubc.ca/courses/97662/files/19784375?wrap=1 "LDP-SDM-2021-lesson3_tutorial.R")[ Download lesson 3 pre-tutorial code](https://canvas.ubc.ca/courses/97662/files/19784375/download?download_frd=1)and [Data Cleaning and Standards Assignment](https://canvas.ubc.ca/courses/97662/assignments/1189490 "Data Cleaning and Standards (individual/group assignment)") - Break-out groups of \~4-5 people grouped according to R skill level - Choose a data cleaning task to work on - Discuss ideas for how to address the task(s), then write a short script - For references on some useful functions, see: - [assertr vignetteLinks to an external site.](https://cran.r-project.org/web/packages/assertr/vignettes/assertr.html) - [lubridate cheat sheet](https://canvas.ubc.ca/courses/97662/files/19784383?wrap=1 "lubridate_cheat_sheet.pdf")[Download lubridate cheat sheet](https://canvas.ubc.ca/courses/97662/files/19784383/download?download_frd=1) - [stringr cheatsheet](https://canvas.ubc.ca/courses/97662/files/19784382?wrap=1 "stringr_cheat_sheet.pdf")[ Download stringr cheatsheet](https://canvas.ubc.ca/courses/97662/files/19784382/download?download_frd=1)\ \ #### **Homework:** - Complete and submit your individual data cleaning task\ - You are encouraged to collaborate with your peers to find a solution, but everyone should submit their own assignment - **Due**: Friday 16 September 2022 - See: [Data Cleaning & Standards Assignment](https://canvas.ubc.ca/courses/97662/assignments/1189490 "Data Cleaning and Standards (individual/group assignment)") for instructions and rubric
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.