Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
**An Introduction to Machine Learning for Psychologists in R** The majority of psychological research is aimed at explaining human experience and behavior with methods of inferential statistics. This is not always in line with the intention to predict psychological variables and associated outcomes with utmost precision (Yarkoni, 2017). Models and techniques from the field of machine learning were developed to achieve a maximum of predictive performance. Whereas, machine learning models have long been considered black-boxes, recent developments have greatly increased their interpretability. For these reasons, the psychological research community shows increasing interest in adopting these methods. In this workshop, we will give a non-technical introduction to the basic concepts and ideas of machine learning. We will discuss the bias variance tradeoff, overfitting, resampling techniques, model evaluation and variable selection. Participants will be introduced to the Random Forest (Breiman, 2001), a powerful, nonlinear machine learning algorithm that is known for its high predictive performance in many application settings. To demonstrate the strengths of the Random Forest, we will compare its performance with linear regression models in a series of benchmark experiments. In addition to performance evaluation, researchers are often interested in the importance of single predictors. In this regard, variable importance measures and partial dependency plots are useful. After this workshop, participants should be able to apply basic machine learning techniques to their own research. Clemens & Florian ---------- Prior to the workshop, please prepare the following things: - install R and Rstudio: - https://cran.r-project.org/ - https://www.rstudio.com/products/rstudio/download/ - install the following R packages from CRAN: - mlr, parallelMap, ggplot2, iml, glmnet, ranger, rpart - knitr, mlbench, mvtnorm, gridExtra, rpart.plot (optional if you want to reproduce our slides) - make sure the installation of the packages was successful (especially if you work on a Mac) - download the files in the OSF Storage folder - charge the battery of your laptop (and don't forget your charger) - for details about the time and the location of the workshop, checkout the official homepage of the conference: https://www.dgpskongress.de/ ---------- Description of the files in the OSF Storage of this repository: - **ml_workshop_slides.pdf** - slides used in the workshop - **phonedata.csv** - data file we use for most practical exercises in the workshop - see our slides for a description of the dataset - **preprocessing.R** - sourcing this R script loads the workshop data, does some preprocessing steps necessary for the predictive modeling analyses, and returns a final dataframe named "phonedata" - this saves workshop time for the more interesting stuff - **ml_workshop_slides.Rmd** - Rmarkdown file containing all the code to reproduce our slides - checkout this file to see what goes on behind the scenes :) - knitting the slides in RStudio requires the knitr R package and a working version of LaTeX - **ml_workshop.bib** - Bib file containing the references included in the presentation - **Figures/** - folder containing the pictures included in the presentation ---------- For questions or comments, feel free to contact the authors at **Florian.Pargent@psy.lmu.de** and **Clemens.Stachl@psy.lmu.de**
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.