Intro to Machine Learning with R
This Advanced Quantitative Methods training is open to all ESRC and non-ESRC funded students within the seven WRDTP partner institutions. Students are welcome from all seven interdisciplinary Pathways.
This session will introduce libraries and functions in R for performing Machine Learning (ML). The most common forms of ML are: (i) supervised learning (e.g., prediction, regression and classification); (ii) unsupervised learning (e.g., clustering); and (iii) reinforcement learning. This session will focus on supervised ML. We wills tart by reviewing linear regression that forms a good starting point for understanding machine learning and may be something you are already familiar with. Then we will explore further algorithms that can be used for regression and classification. We will mainly focus on using the caret package for ML. For a complete overview of R ML libraries see: Cran Task View, but (as usual) there are many ways of doing thinks in R and multiple packages that can be used for ML.
This session is very much a hands-on overview of supervised machine learning and some of the R functions that can be used. For a more theoretical overview you might find the book “An Introduction to Statistical Learning with Applications in R” and the accompanying videos helpful (found here).
Learning Objectives for attendees:
- To briefly introduce the concepts of Machine Learning and Linear Regression
- To demonstrate the use of lm() function for creating linear regression models
- To use the predict() function to make predictions using a model
- To describe information about the linear models provided by the summary() function
- To prepare and transform the data (e.g., scale and handle missing values)
- To create training and test sets (including validation) using caret
- Create classification models using the caret, rpart and randomForest packages
- To evaluate and compare trained regression and classification models.
Structure of the workshop
The workshop consists of the following topics:
- An introduction to machine learning
- An overview of prediction with regression analysis
- The machine learning pipeline (e.g. feature selection, feature engineering, modelling and training/test datasets)
- Classification and regression with other algorithms (e.g. decision trees, neural networks).
What is provided
For the workshop you will be provided with the following:
- A handout to follow in self-study
- An R script containing all R codes for the workshop (and script with answers to exercises)
- Copies of relevant resources
- Datasets to use within the workshop
- Slides that introduce relevant topics
This training session will be delivered via Blackboard Collaborate.
PLEASE NOTE: Our online training sessions will be recorded and will be available on the VIRE in an edited format for those students who cannot attend. If you wish to join this session but do not wish for your contributions to be included in the edited VIRE resource, please ensure that you select NO when prompted in the online booking form regarding recording.