๐Ÿ’ป Week 09 - Class Roadmap (90 min)

2022/23 Lent Term

Author

DS101L 2022/23 Teaching Team

Published

13 March 2023

๐Ÿ“š Learning Objectives

You will learn how to:

Note: we will be talking about classification, a supervised learning approach. This means that we will be using a dataset with a response variable (also known as the target variable) to predict the response variable (a class) for new observations.

๐Ÿ›ฃ๏ธ Roadmap

โš™๏ธ Setup (~ 10 min)

๐ŸŽฏ ACTION POINTS:

  1. Go to Moodle and download the files for this weekโ€™s lab.
  2. Open RStudio and create a new project.
  3. Add the files you download to the folder of your project.
  4. Open DS101L_2022_23_W09_lab.Rmd in RStudio.
  5. Run the code chunks in the โš™๏ธ Setup section to load the libraries you need for this lab.

Ask your colleagues and tutor for help if you get stuck.

1๏ธโƒฃ Load Data (~ 15 min)

๐Ÿง‘โ€๐Ÿซ INSTRUCTOR NOTES:

  1. Your instructor will explain what is in the data.
  2. Since we will take a supervised learning approach, your instructor will briefly explain what the input and output variables are.

๐ŸŽฏ ACTION POINTS:

Follow the action points in the markdown file.

2๏ธโƒฃ Training vs Test (~35 min)

๐Ÿ‘จโ€๐Ÿซ TEACHING POINT:

  1. Your instructor will tell you about a common strategy to validate Machine Learning models: training versus test splits.

๐ŸŽฏ ACTION POINTS:

Follow the action points in the markdown file. You will learn how to create a training and test split and what a confusion matrix looks like.

There are also several points of discussion in the markdown file. You will discuss these points with your colleagues and tutor.

3๏ธโƒฃ Metrics (~ 30 min)

๐Ÿ‘จโ€๐Ÿซ TEACHING POINT:

  1. Each cell in the confusion matrix can be identified with a name. Your instructor will tell you about:

    • False Positives
    • False Negatives
    • True Positives
    • True Negatives
  2. Take a look at this table of metrics together.

๐Ÿ‘ฅ WORKING TOGETHER IN PAIRS:

Try to work out the solution to the questions posed in the markdown file. You can discuss these questions with your colleagues and tutor.

๐Ÿ‘จโ€๐Ÿซ TEACHING POINT:

  1. Your instructor will explain what is good and bad about the metrics of this classification model!

๐ŸกTake-home exercise

There is a take-home exercise in the markdown file. Can you answer the questions posed in there?

If you canโ€™t work out the answers to the questions, send us a message on Slack!