πŸ’» Week 07 Lab

Practice reshaping dataframes

Author
Published

14 November 2024

Image created with the AI embedded in MS Designer using the prompt 'abstract salmon pink light blue icon depicting the metaphysical experience of cleaning up, reshaping, pivoting, and manipulating data in search of the purest insights in data science.'

πŸ“Location: Friday, 15 November 2024. Time varies.

πŸ₯… Learning Objectives

We want you to learn/practice the following goals in this lab:

  • Practice using pd.json_normalize to flatten nested JSON data from a data source you have not seen before.

  • Practice using DataFrame.explode to expand lists in a column.

  • Practice using pd.merge to combine dataframes.

  • Learn about the groupby method in pandas.

πŸ“‹ Preparation

NOTE: The skills you are practising today will be essential for the ✍️ W10 Summative assignment.

πŸ›£οΈ Roadmap

Here is how we will achieve the goal for this lab:

Part I: βš™οΈ Set Up (10 min)

Use either the VSCode on your own machine or the one inside Nuvolos logo Nuvolos.

🎯 INDIVIDUAL ACTION POINTS

There are two options to download the lab notebook:

  1. (Simpler) Click on the link below to download the lab notebook directly.

  2. (GitHub pro) If you have already forked the course’s live repository, GitHub logo lse-ds105/ds105a-2024, cd to the ds105a-2024 folder and run git pull --force in your terminal to get the latest version of the course’s repository.

    • The --force here is to ensure that your local repository is updated with the latest changes from the course’s repository and it will completely ignore any local changes you have made (even if you had committed them). Don’t use –force in other contexts unless you know what you are doing.

    • Navite to where the W07 Lab notebook is located and open it.

Part II: πŸ“š Practice (70-80 min)

Follow the instructions in the lab notebook to complete the exercises.

Notes:

  • You can work alone or in small groups for this.
  • If you want, feel free to play a game of πŸ§‘β€βœˆοΈ Pilot and πŸ™‹ Copilot (s) like we’ve done in the past.

What the exercises will cover:

  • Using pd.json_normalize to flatten nested JSON data.

  • Using DataFrame.explode to expand lists in a column.

  • Using pd.merge to combine dataframes.

  • Using the groupby method in pandas.