DS105 2025-2026 Autumn Term Icon

πŸ–₯️ Week 07 Lecture

JSON Normalisation and Data Reshaping

Author

Dr Jon Cardoso-Silva

Published

13 November 2025

πŸ“ Logistics

Time and Location: Thursday, 13 November 2025, 16:00 - 18:00, CLM 5.02

Many of you discovered pd.json_normalize() organically while working on ✍️ Mini-Project 1 and today, we’ll formalise this knowledge and explore more powerful techniques for handling nested data structures.

πŸ“‹ Preparation

  • Take a look at the ✍️ Mini-Project 2, especially the β™ŸοΈ Tactical Plan section to understand how this lecture will support your work on that assignment.

πŸ““ Lecture Materials

Today’s lecture uses slides with a demonstration notebook for live coding. All materials will be available in your Nuvolos workspace under the week07/ folder.

🎬 Lecture Slides

Use keyboard arrows to navigate. Select the slides below or view fullscreen.

If you prefer to have a PDF version of it:

Lecture Demonstration Notebook

This notebook accompanies the slides with code examples and pre-prepared nested JSON objects for practice.

Available on Nuvolos after the lecture: week07/W07-NB01-Lecture-Data-Reshaping.ipynb

Model Solution

The curated model solution from Mini-Project 1 (with excellent reflections) is available for study:

Available on Nuvolos after the lecture: week07/mp1-model-solution/

πŸ’‘ Key Concepts

  • pd.json_normalize(): Automatically flattens nested JSON structures into DataFrames. Works with dictionaries, lists, and pandas Series. Essential for handling API responses with nested data.
  • record_path and meta: Control how nested lists expand while preserving parent context. The record_path must point to a list within each record.
  • pd.concat(): Combines multiple DataFrames vertically or horizontally (exposure-only in W07)
  • .explode(): Creates one row per element in list-type columns (exposure-only in W07)
  • .melt(): Transforms data from wide format to long format for plotting (exposure-only in W07)

πŸ”– Appendix

Post‑Lecture Actions

  • Review the lecture slides and notebook
  • Complete the πŸ’» W07 Lab (OpenSanctions practice)
  • Start exploring ✍️ Mini-Project 2 (released this week)
  • Attend W07 Lab and drop-in sessions

Useful Links

Looking Ahead

  • Tomorrow (W07 Friday): OpenSanctions lab practice
  • Mini-Project 2: Released this week, due Week 10
  • Week 08: Databases and SQL - storing and querying structured data
  • Key skill: These reshaping techniques are essential for MP2’s TfL API work