DS105 2025-2026 Winter Term Icon

đŸ’ģ Week 07 Lab

MP2 start: first collection and first normalisation pass

Author

Dr Jon Cardoso-Silva

Published

19 March 2026

đŸĨ… Learning Goals

By the end of this lab, you should be able to: i) run a working TfL Journey API call pattern in your own MP2 repo, ii) inspect nested JSON and unnest one response with json_normalize(), iii) save raw and normalised outputs into your repo structure, iv) adapt collection and normalisation choices to match your NB01 methodology.

This lab builds directly on đŸ–Ĩī¸ W07 Lecture, where you defined the methodology you will adopt in your MP2 NB01 and NB02. Today we implement the baseline together, then adapt it to your own methodological choices.

📋 Preparation

  • Attend or watch the đŸ–Ĩī¸ W07 Lecture
  • Bring your MP2 repository open in Nuvolos
  • If not working on Nuvolos for some reason, download the lab notebook:

  • Have your current choices ready: Inner London, Outer London, peak, off-peak

đŸ›Ŗī¸ Lab Roadmap

Part Activity Type Focus Time Outcome
Part 1 đŸ—Ŗī¸ TEACHING MOMENT setup and first request 15 min Everyone runs one working request and saves one raw JSON sample
Part 2 đŸ—Ŗī¸ TEACHING MOMENT Guided JSON normalisation 30 min Everyone produces one normalised table from the response
Part 3 đŸŽ¯ ACTION POINTS Adapt to your own methodology 45 min You revise collection and unnesting choices in your own repo
Wrap-Up đŸ—Ŗī¸ TEACHING MOMENT Report connection and next actions 10 min You leave with saved outputs plus REPORT.md method notes

💡 Note: We all go through the same code and data collection in Part 1 and Part 2 so everyone has the same baseline. From Part 3 onwards, your own methodology decisions drive your implementation choices.

Part 1: API setup and first request (15 min)

Note to class teachers: Keep this fully synchronised. The goal is one reproducible request pattern across the room before students customise anything.

đŸŽ¯ ACTION POINTS

  1. Confirm .env loading and API key access are working
  2. Run one low-risk API request with the class
  3. Save one raw JSON response to data/raw/ in your own repo
  4. Check you can identify where nested journey details live in the JSON

Part 2: Guided json_normalize() on one response (30 min)

Note to class teachers: Keep the rhythm explicit, first no parameters, then record_path, then meta. Make students explain what shape changed after each pass.

As a class, inspect nested structure and unnest step by step.

đŸŽ¯ ACTION POINTS

  1. Run a first pass without parameters and inspect the output shape
  2. Run a second pass with record_path and compare row expansion
  3. Run a third pass with meta and check context columns
  4. Save one normalised output to data/processed/ in your own repo

Part 3: Adapt in your own MP2 repo (45 min)

Note to class teachers: Students now diverge based on their NB01 methodology. Circulate and prompt clear reasoning for changes in scope, windows, or normalisation strategy.

Now connect technical work back to your NB01 decisions.

đŸŽ¯ ACTION POINTS

  1. Adjust origin and destination selections to match your proposal
  2. Adjust temporal parameters for your current peak/off-peak definitions
  3. Re-run collection if your response is empty or too thin for your question
  4. Revise normalisation choices if your structure demands a different pass
  5. Write at least two sentences in REPORT.md documenting your current peak/off-peak definition choices
Tip

Your Scope and Confidence judgement from Lecture was made before seeing real API output. It is normal to revise your plan after Part 2.


Wrap-Up & Next Steps (10 min)

Note to class teachers: Close by checking evidence of work, not just code execution. Students should leave with saved artefacts and a short method rationale in REPORT.md.

Before You Leave:

  • one working API call in your own MP2 repo for at least one peak and one off-peak case
  • one saved raw JSON sample in data/raw/
  • one normalised output table saved in your repo
  • one short REPORT.md note documenting your current methodological choices

Looking Ahead:

  • Revisit NB01 if the data you get forces scope changes
  • Keep iterating your MP2 pipeline before Week 08
  • Bring your current REPORT.md methodology notes to next contact hours if you want feedback

🔗 Useful Resources

🆘 Getting Help

  • Slack: Post questions to #help channel
  • Office Hours: Book via StudentHub
  • Check staff availability on ✋ Contact Hours