DS105 2025-2026 Autumn Term Icon

πŸ–₯️ Week 04 Lecture

From Loops to Vectorisation: NumPy and Pandas

Author

Dr Jon Cardoso-Silva

Published

27 October 2025

πŸ“ Logistics

Time and Location: Thursday, 23 October 2025, 16:00 - 18:00, CLM 5.02

You’ve just submitted the πŸ“ Week 04 Practice, your first complete data science workflow using pure Python. Today’s lecture revisits the loops you had to write for that exercise. Then we’ll see how to scale that same logic to larger problems using numpy and pandas.

Note: Today’s lecture covered Sections 1-6 of the demonstration notebook (loading data, loops, NumPy vectorisation, Pandas DataFrames, and filtering). Sections 7-8 (custom functions and .apply()) will be introduced in Week 05.

πŸ“‹ Preparation

  • Submit the πŸ“ W04 Practice by today at 12:00 (noon)
  • Review your loop logic for detecting hot days and counting streaks
  • Reflect on where you used for loops and if/else statements
  • Individual feedback on your submission will arrive within a week

πŸ—£οΈ Lecture Overview

Part 1: W04 Practice Solution (40 min)

  • Walkthrough of the heatwave detection problem
  • Dynamic state management in loops: building lists, accumulating counts, tracking streaks
  • Scaling question: larger datasets or more complex conditions?

Part 2: Vectorisation with NumPy and Pandas (45 min)

  • NumPy arrays for simple numerical operations
  • Nested np.where() for complex conditions (ugly!)
  • Pandas DataFrames with custom functions (cleaner)
  • Comparing all three approaches

Part 3: Choosing Tools & Lab Preview (15 min)

  • Decision framework: loops, NumPy, or Pandas?
  • Tomorrow’s pair programming lab
  • Mini-Project 1 (released tomorrow)

πŸ““ Lecture Materials

Today’s lecture uses Jupyter Notebooks instead of slides. All materials will be available in your Nuvolos workspace under the week04/ folder, or you can download them directly below.

Lecture Demonstration Notebook

This notebook walks through your W04 Practice solution and introduces NumPy/Pandas vectorisation. It includes Personal Reflection cells where you can take notes during the lecture.

Data Files

The lecture uses your W04 Practice data:

πŸ’‘ Key Concepts

  • State management in loops: Building lists, counters, and tracking variables
  • Vectorisation: Operations on entire arrays at once
  • Tool selection: Loops, NumPy, or Pandas depending on the problem

πŸ”– Appendix

Post‑Lecture Actions

  • Review the lecture notebook and your personal notes
  • Skim the πŸ’» W04 Lab instructions (pair programming!)
  • Start exploring ✍️ Mini-Project 1 (released today or tomorrow)
  • Post questions in #help on Slack

Useful Links

Looking Ahead

  • Tomorrow: Pair programming lab (Pilot + Copilot roles)
  • Mini-Project 1: Released today, due Week 06 Thursday 8pm (20% of grade)
  • Next week (W05): Advanced data transformations and seaborn visualisation
  • Week 06: Reading Week – focus time for Mini-Project 1