DS105 2025-2026 Winter Term Icon

πŸ–₯️ Week 05 Lecture

Summarising and Presenting Data

Author

Dr Jon Cardoso-Silva

Published

18 February 2026

πŸ“ Logistics

πŸ“Location: Thursday, 19 February 2026, 4-6 pm at CKK.LG.03

Last week you struggled with nested np.where() and boolean columns in the W04 Lab when classifying weather. Today, you’ll learn a much cleaner approach through custom functions, then discover how to summarise temporal data to reveal insights. Finally, you’ll learn to present summary tables using pandas Styler.

πŸ“‹ Preparation

  • Complete the πŸ’» W04 Lab (pair programming!)
  • Continue working on ✍️ Mini-Project 1 (released last week)
  • The skills you learn today will directly support your Mini-Project 1 work

πŸ—£οΈ Lecture Overview

Part 1: From Loops to Functions (35 min)

  • Custom functions with def to replace nested np.where()
  • .apply() for processing entire datasets
  • πŸ† Quick challenge (if time permits)

Part 2: Temporal Data (25 min)

  • DateTime conversion and the .dt accessor

BREAK (10 min)

Part 3: GroupBy & Presenting Data with pandas Styler (40 min)

  • .groupby() aggregations by year, month, day
  • .style.format(), .background_gradient(), .bar(), .set_caption()
  • πŸ† Compression Challenge (if time permits)

πŸ““ Lecture Materials

Today’s lecture uses slides with a demonstration notebook for live coding. All materials will be available in your Nuvolos workspace under the week05/ folder, or you can download them directly below.

🎬 Lecture Slides

Use keyboard arrows to navigate. Select the slides below or view fullscreen.

Or download the slides directly as a PDF:

Lecture Demonstration Notebook

This notebook accompanies the slides with code examples you can run yourself.

Data Files

The lecture uses extended W04 weather data (20 years of temperature and rainfall):

πŸ’‘ Key Concepts

  • Custom functions: Extract complex logic into testable, reusable code
  • .apply() method: Process entire datasets without explicit loops
  • DateTime operations: Convert timestamps and extract date components
  • .groupby() aggregations: Summarise data by categories or time periods
  • pandas Styler: Format, colour, and caption summary tables for presentation
  • Narrative titles: State your findings, don’t describe the data

πŸ”– Appendix

Post‑Lecture Actions

  • Complete the πŸ’» W05 Lab (seaborn & matplotlib)
  • Start translating your loop-based NB02 code to vectorised operations
  • Consider using styled DataFrames for your NB03 insights
  • Post questions in #help on Slack

Useful Links

Looking Ahead

  • Tomorrow (W05 Friday): Matplotlib & seaborn lab
  • Mini-Project 1: Ongoing, keep collecting data and experimenting
  • Week 06: Reading Week, focus time for Mini-Project 1 completion
  • Deadline: W06 Thursday 8pm (submit via GitHub)