Python Skills Refresher (Prerequisites & Recap Guide)
ποΈ Slides
Introduction to the course, logistics, and environment setup. Working with the Open Food Facts API to refresh pandas skills and understand REST API fundamentals.
![]()
DS205 (2025/26 Winter Term)
Check this page every week, as some details may change.

Last updated: 03 March 2026, 18:00 GMT
ποΈ Week 01 19 Jan 2026
-
23 Jan 2026 Python Skills Refresher (Prerequisites & Recap Guide)
π₯οΈ Lecture
Python Skills Refresher (Prerequisites & Recap Guide)
ποΈ Slides
Introduction to the course, logistics, and environment setup. Working with the Open Food Facts API to refresh pandas skills and understand REST API fundamentals.
π» Lab
Hands-on Practice with Open Food Facts API
π£οΈ Roadmap Tutorial
Building foundational skills in API consumption and data processing with pandas.
π Support
We love hearing from you! Donβt hesitate to contact us for help.
Slack: Post questions in the
#help channel. Jon will check messages daily.
π¬ Office Hours: Wednesdays, 2:00 pm - 5:00 pm (bookable via StudentHub).
We will announce additional support options and sessions as the term progresses.
ποΈ Week 02 26 Jan 2026
-
30 Jan 2026 Introduction to Web Scraping with Scrapy
π₯οΈ Lecture
Introduction to Web Scraping with Scrapy
ποΈ Slides π₯οΈ Live Demo
Understanding web document structure, XPath and CSS selectors. Introduction to the Scrapy framework for web scraping.
π» Lab
Practising XPath and CSS Selectors with Scrapy (UK Supermarket)
π£οΈ Roadmap Tutorial
Hands-on scraping practice with live product data from UK Supermarket ecommerce website. Goal: scrape a diverse set of information from a single webpage (first a listing page, then a single product page).
ποΈ Note: in the π» W02 Lab and π» W03 Lab, your goal is to mirror what you saw in the lecture on the day before, only this time for a different website. Rather than working on collecting Wikipedia pages (as I will be doing), you will be working on collecting product data from UK Supermarket. You should do this on the dedicated GitHub repository for your Problem Set 1. Thatβs right, you are already building your Problem Set 1 from Week 02!
π£ Release
Problem Set 1 instructions will be released this week.
Due: Thursday, 26 February 2026, 8pm UK time (Week 06). Worth 20% of final grade.
Problem Set 1 is designed to be built incrementally throughout the term. Starting from Week 02, each lab session will guide you through completing a specific part of the project. Youβll work on structured lab activities during class time, with take-home components to prepare for the following week. Each week builds on the previous one, so stay on track with the weekly lab work.
ποΈ Week 03 02 Feb 2026
-
06 Feb 2026 Crawling with Scrapy and Dynamic Content
π₯οΈ Lecture
Crawling with Scrapy and Introduction to dynamic content with Selenium
π₯οΈ Live Demo
Building scalable scrapers with Scrapy pipelines and handling pagination. Introducing Selenium for dynamic content scraping when Scrapy alone is not enough.
π» Lab
Building a Complete Scrapy Spider for UK Supermarket
π£οΈ Roadmap Tutorial
Implementing a full scraping pipeline with data cleaning and storage. Goal: to navigate and scrape a website rather than just scrape a single page. Itβs up to you to decide whether to use Scrapy alone or to use Scrapy in combination with Selenium. It all depends on how the website (UK Supermarket) is structured.
ποΈ Week 04 09 Feb 2026
-
13 Feb 2026 Building Collaborative APIs with FastAPI
π₯οΈ Lecture
Building Collaborative APIs with FastAPI
ποΈ Slides
Introduction to FastAPI, Pydantic v2, and building APIs for data sharing. Collaborative development patterns.
π» Lab
FastAPI Development and Docker Introduction
π£οΈ Roadmap Tutorial
Building APIs and using Docker to resolve environment conflicts.
π Practice
Problem Set 1: Scrapy spider + FastAPI (peer hand-off)
Collaborative project where you build your own web scrapers but write the API for the project of another student.
Important: The web scraping component of Problem Set 1 (which youβve been building in Weeks 02-03) will be graded as a formative exercise. This portion will be assessed directly by Jon.
ποΈ Week 05 16 Feb 2026
-
20 Feb 2026 Data Pipelines and API Design
π₯οΈ Lecture
Data Pipelines and API Design
π₯οΈ Live Demo π¦Έπ» Super Tech Support
Data pipeline architecture (scraped, enriched, API layers), pipeline orchestration with click, and improving API schemas with Pydantic Field constraints and FastAPI Query parameters.
π» Lab
Super Tech Support: Problem Set 1 Working Session
π¦Έπ» Super Tech Support
Dedicated working time for Problem Set 1. Bring your code and your partnerβs repository; Jon will circulate to help with API design, enrichment logic, and collaboration workflow.
ποΈ Week 06 23 Feb 2026
-
27 Feb 2026 Reading Week
π Reading Week
No lecture or lab this week.
Use this time to catch up on coursework and prepare for the second half of the term.
We will announce additional support sessions this week to help you finalise Problem Set 1.
βοΈ Summative
Problem Set 1 Due: Thursday, 26 February 2026, 8pm UK time
Worth 20% of final grade. Submission via GitHub. Includes peer hand-off component.
ποΈ Week 07 02 Mar 2026
-
06 Mar 2026 From Food to Climate: Pipelines, Automation, and TPI
π₯οΈ Lecture
From Food to Climate: Pipelines, Automation, and TPI
π₯οΈ Live Demo π£οΈ Guest Speakers
Naming what you built in W01-W05 (ETL/ELT vocabulary, pipeline design principles), automating pipelines with GitHub Actions, and debugging with VS Code. Guest speakers: Ruikai Liu (Jorb.ai, former DS205 student) on system design and vibe coding, and the TPI Centre team on their climate assessment workflows and the CLEAR RAG system.
π» Lab
Building a Click Pipeline and Wiring It to GitHub Actions
π£οΈ Roadmap Tutorial
Build a skeleton Click CLI that defines your pipeline stages for βοΈ Problem Set 2, run it locally, then create a GitHub Actions workflow that runs the same commands on a remote machine. Read the PS2 brief, browse TPI corporate pages, and choose your sector and companies.
π£ Release
βοΈ Problem Set 2 released this week.
Build a RAG pipeline using TPI Centre corporate disclosure data. Choose a sector (Food Producers, Electrical Utilities, or Diversified Mining) and at least two companies. Worth 40% of final grade.
Due: Thursday, 26 March 2026, 8pm UK time (Week 10).
ποΈ Week 08 09 Mar 2026
-
13 Mar 2026 PDF Extraction and Introduction to Embeddings
π₯οΈ Lecture
PDF Extraction and Introduction to Embeddings
ποΈ Slides π₯οΈ Live Demo
Extracting text from PDFs with unstructured, handling tables and mixed layouts. Introduction to word embeddings (Word2Vec) and transformer-based sentence embeddings.
π» Lab
Extracting and Embedding TPI Corporate Disclosures
π£οΈ Roadmap Tutorial
Extract text from your chosen companiesβ PDFs using unstructured. Inspect what comes out. Start generating embeddings with sentence-transformers. Directly applicable to βοΈ Problem Set 2.
ποΈ Week 09 16 Mar 2026
-
20 Mar 2026 Chunking and Vector Search
π₯οΈ Lecture
Chunking Strategies and Vector Search with ChromaDB
ποΈ Slides π₯οΈ Live Demo
How to split extracted text into chunks suitable for retrieval. Storing and querying embeddings with ChromaDB. Evaluating retrieval quality.
π» Lab
Building a Search System for Climate Disclosures
π£οΈ Roadmap Tutorial
Chunk your extracted text, store embeddings in ChromaDB, and build retrieval that finds relevant passages for your βοΈ Problem Set 2 driving questions.
ποΈ Week 10 23 Mar 2026
-
27 Mar 2026 Retrieval-Augmented Generation
π₯οΈ Lecture
Retrieval-Augmented Generation with Open-Source Models
ποΈ Slides π¦Έπ» Super Tech Support
Connecting retrieval to generation: prompt construction, using HuggingFace models for question answering, and evaluating RAG pipeline outputs. Dedicated support time for βοΈ Problem Set 2 completion.
π» Lab
RAG Pipeline Completion Workshop
π£οΈ Roadmap Tutorial π¦Έπ» Super Tech Support
Add the generation step to your pipeline. Evaluate results against the driving questions. Polish documentation. Dedicated support time for βοΈ Problem Set 2 submission.
βοΈ Summative
βοΈ Problem Set 2 Due: Thursday, 26 March 2026, 8pm UK time
RAG pipeline for TPI Centre Carbon Performance data. Worth 40% of final grade.
ποΈ Week 11 30 Mar 2026
-
03 Apr 2026 Final Project Launch: Capstone Projects with TPI
π₯οΈ Lecture
Final Project Launch: Capstone Projects with TPI
ποΈ Slides
Final project requirements and pre-defined capstone topics. Building on Problem Set 2 skills at group scale.
π» Lab
Final Project Planning and Q&A
π¦Έπ» Super Tech Support
Form project groups, discuss capstone topics, and plan your approach.
π¦ Final Project
Final Project: Group capstone project with TPI Centre
Due in Spring Term, Thursday 21 May 2026, 8pm. Group work worth 40% of final grade.