π£οΈ Week 02 Lecture
From APIs to Web Scraping
Last Updated: 25 January 2026
This lecture builds on Week 01 and moves from API requests to web scraping. We will focus on how the Web works, how HTML is structured, and how to choose between Scrapy and Selenium.
π Session Details
- Date: Monday, 26 January 2026
- Time: 16:00 - 18:00
- Location: SAL.G.03
π Preparation
- Review the π£οΈ Week 01 Lecture if the Web concepts feel rusty.
- Download the lecture notebook so you can follow the live demo.
π£οΈ Lecture Structure
We will cover:
- How the Web works: Internet vs Web, request and response, and why pages load in parts.
- HTML, CSS, JavaScript: What these layers do and how they shape scraping.
- Scrapy fundamentals: Why Scrapy is the default tool and how it differs from
requests. - Static vs dynamic content: Why some pages return empty HTML.
- Selenium on Nuvolos: Why setup matters in the two-container environment.
π¬ Lecture Slides
Use keyboard arrows to navigate. Select the slides below or view fullscreen.
π Lecture Notebook
Download the lecture notebook:
π§ͺ Brief Practical Demonstration
During the lecture, I will:
- Inspect a Wikipedia page and build selectors step by step.
- Use
read_htmlto pull a table, then discuss when that fails. - Move the logic into a minimal Scrapy spider.
β Final Thoughts
Tomorrowβs π» W02 Lab focuses on Selenium setup and a dynamic scraping task. Problem Set 1 follows after the lab.
π₯ Session Recording
Typically, the recordings are made available on Moodle in the afternoon. Iβll update this section once the recording is available.
Recommended Reading

- Ryan Mitchell (2024). Web Scraping with Python (3rd ed.). OβReilly Media, Inc. Chapters 1-3 and 8.

- Berners-Lee, T. (with Witt, S.). (2025). This is for everyone. Macmillan.