🗣️ Week 04 Lecture
Introduction to Web Scraping

Last Updated: 9 February 2025, 23:30.
Welcome to Week 04 of DS205, where we will explore web scraping techniques and their ethical implications.
📍 Session Details
- Date: Monday, 10 February 2025
- Time: 10:00 am - 12:00 pm
- Location: KSW.1.01
🗣️ Lecture Content
1. Introduction to Web Scraping
- What is web scraping and why is it important?
- Real-world applications:
- Market research
- Academic research
- Financial analysis
- Job market analysis
- News monitoring
2. How Websites are Structured
- HTML fundamentals
- CSS basics and styling
- The Document Object Model (DOM)
- Using browser developer tools for inspection
3. Selecting Elements
- HTML elements and attributes
- CSS selectors
- Live demonstration of element selection
4. Ethical and Industry Perspectives
- Legal framework (GDPR, CCPA, DMCA)
- Technical controls and best practices
- Industry cases:
- News Publishers v. AI Companies
- Small Websites v. AI Companies
- Industry responses to AI-powered data collection
5. Looking Ahead
- Introduction to Python web scraping libraries
- Preview of the lab session
- Overview of the formative exercise
🎬 Lecture Slides
Use keyboard arrows to navigate. Select the slides below or view fullscreen.
Or download the slides directly as a PDF: