π» Week 08 Lab
SQL Basics: From SELECT to JOIN
π Logistics
Time and Location: Friday, 21 November 2025. Check your timetable for your precise class time and location.
Todayβs lab focuses on learning SQL by querying a pre-built SQLite database. Youβll progressively build SQL skills, from simple SELECT statements to complex aggregations, and recreate the visualisation from W07 Lab using SQL instead of pandas.
π Preparation
Before coming to lab, ensure you have:
- β Attended the π₯οΈ W08 Lecture
- β Reviewed database concepts and SQL basics from the lecture
- β Completed the W07 Lab (youβll see how SQL achieves the same results as pandas)
π― Learning Strategy
This lab teaches SQL through progressive skill building. Each task adds ONE new SQL concept. By the end, youβll independently write a query combining SELECT, WHERE, COUNT DISTINCT, GROUP BY, ORDER BY, and LIMIT.
How to succeed:
- Do not skip skills - each builds on the previous one
- Write pandas translations by yourself and avoid over-relying on AI
- Ask questions immediately when SQL syntax confuses you
- Notice when SQL feels intuitive (SELECT, WHERE) vs when it requires more thought (GROUP BY, COUNT DISTINCT, JOIN)
Note to class teachers: Monitor student understanding at these critical points:
- Skill 3 (COUNT) vs Skill 4 (COUNT DISTINCT) - is the distinction between COUNT and COUNT DISTINCT clear to them?
- Skill 5 (GROUP BY) - is the notion of split-apply-combine clear (regardless of SQL)?
- Skill 6 (JOIN) - does the concept of JOINing tables make sense?
- Part 3 synthesis - can they combine pieces without help/over-reliance on AI?
π£οΈ Roadmap
Part 1: Database Schema Exploration (15 min)
Explore the OpenSanctions SQLite database structure. Use SQLite Viewer or PRAGMA queries to identify tables, examine schemas, and understand relationships between tables.
Part 2: Progressive SQL Skill Building (50 min)
Build SQL skills step by step:
- Skill 1: SELECT columns (5 min)
- Skill 2: WHERE filtering (8 min)
- Skill 3: COUNT aggregation (8 min)
- Skill 4: COUNT DISTINCT (8 min)
- Skill 5: GROUP BY (10 min)
- Skill 6: JOIN tables (10 min)
After each SQL skill, write the pandas equivalent to verify understanding.
Learning Objective: By Part 3, independently write a SQL query that combines SELECT, WHERE, COUNT DISTINCT, GROUP BY, ORDER BY, and LIMIT.
Part 3: Creating plot_df and Visualization (15 min)
Recreate the W07 Lab visualization (top 10 sanctioning countries) using SQL. Compare SQL approach with pandas to see how both achieve the same result.
Part 4: Reflection (classroom discussion)
Discuss when to use SQL vs pandas for data analysis, and whether SQLite would help your βοΈ Mini Project 2 work.
π Getting Started
Option 1: Using Nuvolos (Recommended)
On
Nuvolos, navigate to the Week 08 materials in your week08 folder. The notebook and database file will be there:
week08/
βββ data/
β βββ opensanctions.db
βββ W08-NB02-Lab-SQL-Basics.ipynb
Open the W08-NB02-Lab-SQL-Basics.ipynb file in VSCode to begin.
If you followed my guidance in W03 Extra advice, you can add this notebook to your my-ds105a-notes repository (for example under week08/).
Option 2: Download Lab Files Directly
Download the lab files to work on your own machine:
π½ Data Specification
Weβre using a pre-built SQLite database containing the same OpenSanctions data from W07 Lab, but stored in normalised tables instead of nested JSON.
Database structure:
targetstable: Sanctioned individuals/entities withtarget_id,caption,country,first_seen,last_seensanctionstable: Sanction events withsanction_id,target_id,country,status,start_date,end_datetarget_aliasestable: Alternative names for targets withtarget_idandalias
Key points:
- Same data as W07 Lab, different structure (normalised vs nested)
- ~4,000 target records with associated sanctions
- Uses foreign keys to connect tables (e.g.,
target_idlinkstargetsandsanctions)
π§ Optional Extension
If you want to see how this database was built from the nested JSON, explore:
W08-NB03-Lab-OpenSanctions-SQLite.ipynb - Shows the complete process of transforming nested JSON into normalised tables and creating the SQLite database
This is useful if you need extra practice with pd.json_normalize() for βοΈ Mini Project 2.
π References
Useful references for lab techniques:
The
sqlite3module for connecting to SQLite databasesThe
pandas.read_sql()function to execute SQL queries and return DataFramesThe SQL SELECT statement with WHERE, GROUP BY, ORDER BY, and LIMIT clauses
SQL COUNT and COUNT DISTINCT for aggregation
