erDiagram ARTISTS ||--o{ ALBUMS : creates ALBUMS ||--o{ TRACKS : contains ARTISTS { string artist_id PK string name int popularity int followers string genres } ALBUMS { string album_id PK string artist_id FK string name string release_date int total_tracks } TRACKS { string track_id PK string album_id FK string name int track_number int duration_ms }
π£οΈ Week 09 Lecture
Designing a Good Database Schema & Best Practices for Data Visualisation

πTime and Location: Thursday, 20 March 2025 from 4-6 pm at MAR.1.04
This weekβs lecture builds directly on our previous introduction to databases and focuses on designing effective database schemas and best practices for data visualisation. These skills will be directly applicable to your βοΈ Mini-Project 2 as you work on creating structured databases and communicating insights through visualisations.
π Preparation
Before the lecture
- Review the database concepts covered in Week 08
- Ensure you can access
Nuvolos (or ensure your computer is all set up).
- Bring your laptop to participate in the interactive demonstrations
- Be prepared to adapt the code you will see live in the lecture to your own data
π¬ Lecture Material
The lecture will be structured in two main parts:
Part 1: Designing a Good Database Schema
Building on last weekβs introduction to databases, we will explore how to design effective schemas that properly organise your data:
Database Fundamentals for Data Scientists
Understanding tables as collections of related data with unique identifiers that connect to each other
Tidy Data Principles for Databases
Applying the same principles we use for DataFrames to create well-structured database tables
Creating Efficient Database Tables
Converting your Reddit data into a database that maintains relationships between posts, comments, and subreddits
Hereβs how the data structure might look for our Spotify example (which parallels what youβll need for your Reddit data):
This diagram illustrates the relationships between our three main entities. Your Reddit database will follow a similar structure, with subreddits, posts, and comments instead.
Part 2: Best Practices for Data Visualisation
In the second hour, we will analyse effective visualisations from the Office for National Statistics (ONS) and learn how to apply these principles to your own work:
Analysing the ONS Educational Attainment Article (link)
Learning from real-world examples of effective data storytelling in an official publication.
Principles of Effective Data Visualisation
Understanding how to maximise information while minimising visual clutter.
Whose responsibility is it to extract insights from data?
Explore the dashboard versus analysis approach we have in this course and how it is evidenced in the ONS article.
π₯ Lecture Notebooks
Download the notebooks and files for todayβs lecture:
π MINI-PROJECT 2 CONNECTION:
Todayβs content is directly applicable to your Mini-Project 2:
- The database schema principles will help you organise your Reddit data effectively
- The visualisation best practices will improve how you communicate your findings
- Both elements are essential for creating a polished, professional final project
π₯ Post-Lecture Actions
- Review the Jupyter notebooks from todayβs lecture
- Apply the database schema principles to your Mini-Project 2
- Revise any visualisations you have created using the best practices discussed
- Finalise your Mini-Project 2 submission
- Use the
#help
channel on Slack if you need clarification or assistance