🗣️ Week 11 Lecture

Project Management and Technical Communication

Author

Dr Jon Cardoso-Silva

Published

03 April 2025

🥅 Learning Goals
By the end of this lecture, you should be able to: i) Identify and fix common data visualisation issues, ii) Create effective project websites with embedded media, iii) Manage your project using GitHub’s tools, iv) Prepare for your group project pitch presentation.
DS105W course icon

📍Time and Location: Thursday, 4 April 2025 from 4-6 pm at MAR.1.04

📋 Preparation

  • Bring your laptop with GitHub access
  • Review your team’s GitHub repository and project website
  • Think about your project pitch for tomorrow’s presentations

Hour 1: Tips for your final project due in May

Avoiding common visualisation mistakes

Let’s address some common issues I’ve observed in your Mini Project 2 submissions:

1. Aesthetic Issues

  • Font sizes too small: Always ensure text is readable at a glance
  • Overcrowded plots: Focus on one clear message per visualisation
  • Improper labelling: Every axis needs a clear label with units
  • Thoughtless colour choices: Use colour with purpose, not just for decoration
  • Bubble charts with tiny dots: Size elements appropriately for the message
  • Don’t just fit a line to everything: Not every relationship is linear!

🔗 LINK: Bad data visualisation examples

2. The “Average” Problem

Many of you love talking about “averages” in your plots, but which average do you mean?

  • Mean: Sum of values divided by count (hugely affected by outliers)
  • Median: Middle value when sorted (robust to outliers)
  • Mode: Most common value (useful for categorical data)

🔗 LINK: How to measure typical values

Key takeaway: Don’t just show averages. Show the variation in your data!

Instead of simple bar charts showing only the mean, consider:

  • Histograms (show full distribution)
  • Box plots (show quartiles, median, and outliers)
  • Violin plots (show distribution shape)

🔗 LINK: Histogram examples (there’s more guidance there)

Case Study: Masterful RICH Data Storytelling

Try to be brave in your final project and produce a single rich data visualisation that tells a big story. You will have to be creative and chase the story to find the best way to tell it.

Let’s examine an exemplary piece of data journalism from the Columbia Journalism Review:

We compared eight AI search engines. They’re all bad at citing news.

What makes this visualisation effective:

  1. Progressive disclosure: Guides the reader step-by-step
  2. Clear annotations: Explains what you’re seeing
  3. Thoughtful colour use: Consistent meaning throughout
  4. Hierarchy of information: Most important insights are emphasised
  5. Multiple views: Shows the same data in different ways for deeper understanding

Key takeaway: Great visualisations tell a story and guide your audience through complex information.

Other rich data visualisation examples I love

🔗 LINK: Ali Wong’s Stand-Up Routine (The Pudding)
(how they made it)

🔗 LINK: Why do cats and dogs…
(design process)

Hour 2: Project Websites and Management

Creating Effective Project Websites

Your project website is your presentation platform for tomorrow and your final submission. Let’s make it effective:

Embedding Videos in GitHub Pages

For those who need to pre-record presentations:

<video width="100%" controls>
  <source src="./videos/presentation.mp4" type="video/mp4">
  Your browser does not support the video tag.
</video>
💡 Alternative: YouTube Embedding

If your video is on YouTube:

<iframe width="560" height="315" 
  src="https://www.youtube.com/embed/YOUR_VIDEO_ID" 
  frameborder="0" 
  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" 
  allowfullscreen>
</iframe>

Replace YOUR_VIDEO_ID with the ID from your YouTube URL.

Quarto Markdown for Professional Websites

Quarto extends Markdown with powerful features for data science communication:

---
title: "Our Amazing Project"
format:
  html:
    toc: true
    theme: cosmo
    code-fold: true
---

## Introduction

Our project explores...
💡 Quarto Tips for Project Websites
  • Use YAML headers to control page appearance
  • Create a consistent navigation structure
  • Use callouts for important information
  • Include interactive elements where appropriate
  • Balance text with visuals

Database Best Practices Refresher

Remember our Databases Cookbook? Let’s highlight key points:

  1. Design your schema first: Plan your tables and relationships before storing data
  2. Use appropriate data types: Choose the right type for each column
  3. Establish proper relationships: Use primary and foreign keys
  4. Write efficient queries: Only select the data you need
  5. Don’t commit large database files to GitHub: Use .gitignore if your file is too big (e.g. 40MB)
    • If your database .db file is too big, you can use a link to the file in your repository (on the README.md file)

Project Management with GitHub

We will look at the GitHub Project board to manage your project.