DS105 2025-2026 Autumn Term Icon

πŸ–₯️ Week 03 Lecture

Understanding File Systems, File Formats, and Version Control

Author

Dr Jon Cardoso-Silva

Published

15 October 2025

πŸ“ Logistics

Time and Location: Thursday, 16 October 2025, 16:00 - 18:00, CLM 5.02

This is our third live session together. By now, you should have completed the πŸ“ Week 03 Practice with the DataQuest lessons on dictionaries, if/else statements, and for loops.

πŸ“‹ Preparation

  • Complete the πŸ“ W03 Practice (DataQuest lessons)
  • Review Python control flow: if/else statements and for loops
  • Review dictionaries and nested data structures from last week
  • Have your GitHub account ready (you created this in W01)

πŸ—£οΈ Lecture Overview

  • Operating systems and file systems: understanding how your computer organises files
  • Absolute vs relative paths: why this matters for reproducible code
  • Environment variables: the hidden settings that shape your coding environment
  • File formats in practice: when to use JSON vs CSV
  • Live demo: collecting API data and saving to files
  • Interactive discussion: sharing discoveries and challenges from W03 Practice
  • After the break: setting up Git and GitHub for version control
  • Creating your first personal repository: my-ds105a-notes

🎬 Lecture Slides

Use keyboard arrows to navigate. Select the slides below or view fullscreen.

Or download the slides directly as a PDF:

DOWNLOAD

File I/O Demonstration: During the lecture, we’ll work through a complete example of fetching weather data from an API, saving it to both JSON and CSV formats, and reading it back. This demonstrates why file formats matter and how to choose the right one for your data.

Git and GitHub Setup: In the second hour, you’ll create your first personal Git repository. This is where you’ll store your lecture notes and practice code. Understanding the difference between Git (version control software) and GitHub (online platform) is essential for collaborative data science work.

πŸ”– Appendix

Post‑Lecture Actions

  • Review today’s lecture slides
  • Skim the πŸ’» W03 Lab instructions
  • Make sure your my-ds105a-notes repository is set up
  • Post questions in #help on Slack

Useful Links

Looking Ahead

  • Next week: Your first formal project using GitHub Classroom
  • W04 Practice: Collecting 20 years of weather data and counting heatwaves
  • Key skill: Combining control flow with file I/O for real data analysis