πŸ’» Week 03 Lab

Version Control & Data Storage

Author
Published

05 February 2025

πŸ₯… Learning Goals
By the end of this lab, you should be able to:
Image representing data transformation and discovery themes.

Last Updated: 6 February 2025, 15:45 GMT

πŸ“Time and Location: Friday, 7 February 2025. Check your timetable for the precise time and location of your class.

This lab will help you build on your Terminal, File Navigation, and Git/GitHub skills, so you can track your work, structure your files properly, and collect and store weather data for analysis.

πŸ“‹ Preparation

Before starting, make sure:

  • You have completed the πŸ“ Week 03 Formative Exercise.

  • Your notes about Python programming, Terminal and files are up-to-date on Nuvolos.

  • You are caught up with πŸ—£οΈ Week 03 Lecture. You might feel a bit lost otherwise.

  • You have created a GitHub account and created a private repository for this course, called my-ds105w-notes (Sections 1 and 2 of our GitHub guide).

πŸ›£οΈ Lab Roadmap

Part I: Clone your repo to GitHub (20 min)

πŸ—£οΈ TEACHING MOMENT:

Your class teacher will reinforce the concepts of Git/GitHub introduced in the lecture and guide you through cloning your private repository to Nuvolos. Use this time to clarify any doubts you have about Git, GitHub, or the Terminal.

If you missed the lecture, read this instead
  • Create your GitHub account. Follow the steps in the guide carefully.

  • You will have to create a new repository on GitHub . Call it my-ds105w-notes.

    Click here for a step-by-step guide.

  • Set up your Git on Nuvolos and make sure you are authenticated with GitHub.

    Click here for a step-by-step guide.

Together with your class teacher, you will:

  1. Access your Nuvolos account and open the VS Code app.

  2. Open a Terminal inside VS Code. Click on the Menu icon then navigate to Terminal > New Terminal.

  3. Confirm that you are at /files by running pwd.

  4. Check that you are logged into GitHub by running gh auth status.

    • If gh isn’t working, run gh auth login.
  5. Clone your repository to Nuvolos.

    You can find details on it from the 4️⃣ Cloning a Repository section of the GitHub guide.

  6. Go inside the cloned folder:

    cd /files/my-ds105w-notes && pwd
  7. Create or modify your README.md file:

    nano README.md

    Files ending with .md are called Markdown files and they only contain text in Markdown formatting. Add just a few lines of text and save the file, for example:

    # DS105W (2024/25) Notes
    
    **Author:** YOUR NAME
    
    This repository contains all my notes for the DS105W course.
  8. Check the status of your repository:

    git status

    You will see that your README.md file is untracked.

  9. Stage and commit your changes:

    git add README.md
    git commit -m "Add a README with basic info about the repo"
  10. Push your changes to GitHub:

    git push
  11. Visit your repository on GitHub to confirm that your README file is visible.

Part II: Organising Your Nuvolos Workspace (40 min)

The point of this section is to get you to practice with your Terminal skills, first covered in πŸ“ W03 Formative and πŸ—£οΈ W03 Lecture.

The overall objective here is to move all of your existing Nuvolos files to inside the repository folder, and then commit and push the changes to GitHub.

🎯 ACTION POINT:

  1. Move your Week 01 files into the repo:

    On Nuvolos, you have a /Week 01 - Python Basics folder under your HOME (/files) directory. We want to move this folder into your repository.

    # Make sure you are in the right directory
    cd /files/my-ds105w-notes
    
    # Use the 'mv' command to move the folder
    mv "../Week 01 - Python Basics" ./

    The mv command above will move the folder as well as everything that is inside of it to inside the ./ directory, that is, the current directory. Your W01 Lab - Python Basics.ipynb should have been moved as well.

    Take a moment to check that the folder is now inside your repository. You should see your Nuvolos file structure like this:

    /files/
    β”œβ”€β”€ my-ds105w-notes/
    |      β”œβ”€β”€ README.md
    |      └── Week 01 - Python Basics/
    |             └── W01 Lab - Python Basics.ipynb
    β”œβ”€β”€ Week 02 - Python Collections & Loops/
    | 
    ...

    Raise your hand if you don’t quite understand what happened here.

  2. Check what changed:

    git status

    You should see that your Week 01 files are untracked.

    We will now follow all steps in the 5️⃣ the Ceremony of Uploading Changes to keep track of our work.

  3. Stage and commit:

    git add .
    git commit -m "Add Week 01 files"

    If you run git status again, you should see that your changes are ready to be pushed.

  4. Push changes to GitHub:

    git push
  5. Repeat for Week 02 and Week 03 files. You can visit your repository on GitHub to confirm that all files are there.

    You can also visualise your Jupyter notebooks directly on GitHub’s website. This is a great way to share your work with others and to quickly check the contents of your notebooks without having to download or run them.

  6. πŸ€” What happens when you try to move empty folders to the repository? Give it a go. Move the Week 04 and Week 05 folders to your repository and see what happens when you try to commit and push the changes.

πŸ—£οΈ TEACHING MOMENT

When the time dedicated for this section is coming to an end, your class teacher will answer any pending questions and provide guidance on how to proceed with the next steps.

Part III: Fetch & Store Historical Weather Data (30 min)

πŸ’‘ PRO-TIP: This exercise is the starting point for your first graded assignment in this course, the Mini Project I (worth 20%). While the actual instructions for this assignment will emerge next week, working on this means you are already progressing towards your first summative assignment.

Time to go back to coding!

You now have all the necessary tools (Python, Terminal, Git, and Jupyter Notebooks) to start working on your first end-to-end data analysis project.

🎯 ACTION POINT

Your goal is to write the necessary Python code to answer the question:

> β€œHow many”hot days” were there in the last metereological Summer in London?

πŸ“Œ DECISIONS, DECISIONS, DECISIONS

  • You decide what you consider a β€œhot day” to be. Is it a day with a maximum temperature above 30Β°C? Or does it have to do with the average temperature throughout the day? The range? It’s all up to you.

🎯 ACTION POINTS:

(Our action points will start to get more high-level as we progress through the course. This is to give you more opportunities to demonstrate your Python and documentation skills.)

  1. Create a new Jupyter Notebook. Call it Weather Analysis - NB01 Data Collection.ipynb and place it under your Week 03 folder.

  2. Write code to fetch the relevant historical weather data from the Open-Meteo API

  3. Store the data you collected in a JSON file (you chose how to call it) at the location:

    /files/my-ds105w-notes/Week 03 - File Formats and Directory Structure/data/

If you have time left, you can start analysing the data you collected. See the 🏑 take-home activity below for the instructions.


🏑 Take-Home Activity

Once you have collected the data above, you can start analysing it to answer the question you posed.

🎯 ACTION POINTS:

Create a separate Jupyter Notebook called Weather Analysis - NB02 Data Analysis.ipynb and place it under your Week 03 folder.

On this notebook, you will:

  • Read the JSON file you created in the previous notebook.

  • Write the necessary Python code to answer the question we stated in the previous section from the data you collected.