πŸ“ W04 Formative Exercise: your first full project

Author
Published

07 February 2025

🎯 Learning Goals
By the end of this exercise, you will: i) Structure repositories following software engineering best practices, ii) Apply API data collection techniques using Python, iii) Transform raw JSON data into analytical insights, iv) Practice Git workflows in a formal submission context
Icon representing the themes of data transformation and insight discovery.

Briefing

⏳ DEADLINE Thursday, 13 February, 15:50 GMT
(just before the lecture)
πŸ“‚ Repository Setup GitHub Classroom Repository (link below)
πŸ’Ž Key Learning Concept GitHub workflows, structured repositories, API data collection, and concise analysis

πŸ’‘ Your First Formal Submission:

This exercise has some overlap and is a direct continuation of the πŸ’» W03 Lab. Only this time, you will complete it in a new GitHub repository previously arranged by us.

You will receive individual feedback on your submission, so please ensure your work follows the expected structure and is pushed to GitHub before the deadline (13 February, 15:50 GMT). Please note that we will not give feedback on late submissions

It’s also worth saying that this exercise is a taster for your first graded assignment, which will be released during the πŸ—£οΈ W04 Lecture. If you practice with this one, it’s likely that you will be super comfortable with the summative (due W06, worth 20% of your final grade).

πŸ›£οΈ Task Overview

Your goal is to create a structure a system of files and folders according to the specification provided and to write the necessary Python code (in two notebooks) to answer the question:

> β€œHow many”hot days” were there in the last metereological Summer in London?

πŸ“Œ DECISIONS, DECISIONS, DECISIONS

  • You decide what you consider a β€œhot day” to be. Is it a day with a maximum temperature above 30Β°C? Or does it have to do with the average temperature throughout the day? The range? It’s all up to you.

πŸ“ Repository Structure

Ensure your repository follows this exact structure:

<github-repo-folder>/
β”‚-- data/
β”‚   β”‚-- ... (your JSON files)
β”‚-- notebooks/
β”‚   β”‚-- NB01 - Data Collection.ipynb
β”‚   β”‚-- NB02 - Analysis.ipynb
β”‚-- README.md
β”‚-- .gitignore

πŸš€ Step-by-Step Instructions

Part 1: Setting Up Your Repository (15 min)

The instructions here will assume you will be working on Nuvolos. If you are using your local machine, you will have to make some adjustments to paths by replacing /files/ with your local working directory.

🎯 ACTION POINTS:

  1. Accept the GitHub Classroom assignment at this link 1. You will be taken to a page where you will have to Accept the assignment. After accepting, a personalised GitHub repository will be created for you. Grab the SSH URL from there.

  2. Open a terminal window on VS Code.

  3. Navigate to /files/ and clone your assigned repository:

    git clone <your-github-classroom-repo-url>

    Remove the < and > symbols and replace the whole placeholder with the URL provided by GitHub Classroom.

    πŸ“‹ NOTE: This means that you will be working on a different GitHub repository than the one you created in Week 03. You are still encouraged to use your my-ds105w-notes repository and the β€œWeek 04” folder that exists in there for your private notes, but we will only mark your formative based on what you have on this new repository.

  4. Navigate inside the cloned repository:

    cd <repo-folder-name>
  5. Confirm you are inside the correct directory using pwd.

  6. Run ls to check that a README.md and .gitignore files exist.

Part 2: Organising Your Repository (20 min)

🎯 ACTION POINTS:

  1. Create the necessary folder structure:

    mkdir -p data notebooks

    The -p flag allows you to create multiple directories at once.

  2. Create two empty Jupyter Notebooks inside the notebooks/ folder:

    touch "notebooks/NB01 - Data Collection.ipynb"
    touch "notebooks/NB02 - Analysis.ipynb"

    The touch command creates an empty file. You can also use the right-click menu in VS Code to create a new file just note that it is very important to name the notebooks precisely as shown.

  3. Verify your structure:

    tree .

    Your output should match the expected repository structure.

  4. Commit your changes:

    git add .
    git commit -m "Set up project structure"
    git push

Part 3: Fetch & Store Historical Weather Data (60 min)

🎯 ACTION POINTS:

Using the requests package in Python, get the necessary data from the Open-Meteo API and store the JSON data in one or multiple .json file(s) to under the data/ folder.

πŸ’‘ Use Markdown in your notebook to document your process.

πŸ“Œ Advice for Naming Data Files

When saving JSON files in the data/ folder, aim for a consistent naming convention. A good practice is to use all lowercase letters and underscores to separate words, and to find descriptive (yet concise) names. For example, you could use this naming pattern: - london_weather_YYYYMMDD.json for daily data - london_summer_2024.json for seasonal data

Where the YYYYMMDD should be replaced with the date you collected the data. This helps track when data was retrieved and makes the analysis more reproducible.

Every time you feel like you’ve written a significant amount of code, or you will take a break from this task, commit and push your changes to GitHub so they are saved.

Part 4: Analysing the Data (45 min)

🎯 ACTION POINTS:

Read the JSON file(s) you stored in the previous step as a Python object (mix of dictionaries and lists) and write the necessary code to answer the question provided at the top.

Keep committing and pushing your changes to GitHub as you progress.

βœ… Submission Checklist

Before the deadline (13 February, 15:50 GMT) (just before the lecture), confirm:

🚨 It is very important to have everything pushed to your repository before the deadline. You will receive individual feedback based on what you have on your GitHub repository.

πŸ“’ Need help? Post your questions in #help on Slack! πŸš€

πŸ“Š Assessment Criteria

We expect you to make use of everything you have learned so far in the course to complete this exercise (Dataquest lessons + lecture + labs + formative exercises). Ultimately, we want to see evidence that makes us go β€œah yes! this student has been paying close attention to the course!”

You will be given a β€œfake mark” (this is not a graded assignment) out of 100 based on the following criteria:

Repository Structure (0-20 marks)
  • Correct folder hierarchy and file naming
  • Clear README.md explaining project purpose and structure
  • Well-organised notebooks with appropriate documentation and sensible use of Markdown
Data Collection (0-40 marks)
  • Successful API interaction and data retrieval
  • Proper JSON file storage and neat organisation
  • Clear and concise documentation of API usage and data processing steps
Analysis & Reasoning (0-40 marks)
  • Clear definition and justification of β€œhot day” criteria
  • Accurate data processing and calculations
  • Concise presentation of findings with supporting evidence

Footnotes

  1. This link is private to enrolled students. Visit the Moodle equivalent of this page to find the linkβ†©οΈŽ