π W04 Formative Exercise: your first full project
data:image/s3,"s3://crabby-images/7fdd0/7fdd0050b7606db98e8d2078680d19e8aa594be3" alt="Winter Term Icon for LSE DS105 - Created to reflect the journey of data transformation and insight discovery. Icon representing the themes of data transformation and insight discovery."
Briefing
β³ | DEADLINE | Thursday, 13 February, 15:50 GMT (just before the lecture) |
π | Repository Setup | GitHub Classroom Repository (link below) |
π | Key Learning Concept | GitHub workflows, structured repositories, API data collection, and concise analysis |
π‘ Your First Formal Submission:
This exercise has some overlap and is a direct continuation of the π» W03 Lab. Only this time, you will complete it in a new GitHub repository previously arranged by us.
You will receive individual feedback on your submission, so please ensure your work follows the expected structure and is pushed to GitHub before the deadline (13 February, 15:50 GMT). Please note that we will not give feedback on late submissions
Itβs also worth saying that this exercise is a taster for your first graded assignment, which will be released during the π£οΈ W04 Lecture. If you practice with this one, itβs likely that you will be super comfortable with the summative (due W06, worth 20% of your final grade).
π£οΈ Task Overview
Your goal is to create a structure a system of files and folders according to the specification provided and to write the necessary Python code (in two notebooks) to answer the question:
> βHow manyβhot daysβ were there in the last metereological Summer in London?
π DECISIONS, DECISIONS, DECISIONS
- You decide what you consider a βhot dayβ to be. Is it a day with a maximum temperature above 30Β°C? Or does it have to do with the average temperature throughout the day? The range? Itβs all up to you.
π Repository Structure
Ensure your repository follows this exact structure:
<github-repo-folder>/
β-- data/
β β-- ... (your JSON files)
β-- notebooks/
β β-- NB01 - Data Collection.ipynb
β β-- NB02 - Analysis.ipynb
β-- README.md
β-- .gitignore
π Step-by-Step Instructions
Part 1: Setting Up Your Repository (15 min)
The instructions here will assume you will be working on Nuvolos. If you are using your local machine, you will have to make some adjustments to paths by replacing /files/
with your local working directory.
π― ACTION POINTS:
Accept the GitHub Classroom assignment at this link 1. You will be taken to a page where you will have to Accept the assignment. After accepting, a personalised GitHub repository will be created for you. Grab the SSH URL from there.
Open a terminal window on VS Code.
Navigate to
/files/
and clone your assigned repository:git clone <your-github-classroom-repo-url>
Remove the
<
and>
symbols and replace the whole placeholder with the URL provided by GitHub Classroom.π NOTE: This means that you will be working on a different GitHub repository than the one you created in Week 03. You are still encouraged to use your
my-ds105w-notes
repository and the βWeek 04β folder that exists in there for your private notes, but we will only mark your formative based on what you have on this new repository.Navigate inside the cloned repository:
cd <repo-folder-name>
Confirm you are inside the correct directory using
pwd
.Run
ls
to check that aREADME.md
and.gitignore
files exist.
Part 2: Organising Your Repository (20 min)
π― ACTION POINTS:
Create the necessary folder structure:
mkdir -p data notebooks
The
-p
flag allows you to create multiple directories at once.Create two empty Jupyter Notebooks inside the
notebooks/
folder:touch "notebooks/NB01 - Data Collection.ipynb" touch "notebooks/NB02 - Analysis.ipynb"
The
touch
command creates an empty file. You can also use the right-click menu in VS Code to create a new file just note that it is very important to name the notebooks precisely as shown.Verify your structure:
tree .
Your output should match the expected repository structure.
Commit your changes:
git add . git commit -m "Set up project structure" git push
Part 3: Fetch & Store Historical Weather Data (60 min)
π― ACTION POINTS:
Using the requests
package in Python, get the necessary data from the Open-Meteo API and store the JSON data in one or multiple
.json
file(s) to under the data/
folder.
π‘ Use Markdown in your notebook to document your process.
π Advice for Naming Data Files
When saving JSON files in the data/
folder, aim for a consistent naming convention. A good practice is to use all lowercase letters and underscores to separate words, and to find descriptive (yet concise) names. For example, you could use this naming pattern: - london_weather_YYYYMMDD.json
for daily data - london_summer_2024.json
for seasonal data
Where the YYYYMMDD should be replaced with the date you collected the data. This helps track when data was retrieved and makes the analysis more reproducible.
Every time you feel like youβve written a significant amount of code, or you will take a break from this task, commit and push your changes to GitHub so they are saved.
Part 4: Analysing the Data (45 min)
π― ACTION POINTS:
Read the JSON file(s) you stored in the previous step as a Python object (mix of dictionaries and lists) and write the necessary code to answer the question provided at the top.
Keep committing and pushing your changes to GitHub as you progress.
β Submission Checklist
Before the deadline (13 February, 15:50 GMT) (just before the lecture), confirm:
π¨ It is very important to have everything pushed to your repository before the deadline. You will receive individual feedback based on what you have on your GitHub repository.
π’ Need help? Post your questions in #help
on Slack! π
π Assessment Criteria
We expect you to make use of everything you have learned so far in the course to complete this exercise (Dataquest lessons + lecture + labs + formative exercises). Ultimately, we want to see evidence that makes us go βah yes! this student has been paying close attention to the course!β
You will be given a βfake markβ (this is not a graded assignment) out of 100 based on the following criteria:
Repository Structure (0-20 marks)
- Correct folder hierarchy and file naming
- Clear README.md explaining project purpose and structure
- Well-organised notebooks with appropriate documentation and sensible use of Markdown
Data Collection (0-40 marks)
- Successful API interaction and data retrieval
- Proper JSON file storage and neat organisation
- Clear and concise documentation of API usage and data processing steps
Analysis & Reasoning (0-40 marks)
- Clear definition and justification of βhot dayβ criteria
- Accurate data processing and calculations
- Concise presentation of findings with supporting evidence
Footnotes
This link is private to enrolled students. Visit the Moodle equivalent of this page to find the linkβ©οΈ