✍️ Mini-Project I: Summer Heat Study (20%)
2024/25 Winter Term
Last Updated: 18 February 2025 to modify the wording of the marking guide. The essence is still the same, but the previous writing was too “checklisty”. Also, the distinction between what constitutes a strong (~70%) and an excellent submission should be clearer now.
This assignment will use OpenMeteo’s API for data collection and analysis, with a focus on creativity within structured constraints.
Overview
This assignment builds upon the skills developed in 📝 W04 Formative Exercise, extending from single-city analysis to a multi-city comparison. While the Week 04 practice exercise focused on basic API usage and data organization, this project requires more sophisticated analysis and visualization techniques.
📚 Preparation
You must click on a GitHub Classroom link 1 to create your designated repository. Do not create a separate repository.
Clone the repository to Nuvolos (or your local machine, if you are brave) and create the necessary folders and files according to the rest of the instructions.
Submission
📅 Due Date: 27 February 2025, 8pm UK time
📤 Submission Method: Push your work to your allocated GitHub repository. DO NOT submit via Moodle.
🤖 AI Usage: You are allowed to use AI tools for whatever you want in this assignment. You won’t lose any marks for copy-pasting code from AI but you might lose marks if your coding choices deviate from the course material without proper justification.
Remember what we’ve been discussing in lectures: those who are relying too much on AI are not learning as much because instead of relying on their own notes or coming to instructors for help, they are pasting AI’s output into their code without any understand of it. You might not realise but this is very apparent to us markers 😬.
If you submit after this date without an authorised extension, you will receive a late submission penalty.
Need an Extension?
If you have extenuating circumstances that require an extension:
- Email 📧 (Kevin) with details of your situation
- Include the extensions form
- Submit your request before the cut-off time. You will typically get an answer within 24 hours.
⚠️ Note: Extensions are granted only for valid extenuating circumstances, not for technical difficulties with Git, Nuvolos or time management issues. Start early and use our support resources (Slack, drop-in sessions, office hours) if you need help.
📝 Research Question
You’ve been commissioned by the Office of Curious Analytics (OCA) to investigate summer heat stress in world capitals. Your task is to answer:
“Which world capital experienced the most extreme summer conditions for outdoor activities in their most recent meteorological summer?”
💡 Note: Your analysis should focus on quantifying and comparing the severity of heat stress conditions across your chosen cities. You will need to justify both your choice of cities and your method for measuring “extreme” conditions.
🤔 Key Decisions
Here are some decisions you will need to make:
City Selection
- Choose exactly 4 national capitals for analysis (e.g., London, Paris, Berlin, Rome)
- All cities must be from the same hemisphere (Northern or Southern)
- Explain why these cities make a good comparison set
- A subjective explanation is acceptable (e.g., “I think these cities represent diverse climates within Europe”)
🚨 Please stick to 4 cities, don’t add more. Because everyone will be using Nuvolos, which is essentially a single shared machine, we need to be mindful of API usage.
Define Extreme Summer
Propose a definition of what constitutes an "extreme summer" for outdoor activities.
Here you can either:
Use the wet-bulb temperature variable and one of the thresholds indicated by the organisations mentioned in this The Atlantic article 2.
This variable is available on OpenMeteo inside the "Additional Variables And Options" section, it’s just not listed prominently,
(we won’t tell you what the variable is called, but if you read the above carefully and search for it on the OpenMeteo website, you’ll easily find its name.)Propose your own variable and thresholds. Just make sure your rationale is grounded in reputable sources (e.g., academic publications or guidelines of professional bodies).
(This time you can’t just simply come up with a variable and thresholds out of thin air. You will need to do a tiny bit of research to find a justification for your choice.)
Time granularity
- Choose how to aggregate your data (hourly, daily, weekly, monthly or something else)
- Explain why this granularity is appropriate for your analysis
- Example: “I chose daily averages because extreme heat’s impact on outdoor activities is best assessed over full days rather than specific hours”
Define how you will compare cities
Will you count the number of hours exceeding the threshold per day or will you calculate averages? The type of analysis is up to you. Beyond the technical implementation and the quality of your code, we will assess how reasonable your choice is.
📋 Other Requirements
Here are a few other things you must adhere to in your work:
Timeframe
Although the time granularity is up to you, the period of analysis must be the last meteorological summer:
- Northern Hemisphere: June 1st - August 31st, 2024
- Southern Hemisphere: December 1st, 2023 - February 28th, 2024
💡 Note: You must collect data for the entire period, even if you later decide to focus on specific weeks or days for your analysis.
Coding Standards
You must use the
requests
library to send API requests.
(Building on W02 & W03 foundations)You must use the
json
library to parse and store the API responses to file.
(Extending W02 & W03 skills)You must use relative paths to read/write data to the
data/raw
anddata/processed
folders.
(A W03 concept)All pre-processing of the data must be done using vectorised operations with
pandas
(preferrably) ornumpy
.
(W04 & W05 concepts)All visualisations must be done using exclusively the
lets-plot
library.
(A W05 concept)Any use of functions or programming concepts that did not feature on Dataquest or in lectures or in classes MUST be explicitly justified. What made this situation unique, forcing you to use something we didn’t cover in the course so far?
Code must aspire to be readable and self-explanatory. Use meaningful variable names and include comments to explain complex operations or key decisions.
(This will help future-you and others understand your code later)
📂 Repository Structure
Ensure your repository adheres to the following structure:
<github-repo-folder>/
|-- data/
| |-- raw/
| |-- processed/
|-- notebooks/
| |-- NB01 - Data Collection.ipynb
| |-- NB02 - Analysis.ipynb
|-- README.md
What goes in each notebook?
Open the boxes below for suggested structures for each notebook. While you have flexibility in how you organise your work, following these structures will help ensure your analysis is clear and complete.
NB01 - Data Collection.ipynb
This notebook focuses on gathering and storing the weather data. Its purpose is to document your data collection process, from API requests to file storage, ensuring your analysis is reproducible.
Section | Details | Suggested Level |
---|---|---|
Title and Overview | First Markdown cell that includes: i) Your name and LSE candidate number, ii) The notebook’s purpose, iii) A high-level summary of your approach | H1 |
Imports | Section where necessary Python packages are imported. All imports must be here and should not appear anywhere else in the notebook. | Not a heading |
City Selection | Document and justify your choice of cities, including any relevant background information. Aim for a concise yet well-grounded explanation. | H2 |
Data Collection | Code and explanation for fetching data from OpenMeteo API. | H2 |
Data Storage | Code for saving the JSON files to the data/raw folder. |
H2 |
Next Steps | Preview of your analysis approach and how this data will help answer your research question. | H2 |
NB02 - Analysis.ipynb
This notebook transforms your raw data into insights. It should tell a clear story about summer heat conditions in your chosen cities, supported by data and visualisations.
Section | Details | Suggested Level |
---|---|---|
Title and Overview | First Markdown cell that includes: i) Your name and LSE candidate number, ii) The notebook’s purpose, iii) A high-level summary of your approach | H1 |
Imports | Section where necessary Python packages are imported. | H2 |
Data Loading | Load the JSON files from the data/raw folder. |
H2 |
Data Processing | Transform your data into a tabular format that is suitable for the subsequent analysis (exploratory data analysis and plots). Save that cleaned data to the data/processed folder as a CSV file. |
H2 |
Analysis | Systematic investigation of your research question, with clear yet concise explanations of your methodology and findings. | H2 |
Visualisations | Carefully designed plots that support your analysis, with meaningful titles and clear interpretations. | H2 |
Conclusions | Synthesise your findings, acknowledge limitations, and suggest potential areas for further investigation. | H2 |
README
(Do this one last, after you’ve completed the notebooks.)
The point of a README is to act as a high-level overview of the project. It should be concise and to the point.
Section | Details | Heading Level |
---|---|---|
<Name of your project> | Give a name to your project and give us an overview of what it is about. (a few sentences) | H2 |
Methodology | Explain choices you’ve made and how you went about answering your research question. | H2 |
Usage | Tell us how to run the code and access the notebooks. Which packages do I need to install? How should I run the code? | H2 |
Results | Summarise your findings in a few sentences and include the plots here. | H2 |
(optional) AI Acknowledgment | A transparent statement on the use of AI tools, if applicable, including their impact on the project. | H2 |
✔️ Marking Guide
In line with the unwritten but widely-used UK marking conventions, grades must be awarded as follows:
- 40-49: Basic implementation with significant room for improvement
- 50-59: Working implementation meeting basic requirements
- 60-69: Good implementation demonstrating solid understanding
- 70+: Excellent implementation going beyond expectations, showing creativity and depth without over-engineering
Documentation and Repository Structure (0-25 marks)
A strong submission (~17-18 marks) will demonstrate:
- Professional repository organisation with the correct structure of folders and files
- Comprehensive yet ‘concise enough’ README that gives readers a good overview of the project and how to run the code
- Good documentation principles in the notebooks, with meaningful section headings and comments in places where code is not self-explanatory
- Thoughtful commit messages that tell a coherent story of the evolution of the project
Excellence in this category (beyond the basic requirements explicitly taught in the course) comes from:
- Documentation that shows deep understanding of software engineering principles
- Repository structure that demonstrates forward thinking about maintainability, showing that the project is organised to be scalable to many more cities and is future-proofed such that it can be easily extended to other types of analysis
Data Collection Code (0-25 marks)
A strong submission (~17-18 marks) will demonstrate:
- Well-structured notebooks where each notebook has a single, focused overall purpose
- Efficient and reliable interaction with the API, with code that handles errors and checks that the data is as expected
- Clear code organisation with meaningful variable and function names
- Thoughtful comments that explain “why” not just “what”
- Data is stored in the appropriate place, with the correct file name and extension
Excellence in this category (beyond the basic requirements explicitly taught in the course) comes from:
- Elegant solutions that anticipate and properly handle unexpected situations (like missing data or unusual API errors)
- Code that is both efficient and highly readable
- Creative yet practical approaches to data collection challenges without over-engineering the solution
Analysis and Visualizations (0-35 marks)
A strong submission (~25 marks) will demonstrate:
- Clear methodology that defines what constitutes “extreme” conditions. Even if using the wet-bulb temperature variable, you must explain what it represents and justify why you chose the threshold you did and what it means for the analysis
- Appropriate use of
pandas
(preferrably) ornumpy
operations. Instead of explicitfor
loops, you should use vectorised operations. - Effective use of
lets-plot
for visualization, with meaningful titles that convey a key takeaway of the plot and does not just describe the plot’s axes. We will penalise the use of other data viz tools harshly (unless you have a really good justification for why you couldn’t do that plot in lets-plot) - Thoughtful interpretation of results
Excellence in this category comes from:
- Sophisticated analysis that shows deep understanding of the data
- Visualisations that effectively communicate complex patterns
- Critical examination of assumptions and limitations
- Novel approaches to answering the research question
Creativity and Originality (0-15 marks)
A strong submission (~10 marks) will demonstrate:
- Creative yet justified methodology choices
- Innovative use of course concepts
- Clear and engaging narrative
- Professional presentation
Excellence in this category (beyond the basic requirements explicitly taught in the course) comes from:
- Original approaches that enhance rather than complicate. Always remember: doing more is not better.
- Thoughtful innovations that demonstrate deep understanding of the course material.
- Engaging presentation that maintains professional standards
- Creative solutions that could be applied in real-world scenarios
⚠️ Important Note: While we encourage exploration and creativity, using concepts or libraries not covered in the course without proper justification will result in mark deductions. If you need to use something we haven’t covered, you must explain why it was necessary and demonstrate your understanding of it.
💡 Note: This assignment contributes 20% to your final grade. It builds on concepts from Weeks 01-05 and prepares you for the more complex group project later in the term.
Feedback
Feedback will include:
- Strengths and areas for improvement.
- Suggestions for enhancing your approach in future assignments.
💡 Tip: The OpenMeteo API has rate limits. You should implement a delay between requests (e.g., using time.sleep()
) to avoid being blocked.
Footnotes
Visit the Moodle version of this page to get the link. The link is private and only available for formally enrolled students.↩︎
Wet-bulb temperature is a key metric for understanding heat stress, as it accounts for both temperature and humidity. High wet-bulb values reduce the human body’s ability to cool down and can make outdoor activities unsafe. This assignment will require you to analyse wet-bulb temperature data and benchmark it against established thresholds for heat stress.↩︎