π Problem Set 1: Data Infrastructure Engineering (20%)
2024/25 Winter Term
π€ Submission Details
π Due Date: 7 March 2025, 8pm UK time
π€ Submission Method: Push your work to your allocated GitHub repository.
This assignment offers two options, allowing you to focus on either API development or advanced web scraping. Choose the option that best aligns with your interests and career goals.
π‘ Tip: Start early and make regular commits. This helps track your progress and ensures you have a working solution by the deadline.
β οΈ Note: Late submissions will receive penalties according to LSEβs policy.
Overview
You will choose between:
- API Development: Build a comprehensive API for the Transition Pathway Initiativeβs Corporate Assessment data
- Advanced Web Scraping: Create a complete data collection solution for the Climate Action Tracker website
π Preparation
Click on this GitHub Classroom link 1 to create your designated repository.
Choose one of the two options below and follow its specific requirements.
Option A: Corporate Assessment API
Build upon your experience with the ASCOR API (π W02-W03 Practice Exercise), design an API for the Transition Pathway Initiativeβs Corporate Assessment data.
What weβre looking for
Requirement | Details |
---|---|
1. Data Structure Design | - Design a clear and efficient data structure - Document your design choices - Data structure should be future-proof (if we decide to add different endpoints in the future) |
2. API Implementation | - Create clear endpoint documentation - Implement robust data validation (π) - Include error handling - Add unit tests (π) |
3. Research and Documentation | - Demonstrate understanding of the data - Document use cases and potential users of the data youβre serving (π) - Consider future extensions (π) |
The items marked with a π are optional and will put you on the path to earn >70+ marks (distinction) if implemented well.
Option B: Climate Action Tracker Scraper
Build upon your experience with web scraping (π W04-W05 Practice Exercise), create a comprehensive data collection solution for the Climate Action Tracker website.
What weβre looking for
Requirement | Details |
---|---|
1. Data Collection | - Scrape all available data from country pages (including text) - Use your discretion to determine what is relevant and what is not - Handle both structured (data that can be put into a table) and unstructured content (raw text, images, etc.) appropriately - Download and organise related media (π) |
2. Data Organisation | - Design a logical folder structure - Implement proper file naming - Create a data dictionary document (π) |
3. Research and Documentation | - Demonstrate understanding of the data - Document your scraping strategy - Consider and document the ethical implications of your scraping (π) - Handle rate limiting professionally (π) |
The items marked with a π are optional and will put you on the path to earn >70+ marks (distinction) if implemented well.
π Common Requirements
Requirement | Details |
---|---|
1. Repository Structure | - Clear folder organization of the repository - Well-organised README (informative yet concise, not overly verbose) - Complete documentation - Follow standard Python project layout for the type of project you are building |
2. Code Quality | - Clean, readable code - Proper error handling - Efficient implementations - Follow PEP 8 style guide (π) |
3. Testing | - Basic error case handling - Performance considerations - Comprehensive unit tests (π) - Integration tests (π) |
The items marked with a π are optional and will put you on the path to earn >70+ marks (distinction) if implemented well.
βοΈ Marking Guide
In line with the unwritten but widely-used UK marking conventions, grades must be awarded as follows:
- 40-49: Basic implementation with significant room for improvement (typically missing core requirements)
- 50-59: Working implementation but one that meets only the very basic requirements (it looks incomplete)
- 60-69: Good implementation demonstrating solid understanding with small caveats and minor improvements possible
- 70+: Excellent implementation going beyond expectations, showing creativity and depth of understanding without being overly verbose or over-engineered
Note from Jon: I find this artificial βcapβ at 70+ marks silly and unnecessary and it clashes with what I understand to be the pedagogical purposes of an undergraduate course that is all about demonstrating hands-on experience. If I can show that your work is of a high standard and clearly demonstrates that you are truly and meaningfullyengaged with the material beyond a shallow level, Iβll be happy to award disctions.
Core Requirements (0-70 marks)
Component | Details | Marks |
---|---|---|
Technical Implementation | - Working solution - Proper error handling - Clean, documented code |
30 |
Data Management | - Appropriate data structures - Efficient processing - Logical organisation |
25 |
Documentation | - Clear README - Usage examples - Implementation details |
15 |
Path to Distinction (70+ marks)
To achieve a distinction, submissions must demonstrate excellence in following the most challenging parts of the course materials, or in going beyond the course materials to apply the techniques in new and creative ways in a meaningful way (just doing more stuff doesnβt qualify):
Enhancement | Examples | Additional Marks |
---|---|---|
Technical Excellence | - Robust data validation - Comprehensive test suite - Professional rate limiting |
+10 |
Architecture & Performance | - Future-proof design - Efficient processing - Optimised data structures |
+8 |
Research & Documentation | - Data dictionary - Use case analysis - Ethical considerations |
+7 |
Innovation | - Novel features - Creative solutions - Meaningful extensions |
+5 |
π Show you can act on feedback
When you get feedback on your work, Iβll give you a list of things that you can do to improve your work and get at most 15 extra marks. If you implement those within the new deadline, Iβll award the extra marks. The new deadline will be given in the feedback message.
For example, if you get 60 marks and I give you a list of 5 things that you can do to improve your work, your grade could go up to at most 75.
Feedback
You will receive:
- Detailed feedback on your implementation
- Suggestions for improvement
- Justification for marks awarded and specific suggestions for improvements that could earn you up to 15 additional marks
π‘ Tip: Start early and make regular commits. This helps track your progress and ensures you have a working solution by the deadline.
Footnotes
Visit the Moodle version of this page to get the link. The link is private and only available for formally enrolled students.β©οΈ