Step 4: The final part of the process
2023/24 Autumn Term
It is sad, but all good things must come to an end.
What we want to see in the final project.
What we want to see
- A web page that tells the story of your project. The text on your web page must have a maximum of 4000 words.
- What made you curious about this kind of data in the first place?
- How did you gather the data?
- What is in the data? What does it look like in general?
- What did you find out about the data? (Exploratory Data Analysis)
- A GitHub repository that contains your source code.
- We want to see that you used the tools and best practices we taught you in class (all eleven weeks of the course).
- We will check the repo history to ensure all team members have contributed with commits to the project (it does not have to be an equal contribution).
How we will mark your final project
While the final project is worth 25% of your final grade, you will be marked on a 0-100 scale.
When you submit your final project, make sure your web page addresses the questions/requirements below:
Source | Criteria | Marks | Description |
---|---|---|---|
Webpage | Motivation | 5 | - The webpage explains what made the group curious about this data. |
Webpage | Data | 5 | - The webpage succinctly lists the data sources and the data collection challenges |
Webpage | Exploratory Data Analysis (EDA) |
10 | - The webpage paints a vivid picture of the data (things like: the number of data points, what are the different data types and the most relevant columns, summaries and distributions, etc.) |
Webpage | Visualisation | 10 | - The plots look really nice - All labels are clear and visible - All variables are clearly identified. - The plots and tables paint a vivid picture of what the data looks like. - The group used ggplot (R) or plotnine (python) to generate the plots |
Webpage | Storytelling | 15 | - The text is engaging and clear. - There is no fluff - The group described relevant technical steps without too many details. - There was a nice conclusion. |
Source code | Organisation | 10 | - The source code is available in a group’s GitHub repository. The code is replicable. - There is a good structure of files and directories |
Source code | Collaboration | 5 | - There is a list of everyone’s contributions to the project somewhere in the project’s webpage or README file. - All members contributed with at least one commit to the group’s GitHub repository. Note: we do not expect all group members to do the same thing; each person could have a different contribution. For example, one person could focus more on data collection while another takes care of the visualisations, and the other member could focus more on documentation. |
Source code | Data cleaning | 20 | - We see a good use of pandas (python) or tidyverse (R) to clean up data. - Data types of the variables are consistent and make sense. - Missing values were identified and dealt with. |
Source code | Data wrangling | 20 | - We see evidence of good use of pandas and/or tidyverse to filter, merge, reshape and pivot your data as needed for the analysis/plots. |