β Week 04 - Checklist
DS105 - Data for Data Science
Here is a suggestion of how to program your week in relation to this course:
π₯ Ready for a challenge? Before the lecture, why not take your knowledge of the Linux shell to the next level? Create an account on Hacker Rank and try to solve the Difficulty:Medium Linux shell challenges.
π§βπ» Keep on practicing Python/R: If you are happy with your knowledge of the terminal, try completing the pre-sessional course or challenge yourself to Python challenges on HackerRank.
π§βπ« Attend the lecture: This is an important week. It is when you will start putting your knowledge of programming to use.
Reserve some time before the labs on Friday to replicate the web scraping and API demos I showed in the lecture. Or even better, reuse it to gather and collect data from other websites of your interest.
I will focus the demo on Python but we will show you how to adapt the same things to R.
βοΈ Solve: The First Summative Problem Set will be released on 18 October 2022 on Moodle. Reserve some time during the week to solve it before the lab on Friday.
π§βπ» Attend the labs: If you have practiced with the code I introduced in the lecture, the labs will be much easier. Bring your technical questions with you!
π Topics of interest: Do you already have an idea of which websites you want to scrape, or which public APIs you want to use? Use the β#dataset-ideas-and-team-formationβ channel on Slack
At the end of the week, I will add a page to our website that will include a π list of suggested APIs and websites suitable for web scraping. This will only be suggestions; when you form your teams you can choose to use other websites/APIs, as long as it meets the required criteria of the final project.
This page will also explain what will be expected of you in the final project: the minimum volume of data you will need to obtain, what will be required in terms of data pre-processing (W05 and beyond), visualisations, the code you will need to provide, etc.