πŸ—“οΈ Week 07 - Data summarisation and the grammar of graphics

Theme: Cleaning and reshaping data

Author

Our teaching this week revolves around data summarisation and a concept known as the grammar of graphics.

But before creating any summaries or plots, I will show you how to collect data from an API. We’ve given you great web scraping exposure, but that is not the easiest way to obtain data. If the developers behind the data source you want to use have created an interface to grant you access to their data, then it is best to use that! The data will generally be in a highly structured and, hopefully, well-documented JSON format, thus removing the need to reverse engineer the styling of a webpage to get what you want. To illustrate this, I will use the Reddit API.

πŸ“š PREPARATION

To come well prepared for the lecture, clone the following GitHub repository:

πŸ–‡οΈ LINK TO REPOSITORY

πŸ“ƒ Lecture Schedule

πŸ“Location: Thursday 9 November 2023, 4 pm - 6 pm at CKK.1.04

πŸ‘¨β€πŸ« Lecture Material

πŸŽ₯ Looking for lecture recordings? You can only find those on Moodle, typically a day after the lecture. If you can’t find the recordings, please contact πŸ“§ .

Material

This week’s lecture material is available under this dedicated GitHub repository:

πŸ–‡οΈ LINK TO REPOSITORY

Solutions to the exercises and live demos in the Jupyter Notebook of this lecture will NOT be posted here afterwards. We will create these solutions together during the lecture.