ποΈ Week 07 - Data summarisation and the grammar of graphics
Theme: Cleaning and reshaping data
Our teaching this week revolves around data summarisation and a concept known as the grammar of graphics.
But before creating any summaries or plots, I will show you how to collect data from an API. Weβve given you great web scraping exposure, but that is not the easiest way to obtain data. If the developers behind the data source you want to use have created an interface to grant you access to their data, then it is best to use that! The data will generally be in a highly structured and, hopefully, well-documented JSON format, thus removing the need to reverse engineer the styling of a webpage to get what you want. To illustrate this, I will use the Reddit API.
π PREPARATION
To come well prepared for the lecture, clone the following GitHub repository:
ποΈ LINK TO REPOSITORY
π Lecture Schedule
πLocation: Thursday 9 November 2023, 4 pm - 6 pm at CKK.1.04
π¨βπ« Lecture Material
π₯ Looking for lecture recordings? You can only find those on Moodle, typically a day after the lecture. If you canβt find the recordings, please contact π§ .
Material
This weekβs lecture material is available under this dedicated GitHub repository:
ποΈ LINK TO REPOSITORY
Solutions to the exercises and live demos in the Jupyter Notebook of this lecture will NOT be posted here afterwards. We will create these solutions together during the lecture.