ποΈ Week 10 - Databases + Data reshaping + Basics of Text Mining
Theme: Cleaning and reshaping data
Yep, thereβs still more to learn about data cleaning and reshaping! This week, weβll look at combining data from multiple data frames, reshaping data so it is easier/faster to plot, and using regular expressions to extract information from text data. Weβll also look at databases and SQL and how to use them in Python.
Keeping up with the hands-on spirit of this course, the lecture material will be delivered via GitHub, and weβll use Jupyter Notebooks for the live demos.
π― Learning Objectives
- Combining data from multiple data frames (pd.merge())
- Reshaping data so it is easier/faster to plot (pd.pivot(), pd.melt())
- An introduction to databases and SQL - for when CSV files get too big (the basics of SQLite + pd.read_sql())
- An introduction to regular expression, using the
re
package in Python
π PREPARATION
To come well prepared for the lecture, clone the following GitHub repository:
ποΈ LINK TO REPOSITORY
π Lecture Schedule
πLocation: Thursday 30 November 2023, 4 pm - 6 pm at CKK.1.04
π¨βπ« Lecture Material
π₯ Looking for lecture recordings? You can only find those on Moodle, typically a day after the lecture. If you canβt find the recordings, please contact π§ .
Material
This weekβs lecture material is available under this dedicated GitHub repository:
ποΈ LINK TO REPOSITORY
Solutions to the exercises and live demos in the Jupyter Notebook of this lecture will NOT be posted here afterwards. Everything will be in the GitHub repository.