DS205 โ Advanced Data Manipulation
20 Jan 2025
Dr Jon Cardoso-Silva ๐ง
@jonjoncardoso
Assistant Professor
LSE Data Science Institute
Current Focus:
Office Hours:
Thursdays, 11:00-13:00
Book via StudentHub
Recent Recognition:
LSESU Teaching Award for Feedback & Communication (2023)
Kevin Kittoe
Teaching & Assessment Administrator (DSI)
ADMINISTRATIVE SUPPORT
Contact ๐ง DSI.ug@lse.ac.uk for:
Key Information:
Terry Zhou
@tz1211
Code Maintainer & Research Assistant
3rd-Year BSc in Politics and Data Science
CODE MAINTAINER
Terry has been working with me for the past ~2 years and has practical experience in:
As code maintainer, heโll:
Transition Pathway Initiative Centre
The TPI Centre evaluates companiesโ readiness for transition to a low-carbon economy. Their work involves a lot of analysis of data that is often messy and unstructured.
๐๐ป Everything you produce in this course has the potential to help TPI automate their data processing workflows.
Why this course exists:
By the end of this course, you should be able to:
pandas
to automate and optimise data cleaning and data processing workflows.20% | Individual |
โ๏ธ Problem Set 1: Web Scraping & API Development |
Release: ~Week 04 Due: 5 March 2025, 8pm |
40% | Individual |
โ๏ธ Problem Set 2: RAG System Implementation |
Release: ~Week 06 Due: 26 March 2025, 8pm |
40% | Group Work |
๐ฅ Final Project: TPI Data Pipeline Development |
Details: Spring Term Due: May/June 2025 (TBC) |
Weekly formative exercises in Weeks 01-04 will prepare you for the summative assessments. These include hands-on practice with GitHub workflows, API development, and web scraping techniques.
๐ Communication
Donโt like your laptop for coding?
We have a dedicated cloud environment on Nuvolos
Visit the Nuvolos - First Time Access to learn how to get access to the DS205 environment.
Read the syllabus for week-by-week information on how we will cover the course content and assessments.
You will TPIโs slides on the course Moodle page later ๐ฉ
After the break:
Watch me as I load the ASCOR dataset and perform some basic operations with pandas. Take notes and ask questions as we go along.
I wonโt provide a step-by-step guide before the live coding session, as you will be replicating these tasks in the class later. A model solution will be available after the Tuesdayโs ๐ป W01 lab.
LSE DS205 (2024/25)