🖥️ Week 02, Day 01 - Lecture

From Loops to Vectorisation: A More Efficient Data Workflow

Author

Dr Jon Cardoso-Silva

Last updated

31 July 2025

🥅 Learning Objectives

By the end of this session, you should be able to: i) Use a list comprehension and pd.concat to efficiently process and combine multiple datasets. ii) Explain the performance benefits of vectorised operations over Python loops. iii) Replace complex loops with vectorised pandas equivalents, including .diff() and .cumsum(). iv) Recognise opportunities to apply vectorisation and list comprehensions in your own data analysis code.

⏰ Date and Time: Monday, 21 July 2025 | 10.00am - 1.00pm 📍 Location: CKK.2.06 (see LSE’s 🗺️ campus map)

Welcome back!

Today our focus is on making our data workflow smarter, faster, and more professional. We’re going to upgrade your toolkit from for loops to a more powerful way of thinking: vectorisation.

🗣️ Lecture Overview

Today’s session is a journey from the workflow you already know to a more efficient way of executing it:

Confidence First: We’ll start by reviewing the data science workflow to reinforce how much you’ve already learned.
A New Trick: We’ll learn to use list comprehensions and pd.concat as an elegant way to replace common for loops when preparing data.
The Pandas Way: We’ll then dive into pandas vectorisation to see how it can solve complex pattern-matching problems without any loops at all.

🎬 Lecture Slides

Use your keyboard arrows to navigate the slides below. They will guide our discussion from the high-level concepts to the practical code. You can also view them in fullscreen.

Prefer a PDF? Download the slides here:

(Sometimes the PDF export is a bit buggy and some text may appear with different formatting.)

After the Lecture

You’ve seen the “why” and the “how”. The lab session this afternoon is your chance to get hands-on.

💻 Today’s Lab

Your main task for the afternoon.

➡️ Go to Lab Instructions

Questions?

➡️ Ask on Slack

🔗 Extra Resources: Vectorisation