🖥️ Week 02, Day 01 - Lecture
From Loops to Vectorisation: A More Efficient Data Workflow
By the end of this session, you should be able to: i) Use a list comprehension and pd.concat
to efficiently process and combine multiple datasets. ii) Explain the performance benefits of vectorised operations over Python loops. iii) Replace complex loops with vectorised pandas
equivalents, including .diff()
and .cumsum()
. iv) Recognise opportunities to apply vectorisation and list comprehensions in your own data analysis code.
⏰ Date and Time: Monday, 21 July 2025 | 10.00am - 1.00pm 📍 Location: CKK.2.06 (see LSE’s 🗺️ campus map)
Welcome back!
Today our focus is on making our data workflow smarter, faster, and more professional. We’re going to upgrade your toolkit from for
loops to a more powerful way of thinking: vectorisation.
🗣️ Lecture Overview
Today’s session is a journey from the workflow you already know to a more efficient way of executing it:
- Confidence First: We’ll start by reviewing the data science workflow to reinforce how much you’ve already learned.
- A New Trick: We’ll learn to use list comprehensions and
pd.concat
as an elegant way to replace commonfor
loops when preparing data. - The Pandas Way: We’ll then dive into
pandas
vectorisation to see how it can solve complex pattern-matching problems without any loops at all.
🎬 Lecture Slides
Use your keyboard arrows to navigate the slides below. They will guide our discussion from the high-level concepts to the practical code. You can also view them in fullscreen.
Prefer a PDF? Download the slides here:
(Sometimes the PDF export is a bit buggy and some text may appear with different formatting.)
After the Lecture
You’ve seen the “why” and the “how”. The lab session this afternoon is your chance to get hands-on.
🔗 Extra Resources: Vectorisation
🔗 Extra Resources: Performance