🗓️ Week 02
Python Collections & First Steps with APIs

DS105W – Data for Data Science

30 Jan 2025

1️⃣ Recap & Markdown Tricks

16:03 – 16:15

🔁 Python Concepts

Through a combination of content introduced in 🗣️ W01 Lecture, the 💻 W01 Lab and the Dataquest lessons from the 📝 W01 Formative and 📝 W02 Formative:

  • Python variables
    • Logical: bool
    • Numeric types: int, float
    • Text type: str
  • Conditional Statements:
    (Flow control)
    • if
    • elif
    • else
  • Collections: our focus today
    • Lists: ordered, indexed by integers
    • Dictionaries: unordered, key-value pairs
  • Loops:
    • for loops to iterate over collections or to perform a task repeatedly
    • while loops to repeat a task until a condition is met

Markdown Concepts

We also talked about Markdown here and there. We use it to format Slack messages, in our 📚 Jupyter Notebooks, these slides, and all the course webpages you see on Moodle.

If you type this:

This is a **bold** text.
This is an _italic_ text.
[This is a link](https://lse.ac.uk/dsi)
`print("Hello, World!")`
```python
# This is a code block
print("Hello, World!")
```

You get this:

This is a bold text.

This is an italic text.

This is a link

print("Hello, World!")

# This is a code block
print("Hello, World!")

Markdown Concepts (Headings)

There are also headings in Markdown. They help structure content—not just make text big! (Those won’t work in Slack, by the way.)

If you type this:

# Title (H1)
## Section (H2)
### Sub-section (H3)
#### Sub-sub-section (H4)

You get this:

Title (H1)

Section (H2)

Sub-section (H3)

Sub-sub-section (H4)

⚠️ Do not use # just to make text bigger! It’s not what it represents.
Use it to create hierarchical demarcations of sections instead.

Markdown Concepts (Blockquotes)

Sometimes you want to write a blockquote, use the > symbol for that:

> We 💖 tables in this course

which renders as:

We 💖 tables in this course

Markdown Concepts (Lists)

You can also create lists with the - or * symbols but they need to be preceded by an empty line and be at the beginning of the line and followed by a space:

We like tables because:

- They are structured
- They are easy to read
- They are easy to generate
  - If you know how to code in Python

render as:

We like tables because:

  • They are structured
  • They are easy to read
  • They are easy to generate
    • If you know how to code in Python

Markdown Concepts (Tables)

And there are tables:

If you type this:

| City  | Max Temp | Min Temp |
|-------|---------|---------|
| London | 12°C    | 5°C     |
| Paris  | 15°C    | 7°C     |

The columns do not need to be aligned perfectly but it helps readability.

You get this:

City Max Temp Min Temp
London 12°C 5°C
Paris 15°C 7°C

The precise styling of the table will depend on the platform you are using. On these slides, tables are always centred and have a red theme.

2️⃣ Python Collections: Lists & Dictionaries

16:15 – 16:40

But as I said, today we are going to focus on collections, data structures that store multiple values.

🧠 Memory Storage of Lists and Dictionaries

Understanding how Python stores lists and dictionaries in memory can help us write more efficient code.

  • Lists: Implemented as dynamic arrays; they store references to objects and can resize as needed.

  • Dictionaries: Implemented as hash tables; they store key-value pairs, allowing for efficient data retrieval based on keys.

References:

📋 Creating Lists and Dictionaries

While we often collect data from external sources, it’s essential to know how to create these structures manually.

List

# List of temperatures in Celsius
temperatures = [22, 21, 19, 23, 20]

Or

temperatures = [
    22,
    21,
    19,
    23,
    20
]

Dictionary

# Dictionary of temperatures
weather_data = {'09:00': 22, '12:00': 21, '15:00': 19, '18:00': 23, '21:00': 20}

Or

weather_data = {
    '09:00': 22,
    '12:00': 21,
    '15:00': 19,
    '18:00': 23,
    '21:00': 20
}

💡 You can use multiple lines for readability.

🔍 Accessing Elements

Accessing data differs between lists and dictionaries.

List

# Accessing the first temperature
first_temp = temperatures[0]

# Accessing the last temperature
last_temp = temperatures[-1]

Lists are indexed by integers, starting at 0.

To get an element, you need to know its position in the list.

Dictionaries

# Accessing temperature at 12:00
temp_at_noon = weather_data['12:00']

# Accessing temperature at 18:00
temp_at_evening = weather_data['18:00']

Dictionaries are accessed by keys, not by position.

You need to know the key to retrieve the value.

➕ Adding Elements

Adding new data to an existing collection.

List

# Adding a new temperature to the list
temperatures.append(18)

Dictionaries

# Adding a new time-temperature pair
weather_data['22:00'] = 18

💡 Notice how dictionaries always encode a bit more information than lists. In our example, whereas in the list we just add a new temperature, on the dictionary we get to specify the precise time associated with that temperature.

➖ Removing Elements

Removing data when it’s no longer needed.

List

# Removing the first temperature
temperatures.pop(0)

# Removing the last temperature
temperatures.pop()
# or 
del temperatures[-1]

Dictionary

# Removing the 09:00 entry
del weather_data['09:00']

# Removing the 21:00 entry
del weather_data['21:00']

🔄 Looping Through Data

Iterating over our collections to process data.

List

# Looping through temperatures
for temp in temperatures:
    print(f"Temp: {temp}°C")

Dictionary

# Looping through time-temperature pairs
for time, temp in weather_data.items():
    print(f"Time: {time}, Temp: {temp}")

And here is a neat trick 🪄.
If you need position information while looping through a list, you can use the enumerate() function:

List

# Now with position information
for i, temp in enumerate(temperatures)
    print(f"Entry {i+1}: ({temp}°C)")

Dictionary

# Now with position information
for i, (time, temp) in enumerate(weather_data.items()):
    print(f"Entry {i+1}:", end=" ")
    print(f"Time: {time} ({temp} °C)")

3️⃣ Nested Lists and Nested Dictionaries

16:40 – 16:50

Lists and dictionaries can encode more than just primitive data types (the integers, floats, and strings we’ve seen so far). They can store other collections.

Let me test your intuition with a quick question (we will open Mentimeter for this) ⏭️

🍵 Quick Coffee Break

16:50 – 17:00

After the break:

  • How to get data from Internet sources (APIs)
  • More on nested collections
  • What’s coming next in this course

4️⃣ Collecting Data with APIs

17:00 – 17:30

The Problem with Manual Data Entry

So far, we’ve created our own lists and dictionaries. But in practice, we don’t manually type in data—we collect it from sources like APIs.

Problem:
Typing weather data manually is tedious and error-prone.


Solution:
Use APIs (Application Programming Interfaces) to fetch live data dynamically.

What is an API?

An API is like a vending machine:

  1. You make a request (insert a coin & press a button).
  2. The API processes it (retrieves your snack).
  3. You get a response (your snack comes out).

In Python, we use a package called requests to “talk” to APIs.

The requests package does not come pre-installed with Python. You need to install it using pip.

  • On VS Code, click on the Menu icon then navigate to Terminal > New Terminal.

  • A window will pop up at the bottom of the screen.

  • In the terminal window, type pip install requests and press Enter.

  • Wait for the installation to complete.

  • The requests package is now installed and ready to use on Jupyter Notebooks.

The Open-Meteo API: Fetching Weather Data

Visit Open-Meteo

🔗 https://open-meteo.com/

Explore their API documentation.

🚀 We will:

  • Request hourly temperature for London
  • Receive structured weather data (it comes back in a format called JSON).

Constructing the Request

import requests

url = "https://api.open-meteo.com/v1/forecast"
params = {
    "latitude": 51.5085,
    "longitude": -0.1257,
    "hourly": "temperature_2m",
    "timezone": "Europe/London"
}

response = requests.get(url, params=params)

We now have real-time weather data!
⏭️ Let’s inspect the response (live demo).

🏗️ Understanding the API Request

The API expects these parameters:

  • Location (latitude, longitude)
  • The variable(s) we need (hourly=temperature_2m)
  • Timezone adjustments (timezone=Europe/London) (optional)

🔗 Equivalent URL:

You could also construct the URL manually:

https://api.open-meteo.com/v1/forecast
?latitude=51.5085
&longitude=-0.1257
&hourly=temperature_2m
&timezone=Europe/London

How Do We Know If It Worked?

print(response.status_code)
  • 200 ✅ Success
  • 404 ❌ Not Found
  • 500 ⚠️ Server Error

What’s Inside the Response?

print(response.text)  # Raw text data
print(type(response.text))  # It's a string!

JSON ≠ Python Dictionary
🔎 The response is a string—we need to convert it to a dictionary.

📦 Converting API Data to a Python Dictionary

Before we can use the data, we must convert JSON (text) into Python objects.

weather_data = response.json()

Now weather_data is a Python dictionary! We can access its values just like any other dictionary:

print(weather_data["hourly"]["temperature_2m"])

The code above prints a list of temperatures by hour.

5️⃣ Some more Live Coding

17:30 – 17:50

Here I will do a bit more live coding to show how to interact with the API and process the data.

🔜 What’s Next?

17:50 – 18:00

💻 W02 Lab (tomorrow)

  • More practice interacting with Jupyter Notebooks
  • Practice with the things we saw today
  • Fetch and process real-time weather data
  • Convert JSON data to Markdown tables (because why not?)

📝 W03 Formative

  • Expect to play a little 🎮 game in your 📝
  • Expect to ask yourself: 🤔 “what if I wanted to store API responses to process then at another time?