DS105W – Data for Data Science
30 Jan 2025
16:03 – 16:15
Through a combination of content introduced in 🗣️ W01 Lecture, the 💻 W01 Lab and the Dataquest lessons from the 📝 W01 Formative and 📝 W02 Formative:
bool
int
, float
str
if
elif
else
for
loops to iterate over collections or to perform a task repeatedlywhile
loops to repeat a task until a condition is metWe also talked about Markdown here and there. We use it to format Slack messages, in our 📚 Jupyter Notebooks, these slides, and all the course webpages you see on Moodle.
This is a **bold** text.
This is an _italic_ text.
[This is a link](https://lse.ac.uk/dsi)
`print("Hello, World!")`
```python # This is a code block print("Hello, World!") ```
The code blocks are just to represent code, not to execute it.
There are also headings in Markdown. They help structure content—not just make text big! (Those won’t work in Slack, by the way.)
# Title (H1)
## Section (H2)
### Sub-section (H3)
#### Sub-sub-section (H4)
Title (H1)
Section (H2)
Sub-section (H3)
Sub-sub-section (H4)
⚠️ Do not use #
just to make text bigger! It’s not what it represents.
Use it to create hierarchical demarcations of sections instead.
Sometimes you want to write a blockquote, use the >
symbol for that:
which renders as:
We 💖 tables in this course
You can also create lists with the -
or *
symbols but they need to be preceded by an empty line and be at the beginning of the line and followed by a space:
We like tables because:
- They are structured
- They are easy to read
- They are easy to generate
- If you know how to code in Python
render as:
We like tables because:
You can do enumerated lists too, just use numbers instead of -
or *
.
And there are tables:
City | Max Temp | Min Temp |
---|---|---|
London | 12°C | 5°C |
Paris | 15°C | 7°C |
The precise styling of the table will depend on the platform you are using. On these slides, tables are always centred and have a red theme.
🤔 Would you know how to write Python code to print()
out a bunch of lists as a Markdown table?
16:15 – 16:40
But as I said, today we are going to focus on collections, data structures that store multiple values.
Understanding how Python stores lists and dictionaries in memory can help us write more efficient code.
Lists: Implemented as dynamic arrays; they store references to objects and can resize as needed.
Dictionaries: Implemented as hash tables; they store key-value pairs, allowing for efficient data retrieval based on keys.
References:
While we often collect data from external sources, it’s essential to know how to create these structures manually.
List
💡 You can use multiple lines for readability.
Accessing data differs between lists and dictionaries.
Adding new data to an existing collection.
💡 Notice how dictionaries always encode a bit more information than lists. In our example, whereas in the list we just add a new temperature, on the dictionary we get to specify the precise time associated with that temperature.
Removing data when it’s no longer needed.
Iterating over our collections to process data.
And here is a neat trick 🪄.
If you need position information while looping through a list, you can use the enumerate()
function:
List
For a while, dictionaries didn’t preserve the order of elements. Since Python 3.6, they do.
16:40 – 16:50
Lists and dictionaries can encode more than just primitive data types (the integers, floats, and strings we’ve seen so far). They can store other collections.
Let me test your intuition with a quick question (we will open Mentimeter for this) ⏭️
16:50 – 17:00
After the break:
17:00 – 17:30
So far, we’ve created our own lists and dictionaries. But in practice, we don’t manually type in data—we collect it from sources like APIs.
❌ Problem:
Typing weather data manually is tedious and error-prone.
✅ Solution:
Use APIs (Application Programming Interfaces) to fetch live data dynamically.
An API is like a vending machine:
In Python, we use a package called requests
to “talk” to APIs.
The requests
package does not come pre-installed with Python. You need to install it using pip
.
On VS Code, click on the Menu icon then navigate to Terminal > New Terminal.
A window will pop up at the bottom of the screen.
In the terminal window, type pip install requests
and press Enter
.
Wait for the installation to complete.
The requests
package is now installed and ready to use on Jupyter Notebooks.
Explore their API documentation.
🚀 We will:
import requests
url = "https://api.open-meteo.com/v1/forecast"
params = {
"latitude": 51.5085,
"longitude": -0.1257,
"hourly": "temperature_2m",
"timezone": "Europe/London"
}
response = requests.get(url, params=params)
✅ We now have real-time weather data!
⏭️ Let’s inspect the response (live demo).
The API expects these parameters:
latitude
, longitude
)hourly=temperature_2m
)timezone=Europe/London
) (optional)🔗 Equivalent URL:
You could also construct the URL manually:
https://api.open-meteo.com/v1/forecast
?latitude=51.5085
&longitude=-0.1257
&hourly=temperature_2m
&timezone=Europe/London
How Do We Know If It Worked?
200
✅ Success404
❌ Not Found500
⚠️ Server ErrorWhat’s Inside the Response?
JSON ≠ Python Dictionary
🔎 The response is a string—we need to convert it to a dictionary.
Before we can use the data, we must convert JSON (text) into Python objects.
Now weather_data
is a Python dictionary! We can access its values just like any other dictionary:
The code above prints a list of temperatures by hour.
17:30 – 17:50
Here I will do a bit more live coding to show how to interact with the API and process the data.
17:50 – 18:00
💻 W02 Lab (tomorrow)
📝 W03 Formative
LSE DS105W (2024/25)