📝 W03 Formative Exercise: Paths, Files, and APIs in the Terminal
![Winter Term Icon for LSE DS105 - Created to reflect the journey of data transformation and insight discovery. Icon representing the themes of data transformation and insight discovery.](../../../figures/visual-identity/2024_2025_DS105W_icon_200px.png)
Briefing
⏳ | Duration | 3-4 hours. This week’s content is a bit more demanding! |
📂 | Folder to Use in Nuvolos | Week 03 - File Formats and Directory Structure/ |
💎 | Key Learning Concept | How to specify locations of files and how to read/write them |
💡 Learning Together:
This week’s exercise will be more demanding than the previous ones, and it’s completely natural to hit roadblocks. This is good - it means you’re learning! We expect you to encounter random errors, and we encourage you to:
- Read error messages carefully - they often contain helpful clues
- Try to solve the problem yourself first using what you’ve learned
- Post in the
#help
channel on Slack if you get stuck after trying
Posting your questions on Slack:
- Helps others who might have the same issue
- Creates opportunities for peer learning
- Builds our course community
- Often leads to faster solutions than struggling alone
Remember to include what you’ve tried and any error messages you see!
Part 1: Exploring Paths with “Shell-it: The London Episode”
Before diving into Terminal commands, let’s experience how paths work in a fun and interactive way.
🎯 ACTION POINTS:
Play “Shell-it: The London Episode” for at least 10 locations.
Observe: Each landmark has an absolute and relative path—just like files on your computer.
Reflect: How is navigating in this game similar to moving through folders in a real file system?
Take a screenshot of your final location in the game and upload it to Nuvolos, under the folder
Week 03 - File Formats and Directory Structure
. This will show us that you know how to handle and organise files.
🏁 Checkpoint
Before jumping into the Nuvolos Terminal, take one minute to reflect on what you just learned from the Shell-it game.
Write a brief reflection on your experience with the game. This can be a few sentences or bullet points and there are no wrong/right notes here. The idea is just to get you thinking about how the game relates to navigating files on a computer.
Part 3: Fetching and Saving API Data with curl
Last week, you learned to collect data from the Internet using APIs. We used Python for it, with the requests
library. This week, we will use a different approach: the curl
command in the Terminal. curl
is essentially a different app altogether that sole’s purpose is to fetch data from the Internet. It is very powerful and can be used to interact with APIs, download files from websites, and more.
🎯 ACTION POINTS:
Navigate to inside the
data/
folder you created:That is, your
pwd
should be/files/Week 03 - File Formats and Directory Structure/data/
.Use
curl
to fetch JSON data from Open-Meteo:curl -o london_forecast.json "https://api.open-meteo.com/v1/forecast?latitude=51.5085&longitude=-0.1257&daily=temperature_2m_max,temperature_2m_min&timezone=Europe%2FLondon"
Verify that the file was created. (Use
ls
to check.)You should see
london_forecast.json
in the directory.Peak inside the file. Your terminal comes with another app/command called
cat
that can be used to print out the contents of a file. It’s kind of similar toprint()
in Python, but it works from the Terminal and prints out the contents of a file. Try it out:cat london_forecast.json
This will show you the raw JSON data that was fetched from the API. This is just a printout, though. You see the content of the string that is stored in the file, you are not working with the
0
’s and1
’s that are actually stored on the disk.How to edit a file from the terminal:
Let’s say you wrote a file yourself or you don’t like how the file above is organised. You can use the
nano
command to edit the file. Try it out:nano london_forecast.json
Is nano
not working?
The nano
command should already be installed on your Nuvolos environment. If it’s not, you can install it by running:
conda install --channel conda-forge nano -y
This will launch a plain text editor. It’s like a dinossaur version of a Google Docs/MS Word/VS Code text editor. You can use the arrow keys to move around, and you can use the keyboard to type.
Importantly: to exit nano
, you need to press Ctrl
+ X
. It will ask you if you want to save the changes you made. If you do, press Y
. If you don’t, press N
. If you want to save the changes to a different file, press Ctrl
+ O
and then Enter
.
Open the file, add a little space at the end of the file, and save it. Then, use cat
to see the (small) change you made.
(Optional) Make the JSON pretty. If, like me, you find it sad that this JSON file is all clumped together in a single line, you can always clean it up a bit.
Use
nano
to open the file and then manually edit the file so that the end result looks like this:{ "latitude": 51.5, "longitude": -0.120000124, "generationtime_ms": 0.06091594696044922, "utc_offset_seconds": 0, "timezone": "Europe/London", "timezone_abbreviation": "GMT", "elevation": 23.0, "daily_units": { "time": "iso8601", "temperature_2m_max": "°C", "temperature_2m_min": "°C" }, "daily": { "time": [ "2025-02-02", "2025-02-03", "2025-02-04", "2025-02-05", "2025-02-06", "2025-02-07", "2025-02-08" ], "temperature_2m_max": [7.6, 8.4, 10.3, 8.7, 7.3, 6.5, 5.4], "temperature_2m_min": [0.7, -0.4, 5.8, 2.7, 1.9, 1.0, 0.2] } }
At the end, save the output to a separate file called
formatted_london_forecast.json
when exitingnano
.This is a silly but a very useful exercise to get exposure and practice with the Terminal. This dark screen can be intimidating at first, but it’s a very powerful tool that can be used to do a lot of things very quickly.
Part 4: Reading JSON in Python (Directly from Terminal)
In this course so far, you have used Dataquest own’s platform to write Python code and for code we wrote by ourselves, we used Jupyter Notebooks from within the VS Code application.
In reality, though, VS Code (and Jupyter Notebook) is running Python in the background. If you want to run Python directly, you can do so from the Terminal. This is a very powerful tool that can be used to quickly test out code snippets, run scripts, and even run entire Python programs.
🎯 ACTION POINTS:
This sessions assumes you are still inside the “/files/Week 03 - File Formats and Directory Structure/data/” folder. If you are not, navigate back to it.
Start Python inside the Terminal:
python
This will start a Python session – we also call it the Python interpreter. You will know you are inside Python because the prompt will change from
$
to>>>
.💡 Here, you are essentially running a different app and terminal commands will no longer work unless you exit the Python app.
Create a random variable.
You can now write any Python code you want. For example, you can create a variable and print it:
= 5 x print(x)
Exit any time. To exit Python, type:
exit()
and hit Enter. This will take you back to the Terminal.
💡 Once on the Terminal you can only use Terminal commands, python commands will not work anymore. You are in a different app again.
Python sessions are ephemeral. If you exit Python, you will lose all the variables you created. If you want to keep them, you need to write them to a file.
Confirm that this is true by starting Python again and trying to print
x
as if it was still there:python
print(x)
You will get an error because
x
is not defined. Variablex
existed only in the Python session you created earlier.Read your JSON file into Python:
Let’s do something more interesting. We will start Python, read the JSON file we fetched earlier and try to interact with it.
We can’t just simply load files like we do with variables. We actually need to open and establish connection with the file. This is done with the
open()
function:file = open("london_forecast.json", "r", encoding="utf-8")
If you try to print
file
, you will see that it is a file object but it doesn’t actually print the contents of the file. To do that, you need to read the file:= file.read() content print(content)
This saves the content of the file into a variable called
content
. Now that you have done that, you can close the file:file.close()
This is a good practice to do because it frees up the file for other applications to use. What if you wanted to open it on a separate Terminal window? Or on a text editor? Believe it or not, you can’t do that if the file is still open in Python.
What is the data type of
content
? If you printcontent
, you will see that it is a string. This should be a similar realisation to when we worked with theresponse
object during the live demo of the 🗣️ W02 Lecture.Files are read as strings by default. If you want to work with it as a dictionary, you need to convert it to a Python dictionary. You could use
eval
like we did in the live demo, but the best practice is to use Python’sjson
library:import json = json.loads(content) weather_data type(weather_data)
This will print
<class 'dict'>
, which means thatweather_data
is now a dictionary. You can now do dictionary things with it:print(weather_data.keys())
"daily"]["temperature_2m_max"] weather_data[
⭐ Always use the
with
statement when opening files.In previous steps above, I showed you the
open()
andclose()
functions just so you know what actually happens when you interact with files on a computer. You create a connection, do something with it (convert its0
s and1
s to a string, for example), and then close the connection.However, Python has a very nice feature that does this for you automatically. It’s called the
with
statement. It’s a bit more advanced, but it’s a good practice to use it because it makes your code cleaner and safer. Here’s how you would use it to read the file:with open("london_forecast.json", "r", encoding="utf-8") as file: = file.read() content = json.loads(content) weather_data # The file is automatically closed when you exit the `with` block # And the `weather_data` variable is still available to you print(weather_data.keys())
This will do the same thing as before, but it will automatically close the file for you when you are done with it. This is a good practice to use whenever you are working with files in Python.
Notice one important thing about the code above: we used indentation to define the block of code that is inside the
with
statement. Everything ‘inside’ thatwith
block is executed with the file open, and everything ‘outside’ is executed with after the file is closed. The good thing is that we don’t need to manage the file connection ourselves, Python does it for us.
Part 5: Transitioning to VS Code and Jupyter
Now that we’ve worked extensively in the Terminal, let’s organise everything we’ve learned in a Jupyter notebook. This will serve as your reference guide for these essential concepts.
🎯 ACTION POINTS:
Exit the Terminal app entirely.
If you are still inside the Python shell, type
exit()
to exit Python. Then, typeexit
to exit the Terminal. The Terminal app will close and you will be back in the Nuvolos environment.Launch VS Code on Nuvolos. On the
Applications tab on the left and click on the
VS Code application.
Open a Terminal inside VS Code:
- Click on the
Menu icon then navigate to Terminal > New Terminal.
- Run
ls
,pwd
, andcat
to see that you can do the same things you did in the Nuvolos Terminal.
- Click on the
Create a new Notebook:
- Click on the
Menu icon then navigate to File > New File.
- Save the file as
W03 - Study Notes.ipynb
inside theWeek 03 - File Formats and Directory Structure
folder.
- Click on the
Structure your notebook with these sections:
# Week 03 - Study Notes ## Terminal Commands [Transfer your command notes here] ## Files and Directories [Include your game reflections and real-world connections] ## Editing Files in the Terminal [Document nano and cat usage] ## The Python Interpreter [Notes on Python in Terminal vs notebooks] ## Working with JSON ### Reading JSON Files [Document both methods with examples] ### Writing JSON Files [Include the dump() examples]
👉🏻 Even though I’m showing the sections all together here, remember to split them into separate Markdown cells to make editing it easier for you.
Under each section, transfer your notes from earlier parts of the exercise and add code examples from our Terminal work.
Write code to read the JSON file:
Under your ‘Reading JSON in Python’ section, write the code you used to read the JSON file that already exists in your
data/
folder. This is the code you used to convert the JSON string to a Python dictionary.You will have to add the
import json
line at the top of your notebook (a cell below your big title) to make sure thejson
library is available to you.Use relative paths to access the file. This is a good practice because it makes your code more portable. If you ever move your code to a different computer, it will still work.
Write code to collect and store the JSON data in a Python dictionary:
Under your ‘Reading JSON in Python’ section, write the code to send a request to the
Open-Meteo API.
Use your creativity and try to collect different data from the API. You can use the same URL you used in the Terminal, or you can try to collect different data.
Then, store the data in a Python dictionary, call it
weather_data
or something similar, and save it to a file in thedata/
folder using the following code:with open("./data/collected_weather_data.json", "w", encoding="utf-8") as file: file) json.dump(weather_data,
Notice the differences between the code above (where we are writing to a file) and the code you used to read the file. The
w
parameter in theopen()
function tells Python that you want to write to the file. Thejson.dump()
function is used to write the Python dictionary to the file.
🎯 Why Does This Matter?
By completing this whole formative exercise, you’ve now mastered a crucial workflow for working with real-world data:
Navigating and structuring files efficiently
Fetching and saving data from APIs using curl (command-line) and Python (automated scripting).
Understanding JSON structures and their role in APIs
Transitioning seamlessly between Terminal and VS Code
👨🏻💻 Feedback
I will take a look at everyone’s Nuvolos environment on the day of the 🗣️ W03 Lecture (Thursday, 6 February 2025) to help me make my lecture more structured. I will be looking mostly at common mistakes/misconceptions as well as best practices I’ve seen and I will share them in the lecture without identifying any particular individual.
⏭️ What’s Next?
This is perhaps the most important exercise you have done so far. The Terminal commands and file handling will appear in every single exercise from now on.
In coming weeks, we will start working with tabular data (in files as well as in databases) and we will spend a lot of time working with files, reading and writing data, and transforming data from one format to another.