💻 Week 08 Lab

Set up your SQLite database for the ✍️ W10 Summative

Author
Published

19 November 2024

Image created with the AI embedded in MS Designer using the prompt 'abstract salmon pink light blue icon depicting the metaphysical experience of cleaning up, reshaping, pivoting, and manipulating data in search of the purest insights in data science.'

📍Location: Friday, 22 November 2024. Time varies.

🥅 Learning Objectives

We want you to learn/practice the following goals in this lab:

📋 Preparation

Before coming to this lab, make sure you have done the following:

  • Start working on the ✍️ W10 Summative Exercise. Try to go as far as connecting to the Spotify API and saving the requests to simple JSON files.

  • Attend (or rewatch) the Week 08 lecture. Take lots of notes about the concepts that will be relevant for your mini-project with the Spotify API.

🛣️ Roadmap

Here is how we will achieve the goal for this lab:

Part I: 📋 Course Survey (10 min)

Your class teacher will be outside the classroom to give you some privacy to fill out the 📋 LSE Course Survey (Note: Jon is also considered as your teacher in the form).

📋 Why is the LSE Course Survey important?

LSE distributes a survey each academic term to collect essential student feedback about their courses.

Your feedback is absolutely crucial, as the LSE Course Survey is the most formal way to provide input to the School.

  • It lets you comment on course content, teaching quality, and resources.
  • The feedback informs Teaching Committee meetings, guiding improvements for future cohorts.
  • Teaching staff discusses the feedback and implements necessary changes.

🫸 Can I inspect your JSON files? (20 min)

Your class teacher will conduct a quick Menti poll/whiteboard tally to identify how many of you have followed the preparation steps and collected some JSON files from the Spotify API for the ✍️ W10 Summative Exercise. They will also go around inquiring about your initial ideas for the Spotify API endpoints you could use.

To make the most of the time in the lab, focus on getting at least two JSON files from the Spotify API – even if they are not the final ones you will use for the summative exercise. This way you can get help with the new and most complex part of the process: the creation of the database.

Part II: Pick your adventure (remaining time)

Starting now, select the adventure that fits you best. Your class teacher is here to help you at any step along the way.

🚀 I have no JSON files yet

If you did not collect any data from the Spotify API, your goal for the lab today is to set up the credentials and the .env file to connect to the Spotify API.

You can use the Week 08 Lecture notebook from the GitHub logo lse-ds105/ds105a-2024 repository as a guide.

Follow the roadmap below:

🎯 ACTION POINTS

  1. Follow the instructions on the Spotify API page to create a developer account and obtain your API credentials.

  2. Use the dotenv library to store your credentials in a .env file. Make sure that the .env file is in your .gitignore file.

  3. Create a NB01 notebook in your notebooks folder and write code to connect to the Spotify API using the requests library.

  4. Make requests to two different endpoints and save the responses as JSON files in the data/raw folder. Chose good, memorable names for the files.

  5. Make sure to include the code to read the credentials from the .env file in the NB01 notebook.

  6. Commit your changes to your repository and push them to GitHub.

✋ I have some initial JSON files

Awesome! Try to follow the roadmap below. You can use the Week 08 Lecture notebook from the GitHub logo lse-ds105/ds105a-2024 repository as a guide.

🎯 ACTION POINTS

  1. Create a NB02 notebook in your notebooks folder and write code to read the JSON files you have collected.

  2. Use the JSON normalisation tricks you learned last week and convert the JSON files into nice and flat dataframes.

  3. Remove any columns that are clearly not useful for any of your intended analyses.

  4. Make sure that you have at least two dataframes that you could link on a merge/join operation. (Ask your class teacher for help if you are not sure about this step).

  5. Create a new SQLite database under the data folder. Make sure to control the data types of the columns and to set up the appropriate foreign keys.

  6. Save the dataframes as tables in the database using the to_sql method.

  7. Read the data back from the database and make sure that you can see the dataframes you saved and the data types and foreign keys are correct.

  8. Commit your changes to your repository and push them to GitHub.