πŸ—“οΈ Week 01 – Day 01: Introduction

Course Logistics, basic programming concepts, and R vs Python

Author
Published

08 July 2024

Welcome to ME204 - Data Engineering for the Social World!

The theme of this first week is KNOW YOUR DATA. Throughout the week, I want to get you into thinking about how data is (or should be) stored, how to access it, and how to manipulate it in order to discover new insights or communicate findings to others.

Part 1: πŸ‘¨β€πŸ« Lecture Slides

These are the slides I will use in the first half of the lecture (10am-11.15am).

Either click on the slide area below or click here to view it in fullscreen. Use your keypad to navigate the slides.

πŸŽ₯ Looking for lecture recordings? You can only find those on Moodle.


Part 2: πŸ“‹ Activity: Setting up your computer

πŸ’‘ TIP: This part contains 🎯 ACTION POINTS. Whenever you see these words in the lab roadmap, you are to work independently or with others, and the instructor will go around the room to help you if you need assistance.

We will now set up our computers for the course. Because you will have the chance to choose between R and Python, today we will compare the two languages and set up the necessary software.

2.1 Install the Programming Language

You need to install either R or Python on your computer.

🎯 ACTION POINTS

  1. Mix it up! Ideally, you should work in tables of four people. If we are in odd numbers, we might have one group with three or five people.

  2. Choose R or Python. Half of the group will set up R, and the other half will set up Python. Decide amongst yourselves who will do what.

  3. Install the programming language. Follow the instructions on the Section 1 of the πŸ“‹ Getting Ready page to ensure you have the necessary software installed on your computer.

    • πŸ’‘ This might take some time, so please be patient and try help each other out.
  4. βœ‹ Call me over if you have any questions or need help.

2.2 Experiment with the language

Let’s check if the programming language is installed correctly.

πŸ§‘πŸ»β€πŸ« TEACHING MOMENT

Whenever you come across this text 'πŸ§‘πŸ»β€πŸ« TEACHING MOMENT', it means your instructor deserves your full attention

  • Follow me as I demonstrate how to open a terminal and run simple commands in R and Python.

2.3 Install the additional packages

We need more than just the programming language to work with data. We need additional packages that will help us load, manipulate, and visualize data.

🎯 ACTION POINTS

  1. Install additional packages. Follow the instructions on the Section 2 of the πŸ“‹ Getting Ready page to ensure you have the necessary software installed on your computer.

    • πŸ’‘ This might take some time, so please be patient and try help each other out.
  2. βœ‹ Call me over if you have any questions or need help.

  3. Check that the packages are installed correctly by running a simple script in R or Python:

    # If in R, run this
    library(tidyverse)
    # If in Python, run this
    import pandas as pd

    If running the code above does not return an error, you are good to go!

2.4 Install an IDE

This step should be simpler than the previous ones. We will install an Integrated Development Environment (IDE) to make our coding experience more enjoyable.

🎯 ACTION POINTS

  1. Follow the instructions on the Section 3 of the πŸ“‹ Getting Ready page to install VS Code (if Python) and RStudio (for R).

Typically, the steps above tend to consume the remaining of the morning. However, if we have time left, I will give you some programming tips and tricks to get you started with R or Python.

⏭️ What’s Next?

In the afternoon, we will have our first lab session. We will compare how to load and view tabular data in R and Python.

Keep in mind: which programming language seems more intuitive to you? Which one do you think you will enjoy working with more? Tomorrow, we will decide which language we will use for the rest of the course.