Setting Up Your Local Development Environment

Continue your data engineering journey after ME204

local-environment
guide
Author

Dr Jon Cardoso-Silva

Last updated

31 July 2025

ME204 course icon

Congratulations on completing ME204! This guide will help you set up your own local development environment so you can continue working on data engineering projects after the course ends.

💡 Why Set Up Your Own Environment?

Now that you’ve mastered data engineering concepts in ME204, you can:

  • Continue working on your projects and portfolio
  • Explore new datasets and APIs independently
  • Build more complex data pipelines
  • Share your work with potential employers or collaborators
  • Develop your skills further with advanced tools

Before You Start

Make sure you have:

  1. Backed up your work: All your ME204 projects should be saved to your personal GitHub account
  2. Downloaded your data: Any datasets you collected during the course
  3. Documented your processes: Your notebooks and README files contain all the setup instructions

🎯 ACTION POINTS

1️⃣ Install Python

Python is the foundation for all the data engineering work you’ve learned. We recommend installing Python 3.10 or newer.

For Windows Users

  1. Download the installer from the official Python website.

  2. Run the installer and make sure to check the box that says “Add Python to PATH” before clicking “Install Now”.

    When asked, remember to add Python to PATH

  3. Verify the installation by opening Command Prompt and typing:

    python --version

For macOS Users

macOS comes with Python pre-installed, but it’s usually an older version. We recommend using Homebrew to install a newer version:

  1. Install Homebrew (if you don’t have it already).

    Go to the Homebrew website and follow the instructions to install it.

  2. Install Python using Homebrew:

    brew install python
  3. Verify the installation by opening Terminal and typing:

    python3 --version

For Linux Users

Most Linux distributions come with Python pre-installed but to install the latest version:

  1. Update your package list:

    sudo apt update  # For Ubuntu/Debian
    # OR
    sudo dnf update  # For Fedora
  2. Install Python:

    sudo apt install python3 python3-pip  # For Ubuntu/Debian
    # OR
    sudo dnf install python3 python3-pip  # For Fedora
  3. Verify the installation:

    python3 --version

2️⃣ Install Data Engineering Packages

Install the packages you’ve been using in ME204 for data collection, processing, and analysis:

pip install numpy pandas requests beautifulsoup4 sqlite3 jupyter ipykernel ipywidgets plotly streamlit quarto

💡 Pro Tip: Create a requirements.txt file for each project:

pip freeze > requirements.txt

This makes your projects reproducible and professional. Others can install all dependencies with:

pip install -r requirements.txt

3️⃣ Install VS Code

Visual Studio Code (VS Code) is excellent for data engineering work with Python, Jupyter notebooks, and markdown files.

  1. Download VS Code from the official website.
  2. Run the installer and follow the prompts.
  3. Launch VS Code after installation.

Install Essential Extensions

To make VS Code perfect for data engineering:

  1. Python extension by Microsoft (for Python development)
  2. Jupyter extension by Microsoft (for notebook support)
  3. GitHub Copilot (free with GitHub Education Pack) for AI-assisted coding
  4. Quarto extension for document creation and publishing

4️⃣ Install Git & GitHub CLI

You’re already familiar with Git from ME204. Now set it up on your local machine:

For Windows:

  1. Download the installer from git-scm.com.

  2. Run the installer and use the default settings.

  3. Verify the installation:

    git --version
  4. Install the GitHub CLI for easier GitHub workflows.

For macOS:

  1. If you have Homebrew:

    brew install git
  2. Install the GitHub CLI.

For Linux:

  1. Install Git:

    sudo apt install git  # For Ubuntu/Debian
    # OR
    sudo dnf install git  # For Fedora
  2. Install the GitHub CLI.

After installing Git, follow the instructions in our Git & GitHub guide to set up authentication.

6️⃣ Clone Your Projects

Now you can work on your ME204 projects locally:

  1. Clone your repositories from GitHub:

    git clone https://github.com/your-username/your-project-name.git
  2. Install project dependencies:

    cd your-project-name
    pip install -r requirements.txt  # if you have one
  3. Test your projects to make sure everything works locally.

Next Steps for Your Data Engineering Journey

Build Your Portfolio

  • Enhance your ME204 projects with additional analysis or features
  • Create new projects using the skills you’ve learned
  • Document everything with clear README files and notebooks
  • Share your work on GitHub and LinkedIn

Explore Advanced Topics