Setting Up Your Local Development Environment
Continue your data engineering journey after ME204
Congratulations on completing ME204! This guide will help you set up your own local development environment so you can continue working on data engineering projects after the course ends.
💡 Why Set Up Your Own Environment?
Now that you’ve mastered data engineering concepts in ME204, you can:
- Continue working on your projects and portfolio
- Explore new datasets and APIs independently
- Build more complex data pipelines
- Share your work with potential employers or collaborators
- Develop your skills further with advanced tools
Before You Start
Make sure you have:
- Backed up your work: All your ME204 projects should be saved to your personal GitHub account
- Downloaded your data: Any datasets you collected during the course
- Documented your processes: Your notebooks and README files contain all the setup instructions
🎯 ACTION POINTS
1️⃣ Install Python
Python is the foundation for all the data engineering work you’ve learned. We recommend installing Python 3.10 or newer.
For Windows Users
Download the installer from the official Python website.
Run the installer and make sure to check the box that says “Add Python to PATH” before clicking “Install Now”.
When asked, remember to add Python to PATH
Verify the installation by opening Command Prompt and typing:
python --version
For macOS Users
macOS comes with Python pre-installed, but it’s usually an older version. We recommend using Homebrew to install a newer version:
Install Homebrew (if you don’t have it already).
Go to the Homebrew website and follow the instructions to install it.
Install Python using Homebrew:
brew install python
Verify the installation by opening Terminal and typing:
python3 --version
For Linux Users
Most Linux distributions come with Python pre-installed but to install the latest version:
Update your package list:
sudo apt update # For Ubuntu/Debian # OR sudo dnf update # For Fedora
Install Python:
sudo apt install python3 python3-pip # For Ubuntu/Debian # OR sudo dnf install python3 python3-pip # For Fedora
Verify the installation:
python3 --version
2️⃣ Install Data Engineering Packages
Install the packages you’ve been using in ME204 for data collection, processing, and analysis:
pip install numpy pandas requests beautifulsoup4 sqlite3 jupyter ipykernel ipywidgets plotly streamlit quarto
💡 Pro Tip: Create a requirements.txt
file for each project:
pip freeze > requirements.txt
This makes your projects reproducible and professional. Others can install all dependencies with:
pip install -r requirements.txt
3️⃣ Install VS Code
Visual Studio Code (VS Code) is excellent for data engineering work with Python, Jupyter notebooks, and markdown files.
- Download VS Code from the official website.
- Run the installer and follow the prompts.
- Launch VS Code after installation.
Install Essential Extensions
To make VS Code perfect for data engineering:
- Python extension by Microsoft (for Python development)
- Jupyter extension by Microsoft (for notebook support)
- GitHub Copilot (free with GitHub Education Pack) for AI-assisted coding
- Quarto extension for document creation and publishing
4️⃣ Install Git & GitHub CLI
You’re already familiar with Git from ME204. Now set it up on your local machine:
For Windows:
Download the installer from git-scm.com.
Run the installer and use the default settings.
Verify the installation:
git --version
Install the GitHub CLI for easier GitHub workflows.
For macOS:
If you have Homebrew:
brew install git
Install the GitHub CLI.
For Linux:
Install Git:
sudo apt install git # For Ubuntu/Debian # OR sudo dnf install git # For Fedora
Install the GitHub CLI.
After installing Git, follow the instructions in our Git & GitHub guide to set up authentication.
5️⃣ Install Quarto (Recommended)
Quarto is perfect for creating professional data reports and websites like you did in your final project:
- Download Quarto from the official website.
- Follow the installation instructions for your operating system.
- Install the VS Code extension for Quarto to get syntax highlighting and preview features.
6️⃣ Clone Your Projects
Now you can work on your ME204 projects locally:
Clone your repositories from GitHub:
git clone https://github.com/your-username/your-project-name.git
Install project dependencies:
cd your-project-name pip install -r requirements.txt # if you have one
Test your projects to make sure everything works locally.
Next Steps for Your Data Engineering Journey
Build Your Portfolio
- Enhance your ME204 projects with additional analysis or features
- Create new projects using the skills you’ve learned
- Document everything with clear README files and notebooks
- Share your work on GitHub and LinkedIn
Explore Advanced Topics
Explore a different database:
Say, PostgreSQL, MongoDB
Learn more advanced data automation:
Check out Apache Airflow, Prefect, dbt
Cloud platforms: the big cloud providers all have courses and certifications you can take.
See Google Cloud, AWS, Azure
Get more technical:
Follow advanced blogs like Netflix Tech Blog and Airbnb Engineering
Join the community:
Take free courses:
Find local meetups.