🗓️ Week 02
REST APIs and Introduction to FastAPI

DS205 – Advanced Data Manipulation

27 Jan 2025

About

10:00 – 10:03

This lecture will cover the transition from data exploration to serving data insights via APIs, focusing on the practical application of FastAPI. You can read more about these core concepts on these two key sources:

  1. The book 📕 FastAPI: Modern Python Web Development by Bill Lubanovic (sadly not avaiable at the LSE Library)
  2. FastAPI’s 🌐 Official Tutorial (available for free online)

This session introduces the core concepts of RESTful APIs using the FastAPI package, and introduces the idea of data validation using Pydantic models. The lecture assumes you have a good understanding of the pandas library. introduces the core concepts of RESTful APIs.

Recap of Week 01

10:03 – 10:10

(Only if anyone has done the W01 Exercise or has questions about pandas)

  • Review key takeaways from the pandas exploration of the ASCOR dataset.
  • Transition into how APIs make insights accessible for non-technical stakeholders.

Introduction to APIs

00:10 – 00:15

What are APIs and why do we need them?

  • API stands for Application Programming Interface. It is a set of rules that allows us to expose/serve data or services to other computers or digital applications.

    • Examples: Google Maps API, Twitter API, Spotify API.

    • To make this more concrete: think of APIs as a waiter in a restaurant. The waiter takes your order, sends it to the kitchen, and then brings your food back to you. In this analogy, the waiter is the API, the kitchen is the server, and the food is the data you requested.

  • APIs provide seamless access to data. You don’t need to know how the data is stored or processed, or from which table in which database it comes from.

  • We can set APIs to communicate with each other, without direct human intervention. This is particularly useful when you want to automate a process or when you want to make data available to a wide audience. Think: Twitter bots.

Key Concepts

10:15 – 11:00

Here is how a typical API architecture looks like (at a conceptual level):

G WebClient Web Client WebServer Web WebClient->WebServer WebServer->WebClient Service Service WebServer->Service Model Model WebServer->Model Service->WebServer Data Data Service->Data Service->Model Data->Service Database Database Data->Database Database->Data

Figure 1. Schematic diagram illustrating the flow of data on an API architecture.
Adapted from the FastAPI book, page 11.

Let’s look at each of these components in more detail 👉🏻

Brief Description of Layers: Web Client

Description

The Web Client is the interface that interacts with the API. You, as a user, can write code/click buttons to send requests and receive responses in structured formats.

Here’s an example of using Python as a client to fetch weather data from an API:

import requests

response = requests.get(
    "https://api.open-meteo.com/v1/forecast",
    params={"latitude": 51.51, 
            "longitude": -0.13, 
            "daily": "temperature_2m_min"}
)
print(response.json())  # The API response in JSON format

Brief Description of Layers: Web Server

Description

The Web Server receives the request from the client, routes it to the appropriate service, and prepares a response. It is the entry point to the API.

G cluster_server Web Server ClientRequest Web Client ServerRequest Request ClientRequest->ServerRequest sends request as HTTP Router Router ServerRequest->Router chooses route Service Service Router->Service sends to /v1/forecast route ServerResponse Response Service->ServerResponse generates response ServerResponse->ClientRequest sends response as JSON

Figure 2. A conceptual depiction of a fictitious version of the OpenMeteo Web Server handling the request from a client.

It’s time to tell you about

FastAPI is a modern, fast (high-performance) Python web framework for building APIs. It’s designed with Python type hints for simplicity and automatic validation.

Some highlights:

  • If you tried to query data from APIs before, you’d know that most APIs are not well-documented. FastAPI solves this problem by automatically generating API documentation.
  • Asynchronous support: Built to handle multiple requests efficiently.
  • Data validation built-in: Uses a package called Pydantic for data validation.

I also need to tell you about uvicorn!

FastAPI is not a web server on its own. It’s a framework, a set of tools and coding patterns to build APIs.

👉🏻 To run FastAPI, you need a web server. The most common one is Uvicorn.

Example: Running a FastAPI App

Server Side

Here’s a simple “Hello, World!” FastAPI app. Save it under main.py:

from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def read_root():
    return {"Hello": "World"}

To run this app now, you need to use Uvicorn:

uvicorn main:app

Client Side

Here are three ways to test this app:

  1. Open your browser and go to http://localhost:8000/ to see the response.

    • You should see {"Hello": "World"} displayed on the page.
    • It also works with http://127.0.0.1:8000/.
  2. Via the command line using curl:

curl http://localhost:8000/
  1. Using Python:
import requests

response = requests.get("http://localhost:8000/")
print(response.json())

About routing in FastAPI

We could add more routes to our FastAPI app. Here’s how you can do it:

from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def read_root():
    return {"Hello": "World"}

@app.get("/alternative")
async def read_alternative():
    return {"Hello": "Alternative World"}

Which you would then access via http://localhost:8000/alternative.

Another look at Figure 2

G cluster_server Web Server ClientRequest Web Client ServerRequest Request ClientRequest->ServerRequest sends request as HTTP Router Router ServerRequest->Router chooses route Service Service Router->Service sends to /v1/forecast route ServerResponse Response Service->ServerResponse generates response ServerResponse->ClientRequest sends response as JSON

Figure 2. A conceptual depiction of a fictitious version of the OpenMeteo Web Server handling the request from a client.

Brief Description of Layers: Service

Description

The Service layer processes the request and handles business logic. It communicates with the data layer to fetch or manipulate data.

If OpenMeteo API was built using FastAPI, the server code might look like this:

from fastapi import FastAPI
from typing import Dict

app = FastAPI()

@app.get("/v1/forecast")
async def get_forecast(latitude: float, longitude: float, daily: str) -> Dict:
    # 
    # Some code here to fetch the forecast data
    #
    return {
        "latitude": latitude,
        "longitude": longitude,
        "daily": {
            "temperature_2m_min": list_of_min_temperatures
        }
    }

Brief Description of Layers: Model

Description

The Model layer defines the structure of the data being passed around. This ensures consistency and validation.

FastAPI relies on a library called Pydantic for defining models.

Examples:

Pydantic Model

from pydantic import BaseModel

class Forecast(BaseModel):
    latitude: float
    longitude: float
    daily: dict

Usage in FastAPI:

@app.get("/v1/forecast", response_model=Forecast)
async def get_forecast(latitude: float, longitude: float, daily: str) -> Forecast:
    # 
    # Some code here to fetch the forecast data
    #
    return Forecast(
        latitude=latitude,
        longitude=longitude,
        daily={"temperature_2m_min": list_of_min_temperatures}
    )

Brief Description of Layers: Data

Descriptions

The Data layer interacts with the database or file storage to read or write data.

Code Example: Querying Data from a Database

from sqlalchemy.orm import Session

def fetch_forecast(db: Session, location_id: int):
    return db.query(ForecastModel).filter(ForecastModel.id == location_id).first()

But it could be as simple as reading a file. Say, a XLSX file:

import pandas as pd

def fetch_forecast(location_id: int):
    return pd.read_excel("forecasts.xlsx").loc[location_id]

Brief Description of Layers: Database

The Database is the storage layer where the data resides. It could be a relational database, NoSQL database, or even flat files for simpler systems.

  • If using a relational database (SQlite, MySQL, PostgreSQL), FastAPI can use the SQLAlchemy ORM to interact with the database.
  • The database models might look quite similar to the Pydantic models. Even if they have the same attributes and types, they serve different purposes. One is for data validation and sending/receiving data, the other is for storing data.

Sample SQLAlchemy model:

from sqlalchemy import Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class ForecastModel(Base):
    __tablename__ = "forecasts"
    id = Column(Integer, primary_key=True, index=True)
    latitude = Column(String, index=True)
    longitude = Column(String, index=True)
    data = Column(String)  # JSON representation of forecast data

We will keep things simple for now. We won’t use a database this week. We’ll focus on implementing the Service and Model layers in FastAPI. But this layer might be useful in a future assignment.

Coffee Break ☕

11:00 – 11:10

Query and Path Parameters

11:10 – 11:20

I want to use this time to distinguish two common ways of passing data to APIs: query parameters and path parameters.

Use these slides later as a reference.

Query Parameters

  • Query parameters are key-value pairs which, from the client side, are appended to the URL after a ?

  • Example:

    https://api.open-meteo.com/v1/forecast?latitude=51.51&longitude=-0.13&daily=temperature_2m_min
  • They are used to pass optional data to the server.

    • For example, you might want to filter the data to just a specific or multiple locations.
    • They are not part of the URL path, but are appended to the URL.
    • Although typically optional, the API design might require that at least one query parameter is provided. This is the case in the OpenMeteo API we’re using as an example where latitude, longitude, and daily are all required.

Example: Query Parameters in FastAPI

Server Side

from fastapi import FastAPI

app = FastAPI()

@app.get("/v1/forecast")
async def get_forecast(latitude: float, 
                       longitude: float, 
                       daily: str):

    # Query the database
    # Create a ForecastModel object
    # Return the object

    return forecast_object

Client Side

import requests

response = requests.get(
    "https://api.open-meteo.com/v1/forecast",
    params={"latitude": 51.51, 
            "longitude": -0.13, 
            "daily": "temperature_2m_min"}
)

# Do something with the response
print(response.json())

Path Parameters

Instead of ?key=value pairs, path parameters are part of the URL path itself. They are used to pass mandatory data to the server.

  • Path parameters are part of the URL and specify a resource directly.
  • They are defined by placeholders in the route.

For example, we could conceive of a path parameter like this (I don’t think this exists on this particular API we’re using as an example):
https://MY-OWN-API/v1/forecast/UK/London

Example: Path Parameters in FastAPI

Server Side

from fastapi import FastAPI

app = FastAPI()

@app.get("/v1/forecast/{country}/{city}")
async def get_forecast(country: str, city: str):

    # Query the database
    # Create a ForecastModel object
    # Return the object

    return forecast_object

Client Side

import requests

# This will not work!
# It's a fictional example 
response = requests.get(
    "https://MY-OWN-API/v1/forecast/UK/London"
)

# Do something with the response
print(response.json())

RESTful Design

11:10 – 11:25

  • REST (Representational State Transfer) is an architectural style for designing networked applications.
  • It is based on a stateless, client-server communication model and revolves around resources.

Key Principles

  1. Resources and Endpoints:
    • Resources are identified using URLs, with paths representing different endpoints.

    • Example:

      /v1/forecast   # this serves one type of data
      /v1/historical # this serves another type of data
  1. HTTP Methods:
    • Different methods perform different actions on resources:
      • GET: Retrieve data (read-only).
      • POST: Create a new resource.
      • PUT: Update an existing resource.
      • DELETE: Remove a resource.

Here’s a different way to explore this.

Key Principles (cont.)

  1. JSON as Data Format:
    • RESTful APIs typically use JSON for requests and responses.
  1. Statelessness:
    • Each request from the client contains all the information needed to process it.
  1. Status Codes:
    • Indicate the result of the HTTP request:
      • 200 OK: Request successful.
      • 404 Not Found: Resource does not exist.
      • 500 Internal Server Error: Something went wrong on the server.

Read more about HTTP status codes here

Let’s design an ASCOR API

11:25 – 12:00

OBJECTIVE: Design API endpoints for ASCOR assessments.

Endpoints

Which endpoints seem to make sense for the ASCOR dataset?

🗣️ CLASSROOM DISCUSSION (if time allows)

We will now interact on the follow GitHub repository: lse-ds205/ascor-api.

Endpoint Candidate I: /v1/country-data/{country}

Purpose: Retrieve country-level benchmark data for a specific year.

Input Parameters

Compulsory:

  • country: Name or ISO code of the country (path parameter).
  • year: Year for benchmark data (query parameter).

Optional:

If none are passed, the API should return all available data for the country.

  • pillar: Select only data from a specific pillar.
  • area: Select only data from a specific area.
  • indicators: Select only data from a specific set of indicators.
  • metric: Select only data from a specific metric.

Expected Response

{
    "country": "United Kingdom",
    "assessment_year": 2023,
    ...,  // Other metadata
    "data": {
        "EP.1": {
            "assessment": "Partial",
            "indicators": {
                "EP1.a": "Yes",
                "EP1.b": "No",
                "EP1.c": "No"
            }
        },
        "EP.2": {
            "assessment": "Partial",
            "indicators": {
                "EP2.a": "Yes",
                "EP2.b": "No",
                "EP2.c": "No",
                "EP2.d": "No"
            },
            "metrics": {
                "EP.2.a.i": "-25%",
                "EP2.b.i": "No or unsuitable disclosure",
                "EP2.c.1": "62%",
                "EP2.d.i": "822%"
            }
        },
        ...

    }
}

What’s Next ⏭️

💻 W02 Lab Session (Tuesday, 28 January 2025):

  • Create a FastAPI app skeleton.

  • Implement the /v1/country-data/{country} endpoint with query and path parameters.

  • Test the API using curl and Python.

✍🏻 W02 Formative Exercise

  • Practice GitFlow by creating a new branch for your API project.

  • Implement the logic for a separate endpoint /v1/indicator-data/.