DS205 2025-2026 Winter Term Icon

โœ… W01 Lab Solutions - Exploring the Open Food Facts API

Author
Published

19 January 2026

๐ŸŽฏ ACTION POINT 01

Our expectations are that you would have seen how once you search for a brand or a product familiar to you on the Open Food Facts website, you get access to all the nutritional information, ingredients, brands, and more that you normally encounter when you buy that product in a supermarket or an online grocery store.

The API documentation page, on the other hand, would perhaps be a bit mysterious and overwhelming at first but once you keep going through the notebook and revisit it, you will then see that it returns the exact same information (or more), just in a different format.

For example, if you search for โ€œWeetabixโ€ on the Open Food Facts website, you could eventually get to see the Weetabix Family Pack page and note that it scores a 3 on the NOVA group scale, that it contains 11 ingredients, and observed the nutritional table. Once you run our given Python code to search for โ€˜breakfast cerealsโ€™ in the UK, that same product will be amongst the 50 products returned in the API response.

๐ŸŽฏ ACTION POINT 02 - Completed Table

Based on the API response structure from the Open Food Facts API v2 reference, here is the completed table:

API Response Structure - Data Types and Descriptions
Key Python Type Description
count int Total number of products matching the search criteria
(e.g., 50 for our breakfast cereals search)
page int Current page number in pagination
(starts at 1)
page_count int Total number of pages available given the page_size
(e.g., 1 page if count โ‰ค page_size)
page_size int Number of products requested per page
(we set this to 50)
products list List of product dictionaries, each containing product details
(e.g., 50 dictionaries in our case)
skip int Number of products skipped for pagination
(0 for first page, increases by page_size for subsequent pages)
NoteUnderstanding the Response Structure

The API returns pagination metadata (count, page, page_count, page_size, skip) along with the actual data (products). This is a common pattern in REST APIs to help you navigate through large datasets. If you were to collect the next page of results, you would need to send another request with the page parameter incremented by 1.

๐ŸŽฏ ACTION POINT 03 - Solution Approaches

There are two valid ways to confirm your suspicions about the data types. Both approaches work, but they serve different purposes:

Approach 1: Systematic Loop with Conditional Logic

This approach is more efficient when you want to check multiple keys at once and get additional information about collections:

# Confirm your suspicions about data types
for key in data.keys():
    value = data[key]
    value_type = type(value).__name__  # Pro-tip: .__name__ gives clean type name
    
    # Build description
    description = f"{value_type}"
    
    # If it's a collection, show its length
    if isinstance(value, (list, dict, str)):
        if isinstance(value, list):
            description += f" (length: {len(value)})"
        elif isinstance(value, dict):
            description += f" (keys: {len(value)})"
        elif isinstance(value, str):
            description += f" (length: {len(value)} chars)"
    
    print(f"{key:15} โ†’ {description:20} = {value}")

Output:

count           โ†’ int                  = 50
page            โ†’ int                  = 1
page_count      โ†’ int                  = 1
page_size       โ†’ int                  = 50
products        โ†’ list (length: 50)    = [{'product_name': 'Weetabix', ...}, ...]
skip            โ†’ int                  = 0
TipPro-tip: Using __name__

Instead of type(value) which returns <class 'int'>, using type(value).__name__ gives you just the clean type name: 'int'. This is much more readable in output and easier to work with programmatically.

Why this approach?

  • Efficient: checks all keys in one loop
  • Informative: automatically shows length for collections
  • Systematic: ensures you donโ€™t miss any keys
  • Good for: when you want a complete overview quickly
Approach 2: Manual One-by-One Inspection

This is the more natural, exploratory approach that most people use when first learning:

# Check each key manually, one at a time
type(data['count'])

Output in notebook:

int

Then continue with:

type(data['page'])
type(data['page_count'])
type(data['page_size'])
type(data['products'])
type(data['skip'])

For collections, you might also check:

data['products']  # Just leave it dangling - Jupyter will display it
len(data['products'])
data['products'][0]  # Inspect first product
NoteWhy No Print?

In Jupyter notebooks, you donโ€™t need print() for the last expression in a cell. Just leaving the object โ€œdanglingโ€ (as the last line) will display it automatically. This is cleaner and more natural for exploration.

Why this approach?

  • Natural: mimics how youโ€™d explore data interactively
  • Visual: you see the actual values, not just types
  • Flexible: you can pause and inspect interesting values
  • Good for: learning, debugging, when you want to see actual data
ImportantCommon Mistakes to Avoid
  1. Forgetting bracket notation: Use data['key_name'] not data.key_name - Python dictionaries require bracket notation for string keys.

  2. Not checking nested structures: Remember that data['products'] is a list, so youโ€™ll need data['products'][0] to see an individual product dictionary.

  3. Assuming all values are simple types: The nutriments field within each product is itself a dictionary - JSON is often deeply nested!

Verification Code

After filling in your table, you can verify your answers match the solution above. Both approaches will confirm that:

  • count, page, page_count, page_size, and skip are all integers
  • products is a list containing 50 product dictionaries
  • Each product dictionary contains fields like product_name, brands, categories, nutriments, nova_group, and ingredients_text