โ W01 Lab Solutions - Exploring the Open Food Facts API
๐ฏ ACTION POINT 01
Our expectations are that you would have seen how once you search for a brand or a product familiar to you on the Open Food Facts website, you get access to all the nutritional information, ingredients, brands, and more that you normally encounter when you buy that product in a supermarket or an online grocery store.
The API documentation page, on the other hand, would perhaps be a bit mysterious and overwhelming at first but once you keep going through the notebook and revisit it, you will then see that it returns the exact same information (or more), just in a different format.
For example, if you search for โWeetabixโ on the Open Food Facts website, you could eventually get to see the Weetabix Family Pack page and note that it scores a 3 on the NOVA group scale, that it contains 11 ingredients, and observed the nutritional table. Once you run our given Python code to search for โbreakfast cerealsโ in the UK, that same product will be amongst the 50 products returned in the API response.
๐ฏ ACTION POINT 02 - Completed Table
Based on the API response structure from the Open Food Facts API v2 reference, here is the completed table:
| Key | Python Type | Description |
|---|---|---|
count |
int |
Total number of products matching the search criteria (e.g., 50 for our breakfast cereals search) |
page |
int |
Current page number in pagination (starts at 1) |
page_count |
int |
Total number of pages available given the page_size (e.g., 1 page if count โค page_size) |
page_size |
int |
Number of products requested per page (we set this to 50) |
products |
list |
List of product dictionaries, each containing product details (e.g., 50 dictionaries in our case) |
skip |
int |
Number of products skipped for pagination (0 for first page, increases by page_size for subsequent pages) |
The API returns pagination metadata (count, page, page_count, page_size, skip) along with the actual data (products). This is a common pattern in REST APIs to help you navigate through large datasets. If you were to collect the next page of results, you would need to send another request with the page parameter incremented by 1.
๐ฏ ACTION POINT 03 - Solution Approaches
There are two valid ways to confirm your suspicions about the data types. Both approaches work, but they serve different purposes:
Approach 1: Systematic Loop with Conditional Logic
This approach is more efficient when you want to check multiple keys at once and get additional information about collections:
# Confirm your suspicions about data types
for key in data.keys():
value = data[key]
value_type = type(value).__name__ # Pro-tip: .__name__ gives clean type name
# Build description
description = f"{value_type}"
# If it's a collection, show its length
if isinstance(value, (list, dict, str)):
if isinstance(value, list):
description += f" (length: {len(value)})"
elif isinstance(value, dict):
description += f" (keys: {len(value)})"
elif isinstance(value, str):
description += f" (length: {len(value)} chars)"
print(f"{key:15} โ {description:20} = {value}")Output:
count โ int = 50
page โ int = 1
page_count โ int = 1
page_size โ int = 50
products โ list (length: 50) = [{'product_name': 'Weetabix', ...}, ...]
skip โ int = 0
__name__
Instead of type(value) which returns <class 'int'>, using type(value).__name__ gives you just the clean type name: 'int'. This is much more readable in output and easier to work with programmatically.
Why this approach?
- Efficient: checks all keys in one loop
- Informative: automatically shows length for collections
- Systematic: ensures you donโt miss any keys
- Good for: when you want a complete overview quickly
Approach 2: Manual One-by-One Inspection
This is the more natural, exploratory approach that most people use when first learning:
# Check each key manually, one at a time
type(data['count'])Output in notebook:
int
Then continue with:
type(data['page'])
type(data['page_count'])
type(data['page_size'])
type(data['products'])
type(data['skip'])For collections, you might also check:
data['products'] # Just leave it dangling - Jupyter will display it
len(data['products'])
data['products'][0] # Inspect first productIn Jupyter notebooks, you donโt need print() for the last expression in a cell. Just leaving the object โdanglingโ (as the last line) will display it automatically. This is cleaner and more natural for exploration.
Why this approach?
- Natural: mimics how youโd explore data interactively
- Visual: you see the actual values, not just types
- Flexible: you can pause and inspect interesting values
- Good for: learning, debugging, when you want to see actual data
Forgetting bracket notation: Use
data['key_name']notdata.key_name- Python dictionaries require bracket notation for string keys.Not checking nested structures: Remember that
data['products']is a list, so youโll needdata['products'][0]to see an individual product dictionary.Assuming all values are simple types: The
nutrimentsfield within each product is itself a dictionary - JSON is often deeply nested!
Verification Code
After filling in your table, you can verify your answers match the solution above. Both approaches will confirm that:
count,page,page_count,page_size, andskipare all integersproductsis a list containing 50 product dictionaries- Each product dictionary contains fields like
product_name,brands,categories,nutriments,nova_group, andingredients_text