πŸ“– Data Dictionary: Tesco Grocery 1.0

A large-scale dataset of grocery purchases in London

Author
Published

23 June 2023

The content of this page was copied from the original data dictionary available in (Aiello 2020).

For each geographic aggregation (LSOA, MSOA, Ward, Borough), the authors provide a file containing the aggregated information on food purchases, enriched with information coming from the census.

For more information on how they aggregated the data, please refer to the original paper:

Aiello, Luca Maria, Daniele Quercia, Rossano Schifanella, and Lucia Del Prete. β€˜Tesco Grocery 1.0, a Large-Scale Dataset of Grocery Purchases in London’. Scientific Data 7, no. 1 (18 February 2020): 57. https://doi.org/10.1038/s41597-020-0397-7. (Aiello et al. 2020)

Files are comma-separated and contain 202 columns in total. Fields include:
Field Description
area_id identifier of the area
weight Weight of the average food product, in grams
volume Volume of the average drink product, in liters
energy Nutritional energy of the average product, in kcals
energy_density Concentration of calories in the area’s average product, in kcals/gram
{nutrient} Weight of {nutrient} in the average product, in grams. Possible nutrients are: carbs, sugar, fat, saturated fat, protein, fibre. The count of carbs include sugars and the count of fats includes saturated fats
energy_{nutrient} Amount of energy from {nutrient} in the average product, in kcals
h_nutrients_weight Diversity (entropy) of nutrients weight
h_nutrients_weight_norm Diversity (entropy) of nutrients weight, normalized in [0,1]
h_nutrients_calories Diversity (entropy) of energy from nutrients
h_nutrients_calories_norm Diversity (entropy) of energy from nutrients, normalized in [0,1]
f_{category} Fraction of products of type {category} purchased. Possible categories are: beer, dairy, eggs, fats & oils, fish, fruit & veg, grains, red meat, poultry, readymade, sauces, soft drinks, spirits, sweets, tea & coffee, water, and wine.
f_{category}_weight Fraction of total product weight given by products of type {category}
h_category Diversity (entropy) of food product categories
h_category_norm Diversity (entropy) of food product categories, normalized in [0,1]
h_category_weight Diversity (entropy) of weight of food product categories
h_category_weight_norm Diversity (entropy) of weight of food product categories, normalized in [0,1].
representativeness_norm The ratio between the number of unique customers in the area and the number of residents as measured by the census; values are min-max normalized in [0,1] across all areas
transaction_days Number of unique dates in which at least one purchase has been made by one of the residents in the area.
num_transactions Total number of products purchased by Clubcard owners who are resident in the area.
man_day Cumulative number of man-days of purchase (number of distinct days a customer has purchased something, summed all individual customers)
population Total population of residents in the area according to the 2015 census.
male Total male population in the area.
female Total female population in the area.
age_0_17 Total number of residents between 0 and 17 years old
age_18_64 Total number of residents between 18 and 64 years old.
age_65+ Total number of residents aged 65 years or more.
avg_age Average age of residents according to the 2015 census
area_sq_km Surface of the area (km^2)
people_per_sq_km Population density per km^2

Where applicable, measures are accompanied by their standard deviation (fields with the suffix - _std ), the 95% confidence interval for the mean (suffix - _ci95 ), and the values of the 2.5th, 25th, 50th, 75th, and 97.5th percentiles (suffix - _perc{value} )

References

Aiello, Luca Maria. 2020. β€œArea-Level Grocery Purchases.” figshare. https://doi.org/10.6084/M9.FIGSHARE.7796666.V1.
Aiello, Luca Maria, Daniele Quercia, Rossano Schifanella, and Lucia Del Prete. 2020. β€œTesco Grocery 1.0, a Large-Scale Dataset of Grocery Purchases in London.” Scientific Data 7 (1): 57. https://doi.org/10.1038/s41597-020-0397-7.