Primitive Data Types

A visual guide

16 Oct 2024

How data is stored in a computer

Computers only understand 0s and 1s

Numbers, text, images, and sounds are all stored as sequences of 0s and 1s in your computer’s memory. Each 0 or 1 is called a bit.

Think of a bit is a tiny box:

\[ \require{color} \fcolorbox{black}{white}{$\phantom{0}$} \phantom{\leftarrow \text{a bit can have a value of $0$}} \]

Computers only understand 0s and 1s

Numbers, text, images, and sounds are all stored as sequences of 0s and 1s in your computer’s memory. Each 0 or 1 is called a bit.

Think of a bit is a tiny box:

\[ \require{color} \begin{array}{ccc} \fcolorbox{black}{#eeeeee}{0} & \leftarrow & \text{a bit can have a value of $0$} \end{array} \]

Computers only understand 0s and 1s

Numbers, text, images, and sounds are all stored as sequences of 0s and 1s in your computer’s memory. Each 0 or 1 is called a bit.

Think of a bit is a tiny box:

\[ \begin{array}{ccc} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \leftarrow & \text{OR it can have a value of $1$} \end{array} \]

but nothing else!

Boolean data type (aka bool)

For everything that has a ‘Yes’ or ‘No’ answer, we can use a single bit.

\[ \textcolor{#9753b8}{\texttt{is_it_raining}} = \begin{cases} \fcolorbox{black}{#eeeeee}{$\textcolor{black}{0}$} & \text{if it is not raining} \\ \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \text{if it is raining} \end{cases} \]

In Python:

# if it is raining
is_it_raining = True

# if it is not raining
is_it_raining = False

What about numbers?

Positive whole numbers

Suppose we want to represent positive numbers (0 included). We can’t do that with just a single bit!

With \(2\) bits, we can represent \(4\) different numbers:

\[\begin{array}{ccc} \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 0 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 1 \\ \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 2 \\ \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 3 \\ \end{array}\]

Positive whole numbers

With \(3\) bits, I can represent double the amount of numbers: \(8\)

\[\begin{array}{ccccccc} \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 0 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 4 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 1 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 5 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 2 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 6 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 3 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 7 \\ \end{array}\]

Positive whole numbers

With \(4\) bits, it doubles yet again and I can represent 16 different numbers:

\[\begin{array}{ccccccccccccccc} \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 0 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 8 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 1 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 9 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 2 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 10 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 3 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 11 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 4 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 12 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 5 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 13 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 6 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \rightarrow 14 \\ \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 7 & & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \rightarrow 15 \\ \end{array}\]

Positive whole numbers

Here is another way of looking at it:

\[\begin{array}{ccccc} \fcolorbox{black}{white}{$\phantom{0}$} & \fcolorbox{black}{white}{$\phantom{0}$} & \fcolorbox{black}{white}{$\phantom{0}$} & \fcolorbox{black}{white}{$\phantom{0}$} \\ \downarrow & \downarrow & \downarrow & \downarrow \\ \fcolorbox{black}{white}{$\phantom{0}$} \times 2^3 & \fcolorbox{black}{white}{$\phantom{0}$} \times 2^2 & \fcolorbox{black}{white}{$\phantom{0}$} \times 2^1 & \fcolorbox{black}{white}{$\phantom{0}$} \times 2^0 \\ \end{array}\]

Positive whole numbers

Suppose we have the following sequence of bits:

\[ \begin{array}{cccccc} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \\ % \downarrow & \downarrow & \downarrow & \downarrow \\ % \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^3 & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^2 & \fcolorbox{black}{#eeeeee}{0} \times 2^1 & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^0 \\ % \downarrow & \downarrow & \downarrow & \downarrow \\ % 8 & 4 & 0 & 1 \\ % \downarrow & \downarrow & \downarrow & \downarrow \\ % 8 & +\quad4 & +\quad0 & +\quad1 & = & 13 \end{array} \]

Positive whole numbers

We assign weights to each bit according to their position:

\[ \begin{array}{cccccc} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \\ \downarrow & \downarrow & \downarrow & \downarrow \\ \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^3 & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^2 & \fcolorbox{black}{#eeeeee}{0} \times 2^1 & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^0 \\ % \downarrow & \downarrow & \downarrow & \downarrow \\ % 8 & 4 & 0 & 1 \\ % \downarrow & \downarrow & \downarrow & \downarrow \\ % 8 & +\quad4 & +\quad0 & +\quad1 & = & 13 \end{array} \]

Positive whole numbers

We assign weights to each bit according to their position:

\[ \begin{array}{cccccc} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \\ \downarrow & \downarrow & \downarrow & \downarrow \\ \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^3 & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^2 & \fcolorbox{black}{#eeeeee}{0} \times 2^1 & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^0 \\ \downarrow & \downarrow & \downarrow & \downarrow \\ 8 & 4 & 0 & 1 \\ % \downarrow & \downarrow & \downarrow & \downarrow \\ % 8 & +\quad4 & +\quad0 & +\quad1 & = & 13 \end{array} \]

Positive whole numbers

And this is why this sequence of bits represents the number 13:

\[ \begin{array}{cccccc} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \\ \downarrow & \downarrow & \downarrow & \downarrow \\ \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^3 & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^2 & \fcolorbox{black}{#eeeeee}{0} \times 2^1 & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times 2^0 \\ \downarrow & \downarrow & \downarrow & \downarrow \\ 8 & 4 & 0 & 1 \\ \downarrow & \downarrow & \downarrow & \downarrow \\ 8 & +\quad4 & +\quad0 & +\quad1 & = & 13 \end{array} \]

But we need negative numbers too!

In practice, we reserve the first bit to represent the sign of the number:

\[ \begin{array}{c|ccccc} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \\ \downarrow & \downarrow & \downarrow & \downarrow \\ \textcolor{green}{+} & 4 & 0 & 1 \\ \downarrow & \downarrow & \downarrow & \downarrow \\ \textcolor{green}{+} & 4 & +\quad0 & +\quad1 & = & +5 \\ \textcolor{green}{sign} & \text{value} & & & & \\ \end{array} \]

But what if we need negative numbers?

In this case, we reserve the first bit to represent the sign of the number:

\[ \begin{array}{c|ccccc} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \\ \downarrow & \downarrow & \downarrow & \downarrow \\ \textcolor{red}{-} & 4 & 0 & 1 \\ \downarrow & \downarrow & \downarrow & \downarrow \\ \textcolor{red}{-} & 4 & +\quad0 & +\quad1 & = & -5 \\ \textcolor{red}{sign} & \text{value} & & & & \\ \end{array} \]

Integers in Python

In Python, whole numbers are represented using the int data type:

# positive number
x = int(5)

# negative number
y = int(-5)

Or simply:

x = 5

y = -5

Integers in Python

You don’t need to think about the number of bits used to represent an integer in Python.

  • Python will automatically use as many bits as needed to represent the number you assign to a variable.
  • As long as you have RAM available on your computer, you can store very large numbers in Python.

💡 However, we need to choose the number of bits (carefully) when we use other data manipulation libraries in Python. Integer types go by many names depending on the number of bits used.

What If I need a decimal number?

Decimal numbers are represented using the floating-point data type.

\[ \textcolor{#9753b8}{\texttt{pi}} = 3.14159 \]

In Python:

pi = 3.14159

If you use a decimal point in a number, Python will automatically use the float data type.

How to represent floating-point numbers?

It gets more complicated…

We usually have a \(\textcolor{red}{sign}\) bit, an \(\textcolor{green}{exponent}\), and a \(\textcolor{blue}{mantissa}\). For example, if I only had 8 bits at my disposal (not a good idea), I could represent decimal numbers like this:

\[ \begin{array}{c|ccc|cccccc} \fcolorbox{red}{white}{$\phantom{0}$} & \fcolorbox{green}{white}{$\phantom{0}$} & \fcolorbox{green}{white}{$\phantom{0}$} & \fcolorbox{green}{white}{$\phantom{0}$} & \fcolorbox{blue}{white}{$\phantom{0}$} & \fcolorbox{blue}{white}{$\phantom{0}$} & \fcolorbox{blue}{white}{$\phantom{0}$} & \fcolorbox{blue}{white}{$\phantom{0}$} \\ \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow\\ \textcolor{red}{sign} & \textcolor{green}{\text{exp. sign}} & \fcolorbox{green}{white}{$\phantom{0}$} \times 2^1 & \fcolorbox{green}{white}{$\phantom{0}$} \times 2^0 & \fcolorbox{blue}{white}{$\phantom{0}$} \times 2^{-1} & \fcolorbox{blue}{white}{$\phantom{0}$} \times 2^{-2} & \fcolorbox{blue}{white}{$\phantom{0}$} \times 2^{-3} & \fcolorbox{blue}{white}{$\phantom{0}$} \times 2^{-4} \\ \end{array} \]

The value of decimal number is retrieved using the formula:

\[ \textcolor{red}{\text{sign}} \times 10^{\textcolor{green}{\text{exp. sign}}\times(\textcolor{green}{\text{exponent number}})} \times \textcolor{blue}{\text{mantissa}} \]

How to represent floating-point numbers?

For example:

\[ \begin{array}{c|ccc|cccccc} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} & \fcolorbox{black}{#eeeeee}{0} \\ \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow\\ \textcolor{red}{-} & \textcolor{green}{-} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times \textcolor{green}{2^1} & \fcolorbox{black}{#eeeeee}{0} \textcolor{green}{\times 2^0} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times \textcolor{blue}{2^{-1}} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times \textcolor{blue}{2^{-2}} & \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \times \textcolor{blue}{2^{-3}} & \fcolorbox{black}{#eeeeee}{0} \times \textcolor{blue}{2^{-4}} \\ \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow\\ \textcolor{red}{-} & \textcolor{green}{-} & \textcolor{green}{2} & \textcolor{green}{0} & \textcolor{blue}{0.5} & \textcolor{blue}{0.25} & \textcolor{blue}{0.125} & \textcolor{blue}{0} \\ \end{array} \]

Now we do some calculations:

\[ \begin{array}{c|ll} \textcolor{red}{sign} & \textcolor{red}{-} & \text{(negative number)} \\ \textcolor{green}{exponent} & 10^{\textcolor{green}{-(2+0)}} = \textcolor{green}{\textbf{0.01}} & \text{(assume base 10)} \\ \textcolor{blue}{mantissa} & \textcolor{blue}{0.5} + \textcolor{blue}{0.25} + \textcolor{blue}{0.125} = \textcolor{blue}{\textbf{0.875}} & \\ \hline \textbf{Result} & \textcolor{red}{-} \quad \textcolor{green}{\textbf{0.01}} \times \textcolor{blue}{\textbf{0.875}} = -0.00875 \\ \end{array} \]

How to represent floating-point numbers?

When we start using numpy and pandas, we will have to consider the number of bits used to represent floating-point numbers. The most typical sizes are:

  • float16 (half precision): 16 bits
  • float32 (single precision): 32 bits
  • float64 (double precision): 64 bits

See all numpy floating data types here.

What about text?

The ASCII table

In the early days of computing, text was represented using the ASCII table. ASCII uses 7 bits to represent each individual character.

Here are some examples:

The letter ‘A’ is represented by the number 65 encoded in binary as:

\[ \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#eeeeee}{0} \]

The letter ‘a’ (lowercase) is represented by the number 97 encoded in binary as:

\[ \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#eeeeee}{0} \]

The linebreak character ‘\n’ is represented by the number 10 encoded in binary as:

\[ \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#111111}{$\textcolor{white}{1}$} \fcolorbox{black}{#eeeeee}{0} \fcolorbox{black}{#eeeeee}{0} \]

UTF-8

The ASCII table is very limited. It can only represent 128 characters. UTF-8 is a more modern text representation that was built to be compatible but can represent over 1 million characters.

🤗 Emojis are part of UTF-8!

The number of bits used to represent a character in UTF-8 can vary from 8 to 32 bits. The most common characters, like the English alphabet, are still represented using just 8 bits. (Read more)