🗓️ Week 02 – Day 01
Neat functions and tidy data

LSE ME204

17 Jul 2023

Programming Paradigms

Imperative Programming

  • When you first start learning to code, it’s likely that you’re introduced to imperative programming.
  • Imperative programming is a style of programming where you tell the computer what to do, step by step, line by line.
  • It is the style that most people associate with programming.
    • if statements
    • for loops
    • while loops
    • switch statements (in some languages, it’s case statements)
    • goto statements (in some languages, it’s break statements)
  • It’s the style we associate with programming languages like C, Java, and also Python.

Comparison

Looping through the numbers 0 to 9 and printing them out:

Language: C

#include <stdio.h>

int main() {
    int i;
    for (i = 0; i < 10; i++) {
        printf("%d\n", i);
    }
    return 0;
}

Language: Python

for i in range(10):
    print(i)

Language: Java

public class Main {
    public static void main(String[] args) {
        for (int i = 0; i < 10; i++) {
            System.out.println(i);
        }
    }
}

Language: R

for (i in 1:10) {
    print(i)
}

Assembly Language

Now, the same code above in Assembly language:

mov ecx, 0
loop:
    mov eax, ecx
    call print # a simplification
    inc ecx
    cmp ecx, 10
    jl loop

Object-Oriented Programming (OOP)

  • Another common style of programming is object-oriented programming (OOP).
  • It’s common to see this pattern in the same languages that support imperative programming.
  • The core feature of OOP is the class, a blueprint for an object.
  • When you create an object, you’re creating an instance of a class, therefore you’re instantiating a class.
  • A class has attributes and methods.
    • Attributes are the characteristics of the class.
    • Methods are the functions that the class can perform.

Python vs R

A class in Python:

class Student:
    def __init__(self, name, age, height):
        self.name = name
        self.age = age
        self.height = height

    def print(self):
        print("Name:", self.name)
        print("Age:", self.age)
        print("Height:", self.height)

which can then be used like this:

specific_student = Student("John", 30, 1.8)
specific_student.print()

Python vs R (cont’d)

In R we can use S3 classes:

# Create a list (a pure R object)
specific_student <- 
    list(name = "John", age = 30, height = 1.8)

# Create a class
class(specific_student) <- "student"

# Define a method
print.student <- function(x) {
    cat("Name:", x$name, "\n")
    cat("Age:", x$age, "\n")
    cat("Height:", x$height, "\n")
}

Or something called S4 classes:

# Create a class
setClass("student", 
    slots = c(name = "character", 
              age = "numeric", 
              height = "numeric"))

# Create an object
specific_student <- new("student", 
                        name = "John", 
                        age = 30, 
                        height = 1.8)

# Define a method
setMethod("show", "student", function(object) {
    cat("Name:", object@name, "\n")
    cat("Age:", object@age, "\n")
    cat("Height:", object@height, "\n")
})

which can be used to print the object like this:

print(specific_student) # or just `specific_student` or, if S4, `show(specific_student)`

to return:

Name: John 
Age: 30 
Height: 1.8 

More than just printing

  • We can use these objects to manipulate data.
  • For example, we can access the attributes of the object:

Python

# Get the name
student.name

R (S3)

# Get the name
student$name

R (S4)

# Get the name
student@name

Altering the object

  • We can also alter the attributes of the object:

Python

# Change the name
student.name = "Mary"

R (S3)

# Change the name
student$name = "Mary"

R (S4)

# Change the name
student@name = "Mary"

Manipulating multiple objects

  • For example, we can store a list of students in a list of objects:
students <- list(
    new("student", name = "John", age = 30, height = 1.8),
    new("student", name = "Mary", age = 25, height = 1.6),
    new("student", name = "Peter", age = 35, height = 1.7)
)

Then we can perform operations on the list of objects:

# Get the average age
mean(sapply(students, function(x) x@age))
[1] 30

Functional Programming

  • However, R is not really an object-oriented language.
  • R shines when it comes to functional programming.
  • Functional programming refers to the style of programming where you focus on the functions.
    • Functions are the first-class citizens of the language.
    • Functions can be passed around like any other object.
    • Functions can be nested within other functions.
  • Most R objects are immutable, i.e. they cannot be changed.
    • This is not the case in OOP, where objects are mutable.
  • Therefore, we can expect functions to always return the same output for the same input.
    • This is not the case in OOP, where methods can change the object.

Writing better functions with TDD

Good functions

  • Functions should be modular.
    • Functions should do one thing and do it well.
    • Functions should be relatively short.
    • Functions should have a clear purpose.
  • Functions helps you to avoid repetition (DRY principle).
    • If you find yourself copying and pasting your own code, it’s time to write a function.
  • Functions should be reproducible.
    • Functions should always return the same output for the same input.
    • Functions should be deterministic.

Test-driven development

There are many approaches to writing functions. I am going to teach you the test-driven development (TDD) way.

  • Write a unit test for the function.
    • a unit test is a test that checks a small piece of code.
  • Write the function.
  • Run the test.
  • If the test fails, fix the function and re-run the test.
  • Repeat until the test passes.

Conceptual Example: mean()

  • Let’s write a function that calculates the mean of a vector of numbers.
  • We’ll call it my_mean().
my_mean <- function(x) {
    # Calculate the mean of x
}
  • Now we need to write a test for the function before we write the function.
# Test 1
my_mean(c(1, 2, 3)) == 2
  • Now we can write the function.
my_mean <- function(x) {
    sum(x) / length(x)
}
  • Now we can run the test.
my_mean(c(1, 2, 3)) == 2
  • The test passes, so we’re done.

Real example: Scraping AZ Lyrics

Task: write a function that scrapes the lyrics of any given song (URL) from AZ Lyrics.

🧑‍💻 Live coding of unit tests + function + debugging + documentation