🗓️ Week 02
Operating Systems, Files & The Terminal

DS105 Data for Data Science

10/4/22

Operating Systems

What Operating Systems (OS) do

  • A computer can be divided into four parts:
    • hardware — provides the basic computing resources for the system
    • application programs — define how these resources are used
    • operating system — controls the hardware and coordinates its use among the various application programs for the various users
    • user — a person or a bot (a computer script) that requests actions from the computer.

user User app Application Programs (compilers, web browsers, development kits, etc.) user->app os Operating System app->os hardware Computer Hardware (CPU, memory, I/O devices, etc.) os->hardware

Insight into operating systems


“An operating system is similar to a government. Like a government, it performs no useful function by itself. It simply provides an environment within which other programs can do userful work.”

(Silberschatz, Galvin, and Gagne 2005, chap. 1)


  • If this sounds a bit vague, it is because it is!
  • It is actually tricky to specify which programs are part of the OS and which ones are not.
  • Let’s try to define what an OS is anyways ⏭️

Definition of OS

  • The OS is the one programming running at all times on the computer.
    • This is usually also called the kernel
  • There might be other programs running alongside the OS.
    • For example, the Terminal (more on that in a minute)
  • 📱 Mobile computers usually have more “additional” software alongside the kernel, which we call the middleware.
    • These applications support multimedia, graphics, internal app databases, etc..

Why bother with this?

  • It is improbable you will ever need to interact with the kernel directly.
  • But, we often need to install custom software to perform some data analysis
    • This software might not come from Apple or Microsoft Store.
    • Those are things you have to install “manually.”

Tip

Let’s face it, you will always encounter puzzling error messages when programming, no matter how senior or skilled you are.

Understanding a little about how everything is tied together will help you get to the core of the problem more quickly.

History of Operating Systems

History

  • In the early days of modern computing, when computers were not accessible to everyone, software (applications) typically came with their source code open.
  • Open source means you can read precisely which instructions the computer will follow when running.
  • As the industry grew, most software companies released only the binaries — a type of file you can only execute, not read as if it was a text.
    • This includes Operating Systems! ⏭️

A computer from the 1950s.
(Computer History Museum n.d.)

UNIX

  • UNIX was the first big Operating System, developed at Bell Labs and AT&T
  • It aimed to be simple* and easy to port to any hardware architecture
  • But, it required a license
  • In the late 1980s and early 1990s, a group of hackers and activists developed free & open source alternatives to UNIX.

How the UNIX System III looks like.

GNU/Linux

  • This led to the birth of one of the most influential operating systems: GNU/Linux, or simply Linux.
  • Android, the most popular OS for phones worldwide, is based on Linux.
  • Two people were instrumental to the development of Linux
    • Richard Stallman
    • Linus Torvalds

Note

GNU stands for “GNU is not Unix”. Computer nerds love a recursive joke.

A picture of Richard Stallman A picture of Linus Torvalds

macOS

  • macOS is the Operating System of Apple computers
  • It is a hybrid system. It has a free, open-source component called Darwin, but it also includes proprietary, closed-source components.
  • iOS, Apple’s mobile operating system, is also based on Darwin
  • Darwin is based on BDS UNIX, a derivative of the original UNIX system.

Windows

  • Windows has its own history.
  • Microsoft and IBM co-developed its predecessor, the OS/2 operating system.
  • But then, Microsoft took on its own path and developed its own versions of the OS: Windows NT, Windows 95, Windows 98, Windows 2000, Windows XP, Windows 7, Windows Vista*, etc.
  • Windows popularity can be traced to the success of the Office suite

Virtualization

  • Virtualization is a technology that creates the illusion that you are running a separate private computer.
  • You decide how much of your CPU/RAM/Hard drive to share with the virtual machine

Emulators & Virtual Machines

  • You can install an emulator to run Windows inside Mac (and vice-versa)
    • Provided you own a licence to install the other OS
  • You can share files to and from the virtual machine inside the emulator, but the internal machine will “think” it is a separate computer.

Note

  • In the 🖥️ labs on 🗓️ Week 03, you will access a virtual machine that lives in the cloud
  • Example of commercial virtualization softwares

Windows Subsystem for Linux (WSL)

  • In an attempt to entice Linux users (especially developers), Microsoft added a Linux emulator to Windows named “Windows Subsystem for Linux”
  • You install your preferred Linux distribution
    • Ubuntu is one of the most popular

Tip

  • Our 🖥️ labs on Weeks 2 & 3 will focus on Linux/UNIX-like commands.
  • Windows users will have to install WSL on their computers.

The Terminal

  • A terminal, or command prompt, is a screen or a window that lets you access the Operating System’s input and output.
  • There are no graphics (images/video) in the terminal, only text.

Shell

  • Typically, the terminal runs a program (app) called the shell.
  • The shell awaits, interprets, processes, executes, and responds to commands typed in by the user.
  • There are many shells, each has its own features.
  • Popular Linux shells:
    • sh or the Bourne shell: developed at AT&T labs in the 70s by a guy named Stephen Bourne.
    • bash or the Bourne again shell: very popular, compatible with sh shell scripts.
      • Our 🖥️ labs will focus on bash
    • ksh or the Korn shell: provides enhancements over the sh and it is also compatible with bash.
    • csh and tcsh: shells that have a syntax similar to the programming language C.

Windows CMD vs PowerShell

  • As mentioned before, Windows has its own thing.
  • There are two main terminals/shells on Windows these days

CMD

Powershell

Files & Filesystems

What are files?

  • Ultimately, everything in a computer is just a bunch of 0s and 1s
  • Files are a set of conventions that allows us to extract information from them.
  • Let’s see where these ideas come from ⏭️

Structured data: Index cards

  • Origins in the 19th century, with botanist Carl Linnaeus, who needed to record species that he was studying

  • This was a form of database

    • each piece of information about a species formed a field
    • each species’ entry in the system formed a record
    • the records were indexed using some reference system

Heyday: Use in libraries to catalog books

card catalog card catalog room

A record looked like this

Dewey decimal system

  • a proprietary library classification system first published in the United States by Melvil Dewey in 1876
  • scheme is made up of ten classes, each divided into ten divisions, each having ten sections
  • the system’s notation uses Arabic numbers, with three whole numbers making up the main classes and sub-classes and decimals creating further divisions
  • Example:

    500 Natural sciences and mathematics
        510 Mathematics
            516 Geometry
                516.3 Analytic geometries
                    516.37 Metric differential geometries
                        516.375 Finsler Geometry

Hierarchical directory structure

  • This kind of hierarchical structure is still present in all modern OSes
  • A directory, or folder, is a place where many files are stored
  • In theory, it can contain infinite sub-directories and files

UNIX directory tree

root / bin bin root->bin dev dev root->dev etc etc root->etc home home root->home lib lib root->lib mnt mnt root->mnt proc proc root->proc namedroot root root->namedroot sbin sbin root->sbin tmp tmp root->tmp usr usr root->usr var var root->var jonathan jonathan home->jonathan documents Documents jonathan->documents images Images jonathan->images videos Videos jonathan->videos downloads Downloads jonathan->downloads workspace Workspace documents->workspace ds105 lse-ds105-course-notes workspace->ds105 usr_lib lib usr->usr_lib usr_bin bin usr->usr_bin usr_include include usr->usr_include var_log log var->var_log var_mail mail var->var_mail var_spool spool var->var_spool var_tmp tmp var->var_tmp

After the ☕ break:

  • Let’s open the Terminal!
  • Basic shell commands
  • A tour around vim, a file editor
  • Common file formats
  • How to prepare for the lab this week

References

Computer History Museum. n.d. “1950 Timeline of Computer History.” 1950 Timeline of Computer History. Accessed September 16, 2022. https://www.computerhistory.org/timeline/1950/.
Ebrahim, Mokhtar, and Andrew Mallett. 2018. Mastering Linux Shell Scripting: A Practical Guide to Linux Command-Line, Bash Scripting, and Shell Programming, 2nd Edition. 2nd ed. Birmingham: Packt Publishing.
Pelz, Oliver. 2018. Fundamentals of Linux: Explore the Essentials of the Linux Command Line. Birmingham: Packt Publishing Ltd.
Silberschatz, Abraham, Peter B. Galvin, and Greg Gagne. 2005. Operating System Concepts. 7th ed. Hoboken, NJ: J. Wiley & Sons.