🛣️ Week 02 - Running commands on a remote computer

Lab Roadmap (90 min)

Author
Published

22 January 2024

Welcome to your second DS105 lab.

Last week, you navigated your computer using the bash shell. Now, it is time to go beyond your machine and get to the cloud! As you would have seen in the lecture, accessing the cloud opens up various possibilities for us as data scientists. This week, you will access a remote machine without leaving your terminal.

🥅 Learning Objectives

  • Connect to a remote machine using SSH.
  • Identify when you are inside your own computer and when you are inside a remote machine.
  • Navigate the remote machine.
  • Transfer files between your local machine and the remote machine.

📋 Lab Tasks

No need to wait! Start reading the tasks and tackling the 🎯 ACTION POINTS below when you come to the classroom.

Part 0: Export your chat logs (~ 3 min)

As part of the GENIAL project, we ask that you fill out the following form as soon as you come to the lab:

🎯 ACTION POINTS

  1. 🔗 CLICK HERE to export your chat log.

    Thanks for being GENIAL! You are now one step closer to earning some prizes! 🎟️

👉 NOTE: You MUST complete the initial form.

If you really don’t want to participate in GENIAL1, just answer ‘No’ to the Terms & Conditions question - your e-mail address will be deleted from GENIAL’s database the following week.

Part I: Acquire your credentials (15 min)

To access a remote computer, we will need to use a secure connection. For that, we will use the Secure Shell (SSH) protocol, a cryptographic network protocol for operating network services securely over an unsecured network. It is a standard for remote login and command execution on a remote machine.

🎯 ACTION POINTS

  1. You should have ssh installed in your computer. To check if you have it, open your terminal (bash, zsh or even Powershell) and type:

    ssh -V

    This will print out the version of SSH you have installed. For example:

    (base) jon@LSE-DSI DS105 % ssh -V
    OpenSSH_9.0p1, LibreSSL 3.3.6

    Your version may be different, but as long as you see something like the message above, you’re good. This confirms your shell is ready to connect to the cloud machine we will use in the lab.

  2. Now, inform your class teacher of your candidate number. They will use this number to create your cloud credentials. Don’t share it with anyone else!

“Your candidate number is a unique five-digit number that ensures that your work is marked anonymously. It is different from your student number and will change every year. Candidate numbers can be accessed using LSE for You.

Source: LSE

  1. 🧑🏻‍🏫 TEACHING MOMENT: Now, wait a bit! Your class teacher will input your candidate number into a script, granting you access to our cloud machine. To log in, you need a username in the following format: student_<CANDIDATE_NUMBER>, where <CANDIDATE_NUMBER> represents your LSE CANDIDATE NUMBER. For example, if your ID is 78451, your username will be student_78451.

Part II: Connect to the cloud (20 min)

Time to connect to the cloud! 🌩️

🎯 ACTION POINTS

  1. Launch your bash shell.

  2. Type the following command after replacing <username> with your username. Your class teacher will display the hostname of the cloud machine on the screen.

     ssh <username>@<host_address>
  3. Enter your password (which is, at this point, the same as your username).

    Read the instructions on the screen carefully and follow them to create a new password. The first time you log in, you need to give a new password for your account.

    💡 TIP: If you wish to close your connection to the remote machine at any point, type the command exit and hit Enter.

Part III: Explore the cloud (20 min)

Now that you are connected to the cloud machine, let’s explore it a bit. Ask your class teacher if you are unsure if you’ve followed the instructions correctly.

🎯 ACTION POINTS

  1. Use the pwd and $HOME commands to understand the path to your home directory in this new machine.

  2. Does your cloud machine have any files stored? Maybe any hidden files?

    Feel free to consult the materials from last week.

  3. Can you access the root directory of the machine? If so, what are the folders inside the root directory?

  4. Go back to your home directory to continue the exercises.

Let’s create some files

  1. Let’s create a hidden folder just for fun. In your home directory, create a directory called .week02 (the dot at the beginning of the name is important).

  2. Inside this folder, create a file called secret_restaurant.txt

  3. Open it with nano and add the name of your favourite place to eat in London in the file.

  4. Still inside .week02, create another file called secret_address.txt.

  5. Save the address of the same place there.

  6. Confirm that these files are hidden using the ls command:

    cd ~
    ls
    ls -a

    Note: hidden files are not protected in any way. They are just hidden from the user’s view when you browse a directory from a file explorer.

Part IV: A bridge between worlds (~30 min)

🧑🏻‍🏫 TEACHING MOMENT: Your class teacher will explain the scp command.

In the first steps of this lab, we have established a secure connection between our computer and the cloud machine. We can use a secure connection for more than just sending commands; we can also send and receive files. Let’s see how it works:

Generally, the scp commands work in the following way:

scp location_1/file location_2/file

It means copying a file from location_1 to location_2.

Now, it’s time to get some practice!

🎯 ACTION POINTS

  1. Exit the cloud machine using the following code:

    exit 
  2. To copy the secret_restaurant.txt file from your cloud machine, use the following command. Ensure you have replaced all the <username>s with your username.

     scp <username>@<server-url>:/home/<username>/.week02/secret_restaurant.txt ./my_secret_restaurant.txt

⚠️ IMPORTANT!

We encourage you to pay attention to all the little details here:

  • The : symbol helps us specify the full path to the file on the remote machine. Left of the colon, we specify the username and the hostname. Right from the colon, we specify the path to the file inside the machine.

  • The scp command takes two locations: copy the file from the path indicated on the left to the place on the right, after the space. We don’t need a username or hostname on the right because we are copying the file to where we are - our local machine.

  • Remember last week when you typed ls .? The dot symbol (.) is a placeholder for the current directory.

  • Notice that the new file has a different name than the original one. When we copy a file, there is no need to keep the same name. We can rename it as we copy it.

  1. Use ls and cat to check if the file was copied correctly.

  2. Copy the secret_address.txt file from the cloud machine to your local machine:

     scp <username>@<server-url>:/home/<username>/.week02/secret_address.txt .

    This time, we only used the dot symbol (.) in the second location. This means we want to copy the file to the current directory and keep the same name.

  3. But what if we wanted to copy all the files from the folder or the folder itself? For that, we can avoid specifying the whole path to a concrete file and replace the name of the file with an asterisk:

    scp -r <username>@<server-url>:~/.week02 .

    This way, we copy the entire .week02 folder. The -r option stands for recursive and indicates that we want to copy the folder and all its files.

    Should you wish to copy the files, not the folder itself, you can use the following command:

    scp -r <username>@<server-url>:~/.week02/* .

    Make sure both commands work for you.

Sending it back

We have just learned how to copy files from the cloud. Today’s last task is sending files to the cloud machine.

  1. Create a file called secret_dish.txt and save the name of your favourite dish from the restaurant you mentioned in the previous steps.

  2. 🤔 Stop for a second. Can you guess how to copy this file to your cloud machine using your already-acquired knowledge? Try it out before looking at the solution.

Solution

Use the code below to copy your file to the cloud machine. Make sure to replace the <username>s with your actual username.

scp secret_dish.txt <username>@<server-url>:/home/<username>/.week02/
  1. Log in to your virtual machine and check if the file is there.

🏡 Take-home Tasks

It can wait

You are now able to connect to the cloud machine, navigate it, and exchange files. Now it’s time to get to the most exciting part! Running code on the cloud! Suppose you need to process millions of rows of data — it could take ages to do that on your computer. Instead of taking your computer’s resources, you can run the scripts on the cloud.

Let’s do it, but before that, we will learn how to do an exciting trick.

Click here to see the 🎯 ACTION POINTS

🎯 ACTION POINTS

  1. Launch a shell on your local machine.

  2. Navigate to a directory of your choice or create one.

  3. Create a new file called waiting.py and include the following code inside:

    import time
    time.sleep(10)
    
    print('The waiting is complete.')

    This file launches a script that waits for 10 seconds and then prints ‘The waiting is complete.’. You might think that it doesn’t make sense. However, let us show you something…

  4. Run this Python script in the following way:

    python waiting.py

    What did it do? Hopefully, it is exactly what we expected. It waited for 10 seconds and then printed one sentence. You might have noticed that you could not execute commands while it was running. But what if we could?

  5. Try running the following code:

    python waiting.py &

    Do you see the difference? Can you now run whoami or ls while we wait for the code to run?

Well done! Now, you have learned how to create and run Python scripts in your terminal and do things in parallel. Shall we try it in the cloud?

More principled script running

Let’s now explore the same operations in the cloud.

Click here to see the 🎯 ACTION POINTS

🎯 ACTION POINTS

  1. Connect to the virtual machine.

  2. Create a folder called test_code.

  3. Create a file called test.py or test.R depending on what language you want to use (for Python and R, respectively).

  4. In the file

    • create a variable called age
    • assign it with your age
    • make the machine wait for 5 seconds (time.sleep(5) in Python or Sys.sleep(5) in R)
    • use print() function to print your age
  5. Use the following code to execute your script on the cloud:

     python3 test.py
  6. Does it print your age?

  7. Go ahead and experiment with using the & operator. It comes in handy if you want your cloud machine to continue running without you constantly monitoring it.

You can check out the tutorial on how to run Python scripts or R scripts to help you.

Footnotes

  1. We’re gonna cry a little bit, not gonna lie. But no hard feelings. We’ll get over it.↩︎