๐Ÿ’ป Week 08 - Class Roadmap (90 min)

2022/23 Lent Term

Author

DS101L 2022/23 Teaching Team

Published

06 March 2023

๐Ÿ“š Learning Objectives

This week, we will explore more of the power of markdown and R. We will learn of a recent development in the world of R, called Quarto, which allows us to create documents that are both reproducible and interactive. We will also learn how to use Zotero to manage our references.

โš™๏ธ Setup

You need the following installed:

๐Ÿ›ฃ๏ธ Roadmap

Step 1: Collecting and saving academic references

Zotero is a free, open-source reference management software that allows you to collect, organize, cite, and share your research sources. It lives as an extension in your browser and allows you to save references from online databases, websites, and books. It also has a desktop application that allows you to organize your references into collections and create bibliographies.

๐ŸŽฏ ACTION ITEM:

  1. Create a Zotero account
  2. Install the Zotero extension in your browser.
  3. Save a few relevant references to your account. (Check out the ๐Ÿ“” Syllabus for some suggestions)
  4. Export your references as a .bib file and save it in your DS101L folder.

Step 2: Get set up with RStudio and Quarto

๐ŸŽฏ ACTION POINTS:

  1. Install Quarto.
  2. Open RStudio and create a new project.
  3. Download DS101L_2022_23_W08_lab.qmd in RStudio

You will see a โ€œchunkโ€ of R code that loads all the libraries you need to finish the task. It also creates a function avg_gdppercap. Donโ€™t worry, you wonโ€™t need to understand what is going on โ€œunder the hoodโ€ โ€“ all you need to know is that if you type in a year, it returns the average global GDP per capita in that year. There is also a subset of the data for the Chinese population, chinese_pop.

Note

You will see that there is a series of short headings and sentences without any formatting. We are going to turn this into an appropriately formatted document with three sections.

There are many ways in which you can format your documents in Quarto, please refer to this page as a user guide throughout.

Step 3: Formatting your document - Section 1 (30 min)

Letโ€™s start by formatting the first section of the document, Section 1: Understanding tidy data.

๐ŸŽฏ ACTION POINTS:

  1. Turn the line โ€œUnderstanding tidy dataโ€ into a section header.

  2. Open Zotero and create a new bibliography

  3. Add Wickhamโ€™s article to this bibliography

    • https://www.jstatsoft.org/article/view/v059i10
    • Include the name of the author, title, journal, number, issue, page range, and DOI
  4. Right-click on the bibliography and select โ€œExport Collectionโ€. Create a BibLaTeX document and save it in the same directory as the Quarto document called references.bib.

  5. Add the following to the YAML header:

    bibliography: references.bib
  6. Click on references.bib to see the reference Zotero created for the Wickham article.

  7. Directly reference the article using @ArticleReference.

  8. Turn the principles of tidy data into bullet points.

Step 4: Formatting your document - Section 2 (30 min)

Now, letโ€™s format the second section of the document, Section 2: Global trends in GDP per capita.

๐ŸŽฏ ACTION POINTS:

  1. Turn the line โ€œGlobal trends in GDP per capitaโ€ into a section header.
  2. Remember the function avg_gdppercap? Well, we can use this function (and any R function for that matter) when writing up our results.
  3. Check out the following documentation:
    • https://rmarkdown.rstudio.com/lesson-4.html
  4. Now use inline R code to replace the ellipses (โ€ฆ) with the dollar amounts using the avg_gdppercap function.

Step 5: Formatting your document - Section 3 (15 min)

Finally, letโ€™s format the third section of the document, Section 3: Population growth in China.

๐ŸŽฏ ACTION POINTS:

  1. Turn the line โ€œPopulation growth in Chinaโ€ into a section header.
  2. As per the previous section, use chinese_pop to replace the ellipses (โ€ฆ) with the population figures.
  3. Create a new R code chunk.
  4. Check out the knitr::kable function by typing ?kable into the console on RStudio. This will open documentation for the function, including all the different parameters that can be used.
  5. Use chinese_pop as the first argument in kable. Technically, this is all you need for it to run, but the table looks a bit basic.
    • Change the variable names year and pop to something more professional looking.
    • Left align the entries.