๐๏ธ Week 05 - Exploratory data analysis and visualisation
2024/25 Autumn Term
The lecture slides have been updated to include some more explanations and clarifications.
This week, weโll dive a bit deeper into exploratory data analysis. During the classes, itโll be time for your group presentations (and first summative for this course!).
๐ฉ๐ปโ๐ซ Lecture Slides
Either click on the slide area below or click here to view it in fullscreen. Use your keypad to navigate the slides. You can also find a PDF version on Moodle.
๐ฅ Looking for lecture recordings? You can only find those on Moodle.
๐ Quarto/Zotero tutorial (Preparation for the formative coursework)
Step 1: Install the necessary software and test your installation. See the installation guide
Step 2: Work through the Quarto/Zotero tutorial
Step 3: Check the solutions to the tutorial here
Youโre now ready to continue on to your formative.
โ๏ธ Formative Coursework
- You can choose to write it as an individual or in pairs.
- This coursework is not graded but it is good practice for your Quarto skills and I will provide feedback on your notes if you submit them via Moodle.
- Create a Quarto Markdown file and name it
LSE_DS101A_2024_25_W05_formative.qmd
. - Because this is a formative assessment, submission is not anonymous. Therefore, please include your name(s) in the document.
- Try to format this document with Quarto Markdown for a bit of practice.
Task 1:
- Read the following articles: (Aschwanden 2015) and (Greenland et al. 2016).
- Now answer the following questions:
- How are p-values misused/misinterpreted?
- What is p-hacking? What are the consequences of p-hacking?
- How should researchers avoid p-hacking?
Task 2:
- Now, think back to the countries from the world from the ๐๏ธ Week 05 lecture.
- Suppose you were to create a linear model that would predict the dependent variable called
GDP per capita
. - Now answer the following questions:
- What would you use as independent variables?
- How would you handle missing data and outliers?
- What would be your null and alternative hypotheses?
- How would you avoid p-hacking?
Task 3:
- Read the following article: (Hohl 2009)
- Now answer the following questions:
- According to the article, should linear regression be used as a matter of routine? Why or why not?
- What does the article suggest as an alternative to linear regression? Why?
Submission
- After you are done writing, render your document to HTML and submit it on Moodle. See this guide for a quick tutorial on how to preview and render your documents to HTML in VSCode.
- The preferred submission is a single HTML document (rendered from Quarto Markdown) that contains the answers to all questions for this formative (including the code-related ones). However, if you are unsure how to embed Python code into a Quarto Markdown file, you are allowed to save the code into a Jupyter notebook (.ipynb extension), include both the HTML file with your non-code related answers and the notebook into an archive (.zip) and upload the archive (.zip) to Moodle.
As shown in the Quarto/Zotero tutorial, you can use the VSCode terminal to preview or render your .qmd
document:
to open the VSCode terminal, go to the VSCode menu, click on Terminal>New Terminal. A new terminal will open.
In the new terminal, check the content of the current folder you are in by typing the
ls
command. If your.qmd
file does not appear in your current folder, check in which folder you are by typing the commandpwd
. Use thecd
command to change folders, e.gpwd
shows that you are currently in/home/users/Downloads
but your.qmd
file is/home/users/Documents/DS101A
, you could typecd /home/users/Documents/DS101A
to go to the correct directory or alternatively you could typecd ../Documents/DS101A
(../
is a special path that brings you back to the parent folder from the folder you are currently in, in this example, it would bring you from/home/users/Downloads
tohome/users/
).Once you are in the correct folder (you can type
ls
orpwd
again to check), you can:- preview your document by typing the command
quarto preview name_of_quarto.qmd --no-browser
- render your document to HTML by typing the command
quarto render name_of_quarto.qmd
. If you want to produce a single HTML file (and not a folder of files), add the lineself-contained: true
to the YAML header of your Quarto document i.e the YAML header of your document should be similar to this
- preview your document by typing the command
---
title: Quarto document title
author: Your name
format:
html:
self-contained: true
bibliography: references.bib
---
- Deadline: 14 November 2024 at 5pm.
๐ Recommended Reading
- Check the end of slides for the list of references cited in the lecture.
- Check the ๐ Syllabus for this weekโs complete list of indicative and recommended readings.
๐ Communication
- Post your reflections, questions, and links on Slack.