LSE DS202

Data Science for Social Scientists

Author
Published

13 November 2022

📓 Syllabus

A list of what happens every week.

Click on each Week’s link for more information (slides, lab instructions, recommended resources, etc.).

Intro
🗓️ W01 Lecture Introduction, Context & Key Concepts
(James et al. 2021, chap. 2)
Lab No class this week.
(Use this time to revisit basic R programming)
Supervised Learning
🗓️ W02 Lecture Simple and Multiple Linear Regression
(James et al. 2021, chap. 3)
Lab Revision of R: data structures, basic commands and some tidyverse
🗓️ W03 Lecture Classifiers (Logistic Regression & Naive Bayes)
(James et al. 2021, chap. 4)
Lab Linear Regression (James et al. 2021, chap. 3)
Formative
  • Worth: knowledge!
  • Problem set involving linear regression.
  • Release date: 15 October 2022
  • (link to Moodle)
  • Deadline: 25 October 2022 (Week 05)
🗓️ W04 Lecture Resampling methods
(James et al. 2021, chap. 5)
Lab Classification Methods (James et al. 2021, chap. 4)
🗓️ W05 Lecture Non-linear algorithms (SVM & tree-based models)
(James et al. 2021, chaps. 8–9)
Lab Cross-Validation and the Bootstrap (James et al. 2021, chap. 5)
Summative
  • Worth: 20% of final marks
  • Problem set involving regression, classification and resampling
  • Release date: 28 October 2022
  • Deadline: 9 November 2022 (Week 07)
Unsupervised Learning
🗓️ W07 Lecture Unsupervised Learning: Clustering
(James et al. 2021, chap. 12)
Lab Tree-based models (James et al. 2021, chap. 8)
🗓️ W08 Lecture Unsupervised Learning: Dimensionality Reduction
(James et al. 2021, chap. 12)
Lab Support Vector Machine + tidymodels + recap of cross-validation (James et al. 2021, chaps. 5, 9)
Summative
  • Worth: 20% of final marks
  • Problem set about unsupervised learning
  • Release date: 18 November 2022
  • Deadline: 29 November 2022 (Week 10)
Applications
🗓️ W09 Lecture Applications: Text as Data & Topic Modelling
Guest: Prof. Ken Benoit
Lab Unsupervised Learning: Clustering (James et al. 2021, chap. 12)
🗓️ W10 Lecture Applications: Predictive Modelling on Tabular Data
Lab Unsupervised Learning: Principal Component Analysis (James et al. 2021, chap. 12)
Summative
  • Worth: 20% of final marks
  • Problem set about ML applications (text mining/social media)
  • Release date: 2 December 2022
  • Deadline: 15 December 2022 (Week 11+1)
🗓️ W11 Lecture Applications: Social Media Data
Guests: Sara Luxmoore, MSc in Applied Social Data Science (LSE)
Anton Boychenko, MSc in Applied Social Data Science (LSE)
Lab We will explore unsupervised models using a couple of text datasets
          
🗓️ Jan/23 Exam
  • Worth: 40% of final marks
  • Problem set about supervised + unsupervised learning + applications
  • 3 hours + 1 hour for submission
  • Online exam via Moodle
  • Date: Sometime during the Exam Period. Exact timeslot has not yet been confirmed by LSE Exams team.

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2021. An Introduction to Statistical Learning: With Applications in R. Second edition. Springer Texts in Statistics. New York NY: Springer. https://www.statlearning.com/.