LSE DS202A - Data Science for Social Scientists

2024/25 Autumn Term

Author

Here is how you will be assessed in this course.

Context

Your grade in this course consists of two main components:

  1. COURSEWORK (60%): This component includes two individual problem sets, each contributing to a different percentage of your final grade (30%, and 30%). You’ll need to submit these assignments via GitHub Classroom, following the deadlines specified in the 📓 Syllabus.

  2. GROUP PROJECT (40%): This will take place in Winter Term over a period of two weeks. It will be based on all the material covered in the course. More specific details will be provided in due course.

📝 Individual Problem Sets

We aim to provide feedback on your work within two to three weeks after the submission deadline (🤞).

These problem sets will have a similar style to the formative problem sets, and exercises done in the labs – although a bit more challenging. Typically, this will involve a mix of R tasks and your written interpretation of the analyses.

Formative

Simple practice problem set
(due Week 04 and to be announced in Week 03’s lecture)

We want you to practice:

  • Submitting work via GitHub Classroom
  • Using Quarto markdown
  • Writing simple R code (using tidyverse)
  • Fitting and applying basic supervised machine learning models for prediction (regression)


30%

Problem Set 01 — Supervised learning
(due Week 08)

See the 🎯 Learning Objectives involved


  • Know how to fit and apply supervised machine learning models for classification and prediction.
  • Apply the methods learned to real data through hands-on exercises.
  • Know how to evaluate and compare fitted models, and to improve model performance.


30%

Problem Set 02 — Unsupervised learning
(due Week 11+1)

See the 🎯 Learning Objectives involved


  • Understand when and where to apply unsupervised learning techniques (e.g clustering or anomaly detection) and what differentiates them from supervised learning.
  • Practice dimensionality reduction techniques.
  • Apply the methods learned to real data through hands-on exercises.
  • Integrate the insights from data analytics into knowledge generation and decision-making.
  • Understand an introductory framework for working with natural language (text) data using techniques of machine learning.

✍️ Group project (40%)

  • A group project that lasts two weeks.
  • Students are introduced to datasets (and associated “research questions”) on the first day of the project.
    Students are asked to rank projects by order of preference by the end of the day. By the end of day 1, students are assigned to a group (no more than 4 students per group).
  • There are two check-in points with lecturer/teachers during the project to discuss the direction of the project.
  • A Quarto project report should be submitted at the end of the two weeks.

40%

Group project
(Winter Term)

  • The group project will be based on all the material covered in the course, including the lectures, labs, problem sets and applications (W09-W11).