LSE DS202W - Data Science for Social Scientists

2024/25 Winter Term

Author
Important

🚧 This website is under construction. Pages will keep being updated in the next few weeks!

Here is how you will be assessed in this course.

Context

Your grade in this course consists of two main components:

  1. COURSEWORK (60%): This component includes two individual problem sets, each contributing to the same percentage of your final grade (30%, and 30%). You’ll need to submit these assignments via GitHub Classroom, following the deadlines specified in the 📓 Syllabus.

  2. GROUP PROJECT (40%): This will take place in Spring Term over a period of two weeks. It will be based on all the material covered in the course. More specific details will be provided in due course.

📝 Individual Problem Sets

We aim to provide feedback on your work within two to three weeks after the submission deadline (🤞).

These problem sets will have a similar style to the formative problem sets, and exercises done in the labs – although a bit more challenging. Typically, this will involve a mix of Python tasks and your written interpretation of the analyses.

Important

Just so we are very clear from the start, if you want to do well in this course, simply writing code that works is not enough!

Lining up (minimally commented) code block after after code block will never result in a good mark and will even be penalized.

We need to understand your thought processes e.g why you chose the particular models you chose to solve the problem at hand, why you selected the parameters you trained your models with, why you’re evaluating your models the way you are and how you are interpreting your evaluation metrics in the context of your dataset/problem.

Code on its own is meaningless and you’ll always be asked to comment it, explain it and interpret its results. Your explanations of your modeling process and interpretations of modeling results are much more important than the code blocks themselves! The code is simply the tool to help you do your modeling.

Formative

Simple practice problem set
(due Week 04 and to be announced in Week 03’s lecture)

We want you to practice:

  • Submitting work via GitHub Classroom
  • Using Quarto markdown
  • Writing simple Python code (using scikit-learn/pandas)
  • Fitting and applying basic supervised machine learning models for prediction (regression)


Formative

Supervised learning practice problem set
(due Week 06 and to be announced in Week 05’s lecture)

We want you to practice:

  • Submitting work via GitHub Classroom
  • Using Quarto markdown
  • Know how to fit and apply supervised machine learning models for classification
  • Know how to evaluate and compare fitted models and select models
  • Practice writing justifications for modeling choices and interpretations of contextualized evaluation metrics


30%

Problem Set 01 — Supervised learning
(due Week 09)

See the 🎯 Learning Objectives involved


  • Know how to fit and apply supervised machine learning models for classification and prediction.
  • Apply the methods learned to real data through hands-on exercises.
  • Know how to evaluate and compare fitted models, and to improve model performance.


30%

Problem Set 02 — Unsupervised learning
(Spring Term)

See the 🎯 Learning Objectives involved


  • Understand when and where to apply unsupervised learning techniques (e.g clustering or anomaly detection) and what differentiates them from supervised learning.
  • Practice dimensionality reduction techniques.
  • Apply the methods learned to real data through hands-on exercises.
  • Integrate the insights from data analytics into knowledge generation and decision-making.
  • Understand an introductory framework for working with natural language (text) data using techniques of machine learning.

✍️ Group project (40%)

  • A group project that lasts two weeks.
  • Students are introduced to datasets (and associated “research questions”) on the first day of the project.
    Students are asked to rank projects by order of preference by the end of the day. By the end of day 1, students are assigned to a group (no more than 4 students per group).
  • There are two check-in points with lecturer/teachers during the project to discuss the direction of the project.
  • A Quarto project report should be submitted at the end of the two weeks.

40%

Group project
(Spring Term)

  • The group project will be based on all the material covered in the course, including the lectures, labs, problem sets and applications (W09-W11).