LSE DS105M

Data for Data Science

Author
Published

29 November 2022

⚠️ Important Update 29/11/2022
Older Updates

(25/10/2022)

We re-ordered the topics from Weeks 05 onwards. See a list of what has changed:

  • Topic of Week 05 lecture is now β€œNotebooks + data frames”
  • The lecture β€œ(Re)shaping data.” was moved from Week 05 to Week 08 (before transforming data, we need to see what data looks like!)
  • Week 05 lab is now β€œAPIs & Data Frames” to allow for a smoother transition into programming
  • The lab β€œGithub & Markdown” was moved from Week 05 to Week 07
  • The lecture β€œData viz with the grammar of graphics” was moved from Week 10 to Week 07 (to help you include plots in your projects sooner)
  • The lecture β€œUnstructured data (text, audio & image)” was moved to Week 10
  • Weeks 09 & 11 were left untouched

πŸ““ Syllabus

A list of what will happen every week.

Click on each Week’s link for more information (slides, lab instructions, recommended resources, etc.).

Note about formative assessment: besides the in-person lab exercises, we might give take-home assignments on certain weeks. Even though we do not grade these problem sets, you will get written feedback on these formative assignments.

This course will help you become familiarised with the most fundamental practical tools needed to gather (Weeks 02-04) and pre-process data (Weeks 05-09) and will give you some inspiration for some fundamental analysis (Weeks 10-11) to perform in your selected datasets.

These skills are cumulative, so practice what you learned each week and make the most of lectures, labs and our Slack group. We believe these points of contact and integration will create a fertile environment of ideas for your project.

πŸ’‘ Remember: collaboration is key to the success of a data science project!

Intro
πŸ—“οΈ Week 01 Lecture Introduction and the Data Science Toolbox 🧰
Lab No class this week.
(Use this time to revisit basic R or python programming)
Theme: Behind the scenes
πŸ—“οΈ Week 02 Lecture Operating Systems, Files & The Terminal
Lab Navigating the command line in your own computer
πŸ—“οΈ Week 03 Lecture The Cloud: accessing and getting data in and out.
Lab Connecting to the cloud via the command line
Formative
  • Worth: knowledge!
  • Problem set involving the terminal.
  • Release date: 7 October 2022
  • Deadline: 13 October 2022
πŸ—“οΈ Week 04 Lecture The Internet: protocols + scraping + APIs.
Lab Web scraping exercise
Summative
  • Worth: 10% of final marks
  • Problem set involving the upload of data to the cloud.
  • Release date: 18 October 2022
  • Deadline: 24 October 2022
Group project
  • after class, students pitch their ideas of preferred APIs/datasets
  • there will be a designated channel on our Slack for this
Theme: Working with data
πŸ—“οΈ Week 05 Lecture Computational notebooks + data frames
Lab APIs & Data Frames
Summative
  • Worth: 15% of final marks
  • Problem set involving web scraping.
  • Release date: 26 October 2022
  • Deadline: 9 November 2022
Group project
  • students form groups of 3
  • students must submit a team contract to Moodle (not graded)
  • Release date: 25 October 2022
  • Deadline: 9 November 2022
πŸ—“οΈ Week 06 Reading Week
πŸ—“οΈ Week 07 Lecture Data viz with the grammar of graphics
Lab Github & Markdown
πŸ—“οΈ Week 08 Lecture (Re-)shaping data, data normalisation & databases
Lab We will have group presentations instead of a structured class this week.
Summative
  • Worth: 15% of final marks
  • each group will present about their selected data.
  • see instructions for marking criteria
πŸ—“οΈ Week 09 Lecture Managing your data science workflow.
Lab CANCELLED
(We didn’t have lab sessions on W09 due to UCU Industrial Strike Action)
Theme: Applications
πŸ—“οΈ Week 10 Lecture Unstructured data (text, audio & image)
Lab 🦸 Super tech-support (get help with your project)
Setting up Github for your group project & GitFlow
πŸ—“οΈ Week 11 Lecture Sentiment analysis, topic modelling and social networks
Lab We will have group presentations instead of a structured class this week
Summative
  • Worth: 20% of final marks
  • each group will present about their selected data
  • see instructions for marking criteria