Step 3: Prepare a progress presentation (W11 lab)

2023/24 Winter Term

It is your day to shine! Show us your cool data manipulation skills and interesting stuff you discovered about your data.
Author

What we expect to see in your second presentation. This one is graded and worth 15% of your final grade. Check the ✍️ Assessment page for more details.

This 12-minute presentation will take place at the DSI Visualisation Studio (COL.1.06) at the same time allocated for your lab session.

What we want to see

Journey before destination, they say.

This is about showing us your progress. Some will have gathered data, some will have faced challenges and gathered none. That’s all fine.

We will reward effort more than a final dataset.

  1. A diagram illustrating the steps you’ve taken so far: detail the data sources you’ve explored (even those that didn’t work well), the steps within data cleaning, and the data pre-processing steps you need to take to create a plot or a summary table. Aim for the “right” balance: you don’t want to make a visually polluted slide or provide a generic data pipeline.

    • You can get some inspiration by searching online for images of ‘Big Data Pipeline’

    • There are ways to create diagrams using markdown, too, like those supported by Quarto

  2. Progress: We want to learn about how you progressed since last week.

    • Not everyone will have successfully collected data yet, and that’s fine. If that is your case, we want to see your efforts and the challenges you faced.

    • If you managed to collect some data, we want to see how you did it. Do you have data frames already? What do they look like? Why did you structure the tables that way?

  3. Data-manipulation opportunities: Whether you’ve got far or not, we want to see if you could spot opportunities to use your data manipulation with the best practices taught at DS105W.

    • If you create plots, they must be a grammar-of-graphics style. You will be penalised if you use matplotlib (or, god forbid, Excel) instead of plotnine1.

You will receive feedback and ideas on other packages/software/methodologies from the teaching team and other invited Data Science Institute community members to explore further.

What to do

  • You are asked to prepare a group presentation of 12 minutes to be presented during your lab sessions.
    • It is not a lot of time, so practice your presentation with your group members to ensure you are on time.
    • Deciding what to leave out is as important as deciding what to include.
  • Your presentation must be hosted and visible on your website (which you created in Week 09 lab).
    • You don’t need to prepare a slide deck necessarily. If you want, you can show us your website and navigate through it.
    • You will be penalised if you show up with a PowerPoint or Canva.
    • If you want to use slides, consider using RevealJS for slide-based presentations or preferably Quarto Markdown with RevealJS enabled to create RevealJS presentations without having to manually write any HTML or CSS.
IMPORTANT: Attendance and late submissions

Live Presentations Required

Please do not embed a video of your presentation. We expect to see you present live!

If you cannot present in person due to exceptional reasons, submit an extension request form with appropriate evidence, as per LSE guidelines. Email the request to Kevin at 📧 .

  • Valid Medical or Exceptional Reasons: Absences for medical reasons or other exceptional circumstances will not attract any penalties. You will still be required to submit a video on Slack by the deadline specified in the approval email.

  • Absence Without Valid Reason: If you miss the presentation without a valid reason (per LSE guidelines), we will allow you to submit a video on Slack, albeit with a significant penalty of -25 marks.

    • Please make sure to be present for the group presentation, as your absence may likely impact the group’s work. For example, the absence of a team member may lead to omitting essential details of the project, impacting the group’s marks. Ultimately, the groups are responsible for making a complete presentation on the day, regardless of the absence of one or more members.

Video Submission Details

Irrespective of the reason for your absence, if you are to submit a video, please adhere to these guidelines:

  • Video Recording: Create a 5-minute Loom video (it’s free) to present an aspect of your group project.

    • Why Loom?: It allows you to record from your browser and share your screen, and it doesn’t require software installation (though a desktop app is available). Additionally, Loom videos can be played directly in Slack, eliminating the need for downloads.
  • Sharing Your Video: Once uploaded to Loom, click ‘Copy the Link’ and post this link in the #general channel on Slack.

  • Deadline: The video submission deadline is the same as the live presentation day for all students. Your video link should be on Slack by Tuesday, 26 March 2024, 10.30am UK time. Refer to your approval email for the specific deadline if you were granted an extension. Late submissions will be penalised according to LSE guidelines.

Marking criteria

You will be marked on a 0-100 points scale:

Criteria Description Marks
01: Time management and clear communication Aim for a smooth presentation
- The group submitted the presentation file before the lab session
- All group members presented
- The group used their 12 minutes well
- The presentation had a nice “flow”. It was not repetitive or boring, and there was a storyline.
- The group convincingly explained their progress from the previous week.
- If there were changes in data sources since W10, the group explained the rationale.
- There wasn’t a lot of unnecessary text in the presentation.
- There weren’t blurry figures or blurry screenshots in the presentation.
- The presentation was conversational, not too formal.
- The group didn’t spend a lot of time on unnecessary/uninteresting details.

How to get distinction here
If you are ambitious about getting a first on this criterion (>21/30 marks), here’s what you can do to WOW us:
- Your website was converted to Quarto Markdown and it works!
- Your presentation is a RevealJS slide created using Quarto and is embedded inside an <iframe> in your website.
30
02: The Diagram & the Data Your data sources should be clearly documented.
- Data sources appear in a diagram
- The (tentative) volume of data is presented
- The group provided a rationale for the selection of the data sources
- If there were “unconventional” data types (not int, float, datetime, simple strings, etc.), the group has shown it to us.
- The group described their original plan and initial efforts to get the desired data.
- Then, we were told what happened:
1. If the group managed to collect substantial data, they showed us summary tables, visualisations, and any key insights derived.
2. If data collection was a real challenge, we received a good account of their ‘data horror story’: the process was well documented, with screenshots or code snippets of attempts, even if unsuccessful.
- If and when relying on domain expertise outside this course’s scope, the group explained and cited relevant academic references.
- We were told of unusual aspects of the data, such as outliers or counter-intuitive findings.

How to get distinction here
If you are ambitious about getting a first on this criterion (>21/30 marks), here’s what you can do to WOW us:
- Your diagram was written in Mermaid or Graphviz and it looks awesome! Everything is clearly visible.
- Your tables, showcasing your dataframes, appeared in nice markdown formatting and were further customised so they don’t look too big or awkward.
30
04: Data Wrangling Skills Showcase your data manipulation and visualisation skills!
- We were shown a diagram outlining all relevant data cleaning, manipulation, and pre-processing steps.
- The process of data cleaning methods was described with examples.
- If the group had to reshape data (via merging, pivoting, etc.), they showed us before-and-after examples.
- The data transformations made were justified. For example, when a group used groupby-apply’s, they made it clear why that was necessary.
- 💡 TIP: You don’t have to show us code, but if you do, ensure it appears in Markdown syntax highlighted (NOT screenshots)

How to get distinction here
If you are ambitious about getting a first on this criterion (>28/30 marks), here’s what you can do to WOW us:
- We see actual interactive illustrations (and they are not blurred and don’t have weird fonts!) that demonstrate how you have or plan to reshape your data when summarising data or performing a visualisation.
40

Footnotes

  1. I typically get asked why I ask for plotnine instead of the usual matplotlib in Python. This is because plotnine forces us to think about the shape of the data explicitly. You must transform the data to get the plot right; that is the skill we want you to practice (and demonstrate) most in this course!↩︎