Step 3: Prepare for your second group presentation (W11 lab)
2023/24 Autumn Term
What we expect to see in your second presentation. This one is graded and worth 15% of your final grade. Check the ✍️ Assessment page for more details.
What we want to see
A substantive analysis of your selected dataset and/or of the challenges you faced. Tell us about the discoveries you made while working with your data: what are the common and, more interestingly, the unusual aspects of your data? Additionally, share your data collection horror stories and the hurdles you encountered.
Show us how you spotted opportunities to use
pandas
(ortidyverse
if done in R) andplotnine
(ggplot2
if done in R) in your project.altair
andbokeh
are also valid options that follow the grammar-of-graphics principle We are looking for the data pre-processing and transformations you had to do to enable discoveries from your data 1.Create a diagram to illustrate your analysis process: detail the data sources, the steps within data cleaning, and the data pre-processing steps you need to take to create a plot or a summary table. Aim for the “right” balance: you don’t want to create a visually polluted slide or provide a generic data pipeline.
- You can get some inspiration by searching online for images of ‘Big Data Pipeline’
- There are ways to create diagrams using HTML, too, like those supported by Quarto
You will receive feedback and ideas on other packages/software/methodologies to explore further from the teaching team and other invited Data Science Institute community members.
What to do
- You are asked to prepare a group presentation of 15 minutes to be presented during your lab sessions.
- It is not a lot of time, so practice your presentation with your group members to ensure you are on time.
- Deciding what to leave out is as important as deciding what to include.
- Share your .ppt or .pdf or .html (revealjs) file on your Slack
#project-cgX-XXXXXXX
channel before the start of your lab session.- If you fail to do so, you will be penalized harshly on Criterion 01.
Live Presentations Required
Please do not embed a video of your presentation. We expect to see you present live!
If you cannot present in person due to exceptional reasons, submit an extension request form with appropriate evidence, as per LSE guidelines. Email the request to Kevin at 📧 .
Valid Medical or Exceptional Reasons: Absences for medical reasons or other exceptional circumstances will not attract any penalties. You will still be required to submit a video on Slack by the deadline specified in the approval email.
Absence Without Valid Reason: If you miss the presentation without a valid reason (per LSE guidelines), we will allow you to submit a video on Slack, albeit with a significant penalty of -25 marks.
- Please make sure to be present for the group presentation, as your absence may likely impact the group’s work. For example, the absence of a team member may lead to omitting essential details of the project, impacting the group’s marks. Ultimately, the groups are responsible for making a complete presentation on the day, regardless of the absence of one or more members.
Video Submission Details
Irrespective of the reason for your absence, if you are to submit a video, please adhere to these guidelines:
Video Recording: Create a 5-minute Loom video (it’s free) to present an aspect of your group project.
- Why Loom?: It allows you to record from your browser, share your screen, and doesn’t require software installation (though a desktop app is available). Additionally, Loom videos can be played directly in Slack, eliminating the need for downloads.
Sharing Your Video: Once uploaded to Loom, click ‘Copy the Link’ and post this link in the
#help-assessments
channel on Slack.Deadline: The video submission deadline is the same as the live presentation day for all students. Your video link should be on Slack by Friday, 8 December 2023, 23:59 UK time. If you’ve been granted an extension, refer to your approval email for the specific deadline. Late submissions will be penalised according to LSE guidelines.
Marking criteria
Let me warn you in advance that we will be strict when marking this presentation, not because we don’t like what you do but mostly because we have to mitigate grade inflation concerns.
Most submissions will likely fall in the 60-70 range, which signals that you did engage well with the project and put in a reasonable effort, although there were some minor mistakes or omissions. You can expect to achieve around 70/100 if you have done a very good job - you adhered to the marking criteria well, and we could not find much to criticise. You should expect to score above 80 only if your work has truly left us in a blissful state of awe and admiration that will not go away for days.
You will be marked on a 0-100 points scale:
Criteria | Description | Marks |
---|---|---|
01: Time management | Aim for a smooth presentation - The group submitted the presentation file before the lab session - All group members presented - The group used their 15 minutes well - The presentation had a nice “flow”. It was not repetitive or boring, and there was a storyline. - The group convincingly explained how they progressed from Week 08. - If there were changes in data sources since W08, the group explained the rationale. |
20 |
02: The Data | Your data sources should be clearly identified at this point. - Data sources appear in a diagram - The volume of data is presented - The group provided a rationale for the selection of the data sources - If there were “unconventional” data types (not int, float, datetime, simple strings, etc.), the group has shown it to us. |
10 |
03: Data Collection / Exploratory Analysis | Document your data collection and exploration efforts - The group described their original plan and initial efforts to get the desired data. - Then, we were told what happened: 1. If the group managed to collect substantial data, they showed us summary tables, visualisations, and any key insights derived. 2. If data collection was a real challenge, we received a good account of their ‘data horror story’: the process was well documented, with screenshots or code snippets of attempts, even if unsuccessful. - If and when relying on domain expertise outside this course’s scope, the group explained and cited relevant academic references. - We were told of unusual aspects of the data, such as outliers or counter-intuitive findings. |
20 |
04: Data Wrangling Skills | Showcase your data manipulation and visualization skills! - We were shown a diagram outlining all relevant data cleaning, manipulation, and pre-processing steps. - The process of data cleaning methods was described with examples. - If the group had to reshape data (via merging, pivoting, etc.), they showed us before-and-after examples. - The data transformations made were justified. For example, when a group used groupby-apply’s, they made it clear why that was necessary. - 💡 TIP: You don’t have to show us code, but if you do, ensure it is clear, readable, and relevant. |
40 |
05: Clear Communication | It’s ok if you are shy. Just make it interesting and clear! - There wasn’t a lot of unnecessary text in the presentation. - There weren’t blurry figures or blurry screenshots in the presentation. - The presentation was conversational, not too formal. - The group didn’t spend a lot of time on unnecessary/uninteresting details. |
10 |
Footnotes
I typically get asked why I ask for
plotnine
instead of the usualmatplotlib
in Python. This is because plotnine forces us to think about the shape of the data explicitly. You must transform the data to get the plot right; that is the skill we want you to practice (and demonstrate) most in this course!↩︎