LSE DS101A - Fundamentals of Data Science
2023/24 Autumn Term
Intro | |||
---|---|---|---|
๐๏ธ Week 01 25 Sep 2023- 29 Sep 2023 |
๐งโ๐ซ Lecture | Introduction, Context & Key Concepts | |
๐ป Class | Discussions: the boundaries of personal data | ||
โ๏ธ Coursework |
|
||
๐ Readings |
Indicative
|
||
Basic concepts from Computer Science and Statistics | |||
๐๏ธ Week 02 02 Oct 2023- 06 Oct 2023 |
๐งโ๐ซ Lecture | Data types and the concept of tidy data | |
๐ป Class | Live Demo: How data scientists use programming to preprocess data | ||
โญ Formative |
|
||
๐ Readings |
Indicative
|
||
๐๏ธ Week 03 09 Oct 2023- 13 Oct 2023 |
๐งโ๐ซ Lecture | Computational Thinking and Programming | |
๐ป Class | Live Demo: How data scientists use programming to visualise data | ||
๐ Summative |
|
||
๐ Readings |
Indicative
|
||
๐๏ธ Week 04 16 Oct 2023- 20 Oct 2023 |
๐งโ๐ซ Lecture | Statistical Inference I | |
๐ป Class ๐ | Tutorial: Introduction to Zotero & Quarto Markdown | ||
๐ Readings |
Indicative
|
||
๐๏ธ Week 05 23 Oct 2023- 27 Oct 2023 |
๐งโ๐ซ Lecture | Statistical Inference II | |
๐ป Class | Group Presentations (worth 10% of final grade) | ||
โญ Formative |
|
||
โ๏ธ Coursework |
|
||
๐ Readings |
Indicative
|
||
๐๏ธ Week 06 30 Oct 2023- 04 Nov 2023 |
Reading Week | ||
Machine Learning & AI | |||
๐๏ธ Week 07 06 Nov 2023- 11 Nov 2023 |
๐งโ๐ซ Lecture | Machine Learning I: Supervised Learning | |
๐ป Class | Live Demo: Supervised Learning | ||
๐ Drop-in session | We will host a drop-in session on Week 07 to help answer any questions you have about Quarto Markdown and Zotero | ||
โญ Formative |
|
||
๐ Readings |
Indicative
|
||
๐๏ธ Week 08 13 Nov 2023- 17 Nov 2023 |
๐งโ๐ซ Lecture | Machine Learning II: Unsupervised Learning | |
๐ป Class |
Peer-reviewing activity (Details about the activity will be given on the week 7 Lecture) |
||
โญ Formative |
|
||
๐ Readings |
Indicative
|
||
๐๏ธ Week 09 20 Nov 2023- 24 Nov 2023 |
๐งโ๐ซ Lecture | Unstructured Data (Text, Audio, Video) | |
๐ป Class | In-class activity: exploring Machine Learning metrics (with a case study) | ||
๐ Summative |
|
||
๐ Readings |
Indicative
|
||
Decisions and Implications | |||
๐๏ธ Week 10 27 Nov 2023- 01 Dec 2023 |
๐งโ๐ซ Lecture | Prediction vs. Explanation | |
๐ป Class | Live Demo: Unsupervised Learning | ||
๐ Drop-in session | We will host a drop-in session on Week 11+1 to help answer any questions you have about your summative essay | ||
โ๏ธ Coursework |
|
||
๐ Readings |
Indicative
|
||
๐๏ธ Week 11 04 Dec 2023- 08 Dec 2023 |
๐งโ๐ซ Lecture | Ethical issues of AI and ethical AI: an overview | |
๐ป Class | Exploring Generative AI | ||
Deadline Approaching โฒ๏ธ |
Keep working on your essays:
|
||
๐ Readings |
Indicative
|
||
After the Term | |||
๐๏ธ Week 11+1 | Deadline โ | Submit your essay by 20 December 2023 | |
๐ Summative |
|
||
Dec 2023- Jan 2024 |
Winter break | ||
Winter Term (Jan & Feb 2024) | |||
๐๏ธ Week 03 |
Deadline Approaching โฒ๏ธ |
Keep working on your essays:
|
|
๐๏ธ Week 6 | Deadline โ | Submit your essay by 22 February 2024 | |
The End |
References
Aschwanden, Christie. 2015. โScience Isnโt Broken.โ FiveThirtyEight. https://fivethirtyeight.com/features/science-isnt-broken/.
Bakir, Vian. 2020. โPsychological Operations in Digital Political Campaigns: Assessing Cambridge Analyticaโs Psychographic Profiling and Targeting.โ Frontiers in Communication 5 (September): 67. https://doi.org/10.3389/fcomm.2020.00067.
Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. โOn the Dangers of Stochastic Parrots: Can Language Models Be Too Big? ๐ฆ.โ In, 610โ23. Virtual Event Canada: ACM. https://doi.org/10.1145/3442188.3445922.
Bossman, Julia. 2016. โTop 9 Ethical Issues in Artificial Intelligence. World Economic Forum.โ October 21, 2016. https://www.weforum.org/agenda/2016/10/top-10-ethical-issues-in-artificial-intelligence/.
Bridle, James. 2023. โThe Stupidity of AI.โ The Guardian, March. https://www.theguardian.com/technology/2023/mar/16/the-stupidity-of-ai-artificial-intelligence-dall-e-chatgpt.
Broman, Karl W., and Kara H. Woo. 2018. โData Organization in Spreadsheets.โ The American Statistician 72 (1): 2โ10. https://doi.org/10.1080/00031305.2017.1375989.
Bruce, Peter C., and Andrew Bruce. 2017. Practical Statistics for Data Scientists: 50 Essential Concepts. First edition. Sebastopol, CA: OโReilly. https://ebookcentral.proquest.com/lib/londonschoolecons/detail.action?docID=4857224.
DโIgnazio, Catherine, and Lauren F. Klein. 2020. Data Feminism. Strong Ideas Series. Cambridge, Massachusetts: The MIT Press. https://ebookcentral.proquest.com/lib/londonschoolecons/reader.action?docID=6120950.
Denning, Peter J., and Matti Tedre. 2019. Computational Thinking. The MIT Press Essential Knowledge Series. Cambridge, Massachusetts: The MIT Press.
Enders, Craig K. 2022. Applied Missing Data Analysis. Guilford Publications.
Flach, Peter A. 2012. Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge: Cambridge University Press. https://doi-org.gate3.library.lse.ac.uk/10.1017/CBO9780511973000.
Floridi, Luciano, Josh Cowls, Monica Beltrametti, Raja Chatila, Patrice Chazerand, Virginia Dignum, Christoph Luetge, et al. 2018. โAI4Peopleโan Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations.โ Minds and Machines (Dordrecht) 28 (4): 689โ707.
Gimlet. n.d. โ#177 Gleeks and Gurgles Reply All.โ Accessed January 15, 2023. https://gimletmedia.com:443/shows/reply-all/z3h78d6.
Gramegna, Alex, and Paolo Giudici. 2021. โSHAP and LIME: An Evaluation of Discriminative Power in Credit Risk.โ Frontiers in Artificial Intelligence 4. https://doi.org/10.3389/frai.2021.752558.
Greshake, Kai, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. โNot What Youโve Signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection.โ https://arxiv.org/abs/2302.12173.
Guyan, Kevin. 2022. Queer Data: Using Gender, Sex and Sexuality Data for Action. Bloomsbury Studies in Digital Cultures. London: Bloomsbury Academic. https://web-s-ebscohost-com.gate3.library.lse.ac.uk/ehost/detail/detail?nobk=y&vid=2&sid=a8efeedd-6bfc-459a-9f0c-a67dabcc75d1@redis&bdata=JnNpdGU9ZWhvc3QtbGl2ZQ==#AN=3077276&db=nlebk.
Hofman, Jake M., Amit Sharma, and Duncan J. Watts. 2017. โPrediction and Explanation in Social Systems.โ Science 355 (6324): 486โ88. https://doi.org/10.1126/science.aal3856.
Hofman, Jake M., Duncan J. Watts, Susan Athey, Filiz Garip, Thomas L. Griffiths, Jon Kleinberg, Helen Margetts, et al. 2021. โIntegrating Explanation and Prediction in Computational Social Science.โ Nature 595 (7866): 181โ88. https://doi.org/10.1038/s41586-021-03659-0.
Hullman, Jessica, Sayash Kapoor, Priyanka Nanayakkara, Andrew Gelman, and Arvind Narayanan. 2022. โThe Worst of Both Worlds: A Comparative Analysis of Errors in Learning from Data in Psychology and Machine Learning.โ In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 335โ48. Oxford United Kingdom: ACM. https://doi.org/10.1145/3514094.3534196.
Illowsky, Barbara, and Susan L. Dean. 2013. Introductory Statistics. Houston, Texas: OpenStax College. https://openstax.org/details/books/introductory-statistics.
Isaak, Jim, and Mina J. Hanna. 2018. โUser Data Privacy: Facebook, Cambridge Analytica, and Privacy Protection.โ Computer 51 (8): 56โ59. https://doi.org/10.1109/MC.2018.3191268.
Jones, Darren. 2023. โBasic Data Types in Python.โ Real Python. https://realpython.com/courses/python-data-types/.
Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. โThe Parable of Google Flu: Traps in Big Data Analysis.โ Science 343 (6176): 1203โ5. https://doi.org/10.1126/science.1248506.
Li, Bo, Peng Qi, Bo Liu, Shuai Di, Jingen Liu, Jiquan Pei, Jinfeng Yi, and Bowen Zhou. 2023. โTrustworthy AI: From Principles to Practices.โ ACM Comput. Surv. 55 (9). https://doi.org/10.1145/3555803.
Lupton, Deborah. 2016. โThe Diverse Domains of Quantified Selves: Self-Tracking Modes and Dataveillance.โ Economy and Society 45 (1): 101โ22. https://doi.org/10.1080/03085147.2016.1143726.
โโโ. 2020. โData Mattering and Self-Tracking: What Can Personal Data Do?โ Continuum 34 (1): 1โ13. https://doi.org/10.1080/10304312.2019.1691149.
Mehrabi, Ninareh, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. โA Survey on Bias and Fairness in Machine Learning.โ ACM Comput. Surv. 54 (6). https://doi.org/10.1145/3457607.
Nauta, Meike, Jan Trienes, Shreyasi Pathak, Elisa Nguyen, Michelle Peters, Yasmin Schmitt, Jรถrg Schlรถtterer, Maurice van Keulen, and Christin Seifert. 2023. โFrom Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI.โ ACM Comput. Surv. 55 (13s). https://doi.org/10.1145/3583558.
Parsons, Lian. 2020. โEthical Concerns Mount as AI Takes Bigger Decision-Making Role. Harvard Gazette.โ October 26, 2020. https://news.harvard.edu/gazette/story/2020/10/ethical-concerns-mount-as-ai-takes-bigger-decision-making-role/.
Perkel, Jeffrey M. 2022. โSix Tips for Better Spreadsheets.โ Nature 608 (7921): 229โ30. https://doi.org/10.1038/d41586-022-02076-1.
Perry, Neil, Megha Srivastava, Deepak Kumar, and Dan Boneh. 2022. โDo Users Write More Insecure Code with AI Assistants?โ https://arxiv.org/abs/2211.03622.
Pessach, Dana, and Erez Shmueli. 2022. โA Review on Fairness in Machine Learning.โ ACM Comput. Surv. 55 (3). https://doi.org/10.1145/3494672.
Pietsch, Wolfgang. 2022. On the Epistemology of Data Science: Conceptual Tools for a New Inductivism. Philosophical Studies Series, Volume 148. Cham: Springer.
Podoletz, Lena. 2022. โWe Have to Talk about Emotional AI and Crime.โ AI & SOCIETY, May. https://doi.org/10.1007/s00146-022-01435-w.
Prince, J. Dale. 2014. โThe Quantified Self: Operationalizing the Quotidien.โ Journal of Electronic Resources in Medical Libraries 11 (2): 91โ99. https://doi.org/10.1080/15424065.2014.909145.
Python documentation. 2023. โFloating Point Arithmetic: Issues and Limitations.โ Python Documentation. https://docs.python.org/3.10/tutorial/floatingpoint.html.
Rastogi, Charvi, Yunfeng Zhang, Dennis Wei, Kush R. Varshney, Amit Dhurandhar, and Richard Tomsett. 2022. โDeciding Fast and Slow: The Role of Cognitive Biases in AI-Assisted Decision-Making.โ Proc. ACM Hum.-Comput. Interact. 6 (CSCW1). https://doi.org/10.1145/3512930.
Real Python. 2023. โIntroduction to Python.โ Real Python. https://realpython.com/learning-paths/python3-introduction/.
Rettberg, Jill Walker. 2022. โAlgorithmic Failure as a Humanities Methodology: Machine Learningโs Mispredictions Identify Rich Cases for Qualitative Analysis.โ Big Data & Society 9 (2): 205395172211312. https://doi.org/10.1177/20539517221131290.
Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. โ"Why Should i Trust You?": Explaining the Predictions of Any Classifier.โ https://arxiv.org/abs/1602.04938.
Sachs, Jeffrey, Rahshemah Wise, and Daniel Karell. 2021. โThe TikTok Self: Music, Signaling, and Identity on Social Media.โ Preprint. SocArXiv. https://doi.org/10.31235/osf.io/2rx46.
Scheffer, Judi. 2002. โDealing with Missing Data.โ
Schroeder, Stan. 2022. โTikTokโs in-App Browser Can Monitor Your Every Click and Keystroke.โ Mashable. https://mashable.com/article/tiktok-browser-monitoring.
Schutt, Rachel, and Cathy OโNeil. 2013. Doing Data Science. First edition. Beijing ; Sebastopol: OโReilly Media. https://ebookcentral.proquest.com/lib/londonschoolecons/detail.action?docID=1465965.
scikit-learn. 2023. โScikit Learn User Guide: Clustering.โ Scikit-Learn. https://scikit-learn/stable/modules/clustering.html.
Segura, Thomas L. 2023. โYes, GitHubโs Copilot Can Leak (Real) Secrets. GitGuardian Blog - Automated Secrets Detection.โ October 12, 2023. https://blog.gitguardian.com/yes-github-copilot-can-leak-secrets/.
Shafer, Douglas S., and Zhiyi Zhang. 2012. Introductory Statistics. Saylor Foundation. https://saylordotorg.github.io/text_introductory-statistics/.
Shah, Chirag. 2020. A Hands-on Introduction to Data Science. Cambridge, United Kingdom ; New York, NY, USA: Cambridge University Press. https://librarysearch.lse.ac.uk/permalink/f/1n2k4al/TN_cdi_askewsholts_vlebooks_9781108673907.
Sturz, John. 2023. โBasic Data Types in Python.โ Real Python. https://realpython.com/python-data-types/.
Swan, Melanie. 2013. โThe Quantified Self: Fundamental Disruption in Big Data Science and Biological Discovery.โ Big Data 1 (2): 85โ99. https://doi.org/10.1089/big.2012.0002.
Sweeney, Latanya. 2013. โDiscrimination in Online Ad Delivery: Google Ads, Black Names and White Names, Racial Discrimination, and Click Advertising.โ Queue 11 (3): 10โ29. https://doi.org/10.1145/2460276.2460278.
Verhagen, Mark D. 2022. โA Pragmatistโs Guide to Using Prediction in the Social Sciences.โ Socius: Sociological Research for a Dynamic World 8 (January): 237802312210817. https://doi.org/10.1177/23780231221081702.
Verma, Sahil, and Julia Rubin. 2018. โFairness Definitions Explained.โ In Proceedings of the International Workshop on Software Fairness, 1โ7. FairWare โ18. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3194770.3194776.
Viswanathan, Giri. 2023. โChatGPT Struggles to Answer Medical Questions, New Research Finds. CNN.โ December 10, 2023. https://www.cnn.com/2023/12/10/health/chatgpt-medical-questions/index.html.
Warne, Russell T. 2021. Statistics for the Social Sciences: A General Linear Model Approach. Second edition. Cambridge, United Kingdom New York, NY Port Melbourne, Australia New Delhi, India Singapore: Cambridge University Press. https://doi.org/10.1017/9781108894319.
Wickham, Hadley. 2014. โTidy Data.โ Journal of Statistical Software 59 (10). https://doi.org/10.18637/jss.v059.i10.
Wong, Julia Carrie. 2019. โThe Cambridge Analytica Scandal Changed the World โ but It Didnโt Change Facebook.โ The Guardian, March. https://www.theguardian.com/technology/2019/mar/17/the-cambridge-analytica-scandal-changed-the-world-but-it-didnt-change-facebook.
Zuboff, Shoshana. 2019. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. First edition. New York: PublicAffairs. https://www.publicaffairsbooks.com/titles/shoshana-zuboff/the-age-of-surveillance-capitalism/9781610395694/.