08 Jul 2024
nlp
network analysis
optimisation
data science workflow
generative AI for education
machine learning applications
VIMuRe (social network analysis)
📦 Python and R packages available
De Bacco, Caterina, Martina Contisciani, Jonathan Cardoso-Silva, Hadiseh Safdari, Gabriela Lima Borges, Diego Baptista, Tracy Sweet, et al. 2023. “Latent Network Models to Account for Noisy, Multiply Reported Social Network Data.” Journal of the Royal Statistical Society Series A: Statistics in Society, February, qnac004.
We developed a Bayesian statistical model to uncover the ‘true’ underlying network behind the social network ties reported by individuals.
What can the emojis you use in your social media profile reveal about your political values? 🤔
Collaboration with:
Read more about it here
Joint project with Dr Marcos Barreto (LSE Statistics)
CONTEXT
In higher education, students are increasingly using Generative AI tools like ChatGPT, GitHub Copilot, Bard, Bing AI to enhance their learning experience. These tools offer personalised and immediate assistance for tasks such as summarising literature, brainstorming, and the writing of code and text, even though some outputs may have limitations in terms of transparency and accuracy. Some educators feel encouraged to incorporate these tools into our teaching and assessments to support students, but there is still limited evidence on how effective these generative AI tools are in improving learning outcomes.
This focus group aims to fill that gap and explore the practical applications of these tools and their role in enhancing, specifically, programming skills and critical thinking.
OBJECTIVES
His research relies on novel ways of using, combining, and automating open data from finance and beyond, such as creating natural language processing algorithms to extract and disambiguate named entities from text. He is currently investigating applications of network analysis in banking and implementing them in his project.
Prior to his DPhil, Alex completed a dual MSc in Public Administration and Government from LSE and Peking University, graduating with distinction and receiving a Prize for Best Dissertation from the Department of Government at LSE, where he researched factionalism and regionalism in Russian banking using geospatial analysis and text mining.
Sign up for DSI events at lse.ac.uk/DSI/Events
Sign up for the DSI Newsletter
DSI offers accessible introductions to Data Science:
Fundamentals of
Data Science
🎯 Focus:
theoretical concepts of data science
📂 How:
reflections through reading and writing
Data for
Data Scientists
🎯 Focus:
collection and handling of real data
📂 How:
hands-on coding exercises and a group project
Data Science for
Social Scientists
🎯 Focus:
fundamental machine learning algorithms
📂 How:
practical use of ML techniques and metrics
Let’s generate some data with mentimeter!
“[…] a field of study and practice that involves the collection, storage, and processing of data in order to derive important 💡 insights into a problem or a phenomenon.
Such data may be generated by humans (surveys, logs, etc.) or machines (weather data, road vision, etc.),
and could be in different formats (text, audio, video, augmented or virtual reality, etc.).”
knows everything about statistics
able to communicate insights perfectly
fully understands businesses like no one
is a fluent computer programmer
We are all jugglers 🤹
It is often said that 80% of the time and effort spent on a data science project goes to the abovementioned tasks.
And this is what this course is about! You will learn some of the most common tools used during this process.
We add the term data engineering to the name of this course for this very reason.
The struggle is real.
by u/ali_azg in r/dataengineering
But remember the unicorn 🦄! You don’t need to be an expert in all of these tools.
DataEngineering 2021 in one pic
by u/Legitimate-Cry2837 in dataengineering
Let’s zoom in 🔎 here.
Let’s navigate our website:
ME204’s favicon was created using DALL-E 2
After the break:
Python
Github!
Use Github for everything related to your project!
Important
Don’t share code via e-mail, Dropbox, Google Drive, or anything like that!
It is a bad practice. Things get messy very quickly.
x
means in a few weeks?
to access the documentation of a function??
to search for a functionhelp(package = "package_name")
to list all functions in a packagePro-tips
for
or while
) can be very inefficient in R
dput()
to share your data with othersYou can also consult the ‘Hands-on Programming with R’ book (Grolemund 2014) as a reference.
Embrace the pipe %>%
mindset!
Refer to (Tavares 2018) for a side-by-side comparison of base R and tidyverse
:
Tavares, Hugo. 2018. “Syntax Equivalents: Base R Vs Tidyverse.” Data Carpentry Extras.
Keep the dplyr
cheat sheet handy
dplyr
” cheatsheetLSE ME204 (2024) – Data Engineering for the Social World