🗣️ Week 08 Lecture

From Word Vectors to Transformers

Author

Dr Barry Ledeatte and Dr Jon Cardoso-Silva

Published

10 March 2025

📢 Important Update (10 March, 3:20pm): The lecture notebook has been updated to fix an incorrect model loading that affected the masked language modeling demos. See the full explanation on Slack.

Building on last week’s exploration of word embeddings, we’re now going to shift our attention to the revolutionary transformer architectures that power today’s most advanced AI systems. The lecture will provide the theoretical foundation as well as practical demonstrations of how to use these models for our use case in this course, of analysis of climate policy and climate finance documents.

📍 Session Details

Date: Monday, 10 March 2025
Time: 10:00 am - 12:00 pm
Location: KSW.1.04

📚 Preparation

If you want to come super prepared for the second hour of the lecture, follow these steps:

Continue using your embedding-env from last week’s lab, but update it with the new requirements:

# Navigate to your environment directory
cd path/to/parent/directory/of/embedding-env

# If on Mac/Linux
source embedding-env/bin/activate

# If on Windows
embedding-env\Scripts\activate

Then, install the new requirements. Replace your requirements.txt with the following:

Click HERE to see the requirements.txt file

# Core data science packages
numpy==1.26.4
pandas==2.2.3
matplotlib==3.10.1
scikit-learn==1.6.1

# NLP and text processing
nltk==3.9.1
gensim==4.3.3
langdetect==1.0.9

# Transformers and deep learning
transformers==4.39.3
datasets==2.18.0
torch==2.2.1

# Visualization
lets-plot==4.6.0

# Utilities
tqdm==4.67.1
ipykernel==6.29.5
ipywidgets==8.1.5

Remember to activate the environment each time you open a new terminal or a notebook.

There’s no reading ahead of this lecture. We will present the materials as we go along.

Things are about to get a bit more complicated.

I recommend that you separate some time later in the week to go over the lecture materials again to reinforce your understanding.

🗣️ Lecture Content

The lecture is organised into two parts:

Hour 1: Neural Foundations of Transformer Models (Barry)

🎞️ Slides

Biological Inspiration: Neural networks inspired by brain structure and function
Computation Fundamentals: Linear transformations, non-linear activations, and matrix operations
Learning Process: Loss functions, gradients, and the training lifecycle
Deep Learning Architecture: From simple perceptrons to complex neural networks
Transformer Evolution: How attention mechanisms revolutionized NLP

Hour 1 Slides

Use keyboard arrows to navigate. Select the slides below or view fullscreen.

Hour 2: Transformers for Climate Document Retrieval (Jon)

🖥️ Live Demo

From Static to Contextual Embeddings: Understanding how transformers capture word meaning in context
The HuggingFace Ecosystem: Navigating the model hub and using pre-trained models
Domain-Specific Models: Comparing general models with specialized ones like ClimateBERT
Building Retrieval Systems: Creating embeddings for climate documents and implementing similarity search

These concepts will form the foundation for tomorrow’s lab session and will cover the key principles behind:

Loading, saving and using pre-trained transformer models from 🤗 HuggingFace ¹
Creating contextual embeddings for climate-related text
Comparing general-purpose versus domain-specific models (DistilRoBERTa vs. ClimateBERT)
Visualising embeddings and their similarity relationships
Practical applications for climate policy analysis and research

This will lead up to the upcoming ✍️ Problem Set 2 (to be released very soon) where you will be asked to build a retrieval system, and to explain its underlying principles.

🎥 Session Recording

The lecture recording will be available on Moodle by the afternoon of the lecture.

Footnotes

The emoji IS part of the name of the organisation.↩︎