πŸ—£οΈ Week 08 Lecture

From Word Vectors to Transformers

Author

Dr Barry Ledeatte and Dr Jon Cardoso-Silva

Published

10 March 2025

DS205 course icon

πŸ“’ Important Update (10 March, 3:20pm): The lecture notebook has been updated to fix an incorrect model loading that affected the masked language modeling demos. See the full explanation on Slack.

Building on last week’s exploration of word embeddings, we’re now going to shift our attention to the revolutionary transformer architectures that power today’s most advanced AI systems. The lecture will provide the theoretical foundation as well as practical demonstrations of how to use these models for our use case in this course, of analysis of climate policy and climate finance documents.

πŸ“ Session Details

  • Date: Monday, 10 March 2025
  • Time: 10:00 am - 12:00 pm
  • Location: KSW.1.04

πŸ“š Preparation

If you want to come super prepared for the second hour of the lecture, follow these steps:

  1. Continue using your embedding-env from last week’s lab, but update it with the new requirements:

    # Navigate to your environment directory
    cd path/to/parent/directory/of/embedding-env
    
    # If on Mac/Linux
    source embedding-env/bin/activate
    
    # If on Windows
    embedding-env\Scripts\activate
  2. Then, install the new requirements. Replace your requirements.txt with the following:

Click HERE to see the requirements.txt file
# Core data science packages
numpy==1.26.4
pandas==2.2.3
matplotlib==3.10.1
scikit-learn==1.6.1

# NLP and text processing
nltk==3.9.1
gensim==4.3.3
langdetect==1.0.9

# Transformers and deep learning
transformers==4.39.3
datasets==2.18.0
torch==2.2.1

# Visualization
lets-plot==4.6.0

# Utilities
tqdm==4.67.1
ipykernel==6.29.5
ipywidgets==8.1.5

Remember to activate the environment each time you open a new terminal or a notebook.

There’s no reading ahead of this lecture. We will present the materials as we go along.

Things are about to get a bit more complicated.

I recommend that you separate some time later in the week to go over the lecture materials again to reinforce your understanding.

πŸ—£οΈ Lecture Content

The lecture is organised into two parts:

Hour 1: Neural Foundations of Transformer Models (Barry)

🎞️ Slides

  • Biological Inspiration: Neural networks inspired by brain structure and function
  • Computation Fundamentals: Linear transformations, non-linear activations, and matrix operations
  • Learning Process: Loss functions, gradients, and the training lifecycle
  • Deep Learning Architecture: From simple perceptrons to complex neural networks
  • Transformer Evolution: How attention mechanisms revolutionized NLP

Hour 1 Slides

Use keyboard arrows to navigate. Select the slides below or view fullscreen.

Hour 2: Transformers for Climate Document Retrieval (Jon)

πŸ–₯️ Live Demo

  • From Static to Contextual Embeddings: Understanding how transformers capture word meaning in context
  • The HuggingFace Ecosystem: Navigating the model hub and using pre-trained models
  • Domain-Specific Models: Comparing general models with specialized ones like ClimateBERT
  • Building Retrieval Systems: Creating embeddings for climate documents and implementing similarity search

These concepts will form the foundation for tomorrow’s lab session and will cover the key principles behind:

  1. Loading, saving and using pre-trained transformer models from πŸ€— HuggingFace 1
  2. Creating contextual embeddings for climate-related text
  3. Comparing general-purpose versus domain-specific models (DistilRoBERTa vs. ClimateBERT)
  4. Visualising embeddings and their similarity relationships
  5. Practical applications for climate policy analysis and research

This will lead up to the upcoming ✍️ Problem Set 2 (to be released very soon) where you will be asked to build a retrieval system, and to explain its underlying principles.

πŸŽ₯ Session Recording

The lecture recording will be available on Moodle by the afternoon of the lecture.

Footnotes

  1. The emoji IS part of the name of the organisation.β†©οΈŽ