An interactive guide to LLMs

Step by Token

Understanding how large language models work, one interactive visualization at a time.

Start reading Glossary

Table of contents

21 / 21 · 189 min

IAnatomy of a Model

01
Predicting one word at a time
What is a language model? Why predicting the next word is enough to make intelligence emerge.
6 min
02
From text to tokens
How text becomes numbers. BPE, subwords, and why LLMs struggle to count letters.
8 min
03
The space of meaning
Words in a geometric space. King − Man + Woman = Queen, and other vector miracles.
10 min
04
Attention is all you need
The mechanism that changes everything. How each token looks at all others to understand context.
12 min
05
The Transformer, in full
Putting the pieces together: multi-head attention, feed-forward, normalization, residual connections.
14 min

IITraining and Alignment

IIIThe Model in Production

IVGoing Further

Step by Token