Jun 12, 2026
dL/d(LLM): The Full Backward Pass
A capstone walkthrough of the full LLM backward pass: loss to LM head, final norm, decoder layers, attention, FFN, residual splits, embedding scatter-add, training loop, AdamW, and C-Kernel-...
Read post →