Jun 4, 2026
Matrix Wx+b: From Scalars To Transformers
A step-by-step explanation of how scalar wx+b becomes matrix Wx+b, how shape rules govern transformer layers, and why GEMM is the physical kernel underneath modern models.
Read post →Tagged With
1 post connected to this tag.
Get my rants delivered to your inbox