tchatela · c80d329e
--- a/GPT2-Parallelization-and-porting.md
+++ b/GPT2-Parallelization-and-porting.md
@@ -81,9 +81,12 @@ In order to calculate our gradients for our backpropagation, we have to keep an
 - **encoded** : Output of the positional_encoding layer (B, T, C)
 - **ln1** : Output of the first layernorm inside the transformer block (L, B, T, C)
 - **ln1_mean** : 
- **ln1_mean** : 
- **ln1_mean** : 
- **ln1_mean** : 
+- **ln1_rstd** : 
+- **qkv** : 
+- **atty** : 
+- **preatt** : 
+- **att** : 
+- **** : 
 - **ln1_mean** :