... | ... | @@ -32,4 +32,5 @@ The particularity of this given order is that we don't have to fully complete a |
|
|
However, keep in mind that the backward pass is the forward pass but mirrored, so the first backward layers are making data dependencies (the update of the weights) towards the last forward layer.
|
|
|
|
|
|
|
|
|
With everything that has been stated, we can now create the following data flow diagram : |
|
|
\ No newline at end of file |
|
|
With everything that has been stated, we can now create the following data flow diagram :
|
|
|
![GPT-2_task_based_model](uploads/567b427c27a031be8a062ba81b93fd60/GPT-2_task_based_model.png) |
|
|
\ No newline at end of file |