... | ... | @@ -2,6 +2,8 @@ This wiki's goal is to link the GPT-2 model with its implementation in C. It wil |
|
|
|
|
|
# Description of the model
|
|
|
|
|
|
## Introduction
|
|
|
|
|
|
The Generative Pre-trained Transformer 2 (GPT-2) is a Large Language Model (LLM) introduced by OpenAI. It's particularity is to be composed of many Transformer layer, as you can see below.
|
|
|
|
|
|

|
... | ... | @@ -11,4 +13,39 @@ However, the model implemented in [our reference code](https://github.com/karpat |
|
|
+ The second residual forward is linking the output of the multi-head masked attention, and not the output of the normalization layer
|
|
|
+ The attention layer's function does not include the two linear layers as the sketch suggests. These two layers are calculated using matmul functions
|
|
|
|
|
|
Therefore, here is a rectified sketch of the model implented : |
|
|
\ No newline at end of file |
|
|
Therefore, here is a rectified sketch of the model implented :
|
|
|

|
|
|
|
|
|
|
|
|
|
|
|
## Tokens
|
|
|
|
|
|
Describe token -> words and words -> tokens
|
|
|
|
|
|
## Data formatting
|
|
|
|
|
|
Introducing the main variables (V, NH, L ...)
|
|
|
Describing the matrices
|
|
|
Short
|
|
|
|
|
|
## Advanced description
|
|
|
|
|
|
Explain the functions in detail. Inputs, outputs and variables
|
|
|
|
|
|
## Variable dictionary
|
|
|
|
|
|
Dictionary of the model's parameters
|
|
|
|
|
|
# Model performances
|
|
|
|
|
|
## Sequential
|
|
|
|
|
|
Performance of the sequential model
|
|
|
|
|
|
## OpenMP
|
|
|
|
|
|
Performance of the model with OpenMP
|
|
|
|
|
|
## OpenMP/n-OS-V
|
|
|
|
|
|
Performance of the model with OpenMP/n-OS-V |
|
|
\ No newline at end of file |