... | @@ -55,19 +55,19 @@ This part is dedicated to facilitate the comprehension of the code given in the |
... | @@ -55,19 +55,19 @@ This part is dedicated to facilitate the comprehension of the code given in the |
|
|
|
|
|
* **GPT2Config config** : Contains all the parameters described just above
|
|
* **GPT2Config config** : Contains all the parameters described just above
|
|
* **<span dir="">ParameterTensors params</span>** : Contains the parameters described in the "Model forward parameters" section
|
|
* **<span dir="">ParameterTensors params</span>** : Contains the parameters described in the "Model forward parameters" section
|
|
* **<span dir="">size_t param_sizes\[NUM_PARAMETER_TENSORS\] </span>**: A list of the size of each parameter array inside "params"
|
|
* **<span dir="">size_t param_sizes\[NUM_PARAMETER_TENSORS\]</span>** : A list of the size of each parameter array inside "params"
|
|
* **<span dir="">float\* params_memory </span>**: A pointer towards the first parameters array of "params"
|
|
* **<span dir="">float\* params_memory</span>** : A pointer towards the first parameters array of "params"
|
|
* **<span dir="">size_t num_parameters</span>** : The total number of parameters in our model (124 439 808 in our case)
|
|
* **<span dir="">size_t num_parameters</span>** : The total number of parameters in our model (124 439 808 in our case)
|
|
* **<span dir="">ParameterTensors grads</span>** : Contains all the gradients arrays for backpropagation
|
|
* **<span dir="">ParameterTensors grads</span>** : Contains all the gradients arrays for backpropagation
|
|
* **<span dir="">float\* grads_memory </span>**: A pointer towards the first array of "grads"
|
|
* **<span dir="">float\* grads_memory</span>** : A pointer towards the first array of "grads"
|
|
* **<span dir="">float\* m_memory</span>** : Used for AdamW optimization function. It is an array of size _num_parameters_
|
|
* **<span dir="">float\* m_memory</span>** : Used for AdamW optimization function. It is an array of size _num_parameters_
|
|
* **<span dir="">float\* v_memory</span>** : Same as _m_memory_
|
|
* **<span dir="">float\* v_memory</span>** : Same as _m_memory_
|
|
* **<span dir="">ActivationTensors acts </span>**: Contains the parameters described in the "Variables for backward propagation" section
|
|
* **<span dir="">ActivationTensors acts</span>** : Contains the parameters described in the "Variables for backward propagation" section
|
|
* **size_t act_sizes\[NUM_ACTIVATION_TENSORS\]** : A list of the size of each parameter array inside "acts"
|
|
* **size_t act_sizes\[NUM_ACTIVATION_TENSORS\]** : A list of the size of each parameter array inside "acts"
|
|
* **<span dir="">float\* acts_memory</span>** : A pointer towards the first parameters array of "acts"
|
|
* **<span dir="">float\* acts_memory</span>** : A pointer towards the first parameters array of "acts"
|
|
* **<span dir="">size_t num_activations</span>** : The total number of parameters in our model for backward propagation (the sum of _act_sizes_)
|
|
* **<span dir="">size_t num_activations</span>** : The total number of parameters in our model for backward propagation (the sum of _act_sizes_)
|
|
* **<span dir="">ActivationTensors grads_acts</span>** : Contains the activation gradients
|
|
* **<span dir="">ActivationTensors grads_acts</span>** : Contains the activation gradients
|
|
* **<span dir="">float\* grads_acts_memory </span>**: Pointer to the first array of "grads_acts"
|
|
* **<span dir="">float\* grads_acts_memory</span>** : Pointer to the first array of "grads_acts"
|
|
* **int batch_size** : The size of the current batch
|
|
* **int batch_size** : The size of the current batch
|
|
* **int seq_len** : The size of the current token sequence
|
|
* **int seq_len** : The size of the current token sequence
|
|
* **<span dir="">int\* inputs</span>** : The token array for the current batch
|
|
* **<span dir="">int\* inputs</span>** : The token array for the current batch
|
... | @@ -313,6 +313,12 @@ What is unexpected is that the OpenMP/nOS-V version is slower than OpenMP versio |
... | @@ -313,6 +313,12 @@ What is unexpected is that the OpenMP/nOS-V version is slower than OpenMP versio |
|
|
|
|
|

|
|

|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
The paraver trace shows us that nearly half of the threads are working simultaneously. We could first increase the efficiency by decreasing the number of threads used for the application.
|
|
|
|
|
|
# Next steps
|
|
# Next steps
|
|
|
|
|
|
Here are the next steps I plan to work on :
|
|
Here are the next steps I plan to work on :
|
... | | ... | |