Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • L llm.c - GPT2
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 4
    • Issues 4
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • tchatela
  • llm.c - GPT2
  • Wiki
  • GPT2 Parallelization and porting

GPT2 Parallelization and porting · Changes

Page history
Update GPT2 Parallelization and porting authored Jun 25, 2024 by tchatela's avatar tchatela
Show whitespace changes
Inline Side-by-side
GPT2-Parallelization-and-porting.md
View page @ a4389ed5
......@@ -55,19 +55,19 @@ This part is dedicated to facilitate the comprehension of the code given in the
* **GPT2Config config** : Contains all the parameters described just above
* **<span dir="">ParameterTensors params</span>** : Contains the parameters described in the "Model forward parameters" section
* **<span dir="">size_t param_sizes\[NUM_PARAMETER_TENSORS\] </span>**: A list of the size of each parameter array inside "params"
* **<span dir="">float\* params_memory </span>**: A pointer towards the first parameters array of "params"
* **<span dir="">size_t param_sizes\[NUM_PARAMETER_TENSORS\]</span>** : A list of the size of each parameter array inside "params"
* **<span dir="">float\* params_memory</span>** : A pointer towards the first parameters array of "params"
* **<span dir="">size_t num_parameters</span>** : The total number of parameters in our model (124 439 808 in our case)
* **<span dir="">ParameterTensors grads</span>** : Contains all the gradients arrays for backpropagation
* **<span dir="">float\* grads_memory </span>**: A pointer towards the first array of "grads"
* **<span dir="">float\* grads_memory</span>** : A pointer towards the first array of "grads"
* **<span dir="">float\* m_memory</span>** : Used for AdamW optimization function. It is an array of size _num_parameters_
* **<span dir="">float\* v_memory</span>** : Same as _m_memory_
* **<span dir="">ActivationTensors acts </span>**: Contains the parameters described in the "Variables for backward propagation" section
* **<span dir="">ActivationTensors acts</span>** : Contains the parameters described in the "Variables for backward propagation" section
* **size_t act_sizes\[NUM_ACTIVATION_TENSORS\]** : A list of the size of each parameter array inside "acts"
* **<span dir="">float\* acts_memory</span>** : A pointer towards the first parameters array of "acts"
* **<span dir="">size_t num_activations</span>** : The total number of parameters in our model for backward propagation (the sum of _act_sizes_)
* **<span dir="">ActivationTensors grads_acts</span>** : Contains the activation gradients
* **<span dir="">float\* grads_acts_memory </span>**: Pointer to the first array of "grads_acts"
* **<span dir="">float\* grads_acts_memory</span>** : Pointer to the first array of "grads_acts"
* **int batch_size** : The size of the current batch
* **int seq_len** : The size of the current token sequence
* **<span dir="">int\* inputs</span>** : The token array for the current batch
......@@ -313,6 +313,12 @@ What is unexpected is that the OpenMP/nOS-V version is slower than OpenMP versio
![openmpv2](uploads/c01dc0a441e6854fdc4319ec2861be08/openmpv2.png)
![thread-state](uploads/ce8ff83559d65cc90c78d67c4cc32a29/thread-state.png)
![thread-state-legend](uploads/4626cbde639c30905d6b003db5f8d6d6/thread-state-legend.png)
The paraver trace shows us that nearly half of the threads are working simultaneously. We could first increase the efficiency by decreasing the number of threads used for the application.
# Next steps
Here are the next steps I plan to work on :
......
Clone repository
  • Distributed Model
  • Fork Join Model
  • GPT2 Parallelization and porting
  • Metrics
  • Runtime and performances
  • Task Based Model
  • Various informations
  • _sidebar
  • Home