Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • L llm.c - GPT2
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 4
    • Issues 4
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • tchatela
  • llm.c - GPT2
  • Wiki
  • Distributed Model

Distributed Model · Changes

Page history
Update Distributed Model authored Aug 08, 2024 by tchatela's avatar tchatela
Hide whitespace changes
Inline Side-by-side
Distributed-Model.md
View page @ e844f715
...@@ -22,6 +22,7 @@ Dt = 1.0 seconds ...@@ -22,6 +22,7 @@ Dt = 1.0 seconds
- Broadcast takes about 60 to 230 ms - Broadcast takes about 60 to 230 ms
- Reduce takes from 180 ms to 380 ms - Reduce takes from 180 ms to 380 ms
- Forward + backward pass takes about 150 ms
- Time per iteration is 650 ms - Time per iteration is 650 ms
Data transfer is done all at once through one single blocking MPI instruction. Data transfer is done all at once through one single blocking MPI instruction.
...@@ -32,6 +33,7 @@ Dt = 1.0 s ...@@ -32,6 +33,7 @@ Dt = 1.0 s
- Broadcast takes about 95 ms - Broadcast takes about 95 ms
- Reduce takes about 380 ms - Reduce takes about 380 ms
- - Forward + backward pass takes about 143 ms
- Time per iteration is 650 ms - Time per iteration is 650 ms
## Using 8 workers and 1 server ## Using 8 workers and 1 server
...@@ -43,6 +45,7 @@ Dt = 2.5 seconds ...@@ -43,6 +45,7 @@ Dt = 2.5 seconds
![base-9-priority](uploads/7193fd34722329b1c593279365ff7ef1/base-9-priority.png) ![base-9-priority](uploads/7193fd34722329b1c593279365ff7ef1/base-9-priority.png)
- Broadcast takes about 570, 740 or 0 ms - Broadcast takes about 570, 740 or 0 ms
- Reduce takes about 0.9, 1.0, 1.7 seconds - Reduce takes about 0.9, 1.0, 1.7 seconds
- Forward + backward pass takes about 120 to 160 ms
- Time per iteration is about 2050 ms - Time per iteration is about 2050 ms
...@@ -53,4 +56,5 @@ Dt = 2.5 s ...@@ -53,4 +56,5 @@ Dt = 2.5 s
![base-9-priority.code_legend](uploads/fd7dfb7130bb685aa20d6636fe20d339/base-9-priority.code_legend.png) ![base-9-priority.code_legend](uploads/fd7dfb7130bb685aa20d6636fe20d339/base-9-priority.code_legend.png)
- Broadcast takes about 500 ms - Broadcast takes about 500 ms
- Reduce takes about 115 ms - Reduce takes about 115 ms
- Forward + backward pass takes about 100 ms
- Time per iteration is about 720 ms - Time per iteration is about 720 ms
\ No newline at end of file
Clone repository

GPT2 Parallelization and Porting

  • Model Description
  • Runtime and Performances
  • Improvements
  • Traces
  • Fork Join Model
  • Task Based Model
  • Distributed Model