Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • L llm.c - GPT2
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 4
    • Issues 4
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • tchatela
  • llm.c - GPT2
  • Wiki
  • Distributed Model

Distributed Model · Changes

Page history
Update Distributed Model authored Aug 08, 2024 by tchatela's avatar tchatela
Show whitespace changes
Inline Side-by-side
Distributed-Model.md
View page @ 973c2e72
...@@ -41,10 +41,16 @@ Dt = 1.0 s ...@@ -41,10 +41,16 @@ Dt = 1.0 s
Data transfer is done through tasks using high priority and TAMPI non blocking mode. Data transfer is done through tasks using high priority and TAMPI non blocking mode.
Dt = 2.5 seconds Dt = 2.5 seconds
![base-9-priority](uploads/7193fd34722329b1c593279365ff7ef1/base-9-priority.png) ![base-9-priority](uploads/7193fd34722329b1c593279365ff7ef1/base-9-priority.png)
- Broadcast takes about 570, 740 or 0 ms
- Reduce takes about 0.9, 1.0, 1.7 seconds
- Time per iteration is about
Data transfer is done all at once through one single blocking MPI instruction. Data transfer is done all at once through one single blocking MPI instruction.
Dt = 2.5 s Dt = 2.5 s
![all-nine](uploads/2d471a1e2cf8dc0bf30f957a4554d9e5/all-nine.png) ![all-nine](uploads/2d471a1e2cf8dc0bf30f957a4554d9e5/all-nine.png)
![base-9-priority.code_legend](uploads/fd7dfb7130bb685aa20d6636fe20d339/base-9-priority.code_legend.png) ![base-9-priority.code_legend](uploads/fd7dfb7130bb685aa20d6636fe20d339/base-9-priority.code_legend.png)
- Broadcast takes about 500 ms
- Reduce takes about 115 ms
- Time per iteration is about
\ No newline at end of file
Clone repository

GPT2 Parallelization and Porting

  • Model Description
  • Runtime and Performances
  • Improvements
  • Traces
  • Fork Join Model
  • Task Based Model
  • Distributed Model