Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • L llm.c - GPT2
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 4
    • Issues 4
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • tchatela
  • llm.c - GPT2
  • Wiki
  • Task Based Model

Task Based Model · Changes

Page history
Update Task Based Model authored Jul 19, 2024 by tchatela's avatar tchatela
Hide whitespace changes
Inline Side-by-side
Task-Based-Model.md
View page @ c1479213
......@@ -40,6 +40,15 @@ To begin with, we will put a taskwait between each call of the main training loo
![Task_based_implementation](uploads/f9062c74a52ed79789ea0d4022624355/Task_based_implementation.png)
Which gives us these paraver traces :
![task-based-forward-pass](uploads/68803df82325ddebb8fba7165cec9896/task-based-forward-pass.png)
![task-based-forward-pass.code_legend](uploads/405e6082260df84f652c7fc323b83312/task-based-forward-pass.code_legend.png)
![task-based-backward-pass](uploads/36c076443d041b488c57a9deb96a0005/task-based-backward-pass.png)
![task-based-backward-pass.code_legend](uploads/b60c10db490f7110f22830a448929411/task-based-backward-pass.code_legend.png)
This lead us to the following performances :
Here we tried to slice according to the token sequence only (BATCH_SUBSIZE=4), or according to the sequence token and the sequence batch (BATCH_SUBSIZE=1). The slope obtained for the second version seems better so we will keep it for the following tests.
......
Clone repository

GPT2 Parallelization and Porting

  • Model Description
  • Runtime and Performances
  • Improvements
  • Traces
  • Fork Join Model
  • Task Based Model