Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • L llm.c - GPT2
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 4
    • Issues 4
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • tchatela
  • llm.c - GPT2
  • Wiki
  • Task Based Model

Task Based Model · Changes

Page history
Update Task Based Model authored Sep 20, 2024 by tchatela's avatar tchatela
Hide whitespace changes
Inline Side-by-side
Task-Based-Model.md
View page @ 32cf9d05
......@@ -68,5 +68,13 @@ This way, we have transformed the outer loops and all collapsed inner loops into
NB : Currently, you can set the collapsed depth and grainsize in _ompss-2_wrappers.h_, however the grainsize is not as powerful as an ompss grainsize, as it applies only to the innermost flatten loop.
# Implementation
To implemented these task definition, we have created an abstract structure of dependency handler. This structure needs to be initialized with B,T, and the array of dependency sizes. Then, you have to define how many tokens per dependency your application should work with. Mind that it is possible to change this value multiple times during the model's computation. Therefore, you can define a different 'outer grainsize' for each layer (it has not been used in the current implementation but it could be leveraged).
Then, for each call to a layer, you have to create a wrapper, that you will define in ompss-2_settings, and implement using a simple macro at the end of the ompss-2_wrappers.c file. This makes it very easy to define new layers and to change the current grainsizes, length and collapse depth of the layers
# Performances
The taskiter versions performs way better than the nested-tasks one. In fact, we get approximately a 1.15 speedup, although the runtimes can be very different from one execution to another, and seems sometimes to depend on how the model starts.
Moreover, although taskiter is more efficient, if `B*T` is too high, it will seconds or minutes to compute the cyclic graph of tasks. For example, with B=8 and T=1024, taskiter takes 54 seconds to compute the graph. A way to reduce this runtime would be to increase the taskloops grainsize defined in ompss-2_settings.h.
\ No newline at end of file
Clone repository

GPT2 Parallelization and Porting

  • Model Description
  • Runtime and Performances
  • Fork Join Model
  • Task Based Model
  • Distributed Model
  • Various Informations