Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • L llm.c - GPT2
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 4
    • Issues 4
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • tchatela
  • llm.c - GPT2
  • Wiki
  • GPT2 Parallelization and porting

GPT2 Parallelization and porting · Changes

Page history
Update GPT2 Parallelization and porting authored Jun 20, 2024 by tchatela's avatar tchatela
Hide whitespace changes
Inline Side-by-side
GPT2-Parallelization-and-porting.md
View page @ 4714aa44
......@@ -2,6 +2,8 @@ This wiki's goal is to link the GPT-2 model with its implementation in C. It wil
# Description of the model
## Introduction
The Generative Pre-trained Transformer 2 (GPT-2) is a Large Language Model (LLM) introduced by OpenAI. It's particularity is to be composed of many Transformer layer, as you can see below.
![Basic structure of GPT-2](https://www.researchgate.net/publication/373352176/figure/fig1/AS:11431281202501967@1698856108167/GPT-2-model-architecture-The-GPT-2-model-contains-N-Transformer-decoder-blocks-as-shown.ppm)
......@@ -11,4 +13,39 @@ However, the model implemented in [our reference code](https://github.com/karpat
+ The second residual forward is linking the output of the multi-head masked attention, and not the output of the normalization layer
+ The attention layer's function does not include the two linear layers as the sketch suggests. These two layers are calculated using matmul functions
Therefore, here is a rectified sketch of the model implented :
\ No newline at end of file
Therefore, here is a rectified sketch of the model implented :
![Adapted structure of GPT-2](uploads/9055498e3cf0d4059418eab6fa8b1133/gpt2-cleaned.png)
## Tokens
Describe token -> words and words -> tokens
## Data formatting
Introducing the main variables (V, NH, L ...)
Describing the matrices
Short
## Advanced description
Explain the functions in detail. Inputs, outputs and variables
## Variable dictionary
Dictionary of the model's parameters
# Model performances
## Sequential
Performance of the sequential model
## OpenMP
Performance of the model with OpenMP
## OpenMP/n-OS-V
Performance of the model with OpenMP/n-OS-V
\ No newline at end of file
Clone repository
  • Distributed Model
  • Fork Join Model
  • GPT2 Parallelization and porting
  • Metrics
  • Runtime and performances
  • Task Based Model
  • Various informations
  • _sidebar
  • Home