tchatela · 8f5cb322
--- a/Task-Based-Model.md
+++ b/Task-Based-Model.md
@@ -34,9 +34,21 @@ With everything that has been stated, we can now create the following data flow
 ![GPT-2_task_based_model-legend](uploads/795fa5f34780b4860c5a2b3a050be977/GPT-2_task_based_model-legend.png)
+# First implementation
+To begin with, we will put a taskwait between each call of the main training loop. However the rest of the program will be task-based, as shown by the following sketch:
+![Task_based_implementation](uploads/f9062c74a52ed79789ea0d4022624355/Task_based_implementation.png)
+This lead us to the following performances :
+Here we tried to slice according to the token sequence only (BATCH_SUBSIZE=4), or according to the sequence token and the sequence batch (BATCH_SUBSIZE=1). The slope obtained for the second version seems better so we will keep it for the following tests.
 ![Comparison_BATCH_SUBSIZE](uploads/2680eec7bdf9875f9a7b79f21ea3f565/Comparison_BATCH_SUBSIZE.png)
+Because we have a taskwait between each call of the main training loop, it is possible to get the runtime, speedup and efficiency for the forward and backward pass independently.
 ![runtimeT](uploads/7e7c70a6bd6723d02927bebd614eb242/runtimeT.png)
 ![tbS](uploads/6f3833c1c74dbfa74eb3c6f78dd9cf2d/tbS.png)
 ![tbE](uploads/70660e5b4387ac75b7f78e79b8563ed2/tbE.png)
+We can compare now the efficiency between this version of the task based model and the fork join model
 ![cmptf](uploads/7a0dd4169473042b29639600f4c28095/cmptf.png)
\ No newline at end of file