... | ... | @@ -40,6 +40,15 @@ To begin with, we will put a taskwait between each call of the main training loo |
|
|
|
|
|
![Task_based_implementation](uploads/f9062c74a52ed79789ea0d4022624355/Task_based_implementation.png)
|
|
|
|
|
|
Which gives us these paraver traces :
|
|
|
![task-based-forward-pass](uploads/68803df82325ddebb8fba7165cec9896/task-based-forward-pass.png)
|
|
|
|
|
|
![task-based-forward-pass.code_legend](uploads/405e6082260df84f652c7fc323b83312/task-based-forward-pass.code_legend.png)
|
|
|
|
|
|
![task-based-backward-pass](uploads/36c076443d041b488c57a9deb96a0005/task-based-backward-pass.png)
|
|
|
|
|
|
![task-based-backward-pass.code_legend](uploads/b60c10db490f7110f22830a448929411/task-based-backward-pass.code_legend.png)
|
|
|
|
|
|
This lead us to the following performances :
|
|
|
|
|
|
Here we tried to slice according to the token sequence only (BATCH_SUBSIZE=4), or according to the sequence token and the sequence batch (BATCH_SUBSIZE=1). The slope obtained for the second version seems better so we will keep it for the following tests.
|
... | ... | |