... | ... | @@ -46,11 +46,11 @@ The paraver trace shows us that nearly half of the threads are working simultane |
|
|
|
|
|
![legend](uploads/1fbc249856438649a150b86717133a6a/legend.png)
|
|
|
|
|
|
A test has been run using openmp and extrae, with 2 threads and 2 cpus on one single NUMA (**numactl -N 1 -m 1**), to create a paraver trace of the execution. On the trace, we can see that most of the runtime is passed in matmul_backward (in green and yellow as there are 2 #pragma omp in this layer).
|
|
|
A test has been run using openmp and extrae, with 2 threads and 2 cpus on one single NUMA (**numactl -N 1 -m 1**), to create a paraver trace of the execution. On the trace, we can see that most of the runtime is passed in matmul_backward (in green and yellow as there are 2 #pragma omp in this layer).
|
|
|
|
|
|
## Comparisons with NUMA system specification
|
|
|
|
|
|
Here below is a graph of the time taken per iteration for this same test:
|
|
|
Here below is a graph of the time taken per iteration for the test in previous section:
|
|
|
|
|
|
![test-2-openmp-tinyshakespeare-mean](uploads/80003c0cba5f8759a46e445151d8fcd4/test-2-openmp-tinyshakespeare-mean.png)
|
|
|
|
... | ... | |