... | @@ -38,4 +38,18 @@ What is unexpected is that the OpenMP/nOS-V version is slower than OpenMP versio |
... | @@ -38,4 +38,18 @@ What is unexpected is that the OpenMP/nOS-V version is slower than OpenMP versio |
|
|
|
|
|
![thread-state-legend](uploads/4626cbde639c30905d6b003db5f8d6d6/thread-state-legend.png)
|
|
![thread-state-legend](uploads/4626cbde639c30905d6b003db5f8d6d6/thread-state-legend.png)
|
|
|
|
|
|
The paraver trace shows us that nearly half of the threads are working simultaneously. We could first increase the efficiency by decreasing the number of threads used for the application. |
|
The paraver trace shows us that nearly half of the threads are working simultaneously. We could first increase the efficiency by decreasing the number of threads used for the application.
|
|
\ No newline at end of file |
|
|
|
|
|
# Time passed in functions
|
|
|
|
|
|
|
|
![paraver](uploads/d86531a4e321e31bf262498c9116d3b4/paraver.png)
|
|
|
|
|
|
|
|
![legend](uploads/1fbc249856438649a150b86717133a6a/legend.png)
|
|
|
|
|
|
|
|
A test has been run using openmp and extrae, with 2 threads and 2 cpus on one single NUMA (**numactl -N 1 -m 1**), to create a paraver trace of the execution. On the trace, we can see that most of the runtime is passed in matmul backward. Here below is a graph of the time taken per iteration for this same test:
|
|
|
|
|
|
|
|
![test-2-openmp-tinyshakespeare-mean](uploads/80003c0cba5f8759a46e445151d8fcd4/test-2-openmp-tinyshakespeare-mean.png)
|
|
|
|
|
|
|
|
As we see, the time taken for each iteration is 7 seconds on average. However, on another test, which runtime is shown just below, we see that when using one single CPU, our average runtime is 41 seconds. This is unexpected, as in theory the runtime should not be higher than
|
|
|
|
|
|
|
|
![test-5-openmp-tinyshakespeare-mean](uploads/d600df70538473235c9037630724a5a2/test-5-openmp-tinyshakespeare-mean.png) |
|
|
|
\ No newline at end of file |