tchatela · 1467ff47
--- a/Distributed-Model.md
+++ b/Distributed-Model.md
-![GPT2_Distributed_Diagram](uploads/2bb14c6b0a33fea35de81ffcdb5859ba/GPT2_Distributed_Diagram.png)
\ No newline at end of file
+![GPT2_Distributed_Diagram](uploads/2bb14c6b0a33fea35de81ffcdb5859ba/GPT2_Distributed_Diagram.png)
+
+# First benchmarking and comparisons
+
+We will focus here on the transfer of the gradients from the 'worker' ranks to the 'server' rank (= rank 0).
+Note that :
+- Transfer of gradients from worker to server is done through MPI_Reduce (SUM) 
+- Transfer of updated parameters from server to workers is done through MPI_Bcast
+- The data sent for each of these operations are about 250 000 floats so about 1 Gb.
+- Accroding to [MN5 overview](https://www.bsc.es/supportkc/docs/MareNostrum5/overview), the transfer speed of a node is about 1Gb/s
+
+
+Using 8 workers and 1 server. Data transfer is done through tasks using high priority. 
+Dt = 2.5 seconds
+![base-9-priority](uploads/7193fd34722329b1c593279365ff7ef1/base-9-priority.png)
+
+Using 8 workers and 1 server. Data transfer is done through tasks using high priority. 
+Dt = 2.5 s
+
+![all-nine](uploads/2d471a1e2cf8dc0bf30f957a4554d9e5/all-nine.png)