tchatela · 91159742
--- a/Distributed-Model.md
+++ b/Distributed-Model.md
@@ -140,11 +140,23 @@ As you can see, we have a better communication time with 4 ranks. This is becaus
 With B=16, T=1024, 1 rank/socket
-| Number of ranks | time/iteration |
+| Number of ranks | time/iteration | tokens/s | tokens/(s.cpus) |
-| ------ | ------ |
+| ------ | ------ | ------ | ------ |
- |    1    |   67000 ms     | 
+ |    1    |   67000 ms     |   |  |
-|    2    |    31500 ms   | 
+|    2    |    31739 ms   | 516 | 4.6 |
-| 4 |  15000 ms |
+| 4 |  15084 ms | 1086 | 4.9 |
+| 8 | 7624 ms | 2149 | 4.8 |
+| 16 | 4084 ms | 4011 | 4.48 |
+# Increasing the batch size per rank
+4 ranks, 1 rank/socket, T=1024
+| Batch size (B) | time/iteration | tokens/s | tokens/(s.cpus) |
+| ------ | ------ | ------ | ------ |
+ |    4    |        |   |  |
+|    8    |       |  |  |
+| 16 |   |  |  |
+| 32 |  |  |  |
+| 64 |  |  |  |