... | @@ -83,13 +83,13 @@ Please see the original User documentation of PhysiCell for instructions on how |
... | @@ -83,13 +83,13 @@ Please see the original User documentation of PhysiCell for instructions on how |
|
|
|
|
|
# MPI processes, Voxels and Divisibility
|
|
# MPI processes, Voxels and Divisibility
|
|
|
|
|
|
The 3-D simulation domain in PhysiCell-X is divided into Voxels (Volumetric Pixels). Voxels generally are cubic (but they can be a cuboid as well). Thus, for example if the domain length in the X, Y and Z direction is \[-500,+500\], \[-500,+500\] and \[-500,+500\], respectively and the Voxel (cubic) dimension is 20, then we have $\\frac{500 - (-500)}{25} = \\frac{1000}{20}=50$ voxels _each_ in the X, Y and Z-directions. As described in the previous section, PhysiCell-X implements 1-D domain partitioning in the X-direction and due to a limitation in our implementation (will be removed in future versions), the total number of Voxels in the X-direction should be perfectly divisible by the total number of MPI processes. Thus, in the case above, we cannot have 3 MPI processes as 3 does not perfectly divide 50. However, we can have 2, 10, or even 25 MPI processes as these divide 50 voxels perfectly. It can be noted that this limitation exists _only_ in the X-direction and _not_ in the Y or Z directions. There is no restriction on the number of OpenMP threads within a single MPI process. Further, there are two types of meshes in PhysiCell (or PhysiCell-X) i.e. a _diffusion mesh_ and a _mechanical mesh_. The size of the voxels for _each_ of these meshes is defined separately. It is important that the perfect divisibility condition should hold for _both_ the types of meshes. We will return to this concept when we show how to run examples and specify parameters in PhysiCell-X.
|
|
The 3-D simulation domain in PhysiCell-X is divided into Voxels (Volumetric Pixels). Voxels generally are cubic (but they can be a cuboid as well). Thus, for example if the domain length in the X, Y and Z direction is [-500,+500], [-500,+500] and [-500,+500], respectively and the Voxel (cubic) dimension is 20, then we have (500 -(-500)/25) = 1000/20 = 50 voxels _each_ in the X, Y and Z-directions. As described in the previous section, PhysiCell-X implements 1-D domain partitioning in the X-direction and due to a limitation in our implementation (will be removed in future versions), the total number of Voxels in the X-direction should be perfectly divisible by the total number of MPI processes. Thus, in the case above, we cannot have 3 MPI processes as 3 does not perfectly divide 50. However, we can have 2, 10, or even 25 MPI processes as these divide 50 voxels perfectly. It can be noted that this limitation exists _only_ in the X-direction and _not_ in the Y or Z directions. There is no restriction on the number of OpenMP threads within a single MPI process. Further, there are two types of meshes in PhysiCell (or PhysiCell-X) i.e. a _diffusion mesh_ and a _mechanical mesh_. The size of the voxels for _each_ of these meshes is defined separately. It is important that the perfect divisibility condition should hold for _both_ the types of meshes. We will return to this concept when we show how to run examples and specify parameters in PhysiCell-X.
|
|
|
|
|
|
> **📝 IMPORTANT**
|
|
> **📝 IMPORTANT**
|
|
>
|
|
>
|
|
> 1. The total number of voxels in the X-direction must be perfectly divisible by the total number of MPI processes.
|
|
> 1. The total number of voxels in the X-direction must be perfectly divisible by the total number of MPI processes.
|
|
> 2. Condition 1 above applies to both Diffusion and Mechanical voxels.
|
|
> 2. Condition 1 above applies to both Diffusion and Mechanical voxels.
|
|
> 3. Size of Diffusion voxel must be $\\leq$ size of Mechanical voxel.
|
|
> 3. Size of Diffusion voxel must be less or equal to the size of Mechanical voxel.
|
|
|
|
|
|
# Code-base organization
|
|
# Code-base organization
|
|
|
|
|
... | @@ -280,7 +280,7 @@ We assume that the script above is saved as `script_tnf_mpi.sh`. The following d |
... | @@ -280,7 +280,7 @@ We assume that the script above is saved as `script_tnf_mpi.sh`. The following d |
|
2. Line 2 gives a name to this job - in this case "TNF_simulation".
|
|
2. Line 2 gives a name to this job - in this case "TNF_simulation".
|
|
3. Line 3 means assign 150 HPC nodes to this job. (In our experiments we use HPC nodes which have 48 cores. These 48 cores are distributed as 2 sockets of 24 cores each.)
|
|
3. Line 3 means assign 150 HPC nodes to this job. (In our experiments we use HPC nodes which have 48 cores. These 48 cores are distributed as 2 sockets of 24 cores each.)
|
|
4. Line 4 states that we only need 1 MPI process per node. This means we will only have 150 MPI processes (= the number of HPC nodes requested).
|
|
4. Line 4 states that we only need 1 MPI process per node. This means we will only have 150 MPI processes (= the number of HPC nodes requested).
|
|
5. Line 5 says that 48 OpenMP threads should be spawned per MPI process. Each of these threads will "cling" (or technically bind) to a core (remember we have 48 cores and 48 threads). Thus, the total number of cores being used in this job are: $150 \\times 48 = 7200$ i.e. 150 MPI processes times 48 OpenMP threads.
|
|
5. Line 5 says that 48 OpenMP threads should be spawned per MPI process. Each of these threads will "cling" (or technically bind) to a core (remember we have 48 cores and 48 threads). Thus, the total number of cores being used in this job are: 150 x 48 = 7200 i.e. 150 MPI processes times 48 OpenMP threads.
|
|
6. Line 6 means that this job can execute for a maximum of 72 hours.
|
|
6. Line 6 means that this job can execute for a maximum of 72 hours.
|
|
7. Line 7 says that the output of the job should be written to a file named `output-[jobid]`, where `jobid` is a unique number that is provided by SLURM.
|
|
7. Line 7 says that the output of the job should be written to a file named `output-[jobid]`, where `jobid` is a unique number that is provided by SLURM.
|
|
8. Line 8 says exactly the same thing as Line 7 but about the error file.
|
|
8. Line 8 says exactly the same thing as Line 7 but about the error file.
|
... | @@ -380,22 +380,22 @@ indicates that we do _not_ input any special settings file to the executable pro |
... | @@ -380,22 +380,22 @@ indicates that we do _not_ input any special settings file to the executable pro |
|
</domain>
|
|
</domain>
|
|
```
|
|
```
|
|
|
|
|
|
First, this shows that domain dimensions in the X/Y and Z direction vary from $\[-200,+200\]$ i.e. $400$ units of length. Second, the length/width and height of the _diffusion_ voxel is $20$. Please note that the length/breadth/height of the _mechanical_ voxel is specified through the `main.cpp` of the specific project. Further, since PhysiCell-X works _only_ for 3-D problems, the 2-D settings are set to `false`. The total number of diffusion voxels in this case in the X, Y and Z directions are $\\frac{200-(-200)}{20} = \\frac{400}{20} = 20$ each. If we check in the `script_physiboss_tnf_model_mpi.sh` file, we can see that the total number of MPI processes are only _two_ i.e. we have a single node with 2 MPI processes per node as indicated by the lines below:
|
|
First, this shows that domain dimensions in the X/Y and Z direction vary from [-200,+200] i.e. 400 units of length. Second, the length/width and height of the _diffusion_ voxel is 20. Please note that the length/breadth/height of the _mechanical_ voxel is specified through the `main.cpp` of the specific project. Further, since PhysiCell-X works _only_ for 3-D problems, the 2-D settings are set to `false`. The total number of diffusion voxels in this case in the X, Y and Z directions are (200-(-200)/20) = (400/20) = 20 each. If we check in the `script_physiboss_tnf_model_mpi.sh` file, we can see that the total number of MPI processes are only _two_ i.e. we have a single node with 2 MPI processes per node as indicated by the lines below:
|
|
|
|
|
|
```shell
|
|
```shell
|
|
#SBATCH --nodes=1
|
|
#SBATCH --nodes=1
|
|
#SBATCH --ntasks-per-node=2
|
|
#SBATCH --ntasks-per-node=2
|
|
```
|
|
```
|
|
|
|
|
|
One of the conditions that must be fulfilled is that the total number of diffusion (or mechanical) voxels in the X-direction _must_ be _completely/exactly/perfectly_ divisible by the total number of MPI processes. In this case, we have a total of 20 diffusion voxels and 2 MPI processes and we can see that $\\frac{20}{2} = 10$ gives 10 _diffusion_ voxels per MPI process. This divisibility condition is not needed in the Y/Z directions as we only implement a 1-dimensional X-direction decomposition. At this stage it is very important to check the size of the mechanical voxel in the `main.cpp` file in the top level directory and the following line in this file shows that the size of the mechanical voxel is also set to 20 (i.e. the same as the size of the diffusion voxel):
|
|
One of the conditions that must be fulfilled is that the total number of diffusion (or mechanical) voxels in the X-direction _must_ be _completely/exactly/perfectly_ divisible by the total number of MPI processes. In this case, we have a total of 20 diffusion voxels and 2 MPI processes and we can see that 20/2 = 10 gives 10 _diffusion_ voxels per MPI process. This divisibility condition is not needed in the Y/Z directions as we only implement a 1-dimensional X-direction decomposition. At this stage it is very important to check the size of the mechanical voxel in the `main.cpp` file in the top level directory and the following line in this file shows that the size of the mechanical voxel is also set to 20 (i.e. the same as the size of the diffusion voxel):
|
|
|
|
|
|
```plaintext
|
|
```plaintext
|
|
double mechanics_voxel_size = 20;
|
|
double mechanics_voxel_size = 20;
|
|
```
|
|
```
|
|
|
|
|
|
It is clear that the total number of mechanical voxels in the X-direction i.e. $\\frac{200-(-200)}{20}=20$ are also divisible by the total number of MPI processes (2 in this case). Thus, we have checked that
|
|
It is clear that the total number of mechanical voxels in the X-direction i.e. (200-(-200)/20) = 20 are also divisible by the total number of MPI processes (2 in this case). Thus, we have checked that
|
|
|
|
|
|
1. Mechanical voxel size $\\geq$ Diffusion voxel size.
|
|
1. Mechanical voxel size greater or equal to the Diffusion voxel size.
|
|
2. Total number of voxels (both Mechanical and Diffusion) in the X-direction are perfectly divisible by the total number of MPI processes.
|
|
2. Total number of voxels (both Mechanical and Diffusion) in the X-direction are perfectly divisible by the total number of MPI processes.
|
|
|
|
|
|
The `PhysiCell_settings.xml` file also indicates that the total number of OpenMP threads is just one i.e.
|
|
The `PhysiCell_settings.xml` file also indicates that the total number of OpenMP threads is just one i.e.
|
... | @@ -424,7 +424,7 @@ where the environment variable `SLURM_CPUS_PER_TASK` indicate the number of Open |
... | @@ -424,7 +424,7 @@ where the environment variable `SLURM_CPUS_PER_TASK` indicate the number of Open |
|
#SBATCH --cpus-per-task=24
|
|
#SBATCH --cpus-per-task=24
|
|
```
|
|
```
|
|
|
|
|
|
Here, the environment variable `SLURM_CPUS_PER_TASK` takes its value from the `–cpus-per-task=24` SLURM option. Thus, the number of OpenMP threads in our program is 24 per socket (and $2 \\times 24 = 48$ per node).\
|
|
Here, the environment variable `SLURM_CPUS_PER_TASK` takes its value from the `–cpus-per-task=24` SLURM option. Thus, the number of OpenMP threads in our program is 24 per socket (and 2 x 24 = 48 per node).\
|
|
The initial part of the `main.cpp` contains the lines of code to build a Cartesian topology. It can be noted that _this part of the code will remain exactly the same for all the parallel programs_. Thus, the user can simply copy and paste this code into any new programs that they make. We show this part of the code below:
|
|
The initial part of the `main.cpp` contains the lines of code to build a Cartesian topology. It can be noted that _this part of the code will remain exactly the same for all the parallel programs_. Thus, the user can simply copy and paste this code into any new programs that they make. We show this part of the code below:
|
|
|
|
|
|
```plaintext
|
|
```plaintext
|
... | | ... | |