... | @@ -158,9 +158,23 @@ In the end, a bithack trick was used to extend the array of 32 bit unsigned inte |
... | @@ -158,9 +158,23 @@ In the end, a bithack trick was used to extend the array of 32 bit unsigned inte |
|
The trick is to load the array as if it had 64 bit elements, and then use an `vand` operation to blank the most significant half of the elements, mimicking an extension with zeros.
|
|
The trick is to load the array as if it had 64 bit elements, and then use an `vand` operation to blank the most significant half of the elements, mimicking an extension with zeros.
|
|
To get the other half, it is needed to perform a shift right logic before applying the `vand`.
|
|
To get the other half, it is needed to perform a shift right logic before applying the `vand`.
|
|
This method is very low level and depends on the endianness of the system in order to work (TODO elaborate why).
|
|
This method is very low level and depends on the endianness of the system in order to work (TODO elaborate why).
|
|
Moreover, it also requires a bit of extra handling for the case in which
|
|
Moreover, it also requires a bit of extra handling for the case in which the array has and odd number of elements.
|
|
To see the code in detail, check annex (TODO).
|
|
To see the code in detail, check annex (TODO).
|
|
|
|
|
|
|
|
### Handling union_int_float_t
|
|
|
|
|
|
|
|
`PairLJCharmmCoulLong::compute` defines an union type `union_int_float_t` in order to be able with operate with the bits of a 32bit floating point number `rsq` as if it was a 32 bit integer.
|
|
|
|
In particular, the least significative part of the exponent combined to the most significative part of the mantissa is extracted and used as index for an array, but the bitmask and the shift value is not static, it is instead generated by LAMMPS based on the the input configuration.
|
|
|
|
|
|
|
|
In our optimized version, `rsq` is a 64 bit number, so a small extra piece of code is needed to be able to convert the 32bit bitmask and shift value (generated in `pair.cpp:init_bitmap`) to 64bit.
|
|
|
|
This can be done easily by considering the differences
|
|
|
|
|
|
|
|
Since the extracted bit fields are centered at the start of the
|
|
|
|
|
|
|
|
```
|
|
|
|
ncoulmask64 = ((long) ncoulmask) << (DBL_MANT_DIG - FLT_MANT_DIG);
|
|
|
|
ncoulshiftbits64 = ncoulshiftbits + (DBL_MANT_DIG - FLT_MANT_DIG);
|
|
|
|
```
|
|
|
|
|
|
atom_vec.h -> contains `**x` and `**f` (3D)
|
|
atom_vec.h -> contains `**x` and `**f` (3D)
|
|
neigh_list.h -> contains `**firstneigh` (for each i, store array of neighbors j)
|
|
neigh_list.h -> contains `**firstneigh` (for each i, store array of neighbors j)
|
... | | ... | |