djurado · 8fc2599c
--- a/Home.md
+++ b/Home.md
@@ -131,19 +131,22 @@ It may be interesting to test how the performance of the modified input affects

 ### Managing 32 bit and 64 data types

+LAMMPS uses 64-bit `double` precision numbers for floating point calculations and 32-bit `int` numbers for integer computations.
+This can become somewhat of an issue with the vectorization, specifically with indexed memory instructions in 0.7.

+To provide an example, in `i,j` interactions, the identifiers of atoms `j` are placed in an array of 32 bit integers `jlist`.
+These 32 bit integers are later used as an index for accessing atom properties such as position (`**x`), which are stored in a array of 64 bit floating point numbers.
+This access cannot be vectorized

 atom_vec.h -> contains `**x` and `**f` (3D)
 neigh_list.h -> contains `**firstneigh` (for each i, store array of neighbors j)
 PairLJCharmmCoulLong::settings -> 

-LAMMPS uses 64-bit `double` precision numbers for floating point calculations and 32-bit `int` numbers for integer computations.
-This can become somewhat of an issue with the `jlist` array, which is an array of `int`s which act as indexes of `x` and 

 - **Specialization**: calls to subroutines inside compute only happen on the first and last iteration
 - Data structures: in which classes is the information about atoms (position, force) stored, how? array of pointer to array
 - **Structure of the code** - present the flowchart, show the different code paths: do-nothing, fast, slow...
-   - Abandoned idea: copied from the INTEL version - the "classify-loop" - store elements that can be computed vectorially in a buffer (in serial beacuse 0.7) and then process them vectorially.
+   - Abandoned idea: copied from the INTEL version - the "classify-loop" - store elements that can be computed vectorially in a buffer (in serial because 0.7) and then process them vectorially.
   - Implemented idea: use a combination of masked operations form elements that do not be processed and using a vmfirst loop to find the elements that need to be processed in serial
 - **Problems**
   - pointer to pointer - often requires two load indexed operations