Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • S sdv-lammps
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 100
    • Issues 100
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • djurado
  • sdv-lammps
  • Wiki
  • Home
You need to sign in or sign up before continuing.

Home · Changes

Page history
Update Home authored Jan 10, 2023 by djurado's avatar djurado
Hide whitespace changes
Inline Side-by-side
Home.md
View page @ 4ac2d3d9
......@@ -103,7 +103,7 @@ Considering that registers in the 0.7 vector unit can hold up to 256 64 bit elem
One can increase the number of iterations in the inner loop by increasing the neighbor distance threshold.
This neighbor threshold is set automatically according to the interaction distance thresholds specified in the `pair_style lj/charmm/coul/long X Y` command.
The largest interaction distance accepted by LAMMPS produced an average of 1290 inner loop iterations.
It may be interesting to do some tests to see if the performance improves with higher inner loop iteration counts.
It may be interesting to do some tests to see how higher inner loop iteration counts affect performance when comparing with the serial version.
| input line | inner loop avg. iterations |
| ---------- | -------------------------- |
......@@ -119,17 +119,20 @@ Vectorization is based on SIMD processing (single instruction, multiple data), b
With the RISC-V vector extension, this can be overcame with the help of masked instructions, which allows restricting writing the result of a vector instructions to only certain elements using a bitmask.
For instance, which proportion of the atom pair interactions (or inner loop iterations) belong to the *do nothing* group?
Even when using masked instructions to avoid updating *do nothing* itneractions, instructions take some time to execute.
Even when using masked instructions to avoid updating *do nothing* interactions, instructions take some time to execute.
So, as opposed to the serial version, a *do nothing* interactions has the same cost in time as any other atom in the vectorized version with masked instructions.
Before starting working on the vectorization, the code was modified to count the number of interactions that belong to each category.
The flowchart shows the average number number of interactions (for a single `i` atom in a timestep) that belong to each category, and the arrows show the same information in percentage form.
Black values show data for the default protein input, while red values correspond to the modified input described in section
Black values show data for the default protein input, while red values correspond to the modified input described in section *Loop size*.
The modified input manages to reduce the proportion of interactions that belong to the *do nothing* and *slow* categories.
It may be interesting to test how the performance of the modified input affects performance compared to the serial version.
### Managing 32 bit and 64 data types
atom_vec.h -> contains `**x` and `**f` (3D)
neigh_list.h -> contains `**firstneigh` (for each i, store array of neighbors j)
PairLJCharmmCoulLong::settings ->
......
Clone repository
  • 32 bit and 64 bit data types
  • 32 bit to 64 bit
  • Home
  • Implementation
  • Loop size
  • Managing code paths
  • Overview of Algorithm and Data structures
  • Specialization
  • _sidebar
  • union_int_float_t