Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • S sdv-lammps
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 100
    • Issues 100
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • djurado
  • sdv-lammps
  • Wiki
  • Home

Home · Changes

Page history
Update Home authored Jan 10, 2023 by djurado's avatar djurado
Hide whitespace changes
Inline Side-by-side
Home.md
View page @ d64b8daf
......@@ -70,7 +70,23 @@ In this section we discuss the implementation of the optimized `PairLJCharmmCoul
The end of `PairLJCharmmCoulLong::compute` contains function calls to `ev_tally` or `virial_fdotr_compute`, which are run depending on the value or `evflag` and `vflag_fdotr` variables.
An analysis using GDB breakpoints showed that, for the protein input, these funcions are only called on the first and last timesteps of the execution.
This paraver trace
This paraver trace shows the weight of the first and last iterations compared to the rest.
TODO Paraver trace with two levels of events, one for compute and other for inner function calls.
Having function calls inside the function to be optimized can be troublesome because:
1. The compiler does not support autovectorization of non-inlined functions.
2. If not autovectorized, the function would need to be vectorized with intrinsics, with all the additional work.
3. If kept serial, then a mechanism to unpack data from the vector registers would still be needed.
After considering this, we decided that the specialized function should only target the case in which the functions are not called.
Now, there are two funcions, `compute_loopi_original` and `compute_loopi_special`.
The specialized function is called when possible, and if not the execution falls back on the original function.
Another factor that has been taken into account with the specialization is the fact that the protein input spcript `in.protein` uses the form `pair_style lj/charmm/coul/long X Y` with only two parameters, which implies that `cut_ljsq = cut_coulsq`.
Targeting only this case for the specialization this case for the optimization can lead to simpler code.
### Managing different code paths
atom_vec.h -> contains `**x` and `**f` (3D)
neigh_list.h -> contains `**firstneigh` (for each i, store array of neighbors j)
......
Clone repository
  • 32 bit and 64 bit data types
  • 32 bit to 64 bit
  • Home
  • Implementation
  • Loop size
  • Managing code paths
  • Overview of Algorithm and Data structures
  • Specialization
  • _sidebar
  • union_int_float_t