Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • EAR EAR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • EAR_teamEAR_team
  • EAREAR
  • Wiki
  • User guide

User guide · Changes

Page history
ear-4.3b commit authored Jul 05, 2023 by Oriol Vidal Teruel's avatar Oriol Vidal Teruel
Hide whitespace changes
Inline Side-by-side
User-guide.md
View page @ 447cd3f1
...@@ -18,9 +18,9 @@ The EAR development team had worked also with OAR and PBSPro batch schedulers, b ...@@ -18,9 +18,9 @@ The EAR development team had worked also with OAR and PBSPro batch schedulers, b
[[_TOC_]] [[_TOC_]]
## Use cases # Use cases
### MPI applications ## MPI applications
EARL is automatically loaded with MPI applications when EAR is enabled by EARL is automatically loaded with MPI applications when EAR is enabled by
default (check `ear-info`). EAR supports the utilization of both default (check `ear-info`). EAR supports the utilization of both
...@@ -32,12 +32,12 @@ When using specific MPI flavour commands to start applications (e.g., `mpirun`, ...@@ -32,12 +32,12 @@ When using specific MPI flavour commands to start applications (e.g., `mpirun`,
`mpiexec.hydra`), there are some keypoints which you must take account. `mpiexec.hydra`), there are some keypoints which you must take account.
See [next sections](#using-mpirunmpiexec-command) for examples and more details. See [next sections](#using-mpirunmpiexec-command) for examples and more details.
#### Hybrid MPI + (OpenMP, CUDA, MKL) applications ### Hybrid MPI + (OpenMP, CUDA, MKL) applications
EARL automatically supports this use case. EARL automatically supports this use case.
`mpirun`/`mpiexec` and `srun` are supported in the same manner as explained above. `mpirun`/`mpiexec` and `srun` are supported in the same manner as explained above.
#### Python MPI applications ### Python MPI applications
EARL cannot detect automatically MPI symbols when Python is used. EARL cannot detect automatically MPI symbols when Python is used.
On that case, an environment variable used to specify which MPI flavour is provided. On that case, an environment variable used to specify which MPI flavour is provided.
...@@ -46,9 +46,9 @@ Export [`SLURM_EAR_LOAD_MPI_VERSION`](EAR-environment-variables#ear_load_mpi_ver ...@@ -46,9 +46,9 @@ Export [`SLURM_EAR_LOAD_MPI_VERSION`](EAR-environment-variables#ear_load_mpi_ver
values, e.g., `export SLURM_EAR_LOAD_MPI_VERSION="open mpi"`, whose are the two MPI values, e.g., `export SLURM_EAR_LOAD_MPI_VERSION="open mpi"`, whose are the two MPI
implementations 100% supported by EAR. implementations 100% supported by EAR.
#### Running MPI applications on SLURM systems ### Running MPI applications on SLURM systems
##### Using `srun` command #### Using `srun` command
Running MPI applications with EARL on SLURM systems using `srun` command is the most Running MPI applications with EARL on SLURM systems using `srun` command is the most
straightforward way to start using EAR. straightforward way to start using EAR.
...@@ -62,12 +62,12 @@ They are provided by EAR's SLURM SPANK plug-in. ...@@ -62,12 +62,12 @@ They are provided by EAR's SLURM SPANK plug-in.
When using SLURM commands for job submission, both Intel and OpenMPI implementations are When using SLURM commands for job submission, both Intel and OpenMPI implementations are
supported. supported.
##### Using `mpirun`/`mpiexec` command #### Using `mpirun`/`mpiexec` command
To provide an automatic loading of the EAR library, the only requirement from To provide an automatic loading of the EAR library, the only requirement from
the MPI library is to be coordinated with the scheduler. the MPI library is to be coordinated with the scheduler.
###### Intel MPI ##### Intel MPI
Recent versions of Intel MPI offers two environment variables that can be used Recent versions of Intel MPI offers two environment variables that can be used
to guarantee the correct scheduler integrations: to guarantee the correct scheduler integrations:
...@@ -79,7 +79,7 @@ These arguments are passed to SLURM, and they can be all the same as EAR's SPANK ...@@ -79,7 +79,7 @@ These arguments are passed to SLURM, and they can be all the same as EAR's SPANK
You can read [here](https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/environment-variable-reference/hydra-environment-variables.html) the Intel environment variables guide. You can read [here](https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/environment-variable-reference/hydra-environment-variables.html) the Intel environment variables guide.
###### OpenMPI ##### OpenMPI
For joining OpenMPI and EAR it is highly recommended to use SLURM's `srun` command. For joining OpenMPI and EAR it is highly recommended to use SLURM's `srun` command.
When using `mpirun`, as OpenMPI is not fully coordinated with the scheduler, EARL When using `mpirun`, as OpenMPI is not fully coordinated with the scheduler, EARL
...@@ -89,7 +89,7 @@ will be reported. ...@@ -89,7 +89,7 @@ will be reported.
To provide support for this workflow, EAR provides [`erun`](EAR-commands#erun) command. To provide support for this workflow, EAR provides [`erun`](EAR-commands#erun) command.
Read the corresponding [examples section](User-guide.md#openmpi-1) for more information about how to use this command. Read the corresponding [examples section](User-guide.md#openmpi-1) for more information about how to use this command.
###### MPI4PY ##### MPI4PY
To use MPI with Python applications, the EAR Loader cannot automatically detect symbols to classify To use MPI with Python applications, the EAR Loader cannot automatically detect symbols to classify
the application as Intel or OpenMPI. In order to specify it, the user has the application as Intel or OpenMPI. In order to specify it, the user has
...@@ -97,9 +97,9 @@ to define the `SLURM_LOAD_MPI_VERSION` environment variable with the values _int ...@@ -97,9 +97,9 @@ to define the `SLURM_LOAD_MPI_VERSION` environment variable with the values _int
_open mpi_. It is recommended to add in Python modules to make it easy for _open mpi_. It is recommended to add in Python modules to make it easy for
final users. final users.
### Non-MPI applications ## Non-MPI applications
#### Python ### Python
Since version 4.1 EAR automatically executes the Library with Python applications, Since version 4.1 EAR automatically executes the Library with Python applications,
so no action is needed. so no action is needed.
...@@ -107,14 +107,14 @@ You must run the application with `srun` command to pass through the EAR's SLURM ...@@ -107,14 +107,14 @@ You must run the application with `srun` command to pass through the EAR's SLURM
SPANK plug-in in order to enable/disable/tuning EAR. SPANK plug-in in order to enable/disable/tuning EAR.
See [EAR submission flags](#ear-job-submission-flags) provided by EAR SLURM integration. See [EAR submission flags](#ear-job-submission-flags) provided by EAR SLURM integration.
#### OpenMP, CUDA and Intel MKL ### OpenMP, CUDA and Intel MKL
To load EARL automatically with non-MPI applications it is required to have it compiled To load EARL automatically with non-MPI applications it is required to have it compiled
with dynamic symbols and also it must be executed with `srun` command. with dynamic symbols and also it must be executed with `srun` command.
For example, for CUDA applications the `--cudart=shared` option must be used at compile time. For example, for CUDA applications the `--cudart=shared` option must be used at compile time.
EARL is loaded for OpenMP, MKL and CUDA programming models when symbols are dynamically detected. EARL is loaded for OpenMP, MKL and CUDA programming models when symbols are dynamically detected.
### Other application types or frameworks ## Other application types or frameworks
For other programming models or sequential apps not supported by default, EARL can For other programming models or sequential apps not supported by default, EARL can
be forced to be loaded by setting [`SLURM_EAR_LOADER_APPLICATION`](EAR-environment-variables#ear_loader_application) be forced to be loaded by setting [`SLURM_EAR_LOADER_APPLICATION`](EAR-environment-variables#ear_loader_application)
...@@ -128,7 +128,7 @@ export SLURM_EAR_LOADER_APPLICATION=my_app ...@@ -128,7 +128,7 @@ export SLURM_EAR_LOADER_APPLICATION=my_app
srun my_app srun my_app
``` ```
## Retrieving EAR data # Retrieving EAR data
As a job accounting and monitoring tool, EARL collects some metrics that you can As a job accounting and monitoring tool, EARL collects some metrics that you can
get to see or know your applications behaviour. get to see or know your applications behaviour.
...@@ -185,7 +185,7 @@ how to generate Paraver traces. ...@@ -185,7 +185,7 @@ how to generate Paraver traces.
> Contact with [ear-support@bsc.es](mailto:ear-support@bsc.es) if you want to get more details > Contact with [ear-support@bsc.es](mailto:ear-support@bsc.es) if you want to get more details
about how to deal with EAR data with Paraver. about how to deal with EAR data with Paraver.
## EAR job submission flags # EAR job submission flags
The following EAR options can be specified when running `srun` and/or `sbatch`, The following EAR options can be specified when running `srun` and/or `sbatch`,
and are supported with `srun`/`sbatch`/`salloc`: and are supported with `srun`/`sbatch`/`salloc`:
...@@ -217,7 +217,7 @@ the output files to be generated, it will be automatically created if needed. ...@@ -217,7 +217,7 @@ the output files to be generated, it will be automatically created if needed.
> You can always check the avaiable EAR submission flags provided by EAR's SLURM SPANK > You can always check the avaiable EAR submission flags provided by EAR's SLURM SPANK
plug-in by typing `srun --help`. plug-in by typing `srun --help`.
### CPU frequency selection ## CPU frequency selection
The [EAR configuration file](www.example.org) supports the specification of *EAR authorized users*, The [EAR configuration file](www.example.org) supports the specification of *EAR authorized users*,
who can ask for a more privileged submission options. The most relevant ones are the possibility who can ask for a more privileged submission options. The most relevant ones are the possibility
...@@ -227,7 +227,7 @@ with sysadmin or helpdesk team to become an authorized user. ...@@ -227,7 +227,7 @@ with sysadmin or helpdesk team to become an authorized user.
- The `--ear-policy=policy_name` flag asks for _policy_name_ policy. Type `srun --help` to see policies currently installed in your system. - The `--ear-policy=policy_name` flag asks for _policy_name_ policy. Type `srun --help` to see policies currently installed in your system.
- The `--ear-cpufreq=value` (_value_ must be given in kHz) asks for a specific CPU frequency. - The `--ear-cpufreq=value` (_value_ must be given in kHz) asks for a specific CPU frequency.
### GPU frequency selection ## GPU frequency selection
EAR version 3.4 and upwards supports GPU monitoring for NVIDIA devices from the EAR version 3.4 and upwards supports GPU monitoring for NVIDIA devices from the
point of view of the application and node monitoring. GPU frequency optimization point of view of the application and node monitoring. GPU frequency optimization
...@@ -242,9 +242,9 @@ To see the list of available frequencies of the GPU you will work on, you can ty ...@@ -242,9 +242,9 @@ To see the list of available frequencies of the GPU you will work on, you can ty
nvidia-smi -q -d SUPPORTED_CLOCKS nvidia-smi -q -d SUPPORTED_CLOCKS
``` ```
## Examples # Examples
### `srun` examples ## `srun` examples
Having an MPI application asking for one node and 24 tasks, the following is a Having an MPI application asking for one node and 24 tasks, the following is a
simple case of job submission. simple case of job submission.
...@@ -288,7 +288,7 @@ srun --ear-cpufreq=2000000 --ear-policy=monitoring --ear-verbose=1 -J test -N 1 ...@@ -288,7 +288,7 @@ srun --ear-cpufreq=2000000 --ear-policy=monitoring --ear-verbose=1 -J test -N 1
For `--ear-cpufreq` to have any effect, you must specify the `--ear-policy` option even if you want to run your application with the default policy. For `--ear-cpufreq` to have any effect, you must specify the `--ear-policy` option even if you want to run your application with the default policy.
### `sbatch` + EARL + srun ## `sbatch` + EARL + srun
When using `sbatch` EAR options can be specified in the same way. If more than one When using `sbatch` EAR options can be specified in the same way. If more than one
`srun` is included in the job submission, EAR options can be inherited from `sbatch` to the different `srun` instances or they can be specifically modified on each individual `srun`. `srun` is included in the job submission, EAR options can be inherited from `sbatch` to the different `srun` instances or they can be specifically modified on each individual `srun`.
...@@ -318,9 +318,9 @@ mkdir ear_metrics ...@@ -318,9 +318,9 @@ mkdir ear_metrics
srun --ear-user-db=ear_metrics/app_metrics application srun --ear-user-db=ear_metrics/app_metrics application
``` ```
### EARL + `mpirun` ## EARL + `mpirun`
#### Intel MPI ### Intel MPI
When running EAR with `mpirun` rather than `srun`, we have to specify the utilization of `srun` as bootstrap. Version 2019 and newer offers two environment variables for bootstrap server specification and arguments. When running EAR with `mpirun` rather than `srun`, we have to specify the utilization of `srun` as bootstrap. Version 2019 and newer offers two environment variables for bootstrap server specification and arguments.
``` ```
...@@ -329,7 +329,7 @@ export I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS="--ear-policy=monitoring --ear-verb ...@@ -329,7 +329,7 @@ export I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS="--ear-policy=monitoring --ear-verb
mpiexec.hydra -n 10 application mpiexec.hydra -n 10 application
``` ```
#### OpenMPI ### OpenMPI
Bootstrap is an Intel(R) MPI option but not an OpenMPI option. For OpenMPI Bootstrap is an Intel(R) MPI option but not an OpenMPI option. For OpenMPI
`srun` must be used for an automatic EAR support. `srun` must be used for an automatic EAR support.
...@@ -355,7 +355,7 @@ time per job or node, like SLURM does with its plugins. ...@@ -355,7 +355,7 @@ time per job or node, like SLURM does with its plugins.
**IMPORTANT NOTE** If you are going to launch `n` applications with `erun` command through a sbatch job, you must set the environment variable `SLURM_STEP_ID` to values from `0` to `n-1` before each `mpirun` call. **IMPORTANT NOTE** If you are going to launch `n` applications with `erun` command through a sbatch job, you must set the environment variable `SLURM_STEP_ID` to values from `0` to `n-1` before each `mpirun` call.
By this way `erun` will inform the EARD the correct step ID to be stored then to the Database. By this way `erun` will inform the EARD the correct step ID to be stored then to the Database.
## EAR job Accounting (`eacct`) # EAR job Accounting (`eacct`)
The [`eacct`](EAR-commands#ear-job-accounting-eacct) command shows accounting information stored in the EAR DB for The [`eacct`](EAR-commands#ear-job-accounting-eacct) command shows accounting information stored in the EAR DB for
jobs (and steps) IDs. jobs (and steps) IDs.
...@@ -363,7 +363,7 @@ The command uses EAR's configuration file to determine if the user running it is ...@@ -363,7 +363,7 @@ The command uses EAR's configuration file to determine if the user running it is
privileged or not, as **non-privileged users can only access their information**. privileged or not, as **non-privileged users can only access their information**.
It provides the following options. It provides the following options.
### Usage examples ## Usage examples
The basic usage of `eacct` retrieves the last 20 applications (by default) of the The basic usage of `eacct` retrieves the last 20 applications (by default) of the
user executing it. user executing it.
...@@ -428,7 +428,7 @@ Please, read the [commands section page](EAR-commands) to see which of them are ...@@ -428,7 +428,7 @@ Please, read the [commands section page](EAR-commands) to see which of them are
Successfully written applications to csv. Only applications with EARL will have its information properly written. Successfully written applications to csv. Only applications with EARL will have its information properly written.
``` ```
## Job energy optimization: EARL policies # Job energy optimization: EARL policies
The core component of EAR at the user's job level is the EAR Library (EARL). The core component of EAR at the user's job level is the EAR Library (EARL).
The Library deals with job monitoring and is the component which implements and applies The Library deals with job monitoring and is the component which implements and applies
......
Clone repository
  • Home
  • User guide
    • Use cases
      • MPI applications
      • Non-MPI applications
      • Others
    • EAR data
    • Submission flags
    • Examples
    • Job accounting
    • Job energy optimization
  • Commands
    • Job accounting (eacct)
    • System energy report (ereport)
    • EAR control (econtrol)
    • Database management
    • erun
    • ear-info
  • Environment variables
    • Support for Intel(R) speed select technology
  • Admin Guide
    • Architecture/Services
    • Quick installation guide
    • Installation from source
    • Installation from RPM
      • Requirements
    • Updating
    • Configuration
    • Starting services
    • Tools
    • Learning phase
    • Plug-ins
    • Supported systems
    • Powercap
  • Database
    • Database fields
    • Updating the database from previous EAR versions
  • CHANGELOG
  • FAQs
  • Known issues