Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • EAR EAR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • EAR_teamEAR_team
  • EAREAR
  • Wiki
  • User guide

User guide · Changes

Page history
ear-5.0.3 authored Oct 11, 2024 by Oriol Vidal Teruel's avatar Oriol Vidal Teruel
Show whitespace changes
Inline Side-by-side
User-guide.md
View page @ 0a0a8597
......@@ -5,8 +5,7 @@ you can run your applications enabling/disabling/tuning EAR with the less effort
for changing your workflow, e.g., submission scripts.
This is achieved by providing integrations (e.g., plug-ins, hooks) with system batch
schedulers, which do all the effort to set-up EAR at job submission.
By now, **SLURM is the batch scheduler full compatible with EAR** thanks to EAR's SLURM
SPANK plug-in.
By now, **[SLURM](https://slurm.schedmd.com/documentation.html) is the batch scheduler full compatible with EAR** thanks to EAR's SLURM SPANK plug-in.
With EAR's SLURM plug-in, running an application with EAR is as easy as submitting
a job with either `srun`, `sbatch` or `mpirun`. The EAR Library (EARL) is automatically
......@@ -22,39 +21,48 @@ The EAR development team had worked also with OAR and PBSPro batch schedulers, b
# Use cases
Since EAR was targetting computational applications, some applications are automatically loaded and others are not to avoid running EAR with, por exampl, sh processes. Types of applications automatically loaded with EAR library are:
Since EAR was targetting computational applications, some applications are automatically loaded and others are not, avoiding running EAR with, for example, bash processes.
The following list resumes the application use cases where the EARL can be loaded transparently with them:
- MPI applications (intel, OpenMPI Fujitsu and CRAY versions)
- Not MPI: OpenMP, CUDA, MKL, OneAPI
- Python
- MPI applications: IntelMPI, OpenMPI, Fujitsu and CRAY versions.
- Non-MPI applications: OpenMP, CUDA, MKL and OneAPI.
- Python applications.
For other use cases it can explicitly requested, see (Other application types or frameworks)
Other use cases not listed here might be still supported.
See the [dedicated section](#other-application-types-or-frameworks).
## MPI applications
EARL is automatically loaded with MPI applications when EAR is enabled by
default (check `ear-info`). EAR supports the utilization of both
`mpirun`/`mpiexec` and `srun` commands.
EARL is automatically loaded with MPI applications when it is enabled by default (check `ear-info`).
EAR supports the utilization of both `mpirun`/`mpiexec` and `srun` commands.
When using `sbacth`/`srun` or `salloc`, [Intel MPI](https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html#gs.mufipm)
When using `sbatch`/`srun` or `salloc`, [Intel MPI](https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html#gs.mufipm)
and [OpenMPI](https://www.open-mpi.org/) are fully supported.
When using specific MPI flavour commands to start applications (e.g., `mpirun`,
`mpiexec.hydra`), there are some keypoints which you must take account.
See [next sections](#using-mpirunmpiexec-command) for examples and more details.
Review SLURM's [MPI Users Guide](https://slurm.schedmd.com/mpi_guide.html), read your cluster documentation or ask your system administrator to see how SLURM is integrated with the MPI Library in your system.
### Hybrid MPI + (OpenMP, CUDA, MKL) applications
EARL automatically supports this use case.
`mpirun`/`mpiexec` and `srun` are supported in the same manner as explained above.
### Python MPI applications
### Python and Julia MPI applications
EARL cannot detect automatically MPI symbols when some of these languages is used.
On that case, an environment variable is provided to give EARL a hint of the MPI flavour being used.
EARL cannot detect automatically MPI symbols when Python is used.
On that case, an environment variable used to specify which MPI flavour is provided.
Export [`EAR_LOAD_MPI_VERSION`](EAR-environment-variables#ear_load_mpi_version) environment with the value from the following table depending on the MPI implementation you are loading:
Export [`SLURM_EAR_LOAD_MPI_VERSION`](EAR-environment-variables#ear_load_mpi_version) environment variable with either _intel_ or _open mpi_
values, e.g., `export SLURM_EAR_LOAD_MPI_VERSION="open mpi"`, whose are the two MPI
implementations 100% supported by EAR.
| MPI flavour | Value |
| ----------- | ----- |
| [Intel MPI](https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html) | _intel_ |
| [Open MPI](https://www.open-mpi.org/) | _open mpi_ or _ompi_ |
| [MVAPICH](https://mvapich.cse.ohio-state.edu/) | _mvapich_ |
| Fujitsu MPI | _fujitsu mpi_ |
| Cray MPICH | _cray mpich_ |
### Running MPI applications on SLURM systems
......@@ -64,18 +72,17 @@ Running MPI applications with EARL on SLURM systems using `srun` command is the
straightforward way to start using EAR.
All jobs are monitored by EAR and the Library is loaded by default depending on
the cluster configuration.
To run a job with `srun` and EARL **there is no need to load the EAR module**.
Even though it is automatic, there are few [flags](#ear-job-submission-flags) than can be selected
at job submission.
They are provided by EAR's SLURM SPANK plug-in.
When using SLURM commands for job submission, both Intel and OpenMPI implementations are
supported.
Even though it is automatic, there are few [flags](#ear-job-submission-flags) than can be selected at job submission.
They are provided by EAR's SLURM SPANK plug-in. When using SLURM commands for job submission, both Intel and OpenMPI implementations are supported.
**There is no need to load the EAR module** for running a job with `srun` and get EARL loaded.
Review SLURM's [MPI Users Guide](https://slurm.schedmd.com/mpi_guide.html), read your cluster documentation or ask your system administrator to see how SLURM is integrated with the MPI Library in your system.
#### Using `mpirun`/`mpiexec` command
To provide an automatic loading of the EAR library, the only requirement from
the MPI library is to be coordinated with the scheduler.
To provide an automatic loading of EARL, the only requirement from the MPI library is to be coordinated with the scheduler.
Review SLURM's [MPI Users Guide](https://slurm.schedmd.com/mpi_guide.html), read your cluster documentation or ask your system administrator to see how SLURM is integrated with the MPI Library in your system.
##### Intel MPI
......@@ -101,44 +108,46 @@ Read the corresponding [examples section](User-guide.md#openmpi-1) for more info
##### MPI4PY
To use MPI with Python applications, the EAR Loader cannot automatically detect symbols to classify
the application as Intel or OpenMPI. In order to specify it, the user has
to define the `SLURM_LOAD_MPI_VERSION` environment variable with the values _intel_ or
_open mpi_. It is recommended to add in Python modules to make it easy for
final users.
To use MPI with Python applications, the EAR Loader cannot automatically detect symbols to classify the application as Intel or OpenMPI.
In order to specify it, the user has to define the `EAR_LOAD_MPI_VERSION` environment variable with the values specified in the [table](#python-and-julia-mpi-applications) explained above.
It is recommended to add in Python modules to make it easy for final users.
Ask your system administrator or check your cluster documentation.
##### MPI.jl
According to the [documentation](https://juliaparallel.org/MPI.jl/stable/), the basic Julia wrapper for MPI is inspired by mpi4py.
Check its [section](#mpi4py) for running this kind of use case.
## Non-MPI applications
### Python
Since version 4.1 EAR automatically executes the Library with Python applications,
so no action is needed.
You must run the application with `srun` command to pass through the EAR's SLURM
SPANK plug-in in order to enable/disable/tuning EAR.
Since version 4.1 EAR automatically executes the Library with Python applications, so no action is needed.
You must run the application with `srun` command to pass through the EAR's SLURM SPANK plug-in in order to enable/disable/tuning EAR.
See [EAR submission flags](#ear-job-submission-flags) provided by EAR SLURM integration.
### OpenMP, CUDA and Intel MKL
### OpenMP, CUDA, Intel MKL and OneAPI
To load EARL automatically with non-MPI applications it is required to have it compiled
with dynamic symbols and also it must be executed with `srun` command.
To load EARL automatically with non-MPI applications it is required to have it compiled with dynamic symbols and also it must be executed with `srun` command.
For example, for CUDA applications the `--cudart=shared` option must be used at compile time.
EARL is loaded for OpenMP, MKL and CUDA programming models when symbols are dynamically detected.
## Other application types or frameworks
For other programming models or sequential apps not supported by default, EARL can
be forced to be loaded by setting [`SLURM_EAR_LOADER_APPLICATION`](EAR-environment-variables#ear_loader_application)
enviroment variable, which must be defined with the application name.
be forced to be loaded by setting [`EAR_LOADER_APPLICATION`](EAR-environment-variables#ear_loader_application)
enviroment variable, which must be defined with the executable name.
For example:
```
#!/bin/bash
export SLURM_EAR_LOADER_APPLICATION=my_app
export EAR_LOADER_APPLICATION=my_app
srun my_app
```
## Using EAR inside Singularity containers
## Using EARL inside Singularity containers
[Apptainer](https://apptainer.org/) (formerly Singularity) is an open source technology for containerization.
It is widely used in HPC contexts because the level of virtualization it offers enables the access to local services.
......@@ -200,55 +209,53 @@ get to see or know your applications behaviour.
The Library is doted with several modules and options to be able to provide different
kind of information.
As a very simple hint of your application workload, you can enable EARL verbosity
to get loop data at runtime.
As a very simple hint of your application workload, you can enable EARL verbosity (e.g., `--ear-verbose=1`) to get loop data at runtime.
The information is shown at _stderr_ by default.
Read how to set up verbosity at [submission time](ear-job-submission-flags) and
[verbosity environment variables](EAR-environment-variables#verbosity) provided
for a more advanced tunning of this EAR feature.
To get offline job data EAR provides [`eacct`](EAR-commands#ear-job-accounting-eacct),
a tool to provide the monitored job data stored in the Database.
You can request information in different ways, so you can read aggregated job data,
per-node or per-loop information among other things.
See [eacct usage examples](User-guide#ear-job-accounting-eacct) for a better overview of which kind of data `eacct` provides.
## Post-mortem application data
To get offline job data EAR provides [**eacct**](EAR-commands#ear-job-accounting-eacct) command, a tool to provide the monitored job data stored in the Database.
You can request information in different ways, so you can read aggregated job data, per-node or per-loop information among other things.
See [eacct usage examples](User-guide#ear-job-accounting-eacct) for a better overview of what `eacct` provides.
There is another way to get runtime and aggregated data during runtime without the
need of calling `eacct` after the job completion.
EAR implements a reporting system mechanism which let developers to add new report
plug-ins, so there is an infinit set of ways to report EAR collected data.
## Runtime report plug-ins
Therefore EAR releases come with a fully supported report plug-in (called *csv_ts*)
which basically provides the same runtime and aggregated data reported to the Database in CSV files,
directly while the job is running.
There is another way to get runtime and aggregated data during runtime without the need of calling `eacct` after the job completion.
EAR implements a reporting system mechanism which let developers to add new report plug-ins, so there is an unlimited set of ways to report EAR collected data.
EAR releases come with a fully supported report plug-in (i.e., *csv_ts.so*) which provides the same runtime and aggregated data reported to the Database in CSV files, directly while the job is running.
You can load this plug-in in two ways:
1. By setting [`--ear-user-db`](#ear-job-submission-flags) flag at submission time.
2. [Loading directly the report plug-in](EAR-environment-variables#ear_report_add) through an environment variable:
`export SLURM_EAR_REPORT_ADD=csv_ts.so`.
2. [Loading directly the report plug-in](EAR-environment-variables#ear_report_add) through an environment variable: `export EAR_REPORT_ADD=csv_ts.so`.
> Contact with [ear-support@bsc.es](mailto:ear-support@bsc.es) for more information
about report plug-ins.
## Other EARL events
You can also request EAR to report **events** to the [Database](EAR-Databse).
They show more details about EARL internal state and can be provided with `eacct`
command.
See how to enable [EAR events reporting](EAR-environment-variables#report_earl_events)
and which kind of events EAR is reporting.
They show more details about EARL internal state and can be provided with `eacct` command.
See how to enable [EAR events reporting](EAR-environment-variables#report_earl_events) and which kind of events EAR is reporting.
If your application applies, you can request EAR to report at the
end of the execution a [summary about its MPI behaviour](EAR-environment-variables#ear_get_mpi_stats).
## MPI stats
If your application applies, you can request EAR to report at the end of the execution a [summary about its MPI behaviour](EAR-environment-variables#ear_get_mpi_stats).
The information is provided along two files and is the aggregated data of each process of the application.
## Paraver traces
Finally, EARL can provide runtime data in the [Paraver](https://tools.bsc.es/paraver) trace format.
Paraver is a flexible performance analysis tool maintained by the [*Barcelona Supercomputing Center*](https://bsc.es/)'s tools team.
This tool provides an easy way to visualize runtime data, computing derived metrics
and to provide histograms for better of your application behaviour.
See on the [environment variables page](EAR-environment-variables#ear_trace_plugin)
how to generate Paraver traces.
Paraver is a flexible performance analysis tool maintained by the [*Barcelona Supercomputing Center*](https://www.bsc.es/)'s tools team.
This tool provides an easy way to visualize runtime data, computing derived metrics and to provide histograms for better of your application behaviour.
See on the [environment variables page](EAR-environment-variables#ear_trace_plugin) how to generate Paraver traces.
> Contact with [ear-support@bsc.es](mailto:ear-support@bsc.es) if you want to get more details
about how to deal with EAR data with Paraver.
Another way to see runtime information with Paraver is to use the open source tool [**ear-job-visualization**](https://github.com/eas4dc/ear-job-visualization), a CLI program written in Python which gets CSV files generated by `--ear-user-db` flag and converts its data to the Paraver trace format.
EAR metrics are reported as trace events.
Node information is stored as Paraver task information.
Node GPU data is stored as Paraver thread information
# EAR job submission flags
......@@ -274,7 +281,7 @@ We recommend to split up SLURM's output (or error) file per-node.
You can read SLURM's [filename pattern specification](https://slurm.schedmd.com/sbatch.html#lbAH) for more information.
If you still need to have job output and EAR output separated, you can set
[`SLURM_EARL_VERBOSE_PATH`](EAR-environment-variables#earl_verbose_path) environment
[`EARL_VERBOSE_PATH`](EAR-environment-variables#earl_verbose_path) environment
variable and one file per node will be generated only with EAR output.
The environemnt variable must be set with the path (a directory) where you want
the output files to be generated, it will be automatically created if needed.
......
Clone repository
  • Home
  • User guide
    • Use cases
      • MPI applications
      • Non-MPI applications
      • Others
    • EAR data
      • Post-mortem application data
      • Runtime report plug-ins
      • EARL events
      • MPI stats
      • Paraver traces
    • Submission flags
    • Examples
    • Job accounting
    • Job energy optimization
    • Data visualization
  • Commands
    • Job accounting (eacct)
    • System energy report (ereport)
    • EAR control (econtrol)
    • Database management
    • erun
    • ear-info
  • Environment variables
    • Support for Intel(R) speed select technology
  • Admin Guide
    • Quick installation guide
    • Installation from RPM
    • Updating
  • Installation from source
  • Architecture/Services
  • High Availability support
  • Configuration
  • Learning phase
  • Plug-ins
  • Powercap
  • Report plug-ins
  • Database
    • Updating the database from previous EAR versions
    • Tables description
  • Supported systems
  • EAR Data Center Monitoring
  • CHANGELOG
  • FAQs
  • Known issues
  • Tutorials