Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • EAR EAR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • EAR_teamEAR_team
  • EAREAR
  • Wiki
  • Admin guide

Admin guide · Changes

Page history
v5.2 release authored Oct 23, 2025 by Oriol's avatar Oriol
Hide whitespace changes
Inline Side-by-side
Admin-guide.md
View page @ 168bb2b7
...@@ -5,11 +5,12 @@ ...@@ -5,11 +5,12 @@
<img src="./images/EAR_arch.png" align="right" width=500> <img src="./images/EAR_arch.png" align="right" width=500>
EAR is composed of five main components: EAR is composed of five main components:
- **Node Manager (EARD):** It is a Linux service which provides the basic node power monitoring and job accounting. It also offers an API to be used for third-parties (e.g., other EAR components) to to make priviledged operations. It must have root access to the node (usually all compute nodes) where it will be running. - **Node Manager (EARD):** It is a Linux service which provides the basic node power monitoring and job accounting. It also offers an API to be used for third-parties (e.g., other EAR components) to to make priviledged operations. It must have root access to the node (usually all compute nodes) where it will be running.
- **Database Manager (EARDBD):** A Linux service (it normally runs in a service node) which caches data to be stored in a database reducing the number of queries. We currently support [MariaDB](https://mariadb.org/) and [PostgresSQL](https://www.postgresql.org/). This compoment is not needed to be enabled/used if don't use such database services to report EAR data. - **Database Manager (EARDBD):** A Linux service (it normally runs in a service node) which caches data to be stored in a database reducing the number of queries. We currently support [MariaDB](https://mariadb.org/) and [PostgresSQL](https://www.postgresql.org/). This compoment is not needed to be enabled/used if don't use such database services to report EAR data.
- **Global Manager (EARGM):** A Linux service (it normally runs in a service node) which provides cluster-level support (e.g., powercap). It needs access to all nodes where a Node Manager is runningi the cluster. - **Global Manager (EARGM):** A Linux service (it normally runs in a service node) which provides cluster-level support (e.g., powercap). It needs access to all nodes where a Node Manager is runningi the cluster.
- **EAR Library (EARL):** A Job Manager (distributed as a shared object) which provides job/application -level monitoring and optimization. - **EAR Library (EARL):** A Job Manager (distributed as a shared object) which provides job/application -level monitoring and optimization.
- **SLURM plug-in:** A SLURM [SPANK](https://slurm.schedmd.com/spank.html) plug-in which provides support for using EAR job accounting and loading EARL transparently for users on systems using SLURM. - **Scheduler plug-in:** A SLURM [SPANK](https://slurm.schedmd.com/spank.html) plug-in and a PBS Pro [Hook](https://2025.help.altair.com/2025.2.0/PBS%20Professional/PBSHooks2025.2.0.pdf) which provide support for using EAR job accounting and loading EARL transparently for users.
For a more detailed information about EAR components, visit the [Architecture](Architecture) page. For a more detailed information about EAR components, visit the [Architecture](Architecture) page.
...@@ -44,7 +45,7 @@ This is an example to configure EAR to be compiled for both versions: ...@@ -44,7 +45,7 @@ This is an example to configure EAR to be compiled for both versions:
``` ```
> **The above example assumes your MPI Library is Intel MPI.** > **The above example assumes your MPI Library is Intel MPI.**
> If you want to compile EARL for another MPI flavour check out [this section](supporting-more-than-one-mpi-implementation). > If you want to compile EARL for another MPI flavour check out [this section](#supporting-more-than-one-mpi-implementation).
EAR currently does not support GNU make parallel builds, so the above example must be run in the source code root directory. EAR currently does not support GNU make parallel builds, so the above example must be run in the source code root directory.
For the same reason, the `configure` script support a variable called `MAKE_NAME`, so it generates a Makefile called `Makefile.<MAKE_NAME variable value>`. For the same reason, the `configure` script support a variable called `MAKE_NAME`, so it generates a Makefile called `Makefile.<MAKE_NAME variable value>`.
...@@ -153,7 +154,7 @@ After compiling and installing following the previous step, you should have the ...@@ -153,7 +154,7 @@ After compiling and installing following the previous step, you should have the
Inside `lib` directory, apart from plug-ins, you should see at least three files. Inside `lib` directory, apart from plug-ins, you should see at least three files.
- `libearld.so`: This is the EAR Loader. - `libearld.so`: This is the EAR Loader.
- `libear.so`: This is the EAR Library compiled with Intel MPI symbols. See the [next section](supporting-more-than-one-mpi-implementation) if you need support for other MPI implementations. - `libear.so`: This is the EAR Library compiled with Intel MPI symbols. See the [next section](#supporting-more-than-one-mpi-implementation) if you need support for other MPI implementations.
- `libear.gen.so`: This is the EAR Library compiled without MPI symbols. The `.gen` extension is added automatically when setting `--disable-mpi` flag. - `libear.gen.so`: This is the EAR Library compiled without MPI symbols. The `.gen` extension is added automatically when setting `--disable-mpi` flag.
## Supporting more than one MPI implementation ## Supporting more than one MPI implementation
...@@ -323,19 +324,25 @@ should report the total energy sent by each daemon since the setup. ...@@ -323,19 +324,25 @@ should report the total energy sent by each daemon since the setup.
### Monitoring: EAR plugin ### Monitoring: EAR plugin
#### Slurm
- Set up EAR's SLURM plugin (see the [configuration](Configuration) section for - Set up EAR's SLURM plugin (see the [configuration](Configuration) section for
more information). more information).
> It is recommented to create a soft link to the `$EAR_ETC/slurm/ear.plugstack.conf` > It is recommented to create a soft link to the `$EAR_ETC/slurm/ear.plugstack.conf`
file in the `/etc/slurm/plugstack.conf.d` directory to simplify the EAR plugin management. file in the `/etc/slurm/plugstack.conf.d` directory to simplify the EAR plugin management.
> For a first test it is recommened to set `default=off` in the `ear.plugstack.conf` > For a first test it is recommened to set `default=off` in the `ear.plugstack.conf` to disable the automatic loading of the EAR library.
(to disable the automatic loading of the EAR library).
#### PBS
- Set up EAR PBS Hook (see the [configuration](Configuration) section for more information).
> For a first test it is recommened to set `default=off` in the `ear_hook_conf.ini` to disable the automatic loading of the EAR library.
**EAR plugin validation** ##### EAR scheduler plugins validation
At this point you must be able to see EAR options when doing, for example, `srun --help`. At this point you must be able to see EAR options when doing, for example, `srun --help`.
You must see something like below as part of the output. The EAR plugin must be enabled at login You must see something like below as part of the output. The EAR plugin must be enabled at login and compute nodes.
and compute nodes.
``` ```
[user@hostname ~]$ srun --help [user@hostname ~]$ srun --help
...@@ -383,22 +390,31 @@ Other options: ...@@ -383,22 +390,31 @@ Other options:
``` ```
- Submit one application via SLURM and check that it is correctly reported to the database with `eacct` command. In PBS, to see EAR options run `ear-hook-help`.
You must see something like below as part of the output. The EAR must be loaded.
For PBS:
```
[user@hostname ~]$ module load ear
[user@hostname ~]$ ear-hook-help
```
- Submit one application via the scheduler and check that it is correctly reported to the database with `eacct` command.
> Note that only privileged users can check other users’ applications. > Note that only privileged users can check other users’ applications.
- Submit one MPI application (corresponding with the version you have compiled) with `--ear=on` and check that now the output of `eacct` includes the Library metrics. - Submit one MPI application (corresponding with the version you have compiled) with `sbatch --ear=on` or `qsub -v "EAR=on"` and check that now the output of `eacct` includes the Library metrics.
- Set `default=on` to set the EAR Library loading by default at `ear.plugstack.conf`. If default is turned off, EARL can be explicitly loaded by setting the flag `--ear=off` at job submission. - Set `default=on` to set the EAR Library loading by default at `ear.plugstack.conf` or in `hook_config.ini`.
At this point, you can use EAR for monitoring and accounting purposes but it cannot use the power policies offered by EARL. At this point, you can use EAR for monitoring and accounting purposes but it cannot use the power policies offered by EARL.
To enable them, first perform a learning phase and compute node coefficients. See the [EAR learning phase](www.example.org) wiki page. To enable them, first perform a learning phase and compute node coefficients. See the [EAR learning phase](Learning-phase) wiki page.
For the coefficients to be active, restart daemons. For the coefficients to be active, restart daemons.
> **Important** Reloading daemons will NOT make them load coefficients, restarting the service is the only way. > **Important** Reloading daemons will NOT make them load coefficients, restarting the service is the only way.
# Installing from RPM # Installing from RPM
EAR includes the specification files to create an rpm **from an already existing installation**. EAR includes the specification files to create an RPM **from an already existing installation**.
Once created, it can be included in the compute nodes images. Once created, it can be included in the compute nodes images.
It is recommened only when no more changes are expected on the installation or when your compute fleet has ephimeral storage and EAR is installed in a non-shared file system. It is recommened only when no more changes are expected on the installation or when your compute fleet has ephimeral storage and EAR is installed in a non-shared file system.
...@@ -406,7 +422,7 @@ The spec file is placed at `etc/rpms/specs/ear.spec` and it is generated from `e ...@@ -406,7 +422,7 @@ The spec file is placed at `etc/rpms/specs/ear.spec` and it is generated from `e
The RPM can be part of the system image. The RPM can be part of the system image.
Visit the [Requirements](#rpm-requirements) page for a quick overview of the requirements. Visit the [Requirements](#rpm-requirements) page for a quick overview of the requirements.
Execute the `rpmbuild.sh` script to create the EAR rpm file. Execute the `rpmbuild.sh` script to create the EAR RPM file.
This is script is located at `etc/rpms` and it is created from `etc/rpms/rpmbuild.sh.in` at configuration time. This is script is located at `etc/rpms` and it is created from `etc/rpms/rpmbuild.sh.in` at configuration time.
**Run it from its location**. **Run it from its location**.
The rpm file will be located at `$HOME/rpmbuild/RPMS`. The rpm file will be located at `$HOME/rpmbuild/RPMS`.
...@@ -485,7 +501,7 @@ The best way to execute all EAR daemon components (EARD, EARDBD, EARGM) is by th ...@@ -485,7 +501,7 @@ The best way to execute all EAR daemon components (EARD, EARDBD, EARGM) is by th
> __NOTE__ EAR uses a MariaDB/MySQL server. The server must be started before EAR services are executed. > __NOTE__ EAR uses a MariaDB/MySQL server. The server must be started before EAR services are executed.
The way to launch the EAR daemons is via unit services. The generated unit services for the EAR Daemon, EAR Global Manager Daemon and EAR Database Daemon are generated and installed in `$(EAR_ETC)/systemd`. You have to copy those unit service files to your `systemd` operating system folder and then use the `systemctl` command to run the daemons. The way to launch the EAR daemons is via unit services. The generated unit services for the EAR Daemon, EAR Global Manager Daemon and EAR Database Daemon are generated and installed in `$(EAR_ETC)/systemd`. You have to copy those unit service files to your `systemd` operating system folder and then use the `systemctl` command to run the daemons.
Check the [EARD](Architecture#ear-node-manager), [EARDBD](Architecture#ear-database-manager), [EARGMD](Architecture#ear-global-manager) pages to find the precise execution commands. Check the [EARD](Architecture#ear-node-manager), [EARDBD](Architecture#ear-database-manager), [EARGMD](Architecture#ear-global-manager-system-power-manager) pages to find the precise execution commands.
When using `systemctl` commands, you can check messages reported to `stderr` using `journalctl`. For instance: When using `systemctl` commands, you can check messages reported to `stderr` using `journalctl`. For instance:
`journalctl -u eard -f`. Note that if `NodeUseLog` is set to 1 in `ear.conf`, the messages will not be printed to `stderr` but to `$EAR_TMP/eard.log` instead. `DBDaemonUseLog` and `GlobalmanagerUseLog` options in `ear.conf` specifies the output for EARDBD and EARGM, respectivelly. `journalctl -u eard -f`. Note that if `NodeUseLog` is set to 1 in `ear.conf`, the messages will not be printed to `stderr` but to `$EAR_TMP/eard.log` instead. `DBDaemonUseLog` and `GlobalmanagerUseLog` options in `ear.conf` specifies the output for EARDBD and EARGM, respectivelly.
......
Clone repository
  • Home
  • User guide
    • Use cases
      • MPI applications
      • Non-MPI applications
      • Other use cases
      • Usage inside Singularity containers
      • Usage through the COMPSs Framework
    • EAR data
      • Post-mortem application data
      • Runtime report plug-ins
      • EARL events
      • MPI stats
      • Paraver traces
      • Grafana
    • Submission flags
    • Examples
    • Job accounting
    • Job energy optimization
  • Tutorials
  • Commands
    • Job accounting (eacct)
    • System energy report (ereport)
    • EAR control (econtrol)
    • Database management
    • erun
    • ear-info
  • Environment variables
    • Support for Intel(R) speed select technology
  • Admin Guide
    • Quick installation guide
    • Installation from RPM
    • Updating
  • Installation from source
  • Architecture/Services
  • High Availability support
  • Configuration
  • Classification strategies
  • Learning phase
  • Plug-ins
  • Powercap
  • Report plug-ins
  • Database
    • Updating the database from previous EAR versions
    • Tables description
  • Supported systems
  • EAR Data Center Monitoring
  • CHANGELOG
  • FAQs
  • Known issues