# Components
# EAR Components
EAR is composed of five main components:
- [Node Manager (EARD)](#daemon). The Node Manager must have root access to the node where it will be running.
- [Database Manager (EARDBD)](#database-daemon). The database manager requires access to the DB server (we support MariaDB and Postgress). Documentation for Postgress is still under development.
- [Global Manager (EARGM)](#global-manager). The global manager needs access to all node managers in the cluster as well as access to database.
- [Library (EARL)](#library)
- [SLURM plugin](#slurm-Plugin)
- **Node Manager (EARD)**. The Node Manager must have root access to the node where it will be running.
- **Database Manager (EARDBD)**. The database manager requires access to the DB server (we support MariaDB and Postgress). Documentation for Postgress is still under development.
- **Global Manager (EARGM)**. The global manager needs access to all node managers in the cluster as well as access to database.
- **Library (EARL)**
- **SLURM plugin**
The following image shows the main interactions between components:
<img src="./images/EAR_arch.png" align="center" width=500>
For a more detailed information about EAR components, visit the [Architecture](Architecture) page.
# Quick Installation Guide
This section provides a, summed up, step by step installation and execution guide for EAR. For a more in depth explanation of the necessary steps see the [Installation from source](Installation-from-source) or [Installation from RPM](Installation-from-RPM), following the [Configuration](Configuration) and [Execution](Execution) guides, or contact us at ear-support@bsc.es
This section provides a, summed up, step by step installation and execution guide for EAR. For a more in depth explanation of the necessary steps see the [Installation from source](Installation from source) page or the [Installing from RPM](#installing-from-rpm) section, following the [Configuration](Configuration) guide, or contact us at ear-support@bsc.es
## EAR Requirements
## Requirements
Requirements to compile EAR are:
- C compiler.
- MPI compiler.
- CUDA installation path if NVIDIA is used.
- Likwid path if Likwid is used.
- Freeipmi path if freeipmi is used.
- GSL is needed for coefficient computations.
- To install EAR from sources, the following libraries and environments are needed: C compiler, MPI compiler and library if MPI version is generated, *mysqlclient* for mariaDB or *postgresql* library. *libGSL* is needed for coefficient computations
- To install EAR from **rpm** (only binaries) all these dependencies have been removed except *mysqlclient*. However, they are neeed when running EAR.
- SLURM must also be present if the SLURM plugin wants to be used. Since current EAR version only supports automatic execution of applications with EAR library using the SLURM plugin, it must be running when EAR library wants to be used (not needed for node monitoring).
- The drivers for CPUFreq management (*acpi-cpufreq*) and Open IPMI must be present and loaded in compute nodes.
To install EAR from **rpm** (only binaries) all these dependencies have been removed except *mysqlclient*. However, they are needed when running EAR.
SLURM must also be present if the SLURM plugin wants to be used. Since current EAR version only supports automatic execution of applications with EAR library using the SLURM plugin, it must be running when EAR library wants to be used (not needed for node monitoring).
Lastly, but not less important:
- The drivers for CPU frequency management (*acpi-cpufreq*) and Open IPMI must be present and loaded in compute nodes.
- *msr kernel* module must be loaded in compute nodes.
- mariaDB or postgress server must be up and running.
- Hardware counters must be accessible for normal users. Set */proc/sys/kernel/perf\_event\_paranoid* to 2 (or less). Type `sudo sh -c "echo 2 > /proc/sys/kernel/perf_event_paranoid"` in compute nodes.
## Installation, configuration and execution
1. Compile and install from source code or install via rpm. `$EAR_TMP` and `$EAR_ETC` are defined in ear module. Till the module is not loaded, define manually these environment variables to execute the next steps.
2. Create the `$EAR_TMP` folder. This folder must be local to each node, so we recommend to create it in /var/ear.
3. Either installing from sources or rpm, EAR installs a template for **ear.conf** file in `$EAR_ETC/ear/ear.conf.template`. Copy at `$EAR_ETC/ear/ear.conf` and update with the desired configuration. Go to [ear.conf](Configuration#ear-configuration-file) page to see how to do it. The ear.conf is used by all the services.
4. Load EAR module to enable commands. It can be found in `$EAR_ETC/module`. You can add ear module when it's not in standard paths by doing `module use $EAR_ETC/module` and then `module load ear`.
5. Create EAR database with `edb_create`. The `edb_create -p` command will ask you for the DB root password. If you get any problem here, check first whether the node where you are running the command can connect to the DB server. In case problems persists, execute `edb_create -o` to report the specific SQL queries generated. In case of trouble, contact with [ear-support@bsc.es](mailto:ear-support@bsc.es).
6. EAR uses a power and performance model based on systems signatures. These system signatures are stored in coefficient files. Before starting EARD, and just for testing, it is needed to create a dummy coefficient file and copy in the coefficients path (by default placed at `$EAR_ETC/coeffs`). Visit the *coeffs\_null* application from [tools section](Tools).
7. Copy EAR service files to start/stop services using system commands such as systemctl. EAR service files are generated at `$EAR_ETC/systemd` and they can usually be placed in `$(ETC)/systemd`.
8. Start EARDs and EARDBDs via services (see our [Launching the components with unit services](Execution#launching-the-components-through-unit-services)). EARDBD and EARD outputs can be found at `$EAR_TMP/eardbd.log` and `$EAR_TMP/eard.log` respectively when *DBDaemonUseLog* and *NodeUseLog* options are set to 1 in ear.conf file. Otherwise, their outputs are generated in *stderr* and can be seen using the *journactl* command. For instance, use `journactl -u eard` to look at eard output.
9. Check that EARDs are up and running correctly with `econtrol --status` (note that daemons will take around a minute to correctly report energy and not show up as an error in `econtrol`). EARDs create a per-node text file with values reported to the EARDBD. In case there are problems when running *econtrol*, you can also find this file at `$EAR_TMP/nodename.pm_periodic_data.txt`.
10. Check that EARDs are reporting metrics to database with *ereport*. `ereport -n all` should report the total energy send by each daemon since the setup.
11. Start EARGM via services.
12. Check if EARGM is reporting to database with `ereport -g`. Note that EARGM will take a period of time set by the admin in *ear.conf* (*GlobalManagerPeriodT1* option) to report for the first time.
13. Set up EAR's SLURM plugin (see our [Configuration](Configuration) page for more information).
14. Run an application via SLURM and check that it is correctly reported to database with `eacct`. Note that only privileged users can check other users' applications.
15. Run an MPI application with `--ear=on` and check that the report by `eacct` now includes the library metrics. EAR library depends on the MPI version: Intel, OpenMPI, etc. By default *libear.so* is used. Different names for different versions can be specified automatically by adding the EAR version name in the corresponding MPI module. For instance, for *libear.openmpi.4.0.0.so* library, define `SLURM_EAR_MPI_VERSION` environment variable as *openmpi.4.0.0*. When EAR has been installed from sources, this name is the same as it is specified in MPI_VERSION during the `configure`. When installed from rpm, look at `$EAR_INSTALL_PATH/lib` to see the available versions.
16. Set `default=on` to specify the EAR library will be loaded with all the applications by default in `plugstack.conf`. If default is set to off, EAR library can be explicitly loaded by doing --ear=on when submitting a job.
17. At this point you can use EAR for monitoring and accounting purposes, but it cannot use the power policies for EARL. To do that, first do a [learning phase](Learning-phase) and compute the coefficients.
18. For the coefficients to be active, restart the daemons. __IMPORTANT__: reloading the daemons will NOT make them load the coefficients, restarting is the only way. |
\ No newline at end of file |
Run `./configure --help` to see all the flags and options.
## Compiling and installing EAR
Once downloaded the code from repository, execute:
- `autoreconf -i`.
./configure --prefix=ear-install-path \
EAR_TMP=ear-tmp-path \EAR_ETC=ear-etc-path \
CC=c-compiler-path \
MPICC=mpi-compiler-path \
CC_FLAGS=c-flags-compiler \
MPICC_FLAGS=mpi-flags \
--with-cuda=path-to-cuda \
Additionally to the Makefile, `MAKE_NAME` forces to copy the generated Makefile with the name Makefile._make\_extension_.
It simplifies the fact of having multiple configurations (1 for each library version needed). More relevant options are:
- The option `--disable-mpi` must be set to generate a configuration for non-MPI version of the library.
- Use `MPI_VERSION=ompi` for OpenMPI compatible version.
Before running `make`, review the Makefile and the configuration log to validate all the requirements of your installation have been automatically detected. In particular, if you need to use some specific library such likwid, freeipmi or CUDA. If CUDA path is specified, EAR will be compiled with GPU support. Check also that MySQL ot PostgreSQL paths have been detected.
You can use options `USER` and `GROUP` if you want to install EAR with a special USER/GROUP.
The following shows how to configure EAR to be compiled with Intel MPI:
autoreconf -i
./configure --prefix=/opt/ear CC=icc MPICC=mpiic MAKE_NAME=impi
make -f Makefile.impi
make -f Makefile.impi install
make -f Makefile.impi doc.install
make -f Makefile.impi etc.install
At this point the EAR binaries will be installed including one version of the
EAR library for MPI (default), EAR documentation, EAR service files for EAR
daemons and templates for `ear.conf` files and SLURM plugin. The configure
tool tries to automatically detect paths to mysql and/or postgress, scheduler
sources, etc. It is mandatory to detect the scheduler path, by default SLURM is
assumed. After the configure, check in the Makefile all the options have been
detected. After the make install, you should have the following folders in the
ear-install-path: bin, sbin, etc, lib, include, man. The bin directory includes
commands and tools, the sbin includes EAR services, the lib includes all the
libraries and plugins, and etc includes templates and examples for EAR service
files, ear.conf file, the EAR module, etc.
## Deployment and validation
### Monitoring: Compute node and DB
**Prepare the configuration**
Either installing from sources or rpm, EAR installs a template for `ear.conf` file
in `$EAR_ETC/ear/ear.conf.template` and `$EAR_ETC/ear/ear.conf.full.template`.
The full version includes all fields. Copy only one as `$EAR_ETC/ear/ear.conf` and update
with the desired configuration. Go to the [configuration](www.example.org) section to see how to do it.
The `ear.conf` is used by all the services. It is recommended to have in a shared folder to simplify the changes in the configuration.
**EAR module**
Install and load EAR module to enable commands. It can be found at `$EAR_ETC/module`.
You can add ear module whan it is not in standard path by doing `module use $EAR_ETC/module` and then
`module load ear`.
**EAR Database**
Create EAR database with `edb_create`, installed at `$EAR_INSTALL_PATH/sbin`.
The `edb_create -p` command will ask you for the DB root password.
If you get any problem here, check first whether the node where you are running the
command can connect to the DB server. In case problems persists, execute edb_create -o to report the specific SQL
queries generated. In case of trouble, contact with ear-support@bsc.es or open in issue.
**Energy models**
EAR uses a power and performance model based on systems signatures.
These system signatures are stored in coefficient files.
Before starting EARD, and just for testing, it is needed to create a dummy coefficient file and copy in the coefficients path, by default placed at`$EAR_ETC/coeffs`. Use the `coeffs_null` application from tools section.
> EAR version 4.1 does not require null coefficients.
**EAR services**
Create soft links or copy EAR service files to start/stop services
using system commands such as `systemctl` in the services folder. EAR service files
are generated at `$EAR_ETC/systemd` and they can usually be placed in `$(ETC)/systemd`.
- EARD must be started on compute nodes.
- EARDBD must be started on service nodes (can be any node with DB access).
Enable and start EARDs and EARDBDs via services (e.g., `sudo systemctl start eard`, `sudo systemctl start eardbd`).
EARDBD and EARD outputs can be found at `$EAR_TMP/eardbd.server.log` and `$EAR_TMP/eard.log` respectively when _DBDaemonUseLog_ and _NodeUseLog_ options are set to _1_ in the `ear.conf` file, respectively.
Otherwise, their outputs are generated at _stderr_ and can be seen using the `journalctl` command (i.e., journalctl -u eard).
By default, a certain level of verbosity is set. It is not recommended to modify
it but you can change it by modifying the value of constants in file `src/common/output/output_conf.h`.
**Quick validation**
Check that EARDs are up and running correctly with `econtrol --status`
(note that daemons will take around a minute to correctly report energy and not show up as an error in econtrol).
EARDs create a per-node text file with values reported to the EARDBD (local to compute nodes).
In case there are problems when running econtrol, you can also find this file at
Check that EARDs are reporting metrics to database with ereport. `ereport -n all`
should report the total energy sent by each daemon since the setup.
### Monitoring: EAR plugin
- Set up EAR's SLURM plugin (see the [configuration](www.example.org) section for
more information).
> It is recommented to create a soft link to the `$EAR_ETC/slurm/ear.plugstack.conf`
file in the `/etc/slurm/plugstack.conf.d` directory to simplify the EAR plugin management.
> For a first test it is recommened to set `default=off` in the `ear.plugstack.conf`
(to disable the automatic loading of the EAR library).
**EAR plugin validation**
At this point you must be able to see EAR options when doing, for example, `srun --help`.
You must see something like below as part of the output. The EAR plugin must be enabled at login
and compute nodes.
[user@hostname ~]$ srun --help
Usage: srun [OPTIONS(0)... [executable(0) [args(0)...]]] [ : [OPTIONS(N)...]] executable(N) [args(N)...]
Parallel run options:
Constraint options:
Consumable resources related options:
Affinity/Multi-core options: (when the task/affinity plugin is enabled)
Options provided by plugins:
--ear=on|off Enables/disables Energy Aware Runtime Library
--ear-policy=type Selects an energy policy for EAR
--ear-cpufreq=frequency Specifies the start frequency to be used by EAR
policy (in KHz)
--ear-policy-th=value Specifies the threshold to be used by EAR policy
(max 2 decimals) {value=[0..1]}
--ear-user-db=file Specifies the file to save the user applications
metrics summary 'file.nodename.csv' file will be
created per node. If not defined, these files
won't be generated.
--ear-verbose=value Specifies the level of the
verbosity{value=[0..1]}; default is 0
--ear-learning=value Enables the learning phase for a given P_STATE
--ear-tag=tag Sets an energy tag (max 32 chars)
Help options:
-h, --help show this help message
--usage display brief usage message
Other options:
-V, --version output version information and exit
- Submit one application via SLURM and check that it is correctly reported to the database with `eacct` command.
> Note that only privileged users can check other users’ applications.
- Submit one MPI application (corresponding with the version you have compiled) with `--ear=on` and check that now the output of `eacct` includes the Library metrics.
- Set `default=on` to set the EAR Library loading by default at `ear.plugstack.conf`. If default is turned off, EARL can be explicitly loaded by setting the flag `--ear=off` at job submission.
At this point, you can use EAR for monitoring and accounting purposes but it cannot use the power policies offered by EARL.
To enable them, first perform a learning phase and compute node coefficients. See the [EAR learning phase](www.example.org) wiki page.
For the coefficients to be active, restart daemons.
> **Important** Reloading daemons will NOT make them load coefficients, restarting the service is the only way.
## EAR Library versions: MPI vs. Non-MPI
As commented in the overview, the EAR Library is loaded next to the user MPI
application by the EAR Loader.
The Library uses MPI symbols, so it is compiled by using the includes provided
by your MPI distribution. The selection of the library version is automatic at runtime,
but it is not required during the compilation and installation steps.
Each compiled library version has its own file name that has to be defined by the
`MPI_VERSION` variable during the `./configure` or by editing the
root Makefile.
The name list per distribution is exposed in the following table:
| **Distribution** | **Name** | **MPI_VERSION value** |
|:----------------: |--------------------- |----------------------- |
| Intel MPI | libear.so (default) | not required |
| MVAPICH | libear.so (default) | not required |
| OpenMPI | libear.ompi.so | ompi |
If different MPI distributions share the same library name, it means their
symbols are compatible between them, so compiling and installing the library
one time will be enough.
However, if you provide different MPI distributions to users,
you will have to compile and install the library multiple times.
EAR makefiles include a specific target for each [EAR component](#ear-components),
supporting full or partial updates:
| Command | Description |
|--------------------------------------------------- |------------------------------------------------- |
| `make -f Makefile.make_extension install` | Reinstall all the files except `etc` and `doc`. |
| `make -f Makefile.make_extension earl.install` | Reinstall only the EARL. |
| `make -f Makefile.make_extension eard.install` | Reinstall only the EARD. |
| `make -f Makefile.make_extension earplug.install` | Reinstall only the EAR SLURM plugin. |
| `make -f Makefile.make_extension eardbd.install` | Reinstall only the EARDBD. |
| `make -f Makefile.make_extension eargmd.install` | Reinstall only the EARGMD. |
| `make -f Makefile.make_extension reports.install` | Reinstall only report plugins. |
Before compiling new libraries you have to install by typing `make install`.
Then you can run the `./configure` again, changing the `MPICC`, `MPICC_FLAGS`
and `MPI_VERSION` variables, or just opening the root Makefile and edit the
same variables and `MPI_BASE`, which just sets the MPI installation root path.
Now type `make full` to perform a clean compilation and `make earl.install`,
to install only the new version of the library.
If your MPI version is not fully compatible, please contact ear-support@bsc.es.
See the [User guide](User guide) to check the use cases supported and how to submit jobs with EAR.
# Installing from RPM
EAR includes the specification files to create an rpm from an already existing
The spec file is placed at `etc/rpms`.
To create the RPM it is needed a valid installation from source.
The RPM can be part of the system image.
Visit the [Requirements](RPM requirements) page for a quick overview of the requirements.
Execute the `rpmbuild.sh` script to create the EAR rpm file.
Once created, it can be included in the compute nodes images. It is recommened
only when no more changes are expected on the installation.
Once you have the rpm file, execute the following steps:
- Before the installation, make sure the installation path is accessible by all the computing nodes. Do the same in the folder where you want to set the temporary files (it will be called `$(EAR_TMP)` in this guide for simplicity).
- Default paths are `/usr` and `/etc`.
- Run `rpm -ivh --relocate /usr=/new/install/path --relocate /etc=/new/etc/path ear.version.rpm`.
> You can also use the `--nodeps` if your dependency test fails.
- During the installation the configuration files `*.in` are compiled to the ready to use version, replacing tags for correct paths. You will have more information of those files in the following pages. Check the [next section](#installation-content)
for more information.
- Type `rpm -e ear.version` to uninstall.
## Installation content
The `*.in` configuration files are compiled into `etc/ear/ear.conf.template`
and `etc/ear/ear.full.conf.template`, `etc/module/ear`, `etc/slurm/ear.plugstack.conf`
and various `etc/systemd/ear*.service`. You can find more information in
the [configuration](Configuration) page.
Below table describes the complet heriarchy of the EAR installation:
| Directory | Content / description |
|-------------------- |---------------------------------------------- |
| `/usr/lib` | Libraries and the scheduler plugin. |
| `/usr/lib/plugins` | EAR plugins. |
| `/usr/bin` | EAR commands. |
| `/usr/bin/tools` | EAR tools for coefficients computation. |
| `/usr/sbin` | Privileged components: EARD, EARDBD, EARGMD. |
| `/etc/ear` | Configuration files templates. |
| `/etc/ear/coeffs` | Folder to store coefficient files. |
| `/etc/module` | EAR module. |
| `/etc/slurm` | EAR SLURM plugin configuration file. |
| `/etc/systemd` | EAR service files. |
# Next steps
For a better overview of the installation process, return to the [installation guide](#quick-installation-guide).
To continue the installation, visit the [configuration page](Configuration) to set up properly the EAR configuration file and the EAR SLURM plugin stack file. |
\ No newline at end of file |