|
|
|
# Components
|
|
|
|
EAR is composed of five main components:
|
|
|
|
- [Node Manager (EARD)](#daemon). The Node Manager must have root access to the node where it will be running.
|
|
|
|
- [Database Manager (EARDBD)](#database-daemon). The database manager requires access to the DB server (we support MariaDB and Postgress). Documentation for Postgress is still under development.
|
|
|
|
- [Global Manager (EARGM)](#global-manager). The global manager needs access to all node managers in the cluster as well as access to database.
|
|
|
|
- [Library (EARL)](#library)
|
|
|
|
- [SLURM plugin](#slurm-Plugin)
|
|
|
|
|
|
|
|
The following image shows the main interactions between components:
|
|
|
|
|
|
|
|
|
|
|
|
<img src="./images/EAR_arch.png" align="center" width=500>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Quick Installation Guide
|
|
|
|
|
|
|
|
This section provides a, summed up, step by step installation and execution guide for EAR. For a more in depth explanation of the necessary steps see the [Installation from source](Installation-from-source) or [Installation from RPM](Installation-from-RPM), following the [Configuration](Configuration) and [Execution](Execution) guides, or contact us at ear-support@bsc.es
|
|
|
|
|
|
|
|
## Requirements
|
|
|
|
|
|
|
|
- To install EAR from sources, the following libraries and environments are needed: C compiler,papi, gsl, MPI, mysqlclient for mariaDB.
|
|
|
|
- To install EAR from rpm (only binaries) all these dependencies have been removed except mysqlclient. However, they are neeed when running EAR.
|
|
|
|
- SLURM must also be present if the SLURM plugin wants to be used. Since current EAR version only supports automatic execution of applications with EAR library using the SLURM plugin, it must be running when EAR library wants to be used (not needed for node monitoring)
|
|
|
|
- The drivers for CPUFreq management (acpi-cpufreq) and Open IPMI must be present and loaded.
|
|
|
|
- MySQL server must be up and running.
|
|
|
|
|
|
|
|
## Installation, configuration and execution
|
|
|
|
1. Compile and install from source code or install via .rpm. EAR_TMP and EAR_ETC are defined in ear module. Till the module is not loaded, define manually these env vars to execute the next steps.
|
|
|
|
2. Create the $EAR_TMP folder. This folder must be local to each node, so we recommend to create it in /var/ear.
|
|
|
|
3. Either installing from sources or rpm, EAR installs a template for ear.conf file in `$EAR_ETC/ear/ear.conf.template`. Copy at `$EAR_ETC/ear/ear.conf` and update with the desired configuration. Go to our [ear.conf](Configuration#ear-configuration-file) page to see how to do it.The ear.conf is used by all the services.
|
|
|
|
4. Load EAR module to enable commands. It can be found in `$EAR_ETC/module`. You can add ear module when it's not in standard paths by doing `module use $EAR_ETC/module` and then `module load ear`.
|
|
|
|
5. Create EAR database with `edb_create`. The `edb_create -p` command will ask you for the DB root password. If you get any problem here, check first the node where you are running the command can connect to the DB server. In case problems persists, execute `edb_create -o` to report the specific SQL queries generated. In case of troubles, contact with [ear-support@bsc.es](mailto:ear-support@bsc.es).
|
|
|
|
6. EAR uses a power and performance model based on systems signatures. These system signatures are stored in coefficient files. Before starting EARDxa, and just for testings, it is needed to create a dummy coefficient file and copy in the coefficients path (by default placed at $EAR_ETC/coeffs). Visit the [tools section](Tools), coeffs_null application.
|
|
|
|
7. Copy EAR service files to start/stop services using system commands such as systemctl. EAR service files are generated at `$EAR_ETC/systemd` and they can usually be placed in `$(ETC)/systemd`.
|
|
|
|
8. Start EARDs and EARDBDs via services (see our [Launching the components with unit services](Execution#launching-the-components-through-unit-services)). EARDBD and EARD outputs can be found at ´$EAR_TMP/eardbd.log´ and ´$EAR_TMP/eard.log´ respectivelly when DBDaemonUseLog and NodeUseLog options are set to 1 in ear.conf file. Otherwise, their outputs are generated in stderr and can be seen using the journactl command. For instance, use ´journactl -u eard´ to look at eard output.
|
|
|
|
9. Check that the EARDs are up and running correctly with `econtrol --status` (note that the daemons will take around a minute to correctly report energy and not show up as an error in `econtrol`). EARDs creates a per-node text file with values reported to the EARDBD. In case there is problems when running econtrol, you can also find this file at `$EAR_TMP/nodename.pm_periodic_data.txt`.
|
|
|
|
10. Check that the EARDs are reporting metrics to database with `ereport` (`ereport -n all` should report the total energy send by each daemon since the setup).
|
|
|
|
11. Start EARGM via services.
|
|
|
|
12. Check if EARGM is reporting to database with `ereport -g`. (Note that EARGM will take a period of time set by the admin in `ear.conf`, option GlobalManagerPeriodT1, to report for the first time. ).
|
|
|
|
13. Set up EAR's SLURM plugin (see our [Configuration](Configuration) page for more information).
|
|
|
|
14. Run an application via SLURM and check that it is correctly reported to database with `eacct`. (Note that only privileged users can check other users' applications).
|
|
|
|
15. Run an MPI application with `--ear=on` and check that the report by `eacct` now includes the library metrics. EAR library depends on the MPI version: Intel, OpenMPI, etc. By default libear.so is used. Different names for different versions can be specified automatically by adding the EAR version name in the corresponding MPI module. For instance, for libear.openmpi.4.0.0.so library, define **SLURM_EAR_MPI_VERSION** environment variable as openmpi.4.0.0. When EAR has been installed from sources, this name is the same it is specified in MPI_VERSION during the configure. When installed from rpm, look at ´$EAR_INSTALL_PATH/lib´ to see the available versions.
|
|
|
|
16. Set `default=on` to specify the EAR library will be loaded with all the applicatins by default in `plugstack.conf`. If default is set to off, EAR library can be explicitly loaded by doing --ear=on when submitting a job.
|
|
|
|
17. At this point you can use EAR for monitoring and accounting purposes, but it cannot use the power policies for EARL. To do that, first do a [learning phase](Learning-phase) and compute the coefficients.
|
|
|
|
18. For the coefficients to be active, restart the daemons. __IMPORTANT__: reloading the daemons will NOT make them load the coefficients, restarting is the only way.
|
|
|
|
|
|
|
|
|