... | @@ -2,17 +2,17 @@ Configuration requirements |
... | @@ -2,17 +2,17 @@ Configuration requirements |
|
-------------
|
|
-------------
|
|
The following requirements must be met for EAR to work properly:
|
|
The following requirements must be met for EAR to work properly:
|
|
- EAR folders: EAR uses two paths for EAR configuration.
|
|
- EAR folders: EAR uses two paths for EAR configuration.
|
|
- EAR_TMP=tmp_ear_path must be a private folder per compute node. It must have read/write permissions for normal users. Communication files are created here. `tmp_ear_path` must be created by the admin.
|
|
- **EAR_TMP:** *tmp_ear_path* must be a private folder per compute node. It must have read/write permissions for normal users. Communication files are created here. It must be created by the admin.
|
|
For instance: `mkdir /var/ear; chmod ugo +rwx /var/ear`
|
|
For instance: `mkdir /var/ear; chmod ugo +rwx /var/ear`
|
|
- EAR_ETC=etc_ear_path must be readable for normal users in all compute nodes. It can be a shared folder in “GPFS” (simple to manage) or replicated data because it is very few data and modified at a very low frequency (`ear.conf` and coefficients). Coefficients can be installed in a different path specified at configure time in COEFFS flag. Both `ear.conf` and coefficients must be readable in all the nodes (compute and “service” nodes).
|
|
- **EAR_ETC:** *etc_ear_path* must be readable for normal users in all compute nodes. It can be a shared folder in "GPFS" (simple to manage) or replicated data because it has very few data and it is modified at a very low frequency (**ear.conf** and coefficients). Coefficients can be installed in a different path specified at configure time in **COEFFS** flag. Both ear.conf and coefficients must be readable in all the nodes (compute and “service” nodes).
|
|
- Configure `ear.conf`: `ear.conf` is an ascii file setting default values and cluster descriptions. An `ear.conf` is automatically generated based on a `ear.conf.in` template. However, sysadmin must include installation details such as hostname details for EAR services, ports, default values, and list of nodes. For more details, check [EAR configuration file](#ear-configuration-file) below.
|
|
- Configure **ear.conf**: ear.conf is an ascii file setting default values and cluster descriptions. An ear.conf is automatically generated based on a **ear.conf.in** template. However, sysadmin must include installation details such as hostname details for EAR services, ports, default values, and list of nodes. For more details, check [EAR configuration file](#ear-configuration-file) below.
|
|
- MySQL DB or PostgreSQL DB: EAR saves data in a MySQL/PostgreSQL DB server. EAR DB can be created using `edb_create` command provided (MySQL/PostgreSQL server must be running and root access to the DB is needed)
|
|
- MySQL DB or PostgreSQL DB: EAR saves data in a MySQL/PostgreSQL DB server. EAR DB can be created using `edb_create` command provided (MySQL/PostgreSQL server must be running and root access to the DB is needed).
|
|
- Set EAR SLURM plugin
|
|
- Set EAR SLURM plugin
|
|
- EAR SLURM plugin must be set in /etc/slurm/plugstack.conf. EAR generates an example at ear_etc_path/slurm/ear.plugstack.conf. For more information see our [Plugin section](Configuration#slurm-spank-plugin-configuration-file) down below.
|
|
- EAR SLURM plugin must be set in /etc/slurm/plugstack.conf. EAR generates an example at *ear_etc_path/slurm/ear.plugstack.conf*. For more information see our [Plugin section](Configuration#slurm-spank-plugin-configuration-file) down below.
|
|
|
|
|
|
EAR configuration file
|
|
EAR configuration file
|
|
----------------------
|
|
----------------------
|
|
`ear.conf` is a text file describing the EAR package behaviour in the cluster. It must be readable by all compute nodes and by nodes where commands are executed.
|
|
**ear.conf** is a text file describing the EAR package behaviour in the cluster. It must be readable by all compute nodes and by nodes where commands are executed. Two ear.conf templates are generated with default values and will be installed as reference when executing `make etc.install`
|
|
|
|
|
|
Usually the first word in the configuration file expresses the component related with the option. Lines starting with `#` are comments.
|
|
Usually the first word in the configuration file expresses the component related with the option. Lines starting with `#` are comments.
|
|
|
|
|
... | @@ -45,7 +45,7 @@ DBReportNodeDetail=1 |
... | @@ -45,7 +45,7 @@ DBReportNodeDetail=1 |
|
# Extended signature hardware counters reported to database.
|
|
# Extended signature hardware counters reported to database.
|
|
DBReportSigDetail=1
|
|
DBReportSigDetail=1
|
|
# Set to 1 if you want Loop signatures to be reported to database.
|
|
# Set to 1 if you want Loop signatures to be reported to database.
|
|
DBReportLoops=0
|
|
DBReportLoops=1
|
|
```
|
|
```
|
|
|
|
|
|
### EARD configuration. EARD are executed in compute nodes
|
|
### EARD configuration. EARD are executed in compute nodes
|
... | @@ -88,8 +88,6 @@ DBDaemonAggregationTime=60 |
... | @@ -88,8 +88,6 @@ DBDaemonAggregationTime=60 |
|
DBDaemonInsertionTime=30
|
|
DBDaemonInsertionTime=30
|
|
# Memory allocated per process. This allocations is used for buffering the data sent to the database by EARD or other components. If there is a server and mirror in a node a double of that value will be allocated. It is expressed in MegaBytes.
|
|
# Memory allocated per process. This allocations is used for buffering the data sent to the database by EARD or other components. If there is a server and mirror in a node a double of that value will be allocated. It is expressed in MegaBytes.
|
|
DBDaemonMemorySize=120
|
|
DBDaemonMemorySize=120
|
|
# The percentage of the memory buffer used by the previous field, by each type. These types are: mpi, non-mpi and learning applications, loops, energy metrics and aggregations and events, in that order. If a type gets 0% of space, this metric is discarded and not saved into the database.
|
|
|
|
DBDaemonMemorySizePerType=40,20,5,24,5,1,5
|
|
|
|
# When set to 1, eardbd uses a '$EAR_TMP'/eardbd.log file as a log file
|
|
# When set to 1, eardbd uses a '$EAR_TMP'/eardbd.log file as a log file
|
|
DBDaemonUseLog=1
|
|
DBDaemonUseLog=1
|
|
```
|
|
```
|
... | @@ -114,7 +112,7 @@ CheckEARModeEvery=1000 |
... | @@ -114,7 +112,7 @@ CheckEARModeEvery=1000 |
|
### EARGM configuration
|
|
### EARGM configuration
|
|
|
|
|
|
```INI
|
|
```INI
|
|
# The IP or hostname of the node where the EARGMD demon is running.
|
|
# The IP or hostname of the node where the EARGMD daemon is running.
|
|
EARGMHost=hostname
|
|
EARGMHost=hostname
|
|
# Port where EARGMD will be listening.
|
|
# Port where EARGMD will be listening.
|
|
EARGMPort=50000
|
|
EARGMPort=50000
|
... | @@ -135,7 +133,7 @@ EARGMMail=nomail |
... | @@ -135,7 +133,7 @@ EARGMMail=nomail |
|
# Percentage of accumulated energy to start the warning DEFCON level L4, L3 and L2.
|
|
# Percentage of accumulated energy to start the warning DEFCON level L4, L3 and L2.
|
|
EARGMWarningsPerc=85,90,95
|
|
EARGMWarningsPerc=85,90,95
|
|
# Number of "grace" T1 periods before doing a new re-evaluation. After a warning, EARGM will wait T1xGlobalManagerGracePeriods seconds until it raises a new warning.
|
|
# Number of "grace" T1 periods before doing a new re-evaluation. After a warning, EARGM will wait T1xGlobalManagerGracePeriods seconds until it raises a new warning.
|
|
EARGMGracePeriods=6
|
|
EARGMGracePeriods=3
|
|
# Verbose level
|
|
# Verbose level
|
|
EARGMVerbose=1
|
|
EARGMVerbose=1
|
|
# When set to 1, the output is saved in '$EAR_TMP'/eargmd.log (common configuration) as a log file.
|
|
# When set to 1, the output is saved in '$EAR_TMP'/eargmd.log (common configuration) as a log file.
|
... | @@ -143,13 +141,15 @@ EARGMUseLog=1 |
... | @@ -143,13 +141,15 @@ EARGMUseLog=1 |
|
# Format for action is: command_name energy_T1 energy_T2 energy_limit T2 T1 units "
|
|
# Format for action is: command_name energy_T1 energy_T2 energy_limit T2 T1 units "
|
|
# This action is automatically executed at each warning level (only once per grace periods)
|
|
# This action is automatically executed at each warning level (only once per grace periods)
|
|
EARGMEnergyAction=no_action
|
|
EARGMEnergyAction=no_action
|
|
|
|
#### POWERCAP definition for EARGM: Powercap is still under development. Do not activate
|
|
|
|
# 0 means no powercap
|
|
|
|
EARGMPowerLimit=0
|
|
```
|
|
```
|
|
|
|
|
|
### Common configuration
|
|
### Common configuration
|
|
|
|
|
|
```INI
|
|
```INI
|
|
# Network extension (using another network instead of the local one). If compute nodes must be accessed from login nodes with a network different than default, and can be accesed using a expension, uncommmet next line and define 'netext' accordingly.
|
|
|
|
# NetworkExtension=netext
|
|
|
|
# Default verbose level
|
|
# Default verbose level
|
|
Verbose=0
|
|
Verbose=0
|
|
# Path used for communication files, shared memory, etc. It must be PRIVATE per compute node and with read/write permissions. $EAR_TMP
|
|
# Path used for communication files, shared memory, etc. It must be PRIVATE per compute node and with read/write permissions. $EAR_TMP
|
... | @@ -189,7 +189,7 @@ EnergyTag=memory-intensive pstate=4 users=user1,user2 groups=group1,group2 accou |
... | @@ -189,7 +189,7 @@ EnergyTag=memory-intensive pstate=4 users=user1,user2 groups=group1,group2 accou |
|
|
|
|
|
|
|
|
|
### Tags
|
|
### Tags
|
|
Tags are used for architectural descriptions. Max. AVX frequencies are used in predictor models and are SKU-specific. At least a default tag is mandatory to be included for a cluster to work properly. At least a default tag is mandatory.
|
|
Tags are used for architectural descriptions. Max. AVX frequencies are used in predictor models and are SKU-specific. At least a default tag is mandatory to be included for a cluster to work properly.
|
|
|
|
|
|
The `min_power`, `max_power` and `error_power` are threshold values that determine if the metrics read might be invalid, and a warning message to syslog will be reported if the values are outside of said thresholds. `error_power` is a more extreme value that if a metric surpasses it, said metric will not be reported to database.
|
|
The `min_power`, `max_power` and `error_power` are threshold values that determine if the metrics read might be invalid, and a warning message to syslog will be reported if the values are outside of said thresholds. `error_power` is a more extreme value that if a metric surpasses it, said metric will not be reported to database.
|
|
|
|
|
... | @@ -208,17 +208,17 @@ Tag=6126 max_avx512=2.3 max_avx2=2.9 ceffs=coeffs.6126.default max_power=600 err |
... | @@ -208,17 +208,17 @@ Tag=6126 max_avx512=2.3 max_avx2=2.9 ceffs=coeffs.6126.default max_power=600 err |
|
## ---------------------------------------------------------------------------------------------------
|
|
## ---------------------------------------------------------------------------------------------------
|
|
#
|
|
#
|
|
## policy names must be exactly file names for policies installeled in the system
|
|
## policy names must be exactly file names for policies installeled in the system
|
|
DefaultPowerPolicy=min_time
|
|
DefaultPowerPolicy=monitoring
|
|
Policy=monitoring Settings=0 DefaultFreq=2.4 Privileged=0
|
|
Policy=monitoring Settings=0 DefaultFreq=2.4 Privileged=0
|
|
Policy=min_time Settings=0.7 DefaultFreq=2.0 Privileged=0
|
|
Policy=min_time Settings=0.7 DefaultFreq=2.0 Privileged=0
|
|
Policy=min_energy Settings=0.1 DefaultFreq=2.4 Privileged=1
|
|
Policy=min_energy Settings=0.05 DefaultFreq=2.4 Privileged=1
|
|
|
|
|
|
# For homogeneous systems, default frequencies can be easily specified using freqs, for heterogeneous systems it is preferred to use pstates
|
|
# For homogeneous systems, default frequencies can be easily specified using freqs, for heterogeneous systems it is preferred to use pstates
|
|
|
|
|
|
# Example with pstates (lower pstates corresponds with higher frequencies). Pstate=1 is nominal and 0 is turbo
|
|
# Example with pstates (lower pstates corresponds with higher frequencies). Pstate=1 is nominal and 0 is turbo
|
|
#Policy=monitoring Settings=0 DefaultPstate=1 Privileged=0
|
|
#Policy=monitoring Settings=0 DefaultPstate=1 Privileged=0
|
|
#Policy=min_time Settings=0.7 DefaultPstate=4 Privileged=0
|
|
#Policy=min_time Settings=0.7 DefaultPstate=4 Privileged=0
|
|
#Policy=min_energy Settings=0.1 DefaultPstate=1 Privileged=1
|
|
#Policy=min_energy Settings=0.05 DefaultPstate=1 Privileged=1
|
|
|
|
|
|
|
|
|
|
```
|
|
```
|
... | @@ -226,25 +226,34 @@ Policy=min_energy Settings=0.1 DefaultFreq=2.4 Privileged=1 |
... | @@ -226,25 +226,34 @@ Policy=min_energy Settings=0.1 DefaultFreq=2.4 Privileged=1 |
|
|
|
|
|
### Island description
|
|
### Island description
|
|
|
|
|
|
This section is mandatory since it is used for cluster description. Normally nodes are grouped in islands that share the same hardware characteristics as well as its database managers (EARDBDS). Each line describes an island, and every node must be in an island.
|
|
This section is mandatory since it is used for cluster description. Normally nodes are grouped in islands that share the same hardware characteristics as well as its database managers (EARDBDS). Each entry describes part of an island, and every node must be in an island.
|
|
|
|
|
|
Remember that there are two kinds of database daemons. One called 'server' and other one called 'mirror'. Both performs the metrics buffering process, but just one performs the insert. The mirror will do that insert in case the 'server' process crashes or the node fails.
|
|
There are two kinds of database daemons. One called 'server' and other one called 'mirror'. Both performs the metrics buffering process, but just one performs the insert. The mirror will do that insert in case the 'server' process crashes or the node fails.
|
|
|
|
|
|
It is recommended for all islands to have symmetry. For example, if the island I0 and I1 have the server N0 and the mirror N1, the next island would have to point the same N0 and N1 or point to new ones N2 and N3.
|
|
It is recommended for all islands to have maintain server-mirror symmetry. For example, if the island I0 and I1 have the server N0 and the mirror N1, the next island would have to point the same N0 and N1 or point to new ones N2 and N3, not point to N1 as server and N0 as mirror.
|
|
|
|
|
|
Multiple EARDBDs are supported in the same island, so more than one line per island is required, but the condition of symmetry have to be met.
|
|
Multiple EARDBDs are supported in the same island, so more than one line per island is required, but the condition of symmetry have to be met.
|
|
|
|
|
|
It is recommended that for a island to the server and the mirror running in different nodes. However, the EARDBD program could be both server and mirror at the same time. This means that the islands I0 and I1 could have the N0 server and the N2 mirror, and the islands I2 and I3 the N2 server and N0 mirror, fulfilling the symmetry requirements.
|
|
It is recommended that for an island the server and the mirror to be running in different nodes. However, the EARDBD program could be both server and mirror at the same time. This means that the islands I0 and I1 could have the N0 server and the N2 mirror, and the islands I2 and I3 the N2 server and N0 mirror, fulfilling the symmetry requirements.
|
|
|
|
|
|
|
|
A tag can be specified that will apply to all the nodes in that line. If no tag is defined, the default one will be used as hardware definition.
|
|
|
|
|
|
A tag can be specified that will apply to all the nodes in that line. If no tag is defined, the default one will be used as hardware definition
|
|
```INI
|
|
|
|
#
|
|
|
|
# In the following example the nodes are clustered in two different islands, but the Island 1 have
|
|
|
|
# two types of EARDBDs configurations.
|
|
|
|
#
|
|
|
|
|
|
|
|
Island=0 DBIP=node1081 DBSECIP=node1082 Nodes=node10[01-80]
|
|
|
|
|
|
|
|
# These nodes are in island0 using different DB connections and with a different architecture
|
|
|
|
Island=0 DBIP=node1084 DBSECIP=node1085 Nodes=node11[01-80] DBSECIP=node1085 tag=6126
|
|
|
|
# These nodes are in island0 and will use default values for DB connection (line 0 for island0) and default tag
|
|
|
|
Island=0 Nodes=node12[01-80]
|
|
|
|
|
|
```INI
|
|
|
|
Island=0 Nodes=nodename_list DBIP=EARDB_server_hostname DBSECIP=EARDB_mirror_hostname
|
|
|
|
|
|
|
|
#This second island uses a tag that is not the default one
|
|
# Will use default tag
|
|
Island=1 Nodes=nodename_list DBIP=EARDB_server_hostname DBSECIP=EARDB_mirror_hostname Tag=6126
|
|
Island=1 DBIP=node1181 DBSECIP=node1182 Nodes=node11[01-80]
|
|
```
|
|
```
|
|
|
|
|
|
Detailed island accepted values:
|
|
Detailed island accepted values:
|
... | @@ -270,7 +279,14 @@ __Example__: |
... | @@ -270,7 +279,14 @@ __Example__: |
|
required ear_install_path/lib/earplug.so prefix=ear_install_path sysconfdir=etc_ear_path localstatedir=tmp_ear_path earlib_default=off
|
|
required ear_install_path/lib/earplug.so prefix=ear_install_path sysconfdir=etc_ear_path localstatedir=tmp_ear_path earlib_default=off
|
|
```
|
|
```
|
|
|
|
|
|
The argument `prefix` points to the EAR installation path and it is used to load the library using `LD_PRELOAD` mechanism. Also the `localstatedir` is used to contact with the EARD, which by default points the path you set during the `./configure` using `--localstatedir` or `EAR_TMP` arguments. Next to these fields, there is the field `earlib_default=off`, which means that by default EARL is not loaded, and `eargmd_host` and `eargmd_port`, if you plan to connect with the EARGMD component (you can leave this empty).
|
|
The argument `prefix` points to the EAR installation path and it is used to load the library using `LD_PRELOAD` mechanism. Also the `localstatedir` is used to contact with the EARD, which by default points the path you set during the `./configure` using `--localstatedir` or `EAR_TMP` arguments. Next to these fields, there is the field `earlib_default=off`, which means that by default EARL is not loaded. Finally there are `eargmd_host` and `eargmd_port` if you plan to connect with the EARGMD component (you can leave this empty).
|
|
|
|
|
|
|
|
Also, there are two additional arguments. The first one, `nodes_allowed=` followed by a comma separated list of nodes, enables the plugin only in that nodes. The second, `nodes_excluded=`, also followed by a comma separated list of nodes, disables the plugin only in nodes in the list. These are arguments for very specific configurations that must be used with caution, if they are not used it is better that they are not written.
|
|
|
|
|
|
|
|
__Example__:
|
|
|
|
```
|
|
|
|
required ear_install_path/lib/earplug.so prefix=ear_install_path sysconfdir=etc_ear_path localstatedir=tmp_ear_path earlib_default=off nodes_excluded=node01,node02
|
|
|
|
```
|
|
|
|
|
|
MySQL/PostgreSQL
|
|
MySQL/PostgreSQL
|
|
-----
|
|
-----
|
... | | ... | |