... | ... | @@ -7,9 +7,9 @@ The following requirements must be met for EAR to work properly: |
|
|
### EAR paths
|
|
|
|
|
|
**EAR folders** EAR uses two paths for EAR configuration:
|
|
|
- **EAR_TMP:** *tmp_ear_path* must be a private folder per compute node. It must have read/write permissions for normal users. Communication files are created here. It must be created by the admin.
|
|
|
For instance: `mkdir /var/ear; chmod ugo +rwx /var/ear`
|
|
|
- **EAR_ETC:** *etc_ear_path* must be readable for normal users in all compute nodes. It can be a shared folder in "GPFS" (simple to manage) or replicated data because it has very few data and it is modified at a very low frequency (**ear.conf** and coefficients). Coefficients can be installed in a different path specified at configure time with **COEFFS** flag. Both `ear.conf` and coefficients must be readable in all the nodes (compute and _"service"_ nodes).
|
|
|
|
|
|
- **EAR_TMP:** _tmp_ear_path_ must be a private folder per compute node. It must have read/write permissions for normal users. Communication files are created here. It must be created by the admin. For instance: `mkdir /var/ear; chmod ugo +rwx /var/ear`
|
|
|
- **EAR_ETC:** _etc_ear_path_ must be readable for normal users in all compute nodes. It can be a shared folder in "GPFS" (simple to manage) or replicated data because it has very few data and it is modified at a very low frequency (**ear.conf** and coefficients). Coefficients can be installed in a different path specified at configure time with **COEFFS** flag. Both `ear.conf` and coefficients must be readable in all the nodes (compute and _"service"_ nodes).
|
|
|
|
|
|
**ear.conf** `ear.conf` is an ascii file setting default values and cluster descriptions. An `ear.conf` is automatically generated based on a **ear.conf.in** template. However, the administrator must include installation details such as hostname details for EAR services, ports, default values, and the list of nodes. For more details, check [EAR configuration file](#ear-configuration-file) below.
|
|
|
|
... | ... | @@ -19,21 +19,15 @@ MySQL or PostgreSQL database: EAR saves data in a MySQL/PostgreSQL DB server. EA |
|
|
|
|
|
### EAR SLURM plug-in
|
|
|
|
|
|
EAR SLURM plug-in can be enabled by adding an additional line at the `/etc/slurm/plugstack.conf` file.
|
|
|
You can copy from the `ear_etc_path/slurm/ear.plugstack.conf` file).
|
|
|
EAR SLURM plug-in can be enabled by adding an additional line at the `/etc/slurm/plugstack.conf` file. You can copy from the `ear_etc_path/slurm/ear.plugstack.conf` file).
|
|
|
|
|
|
Another way to enable it is to create the directory `/etc/slurm/plugstack.conf.d` and copy there the `ear_etc_path/slurm/ear.plugstack.conf` file. On that case, the content of `/etc/slurm/plugstack.conf`
|
|
|
must be `include /etc/slurm/plugstack.conf.d/*`.
|
|
|
Another way to enable it is to create the directory `/etc/slurm/plugstack.conf.d` and copy there the `ear_etc_path/slurm/ear.plugstack.conf` file. On that case, the content of `/etc/slurm/plugstack.conf` must be `include /etc/slurm/plugstack.conf.d/\\\*`.
|
|
|
|
|
|
# EAR configuration file
|
|
|
|
|
|
The **ear.conf** is a text file describing the EAR package behaviour in the cluster.
|
|
|
It must be readable by all compute nodes and by nodes where commands are executed.
|
|
|
Two `ear.conf` templates are generated with default values and will be installed as reference when executing `make etc.install`.
|
|
|
The **ear.conf** is a text file describing the EAR package behaviour in the cluster. It must be readable by all compute nodes and by nodes where commands are executed. Two `ear.conf` templates are generated with default values and will be installed as reference when executing `make etc.install`.
|
|
|
|
|
|
Usually the first word in the configuration file expresses the component related with the option. Lines starting with `#` are comments.
|
|
|
A test for `ear.conf` file can be found in the path `src/test/functionals/ear_conf`.
|
|
|
It is recommended to test it since the `ear.conf` parser is very sensible to errors in the `ear.conf` syntax, spaces, newlines, etc.
|
|
|
Usually the first word in the configuration file expresses the component related with the option. Lines starting with `#` are comments. A test for `ear.conf` file can be found in the path `src/test/functionals/ear_conf`. It is recommended to test it since the `ear.conf` parser is very sensible to errors in the `ear.conf` syntax, spaces, newlines, etc.
|
|
|
|
|
|
## Database configuration
|
|
|
|
... | ... | @@ -183,7 +177,7 @@ EARGMEnergyAction=no_action |
|
|
|
|
|
# Period at which the powercap thread is activated.
|
|
|
EARGMPowerPeriod=120
|
|
|
# 1 means automatic, 0 is only monitoring.
|
|
|
# Powercap mode: 0 is monitoring, 1 is hard powercap, 2 is soft powercap.
|
|
|
EARGMPowerCapMode=1
|
|
|
# Admins can specify to automatically execute a command in
|
|
|
# EARGMPowerCapSuspendAction when total_power >= EARGMPowerLimit*EARGMPowerCapResumeLimit/100
|
... | ... | @@ -201,9 +195,14 @@ EARGMPowerCapResumeAction=no_action |
|
|
# remote connections. An EARGM can also act as meta-eargm if the meta field is filled,
|
|
|
# and it will control the EARGMs whose ids are in said field. If two EARGMs are in the
|
|
|
# same node, setting the EARGMID environment variable overrides the node field and
|
|
|
# chooses the characteristics of the EARGM with the correspoding id. If energy is
|
|
|
# set to 0, cluster_energy_cap will be disabled for that EARGM. Currently, only 1
|
|
|
# cluster_energy_cap is supported.
|
|
|
# chooses the characteristics of the EARGM with the correspoding id.
|
|
|
|
|
|
# Only one EARGM can currently control the energy caps, so setting the rest to 0 is recommended.
|
|
|
# energy = 0 -> energy_cap disabled
|
|
|
# power = 0 -> powercap disabled
|
|
|
# power = N -> powercap budget for that EARGM (and the nodes it controls) is N
|
|
|
# power = -1 -> powercap budget is calculated by adding up the powercap set to each of the nodes under its control.
|
|
|
# This is incompatible with nodes that have their powercap unlimited (powercap = 1)
|
|
|
EARGMId=1 energy=1800 power=600 node=node1 port=50100 meta=1,2,3
|
|
|
EARGMId=2 energy=0 power=500 node=node1 port=50101
|
|
|
EARGMId=3 energy=0 power=500 node=node2 port=50100
|
... | ... | @@ -228,18 +227,17 @@ InstDir=/path/to/inst |
|
|
|
|
|
## EAR Authorized users/groups/accounts
|
|
|
|
|
|
Authorized users that are allowed to change policies, thresholds and frequencies are supposed to be administrators.
|
|
|
A list of users, Linux groups, and/or SLURM accounts can be provided to allow normal users to perform that actions. Only normal Authorized users can execute the learning phase.
|
|
|
Authorized users that are allowed to change policies, thresholds and frequencies are supposed to be administrators. A list of users, Linux groups, and/or SLURM accounts can be provided to allow normal users to perform that actions. Only normal Authorized users can execute the learning phase.
|
|
|
|
|
|
```
|
|
|
AuthorizedUsers=user1,user2
|
|
|
AuthorizedAccounts=acc1,acc2,acc3
|
|
|
AuthorizedGroups=xx,yy
|
|
|
```
|
|
|
|
|
|
## Energy tags
|
|
|
|
|
|
Energy tags are pre-defined configurations for some applications (EAR Library is not loaded).
|
|
|
This energy tags accept a user ids, groups and SLURM accounts of users allowed to use that tag.
|
|
|
Energy tags are pre-defined configurations for some applications (EAR Library is not loaded). This energy tags accept a user ids, groups and SLURM accounts of users allowed to use that tag.
|
|
|
|
|
|
```
|
|
|
# General energy tag
|
... | ... | @@ -249,16 +247,15 @@ EnergyTag=memory-intensive pstate=4 users=user1,user2 groups=group1,group2 accou |
|
|
```
|
|
|
|
|
|
## Tags
|
|
|
Tags are used for architectural descriptions. Max. AVX frequencies are used in predictor models and are SKU-specific.
|
|
|
At least a default tag is mandatory to be included for a cluster to properly work.
|
|
|
|
|
|
Tags are used for architectural descriptions. Max. AVX frequencies are used in predictor models and are SKU-specific. At least a default tag is mandatory to be included for a cluster to properly work.
|
|
|
|
|
|
The **min_power**, **max_power** and **error_power** are threshold values that determine if the metrics read might be invalid, and a warning message to syslog will be reported if the values are outside of said thresholds. The **error_power** field is a more extreme value that if a metric surpasses it, said metric will not be reported to the DataBase.
|
|
|
|
|
|
A special energy plug-in or energy model can be specified in a tag that will override the global values previously defined in all nodes that have this tag associated with them.
|
|
|
|
|
|
Powercap set to 0 means powercap is disabled and cannot be enabled at runtime.
|
|
|
Powercap set to 1 means no limits on power consumption but a powercap can be set without stopping eard.
|
|
|
List of accepted options:
|
|
|
Powercap set to 0 means powercap is disabled and cannot be enabled at runtime. Powercap set to 1 means no limits on power consumption but a powercap can be set without stopping eard. List of accepted options:
|
|
|
|
|
|
- max_avx512 (GHz)
|
|
|
- max_avx2 (GHz)
|
|
|
- max_power (W)
|
... | ... | @@ -268,16 +265,20 @@ List of accepted options: |
|
|
- powercap (W)
|
|
|
- powercap_plugin (filename)
|
|
|
- energy_plugin (filename)
|
|
|
- gpu\_powercap\_plugin (filename)
|
|
|
- gpu_powercap_plugin (filename)
|
|
|
- max_powercap (W)
|
|
|
- gpu\_def\_freq (GHz)
|
|
|
- cpu\_max\_pstate (0..max_pstate)
|
|
|
- imc\_max\_pstate (0..max\_imc\_pstate)
|
|
|
- energy\_model (filename)
|
|
|
- gpu_def_freq (GHz)
|
|
|
- cpu_max_pstate (0..max_pstate)
|
|
|
- imc_max_pstate (0..max_imc_pstate)
|
|
|
- energy_model (filename)
|
|
|
- imc_max_freq (GHz)
|
|
|
- imc_min_freq (GHz)
|
|
|
- idle_governor (governor name)
|
|
|
- idle_pstate (0..max_pstate)
|
|
|
|
|
|
```
|
|
|
Tag=6148 default=yes max_avx512=2.2 max_avx2=2.6 max_power=500 powercap=1 max_powercap=600 gpu_def_freq=1.4 energy_model=avx512_model.so energy_plugin=energy_nm.so powercap_plugin=dvfs.so gpu_powercap_plugin=gpu.so min_power=50 error_power=600 coeffs=coeffs.default
|
|
|
Tag=6126 max_avx512=2.3 max_avx2=2.9 ceffs=coeffs.6126.default max_power=600 error_power=700
|
|
|
Tag=6126 max_avx512=2.3 max_avx2=2.9 ceffs=coeffs.6126.default max_power=600 error_power=700 idle_governor=ondemand
|
|
|
```
|
|
|
|
|
|
## Power policies plug-ins
|
... | ... | @@ -304,13 +305,9 @@ Policy=min_energy Settings=0.05 DefaultFreq=2.4 Privileged=1 |
|
|
|
|
|
## Island description
|
|
|
|
|
|
This section is mandatory since it is used for cluster description. Normally nodes
|
|
|
are grouped in islands that share the same hardware characteristics as well as its
|
|
|
database managers (EARDBDS). Each entry describes part of an island, and every node must be in an island.
|
|
|
This section is mandatory since it is used for cluster description. Normally nodes are grouped in islands that share the same hardware characteristics as well as its database managers (EARDBDS). Each entry describes part of an island, and every node must be in an island.
|
|
|
|
|
|
There are two kinds of database daemons. One called **server** and other one called **mirror**.
|
|
|
Both perform the metrics buffering process, but just one performs the insert.
|
|
|
The mirror will do that insert in case the 'server' process crashes or the node fails.
|
|
|
There are two kinds of database daemons. One called **server** and other one called **mirror**. Both perform the metrics buffering process, but just one performs the insert. The mirror will do that insert in case the 'server' process crashes or the node fails.
|
|
|
|
|
|
It is recommended for all islands to maintain server-mirror symmetry. For example, if the island I0 and I1 have the server N0 and the mirror N1, the next island would have to point the same N0 and N1 or point to new ones N2 and N3, not point to N1 as server and N0 as mirror.
|
|
|
|
... | ... | @@ -322,7 +319,6 @@ A tag can be specified that will apply to all the nodes in that line. If no tag |
|
|
|
|
|
Finally, if an EARGM is being used to cap power, the EARGMID field is necessary in at least one line, and will specify what EARGM controls the nodes declared in that line. If no EARGMID is found in a line, the first one found will be used (ie, the previous line EARGMID).
|
|
|
|
|
|
|
|
|
```
|
|
|
# In the following example the nodes are clustered in two different islands,
|
|
|
# but the Island 1 have two types of EARDBDs configurations.
|
... | ... | @@ -342,24 +338,25 @@ Island=1 DBIP=node1181 DBSECIP=node1182 Nodes=node11[01-80] |
|
|
```
|
|
|
|
|
|
Detailed island accepted values:
|
|
|
|
|
|
- nodename_list accepts the following formats:
|
|
|
- Nodes=`node1,node2,node3`
|
|
|
- Nodes=`node[1-3]`
|
|
|
- Nodes=`node[1,2,3]`
|
|
|
- Nodes=`node\\\[1-3\\\]`
|
|
|
- Nodes=`node\\\[1,2,3\\\]`
|
|
|
- Any combination of the two latter options will work, but if nodes have to be specified individually (the first format) as of now they have to be specified in their own line. As an example:
|
|
|
- Valid formats:
|
|
|
- Island=1 Nodes=`node1,node2,node3`
|
|
|
- Island=1 Nodes=`node[1-3],node[4,5]`
|
|
|
- Island=1 Nodes=`node\\\[1-3\\\],node\\\[4,5\\\]`
|
|
|
- Invalid formats:
|
|
|
- Island=1 Nodes=`node[1,2],node3`
|
|
|
- Island=1 Nodes=`node[1-3],node4`
|
|
|
|
|
|
- Island=1 Nodes=`node\\\[1,2\\\],node3`
|
|
|
- Island=1 Nodes=`node\\\[1-3\\\],node4`
|
|
|
|
|
|
# SLURM SPANK plug-in configuration file
|
|
|
|
|
|
SLURM loads the plug-in through a file called `plugstack.conf`, which is composed by a list of a plug-ins. In the file `etc/slurm/ear.plugstack.conf`, there is an example entry with the paths already set to the plug-in, temporal and configuration paths.
|
|
|
|
|
|
__Example__:
|
|
|
**Example**:
|
|
|
|
|
|
```
|
|
|
required ear_install_path/lib/earplug.so prefix=ear_install_path sysconfdir=etc_ear_path localstatedir=tmp_ear_path earlib_default=off
|
|
|
```
|
... | ... | @@ -368,31 +365,33 @@ The argument `prefix` points to the EAR installation path and it is used to load |
|
|
|
|
|
Also, there are two additional arguments. The first one, `nodes_allowed=` followed by a comma separated list of nodes, enables the plug-in only in that nodes. The second, `nodes_excluded=`, also followed by a comma separated list of nodes, disables the plug-in only in nodes in the list. These are arguments for very specific configurations that must be used with caution, if they are not used it is better that they are not written.
|
|
|
|
|
|
__Example__:
|
|
|
**Example**:
|
|
|
|
|
|
```
|
|
|
required ear_install_path/lib/earplug.so prefix=ear_install_path sysconfdir=etc_ear_path localstatedir=tmp_ear_path earlib_default=off nodes_excluded=node01,node02
|
|
|
```
|
|
|
|
|
|
# MySQL/PostgreSQL
|
|
|
|
|
|
**WARNING**: If any EAR component is running in the same machine as the MySQL server some connection problems might occur. This will not happen with PostgreSQL. To solve those issues, input into MySQL's CLI client the `CREATE USER` and `GRANT PRIVILEGES` queries from `edb_create -o` changing the portion `'user_name'@'%'` to `'user_name'@'localhost'` so that EAR's users have access to the server from the local machine.
|
|
|
There are two ways to configure a database server for EAR's usage.
|
|
|
- run `edb_create -r` located in `$EAR_INSTALLATION_PATH/sbin` from a node with root access to the MySQL server. This requires MySQL/PostgreSQL's section of ear.conf to be correctly written. For more info run `edb_create -h`.
|
|
|
**WARNING**: If any EAR component is running in the same machine as the MySQL server some connection problems might occur. This will not happen with PostgreSQL. To solve those issues, input into MySQL's CLI client the `CREATE USER` and `GRANT PRIVILEGES` queries from `edb_create -o` changing the portion `'user_name'@'%'` to `'user_name'@'localhost'` so that EAR's users have access to the server from the local machine. There are two ways to configure a database server for EAR's usage.
|
|
|
|
|
|
- Run `edb_create -r` located in `$EAR_INSTALLATION_PATH/sbin` from a node with root access to the MySQL server. This requires MySQL/PostgreSQL's section of ear.conf to be correctly written. For more info run `edb_create -h`.
|
|
|
- Manually create the database and users specified in ear.conf, as well as the required tables. If ear.conf has been configured, running `edb_create -o` will output the queries that would be run with the program that contain all that is needed for EAR to properly function.
|
|
|
|
|
|
For more information about how each `ear.conf` flag changes the database creation, see our [Database section](EAR-Database).
|
|
|
For more information about how each `ear.conf` flag changes the database creation, see our [Database section](EAR-Database). For further information about EAR's database management tools, see the [Commands section](EAR-commands#database-commands).
|
|
|
|
|
|
# MSR Safe
|
|
|
|
|
|
MSR Safe is a kernel module that allows to read and write MSR without root permission. EAR opens MSR Safe files if the ordinary MSR files fail. MSR Safe requires a configuration file to allow read and write registers. You can find configuration files in `etc/msr_safe` for Intel Skylake and superior and AMD Zen and superior.
|
|
|
|
|
|
You can pass these configuration files to MSR Safe kernel mode like this:
|
|
|
|
|
|
```
|
|
|
cat intel63 > /dev/cpu/msr_allowlist
|
|
|
```
|
|
|
|
|
|
You can find more information in the [official repository]( https://github.com/LLNL/msr-safe)
|
|
|
You can find more information in the [official repository](https://github.com/LLNL/msr-safe)
|
|
|
|
|
|
# Next step
|
|
|
|
|
|
Visit the [execution page](Starting services) to run EAR's different components. |
|
|
\ No newline at end of file |
|
|
Visit the [execution page](Starting%20services) to run EAR's different components. |
|
|
\ No newline at end of file |