|
[[_TOC_]]
|
|
[[_TOC_]]
|
|
|
|
|
|
## Configuration requirements
|
|
## EAR Configuration requirements
|
|
|
|
|
|
The following requirements must be met for EAR to work properly:
|
|
The following requirements must be met for EAR to work properly:
|
|
|
|
|
... | @@ -8,7 +8,7 @@ The following requirements must be met for EAR to work properly: |
... | @@ -8,7 +8,7 @@ The following requirements must be met for EAR to work properly: |
|
|
|
|
|
**EAR folders** EAR uses two paths for EAR configuration:
|
|
**EAR folders** EAR uses two paths for EAR configuration:
|
|
|
|
|
|
- **EAR_TMP:** _tmp_ear_path_ must be a private folder per compute node. It must have read/write permissions for normal users. Communication files are created here. It must be created by the admin. For instance: `mkdir /var/ear; chmod ugo +rwx /var/ear`
|
|
- **EAR_TMP:** _tmp_ear_path_ must be a private folder per compute node. It must have read/write permissions for normal users. Communication files are created here. It must be created by the admin. For instance: `mkdir /var/ear; chmod ugo +rwx /var/ear`.
|
|
- **EAR_ETC:** _etc_ear_path_ must be readable for normal users in all compute nodes. It can be a shared folder in "GPFS" (simple to manage) or replicated data because it has very few data and it is modified at a very low frequency (**ear.conf** and coefficients). Coefficients can be installed in a different path specified at configure time with **COEFFS** flag. Both `ear.conf` and coefficients must be readable in all the nodes (compute and _"service"_ nodes).
|
|
- **EAR_ETC:** _etc_ear_path_ must be readable for normal users in all compute nodes. It can be a shared folder in "GPFS" (simple to manage) or replicated data because it has very few data and it is modified at a very low frequency (**ear.conf** and coefficients). Coefficients can be installed in a different path specified at configure time with **COEFFS** flag. Both `ear.conf` and coefficients must be readable in all the nodes (compute and _"service"_ nodes).
|
|
|
|
|
|
**ear.conf** `ear.conf` is an ascii file setting default values and cluster descriptions. An `ear.conf` is automatically generated based on a **ear.conf.in** template. However, the administrator must include installation details such as hostname details for EAR services, ports, default values, and the list of nodes. For more details, check [EAR configuration file](#ear-configuration-file) below.
|
|
**ear.conf** `ear.conf` is an ascii file setting default values and cluster descriptions. An `ear.conf` is automatically generated based on a **ear.conf.in** template. However, the administrator must include installation details such as hostname details for EAR services, ports, default values, and the list of nodes. For more details, check [EAR configuration file](#ear-configuration-file) below.
|
... | @@ -29,11 +29,24 @@ The **ear.conf** is a text file describing the EAR package behaviour in the clus |
... | @@ -29,11 +29,24 @@ The **ear.conf** is a text file describing the EAR package behaviour in the clus |
|
|
|
|
|
Usually the first word in the configuration file expresses the component related with the option. Lines starting with `#` are comments. A test for `ear.conf` file can be found in the path `src/test/functionals/ear_conf`. It is recommended to test it since the `ear.conf` parser is very sensible to errors in the `ear.conf` syntax, spaces, newlines, etc.
|
|
Usually the first word in the configuration file expresses the component related with the option. Lines starting with `#` are comments. A test for `ear.conf` file can be found in the path `src/test/functionals/ear_conf`. It is recommended to test it since the `ear.conf` parser is very sensible to errors in the `ear.conf` syntax, spaces, newlines, etc.
|
|
|
|
|
|
|
|
|
|
|
|
In order to improve the readability of the ear.con, EAR version 5.0 includes a new clause "include" that case be used to include additional files with parts of the configurations such as tags or the list of nodes. The syntax is:
|
|
|
|
|
|
|
|
```
|
|
|
|
include=absolute_path
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Database configuration
|
|
### Database configuration
|
|
|
|
|
|
```
|
|
```
|
|
# The IP of the node where the MariaDB (MySQL) or PostgreSQL server process is running. Current version uses same names for both DB servers.
|
|
# The IP of the node where the MariaDB (MySQL) or PostgreSQL server process is running. Current version uses same names for both DB servers.
|
|
DBIp=172.30.2.101
|
|
DBIp=172.30.2.101
|
|
|
|
# Uncomment and add a secondary IP for high availability.
|
|
|
|
# If specified, the mysql plugin will submit data to a second DB automatically .
|
|
|
|
# Not supportd with other report plugins.
|
|
|
|
# DBSECIP=add_secondary_ip_for_ha
|
|
|
|
|
|
# Port in which the server accepts the connections.
|
|
# Port in which the server accepts the connections.
|
|
DBPort=3306
|
|
DBPort=3306
|
|
|
|
|
... | @@ -70,7 +83,7 @@ NodeDaemonPort=50001 |
... | @@ -70,7 +83,7 @@ NodeDaemonPort=50001 |
|
# Frequency used by power monitoring service, in seconds.
|
|
# Frequency used by power monitoring service, in seconds.
|
|
NodeDaemonPowermonFreq=60
|
|
NodeDaemonPowermonFreq=60
|
|
# Maximum supported frequency (1 means nominal, no turbo).
|
|
# Maximum supported frequency (1 means nominal, no turbo).
|
|
NodeDaemonMaxPstate=1
|
|
NodeDaemonMinPstate=1
|
|
# Enable (1) or disable (0) the turbo frequency.
|
|
# Enable (1) or disable (0) the turbo frequency.
|
|
NodeDaemonTurbo=0
|
|
NodeDaemonTurbo=0
|
|
|
|
|
... | @@ -187,6 +200,9 @@ EARGMPowerCapSuspendAction=no_action |
... | @@ -187,6 +200,9 @@ EARGMPowerCapSuspendAction=no_action |
|
EARGMPowerCapResumeLimit=40
|
|
EARGMPowerCapResumeLimit=40
|
|
# Format for action is: command_name current_power current_limit total_idle_nodes total_idle_power
|
|
# Format for action is: command_name current_power current_limit total_idle_nodes total_idle_power
|
|
EARGMPowerCapResumeAction=no_action
|
|
EARGMPowerCapResumeAction=no_action
|
|
|
|
# Sets the report plugins to use for EARGM warning and events accounting
|
|
|
|
EARGMReportPlugins=mysql.so
|
|
|
|
|
|
|
|
|
|
# EARGMs must be specified with a unique id, their node and the port that receives
|
|
# EARGMs must be specified with a unique id, their node and the port that receives
|
|
# remote connections. An EARGM can also act as meta-eargm if the meta field is filled,
|
|
# remote connections. An EARGM can also act as meta-eargm if the meta field is filled,
|
... | @@ -264,7 +280,7 @@ Powercap set to 0 means powercap is disabled and cannot be enabled at runtime. P |
... | @@ -264,7 +280,7 @@ Powercap set to 0 means powercap is disabled and cannot be enabled at runtime. P |
|
- energy_plugin (filename)
|
|
- energy_plugin (filename)
|
|
- gpu_powercap_plugin (filename)
|
|
- gpu_powercap_plugin (filename)
|
|
- max_powercap (W)
|
|
- max_powercap (W)
|
|
- gpu_def_freq (GHz)
|
|
- gpu_def_freq (KHz)
|
|
- cpu_max_pstate (0..max_pstate)
|
|
- cpu_max_pstate (0..max_pstate)
|
|
- imc_max_pstate (0..max_imc_pstate)
|
|
- imc_max_pstate (0..max_imc_pstate)
|
|
- energy_model (filename)
|
|
- energy_model (filename)
|
... | @@ -348,6 +364,25 @@ Detailed island accepted values: |
... | @@ -348,6 +364,25 @@ Detailed island accepted values: |
|
- Island=1 Nodes=`node\\\[1,2\\\],node3`
|
|
- Island=1 Nodes=`node\\\[1,2\\\],node3`
|
|
- Island=1 Nodes=`node\\\[1-3\\\],node4`
|
|
- Island=1 Nodes=`node\\\[1-3\\\],node4`
|
|
|
|
|
|
|
|
### EDCMON
|
|
|
|
|
|
|
|
This section specifes the list of sensors, types, pdu ips etc for the edcmon. Even though he edcmon includes other plugins for testing purposes, the main goal is the data center monitor so this section only addresses this use case.
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
# EDCMON section
|
|
|
|
# sensor_list field must be placed at the end. It is a comma separated list of sensors names.
|
|
|
|
# Use quotes "" to group sensors lists or sensor names includig spaces
|
|
|
|
# host can be any or a hostname
|
|
|
|
# pdu_type can be : storage, management, network or others
|
|
|
|
#
|
|
|
|
edcmontag=stg pdu_type=storage pdu_ips=pduip1,pduip2,pduip3 sensor_list="Internal Humidity,Internal Temperature,Total Real Power"
|
|
|
|
edcmontag=ntw pdu_type=network pdu_ips=pduip4 sensor_list="Internal Humidity,Internal Temperature,Total Real Power"
|
|
|
|
edcmontag=mgt pdu_type=management pdu_ips=pdu5 sensor_list="Power"
|
|
|
|
edcmontag=spe host=host1 pdu_type=others pdu_ips=pdu6 sensor_list=Power
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## SLURM SPANK plug-in configuration file
|
|
## SLURM SPANK plug-in configuration file
|
|
|
|
|
|
SLURM loads the plug-in through a file called `plugstack.conf`, which is composed by a list of a plug-ins. In the file `etc/slurm/ear.plugstack.conf`, there is an example entry with the paths already set to the plug-in, temporal and configuration paths.
|
|
SLURM loads the plug-in through a file called `plugstack.conf`, which is composed by a list of a plug-ins. In the file `etc/slurm/ear.plugstack.conf`, there is an example entry with the paths already set to the plug-in, temporal and configuration paths.
|
... | | ... | |