Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • EAR EAR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • EAR_teamEAR_team
  • EAREAR
  • Wiki
  • Configuration

Configuration · Changes

Page history
ear-5.0 authored Sep 02, 2024 by Oriol Vidal Teruel's avatar Oriol Vidal Teruel
Show whitespace changes
Inline Side-by-side
Configuration.md
View page @ 33467cc1
[[_TOC_]] [[_TOC_]]
## Configuration requirements ## EAR Configuration requirements
The following requirements must be met for EAR to work properly: The following requirements must be met for EAR to work properly:
...@@ -8,7 +8,7 @@ The following requirements must be met for EAR to work properly: ...@@ -8,7 +8,7 @@ The following requirements must be met for EAR to work properly:
**EAR folders** EAR uses two paths for EAR configuration: **EAR folders** EAR uses two paths for EAR configuration:
- **EAR_TMP:** _tmp_ear_path_ must be a private folder per compute node. It must have read/write permissions for normal users. Communication files are created here. It must be created by the admin. For instance: `mkdir /var/ear; chmod ugo +rwx /var/ear` - **EAR_TMP:** _tmp_ear_path_ must be a private folder per compute node. It must have read/write permissions for normal users. Communication files are created here. It must be created by the admin. For instance: `mkdir /var/ear; chmod ugo +rwx /var/ear`.
- **EAR_ETC:** _etc_ear_path_ must be readable for normal users in all compute nodes. It can be a shared folder in "GPFS" (simple to manage) or replicated data because it has very few data and it is modified at a very low frequency (**ear.conf** and coefficients). Coefficients can be installed in a different path specified at configure time with **COEFFS** flag. Both `ear.conf` and coefficients must be readable in all the nodes (compute and _"service"_ nodes). - **EAR_ETC:** _etc_ear_path_ must be readable for normal users in all compute nodes. It can be a shared folder in "GPFS" (simple to manage) or replicated data because it has very few data and it is modified at a very low frequency (**ear.conf** and coefficients). Coefficients can be installed in a different path specified at configure time with **COEFFS** flag. Both `ear.conf` and coefficients must be readable in all the nodes (compute and _"service"_ nodes).
**ear.conf** `ear.conf` is an ascii file setting default values and cluster descriptions. An `ear.conf` is automatically generated based on a **ear.conf.in** template. However, the administrator must include installation details such as hostname details for EAR services, ports, default values, and the list of nodes. For more details, check [EAR configuration file](#ear-configuration-file) below. **ear.conf** `ear.conf` is an ascii file setting default values and cluster descriptions. An `ear.conf` is automatically generated based on a **ear.conf.in** template. However, the administrator must include installation details such as hostname details for EAR services, ports, default values, and the list of nodes. For more details, check [EAR configuration file](#ear-configuration-file) below.
...@@ -29,11 +29,24 @@ The **ear.conf** is a text file describing the EAR package behaviour in the clus ...@@ -29,11 +29,24 @@ The **ear.conf** is a text file describing the EAR package behaviour in the clus
Usually the first word in the configuration file expresses the component related with the option. Lines starting with `#` are comments. A test for `ear.conf` file can be found in the path `src/test/functionals/ear_conf`. It is recommended to test it since the `ear.conf` parser is very sensible to errors in the `ear.conf` syntax, spaces, newlines, etc. Usually the first word in the configuration file expresses the component related with the option. Lines starting with `#` are comments. A test for `ear.conf` file can be found in the path `src/test/functionals/ear_conf`. It is recommended to test it since the `ear.conf` parser is very sensible to errors in the `ear.conf` syntax, spaces, newlines, etc.
In order to improve the readability of the ear.con, EAR version 5.0 includes a new clause "include" that case be used to include additional files with parts of the configurations such as tags or the list of nodes. The syntax is:
```
include=absolute_path
```
### Database configuration ### Database configuration
``` ```
# The IP of the node where the MariaDB (MySQL) or PostgreSQL server process is running. Current version uses same names for both DB servers. # The IP of the node where the MariaDB (MySQL) or PostgreSQL server process is running. Current version uses same names for both DB servers.
DBIp=172.30.2.101 DBIp=172.30.2.101
# Uncomment and add a secondary IP for high availability.
# If specified, the mysql plugin will submit data to a second DB automatically .
# Not supportd with other report plugins.
# DBSECIP=add_secondary_ip_for_ha
# Port in which the server accepts the connections. # Port in which the server accepts the connections.
DBPort=3306 DBPort=3306
...@@ -70,7 +83,7 @@ NodeDaemonPort=50001 ...@@ -70,7 +83,7 @@ NodeDaemonPort=50001
# Frequency used by power monitoring service, in seconds. # Frequency used by power monitoring service, in seconds.
NodeDaemonPowermonFreq=60 NodeDaemonPowermonFreq=60
# Maximum supported frequency (1 means nominal, no turbo). # Maximum supported frequency (1 means nominal, no turbo).
NodeDaemonMaxPstate=1 NodeDaemonMinPstate=1
# Enable (1) or disable (0) the turbo frequency. # Enable (1) or disable (0) the turbo frequency.
NodeDaemonTurbo=0 NodeDaemonTurbo=0
...@@ -187,6 +200,9 @@ EARGMPowerCapSuspendAction=no_action ...@@ -187,6 +200,9 @@ EARGMPowerCapSuspendAction=no_action
EARGMPowerCapResumeLimit=40 EARGMPowerCapResumeLimit=40
# Format for action is: command_name current_power current_limit total_idle_nodes total_idle_power # Format for action is: command_name current_power current_limit total_idle_nodes total_idle_power
EARGMPowerCapResumeAction=no_action EARGMPowerCapResumeAction=no_action
# Sets the report plugins to use for EARGM warning and events accounting
EARGMReportPlugins=mysql.so
# EARGMs must be specified with a unique id, their node and the port that receives # EARGMs must be specified with a unique id, their node and the port that receives
# remote connections. An EARGM can also act as meta-eargm if the meta field is filled, # remote connections. An EARGM can also act as meta-eargm if the meta field is filled,
...@@ -264,7 +280,7 @@ Powercap set to 0 means powercap is disabled and cannot be enabled at runtime. P ...@@ -264,7 +280,7 @@ Powercap set to 0 means powercap is disabled and cannot be enabled at runtime. P
- energy_plugin (filename) - energy_plugin (filename)
- gpu_powercap_plugin (filename) - gpu_powercap_plugin (filename)
- max_powercap (W) - max_powercap (W)
- gpu_def_freq (GHz) - gpu_def_freq (KHz)
- cpu_max_pstate (0..max_pstate) - cpu_max_pstate (0..max_pstate)
- imc_max_pstate (0..max_imc_pstate) - imc_max_pstate (0..max_imc_pstate)
- energy_model (filename) - energy_model (filename)
...@@ -348,6 +364,25 @@ Detailed island accepted values: ...@@ -348,6 +364,25 @@ Detailed island accepted values:
- Island=1 Nodes=`node\\\[1,2\\\],node3` - Island=1 Nodes=`node\\\[1,2\\\],node3`
- Island=1 Nodes=`node\\\[1-3\\\],node4` - Island=1 Nodes=`node\\\[1-3\\\],node4`
### EDCMON
This section specifes the list of sensors, types, pdu ips etc for the edcmon. Even though he edcmon includes other plugins for testing purposes, the main goal is the data center monitor so this section only addresses this use case.
```
# EDCMON section
# sensor_list field must be placed at the end. It is a comma separated list of sensors names.
# Use quotes "" to group sensors lists or sensor names includig spaces
# host can be any or a hostname
# pdu_type can be : storage, management, network or others
#
edcmontag=stg pdu_type=storage pdu_ips=pduip1,pduip2,pduip3 sensor_list="Internal Humidity,Internal Temperature,Total Real Power"
edcmontag=ntw pdu_type=network pdu_ips=pduip4 sensor_list="Internal Humidity,Internal Temperature,Total Real Power"
edcmontag=mgt pdu_type=management pdu_ips=pdu5 sensor_list="Power"
edcmontag=spe host=host1 pdu_type=others pdu_ips=pdu6 sensor_list=Power
```
## SLURM SPANK plug-in configuration file ## SLURM SPANK plug-in configuration file
SLURM loads the plug-in through a file called `plugstack.conf`, which is composed by a list of a plug-ins. In the file `etc/slurm/ear.plugstack.conf`, there is an example entry with the paths already set to the plug-in, temporal and configuration paths. SLURM loads the plug-in through a file called `plugstack.conf`, which is composed by a list of a plug-ins. In the file `etc/slurm/ear.plugstack.conf`, there is an example entry with the paths already set to the plug-in, temporal and configuration paths.
......
Clone repository
  • Home
  • User guide
    • Use cases
      • MPI applications
      • Non-MPI applications
      • Others
    • EAR data
    • Submission flags
    • Examples
    • Job accounting
    • Job energy optimization
    • Data visualization
  • Commands
    • Job accounting (eacct)
    • System energy report (ereport)
    • EAR control (econtrol)
    • Database management
    • erun
    • ear-info
  • Environment variables
    • Support for Intel(R) speed select technology
  • Admin Guide
    • Quick installation guide
    • Installation from RPM
    • Updating
  • Installation from source
  • Architecture/Services
  • High Availability support
  • Configuration
  • Learning phase
  • Plug-ins
  • Powercap
  • Report plug-ins
  • Database
    • Updating the database from previous EAR versions
    • Tables description
  • Supported systems
  • EAR Data Center Monitoring
  • CHANGELOG
  • FAQs
  • Known issues
  • Tutorial