|
|
## Tables
|
|
|
|
|
|
EAR's database consists of the following tables:
|
|
|
- **Jobs**: job information (app_id, user_id, job_id, step_id, etc). One record per jobid.stepid is created in the DB.
|
|
|
- **Applications**: this table's records serve as a link between Jobs and Signatures, providing an application signature (from EARL) for each node of a job. One record per jobid.stepid.nodename is created in the DB.
|
|
|
- **Signatures**: EARL computed signature and metrics. One record per jobid.stepid.nodename is created in the DB when the application is executed with EARL.
|
|
|
- **Periodic_metrics**: node metrics every N seconds (N defined in `ear.conf`)
|
|
|
- **Periodic_aggregations**: sum of all Periodic_metrics in a time period to ease accounting in `ereport` command and EARGM, as well as reducing database size (Periodic_metrics of older periods where precision at node level is not needed can be deleted and the aggregations used instead).
|
|
|
- **Loops**: similar to Applications, but stores a Signature for each application loop detected by EARL, instead of one per each application. This table provides internal details of running applications and could significantly increase the DB size.
|
|
|
- **Events**: EARL events report. Events includes frequency changes, and internal EARL decisions such as turning off the DynAIS algorithm.
|
|
|
- **Global_energy**: reports of cluster-wide energy accounting, set by EARGM using the parameters in `ear.conf`. One record every T1 period (defined at ear.conf) is reported.
|
|
|
- **Power_signatures**: Basic time and power metrics that can be obtained without EARL. Reported for all the applications. One record per jobid.stepid.nodename is created in the DB.
|
|
|
- **Learning_applications**: same as Applications, restricted to learning phase applications
|
|
|
- **Learning_jobs**: same as Jobs, restricted to learning phase jobs
|
|
|
- **Learning_signatures**: same as Signatures, restricted to learning phase job metrics
|
|
|
- **Jobs**: job information (app_id, user_id, job_id, step_id, etc). One record per JOBID.STEPID is created in the DB.
|
|
|
- **Applications**: this table's records serve as a link between Jobs and Signatures, providing an application signature (from EARL) for each node of a job. One record per JOBID.STEPID.NODENAME is created in the DB.
|
|
|
- **Signatures**: EARL computed signature and metrics. One record per JOBID.STEPID.NODENAME is created in the DB when the application is executed with EARL.
|
|
|
- **Periodic_metrics**: node metrics reported every N seconds (N is defined in `ear.conf`).
|
|
|
- **Periodic_aggregations**: sum of all *Periodic_metrics* in a time period to ease accounting in `ereport` command and EARGM, as well as reducing database size (*Periodic_metrics* of older periods where precision at node level is not needed can be deleted and the aggregations can be used instead).
|
|
|
- **Loops**: similar to *Applications*, but stores a Signature for each application loop detected by EARL, instead of one per each application. This table provides internal details of running applications and could significantly increase the DB size.
|
|
|
- **Events**: EARL events report. Events includes frequency changes, and internal EARL decisions such as turning off the DynAIS algorithm.
|
|
|
- **Global_energy**: contains reports of cluster-wide energy accounting set by EARGM using the parameters in `ear.conf`. One record every T1 period (defined at ear.conf) is reported.
|
|
|
- **Power_signatures**: Basic time and power metrics that can be obtained without EARL. Reported for all applications. One record per JOBID.STEPID.NODENAME is created in the DB.
|
|
|
- **Learning_applications**: same as *Applications*, restricted to learning phase applications.
|
|
|
- **Learning_jobs**: same as *Jobs*, restricted to learning phase jobs.
|
|
|
- **Learning_signatures**: same as *Signatures*, restricted to learning phase job metrics.
|
|
|
|
|
|
If GPUs are enabled at database creation (or are added afterwards, see [Updating from previous versions](#updating-from-previous-versions)), *Periodic_metrics* will also contain GPU data and a new table **GPU_signatures** will be created, containing all GPU metrics for every application that runs with EARL.
|
|
|
|
|
|
## Database creation and `ear.conf`
|
|
|
|
|
|
When running `edb_create` some tables might not be created, or may have some quirks, depending on some `ear.conf` settings. The settings and alterations are as follows:
|
|
|
|
|
|
- `DBReportNodeDetail`: if set to 1, `edb_create` will create to additional columns in the Periodic_metrics table for Temperature (in Celsius) and Frequency (in Hz) accounting.
|
|
|
- `DBReportSigDetail`: if set to 1, Signatures will have additional fields for cycles, instructions, and FLOPS1-8 counters (number of instruction by type).
|
|
|
- `DBReportNodeDetail`: if set to 1, `edb_create` will create two additional columns in the *Periodic_metrics* table for Temperature (in Celsius) and Frequency (in Hz) accounting.
|
|
|
- `DBReportSigDetail`: if set to 1, *Signatures* will have additional fields for cycles, instructions, and FLOPS1-8 counters (number of instruction by type).
|
|
|
- `DBMaxConnections`: this will restrict the number of maximum simultaneous commands connections.
|
|
|
|
|
|
If any of the settings is set to 0, the table will have fewer details but the table's records will be smaller in stored size.
|
... | ... | @@ -28,14 +30,49 @@ Any table with missing columns can be later altered by the admin to include said |
|
|
|
|
|
## Information reported and `ear.conf`
|
|
|
|
|
|
There are various settings in `ear.conf` that restrict the data reported to database, and some errors might occur if the database configuration is different from EARDB's.
|
|
|
There are various settings in `ear.conf` that restrict data reported to the database and some errors might occur if the database configuration is different from EARDB's.
|
|
|
|
|
|
- `DBReportNodeDetail`: if set to 1, the node managers will report temperature, average frequency, DRAM and PCK energy to the database manager, which will try to insert it to Periodic_metrics. If Periodic_metrics does not have the columns for both metrics, an error will occur and nothing will be inserted. To solve the error, set `ReportNodeDetail` to 0 or manually update Periodic_metrics to have the necessary columns.
|
|
|
- `DBReportNodeDetail`: if set to 1, node managers will report temperature, average frequency, DRAM and PCK energy to the database manager, which will try to insert it to *Periodic_metrics*. If *Periodic_metrics* does not have the columns for both metrics, an error will occur and nothing will be inserted. To solve the error, set `ReportNodeDetail` to 0 or manually update *Periodic_metrics* in order to have the necessary columns.
|
|
|
|
|
|
- `DBReportSigDetail` : similarly to `ReportNodeDetail`, an error will occur if the configuration differs from the one used when creating the database.
|
|
|
- `DBReportSigDetail`: similarly to `ReportNodeDetail`, an error will occur if the configuration differs from the one used when creating the database.
|
|
|
|
|
|
- `DBReportLoops` : if set to 1, EARL detected application loops will be reported to database, each with its corresponding Signature. Set to 0 to disable this feature. Regardless of the setting, no error should occur.
|
|
|
- `DBReportLoops` : if set to 1, EARL detected application loops will be reported to the database, each with its corresponding Signature. Set to 0 to disable this feature. Regardless of the setting, no error should occur.
|
|
|
|
|
|
If Signatures and/or Periodic_metrics have the additional columns but their respective settings are set to 0, a NULL will be set in said additional columns, which will make those rows smaller in size (but bigger than if the columns did not exist).
|
|
|
If *Signatures* and/or *Periodic_metrics* have additional columns but their respective settings are set to 0, a NULL will be set in those additional columns, which will make those rows smaller in size (but bigger than if the columns did not exist).
|
|
|
|
|
|
<img src="./images/EARDB_schema.png" align="center" width="680">
|
|
|
|
|
|
## Updating from previous versions
|
|
|
|
|
|
### From EAR 3.4 to 4.0
|
|
|
|
|
|
Several fields have to be added in this update. To do so, run the following commands to the database's CLI client:
|
|
|
|
|
|
```
|
|
|
ALTER TABLE Signatures ADD COLUMN avg_imc_f BIGINT unsigned AFTER avg_f;
|
|
|
ALTER TABLE Signatures ADD COLUMN perc_MPI DOUBLE AFTER time;
|
|
|
ALTER TABLE Signatures ADD COLUMN IO_MBS DOUBLE AFTER GBS;
|
|
|
|
|
|
ALTER TABLE Learning_signatures ADD COLUMN avg_imc_f BIGINT unsigned AFTER avg_f;
|
|
|
ALTER TABLE Learning_signatures ADD COLUMN perc_MPI DOUBLE AFTER time;
|
|
|
ALTER TABLE Learning_signatures ADD COLUMN IO_MBS DOUBLE AFTER GBS;
|
|
|
```
|
|
|
|
|
|
|
|
|
### From EAR 3.3 to 3.4
|
|
|
|
|
|
If no GPUs were used and they will not be used there are no changes necessary.
|
|
|
|
|
|
If GPUs were being used, type the following commands to the database's CLI client:
|
|
|
|
|
|
```
|
|
|
ALTER TABLE Signatures ADD COLUMN min_GPU_sig_id INT unsigned, ADD COLUMN max_GPU_sig_id INT unsigned;
|
|
|
ALTER TABLE Learning_signatures ADD COLUMN min_GPU_sig_id INT unsigned, ADD COLUMN max_GPU_sig_id INT unsigned;
|
|
|
CREATE TABLE IF NOT EXISTS GPU_signatures ( id INT unsigned NOT NULL AUTO_INCREMENT, GPU_power FLOAT NOT NULL, GPU_freq INT unsigned NOT NULL, GPU_mem_freq INT unsigned NOT NULL, GPU_util INT unsigned NOT NULL, GPU_mem_util INT unsigned NOT NULL, PRIMARY KEY (id));
|
|
|
```
|
|
|
|
|
|
If no GPUs were being used but now are present, use the previous query plus the following one:
|
|
|
|
|
|
```
|
|
|
ALTER TABLE Periodic_metrics ADD COLUMN GPU_energy INT;
|
|
|
``` |
|
|
\ No newline at end of file |