|
### EAR 5.0.3
|
|
# EAR 5.1
|
|
|
|
- CPU temperature read by EARL and reported to csv files.
|
|
|
|
- Prevent workflows where all applications see all GPUs and all of them change GPU frequency.
|
|
|
|
- Support for Python applications which use multiprocess module.
|
|
|
|
- EARL is compiled by default with LITE mode and DCGM metric collection enabled by default.
|
|
|
|
- Fix DCGM application-level metrics computation.
|
|
|
|
- Prevent closing fd 0 on NTASK_WORKSHARING use case.
|
|
|
|
- Avoid collapsing application's channel 2 when earl has high verbosity.
|
|
|
|
- Checking for authorized groups fixed.
|
|
|
|
- Added a tool for creating application-level signatures csv file from loop signatures csv file.
|
|
|
|
- EARL Loader can detect Python MPI flavour without an environment variable.
|
|
|
|
- Fixed an error with EARD remote connections not being properly closed.
|
|
|
|
- Add --domain option to econtrol.
|
|
|
|
|
|
|
|
# EAR 5.0.3
|
|
- EARD local API creates an application directory if a third-party program connects with it.
|
|
- EARD local API creates an application directory if a third-party program connects with it.
|
|
- Fixed a typo in ereport queries.
|
|
- Fixed a typo in ereport queries.
|
|
- Prevent closing fd 0 on NTASK\_WORKSHARING use cases.
|
|
- Prevent closing fd 0 on NTASK\_WORKSHARING use cases.
|
|
- Prevent closing fd 0 when initiating earl\_node\_mgr\_info.
|
|
- Prevent closing fd 0 when initiating earl\_node\_mgr\_info.
|
|
|
|
|
|
### EAR 5.0
|
|
# EAR 5.0
|
|
- Workflows support. Automatic detection of applications executed with same jobid/stepid.
|
|
- Workflows support. Automatic detection of applications executed with same jobid/stepid.
|
|
- Fixed Intel PSTATE driver to avoid loading if there is a driver already loaded.
|
|
- Fixed Intel PSTATE driver to avoid loading if there is a driver already loaded.
|
|
- Robustness improved.
|
|
- Robustness improved.
|
... | @@ -19,14 +33,14 @@ |
... | @@ -19,14 +33,14 @@ |
|
- Fixes in EAR Loader to support MPI application when MPI symbols can not be detected.
|
|
- Fixes in EAR Loader to support MPI application when MPI symbols can not be detected.
|
|
- GPU GFLOPS are now estimated and reported when using NVIDIA GPUs.
|
|
- GPU GFLOPS are now estimated and reported when using NVIDIA GPUs.
|
|
|
|
|
|
### EAR 4.3.1
|
|
# EAR 4.3.1
|
|
- Documentation typos fixed.
|
|
- Documentation typos fixed.
|
|
- EAR configuration files templates updated.
|
|
- EAR configuration files templates updated.
|
|
- Bugs fixed for intel\_pstate CPUFreq driver support.
|
|
- Bugs fixed for intel\_pstate CPUFreq driver support.
|
|
- Powercap bug fixes.
|
|
- Powercap bug fixes.
|
|
- ear.conf parsing errors found and fixed.
|
|
- ear.conf parsing errors found and fixed.
|
|
|
|
|
|
### EAR 4.3
|
|
# EAR 4.3
|
|
- MPI stats collection now is guided by sampling to minimize the overhead.
|
|
- MPI stats collection now is guided by sampling to minimize the overhead.
|
|
- EARL-EARD communication optimized.
|
|
- EARL-EARD communication optimized.
|
|
- EARL: Periodic actions optimization.
|
|
- EARL: Periodic actions optimization.
|
... | @@ -36,7 +50,7 @@ |
... | @@ -36,7 +50,7 @@ |
|
- Improved metrics computation in AMD Zen2/Zen3.
|
|
- Improved metrics computation in AMD Zen2/Zen3.
|
|
- Improved robustness in metrics computation to support hardware failures.
|
|
- Improved robustness in metrics computation to support hardware failures.
|
|
|
|
|
|
### EAR 4.2
|
|
# EAR 4.2
|
|
- Improved support for node sharing : save/restore configurations
|
|
- Improved support for node sharing : save/restore configurations
|
|
- AMD(Zen3) CPUs
|
|
- AMD(Zen3) CPUs
|
|
- Intel(r) SST support ondemand
|
|
- Intel(r) SST support ondemand
|
... | @@ -55,11 +69,11 @@ |
... | @@ -55,11 +69,11 @@ |
|
- Improved metrics and management API
|
|
- Improved metrics and management API
|
|
- Changes in the environment variables have been done for homogeneity
|
|
- Changes in the environment variables have been done for homogeneity
|
|
|
|
|
|
### EAR4.1.1
|
|
# EAR4.1.1
|
|
- Select replaced by poll to support bigger nodes
|
|
- Select replaced by poll to support bigger nodes
|
|
- Minor changes in edb_create and FP exceptions fixes
|
|
- Minor changes in edb_create and FP exceptions fixes
|
|
|
|
|
|
### EAR 4.1
|
|
# EAR 4.1
|
|
- Meta EARGM.
|
|
- Meta EARGM.
|
|
- Support for N jobs in a node.
|
|
- Support for N jobs in a node.
|
|
- CPU power models for N jobs.
|
|
- CPU power models for N jobs.
|
... | @@ -78,7 +92,7 @@ |
... | @@ -78,7 +92,7 @@ |
|
- msr_safe
|
|
- msr_safe
|
|
- HEROES plug-in.
|
|
- HEROES plug-in.
|
|
|
|
|
|
### EAR 4.0
|
|
# EAR 4.0
|
|
- AMD virtual p-states support and DF frequency management included
|
|
- AMD virtual p-states support and DF frequency management included
|
|
- AMD optimization based on min_energy and min_time
|
|
- AMD optimization based on min_energy and min_time
|
|
- GPU optimization in low GPU utilization phases
|
|
- GPU optimization in low GPU utilization phases
|
... | @@ -87,7 +101,7 @@ |
... | @@ -87,7 +101,7 @@ |
|
- IO, Percentage of MPI and Uncore frequency reported to DB and included in eacct
|
|
- IO, Percentage of MPI and Uncore frequency reported to DB and included in eacct
|
|
- econtrol extensions for EAR health-check
|
|
- econtrol extensions for EAR health-check
|
|
|
|
|
|
### EAR 3.4
|
|
# EAR 3.4
|
|
- Automatic loading of EAR library for MPI applications (already in 3.3), OpenMP, MKL and CUDA applications. Programming model detection is based on dynamic symbols so it could not work if symbols are statically included.
|
|
- Automatic loading of EAR library for MPI applications (already in 3.3), OpenMP, MKL and CUDA applications. Programming model detection is based on dynamic symbols so it could not work if symbols are statically included.
|
|
- AMD monitoring support.
|
|
- AMD monitoring support.
|
|
- TAGS support included in policies.
|
|
- TAGS support included in policies.
|
... | @@ -96,8 +110,7 @@ |
... | @@ -96,8 +110,7 @@ |
|
- Node powercap and cluster power cap under development.
|
|
- Node powercap and cluster power cap under development.
|
|
- papi dependency removed.
|
|
- papi dependency removed.
|
|
|
|
|
|
|
|
# EAR 3.3
|
|
### EAR 3.3
|
|
|
|
- eacct loop signature reported.
|
|
- eacct loop signature reported.
|
|
- EAR loader included.
|
|
- EAR loader included.
|
|
- GPU support migrated to nvml API.
|
|
- GPU support migrated to nvml API.
|
... | @@ -108,7 +121,7 @@ |
... | @@ -108,7 +121,7 @@ |
|
- Internal messaging protocol improved.
|
|
- Internal messaging protocol improved.
|
|
- Average CPU frequency and Average IMC frequency computation improved.
|
|
- Average CPU frequency and Average IMC frequency computation improved.
|
|
|
|
|
|
### EAR 3.2
|
|
# EAR 3.2
|
|
- GPU monitoring based on nvidia-smi command.
|
|
- GPU monitoring based on nvidia-smi command.
|
|
- GPU power reported to the DB using NVIDIA commands.
|
|
- GPU power reported to the DB using NVIDIA commands.
|
|
- Postgresql support.
|
|
- Postgresql support.
|
... | | ... | |