Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • EAR EAR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • EAR_teamEAR_team
  • EAREAR
  • Wiki
  • Architecture

Architecture · Changes

Page history
Improved 5.0 documentation authored Oct 16, 2024 by Oriol Vidal Teruel's avatar Oriol Vidal Teruel
Hide whitespace changes
Inline Side-by-side
Architecture.md
View page @ d7dd976c
[[_TOC_]]
# Overview
EAR is formed by a set of components, where each of them and their relationships with each other provides a full system software which accounts the power and energy consumption of jobs and applications in a cluster, provides a runtime library for application performance monitoring and optimization which can be loaded dynamically during application execution, a global power-capping system and a flexible reporting system to fit any storage requirements for saving all the collected data, all designed to be as most transparent as possible from the user point of view.
This section introduces all of these components and how they are stacked to provide different services and EAR features.
## System power consumption and job accounting
This is the most basic feature.
EAR is able to collect node power consumption and report it periodically thanks to the [EAR Node Manager](#ear-node-manager), a Linux service which runs on each compute node.
Is up to the sysadmin to decide how and where its periodic metrics are [reported](Configuration#eard-configuration).
The following figure shows this scheme.
![EAR_basic_accounting.svg](images/EAR_basic_accounting.svg)
The EAR Node Manager provides an API which can be used by a batch scheduler plug-in/hook to indicate the start/end of jobs/steps so it can account the power consumption of such entities.
Currently, EAR distribution comes with a [SLURM SPANK plug-in](#ear-slurm-plugin) for supporting the accounting of jobs and steps in SLURM systems.
## Application performance monitoring and energy efficiency optimization
Along with applications running in compute nodes, a runtime library can be loaded dynamically (thanks again to the batch scheduler support).
The [EAR Job Manager](#the-ear-library-job-manager) runs within application/workflow processes, so it can collect performance metrics, which can be reported in the same way as with the Node Manager, but still configurable.
Moreover, the Job Manager comes with optimization policies, which can select the optimal CPU/IMC/GPU frequencies based on those performance metrics by contacting with the Node Manager.
Below figure shows the interaction between these two components.
![EAR_job_mgr.svg](images/EAR_job_mgr.svg)
# EAR Node Manager
The EAR Daemon (EARD) is a per-node linux service that provides privileged metrics of each node as well as a periodic power monitoring service.
......@@ -61,7 +87,7 @@ Visit the [EAR configuration file page](Configuration#EARD-configuration) for mo
The EAR Database Daemon (EARDBD) acts as an intermediate layer between any EAR component that inserts data and the EAR's Database, in order to prevent the database server from collapsing due to getting overrun with connections and insert queries.
The Database Manager caches records generated by the [EAR Library](#the-ear-library) and the [EARD](#ear-node-manager) in the system and reports it to the centralized database.
The Database Manager caches records generated by the [EAR Library](#the-ear-library-job-manager) and the [EARD](#ear-node-manager) in the system and reports it to the centralized database.
It is recommended to run several EARDBDs if the cluster is big enough in order to reduce the number of inserts and connections to the database.
Also, the EARDBD accumulates data during a period of time to decrease the total insertions in the database, helping the performance of big queries.
......
Clone repository
  • Home
  • User guide
    • Use cases
      • MPI applications
      • Non-MPI applications
      • Others
    • EAR data
      • Post-mortem application data
      • Runtime report plug-ins
      • EARL events
      • MPI stats
      • Paraver traces
    • Submission flags
    • Examples
    • Job accounting
    • Job energy optimization
    • Data visualization
  • Commands
    • Job accounting (eacct)
    • System energy report (ereport)
    • EAR control (econtrol)
    • Database management
    • erun
    • ear-info
  • Environment variables
    • Support for Intel(R) speed select technology
  • Admin Guide
    • Quick installation guide
    • Installation from RPM
    • Updating
  • Installation from source
  • Architecture/Services
  • High Availability support
  • Configuration
  • Learning phase
  • Plug-ins
  • Powercap
  • Report plug-ins
  • Database
    • Updating the database from previous EAR versions
    • Tables description
  • Supported systems
  • EAR Data Center Monitoring
  • CHANGELOG
  • FAQs
  • Known issues
  • Tutorials