System monitoring overview
The System Monitoring section of the Reports page lists the monitored environments in the program. It reports on their high-level health across the following four separate categories:
- Host
- Storage
- Network
- Application
The status in each category is a summary of individual metrics. If any metric in a category is in the critical state, the entire category is in a critical state for the purpose of the overview page. The same summarization can be viewed at an environment level and at an instance level.
System monitoring detail
To view the details of specific metrics, click one of the category columns of a specific instance or the category title in the left navigation. Each detail page shows a series of graphs for the metrics within that category. You can either view the metrics for all instances in an environment or for a specific instance. You can switch between the environment and instances using the dropdown boxes in the top-right corner.
The navigation on the left shows the available metrics within the currently selected category for which there is data for the currently selected environment and instances.
An individual graph shows the status and a graph of the data over time along with the thresholds. If multiple instances are displayed, each instance’s data are in a separate series.
Individual series can be hidden on a graph by clicking on the series in the legend.
For example, if you click the warning threshold series, you see only the critical threshold.
Metric definitions
Host
- Load Per Core: The number of processes that the CPU is executing. Or, the number of queued processes that are in a waiting state averaged over a one (load1), five (load5), and fifteen (load15) minute period.
- Process Count: The number of processes currently open.
- User Count: The number of users with an active shell session.
- Memory Usage: The percentage of system memory currently allocated.
- JVM Memory: The size (in megabytes) of the allocated Java heap.
- Old Generation Space: The percentage of JVM old generation memory currently allocated.
Network
- CQ Port Check: The response time in seconds to access the AEM or Dispatcher port. There are different metrics for author, publish, and Dispatcher.
Storage
- Disk Space: The used disk space (in megabytes) for each mount point on the host. There are different metrics for each mount point. At a minimum, there are metrics for
/
and/mnt
, but additional mount point metrics may be available depending on the specific instance configuration. - Folder Size
- AEM Segment Store: The used disk space (in gigabytes) for the AEM Segment Store.
Application
- Replication Agent: The time (in seconds) for a test replication event
- There are separate metrics for each replication agent.
- Dispatcher Flush: The number of items currently in the Dispatcher flush queue