Products documentation

Alarms and Metrics

Purpose

Defines global Alarms and Metrics configuration
All alarms and metrics generated by Redpeaks can be propagated by email or to third party applications by the use of plugins only.
If you want to use different plugins depending on the origin of an alarm (SAP/internal), you can use Alarm rules for that
Internal alarms are typically send by email to Redpeaks admin

How to access Alarms and Metrics feature

From the top right of the screen, click on the setting icon
Select the admin configuration sub-menu
Click on tabs Alarms/Metrics

System availability alerts

Max connection resp. time (sec) : An alert will be generated if a System is not responding after a number of seconds set in this input field. The severity of the Alert can be set using the corresponding dropdown list.
Max system down time (sec) : An alert will be generated after attempting to reach a System for a number of seconds set in this input field. The severity of the Alert can be set using the corresponding dropdown list.
Time zone alarm: An alert will be generated if the time zone of a system is not properly set, or cannot be resolved. This option will define the severity used for this alert.

Internal alerts

Monitor job execution error : An alert will be generated if a Monitor job encounters an error during its execution. The severity of the Alert can be set using the corresponding dropdown list.
CCMS errors : An alert will be generated if CCMS kind jobs encounter an error during its execution. The severity of the Alert can be set in the dropdown list.
Monitor Tree loading errors : An alert will be generated if Monitor Tree kind jobs encounter an error during loading data from SAP. The severity of the Alert can be set in the dropdown list.

Agents

This set of alarm settings will help to detect and be notified when a problem is detected on a agent:

Max agent down time (sec) :
- To be notified when an agent is not responding
- Define the max time in seconds the agent must be available before sending a notification
Min schedule ratio (%) :
- This alarm allows to detect when an agent has not enough time to execute all its monitors
- The server computes the ratio between executed monitors and rescheduled ones and compare it to the threshold
- A ratio of 100% is to be expected on well configured agents
Min successful exec. ratio (%) :
- This alarm allows to detect when an agent returns a lot of execution errors for its monitors
- The server will compute the ratio between successful executions and failed ones
- To have some monitor failing from time to time is normal, but a lot of failures might indicate a problem in the agent (resources/network)
Max result send time (sec) :
- This alarm allows to detect when sending the results from the agent to the primary server is taking too long time
- This can be caused by network problems, or resource problem on agent of primary server.
- A notification will be sent if the send time is over threshold.
Max time without results (sec) :
- This alarm allows to detect when an agent is not sending any results to the server
- This can indicate a resource problem on the agent
- A notification will be sent if the time since last received result is over threshold
Max VM Heap usage (%) :
- This alarm allows to detect when an agent is using all its allocated memory
- If the agent memory usage reaches 100%, this may indicate memory starvation and instability
- A notification will be sent if VM memory usage reaches threshold
Max OS RAM usage (%) :
- This alarm allows to detect when the overall OS memory usage is too high
- High OS memory usage may prevent the server to use its allocated memory, and also use paging which will decrease performances.
- A notification will be sent if OS memory usage is over threshold
Max OS disk usage (%) :
- This alarm allows to detect when the application disk space is running low
- Disk full situation must absolutely be avoided, it may bring the service down.
- A notification will be sent if the disk used space is over threshold

Plugins

Max plugin down time (sec) :
- Allows to detect when a plugin is failing to send events.
- This is usually a critical case, because it means that monitoring might not be visible in the corresponding third party platform
- A notification will be sent if the plugin error last for more than threshold.

Licenses

Max expiration delay (days) :
- Allows to be notified when a license is going to expire
Invalid license severity :
- Allows to be notified when a license is not valid

Internal alarms settings

Clear alarms :
- If set, all clearable alarms will be cleared (by using an alarm with toClear paramter set to true.) once the problem is not detected anymore.

Metrics sources

Alarm source : SID, HOST, FQND, TITLE, INSTANCE, IP
Metric source : SID, HOST, FQND, TITLE, INSTANCE, IP

Products documentation

Sidebar

Redpeaks V6.8

Table of Contents

Alarms and Metrics

Purpose

How to access Alarms and Metrics feature

System availability alerts

Internal alerts

Agents

Plugins

Licenses

Internal alarms settings

Metrics sources

Products documentation

User Tools

Site Tools

Sidebar

Redpeaks V6.8

Table of Contents

Alarms and Metrics

Purpose

How to access Alarms and Metrics feature

System availability alerts

Internal alerts

Agents

Plugins

Licenses

Internal alarms settings

Metrics sources

Page Tools