All alarms and metrics generated by Redpeaks can be propagated by email or to third party applications by the use of plugins only.
If you want to use different plugins depending on the origin of an alarm (SAP/internal), you can use Alarm rules for that
Internal alarms are typically send by email to Redpeaks admin
How to access Alarms and Metrics feature
From the top right of the screen, click on the setting icon
Select the admin configuration sub-menu
Click on tabs Alarms/Metrics
System availability alerts
Max connection resp. time (sec) : An alert will be generated if a System is not responding after a number of seconds set in this input field. The severity of the Alert can be set using the corresponding dropdown list.
Max system down time (sec) : An alert will be generated after attempting to reach a System for a number of seconds set in this input field. The severity of the Alert can be set using the corresponding dropdown list.
Time zone alarm: An alert will be generated if the time zone of a system is not properly set, or cannot be resolved. This option will define the severity used for this alert.
Internal alerts
Monitor job execution error : An alert will be generated if a Monitor job encounters an error during its execution. The severity of the Alert can be set using the corresponding dropdown list.
CCMS errors : An alert will be generated if CCMS kind jobs encounter an error during its execution. The severity of the Alert can be set in the dropdown list.
Monitor Tree loading errors : An alert will be generated if Monitor Tree kind jobs encounter an error during loading data from SAP. The severity of the Alert can be set in the dropdown list.
Agents
This set of alarm settings will help to detect and be notified when a problem is detected on a agent:
Max agent down time (sec) :
To be notified when an agent is not responding
Define the max time in seconds the agent must be available before sending a notification
Min schedule ratio (%) :
This alarm allows to detect when an agent has not enough time to execute all its monitors
The server computes the ratio between executed monitors and rescheduled ones and compare it to the threshold
A ratio of 100% is to be expected on well configured agents
Min successful exec. ratio (%) :
This alarm allows to detect when an agent returns a lot of execution errors for its monitors
The server will compute the ratio between successful executions and failed ones
To have some monitor failing from time to time is normal, but a lot of failures might indicate a problem in the agent (resources/network)
Max result send time (sec) :
This alarm allows to detect when sending the results from the agent to the primary server is taking too long time
This can be caused by network problems, or resource problem on agent of primary server.
A notification will be sent if the send time is over threshold.
Max time without results (sec) :
This alarm allows to detect when an agent is not sending any results to the server
This can indicate a resource problem on the agent
A notification will be sent if the time since last received result is over threshold
Max VM Heap usage (%) :
This alarm allows to detect when an agent is using all its allocated memory
If the agent memory usage reaches 100%, this may indicate memory starvation and instability
A notification will be sent if VM memory usage reaches threshold
Max OS RAM usage (%) :
This alarm allows to detect when the overall OS memory usage is too high
High OS memory usage may prevent the server to use its allocated memory, and also use paging which will decrease performances.
A notification will be sent if OS memory usage is over threshold
Max OS disk usage (%) :
This alarm allows to detect when the application disk space is running low
Disk full situation must absolutely be avoided, it may bring the service down.
A notification will be sent if the disk used space is over threshold
Plugins
Max plugin down time (sec) :
Allows to detect when a plugin is failing to send events.
This is usually a critical case, because it means that monitoring might not be visible in the corresponding third party platform
A notification will be sent if the plugin error last for more than threshold.
Licenses
Max expiration delay (days) :
Allows to be notified when a license is going to expire
Invalid license severity :
Allows to be notified when a license is not valid
Internal alarms settings
Clear alarms :
If set, all clearable alarms will be cleared (by using an alarm with toClear paramter set to true.) once the problem is not detected anymore.
Metrics sources
Alarm source : SID, HOST, FQND, TITLE, INSTANCE, IP
Metric source : SID, HOST, FQND, TITLE, INSTANCE, IP
/home/clients/8c48b436badcd3a0bdaaba8c59a54bf1/wiki-web/data/pages/products/promonitor/6.8/userguide/administration/adminconfig/alarmsandmetrics.txt · Last modified: 2024/05/01 18:35 (external edit)