====== Custom SapControl monitoring ====== This monitor will give the possibility to extend the monitoring of components using the SAP start service framework, such as Web dispatcher, ICM or application servers. It will discover metrics from the SAPControl services and let you customize how you want to monitor them. ===== Configuration hints ===== * Use the ''**load Metrics**'' button to discover the available metrics from a system. * You can then select the metrics you want and create surveillance rules for each one. * Metric paths will automatically be filled in the table. * You can replace branches of the path by a variable like //%INSTANCE%// or //%DISK%// to use it in the alarm message or in the metric target in order to make that rule a generic one. * For each line of surveillance, you can send alarms and metrics. ====Thresholds==== The threshold syntax is similar than the one in CCMS: * G2W:50 -> 50 is the threshold to set a WARNING * W2M:80 -> 80 is the threshold to move from WARNING to MAJOR * M2W:70 -> 70 is the threshold to move from MAJOR to WARNING * You can combine multiple rules: G2W:50 W2M:80 M2W:70 **Severity letters:** * G: Green * W: Warning * m: minor * M: Major * C: Critical ====Alarm message==== You can build the alarm message by using the variables declared in the metric path, but also following ones: * %THRESH%: the threshold that has been breached * %VALUE%: the value of the metric * %VALUNIT%: the unit of the metric ====Custom performance metrics==== * You can define custom metrics on any discovered performance metric * Use the **Monitor name** to define the name of your metric (Mind naming convention) * Use **Target** field to define the metric tags. You can use path variables as tag value: * MTE: %INSTANCE%\FileSystem\%FS%\UsedSpace * Target: instance:%INSTANCE%,filesystem:%FS% ==== Alternative metric ==== * Many metrics have different naming from system to system * The Alternative check box can be used to define an alternative metric path to the previous rule. * You can define several alternative paths that you need, but it has to be mapped to the same metric. ===== Surveillance table ===== ^Parameter^Description^ ^Active|To enable/disable a rule| ^Mandatory|If set to true, an error alarm will be generated if no metrics can be collected at the specified path| ^Alternative|If set to true, the metric defined in this rule is an alternative to the previous one. This field is convenient to describe a metric which have different naming from one system to another, and stay with a generic configuration.| ^Monitor tree element|The path to the metric| ^Monitor name|The name to use for the generated metric| ^Alarm message|The alarm message definition| ^Threshold|The threshold definition, using CCMS style syntax, for example: G2W:80 W2C:90| ^Target|A coma separated list of tags: tag_name:tag_value. Variables defined in the metric path can be used here| ^Delta|If you want to monitor a delta within time instead of an absolute value, you can enable that field and set a value in the delta time field. Useful when you work on metrics that are only reset at system startup.| ^Delta time (min)|The length of the period to compute delta value of a metric| ^Alarm|Enable to activate alarm sending| ^Metric|Enable to activate metric sending| ===== Examples ===== ^Active^Mandatory^Alternative^Monitor tree element^Monitor name^Alarm message^Threshold^Target^Delta^Delta time (min)^Alarm^Metric^ |true|true|false|%INSTANCE%\CPU\Utilization|CPU_UTILIZATION|CPU utilization is %VALUE% (>%THRESH%)|G2W:60 W2C:90|instance:%INSTANCE%|false|0|true|true| **Effect** : Generates a WARNING alarm if CPU is between 60 and 90, and a Critical alarm if over 90. Generates a metric with the instance name as a target value ===== Examples ===== ^Active^Mandatory^Alternative^Monitor tree element^Monitor name^Alarm message^Threshold^Target^Delta^Delta time (min)^Alarm^Metric^ |true|true|false|%INSTANCE%\Connections\Failed_connections|FAILED_CONNECTION_60MIN|%VALUE% failed connections since last 60 min (>=%THRESH%)|G2W:100|instance:%INSTANCE%|true|60|true|true| **Effect** : Sends an WARNING alarm if more than 100 failed connections occurred within the last 60 min. Generates a metric with instance name as target.