Redpeaks V6.8
Trouble shooting
Monitors Guide
Trouble shooting
Monitors Guide
Work processes are the actual workers in SAP. It is important to check their state, availability and run time. This monitor will watch work processes of all kinds, with the ability to tune the monitoring for specific tasks or cases. It will detect long running processes, stopped or private processes, compute the usage ratio, and notify as soon as a threshold on those metrics is reached.
This monitor allows to send an alarm if it finds too many work processes in a given situation, or if some are running for too long.
To start the configuration, create a rule in the surveillance table. You can use the User filter to watch processes used by a specific user or group of users (regular expression)
The Task filter Select the specific task for which you want to check the processes. You can also select ANY if you want to build a general rule to match any kind of work process
The State filter Will select only the work processes of the give state. This allows to monitor the number of processes being in a given state, for example PRIVATE or STOPPED, which are states you want to avoid.
The Runtime threshold You can set the maximum runtime for a given type of work process. All WP matching the task and state filter, running for more than threshold will generate an alarm
The Count threshold You can set the maximum number work processes to be in a given state, in absolute value or in ratio of the total. For example, to get an alarm when 80% of DIALOG work processes are RUNNING.
Thresholds are independent and can be set to 0 if not used.
Parameter | Description |
---|---|
Active | Use this field to activate or deactivate a line of configuration. |
Task | The type of workprocess to watch, or ANY to watch all. |
Status | Watch for the processes of the given status |
User | A filter to match only the process used by the given user. Regular expressions may be used. |
Runtime threshold (elapsed in sec) | The threshold for the maximum runtime (seconds) of a given process. Does not depend on the configured status. |
Count threshold (nb or %) | The threshold for the number of processes being in the defined status. Use an absolute value or a percentage of the total number of processes available for the given task (specify % unit). |
Severity | The level of severity of the alarm generated by this line of surveillance. |
Auto clear | If checked, the alarm will be cleared as soon as the alarm condition is not met anymore. |
Alarm tag | This field allows to add custom text within the alarm message. %MSG% variable will contain the actual generated message and can be used such as: “my_prefix %MSG% my_suffix”. By default, tag will be used as prefix. |
Alarm | If checked, this line of surveillance will be used for alarm generation. |
Metric | If checked, this line of surveillance will be used for QOS generation. |
Report | If checked, this line of surveillance will used for showing threshold and severity in the daily report |
Active | Task | Status | User | Runtime threshold (elapsed in sec) | Count threshold (nb or %) | Severity | Auto clear | Alarm tag | Alarm | Metric | Report |
---|---|---|---|---|---|---|---|---|---|---|---|
true | DIALOG | RUNNING | * | 90 | 80% | MAJOR | true | true | false | false |
Effect : Send a MAJOR alarm for any DIALOG process running a task since more than 90 sec or if 80% of the DIALOG processes are in RUNNING state.
Active | Task | Status | User | Runtime threshold (elapsed in sec) | Count threshold (nb or %) | Severity | Auto clear | Alarm tag | Alarm | Metric | Report |
---|---|---|---|---|---|---|---|---|---|---|---|
true | DIALOG | PRIV | * | 0 | 1 | CRITICAL | true | true | false | false |
Effect : Send a CRITICAL alarm if one DIALOG process is in PRIVATE status
Active | Task | Status | User | Runtime threshold (elapsed in sec) | Count threshold (nb or %) | Severity | Auto clear | Alarm tag | Alarm | Metric | Report |
---|---|---|---|---|---|---|---|---|---|---|---|
true | ANY | STOPPED | * | 0 | 1 | CRITICAL | true | true | false | false |
Effect : Send a CRITICAL alarm if any process is in STOPPED state
Active | Task | Status | User | Runtime threshold (elapsed in sec) | Count threshold (nb or %) | Severity | Auto clear | Alarm tag | Alarm | Metric | Report |
---|---|---|---|---|---|---|---|---|---|---|---|
true | SPOOL | RUNNING | JDOE | 120 | 0 | WARNING | false | true | false | false |
Effect : Send a WARNING alarm each time user JDOE runs a SPOOL task for more than 120 seconds. Do not clear the alarm
metricId | metricUnit | metricTarget | metricDescription |
---|---|---|---|
WORKPROCESSES_PERCENTAGE_USED | Percent or Work processes | [INSTANCE][TASK][STATUS] | Sends the usage ratio of a work process of a given type |
WORKPROCESSES_USED_COUNT | Percent or Work processes | [INSTANCE][TASK][STATUS] | Sends the number of a work processes of a given type |
WORKPROCESSES_FREE_COUNT | Work processes | [INSTANCE][TASK] | Sends the number of work processes in WAITING state for a given task |