Redpeaks V6.8
Trouble shooting
Monitors Guide
Trouble shooting
Monitors Guide
This monitor is dedicated to surveillance of SAP background jobs. It will watch for job error status, duration, delay and job occurrence. You can set general rules to watch any job and be notified as soon as any fails. You can also configure specific rules for a given job to monitor its execution plan. A dedicated table is available to help managing notifications for specific job failures.
ABORTED
stateABORTED
Job XYZ has been ABORTED on 2023/04/23 05:23
aggregate
mode is OFFABORTED
15 jobs matching Z_ABC* name have been aborted (>=10) in the last 60 minutes
aggregate
mode is ONMax duration
or Max start delay
threshold to activate duration and delay monitoring (must be > 0)Abort. sev
fieldaggregate
option can be used to define the granularity of the monitoring:Job XYZ is running since 46 minutes (>=30 min)
aggregate
mode is OFF15 jobs matching Z_ABC* are delayed since 40 min (>=10)
aggregate
mode is OFFMax aborts
threshold to activate job status monitoring (must be > 0)Duration/Delay severity
fieldsaggregate
option can be used to define the granularity of the monitoring:occurence severity
fieldLoad jobs
button to display the list of discovered jobsZ_ABC*
→ matches jobs starting with Z_ABC
*_PROCESSING
→ will generate an errorExclusion List | The exclusion list can be used when you activated to SAP jobs collection in the daily report monitor job. You can specify a list of jobs that you want to ignore, so they do not appear in the report. Note: This has no effect on the real time surveillance! |
---|---|
Metadata | If active all collected jobs will be sent to plugin compatible with metadata processing |
Parameter | Description |
---|---|
Active | Use this field to activate or deactivate a line of configuration. |
Job name | A filter to define the job that you want to monitor. Use * for all. |
Client | A filter to match only a subset of clients. |
Schedule info | Defines the schedule defined for the job. This field can only be modified via the wizard. |
Max aborts | The threshold for the maximum number of aborted jobs within a period. |
Abort sev. | Defines the severity of the alarm to send if a job get aborted. |
Aggregate | If checked, an alarm will be sent if the total number of aborted jobs is over the threshold. If not check, then one alarm will be sent per job having a number of abort status equal or greater than the threshold. |
Max duration | The threshold for the maximum job duration in seconds |
Duration severity | The severity for the duration alarm. |
Max start delay | The threshold for the maximum execution delay. |
Delay severity | The severity for the delay alarm. |
Occurrence severity | The severity used for schedule alarm. |
Calendar | The execution calendar of the job. The check of the job won't be performed on calendar's closed days. |
Alarm tag | This field allows to add custom text within the alarm message. %MSG% variable will contain the actual generated message and can be used such as: “my_prefix %MSG% my_suffix”. By default, tag will be used as prefix. |
Advanced match | If checked, will fetch all executed jobs from SAP and matches them according to specified job name with an advanced method. Collects more data, but allows more advanced matching methods. |
Exclusive | If checked, all jobs matching the rule filter will not be processed by subsequent rules. |
Alarm | If checked, this line of surveillance will be used for alarm generation. |
Metric | If checked, this line of surveillance will be used for metric generation. |
Parameter | Description |
---|---|
Active | Use this field to activate or deactivate this rule |
Job name | The job name to match |
Client | The client to match |
Recipient | A coma separated list of email recipients |
Alarm tag | The optional alarm tag |
Active | Job name | Client | Schedule info | Max aborts | Abort sev. | Aggregate | Max duration | Duration severity | Max start delay | Delay severity | Occurrence severity | Calendar | Alarm tag | Alarm | Metric |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
true | * | * | From last 15 min | 1 | CRITICAL | false | 0 | DISABLED | 0 | DISABLED | DISABLED | None | false | true | false |
Effect : A CRITICAL alarm will be sent for each aborted job occurred in the last 15 minutes
Active | Job name | Client | Schedule info | Max aborts | Abort sev. | Aggregate | Max duration | Duration severity | Max start delay | Delay severity | Occurrence severity | Calendar | Alarm tag | Alarm | Metric |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
true | MY_JOB | * | Every 1 HOURS Starting 2015/01/01 10:00 | 1 | CRITICAL | false | 15 | WARNING | 0 | DISABLED | MAJOR | None | false | true | false |
Effect : Sends a CRITICAL alarm if job MY_JOB is aborted. Sends a WARNING alarm if the job runs for more than 15 minutes. Sends a MAJOR alarm if the job is not scheduled every hour (one alarm per missed slot).
metricId | metricUnit | metricTarget | metricDescription |
---|---|---|---|
JOBS_DURATION | Seconds | [JOB] (CLIENT) | The duration of the job in seconds |
JOBS_STATUS | Status | [JOB] (CLIENT) | The status of the job exexution. TRUE if success, FALSE instead |
ABORTED_JOBS | Aborted jobs | [JOB] (CLIENT) | The number of aborted jobs matching the filter. |