====== BTP Application Stats Monitor ======
===== Overview =====
The **BTP Application Stats Monitor** provides comprehensive monitoring of SAP Business Technology Platform (BTP) Cloud Foundry applications. It tracks application health, resource usage, and instance status in real-time, allowing you to proactively detect and respond to performance issues or failures.
===== Prerequisites =====
* BTP Cloud Foundry API access
* Valid BTP credentials with permissions to:
* List applications (''/v3/apps'')
* Read process statistics (''/v3/processes/{guid}/stats'')
* Network connectivity from the monitoring system to BTP API endpoints
===== API Endpoints Used =====
^ Endpoint ^ Purpose ^
| POST /oauth/token | Authentication (OAuth2 token generation) |
| GET /v3/apps | List all applications (wildcard mode) |
| GET /v3/processes/{guid}/stats | Retrieve instance statistics for an application |
===== Key Features =====
==== Metrics Collection ====
The monitor collects the following metrics for each application:
* **Running Instances** - Number of healthy, operational instances
* **Crashed Instances** - Number of failed instances
* **Memory Used (MB)** - Total memory consumption across all running instances
* **Disk Used (MB)** - Total disk space consumption across all running instances
* **CPU Usage (%)** - Average CPU utilization across all running instances
==== Alarm Capabilities ====
* **Application State Monitoring** - Detect when apps are not in the expected state (e.g., STOPPED, CRASHED)
* **Crashed Instance Detection** - Alert when the number of crashed instances exceeds a threshold
* **Memory Usage Alarms** - Warning and Critical thresholds for memory consumption
* **Disk Usage Alarms** - Warning and Critical thresholds for disk consumption
* **CPU Usage Alarms** - Warning and Critical thresholds for CPU utilization
==== Wildcard Support ====
Monitor all applications in your BTP space automatically by using ''*'' as the app name. The monitor will:
* Automatically discover all applications in your BTP Cloud Foundry space
* Monitor each application individually
* Dynamically adapt to new or removed applications
===== How It Works =====
==== Data Collection Process ====
- **Authentication**: The monitor authenticates to the BTP Cloud Foundry API using OAuth2 with cached token management
- **Application Discovery (wildcard mode)**: Retrieves all applications from ''/v3/apps'' endpoint
- **Stats Collection**: For each application, calls ''/v3/processes/{guid}/stats'' to gather instance-level statistics
- **Data Aggregation**: Combines data from all instances to calculate totals and averages
- **Alarm Evaluation**: Compares collected metrics against configured thresholds
- **Metrics Storage**: Stores collected metrics for historical tracking and reporting
==== Application State Determination ====
The overall application state is determined as follows:
* **RUNNING** - At least one instance is running (even if some are crashed)
* **CRASHED** - All instances are crashed
* **STOPPED** - No instances are running (and none are crashed)
//Note//: If you have 2 instances where 1 is RUNNING and 1 is CRASHED, the application is considered RUNNING because it's still serving traffic.
==== Resource Aggregation ====
* **Memory & Disk**: Summed across all running instances only (crashed instances are excluded)
* **CPU**: Averaged across all running instances only
* **Quotas**: Summed across all instances (running + crashed) to calculate usage percentages
===== Configuration =====
==== Connection Settings ====
* Create a Web Service Connector pointing to your BTP Cloud Foundry API endpoint
* Example: ''https://api.cf.eu10-123.hana.ondemand.com''
* Configure authentication credentials (user profile with BTP credentials)
==== Monitor Configuration ====
==== Adding Applications to Monitor ====
There are two ways to add applications to the monitor:
=== Method 1: Load BTP Apps (Recommended) ===
* Open the BTP Application Stats monitor configuration
* Click the "Load BTP Apps" button in the toolbar
* The system will:
* Connect to your BTP Cloud Foundry API
* Retrieve all available applications
* Populate the table with application names
* Select which applications to monitor by checking the Active checkbox
* Configure alarm thresholds for each application individually
Benefits:
* No need to manually find application GUIDs
* Shows all available applications in your BTP space
* Pre-fills application names automatically
* Allows selective monitoring of specific applications
=== Method 2: Wildcard Mode (Monitor All) ===
* Set App Name to ''*''
* The monitor will automatically:
* Discover all applications at runtime
* Monitor every application using the same configuration
* Adapt to new or removed applications without reconfiguration
Benefits:
Zero configuration for comprehensive coverage
Automatically monitors new applications
Single configuration for all apps
=== Basic Settings ===
^ Field ^ Description ^ Default ^
| Active | Enable/disable monitoring for this application | ''true'' |
| Schedule | Collection frequency (minutes) | ''5'' |
| Timeout | Maximum execution time (seconds) | ''120'' |
| App Name | Application name or * for all apps | * |
=== State Monitoring ===
^ Field ^ Description ^ Default ^
| State Alarm | Enable state monitoring | ''true'' |
| Expected State | Expected application state | ''RUNNING'' |
| State Check Mode | ''EQUALS'' or ''NOT_EQUALS'' | ''NOT_EQUALS'' |
| State Severity | Alarm severity (1-5) | ''4'' (Critical) |
* Check Mode Examples:
* ''NOT_EQUALS'' + ''RUNNING'' → Alarm if app is NOT running
* ''EQUALS'' + ''CRASHED'' → Alarm if app IS crashed
=== Crashed Instances Monitoring ===
^ Field ^ Description ^ Default ^
| Crashed Instances Alarm | Enable crashed instance detection | ''true'' |
| Crashed Instances Threshold | Maximum acceptable crashed instances | ''0'' |
| Crashed Instances Severity | Alarm severity | ''4'' (Critical) |
=== Memory Monitoring ===
^ Field ^ Description ^ Default ^
| Memory Alarm | Enable memory monitoring | ''true'' |
| Memory Warning % | Warning threshold | ''80%'' |
| Memory Critical % | Critical threshold | ''90%'' |
=== Disk Monitoring ===
^ Field ^ Description ^ Default ^
| Disk Alarm | Enable disk monitoring | ''true'' |
| Disk Warning % | Warning threshold | ''80%'' |
| Disk Critical % | Critical threshold | ''90%'' |
=== CPU Monitoring ===
^ Field ^ Description ^ Default ^
| CPU Alarm | Enable CPU monitoring | ''true'' |
| CPU Warning % | Warning threshold | ''80%'' |
| CPU Critical % | Critical threshold | ''90%'' |
=== General Settings ===
^ Field ^ Description ^ Default ^
| Auto Clear | Automatically clear alarms when conditions normalize | ''true'' |
| Metric | Enable metrics collection | ''true'' |
===== Usage Examples =====
==== Example 1: Load and Monitor Specific Applications ====
* Click "Load BTP Apps"
* Select only production apps (e.g., prod-api, prod-web, prod-worker)
* Configure stricter thresholds for production:
Memory Warning: 70%
Memory Critical: 85%
CPU Warning: 60%
CPU Critical: 80%
* Leave development apps inactive
==== Example 2: Monitor All Applications with Wildcard ====
App Name: *
State Alarm: Enabled
Expected State: RUNNING
State Check Mode: NOT_EQUALS
→ Monitors all apps in your BTP space and raises alarms if any app is not running.
=== Example 3: Mixed Approach ===
* Use wildcard (*) with default thresholds for general coverage
* Add specific critical apps via Load BTP Apps with custom thresholds
* Both configurations can coexist in the same monitor
==== Example 4: Monitor Only Application State ====
App Name: background-worker
State Alarm: Enabled
Memory Alarm: Disabled
Disk Alarm: Disabled
CPU Alarm: Disabled
Crashed Instances Alarm: Disabled
→ Only monitors if the application is running, ignoring resource usage.
==== Collected Metrics ====
The BTP Application Stats monitor collects 5 key performance metrics for each monitored application. These metrics are stored with dimensional tags for easy filtering and analysis.
=== Metric Types===
^ Metric Name ^ Unit ^ Description ^ Example ^
| running_instances | count | Number of healthy, operational instances | 2 |
| crashed_instances | count | Number of failed or crashed instances | 0 |
| memory_used_mb | MB | Total memory consumption across all running instances | 256.5 |
| disk_used_mb | MB | Total disk space consumption across all running instances | 512.3 |
| cpu_usage_percent | % | Average CPU utilization across all running instances | 15.67 |
=== Metric Format ===
All metrics follow this naming convention:
promonitor.btp_app_stats. app_name:;host:;
==== Alarm Examples ====
=== State Alarm ===
BTP App my-app is in state STOPPED (expected state: RUNNING)
=== Crashed Instances Alarm ===
BTP App my-app has 2 crashed instance(s) (threshold: 0)
== Memory Alarm ==
BTP App my-app memory usage is 92.5% [CRITICAL] (thresholds: warning=80%, critical=90%)
== CPU Alarm ==
BTP App my-app CPU usage is 85.23% [WARNING] (thresholds: warning=80%, critical=90%)
===== Best Practices =====
* Use "**Load BTP Apps**" to easily discover and select specific applications to monitor
* Use **Wildcard Mode** for comprehensive coverage when you want to monitor everything
* **Combine both approaches** - Use wildcard for general monitoring and add specific apps with custom thresholds
* **Adjust thresholds** based on your application's normal behavior patterns
* **Enable Auto Clear** to automatically resolve alarms when conditions improve
* **Set appropriate schedules** - 5 minutes is recommended for production monitoring
* **Monitor crashed instances** separately from state to distinguish between partial and total failures
* **Review metrics regularly** to identify trends and optimize resource allocation
===== Troubleshooting =====
==== "Load BTP Apps" Button Doesn't Work ====
* Verify BTP credentials are correct in the Web Service Connector
* Check network connectivity to BTP API
* Ensure the connector is properly configured and saved
* Review logs for authentication errors
==== No Data Collected ====
* Verify BTP credentials are correct
* Check network connectivity to BTP API
* Ensure the application GUID is correct (for specific app monitoring)
* Review logs for authentication errors
==== Metrics Show Zero ====
* Application may have no running instances
* All instances might be crashed or stopped
* Check application status in BTP Cockpit
==== Alarms Not Clearing ====
* Verify Auto Clear is enabled
* Check if the condition has actually normalized
* Review alarm threshold configurations
==== Related Documentation ====
[[https://v3-apidocs.cloudfoundry.org/version/3.205.0/#get-stats-for-a-process|SAP BTP Cloud Foundry API Documentation - Get stats for a process]]