Create Alarms

Setting Alarms

When a metric condition is met, you can use the Monitoring service's alarm system to alert interested parties to conditions. You can create alarms on individual resources or on an entire compartment.

Ops Insights provides convenient access to Monitoring service's alarm creation functionality directly from any fleet resource page.

To create an alarm:

  1. Open the navigation menu, click Observability & Management, and then click Ops Insights.
  2. In the left pane, click Administration, and then click the fleet resource, such as Database fleet, Host fleet, or Exadata fleet.
  3. Locate the resource for which you want to create an alarm. Click the Actions menu for the row, and then select Add alarms.
  4. In the Add alarms to metrics panel, review the available metrics, expand View alarm details for a metric to view the suggested trigger parameters and key dimensions.
  5. Click Add alarm for the metric. The Monitoring service Create alarm page is displayed, with the required metric details.
    Note

    By default, an alarm applies to an individual resource. If you want the alarm to apply to an entire compartment, remove the resourceID.
  6. Enter an alarm name, and set the threshold and trigger delay.
  7. Define alarm notifications by selecting a Destination service and a topic or channel to use for alarm notifications. You can also create a new topic.
  8. Click Create.

Specific Alarm Conditions

Data Flow Delays

You can create alerts to conditions defined for the DataFlowDelayInHrs metric. The following table shows some recommended alarms you can set up along with a corresponding Monitoring Query Language (MQL) example which you can use as a template to define your alarms. For more information about setting up alarms, see Managing Alarms.

Alarm Name MQL Alarm Definition Description
DataFlowSourceAlarmFor1HrData DataFlowDelayInHrs[1h]{dataProcessingFrequencyInHrs="1.00"}.grouping(telemetrySourceType , sourceIdentifier).mean() > 48

Pending duration: 1h

For a sourceType, sourceIdentifier with 1 hour data processing frequency, the mean value (across targets) of DataFlowDelayInHrs is greater than 48 hours for continuous 6 hours. This indicates that the problem is at the whole source level.
DataFlowResourceAlarmFor1HrData DataFlowDelayInHrs[1h]{dataProcessingFrequencyInHrs="1.00"}.grouping(telemetrySourceType, resourceId,resourceDisplayName, sourceIdentifier).max() > 24

Pending duration: 1h

For a sourceType, resource & sourceIdentifier, DataFlowDelayInHrs is more than 24 hours for continuous 1 day for the type of data for which data processing frequency is every 1 hour.
DataFlowResourceAlarmFor3HrData DataFlowDelayInHrs[3h]{dataProcessingFrequencyInHrs="3.00"}.grouping(telemetrySourceType, resourceId, sourceIdentifier).max() > 48

Pending duration: 1h

For a sourceType, resource & sourceIdentifier, DataFlowDelayInHrs is more than 48 hours for continuous 1 day for the type of data for which data processing frequency is every 3 hours.
DataFlowResourceAlarmForDailyData DataFlowDelayInHrs[3h]{dataProcessingFrequencyInHrs="24.00"}.grouping(telemetrySourceType, resourceId, sourceIdentifier).mean()

Pending duration: 1h

For a sourceType, resource & sourceIdentifier, DataFlowDelayInHrs is more than 72 hours for continuous 1 day for the type of data for which data processing frequency is every 24 hours.

About Forecast Issues

Ops Insights provides metrics to help you configure alarms for high (default value >75%) or low (default value < 25%) utilization for a given resource and resource metric. Additionally you can customize these forecast metric thresholds. Helping provide more granular capacity management forecasting, allowing you to be more proactive in resource management by setting threshold values that are more relevant to a specific target type for more accurate forecasting. For more information on setting threshold values, see Changing Utilization Thresholds.

The forecast metrics are generated using at most 100 days of history data and forecast window of 90 days. You can verify the forecast from the Ops Insights console by selecting 1 year as the time range and High or Low utilization for 90 days.

The following table shows a sample of a recommended alarm you can set up along with a corresponding Monitoring Query Language (MQL) example which you can use as a template to define your alarms. For more information about setting up alarms, see Managing Alarms.

Alarm Name MQL Description
DaysToReachHighUtilizationStorageLessThan30D DaysToReachHighUtilization[1D]{resourceMetric="STORAGE", resourceType="Database", exceededForecastWindow="false"}.grouping(telemetrySource,resourceId).mean() < 30," For sourceType, resourceType, resourceMetric and sourceIdentifier, DaysToReachHighUtilization is less than 30 days.
DaysToReachHighUtilizationExaStorage DaysToReachHighUtilization[1D]{resourceMetric="STORAGE", resourceType="Database", exceededForecastWindow="false"}.grouping(telemetrySource,resourceId).mean() < 30, For sourceType, resourceType, resourceMetric and sourceIdentifier, DaysToReachHighUtilization is less than 30 days.
Note

For linear and seasonality aware forecasts, the forecast window is 90 days, which means that if a specific resource has a forecast of more than 90 days, by default the metric value will show 91 days. For AutoML this is forecast by number of data points available.