Dimension Numeric Capture (Regexes) Processors

George Alpizar
George Alpizar
  • Updated

Overview

This processor: 

  • Monitors a specific numerical field, such as latency, per unique dimension value, such as api_path
  • Automatically generate statistics, such as counts and averages
  • Detects anomalies, based on the aggregate values grouped by dimensions

Review Parameters

Review the following parameters that you can configure in the Edge Delta App:

Visual Editor YAML Description
Name name

Enter a descriptive label for this processor. 

When you create a workflow, you will use this label to enter your processor into the workflow in the visual editor.

This parameter is required. 

Review the following example:

name: "http-request-latencies"
Pattern pattern

Enter a regular expression pattern with a named group.

The matching part of the log will be extracted and converted to a floating point. 

Named capture groups must follow Golang regex protocol, such as:

  • "(?P<latency>\\d+)ms"

This parameter is required. 

pattern: "] \"(?P<method>\\w+) took (?P<latency>\\d+) ms"
Dimensions dimensions

This parameter lists fields of named capture groups to use as dynamic dimensions (to group by).

For each dimension that you specify, you must have a corresponding named capture group in the pattern field for the processor.

This parameter is optional. 

Review the following example:

dimensions: ["method"] 
Dimensions As Attributes dimensions_as_attributes

Enter true or false to to send dimension key/value pairs as attributes.

If you select false, then the dimension key/value pairs will be appended to the metric name. 

Note

If you enable this parameter, then you can specify the Dimensions Groups  parameter.

This parameter is optional.

Review the following example:

dimensions_as_attributes: true 
Interval interval

This parameter is a golang duration string that represents the reporting (or rollup) interval for the generated statistics.

The default value is 1m.

This parameter is optional. 

Review the following example:

interval: 2m 
Retention retention

This parameter is a golang duration string that represents how far back the agent should look when generating anomaly scores.

The default value is 3h.

This parameter is optional. 

Review the following example:

retention: 4h 
Not applicable trigger_thresholds

This parameter defines threshold limits, based on calculated metrics.

When a threshold is reached, the agent notifies the corresponding trigger destinations in the same workflow.

You can configure the following trigger threshold types:

  • anomaly_probability_percentage
  • upper_limit_per_interval
  • lower_limit_per_interval
  • consecutive

This parameter is optional. 

Review the following example:

trigger_thresholds: 
anomaly_probability_percentage: 90
upper_limit_per_interval: 250
consecutive: 5
Anomaly Probability Percentage  anomaly_probability_percentage (trigger_thresholds)

This parameter sets the confidence level / probability of an anomaly that needs to be reached to trigger an alert. 

For example, if you enter 90, then an alert will trigger when there is a 90% probability that the detected pattern is an anomaly. 

Enter a number between 0 and 100.

There is no default value. 

This parameter is optional.

Review the following example:  

trigger_thresholds: 
anomaly_probability_percentage: 90
Upper Limit Per Interval  upper_limit_per_interval (trigger_thresholds)

This parameter sets a static threshold to trigger an alert.  

If the number of events that match the given pattern for the most recent reporting interval is greater than the limit, then an alert will be triggered.

There is no default value. 

This parameter is optional.

Review the following example:  

trigger_thresholds: 
upper_limit_per_interval: 250
Lower Limit Per Interval  lower_limit_per_interval (trigger_thresholds)

This parameter sets a static threshold to trigger an alert.

If the number of events that match the given pattern for the most recent reporting interval is less than the limit, then an alert will trigger.

There is no default value. 

This parameter is optional.

Review the following example: 

trigger_thresholds: 
lower_limit_per_interval: 10
Consecutive consecutive (trigger_thresholds)

This parameter sets how many consecutive times a threshold must be exceeded to trigger an alert.  

The default value is 0, which means that any condition that is met will trigger an alert. 

This parameter is optional.

Review the following example:

trigger_thresholds: 
consecutive: 5
Filters filters

Select an existing filter to add to this processor. 

To learn how to create a filter, see Filters.

This parameter is optional. 

Review the following example:

filters:
- extract_severity

Review Sample Configuration

Review the following example configuration: 

regexes:
  - name: "http"
    pattern: "(?P<method>\\w+) took (?P<latency>\\d+) ms"
    dimensions: ["method"]
    trigger_thresholds:
      anomaly_probability_percentage: 90

Review Sample Scenario 

To better understand how this processor works, review the following scenario: 

The following logs are fed into the processor: 

  • "GetAlbums took 12ms"
  • "GetRecords took 16ms"

When the agent sees these logs, the agent will generate the following metrics: 

  • http_method_getalbums_latency.count
  • http_method_getalbums_latency.avg
  • http_method_getalbums_latency.min
  • http_method_getalbums_latency.max
  • http_method_getalbums_latency.anomaly1
  • http_method_getrecords_latency.count
  • http_method_getrecords_latency.avg
  • http_method_getrecords_latency.min
  • http_method_getrecords_latency.max
  • http_method_getrecords_latency.anomaly1

Additionally, metrics are displayed in the following format:

  • {processor name}_{dimension name}_{dimension value}_{numeric capture group name}.{stat type}

For each distinct dimension (getalbums and getrecords), numeric statistics are calculated and reported with a metric name that contains the dimension.

If the above example had dimensions_as_attributes: true, then the metric name would not have been altered for each dimension value. Instead, the dimension value is added as an attribute. In this case, the following metrics would have been generated:

  • name: http_latency.count, attributes: {method="GetAlbums"}
  • name: http_latency.avg, attributes: {method="GetAlbums"}
  • name: http_latency.min, attributes: {method="GetAlbums"}
  • name: http_latency.max, attributes: {method="GetAlbums"}
  • name: http_latency.anomaly1, attributes: {method="GetAlbums"}
  • name: http_latency.count, attributes: {method="GetRecords"}
  • name: http_latency.avg, attributes: {method="GetRecords"}
  • name: http_latency.min, attributes: {method="GetRecords"}
  • name: http_latency.max, attributes: {method="GetRecords"}
  • name: http_latency.anomaly1, attributes: {method="GetRecords"}

 

Share this document