Cluster Processors

George Alpizar
George Alpizar
  • Updated

Overview

This processor type finds patterns in logs, and then groups (or clusters) these patterns based on similarities.

This processor populates the Patterns page. 

  • Most users, especially new users, will have default processors already configured for their account; however, if your account does not have any existing monitors, then the Patterns page will be empty.

Note

Analyzing patterns in high-volume log environments may cause strains on your computing resources. As a result, by default, this processor type only processes 200 logs per source, such as a container or file.

You can change this setting with the cpu_friendly and throttle_limit_per_sec parameters.


Review Sample Configuration

Review the following sample configuration:

cluster:
    name: clustering
    num_of_clusters: 100
    samples_per_cluster: 20
    reporting_frequency: 30s
    retention: 10m
    cpu_friendly: true
    throttle_limit_per_sec: 200
filters:
- info

Review Parameters

Review the following parameters that you can configure in the Edge Delta App.


name

Required

Enter a descriptive label for this processor. 

When you create a workflow, you will use this label to enter your processor into the workflow.

Review the following example: 

name: clustering

reporting_frequency

Required

This parameter sets the frequency to send clustering results to a streaming destination. These results include patterns and samples. 

Review the following example: 

reporting_frequency: 1m

num_of_clusters

Required

This parameter sets the maximum number of clusters to generate for an input.

Review the following example: 

num_of_clusters: 100

samples_per_cluster

Optional

This parameter sets the number of sample events to report when providing cluster details. 

Review the following example: 

samples_per_cluster: 20

retention

Optional

This parameter is a golang duration string that represents a cluster's retention rate. 

Clusters that do not have any new logs within the retention period will dropped and will no longer be reported until logs appear again. 

For example, if you set this parameter at 10 minutes, then clusters without new logs for the last 10 minutes will be dropped. 

The default retention rate is 1 hour (1h).

Review the following example: 

retention: 30m

cpu_friendly

Optional

This parameter sets CPU rate limiting. Specifically, the agent will review the soft_cpu_limit parameter from Agent Settings and drop some percentage of events to keep agent's CPU usage below the given limit.

By default, this parameter is disabled. 

Note

This parameter only applies to users who have more 1,000 logs per second. 

 

Note

Analyzing patterns in high-volume log environments may cause strains on your computing resources. As a result, by default, this processor type only processes 200 logs per source, such as a container or file.

You can change this setting with the cpu_friendly and throttle_limit_per_sec parameters.

Review the following example: 

cpu_friendly: true

throttle_limit_per_sec

Optional

This parameter sets a limit on the number of logs that can be clustered per second from a single source. 

If the cpu_friendly parameter is enabled, then this parameter will be ignored.

Note

Analyzing patterns in high-volume log environments may cause strains on your computing resources. As a result, by default, this processor type only processes 200 logs per source, such as a container or file.

You can change this setting with the cpu_friendly and throttle_limit_per_sec parameters.

Review the following example: 

throttle_limit_per_sec: 200

filters

Optional

Enter an existing filter to add to this input. 

To learn how to create a filter, see Filters.

Review the following example:

filters:
- extract_severity

include_pattern_info_in_cluster_sample

Optional

Enter true or false to include pattern information (pattern, pattern count, sentiment score) as tags in the cluster sample. 

The default value is false. 

Review the following example: 

include_pattern_info_in_samples: true 

Additional Information

To learn how Edge Delta calculates and clusters invariant components, see Processors Overview, specifically the Learn about Clustered Invariants section. 


Share this document