Skip to content

[➕ Feature]: Add "correlate" rule type to deduplication rules #6567

Description

@ausias-armesto

Is your feature request related to a problem? Please describe.
Keep's deduplication rules currently have a single purpose: controlling which fields are used to compute an alert's fingerprint (its identity key). This lets operators override how Keep decides "is this the same alert as one I've already seen?"

This works well for preventing stale alerts from being reused. But it doesn't support a common monitoring use case:

  • The same alert fires on multiple instances simultaneously (e.g., the same check failing across several hosts, regions, or services). These are genuinely distinct alert events and must be stored individually. However, before they enter the workflows pipeline, they should be correlated — so the operator can act on that relationship: suppress redundant notifications, route them together or apply any other custom logic.

The key idea is that correlation is pre-processing, not a prescribed action. By flagging alerts as correlated before they reach workflows, the operator retains full control over what happens next.

Describe the solution you'd like
Add a rule_type field to AlertDeduplicationRule. Each provider can have one rule of each type:

rule_type Behavior
split (default) Existing behavior. Computes the alert's primary fingerprint.
correlate (new) Computes a secondary correlation_fingerprint without changing the primary fingerprint.

Three new fields are set on AlertDto before it enters the workflow pipeline:

Field Description
correlation_fingerprint Hash of the correlate rule's fields.
is_correlated true if another active (non-resolved, non-suppressed) alert with the same correlation_fingerprint already exists.
correlated_to Fingerprint of the first (representative) alert in the group.

Example

  • Split rule: [fingerprint, startsAt] — each firing event is distinct
  • Correlate rule: [alertname, service] — groups all hosts firing the same alert for the same service
Host Alert name Service is_correlated correlated_to
host-a CheckoutFailureRateHigh payments false null
host-b CheckoutFailureRateHigh payments true fingerprint of host-a
host-c CheckoutFailureRateHigh payments true fingerprint of host-a

3 separate alerts stored, all enriched with correlation metadata before reaching workflows.

Describe alternatives you've considered

Option A — custom deduplication rule grouping by shared fields
This makes all instance alerts share the same fingerprint. They get merged into a single alert — the per-instance context is lost, and only one alert appears in Keep. This is semantically wrong: these are distinct alert events.

Option B — provider-side grouping (e.g. Alertmanager group_by)
Provider-side grouping only affects how the provider batches notifications before sending them to Keep. It has no effect on Keep's alert model. Keep still receives and stores each alert individually, and fires a workflow/notification for each one.

Option C — Keep's alert grouping / incident correlation
Keep's grouping feature is designed to create incidents automatically. The operator's goal here is different: they want to pre-process alerts and decide themselves whether to escalate. No automatic incident creation.

Option D — Keep's Correlation feature
Keep's Correlation feature can group related alerts, but it does so by creating an incident automatically. This couples the act of recognizing related alerts with the decision to escalate them — which may not be desirable. The operator may want to correlate alerts as a first step and only create an incident manually after reviewing the situation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions