Skip to main content
When an upstream table breaks, the problem rarely stays put. A stale source feeds a cleaned table, which feeds a metric, which feeds a dashboard, and every monitored asset along the way fires its own alert. One root cause becomes a wall of notifications. Incident correlation collapses that wall. Instead of one incident per affected table, AnomalyArmor uses your lineage graph to group the cascade into a single incident with a named root cause and a visible blast radius.
A stale upstream source cascades through lineage, firing an alert on each downstream asset; correlation groups them into one incident with a root cause and blast radius

How correlation works

When an alert fires for an asset that participates in your lineage graph, AnomalyArmor walks upstream to find the most-upstream asset that is also currently failing. That asset is the root cause, and its incident becomes the single home for the whole cascade.
  • Root cause: the most-upstream failing asset. When several upstream assets are failing at once, the one furthest upstream wins; ties break toward the failure that started first and then the one with the largest downstream impact.
  • Blast radius: every downstream asset whose alert is grouped under the root incident, each labeled with its hop distance (how many lineage steps it sits from the root cause).
  • One notification: the root incident sends a single Slack message. Later alerts from the same cascade post as threaded replies under it rather than new top-level messages, so a channel shows one thread per incident.
Assets without lineage are unaffected: an alert on an asset that has no lineage node creates a standalone incident, exactly as before.
Correlation builds on the lineage you already have in AnomalyArmor. The more complete your lineage graph, the more accurately cascades are grouped and attributed. See Lineage to enrich it.

Viewing a correlated incident

Open any incident from the Incidents view. A correlated incident shows:
  • the root cause asset at the top,
  • the blast radius, the list of downstream assets pulled into the incident, ordered nearest-to-farthest from the root cause,
  • the underlying alerts, each still individually visible and resolvable.
The blast radius is the real set of assets that actually alerted, not a generic lineage fetch, so it reflects exactly what this incident is affecting right now.

Auto-resolution

A correlated incident resolves itself once every alert grouped under it has been cleared (resolved or dismissed). A system entry is added to the incident timeline when this happens. If the cascade re-fires later, a new incident is created through the normal flow. Incidents you have snoozed are never auto-resolved while the snooze is in effect. An intentional pause is always respected.

Availability

Incident correlation is on by default for accounts with lineage. It is fully backwards compatible: incidents you already have keep behaving exactly as before, and nothing about alert detection or thresholds changes. To adjust correlation behavior for your account, contact support.

Alerts Overview

How events become notifications across your destinations.

Alert Best Practices

Tune alerts to cut noise and avoid alert fatigue.