Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.anomalyarmor.ai/llms.txt

Use this file to discover all available pages before exploring further.

Data quality monitoring ensures your data is fresh, complete, and reliable. AnomalyArmor helps you detect data issues before they impact business decisions.

Why Data Quality Matters

Bad data leads to bad decisions: Impact of bad data on business decisions

Data Quality Dimensions

AnomalyArmor monitors key data quality dimensions:
DimensionWhat It MeansHow We Monitor
FreshnessIs data up to date?Timestamp monitoring, SLAs
CompletenessDid the right amount arrive?Row count monitoring, ML anomaly detection
MetricsAre column values correct?Statistical monitoring, anomaly detection
SchemaIs structure correct?Schema drift detection
AvailabilityIs data accessible?Discovery success/failure

Monitoring Capabilities

Freshness Monitoring

Track when data was last updated and detect stale data before it impacts downstream consumers.

Row Count Monitoring

Monitor row counts with ML-based anomaly detection or explicit thresholds.

Data Quality Metrics

Monitor null percentages, distinct counts, and other column-level statistics. Detect anomalies automatically.

Schema Monitoring

Detect structural changes to your database that could break pipelines and reports.

Report Badges

Embed data quality status indicators in dashboards, wikis, and operational tools.

How Data Quality Monitoring Works

Data quality monitoring pipeline from discovery to alerts
  1. Discovery runs on your configured schedule, metrics captured at defined intervals
  2. Metadata collected including schema, timestamps, and metric values (row counts, null %, etc.)
  3. Compared against expectations (SLAs, statistical baselines, previous state)
  4. Alerts fired when expectations aren’t met or anomalies detected

Getting Started

Set Up Freshness Monitoring

  1. Navigate to an asset
  2. Click Freshness tab
  3. Select a timestamp column
  4. Set your expected update frequency
  5. Configure alert threshold

Set Up Row Count Monitoring

  1. Navigate to an asset
  2. Click Data Quality tab
  3. Scroll to Row Count Monitoring section
  4. Click Create Schedule
  5. Configure time window and check interval
  6. Choose auto-learn or explicit thresholds

Set Up Data Quality Metrics

  1. Navigate to an asset
  2. Click Metrics tab
  3. Click Create Metric
  4. Select metric type (null %, distinct count, etc.)
  5. Configure capture interval
  6. Enable anomaly detection (optional)

Set Up Schema Monitoring

Schema monitoring is automatic once you:
  1. Connect a data source
  2. Run discovery
  3. Configure alert rules for schema changes

Best Practices

Start with Critical Assets

Don’t monitor everything at once. Focus on:
  • Revenue-impacting tables: Orders, payments, transactions
  • Customer-facing data: Data that powers dashboards and reports
  • Compliance-required data: Audit logs, regulatory reports

Set Realistic Expectations

Match SLAs to actual data patterns:
Data TypeTypical Freshness
Real-time eventsMinutes
Hourly ETL1-2 hours
Daily batchesSame-day
Weekly reports1 week

Layer Your Monitoring

Combine multiple checks for full coverage: Critical table (orders):
  • Freshness: Alert if >2 hours stale
  • Completeness: Alert if row count drops >50%
  • Metrics: Alert if null_percent exceeds 5%
  • Schema: Alert on any column removed
  • Availability: Alert if discovery fails

Data Quality Dashboard

View overall data health in the Assets section:
IndicatorMeaning
GreenAll checks passing
YellowWarning threshold reached
RedSLA violated or issue detected
GrayNot monitored

Common Questions

Which monitor should I set up first?

Start with freshness monitoring on revenue-impacting tables. Freshness catches the most common failure mode (ETL didn’t run) with the least configuration, then layer in row counts and column metrics as you learn your patterns.

What’s the difference between freshness, row count, and metrics?

Freshness answers “did data arrive on time?” Row count answers “did the right amount arrive?” Metrics answer “is the column-level data correct?” Critical tables benefit from all three. See freshness, row count, and metrics.

Do I have to monitor every table?

No, and you shouldn’t try. Focus on revenue-impacting tables, customer-facing dashboard sources, and compliance-required data. Most teams get 80% of the value from monitoring 20% of their assets.

Can AnomalyArmor detect issues it wasn’t explicitly configured for?

Row count monitoring and metrics use anomaly detection against learned baselines, so they can flag unusual values without explicit thresholds. Freshness still needs an SLA, and schema drift detects all changes automatically once discovery runs.

Freshness Monitoring

Set up freshness SLAs

Row Count Monitoring

Monitor row counts with ML anomaly detection

Data Quality Metrics

Track column-level statistics and detect anomalies

Alert Rules

Configure data quality alerts

Report Badges

Embed quality status in external tools