> ## Documentation Index > Fetch the complete documentation index at: https://docs.anomalyarmor.ai/llms.txt > Use this file to discover all available pages before exploring further. # Data Quality Metrics > Monitor null percentages, distinct counts, and other column-level statistics to detect data quality issues Data quality metrics let you track statistical properties of your columns over time. AnomalyArmor captures metric values on a schedule, builds historical baselines, and automatically detects when values fall outside expected ranges. **Looking for row count monitoring?** Use [Row Count Monitoring](/data-quality/row-count-monitoring) for tracking row counts with ML-based anomaly detection or explicit thresholds. **Prerequisites**: Before creating metrics, you need: * A [connected data source](/data-sources/overview) with discovery completed * At least one asset (table/view) to monitor **Example scenario:** The `customer_email` column normally has \~3% null values. On Jan 30, null percentage jumped to 12.3%, well outside the expected range band. AnomalyArmor flags this as an anomaly, indicating a potential data quality issue in the source system. ## Why Use Metrics Freshness tells you *when* data was updated. Completeness tells you *how much* arrived. Metrics tell you *what changed* at the column level: | Issue | Freshness | Completeness | Metrics | | -------------------------- | ------------- | -------------- | -------------- | | ETL job failed completely | Detects it | Detects it | Detects it | | ETL ran but loaded 0 rows | Might miss it | **Catches it** | N/A | | Data loaded but 50% nulls | Misses it | Misses it | **Catches it** | | Unexpected duplicates | Misses it | Misses it | **Catches it** | | Values outside valid range | Misses it | Misses it | **Catches it** | **Use freshness for "did data arrive on time?"** **Use row count monitoring for "did the right amount of data arrive?"** **Use metrics for "is the column-level data quality correct?"** ## Metric Types All metrics require a specific column to monitor: | Type | Description | Best For | | ----------------- | ------------------------- | ---------------------- | | `null_percent` | Percentage of null values | Detecting missing data | | `distinct_count` | Count of unique values | Cardinality monitoring | | `duplicate_count` | Count of repeated values | Data quality checks | | `min_value` | Minimum numeric value | Range validation | | `max_value` | Maximum numeric value | Outlier detection | | `mean` | Average numeric value | Central tendency | | `percentile` | Nth percentile value | Distribution analysis | ## Creating a Metric Go to **Assets** and select the table you want to monitor. Click the **Metrics** tab on the asset detail page. Click **Create Metric** to open the metric configuration form. Choose the type of metric you want to track: * **null\_percent**: Percentage of null values in a column * **distinct\_count**: Number of unique values * **duplicate\_count**: Number of duplicate values * **min/max/avg**: Numeric range and central tendency * **percentile**: Distribution analysis Need to monitor row counts? Use [Row Count Monitoring](/data-quality/row-count-monitoring) instead. Choose how often to capture the metric: | Interval | Best For | | -------- | ------------------------------------- | | Hourly | High-frequency data, real-time tables | | Daily | Most batch ETL pipelines | | Weekly | Slowly changing data | Toggle **Anomaly Detection** on and set sensitivity: | Sensitivity | Meaning | Use When | | ----------- | ------------------------------ | ---------------------- | | 1.0 | Alert at 1 standard deviation | Very sensitive | | 2.0 | Alert at 2 standard deviations | Balanced (recommended) | | 3.0 | Alert at 3 standard deviations | Less sensitive | Start with sensitivity 2.0. Adjust based on false positive rate. Click **Create** to save the metric. The first capture will run immediately. ## Viewing Metric History Each metric tracks historical values and displays them as a trend chart: * **Value line**: Actual metric values over time * **Anomaly band**: Expected range (mean +/- sensitivity \* stddev) * **Anomaly points**: Values outside the band are flagged ### Reading the Chart | Indicator | Meaning | | ---------------------- | ------------------ | | Green line within band | Normal values | | Red dot outside band | Anomaly detected | | Gray dashed lines | Upper/lower bounds | ## Which Metric Type Should I Use? Use [Row Count Monitoring](/data-quality/row-count-monitoring). It provides ML-based pattern learning, time-windowed counting, and explicit threshold support for row count monitoring. Use **null\_percent** on the column that shouldn't have nulls. Example: Monitor `customer_email` for null percentage. Alert if nulls exceed historical baseline (e.g., jumps from 2% to 15%). Use **min\_value** and **max\_value** on numeric columns. Example: Monitor `price` column. Alert if minimum drops below 0 (invalid) or maximum exceeds historical norms. Use **duplicate\_count** on columns that should be unique. Example: Monitor `order_id` for duplicates. Any duplicates indicate a data quality issue. Use **distinct\_count** on categorical columns. Example: Monitor `country_code` distinct count. A sudden increase might indicate invalid data. ## Best Practices ### Start with High-Impact Metrics Focus on metrics that catch real problems: **Critical table (orders):** * **Completeness**: Catch data loss or duplication (see [Row Count Monitoring](/data-quality/row-count-monitoring)) * **null\_percent** on `order_id`: Should never be null * **null\_percent** on `customer_id`: Should never be null * **min\_value** on `total_amount`: Should never be negative ### Match Capture Interval to Data Freshness | Data Update Pattern | Recommended Interval | | ------------------- | -------------------- | | Real-time streaming | Hourly | | Hourly batch jobs | Hourly | | Daily batch jobs | Daily | | Weekly aggregates | Weekly | ### Use Meaningful Sensitivity Values | Scenario | Sensitivity | Rationale | | ---------------------------------- | ----------- | --------------------------- | | New table, learning patterns | 3.0 | Reduce noise while learning | | Established table, stable patterns | 2.0 | Balanced detection | | Critical data, low tolerance | 1.5 | More sensitive alerting | ## Operating-Period Awareness Many tables are only active during business hours. Nights and weekends are structurally quiet, so pooling those near-zero periods into one baseline widens the expected band until real weekday regressions slip through, or it flags every weekend as anomalously low. Operating-period awareness fixes this by comparing a value only against history from the same kind of period. Each metric has an **operating period mode**: | Mode | Behavior | | --------------- | --------------------------------------------------------------------------------------------------------------------------------- | | `off` (default) | Pools all history into one baseline. Unchanged from before. | | `schedule` | Segments the baseline using a linked [operating schedule](/alerts/operating-schedules) (declared days, hours, and timezone). | | `auto` | Learns an active/dormant calendar from the metric's own history (day-of-week, plus hour-of-week for sub-daily capture intervals). | When a value falls in an **active** period, it is baselined only against prior active values, so the band stays tight. When a value falls in a **dormant** (closed) period, a low or zero value is expected and never alerts; only unexpected activity above a near-zero floor alerts (for example, writes at 3am to a table that should be idle overnight). Both `schedule` and `auto` need enough history before they take effect. Until then, and whenever mode is `off`, the metric uses the standard pooled baseline. `auto` learns from observed volume, so it adapts when real activity differs from nominal hours. Use `schedule` when you want to declare exact hours, or to override what `auto` would learn. ## Troubleshooting **Causes:** * Metric was just created and hasn't captured yet * Capture job failed * Table is empty **Solutions:** 1. Wait for the next scheduled capture (check interval) 2. Trigger a manual capture: **Actions > Capture Now** 3. Check the table has data **Causes:** * Sensitivity is too low (too sensitive) * Normal data patterns are highly variable * Seasonality not accounted for **Solutions:** 1. Increase sensitivity (e.g., 2.0 to 3.0) 2. Allow more baseline data to accumulate (30+ days) 3. Consider if the variation is actually expected **Causes:** * Sensitivity is too high (not sensitive enough) * Baseline includes anomalous data * Capture interval too infrequent **Solutions:** 1. Decrease sensitivity (e.g., 3.0 to 2.0) 2. Reset baseline after fixing data issues 3. Increase capture frequency **Causes:** * Database connection issues * Column was renamed or removed * Permission changes **Solutions:** 1. Check data source connection status 2. Verify column still exists 3. Check database user permissions ## Common Questions ### When should I use metrics versus row count monitoring? Use **metrics** for column-level checks like null rates, distinct counts, and numeric ranges. Use [Row Count Monitoring](/data-quality/row-count-monitoring) for table-level volume tracking, it has ML-based pattern learning and time-windowed counting that metrics don't. ### What sensitivity should I start with for anomaly detection? Start at **2.0** (balanced, alerts on 2 standard deviations). Drop to 1.5 for critical data where you want tight detection, or raise to 3.0 if you're seeing too many false positives from noisy patterns. ### How long before anomaly detection is useful? Anomaly detection needs a baseline. Expect rougher results for the first week or two while history accumulates. For stable patterns, 30+ days of baseline data gives the tightest, most trustworthy bands. ### Does AnomalyArmor read my column values? It runs aggregate queries (like `COUNT`, `MIN`, `MAX`, `AVG`) against your database to compute the metric. Only the numeric result is stored, individual row values aren't transmitted or retained. ### Can I monitor a metric on a custom SQL expression? The built-in metric types run against a specific column. For arbitrary SQL, use [Custom SQL Monitoring](/data-quality/custom-sql-monitoring) instead, which lets you write any `SELECT` that returns a numeric value. ## What's Next Get notified when metrics detect anomalies Automate metric management with the API Embed metric status in dashboards Configure where alerts are sent