# AnomalyArmor Documentation (Full Content)

> AnomalyArmor is a data observability platform that detects schema changes, data freshness issues, and quality problems before they impact your business. This file concatenates every documentation page in navigation order, including all Q&A / Common Questions sections, for LLM ingestion.

> Index: https://docs.anomalyarmor.ai/llms.txt

---

<!-- Source: https://docs.anomalyarmor.ai/introduction -->

# Introduction

> AnomalyArmor - Data observability for modern data teams

AnomalyArmor is a data observability platform that helps data teams detect issues before they become incidents. Know when schemas change, data goes stale, or pipelines break, before your stakeholders do.

*AnomalyArmor: Data sources to alerts in one platform*

## The Problem We Solve

Data teams face a common challenge: **you don't know what you don't know**.

- A column gets dropped upstream → your dbt models fail at 3am
- An ETL job silently stops → dashboards show stale data for hours
- A data type changes → aggregations return wrong results for days
- A new table appears → you find out when someone asks about it

These aren't edge cases. They're Tuesday.

**AnomalyArmor gives you visibility** so you can catch issues when they happen, not when your CFO asks why the numbers are wrong.

## What is Data Observability?

Data observability brings the principles of application monitoring to your data infrastructure. Just as you wouldn't run a production application without alerts for errors and latency, you shouldn't run data pipelines without visibility into data health.

## Key Features

- **[Schema Monitoring](/schema-monitoring/overview)**: Automatically discover tables and detect schema changes before they break downstream systems. Know immediately when columns are added, removed, or modified.
- **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set SLAs on data freshness and get alerted when data is stale or delayed. Never show yesterday's data on today's dashboard.
- **[Data Quality Metrics](/data-quality/metrics)**: Track row counts, null percentages, distinct values, and more. Detect anomalies automatically using statistical analysis.
- **[Intelligent Alerts](/alerts/overview)**: Route alerts to Slack, email, PagerDuty, or webhooks. Configure rules to filter noise and escalate what matters.
- **[AI-Powered Insights](/intelligence/overview)**: Get AI-generated documentation for your tables and columns. Ask natural language questions about your data catalog.
- **[Report Badges](/data-quality/report-badges)**: Embed data quality status in Databricks, Looker, Notion, and other tools. See pass/fail at a glance.

## Who Uses AnomalyArmor?

### Data Engineers

**Your challenge**: Pipelines break when upstream schemas change. You're the first call when data is wrong.

**How AnomalyArmor helps**:
- Get alerted to schema changes before your pipelines run
- Understand what changed and when
- Reduce time-to-resolution for data incidents

### Analytics Engineers

**Your challenge**: dbt models fail mysteriously. You spend hours debugging "column not found" errors.

**How AnomalyArmor helps**:
- Know when source tables change
- Track which models depend on changed assets
- Document your data models with AI assistance

### Platform Teams

**Your challenge**: Multiple teams, multiple databases, no central visibility.

**How AnomalyArmor helps**:
- Single pane of glass across all data sources
- Centralized alerting with team-based routing
- Audit trail for compliance requirements

### Data Leaders

**Your challenge**: Stakeholders don't trust the data. You need to reduce data incidents.

**How AnomalyArmor helps**:
- Proactive monitoring reduces incidents by catching issues early
- SLA tracking demonstrates data reliability
- Centralized catalog improves data discovery

## Supported Data Sources

AnomalyArmor connects to your existing data infrastructure with no agents to install:

| Data Source | Support Level | Key Features |
|-------------|---------------|--------------|
| **PostgreSQL** | Full | Tables, views, schemas, constraints |
| **Databricks** | Full | Unity Catalog, Delta tables, views |
| **ClickHouse** | Full | Tables, materialized views, dictionaries |

> 💡 More data sources coming soon. [Contact us](mailto:support@anomalyarmor.ai) if you need a specific connector.

## How It Works

AnomalyArmor follows a simple, secure model:

1. **Connect**: Add your database with read-only credentials
2. **Discover**: We scan your metadata (never your actual data)
3. **Monitor**: Scheduled scans detect changes automatically
4. **Alert**: Get notified through your preferred channels

## Security First

AnomalyArmor is built with security as a foundation:

- **Metadata only**: We never access your actual data
- **Encrypted everywhere**: TLS 1.3 in transit, AES-256 at rest
- **Minimal access**: Read-only, dedicated database users recommended

## Getting Started

Ready to try AnomalyArmor? You can be monitoring your first database in under 15 minutes.

- **[Quickstart Guide](/quickstart/overview)**: Step-by-step guide to connect your first database and set up monitoring
- **[Core Concepts](/concepts/overview)**: Understand assets, discovery, schema changes, and freshness

## Common Questions

### Does AnomalyArmor access my actual data?

No. AnomalyArmor only queries metadata (system catalogs like `information_schema` and `pg_catalog`) and bounded aggregates (`COUNT`, `MIN`, `MAX`, `AVG`, distinct counts). It never runs `SELECT *` or reads raw row values. The open-source [Query Gateway](/security/query-gateway) enforces this on every query.

### What databases does AnomalyArmor support?

PostgreSQL (12+), Databricks (with Unity Catalog), and ClickHouse are fully supported today. BigQuery, Snowflake, Redshift, MySQL, and SQL Server are in active development. See [Data Sources](/data-sources/overview) for the current list and [contact us](mailto:support@anomalyarmor.ai) if you need a specific connector.

### How long does it take to set up AnomalyArmor?

About 15-20 minutes to connect your first database and get your first alert. The [quickstart](/quickstart/overview) walks through the four steps: connect, discover, alert, metric.

### Do I need to install anything in my infrastructure?

No. AnomalyArmor is fully managed SaaS. Connections go outbound from our cloud to your database using the credentials you provide. Nothing runs on your side.

### Will AnomalyArmor slow down my production database?

Impact is minimal. Queries hit only system tables and run lightweight aggregates on a schedule you control. Typical discovery completes in under 60 seconds for up to 1,000 tables. You can point AnomalyArmor at a read replica if you want zero impact on the primary.

### Can I try AnomalyArmor without talking to sales?

Trials are available by invitation. If you received a cold email or a trial code, use the link provided. Otherwise, start from the [homepage try-it form](https://www.anomalyarmor.ai/) or email [hello@anomalyarmor.ai](mailto:hello@anomalyarmor.ai). Trials are 14 days with no credit card; there is no self-serve trial without a code and no free tier.

## Need Help?

- **[Support](/support/contact)**: Contact our team for help with your account.
- **[FAQ](/troubleshooting/faq)**: Answers to common questions.

---

<!-- Source: https://docs.anomalyarmor.ai/how-it-works -->

# How AnomalyArmor Works

> Understand the architecture and data flow behind AnomalyArmor's data observability platform

AnomalyArmor provides visibility into your data infrastructure without accessing your actual data. We connect to your databases, discover your assets, track data quality metrics, use AI to document your catalog, and alert you when something needs attention.

*AnomalyArmor system architecture*

## Core Architecture

AnomalyArmor operates on a simple principle: **metadata and aggregates only, never raw row values**. We query your database's information schema for structure, and run bounded aggregate queries (`COUNT`, `MIN`, `MAX`, `SUM`, `AVG`, `PERCENTILE`, distinct-value counts) for monitoring - never `SELECT *` or row-level reads of sensitive columns. The Query Gateway enforces this at every query. See [Query Gateway](/security/query-gateway) for the full policy.

### What We Access

| We Access | We Never Access |
|-----------|-----------------|
| Table and view names | Individual row data |
| Column names and types | `SELECT *` or raw column values |
| Schema metadata | PII or sensitive content |
| Timestamp columns (for freshness) | Raw business records |
| Index and constraint definitions | Query results beyond aggregates |
| Aggregate statistics (counts, min/max, percentiles) | |

This gives you value-level monitoring - row counts, freshness, distribution shifts - with minimal data exposure.

## The Data Flow

AnomalyArmor follows a six-stage pipeline from connection to notification:

*Data flow from discovery to alerting*

### Stage 1: Discovery

When you connect a database, AnomalyArmor runs a **discovery job** that:

1. Connects securely using your provided credentials
2. Queries the information schema (`pg_catalog`, `information_schema`, etc.)
3. Catalogs all tables, views, columns, and their data types
4. Records the current state as a baseline

Discovery can run on a schedule (hourly, daily) or on-demand. After the initial full discovery, subsequent runs are **incremental**, only detecting what changed since the last run.

### Stage 2: Asset Catalog

Discovered assets are stored in your **Asset Catalog**, a searchable inventory of everything in your data infrastructure:

- **Tables and Views**: With column details, data types, and relationships
- **Metadata**: Last updated timestamps, row counts (if available), descriptions
- **History**: Full audit trail of every change detected

The catalog becomes your source of truth for understanding what exists across all connected databases.

### Stage 3: Change Detection

On each discovery run, AnomalyArmor compares the current state against the previous state to detect:

- **Schema drift**: Columns added, removed, renamed, or type-changed
- **New assets**: Tables or views that didn't exist before
- **Removed assets**: Tables or views that no longer exist
- **Freshness violations**: Data not updated within expected SLA

Each detected change is recorded with timestamp, type, before/after values, and affected asset details.

### Stage 4: Data Quality Metrics

Beyond schema monitoring, AnomalyArmor tracks statistical properties of your data:

| Metric | What It Measures |
|--------|------------------|
| Row count | Total rows in a table |
| Null percentage | Missing values in a column |
| Distinct count | Unique values |
| Min/max/avg | Numeric column statistics |

Metrics run on a schedule you configure. AnomalyArmor builds baselines from historical values and detects anomalies when new values fall outside expected ranges.

### Stage 5: Intelligence

AnomalyArmor uses AI to help you understand your data catalog:

- **Natural language search**: Ask "where is customer data?" instead of clicking through schemas
- **Auto-generated descriptions**: Get human-readable documentation for tables and columns
- **Pattern detection**: Identify PII, relationships, and naming conventions

Intelligence analyzes your schema metadata only. It never reads actual data values.

### Stage 6: Alerting

When changes match your configured **alert rules**, notifications are sent to your chosen destinations:

- **Slack**: Channel messages with rich formatting
- **Email**: Detailed reports to individuals or distribution lists
- **PagerDuty**: Incidents for on-call escalation
- **Microsoft Teams**: Channel notifications
- **Webhooks**: Custom integrations with any system

You control which changes trigger alerts and where they go. A dropped column in production might page on-call, while a new table in dev just goes to email.

## Security Model

AnomalyArmor uses enterprise-grade security controls:

### Query Security

Every SQL query is validated by our open-source [Query Gateway](/security/query-gateway) before execution:

- **Fail-closed validation**: Queries that can't be parsed are blocked
- **Access level enforcement**: Only metadata and aggregate queries are allowed
- **Open source**: Audit the code yourself at [github.com/anomalyarmor/anomalyarmor-query-gateway](https://github.com/anomalyarmor/anomalyarmor-query-gateway)

### Credential Storage

- All credentials encrypted at rest with **AES-256**
- Encryption keys managed via AWS KMS
- Credentials never logged or exposed in UI

### Network Security

- All connections use **TLS 1.3** encryption
- Static IP addresses available for firewall allowlisting
- Support for VPC peering and Private Link (Enterprise)

### Access Control

- Role-based access control (RBAC) for team members
- Audit logs for all administrative actions
- SSO integration (SAML 2.0) available

> ℹ️ For detailed security information, see our [Security Overview](/security/overview).

## Deployment Architecture

AnomalyArmor is a fully managed SaaS platform. There's nothing to install in your infrastructure.

*Deployment model: your databases stay in your infrastructure*

### What Runs Where

| Component | Location | Details |
|-----------|----------|---------|
| Discovery jobs | AnomalyArmor Cloud | Scheduled or on-demand |
| Asset catalog | AnomalyArmor Cloud | Your metadata stored securely |
| Metrics engine | AnomalyArmor Cloud | Scheduled metric capture and anomaly detection |
| Intelligence | AnomalyArmor Cloud | AI-powered search and documentation |
| Alert rules | AnomalyArmor Cloud | Configured via web UI |
| Database credentials | AnomalyArmor Cloud | Encrypted at rest |
| Your database | Your infrastructure | We connect outbound to you |

## Performance Impact

AnomalyArmor is designed for minimal performance impact on your databases:

- **Read-only queries**: We only run SELECT on system tables
- **Lightweight queries**: Information schema queries, not table scans
- **Configurable scheduling**: Run discovery during off-peak hours
- **Connection pooling**: Efficient connection management

Typical discovery runs complete in **under 60 seconds** for databases with up to 1,000 tables.

## Common Questions

### What exactly does AnomalyArmor query against my database?

Only metadata from system catalogs (`information_schema`, `pg_catalog`, and equivalents) and bounded aggregates on your tables: `COUNT`, `MIN`, `MAX`, `AVG`, distinct-value counts, and percentile approximations. Never `SELECT *`, never raw row reads. The [Query Gateway](/security/query-gateway) blocks anything else at execution time.

### Does AnomalyArmor need write access to my database?

No. Read-only credentials are sufficient and recommended. AnomalyArmor never issues `INSERT`, `UPDATE`, `DELETE`, or DDL statements. Create a dedicated read-only user and grant it `SELECT` on the schemas you want monitored.

### Can I point AnomalyArmor at a read replica instead of my primary database?

Yes, and it's a common setup for production workloads. Freshness is determined by the replica's timestamps, so if your replica lags the primary by minutes, that lag is reflected in freshness checks - factor that into your SLAs.

### Where is the metadata AnomalyArmor collects stored?

In AnomalyArmor's managed cloud, encrypted at rest with AES-256 (keys managed via AWS KMS) and encrypted in transit with TLS 1.3. Credentials are stored with separate encryption and are never exposed in the UI or logs.

### Can I self-host AnomalyArmor?

AnomalyArmor is primarily offered as managed SaaS. Self-hosted deployments of the open-source Query Gateway component are available for Enterprise customers who want to run validation inside their own network. [Contact sales](mailto:sales@anomalyarmor.ai) for options.

### Does AnomalyArmor support IP allowlisting?

Yes. AnomalyArmor provides static outbound IPs you can add to your database firewall allowlist. See the current IP list in your dashboard under **Settings → Security**. VPC peering and AWS PrivateLink are available on Enterprise plans for networks that can't use IP allowlisting.

## Next Steps

Now that you understand how AnomalyArmor works, you're ready to get started:

- **[Quickstart Guide](/quickstart/overview)**: Connect your first database in under 15 minutes
- **[Core Concepts](/concepts/overview)**: Deep dive into assets, discovery, and monitoring

---

<!-- Source: https://docs.anomalyarmor.ai/quickstart/connect-first-database -->

# Connect Your First Database

> Step 1 - Add a database connection to AnomalyArmor

In this step, you'll add your first database connection to AnomalyArmor. This establishes the secure link that allows us to discover and monitor your data assets.

## Before You Begin

Make sure you have:
- Database hostname, port, and credentials
- Network access from AnomalyArmor to your database (see [Quickstart Overview](/quickstart/overview))
- A read-only database user (recommended)

## Supported Databases

| Database | Version | Notes |
|----------|---------|-------|
| PostgreSQL | 12+ | Including RDS, Aurora, Supabase |
| Databricks | Unity Catalog | Requires access token |
| ClickHouse | 21+ | Including ClickHouse Cloud |

## Step-by-Step Guide

## What Happens Next

After saving your connection:

1. **Credentials are encrypted**: Stored securely with AES-256 encryption
2. **Connection is ready**: You can now run discovery to scan your database
3. **No data accessed yet**: Discovery must be triggered manually or scheduled

## Security Best Practices

> ⚠️ Always use a **read-only, dedicated user** for AnomalyArmor. Never use your application's production credentials.

### Creating a Read-Only User

### PostgreSQL

```sql
-- Create a dedicated user
CREATE USER anomalyarmor WITH PASSWORD 'your-secure-password';

-- Grant minimal required permissions
GRANT CONNECT ON DATABASE your_database TO anomalyarmor;
GRANT USAGE ON SCHEMA public TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor;

-- For future tables (optional but recommended)
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO anomalyarmor;
```

### ClickHouse

```sql
-- Create a dedicated user with read-only access
CREATE USER anomalyarmor
IDENTIFIED BY 'your-secure-password';

-- Grant read access to all tables
GRANT SELECT ON *.* TO anomalyarmor;
```

### Databricks

In Unity Catalog:
1. Create a service principal or use a personal access token
2. Grant `USE CATALOG` on your catalog
3. Grant `USE SCHEMA` on schemas to monitor
4. Grant `SELECT` on tables (or use `ALL PRIVILEGES` for read access)

## Troubleshooting

### Connection refused

**Cause**: Network connectivity issues

**Solutions**:
1. Verify the hostname and port are correct
2. Check that your database is running
3. Ensure AnomalyArmor IPs are allowlisted in your firewall/security group
4. For RDS: Verify the security group allows inbound traffic on your DB port

```bash
# Test connectivity from your machine
nc -zv your-host.rds.amazonaws.com 5432
```

### Authentication failed

**Cause**: Invalid credentials

**Solutions**:
1. Double-check username and password (copy-paste to avoid typos)
2. Verify the user exists in the database
3. Ensure the user has CONNECT permission on the database
4. Check for special characters in password that may need escaping

### SSL connection required

**Cause**: Database requires SSL but connection isn't using it

**Solutions**:
1. Set SSL Mode to `require` in the connection form
2. For RDS: SSL is required by default, ensure SSL Mode is set
3. For self-hosted: Verify SSL certificates are properly configured

### Connection timeout

**Cause**: Network latency or firewall issues

**Solutions**:
1. Verify the database is accessible from outside your VPC
2. Check for firewall rules blocking the connection
3. For private databases: Contact us about VPC peering options

### Permission denied

**Cause**: User lacks required permissions

**Solutions**:
1. Grant `SELECT` on `information_schema` tables
2. Grant `USAGE` on the schema you want to monitor
3. Verify user can access the specific database

## Common Questions

### What database permissions does AnomalyArmor need?

Read-only: `CONNECT` on the database, `USAGE` on the schemas you want monitored, and `SELECT` on tables. `ALTER DEFAULT PRIVILEGES` is optional but recommended so new tables are picked up automatically. No write, DDL, or admin permissions are ever needed.

### Can I connect AnomalyArmor to a database inside a private VPC?

Yes, via three options: IP allowlisting from AnomalyArmor's static outbound IPs (visible in **Settings → Security**), VPC peering, or AWS PrivateLink. The last two require an Enterprise plan and a short setup call. See [Connection Issues](/troubleshooting/connection-issues) for diagnosing network reachability.

### Does AnomalyArmor support Amazon RDS, Aurora, and Supabase?

Yes. These are PostgreSQL-compatible and work with the PostgreSQL connector. RDS and Aurora require SSL; set `SSL Mode` to `require` on the connection form. Supabase requires the direct database connection string, not the pooler URL, for freshness queries to return accurate timestamps.

### Why should I create a dedicated read-only user instead of reusing an existing one?

Blast radius. A dedicated user means you can audit exactly what AnomalyArmor runs, rotate credentials without breaking other systems, and apply minimal permissions. The [Query Gateway](/security/query-gateway) already blocks writes and full-row reads, but a read-only user at the database layer is defense in depth.

### My connection test fails with 'Connection refused' - what's wrong?

Almost always a firewall or security-group issue. Verify your database accepts inbound traffic on its port from AnomalyArmor's static IPs (listed in **Settings → Security**). For RDS, check the security group inbound rules. For self-hosted databases, confirm `listen_addresses` in PostgreSQL config allows external connections.

## Next Step

Your database is connected! Now let's discover your tables and schemas.

---

<!-- Source: https://docs.anomalyarmor.ai/quickstart/run-first-discovery -->

# Run Your First Discovery

> Step 2 - Discover tables and schemas in your database

Discovery is how AnomalyArmor learns about your database. It scans your metadata to find tables, views, columns, and their data , creating a complete inventory of your data assets.

## What Discovery Does

Discovery queries your database's **system catalogs** (not your data) to find:

| Asset Type | What We Capture |
|------------|-----------------|
| **Tables** | Name, schema, column definitions |
| **Views** | Name, schema, underlying columns |
| **Columns** | Name, data type, nullability, constraints |
| **Metadata** | Primary keys, indexes, row counts (if available) |

> ℹ️ **Privacy First**: Discovery reads ONLY metadata (table names, columns, types). It never reads your actual data values. We only query metadata from `information_schema`, `pg_catalog`, or equivalent system tables.

## Step-by-Step Guide

## Understanding Discovery Results

### The Asset Catalog

After discovery, your assets appear in the catalog. You can view them by clicking **Assets** in the sidebar.

### Asset Details

Click any asset to see detailed information:

- **Columns**: Full list with data types and constraints
- **Schema History**: Changes detected over time (empty for first discovery)
- **Freshness**: Data recency (requires configuration)
- **Description**: AI-generated or manual documentation

## Setting Up Scheduled Discovery

Discovery should run regularly to detect changes. Configure a schedule:

### Recommended Schedules

| Environment | Frequency | Rationale |
|-------------|-----------|-----------|
| Production | Hourly | Catch changes quickly |
| Staging | Every 6 hours | Balance coverage and load |
| Development | Daily | Lower priority |

### Configure Schedule

1. Go to **Settings** → **Assets** → Select your connection
2. Click the **Settings** tab on the asset detail page
3. Under **Discovery Schedule**, choose frequency
4. Click **Save**

**Discovery Schedule Options:**
- Every 15 minutes (real-time monitoring)
- Hourly (recommended for production)
- Every 6 hours
- Daily
- Weekly
- Manual only

> 💡 Start with **hourly** discovery for production databases. You can adjust based on how frequently your schemas change and your alerting needs.

## What Happens on Subsequent Discoveries

After the initial discovery, subsequent runs detect **changes**:

- **New tables/views** added since last discovery
- **Removed tables/views** that no longer exist
- **Schema changes** (columns added, removed, type changed)
- **Freshness updates** (latest timestamp values)

Changes trigger alerts if you have alert rules configured (next step!).

## Incremental vs. Full Discovery

| Type | What It Does | When to Use |
|------|--------------|-------------|
| **Incremental** | Detects changes since last run | Default for scheduled runs |
| **Full** | Complete rescan of everything | After major changes, troubleshooting |

To run a full discovery: Click **Run Discovery** → Select **Full Discovery**.

## Troubleshooting

### Discovery takes too long

**Cause**: Large database or network latency

**Solutions**:
1. Exclude schemas you don't need (e.g., `pg_temp`, test schemas)
2. Run during off-peak hours
3. Use a read replica instead of primary database

To exclude schemas:
- **Settings** → **Assets** → Select your connection → **Settings** tab → **Schema Filter**
- Add schemas to exclude: `pg_temp_*, test_*`

### Discovery finds no tables

**Cause**: Permission or schema filtering issues

**Solutions**:
1. Verify the user has `SELECT` on `information_schema`
2. Check schema filter isn't excluding everything
3. Confirm you're connected to the correct database
4. Verify tables exist in the schemas you're monitoring

### Discovery fails with error

**Cause**: Connection or permission issues

**Solutions**:
1. Test the connection (Settings → Assets → Select connection → Test Connection)
2. Check database user permissions
3. Verify network connectivity
4. Check database logs for errors

### Row counts not showing

**Cause**: Some databases don't expose row counts in system tables

**Solutions**:
- PostgreSQL: Run `ANALYZE` on tables to populate `pg_stat_user_tables`
- This is informational only. Monitoring works without row counts

## What You've Accomplished

After completing discovery, you have:

- A complete inventory of your database assets
- Metadata for all tables, views, and columns
- A baseline for detecting future changes
- Assets visible in your catalog

## Common Questions

### How long does discovery take for a large database?

Discovery queries only system catalogs, not your data, so it's fast. Typical durations: under 1 minute for up to 100 tables, 1-3 minutes for 100-500 tables, 3-5 minutes for 500-1000 tables, and 5-10 minutes for 1000+ tables. You can navigate away during discovery.

### Does discovery impact my database's performance?

Minimal. Discovery runs information-schema queries against system tables, not table scans on your data. Expect a brief spike in catalog read activity for a few seconds, no noticeable load on production workloads.

### Can I limit which schemas or tables AnomalyArmor discovers?

Yes. Configure include/exclude patterns per schema on the connection's detail page. Patterns support glob-style matching (e.g., `analytics_*` includes every schema starting with `analytics_`, `public.audit_*` excludes specific tables).

### How often does discovery re-run after the first scan?

By default, daily. Set it to hourly on the connection's detail page if your environment changes often. Each subsequent run is incremental - AnomalyArmor only records what changed since the last scan, which is also what drives schema-drift alerts.

### Why are my row counts showing as zero or missing after discovery?

Some databases don't expose row counts in system tables by default. For PostgreSQL, run `ANALYZE` on the affected tables to populate `pg_stat_user_tables`. Row counts are informational only - monitoring, freshness, and schema-drift detection all work without them.

## Next Step

Now that your assets are discovered, let's set up alerts to notify you when something changes.

---

<!-- Source: https://docs.anomalyarmor.ai/quickstart/set-up-first-alert -->

# Set Up Your First Alert

> Step 3 - Get notified when data issues occur

Alerts are how AnomalyArmor tells you when something needs attention. In this step, you'll create your first alert rule to get notified about schema changes.

## How Alerts Work

Alerts follow a simple flow:

**Three components**:
1. **Event**: What triggers the alert (schema change, freshness violation, etc.)
2. **Rule**: Conditions that filter events (optional: specific tables, change types)
3. **Destination**: Where the alert goes (Slack, email, PagerDuty, webhook)

## Step-by-Step Guide

Let's create an alert rule for schema , one of the most common and important alerts.

## Recommended First Rules

Here are three rules we recommend setting up:

### Rule 1: All Schema Changes

| Field | Value |
|-------|-------|
| **Name** | All Schema Changes |
| **Event** | Schema Change Detected |
| **Conditions** | None (catch everything) |
| **Destination** | Slack #data-alerts |

Catches: Any structural change to your database

### Rule 2: Critical Table Freshness

| Field | Value |
|-------|-------|
| **Name** | Critical Table Freshness |
| **Event** | Freshness Violation |
| **Tables** | orders, users, payments |
| **Destination** | Slack #data-alerts + PagerDuty |

Catches: Stale data in your most important tables

### Rule 3: Discovery Failures

| Field | Value |
|-------|-------|
| **Name** | Discovery Failures |
| **Event** | Discovery Failed |
| **Conditions** | None |
| **Destination** | Slack #data-alerts |

Catches: Connection or permission issues

## Alert Destinations Deep Dive

- **[Slack Setup](/alerts/destinations/slack)**: Full guide to Slack integration
- **[All Destinations](/alerts/overview)**: Email, PagerDuty, webhooks, and more

## Troubleshooting

### Alerts not firing

**Check these in order**:

1. **Rule is enabled**: Toggle should be ON in the Rules list
2. **Destination is configured**: Test the destination
3. **Conditions match**: Are your filters too restrictive?
4. **Events are occurring**: Check Discovery is running and finding changes

### Too many alerts

**Reduce noise with filters**:

1. Add conditions to target specific tables
2. Filter out development/test schemas
3. Filter by change type (e.g., only column removals)
4. Use different destinations for different event types

See [Alert Best Practices](/alerts/best-practices) for more tips.

### Slack alerts not appearing

**Common causes**:

1. Bot not invited to private channels: Type `/invite @AnomalyArmor`
2. OAuth expired: Reconnect the Slack destination
3. Channel archived: Select a different channel

### Test alert works but real alerts don't

**The rule conditions might not match**:

1. Verify discovery is running and finding assets
2. Check rule conditions aren't filtering out all events
3. Wait for the next discovery run to detect changes
4. Manually trigger a schema change in a test table to verify

## What You've Accomplished

Congratulations! You've completed the quickstart. You now have:

-A database connected to AnomalyArmor
-Your data assets discovered and cataloged
-Alert rules configured to notify you of issues

## What's Next?

- **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set up SLAs to detect stale data before it impacts your business
- **[Alert Best Practices](/alerts/best-practices)**: Learn how to reduce alert fatigue and route alerts effectively
- **[Core Concepts](/concepts/overview)**: Deep dive into assets, discovery, schema changes, and more
- **[Add More Assets](/quickstart/connect-first-database)**: Connect additional databases to expand your monitoring

---

## Quick Reference: Alert Rule Cheat Sheet

| Event Type | Best Destination | Typical Conditions |
|------------|------------------|-------------------|
| Schema Change | Slack | Production databases only |
| Column Removed | PagerDuty | Critical tables |
| Freshness Violation | Slack + Email | SLA-bound tables |
| Discovery Failed | Email | All data sources |
| Asset Removed | Slack | Unexpected only |

## Common Questions

### What destinations can AnomalyArmor send alerts to?

Slack, Email, PagerDuty, Microsoft Teams, Linear, incident.io, and custom webhooks. Each destination is configured once under **Settings → Integrations** and then reused across alert rules. See [Alerts Overview](/alerts/overview) for the full list.

### How do I stop getting alerts for a specific table without disabling the whole rule?

Add an exclude condition on the alert rule: filter by asset name, schema, or tag. This keeps the rule active for everything else while muting the noisy table. Blackout windows work too if the noise is time-bounded (overnight batch loads, planned migrations).

### Can I route different alert types to different channels or people?

Yes - that's the recommended pattern. Create separate rules per event type (schema change, freshness violation, discovery failure) with their own destinations. Schema drops typically page on-call via PagerDuty; freshness lapses go to team Slack; discovery errors email the data platform team.

### Why didn't my alert fire when the data issue happened?

Three common causes: (1) the rule's conditions didn't match the specific event, (2) the rule is scoped to a different asset or schema, (3) the destination is misconfigured (stale Slack webhook, revoked token). Check **Alerts → Rule History** to see which events matched the rule and whether delivery succeeded.

### Can I test an alert rule before turning it on?

Yes. Every rule has a **Test Send** button that fires a sample alert to the destination so you can verify formatting and channel routing. Use it before enabling rules in production.

---

<!-- Source: https://docs.anomalyarmor.ai/quickstart/create-first-metric -->

# "Set Up Completeness Monitoring"

> Track row counts with ML-powered anomaly detection

Now that you have discovery running and alerts configured, let's set up completeness monitoring. This tracks row counts in your tables and alerts you when data volumes are abnormal.

## What You'll Accomplish

By the end of this step, you'll have:
- Completeness monitoring on a critical table
- ML-based anomaly detection learning your data patterns
- Alerts when row counts deviate from expected values

## Why Completeness Monitoring?

Row count anomalies catch real problems:

| Anomaly | What It Means |
|---------|---------------|
| **Sudden drop** | Failed ETL job, data loss, broken pipeline |
| **Unexpected spike** | Duplicate loads, runaway inserts |
| **Missing data** | Source system outage, extraction failure |
| **Gradual decline** | Upstream issues, filtering bugs |

## Set Up Monitoring

## Understanding the Learning Phase

After enabling, you'll see a **Learning** status badge on the monitoring card. The chart will show data points being collected, with a message like "Building baseline - 3 of 7 data points collected."

**What happens during learning:**
- Row counts are captured at your check interval
- ML model analyzes patterns (daily, weekly, hourly trends)
- After 7+ data points, predictions activate

## After Learning Completes

Once the baseline is established:

- **Expected range** shown on the chart
- **Anomalies** highlighted when row count falls outside predictions
- **Alerts** fire to your configured destinations

## Explicit Mode (Alternative)

If you know exactly what to expect, use explicit mode:

| Setting | Example |
|---------|---------|
| Mode | Explicit |
| Minimum Rows | 10,000 |
| Maximum Rows | 50,000 |

Alerts fire when row count falls outside your defined range.

## Which Tables to Monitor

Start with tables that:
- Receive data regularly (daily, hourly)
- Are critical to downstream reporting
- Have predictable volume patterns

**Good candidates:**
- Fact tables (orders, events, transactions)
- Staging tables from ETL pipelines
- Aggregation tables

**Skip for now:**
- Dimension tables (change infrequently)
- Archive tables
- Temporary/scratch tables

## Troubleshooting

### Still in 'Learning' after a week

**Check these:**
1. Is the table receiving new data?
2. Is the check interval appropriate for your data frequency?
3. View the history tab to see if captures are running

### Too many alerts

**Reduce noise:**
1. Increase sensitivity (3 or 4 = fewer alerts)
2. Adjust time window to match your data patterns
3. Consider if this table has irregular patterns

### Not getting alerts when expected

**Verify:**
1. Learning phase is complete
2. Alert rules are configured for completeness events
3. Check the history tab to see detected anomalies

## Common Questions

### How long does AnomalyArmor's anomaly detection take to start catching issues?

Usually 7-14 days of learning before anomaly detection is meaningful. During learning, AnomalyArmor records values without firing alerts so it can establish a baseline. You'll see a **Learning** status on the metric until the baseline is stable.

### Why is my metric stuck in 'learning' status?

Learning needs enough historical data points to build a baseline - typically 14 daily samples or 30+ hourly samples. If your table is new or only sampled sporadically, learning takes longer. Reduce the sampling interval temporarily to accelerate, then relax it once anomaly detection is active.

### Can I use AnomalyArmor for row-count monitoring on massive tables without full scans?

Yes - AnomalyArmor uses `COUNT(*)` only, which most databases satisfy from statistics or indexes without a full scan. For very large tables where even `COUNT(*)` is expensive, you can monitor a partitioned slice (e.g., `created_at >= NOW() - INTERVAL '1 day'`) as a lightweight proxy.

### What's the difference between row-count monitoring and freshness monitoring?

Row counts detect volume anomalies (sudden drops, spikes) regardless of whether new rows are landing. Freshness detects staleness - was the most recent row added within your SLA? Use both: freshness alerts when the pipeline stops, row counts alert when it runs but produces wrong amounts.

## What's Next

- **[Column Metrics](/data-quality/metrics)**: Track null percentages, distinct counts, and more
- **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Ensure tables are updated on schedule

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/overview -->

# Core Concepts

> The building blocks of data observability

Data observability answers a simple question: **Can I trust this data?**

When a dashboard shows unexpected numbers, you need to know if it's a real business trend or a broken pipeline. When an executive asks about yesterday's revenue, you need confidence that the data is fresh and complete. Data observability gives you that confidence.

## The Problem

Data pipelines fail silently. Unlike application errors that crash loudly, data problems often go unnoticed until someone makes a bad decision:

| Scenario | What Happened | The Cost |
|----------|---------------|----------|
| Marketing spends $50K on wrong audience | Pipeline dropped demographic column | Wasted ad spend, wrong targeting |
| CEO quotes wrong revenue in earnings call | ETL job failed, dashboard showed stale data | Stock price impact, credibility loss |
| Product team ships feature to wrong segment | Upstream table had schema change | Development time wasted, wrong launch |

Data observability prevents these scenarios by monitoring your data like you monitor your applications.

*Reactive vs proactive: catching issues before they cause damage*

## The Building Blocks

AnomalyArmor monitors data through these interconnected concepts:

- **[Assets](/concepts/assets)**: Tables, views, and other data objects that AnomalyArmor monitors
- **[Discovery](/concepts/discovery)**: How AnomalyArmor finds and catalogs your data assets
- **[Schema Changes](/concepts/schema-changes)**: Detecting and tracking structural changes to your data
- **[Freshness](/concepts/freshness)**: Monitoring when your data was last updated
- **[Metrics](/concepts/metrics)**: Tracking statistical properties and detecting anomalies
- **[Alerts](/concepts/alerts)**: How you get notified when issues occur
- **[Report Badges](/concepts/badges)**: Embedded status indicators for dashboards and docs
- **[Intelligence](/concepts/intelligence)**: AI-powered search and documentation
- **[Tagging](/concepts/tagging)**: Classifying and organizing your assets

## How They Work Together

**Discovery** scans your databases and catalogs **Assets** (tables, views, columns). Once cataloged, AnomalyArmor monitors each asset for:

- **Schema changes**: Columns added, removed, or type-changed
- **Freshness violations**: Data not updated within your SLA
- **Metric anomalies**: Unexpected changes in row counts, null rates, or distributions

When issues occur, **Alerts** notify your team through Slack, PagerDuty, email, or webhooks.

**Report Badges** embed this status directly in your dashboards and documentation, so consumers always know if data is trustworthy.

**Intelligence** helps you explore your catalog with natural language, and **Tagging** organizes assets for compliance and governance.

## Quick Reference

| Concept | Question It Answers | Example Alert |
|---------|---------------------|---------------|
| Asset | What data do I have? | (Cataloging, no alert) |
| Discovery | What changed since last scan? | "New table detected: staging.orders_v2" |
| Schema Change | Did the structure change? | "Column removed: orders.shipping_status" |
| Freshness | Is data up to date? | "orders table is 4 hours stale" |
| Metric | Is data quality normal? | "Row count dropped 60% from yesterday" |
| Alert | Who needs to know? | (Routes to Slack, PagerDuty, etc.) |
| Report Badge | Can consumers trust this? | (Visual indicator on dashboards) |
| Intelligence | Where is X data? | (AI-powered search result) |
| Tagging | What category is this? | (Classification: PII, production, etc.) |

## Common Questions

### What is data observability and how is it different from data quality?

Data observability is continuous visibility into data health: schema, freshness, row counts, distributions. Data quality is often a one-time or batch check (is this value in the expected set?). Observability catches problems *as they happen* - including ones you didn't predict - while quality checks validate specific known rules. AnomalyArmor does both: observability by default, rule-based quality checks layered on top.

### Do I need to configure every monitoring concept manually?

No. Schema drift and freshness are automatic once discovery runs - no per-table setup. Metrics and custom rules are opt-in where you want them. Most customers get value from the automatic monitoring on day one and add explicit metrics to their most critical tables over the first week.

### How does AnomalyArmor decide what's anomalous without me defining thresholds?

For metrics, AnomalyArmor builds a statistical baseline from 7-14 days of historical values and flags new readings outside that baseline's expected range. For schema changes, "anomalous" is literal - the structure differs from the previous discovery. For freshness, it learns your typical update cadence and alerts when a table misses it.

### Can I start with just one concept and add others later?

Yes. The most common starting point is schema drift + freshness on production databases - zero-config, high signal. Add metrics and report badges once the alerting cadence is calibrated. Tagging and compliance usually follow.

## Next Steps

- **[Connect Your Database](/quickstart/connect-first-database)**: Start monitoring in under 15 minutes
- **[Explore Assets](/concepts/assets)**: Understand the foundation of data observability

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/assets -->

# Assets

> Tables, views, and other data objects that AnomalyArmor monitors

An **asset** is any data object that AnomalyArmor discovers and monitors. When you connect a database, we catalog all tables, views, and their columns.

## Asset Hierarchy

Assets are organized in a hierarchy that mirrors your database structure:

For example: `snowflake.analytics.orders` represents the `orders` table in the `analytics` schema of your Snowflake connection.

## What We Track

Each asset stores:

| Property | Description | Use For |
|----------|-------------|---------|
| **Schema** | Column names, types, constraints | Drift detection |
| **Freshness** | Last update timestamp | SLA monitoring |
| **Metrics** | Row counts, null percentages | Data quality |
| **Tags** | Auto and custom classifications | Organization |
| **History** | All changes over time | Debugging |

## Asset Lifecycle

Assets move through states as you use AnomalyArmor:

1. **Discovered**: Found during discovery, no monitoring configured
2. **Monitored**: Has freshness SLAs, metrics, or alert rules attached
3. **Archived**: Table no longer exists in database (kept for history)

## Finding Assets

### In the UI

**Assets** page shows your full catalog. Use filters:
- **Data Source**: Filter by database connection
- **Schema**: Filter by namespace
- **Tags**: Filter by classification (e.g., `pii:email`)
- **Status**: Filter by freshness state

### Via API

```python
from anomalyarmor import Client

client = Client()
assets = client.assets.list(schema="analytics", limit=100)
```

### Via Intelligence

Ask in plain English: "Where is customer data?" or "Show me tables updated today."

## Asset Details

Click any asset to see:

- **Overview**: Basic info, tags, description
- **Schema**: Current columns with types
- **History**: All schema changes over time
- **Freshness**: Update timeline and SLA status
- **Metrics**: Configured data quality checks

## Common Questions

### What counts as an asset in AnomalyArmor?

Any table or view that AnomalyArmor discovers in your connected databases. System catalogs and internal tables are excluded. Materialized views count as assets; temporary tables don't.

### What happens to an asset when the underlying table is dropped?

The asset moves to **Archived** state - removed from active monitoring but retained for history (schema snapshots, freshness timeline, past alerts). You can restore an asset if the table comes back, or delete it permanently to free a monitored-table slot.

### Can I monitor views the same way I monitor tables?

Yes. Views are first-class assets. Schema drift, freshness (if the view has a timestamp column), and metrics all work. Note that freshness on views reflects the underlying base table's update, not the view's definition.

### How do I find assets across multiple databases?

The **Assets** page aggregates every asset across all connected data sources. Filter by data source, schema, tag, or freshness status. You can also use the API (`client.assets.list()`) or ask in natural language via Intelligence ("show me tables with PII tags updated today").

## Next Steps

- **[Set Up Discovery](/schema-monitoring/asset-discovery)**: Configure how AnomalyArmor finds your assets
- **[Connect a Data Source](/data-sources/overview)**: Add your first database to start discovering assets

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/discovery -->

# Discovery

> How AnomalyArmor finds and catalogs your data assets

**Discovery** is the process of scanning your databases to find and catalog data assets. AnomalyArmor queries system catalogs (not your data) to build an inventory of tables, views, and columns. Discovery can run on a schedule or on-demand.

On each run, discovery compares current state against previous state to detect new assets, removed assets, and schema changes.

## How Discovery Works

## Discovery Types

**Initial Discovery** runs when you first connect a data source, performing a full catalog scan. **Incremental Discovery** runs on your configured schedule, checking only for changes since the last run.

## Common Questions

### What's the difference between initial and incremental discovery?

Initial discovery is a full catalog scan that runs once when you connect a database. Incremental discovery runs on your configured schedule (hourly, daily, or custom) and only records what changed since the last run - new assets, removed assets, schema diffs. Incremental is fast even on large databases because it queries the same metadata but diffs against the previous snapshot.

### Can discovery be triggered manually?

Yes. Click **Run Discovery** on the connection's detail page to kick off an on-demand scan. This is useful after a deliberate schema change (migration, table addition) when you want the catalog to reflect the new state immediately instead of waiting for the next scheduled run.

### Does discovery pull my actual data?

No. Discovery queries only system catalogs (`information_schema`, `pg_catalog`, or equivalent) to read names, types, and metadata. It never runs `SELECT *` or reads row values. The [Query Gateway](/security/query-gateway) enforces this.

### Why didn't a newly created table appear after I ran discovery?

Three common causes: (1) the table is in a schema excluded by your discovery include/exclude patterns; (2) your read-only database user doesn't have `USAGE` on that schema or doesn't see it in the system catalogs; (3) some databases delay catalog visibility for a few seconds after DDL. Check the connection's discovery include list first.

## Next Steps

- **[Configure Discovery](/schema-monitoring/asset-discovery)**: Set up schedules and discovery options
- **[Connect a Data Source](/data-sources/overview)**: Add databases for discovery to scan

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/schema-changes -->

# Schema Changes

> Detecting and tracking structural changes to your data assets

A **schema change** is a single structural modification to your data - a column added, removed, or renamed; a data type changed; a table created or dropped. Each change is an individual record detected by [schema drift monitoring](/schema-monitoring/schema-drift), the feature that compares successive [asset discoveries](/schema-monitoring/asset-discovery) to find structural differences over time. These modifications can break downstream systems like dbt models, dashboards, and reports if not caught early.

AnomalyArmor detects changes by comparing discovery runs. When a column disappears or a type changes, you get alerted before your pipelines fail.

## Change Types

| Change | Risk Level | Impact |
|--------|------------|--------|
| Column removed | High | Breaks queries referencing it |
| Table removed | High | Breaks all dependent queries |
| Column renamed | High | Old name no longer works |
| Type changed | Medium | May affect aggregations |
| Column added | Low | Usually safe |
| Table added | Low | Informational |

## Common Questions

### What kinds of schema changes does AnomalyArmor detect?

Column added, removed, renamed, or type-changed; nullability changes; default-value changes; constraint changes (primary key, unique, foreign key); new and dropped tables and views. The full list is tracked at [schema drift](/schema-monitoring/schema-drift). AnomalyArmor does not diff stored procedures, triggers, or database-engine-level objects.

### How does AnomalyArmor detect a rename versus a drop-and-add?

By default it's conservative: a column disappearing followed by one appearing with the same type is logged as drop+add, not rename. Most database catalogs don't preserve rename history, so we can't always distinguish. If you run a deliberate rename, annotate the alert so it doesn't page your team twice.

### Can I acknowledge schema changes so they stop alerting?

Yes. Each change has an **Acknowledge** action that marks it as reviewed and suppresses downstream notifications on that specific change. The change is retained in history for audit - acknowledging doesn't delete it. See [alert best practices](/alerts/best-practices) for patterns.

### How fast does AnomalyArmor detect a schema change after it happens?

Detection latency equals your discovery cadence. Default daily discovery means up to 24-hour detection; hourly discovery detects within the hour. For databases where schema changes happen frequently (active migration projects), increase the cadence to hourly or run manual discovery immediately after known deploys.

## Next Steps

- **[View Change History](/schema-monitoring/schema-drift)**: See what changed and when
- **[Set Up Alerts](/alerts/alert-rules)**: Get notified of breaking changes

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/freshness -->

# Freshness

> Monitor when your data was last updated

**Freshness** measures how recently data in a table was updated. Stale data can be as damaging as wrong data. If your dashboard shows yesterday's numbers, decisions made today could be wrong.

AnomalyArmor tracks freshness by monitoring timestamp columns. Set SLAs defining how old data can be, and get alerted when data is stale.

## Why Freshness Matters

Data freshness is a critical indicator of data pipeline health. Here's why monitoring freshness prevents costly problems:

### Catch Pipeline Failures Early

When ETL jobs fail silently, you might not know until someone reports a problem. Freshness monitoring detects the issue immediately when expected data doesn't arrive.

**Real scenario**: Your nightly sales import fails at 2 AM. Without freshness monitoring, your morning reports show yesterday's data and your team makes decisions on outdated information. With freshness monitoring, you get alerted at 2:05 AM and can investigate before business hours.

### Prevent Downstream Cascade Failures

Modern data stacks have dependencies. When upstream data goes stale, it can cause a cascade of failures downstream. Freshness monitoring acts as an early warning system.

**Example**: Your `raw_events` table feeds into `sessions`, which feeds into `user_analytics`. If `raw_events` stops updating, freshness alerts catch it before derived tables produce incorrect aggregations.

### Meet Business SLAs

Different tables have different freshness requirements. Customer-facing dashboards might need real-time data, while monthly reports can tolerate delays. Freshness monitoring lets you codify these expectations.

| Use Case | Typical SLA | Impact if Stale |
|----------|-------------|-----------------|
| Real-time dashboards | < 5 minutes | Customer complaints, lost revenue |
| Daily reporting | < 2 hours | Delayed decisions, missed opportunities |
| Weekly analytics | < 24 hours | Inaccurate trend analysis |
| Monthly aggregates | < 7 days | Incorrect billing, compliance issues |

### Detect Data Quality Issues

Freshness problems often signal deeper issues. If a table that usually updates every hour hasn't updated in 12 hours, something is broken in your pipeline.

**What stale data reveals**:
- Source system failures
- Network connectivity issues
- Permission problems
- Schema changes breaking queries
- Resource exhaustion (disk, memory, connections)

### Reduce Mean Time to Detection (MTTD)

Without freshness monitoring, you discover data problems when users report them. With automated freshness checks, you detect issues minutes after they occur instead of hours or days later.

**Impact on MTTD**:
- **Without monitoring**: 4-48 hours (user reports issue)
- **With monitoring**: 5-15 minutes (automated alert)
- **Result**: 95%+ reduction in detection time

## How It Works

1. You specify a timestamp column (e.g., `created_at`, `updated_at`)
2. AnomalyArmor queries `MAX(timestamp_column)` on your schedule
3. If the latest timestamp exceeds your SLA threshold, an alert fires
4. Alerts can route to Slack, email, PagerDuty, or webhooks

### Auto-Learning Freshness Baselines

AnomalyArmor can learn your table's update patterns automatically. Instead of manually setting SLAs, enable auto-learning and the system will:

1. Observe your table's update frequency over time
2. Calculate typical update intervals and variance
3. Set dynamic thresholds based on historical patterns
4. Alert only when updates deviate from the learned baseline

This is especially useful for tables with irregular update patterns or tables you're just starting to monitor.

### Handling Complex Update Patterns

Some tables have nuanced freshness requirements:

**Business hours only**: Your CRM sync runs 9 AM to 6 PM. Configure freshness checks to only alert during business hours, avoiding false alerts at night.

**Weekly batches**: A table updates every Monday at 3 AM. Set a weekly schedule that expects updates once per week, not daily.

**Time zone considerations**: Your `created_at` timestamps are in UTC but your business operates in PST. AnomalyArmor handles time zone conversions automatically.

**Multiple sources**: If one table receives data from multiple sources with different frequencies, you can monitor multiple freshness columns or use separate freshness schedules.

## Common Questions

### Which column should I use for freshness monitoring?

Use the column that's written every time a row is inserted or meaningfully updated - commonly `updated_at`, `created_at`, `event_time`, or `last_modified`. Avoid columns that only reflect specific lifecycle events (e.g., `deleted_at`). If your table has no reliable timestamp, freshness isn't a good fit; track it via row-count deltas instead.

### How do I configure freshness for tables that only update on business hours or weekly?

Set a schedule on the freshness check itself: business-hours window, weekly cadence, or a cron expression. Alerts fire only when the table violates the configured schedule, so overnight or weekend gaps don't page on-call by mistake.

### What's auto-learning freshness and when should I use it?

Auto-learning watches your table's update pattern for 1-2 weeks and sets a dynamic threshold based on observed variance. Use it for tables with irregular cadence where a fixed SLA would be wrong (analytics jobs, batch loads with variable completion times). Use a fixed SLA for tables where the business-contract freshness is an absolute (hourly sync, real-time dashboard).

### Does freshness monitoring work if my table's timestamp is in a different time zone than my business?

Yes. AnomalyArmor normalizes timestamps to UTC internally; alerting respects the time zone configured on your workspace. You don't need to manually convert - just configure the business time zone once in workspace settings.

### Can I monitor freshness on a view or a derived table?

Yes - as long as the view exposes a reliable timestamp column. Note that the freshness reflects the view's output (which depends on its underlying tables), so a stale view can indicate either an upstream issue or a materialization lag. Combine with freshness on the base table to distinguish.

## Next Steps

- **[Set Freshness SLAs](/data-quality/freshness-monitoring)**: Define how fresh your data should be
- **[Set Up Alerts](/alerts/alert-rules)**: Get notified when data goes stale

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/metrics -->

# Metrics

> Track statistical properties of your data and detect anomalies

Metrics are statistical measurements that track properties of your data over time. AnomalyArmor captures metrics on a schedule, builds baselines, and alerts you when values fall outside expected ranges.

## What Metrics Track

| Metric Type | What It Measures | Example Use Case |
|-------------|------------------|------------------|
| `row_count` | Total rows in a table | Detect data loss or unexpected growth |
| `null_percent` | Percentage of null values | Catch ETL issues leaving nulls |
| `distinct_count` | Unique values in a column | Detect cardinality changes |
| `duplicate_count` | Duplicate values | Find unexpected duplicates |
| `min_value` | Minimum numeric value | Catch invalid data (negative prices) |
| `max_value` | Maximum numeric value | Detect outliers |
| `mean` | Average numeric value | Monitor central tendency |

## How Anomaly Detection Works

1. **Baseline building**: Historical values establish what's "normal"
2. **Z-score calculation**: Each new value is compared to the baseline
3. **Sensitivity threshold**: Values exceeding the threshold trigger alerts

A sensitivity of 2.0 means values more than 2 standard deviations from the mean are flagged as anomalies.

## Common Questions

### How is a metric different from a validity rule?

A metric captures a numeric property over time (row count, null percentage, distinct count) and flags statistical anomalies against a learned baseline. A validity rule checks a deterministic condition on current data (value in allowed set, format match, range bounds). Use metrics to catch "this number changed suspiciously"; use validity rules to catch "this data broke a known contract".

### What's the default sensitivity for anomaly detection, and when should I change it?

Default is 2.0 standard deviations - roughly 95% of normal variance stays silent, outliers alert. Raise to 3.0 for noisy metrics where 2σ produces false alerts; lower to 1.5 for business-critical metrics where you'd rather investigate a false positive than miss a real anomaly. Tune per-metric, not globally.

### How much history does AnomalyArmor need before metric anomaly detection works?

Typically 7-14 days of captures at your chosen sampling interval. During this "learning" phase, values are recorded but no alerts fire. You can shorten the wait by temporarily reducing the sample interval, then relaxing it once the baseline is established.

### Can I define a custom metric that isn't in the built-in list?

Yes, via [custom SQL monitoring](/data-quality/custom-sql-monitoring). Write a `SELECT` that returns a single numeric column and AnomalyArmor treats the result as a metric - baseline learning, anomaly detection, and alerting all work the same way as built-in metrics.

## Next Steps

- **[Create a Metric](/data-quality/metrics)**: Set up your first data quality metric
- **[Set Up Alerts](/alerts/alert-rules)**: Get notified when anomalies are detected

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/alerts -->

# Alerts

> How you get notified when issues occur

**Alerts** notify you when something needs attention. Whether it's a schema change, stale data, or a failed discovery job, alerts ensure the right people know at the right time.

Alerts follow a simple pipeline: events are detected during discovery, evaluated against your rules, and routed to destinations like Slack, email, or PagerDuty.

## What Triggers Alerts

| Event Type | Description |
|------------|-------------|
| Schema Change | Column added, removed, or type changed |
| Freshness Violation | Data not updated within SLA |
| Metric Anomaly | Value outside expected range |
| Discovery Failed | Connection or permission error |
| Asset Removed | Table/view no longer exists |

## Common Questions

### What's the difference between an event, a rule, and a destination?

An event is something detected (schema change, freshness violation, metric anomaly). A rule is your filter on events - which assets, which event types, severity thresholds, schedule conditions. A destination is where the notification lands (Slack, email, PagerDuty, webhook). Events flow through rules to destinations.

### Can one event fire multiple alerts to different destinations?

Yes - that's the normal pattern. A production schema drop might page on-call via PagerDuty *and* post to the team Slack channel *and* email the data-platform list. Each is a separate rule with the same event filter but different destinations.

### Do I configure alerts per table, or globally?

Either. Rules can apply to all assets (useful for catching discovery failures across everything) or be scoped to specific schemas, tags, or individual assets. Most teams start with a global "schema drop" rule and add tighter per-critical-table rules over time.

### Can I suppress alerts during a planned migration or deploy?

Yes. Use [blackout windows](/alerts/blackouts) to pause alerting on specific assets or rules for a defined time window. Events are still recorded; they just don't notify until the window ends.

## Next Steps

- **[Create Alert Rules](/alerts/alert-rules)**: Define when and where alerts fire
- **[Connect Destinations](/alerts/destinations/slack)**: Set up Slack, email, or other channels

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/badges -->

# Report Badges

> Embedded status indicators for data quality visibility

Report badges display data quality status directly in your existing tools. Instead of switching to AnomalyArmor to check if data is fresh, see a green or red badge right in your dashboard.

## The Problem Badges Solve

Data teams context-switch constantly. Checking data quality in a separate tool breaks focus:

Badges provide **always-on visibility** wherever you work.

## What Badges Monitor

A single badge can track multiple quality signals:

| Check | What It Catches |
|-------|-----------------|
| Schema Drift | Column additions, removals, type changes |
| Freshness | Data not updated within SLA |
| Metrics | Anomalies in row counts, null percentages |

Configure badges to monitor any combination of these checks.

## Where Badges Work

Any tool that supports image embeds:

- **Databricks** - Markdown cells in notebooks
- **Notion / Confluence** - Wiki pages and documentation
- **Looker / Tableau** - Dashboard annotations
- **GitHub** - README files and PR descriptions
- **Slack** - Pinned messages in data channels

## Badge States

| Badge | Meaning |
|-------|---------|
| **Passing** (green) | All monitored tables pass quality checks |
| **Issues** (red) | One or more problems detected |
| **Unknown** (grey) | Badge disabled or configuration error |

## Common Questions

### Where can I embed an AnomalyArmor report badge?

Anywhere that accepts an image URL or Markdown image syntax: Notion, Confluence, Databricks notebooks, Looker / Tableau dashboard annotations, GitHub README files, Slack pinned messages, and most static-site generators. Badges are rendered from a URL, so if the tool shows external images, badges work.

### How quickly does a badge update after an issue is detected?

Within a minute. The badge endpoint reads the current status of the tracked checks on each request, so a failing freshness check shows red as soon as the check fires. There's no caching layer on your side unless your embedding tool caches images (Slack and some wikis do; pass a cache-busting query param if needed).

### Can one badge cover multiple tables or checks?

Yes. A badge can aggregate schema drift, freshness, and metric status across any set of tables. If any tracked check fails, the badge turns red. For per-table detail, use multiple badges side-by-side.

### Does embedding a badge expose any sensitive information?

No. The badge image shows only a pass/fail color and an optional short label. Clicking the badge can link back to AnomalyArmor for authenticated detail, but the image itself has no table names, row counts, or customer data. Safe to embed in public READMEs.

## Next Steps

- **[Create Your First Badge](/data-quality/report-badges)**: Step-by-step setup guide
- **[Freshness Monitoring](/concepts/freshness)**: Understand what badges monitor

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/intelligence -->

# Intelligence

> AI-powered analysis that connects your monitoring context into actionable answers

**Intelligence** aggregates context from schema drift, freshness, data quality, tags, lineage, and alerts, then lets you ask questions and get AI-generated analysis in plain English.

## How It Works

Intelligence reads your operational metadata, not your data values:

| What We Analyze | What We Never Access |
|----------------|---------------------|
| Table and column names | Actual row data |
| Data types and constraints | Query results |
| Schema change history | Business values |
| Freshness patterns and SLAs | Database credentials |
| Data quality metric trends | |
| Tags and classifications | |

Your data values stay in your database. Intelligence only sees structure and operational metadata.

## What You Can Ask

### Finding Tables

```
"Where is customer data?"
"Show me tables with email columns"
"Which tables were updated today?"
```

### Cross-Domain Diagnosis

```
"Why is my orders dashboard broken?"
"What changed upstream that could cause NULLs in silver_orders?"
"What should I prioritize this morning?"
```

### Tracking Changes

```
"What columns were removed this week?"
"Show me schema drift on gold tables"
"Which tables had changes since the last deploy?"
```

## Example Conversation

**You**: "Why is the revenue dashboard showing errors?"

**Intelligence**: "The column `discount_pct` was removed from `gold.fact_orders` on Tuesday at 4:12 PM. This was caused by an upstream schema change in `staging.orders_raw` at 3:47 PM.

Two downstream views reference this column:
- `analytics.order_summary`
- `reporting.daily_revenue`

Freshness is normal (table is still updating). No data quality anomalies on other columns. The issue is isolated to the removed column."

## AI-Generated Profiles

When you select a table on the Intelligence page, you see an **Object Profile**:

- **Summary and business context**: What this table is for and who uses it
- **Findings**: Issues grouped by severity (Critical, High, Medium, Low), sourced from schema drift, freshness, data quality, and other domains
- **Relationships**: Foreign keys, lineage signals, hub tables
- **Change history**: Recent column additions, removals, and type changes

## Privacy Model

| Question | Answer |
|----------|--------|
| Does Intelligence see my data? | No. Only metadata (names, types, monitoring history). |
| Which AI provider is used? | Configurable per account. Supports multiple providers. |
| Can I disable Intelligence? | Yes. Contact support. |
| Is usage logged? | Yes, for billing and debugging. |

## Common Questions

### Does Intelligence send my data to an LLM provider?

No. Intelligence only sends metadata - table names, column names, types, schema history, freshness stats, tags, alerts - never row values. Your data never leaves your database. This is enforced by the same [Query Gateway](/security/query-gateway) that governs monitoring queries.

### Which AI provider does Intelligence use?

Configurable per account. Intelligence supports multiple providers, so enterprise customers with BAA or regional-data requirements can pick a provider that matches their compliance posture. Contact your account team to change providers.

### Why did Intelligence give me an answer that was out of date?

Intelligence analyses the metadata snapshot as of the last discovery run plus live monitoring state. If discovery hasn't run since the schema change you're asking about, the answer reflects the pre-change state. Trigger a manual discovery on the relevant connection and re-ask.

### Can I disable Intelligence for compliance or privacy reasons?

Yes. Intelligence can be disabled at the workspace level - contact support to turn it off. Even when enabled, Intelligence only sees metadata, so the privacy delta is small, but some compliance programs require the feature be off by default.

### Can Intelligence write to my database or run queries I didn't approve?

No. Intelligence only reads the metadata AnomalyArmor already collects. It has no write path, no ad-hoc query capability on your data, and no way to touch your database beyond what the Query Gateway already allows for scheduled monitoring.

## Next Steps

- **[Intelligence Overview](/intelligence/overview)**: Learn how monitoring context powers Intelligence
- **[Ask Questions](/intelligence/ask-questions)**: Chat with the AI agent

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/tagging -->

# Tagging

> Classifying and organizing your assets

**Tagging** helps you classify and organize data assets for governance and compliance. Know which tables contain PII before your next audit asks.

AnomalyArmor supports two types of tags: auto-classification that detects PII patterns automatically, and custom tags you create to match your organization's needs.

## Tag Types

| Type | What It Does | Use For |
|------|--------------|---------|
| Auto-classification | Detects PII patterns in column names | Compliance, PII tracking |
| Custom tags | Labels you define and apply | Organization, ownership, status |

## Common Questions

### Does auto-classification read my actual data to detect PII?

No. Auto-classification runs pattern matching against column names, types, and AI-generated profiles - never row values. So a column named `user_email` is classified as PII-email from its name, not from a data scan. See [auto-classification](/data-classification/auto-classification) for the full detection model.

### How do I create a custom tag for something like 'team ownership'?

Create a tag set in **Settings → Tags** (e.g., `team:growth`, `team:platform`) then apply it to assets manually, via API, or as a rule (tag any asset in a specific schema with a specific team). See [custom tags](/data-classification/custom-tags).

### Can I use tags in alert rules to scope notifications?

Yes. Alert rules support tag-based filtering - route all `pii:email` asset alerts to the security channel, or page only on `tier:critical` tables. This is typically how teams avoid noisy alerting across heterogeneous data.

### Are auto-classification tags guaranteed accurate?

They're high-precision heuristics, not a substitute for your own review. Cases like columns with ambiguous names (`id`, `value`, `data`) are more likely to be missed or mis-tagged. Treat auto-tags as a starting point for compliance review, not the final word.

## Next Steps

- **[Create Custom Tags](/data-classification/custom-tags)**: Define labels for your organization
- **[Auto-Classification](/data-classification/auto-classification)**: Set up automatic PII detection

---

<!-- Source: https://docs.anomalyarmor.ai/concepts/jobs -->

# Jobs

> Monitor and track background operations on your data assets

Jobs are background operations that AnomalyArmor runs on your data assets. The Jobs page provides observability into all operations, helping you understand what actions have been taken, their status, and troubleshoot any failures.

## Why Jobs Matter

Every time AnomalyArmor interacts with your data assets, it creates a job. This gives you:

- **Audit trail**: Know exactly what operations ran and when
- **Debugging**: Understand why something failed and how to fix it
- **Performance visibility**: See how long operations take
- **Operational awareness**: Track what's happening across your data estate

## Job Types

| Job Type | Description | Triggered By |
|----------|-------------|--------------|
| **Asset Discovery** | Analyzes database structure, tables, columns, and relationships | Manual trigger, scheduled refresh |
| **Intelligence Analysis** | Generates AI documentation for tables and columns | Manual trigger on asset |
| **Tag Generation** | Auto-classifies columns with PII, sensitivity, and domain tags | After asset discovery |
| **Freshness Check** | Monitors when tables were last updated | Scheduled intervals |
| **Metric Capture** | Collects row counts and custom metrics | Scheduled intervals |
| **Lineage Extraction** | Maps data flow between tables | After asset discovery |

## Job Statuses

| Status | Meaning |
|--------|---------|
| **Pending** | Job is queued and waiting to start |
| **Running** | Job is actively executing |
| **Completed** | Job finished successfully |
| **Failed** | Job encountered an error |
| **Cancelled** | Job was manually stopped |

## Using the Jobs Page

### Viewing Jobs

Navigate to **Jobs** in the left sidebar to see all jobs across your organization.

The jobs table shows:
- **Job ID**: Unique identifier (click row to see details)
- **Job Type**: What operation was performed
- **Asset**: Which database or table was affected
- **Status**: Current job state
- **Progress**: Completion percentage for running jobs
- **Duration**: How long the job took
- **Message**: Status message or error description
- **Created**: When the job was started

### Filtering Jobs

Use the tabs to filter by status:
- **All**: Every job regardless of status
- **Active**: Currently running or pending jobs
- **Completed**: Successfully finished jobs
- **Failed**: Jobs that encountered errors

### Job Details

Click any job row to open the details modal, which shows:

- Full job configuration
- Detailed progress information
- Error messages and stack traces (for failed jobs)
- Timing breakdown
- Related asset information

### Job Actions

From the actions menu (three dots), you can:
- **View Details**: Open the full job information modal
- **Go to Asset**: Navigate to the affected asset
- **Go to Job Page**: View results in context (e.g., schema changes page)
- **Cancel Job**: Stop a running job
- **Retry Job**: Re-run a failed job

## Common Scenarios

### Debugging a Failed Job

1. Navigate to **Jobs**
2. Click the **Failed** tab
3. Click the failed job row to open details
4. Review the error message and stack trace
5. Fix the underlying issue (e.g., connection problem, permission error)
6. Click **Retry Job** to re-run

### Monitoring a Long-Running Job

1. Navigate to **Jobs**
2. Click the **Active** tab
3. Watch the progress percentage update in real-time
4. Click the job row for detailed progress information

### Understanding Job History

Use [Intelligence Q&A](/intelligence/overview) to ask questions about job history:

```
"Why did my last job fail?"
"What jobs ran this week?"
"How long does asset discovery usually take?"
"Show me failed jobs for the orders database"
```

## Best Practices

1. **Check jobs after schema changes**: When you modify your database, verify discovery jobs complete successfully
2. **Monitor failed jobs**: Set up alerts for job failures to catch issues early
3. **Review job duration trends**: Sudden increases may indicate database performance issues
4. **Use retry wisely**: If a job fails due to transient issues (network, timeouts), retry is often sufficient

## Common Questions

### Why did my discovery job fail?

Open the job from the **Jobs** page and check the error and stack trace. Most failures are one of: expired database credentials, network unreachability (the AnomalyArmor IPs were removed from your firewall), insufficient permissions on new schemas, or a timeout on an unusually large catalog. Fix the underlying cause and click **Retry Job**.

### How long does a discovery job normally take?

Under 60 seconds for up to 1,000 tables is typical. A sudden increase in duration often indicates a database-side issue (stats out of date, table explosion, replica lag). Track the duration trend on the Jobs page to catch this early.

### Can I cancel a long-running job?

Yes. From the job's actions menu, click **Cancel Job**. In-flight queries are stopped, and the job moves to **Cancelled** state. Restarting is safe - the next run picks up from current database state.

### Do failed jobs fire alerts?

Only if you configure a rule for the `Discovery Failed` event on your alerting destinations. See [alert rules](/alerts/alert-rules). Many teams route job failures to email rather than Slack, since they're often recoverable without paging on-call.

## Related

- [Asset Discovery](/schema-monitoring/asset-discovery) - Learn about discovery jobs
- [Intelligence Q&A](/intelligence/overview) - Ask questions about job history
- [Alerts](/alerts/overview) - Set up notifications for job failures

---

<!-- Source: https://docs.anomalyarmor.ai/data-sources/overview -->

# Data Sources Overview

> Connect AnomalyArmor to your databases and data platforms

Connect once, monitor everything. Add your database credentials and AnomalyArmor continuously monitors for schema changes, stale data, and quality issues, without installing anything in your infrastructure.

Your data stays in your database. We only read metadata (table names, column types, timestamps) through a secure, read-only connection.

## Supported Databases

AnomalyArmor supports the most popular data platforms used by modern data teams:

| Database | Version | Key Features |
|----------|---------|--------------|
| **PostgreSQL** | 12+ | Tables, views, schemas, materialized views |
| **MySQL** | 5.7+ | Tables, views, schemas |
| **SQL Server** | 2012+ | Tables, views, schemas, Azure SQL Database |
| **Amazon Redshift** | Any | Tables, views, external tables (Spectrum) |
| **Databricks** | Unity Catalog | Catalogs, schemas, Delta tables, views |
| **ClickHouse** | 21.8+ | Tables, views, materialized views, dictionaries |

- **[PostgreSQL](/data-sources/postgresql)**: Including RDS, Aurora, Supabase, and self-hosted
- **[MySQL](/data-sources/mysql)**: Including RDS, Aurora MySQL, PlanetScale, and self-hosted
- **[SQL Server](/data-sources/sql-server)**: Including Azure SQL Database and on-premise
- **[Amazon Redshift](/data-sources/redshift)**: Provisioned clusters and Serverless workgroups
- **[Databricks](/data-sources/databricks)**: Unity Catalog with Delta Lake support
- **[ClickHouse](/data-sources/clickhouse)**: Including ClickHouse Cloud

## How Data Sources Work

When you add a data source, AnomalyArmor:

1. **Stores credentials securely**: Encrypted with AES-256
2. **Tests connectivity**: Verifies we can reach your database
3. **Awaits discovery**: No scanning until you trigger it

## What We Access

AnomalyArmor only queries **metadata** from your databases:

| We Access | We Never Access |
|-----------|-----------------|
| `information_schema` | Your actual data |
| System catalogs | Row contents |
| Table/column names | PII or sensitive values |
| Data types | Business data |
| Timestamps (for freshness) | Query results |

> ℹ️ See [Security Overview](/security/overview) for detailed information about our security practices.

## Adding a Data Source

### Quick Steps

1. Navigate to **Data Sources** in the sidebar
2. Click **Add Connection**
3. Select your database type
4. Enter connection credentials
5. Click **Test Connection**
6. Click **Save**

For detailed instructions, see the guide for your specific database:

- [PostgreSQL Setup](/data-sources/postgresql)
- [MySQL Setup](/data-sources/mysql)
- [SQL Server Setup](/data-sources/sql-server)
- [Amazon Redshift Setup](/data-sources/redshift)
- [Databricks Setup](/data-sources/databricks)
- [ClickHouse Setup](/data-sources/clickhouse)

## Managing Data Sources

### Editing Connections

To update a data source:

1. Go to **Data Sources**
2. Click on the connection name
3. Click **Settings**
4. Update credentials or configuration
5. Click **Save**

> ⚠️ Changing credentials will require re-testing the connection. Scheduled discoveries may fail if credentials are incorrect.

### Deleting Connections

To remove a data source:

1. Go to **Data Sources**
2. Click on the connection name
3. Click **Settings** → **Delete Connection**
4. Confirm deletion

**What gets deleted**:
- The connection and credentials
- Discovery schedule
- Associated alert rules (optional)

**What's preserved**:
- Historical schema change data
- Audit logs

### Connection Status

Each data source shows its status:

| Status | Meaning |
|--------|---------|
| **Connected** | Last discovery succeeded |
| **Error** | Connection or permission issue |
| **Never Run** | Discovery hasn't been triggered yet |
| **Running** | Discovery in progress |

## Network Requirements

AnomalyArmor connects outbound to your databases. You'll need to:

### 1. Allow AnomalyArmor IPs

Allowlist our static IP addresses in your firewall or security group:

```
AnomalyArmor IP Addresses:
34.xxx.xxx.xxx/32
34.xxx.xxx.xxx/32
```

> 💡 View current IPs in **Settings → Security** in your dashboard.

### 2. Open Database Port

Ensure the database port is accessible:

| Database | Default Port |
|----------|--------------|
| PostgreSQL | 5432 |
| MySQL | 3306 |
| SQL Server | 1433 |
| Amazon Redshift | 5439 |
| Databricks | 443 (HTTPS) |
| ClickHouse | 8443 (HTTPS) |

### 3. SSL/TLS Configuration

We recommend (and often require) encrypted connections:

- **PostgreSQL**: SSL Mode = `require`
- **MySQL**: SSL Mode = `require`
- **SQL Server**: Encryption enabled (required for Azure SQL)
- **Amazon Redshift**: SSL required by default
- **Databricks**: Always HTTPS
- **ClickHouse**: Port 8443 for HTTPS

## Enterprise Options

For enhanced security, Enterprise customers can use:

### VPC Peering

Direct network peering between your AWS VPC and AnomalyArmor:
- No public internet exposure
- Lower latency
- Private IP connectivity

### AWS PrivateLink

Connect via AWS PrivateLink:
- Fully private connectivity
- No firewall changes needed
- Traffic stays on AWS backbone

Contact [sales@anomalyarmor.ai](mailto:sales@anomalyarmor.ai) for Enterprise options.

## Best Practices

### Use Read-Only Credentials

Always create a dedicated, read-only user for AnomalyArmor:

```sql
-- Example for PostgreSQL
CREATE USER anomalyarmor WITH PASSWORD 'secure-password';
GRANT CONNECT ON DATABASE your_db TO anomalyarmor;
GRANT USAGE ON SCHEMA public TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor;
```

### Use Descriptive Names

Name your data sources clearly:

**Good names**:
- `Production PostgreSQL`
- `Analytics Databricks`
- `Staging ClickHouse`

**Avoid**:
- `db1`
- `test`
- `connection`

### Start with One Environment

Begin with your production database, then expand to staging and development environments once you're comfortable with the setup.

## Troubleshooting

### Connection test fails

1. Verify hostname and port are correct
2. Check credentials are valid
3. Ensure AnomalyArmor IPs are allowlisted
4. Verify SSL/TLS settings match your database

### Discovery finds no tables

1. Verify user has `SELECT` on `information_schema`
2. Check schema filters aren't excluding everything
3. Confirm tables exist in the monitored schemas

### Intermittent connection errors

1. Check database availability and load
2. Verify network stability
3. Consider using a read replica for monitoring

## Common Questions

### What databases does AnomalyArmor support today?

PostgreSQL (12+), MySQL (5.7+), SQL Server (2012+, including Azure SQL), Amazon Redshift, Databricks (with Unity Catalog), and ClickHouse (21.8+). RDS, Aurora, Supabase, PlanetScale, Cloud SQL, and Azure Database are all supported via the compatible engine's connector. Snowflake and BigQuery are in active development.

### Can I connect a database that's behind a firewall or in a private VPC?

Yes. Three options: (1) allowlist AnomalyArmor's static outbound IPs (visible in **Settings → Security**) in your firewall; (2) VPC peering (Enterprise); (3) AWS PrivateLink (Enterprise). SSH tunnel / bastion host is also supported for PostgreSQL and MySQL when direct connectivity is not possible.

### Can I monitor multiple databases from one AnomalyArmor workspace?

Yes - most customers do. Connect as many as you need; assets across all connections appear in a unified catalog. Plan limits are on *monitored tables*, not number of connections.

### Should I point AnomalyArmor at production or a replica?

Either works. A replica has zero impact on primary workload and is the recommended pattern for production-grade deployments. Freshness reflects replica timestamps, so factor in replication lag when setting SLAs.

### What happens if my database credentials expire or rotate?

Discovery and monitoring jobs fail and the connection moves to **Error** status. Update credentials under the connection's settings and re-test; in-flight jobs will resume on the next scheduled run. Consider using credential managers (AWS Secrets Manager, Vault) and rotating via Enterprise SSO to reduce manual updates.

## Next Steps

- **[Connect PostgreSQL](/data-sources/postgresql)**: Full guide with RDS, Aurora, and Supabase instructions
- **[Run Discovery](/quickstart/run-first-discovery)**: Scan your database after connecting

---

<!-- Source: https://docs.anomalyarmor.ai/data-sources/postgresql -->

# PostgreSQL

> Connect AnomalyArmor to PostgreSQL databases including RDS, Aurora, and Supabase

Connect AnomalyArmor to any PostgreSQL-compatible database. This guide covers self-hosted PostgreSQL as well as managed services like Amazon RDS, Aurora, and Supabase.

## Supported Versions & Platforms

| Platform | Minimum Version | Notes |
|----------|-----------------|-------|
| PostgreSQL | 12+ | Self-hosted or any cloud |
| Amazon RDS | 12+ | All instance classes |
| Amazon Aurora | PostgreSQL 12+ | Cluster and serverless |
| Supabase | Any | Direct connection or pooler |
| Google Cloud SQL | 12+ | Public or private IP |
| Azure Database | 12+ | Single server or flexible |
| Heroku Postgres | Any | Requires SSL |

## Connection Settings

| Field | Description | Example |
|-------|-------------|---------|
| **Connection Name** | Friendly identifier | `Production PostgreSQL` |
| **Host** | Hostname or IP address | `db.example.com` |
| **Port** | Database port | `5432` |
| **Database** | Database name | `myapp_production` |
| **Username** | Database user | `anomalyarmor` |
| **Password** | User password | `••••••••` |
| **SSL Mode** | SSL configuration | `require` |

### SSL Mode Options

SSL (Secure Sockets Layer) encrypts the connection between AnomalyArmor and your database, preventing eavesdropping on sensitive data like credentials and query results.

**Why use SSL?**
- **Security**: Encrypts all data in transit, protecting against network sniffing
- **Compliance**: Required for SOC2, HIPAA, PCI-DSS, and other security standards
- **Cloud providers**: Most managed databases (RDS, Aurora, Cloud SQL) require or strongly recommend SSL

**When SSL may not be needed:**
- Local development databases on `localhost`
- Databases on a private network with no external access
- Testing environments with non-sensitive data

#### Choosing an SSL Mode

| Mode | Security Level | Description |
|------|----------------|-------------|
| `disable` | None | No encryption. Data sent in plain text. |
| `allow` | Low | Uses SSL only if server requires it. |
| `prefer` | Medium | Tries SSL first, falls back to unencrypted if unavailable. |
| `require` | High | Always uses SSL, but doesn't verify the server certificate. |
| `verify-ca` | Higher | Uses SSL and verifies the server certificate is signed by a trusted CA. |
| `verify-full` | Highest | Uses SSL, verifies CA, and confirms the server hostname matches the certificate. |

#### Recommendations by Environment

| Environment | Recommended Mode | Reason |
|-------------|------------------|--------|
| **Local development** | `prefer` or `disable` | Convenience for local testing |
| **Cloud databases** (RDS, Aurora, Cloud SQL) | `require` | SSL is available; certificate verification often not needed |
| **Production with compliance** | `verify-ca` or `verify-full` | Maximum security for sensitive data |
| **Heroku, Supabase** | `require` | These platforms require SSL |

> 💡 When in doubt, start with `prefer`. It provides encryption when available without blocking connections if SSL isn't configured on your database.

> ⚠️ Never use `disable` for production databases or any database containing sensitive data.

## SSH Tunnel (Bastion Host)

For databases behind firewalls, AnomalyArmor supports SSH tunnel connections through a bastion host. This is common in enterprise environments where databases are not directly accessible from the internet.

### When to Use SSH Tunnel

- Database is in a private subnet with no public IP
- Firewall rules prevent direct connections
- Security policy requires bastion host access

### SSH Tunnel Settings

Enable **SSH Tunnel** in the connection form to reveal these fields:

| Field | Description | Example |
|-------|-------------|---------|
| **SSH Host** | Bastion server hostname | `bastion.example.com` |
| **SSH Port** | SSH port (usually 22) | `22` |
| **SSH Username** | SSH user on bastion | `ec2-user` |
| **Authentication Method** | Key or Password | `Key` |
| **SSH Private Key** | PEM-formatted private key | `-----BEGIN RSA PRIVATE KEY-----...` |
| **Key Passphrase** (Optional) | For encrypted keys | `••••••••` |
| **SSH Password** | If using password auth | `••••••••` |

### Key-Based Authentication (Recommended)

1. Generate an SSH key pair (or use existing):
   ```bash
   ssh-keygen -t rsa -b 4096 -f anomalyarmor_key
   ```

2. Add the public key to the bastion host's `~/.ssh/authorized_keys`

3. In AnomalyArmor, paste the contents of the private key file or click **Upload Key File**

> 💡 Key-based authentication is more secure and doesn't require password rotation.

### Password Authentication

If your bastion host uses password authentication:

1. Set **Authentication Method** to `Password`
2. Enter the SSH password

> ⚠️ Key-based authentication is more secure than passwords. Use password auth only if key auth is not available.

### Connection Flow with SSH Tunnel

1. AnomalyArmor connects to your bastion host via SSH
2. An encrypted tunnel is established to your database
3. Database traffic flows securely through the tunnel
4. The tunnel closes automatically after each operation

## Creating a Read-Only User

Create a dedicated user with minimal permissions.

> 💡 **Quick Setup**: [View the PostgreSQL permissions script](/downloads/postgresql-permissions) for a ready-to-use SQL template with all necessary grants.

```sql
-- Create the user
CREATE USER anomalyarmor WITH PASSWORD 'your-secure-password';

-- Grant connection access
GRANT CONNECT ON DATABASE your_database TO anomalyarmor;

-- Grant schema access (repeat for each schema)
GRANT USAGE ON SCHEMA public TO anomalyarmor;
GRANT USAGE ON SCHEMA analytics TO anomalyarmor;

-- Grant read access to existing tables
GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO anomalyarmor;

-- Grant access to future tables (recommended)
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO anomalyarmor;

ALTER DEFAULT PRIVILEGES IN SCHEMA analytics
GRANT SELECT ON TABLES TO anomalyarmor;
```

### Verifying Permissions

Test that the user can access metadata:

```sql
-- Should return tables
SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public' LIMIT 5;

-- Should return columns
SELECT column_name, data_type FROM information_schema.columns
WHERE table_schema = 'public' LIMIT 5;
```

## Provider-Specific Instructions

### Amazon RDS

### Amazon RDS PostgreSQL

**Connection Details**:
- **Host**: Your RDS endpoint (e.g., `mydb.abc123.us-east-1.rds.amazonaws.com`)
- **Port**: `5432` (default)
- **SSL Mode**: `require`

**Security Group Configuration**:

1. Go to **AWS Console → RDS → Your Instance → Security Groups**
2. Edit inbound rules
3. Add rule:
- Type: `PostgreSQL`
- Port: `5432`
- Source: AnomalyArmor IPs (see Settings → Security)

**Parameter Group** (if using verify-ca or verify-full):
- Ensure `rds.force_ssl = 1`
- Download RDS CA certificate bundle

> ℹ️ RDS instances in private subnets require NAT Gateway or VPC peering for AnomalyArmor access. Contact us for Enterprise VPC peering options.

### Amazon Aurora

### Amazon Aurora PostgreSQL

**Connection Details**:
- **Host**: Cluster endpoint (reader or writer)
- **Port**: `5432` (default)
- **SSL Mode**: `require`

**Choosing the Right Endpoint**:

| Endpoint | Use Case |
|----------|----------|
| **Cluster (writer)** | If you need real-time schema changes |
| **Reader** | Recommended - no impact on production writes |

**Aurora Serverless v2**:
- Use the cluster endpoint
- Ensure minimum ACU allows connections during discovery
- Consider scheduling discovery during active hours

### Supabase

### Supabase PostgreSQL

**Connection Options**:

| Method | Host | Port | When to Use |
|--------|------|------|-------------|
| **Direct** | `db.xxx.supabase.co` | `5432` | Standard setup |
| **Pooler (Transaction)** | `xxx.pooler.supabase.com` | `5432` | High connection limits |
| **Pooler (Session)** | `xxx.pooler.supabase.com` | `6543` | If direct fails |

**Finding Your Credentials**:

1. Go to **Supabase Dashboard → Settings → Database**
2. Copy the connection string or individual fields
3. Use the database password (not the API key)

**SSL Configuration**:
- SSL Mode: `require`
- Supabase enforces SSL by default

> ⚠️ Don't use the Supabase API key as the password. Use the actual database password from Settings → Database.

### Self-Hosted

### Self-Hosted PostgreSQL

**Connection Details**:
- **Host**: Your server's hostname or IP
- **Port**: `5432` (or custom port)
- **SSL Mode**: Depends on your setup

**Firewall Configuration**:

Allow inbound connections from AnomalyArmor IPs:

```bash
# iptables example
iptables -A INPUT -p tcp --dport 5432 -s 34.xxx.xxx.xxx -j ACCEPT
iptables -A INPUT -p tcp --dport 5432 -s 34.xxx.xxx.xxx -j ACCEPT
```

**pg_hba.conf Configuration**:

Add entries for AnomalyArmor:

```
# TYPE  DATABASE  USER           ADDRESS         METHOD
hostssl all       anomalyarmor   34.xxx.xxx.xxx/32  scram-sha-256
hostssl all       anomalyarmor   34.xxx.xxx.xxx/32  scram-sha-256
```

**SSL Setup** (if not already configured):

```bash
# Generate self-signed certificate (for testing)
openssl req -new -x509 -days 365 -nodes \
-out server.crt -keyout server.key

# Set permissions
chmod 600 server.key
chown postgres:postgres server.key server.crt

# Enable in postgresql.conf
ssl = on
ssl_cert_file = '/path/to/server.crt'
ssl_key_file = '/path/to/server.key'
```

### Google Cloud SQL

### Google Cloud SQL for PostgreSQL

**Connection Methods**:

| Method | Description |
|--------|-------------|
| **Public IP** | Add AnomalyArmor IPs to authorized networks |
| **Cloud SQL Proxy** | For private IP instances (self-managed) |

**Public IP Setup**:

1. Go to **Cloud Console → SQL → Your Instance → Connections**
2. Under **Authorized Networks**, click **Add Network**
3. Add each AnomalyArmor IP

**Connection Details**:
- **Host**: Public IP from instance overview
- **Port**: `5432`
- **SSL Mode**: `require`

> ℹ️ Cloud SQL requires SSL by default. If you need `verify-ca`, download the server certificate from the instance details.

## Connection Pooling Considerations

If you use a connection pooler (PgBouncer, Pgpool):

### PgBouncer

- **Transaction mode**: Works with AnomalyArmor
- **Session mode**: Recommended for best compatibility
- **Statement mode**: May have issues with complex queries

> 💡 Connect directly to PostgreSQL, not through PgBouncer, unless you have connection limit constraints.

### Connection Limits

AnomalyArmor uses **1-2 connections** during discovery. If you're near your connection limit:

1. Use a read replica for monitoring
2. Schedule discovery during off-peak hours
3. Increase `max_connections` if possible

## What We Query

AnomalyArmor runs these types of queries:

```sql
-- Tables and views
SELECT * FROM information_schema.tables
WHERE table_schema NOT IN ('pg_catalog', 'information_schema');

-- Columns
SELECT * FROM information_schema.columns
WHERE table_schema NOT IN ('pg_catalog', 'information_schema');

-- Constraints
SELECT * FROM information_schema.table_constraints;

-- Freshness (for timestamp columns)
SELECT MAX(your_timestamp_column) FROM your_table;
```

**Impact**: Minimal. These are lightweight metadata queries.

## Troubleshooting

### Connection refused

**Causes**:
- Firewall blocking the connection
- Wrong hostname or port
- Database not running

**Solutions**:
1. Verify AnomalyArmor IPs are allowlisted
2. Check security group rules (for RDS/Aurora)
3. Test connectivity: `nc -zv hostname 5432`
4. Verify database is accepting connections

### Password authentication failed

**Causes**:
- Wrong password
- User doesn't exist
- pg_hba.conf not configured

**Solutions**:
1. Verify password (copy-paste to avoid typos)
2. Confirm user exists: `SELECT usename FROM pg_user;`
3. Check pg_hba.conf allows the connection method
4. Try resetting the password

### SSL connection required

**Causes**:
- Database requires SSL but connection uses `disable`
- Wrong SSL mode for the server

**Solutions**:
1. Set SSL Mode to `require`
2. For RDS/Aurora/Supabase: SSL is required
3. For self-hosted: Enable SSL or allow non-SSL (not recommended)

### Permission denied for relation

**Causes**:
- User lacks SELECT permission
- Schema permission missing

**Solutions**:
```sql
-- Grant schema access
GRANT USAGE ON SCHEMA public TO anomalyarmor;

-- Grant table access
GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor;
```

### No tables found in discovery

**Causes**:
- User can't see tables in information_schema
- Schema filter excluding all schemas

**Solutions**:
1. Test as the user: `SELECT * FROM information_schema.tables LIMIT 5;`
2. Check schema filter settings in AnomalyArmor
3. Verify tables exist in the expected schemas

### SSH tunnel connection failed

**Causes**:
- Invalid SSH credentials
- Bastion host not reachable
- SSH port blocked

**Solutions**:
1. Test SSH connection manually: `ssh -i key.pem user@bastion.example.com`
2. Verify SSH host and port are correct
3. Check that AnomalyArmor IPs can reach the bastion host
4. Ensure the SSH user has permission to forward connections

### SSH authentication failed

**Causes**:
- Invalid private key format
- Wrong passphrase for encrypted key
- Public key not added to bastion

**Solutions**:
1. Verify key is in PEM format (starts with `-----BEGIN`)
2. For encrypted keys, ensure passphrase is correct
3. Check `~/.ssh/authorized_keys` on bastion includes your public key
4. Verify SSH user exists on the bastion host

## Common Questions

### Which PostgreSQL-compatible services does AnomalyArmor support?

PostgreSQL 12+ self-hosted, Amazon RDS, Amazon Aurora (cluster and serverless), Supabase, Google Cloud SQL, Azure Database for PostgreSQL (single and flexible server), and Heroku Postgres. Any managed PostgreSQL that exposes the wire protocol on a network-reachable port works.

### What SSL Mode should I use for RDS or Aurora?

`require` is the right default - it encrypts traffic without pinning certificates. Use `verify-ca` or `verify-full` if your compliance program requires CA validation; you'll need to upload the RDS/Aurora CA bundle in connection settings. Never use `disable` for managed cloud databases.

### My PostgreSQL is only reachable through a bastion host. Can AnomalyArmor still connect?

Yes. Enable SSH tunnel mode on the connection, provide the bastion's host, port, user, and an SSH key, and AnomalyArmor tunnels to your PostgreSQL through it. The bastion needs outbound access to your database on port 5432.

### Does AnomalyArmor support PostgreSQL logical replication or CDC?

Not for primary monitoring. AnomalyArmor monitors via `information_schema` and bounded aggregates on a schedule - it does not read the WAL or consume replication slots. For freshness, it uses `MAX(timestamp_column)`, which works without replication.

### How do I give AnomalyArmor access to future tables without re-granting permissions?

Use `ALTER DEFAULT PRIVILEGES` so new tables created in the monitored schema automatically grant `SELECT` to the AnomalyArmor user:
```sql
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO anomalyarmor;
```
Run it once per schema you want monitored. Existing tables still need an explicit `GRANT SELECT ON ALL TABLES` the first time.

## Next Steps

- **[Run Discovery](/quickstart/run-first-discovery)**: Scan your PostgreSQL database
- **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes

---

<!-- Source: https://docs.anomalyarmor.ai/data-sources/mysql -->

# MySQL

> Connect AnomalyArmor to MySQL databases including RDS, Aurora MySQL, and PlanetScale

Connect AnomalyArmor to any MySQL-compatible database. This guide covers self-hosted MySQL as well as managed services like Amazon RDS, Aurora MySQL, PlanetScale, and DigitalOcean.

## Supported Versions & Platforms

| Platform | Minimum Version | Notes |
|----------|-----------------|-------|
| MySQL | 5.7+ | Self-hosted or any cloud |
| Amazon RDS | 5.7+ | All instance classes |
| Amazon Aurora MySQL | 5.7+ | Cluster and serverless |
| PlanetScale | Any | Serverless MySQL |
| DigitalOcean | 8.0+ | Managed databases |
| Google Cloud SQL | 5.7+ | Public or private IP |
| Azure Database | 5.7+ | Single server or flexible |
| MariaDB | 10.3+ | MySQL-compatible fork |

## Connection Settings

| Field | Description | Example |
|-------|-------------|---------|
| **Connection Name** | Friendly identifier | `Production MySQL` |
| **Host** | Hostname or IP address | `db.example.com` |
| **Port** | Database port | `3306` |
| **Database** | Database name | `myapp_production` |
| **Username** | Database user | `anomalyarmor` |
| **Password** | User password | `••••••••` |
| **SSL Mode** | SSL configuration | `require` |

### SSL Mode Options

| Mode | Description | When to Use |
|------|-------------|-------------|
| `disable` | No SSL | Local development only |
| `require` | SSL required, no verification | **Recommended** for most cloud providers |
| `verify-ca` | Verify server certificate | High security requirements |
| `verify-full` | Verify certificate and hostname | Maximum security |

> ⚠️ Never use `disable` for production databases. Most cloud providers (RDS, Aurora, PlanetScale) require SSL.

## Creating a Read-Only User

Create a dedicated user with minimal permissions:

```sql
-- Create the user
CREATE USER 'anomalyarmor'@'%' IDENTIFIED BY 'your-secure-password';

-- Grant read access to your database
GRANT SELECT ON your_database.* TO 'anomalyarmor'@'%';

-- Access to information_schema is implicit with SELECT
-- Flush privileges to apply changes
FLUSH PRIVILEGES;
```

### For Multiple Databases

If you want to monitor multiple databases:

```sql
-- Grant access to specific databases
GRANT SELECT ON database1.* TO 'anomalyarmor'@'%';
GRANT SELECT ON database2.* TO 'anomalyarmor'@'%';
GRANT SELECT ON analytics.* TO 'anomalyarmor'@'%';

FLUSH PRIVILEGES;
```

### Verifying Permissions

Test that the user can access metadata:

```sql
-- Should return tables
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'your_database' LIMIT 5;

-- Should return columns
SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = 'your_database' LIMIT 5;
```

## Provider-Specific Instructions

### Amazon RDS

### Amazon RDS MySQL

**Connection Details**:
- **Host**: Your RDS endpoint (e.g., `mydb.abc123.us-east-1.rds.amazonaws.com`)
- **Port**: `3306` (default)
- **SSL Mode**: `require`

**Security Group Configuration**:

1. Go to **AWS Console → RDS → Your Instance → Security Groups**
2. Edit inbound rules
3. Add rule:
- Type: `MySQL/Aurora`
- Port: `3306`
- Source: AnomalyArmor IPs (see Settings → Security)

**Parameter Group** (if using verify-ca or verify-full):
- Ensure `require_secure_transport = ON`
- Download RDS CA certificate bundle

> ℹ️ RDS instances in private subnets require NAT Gateway or VPC peering for AnomalyArmor access. Contact us for Enterprise VPC peering options.

### Amazon Aurora MySQL

### Amazon Aurora MySQL

**Connection Details**:
- **Host**: Cluster endpoint (reader or writer)
- **Port**: `3306` (default)
- **SSL Mode**: `require`

**Choosing the Right Endpoint**:

| Endpoint | Use Case |
|----------|----------|
| **Cluster (writer)** | If you need real-time schema changes |
| **Reader** | Recommended - no impact on production writes |

**Aurora Serverless v2**:
- Use the cluster endpoint
- Ensure minimum ACU allows connections during discovery
- Consider scheduling discovery during active hours

### PlanetScale

### PlanetScale

**Connection Details**:
- **Host**: Your branch endpoint (e.g., `aws.connect.psdb.cloud`)
- **Port**: `3306`
- **SSL Mode**: `require` (mandatory)

**Getting Credentials**:

1. Go to **PlanetScale Dashboard → Your Database → Settings → Passwords**
2. Create a new password with **Read-only** access
3. Copy the connection details

**Important Notes**:
- PlanetScale requires SSL - `disable` mode will fail
- Use the `main` branch for production monitoring
- Create separate credentials for AnomalyArmor

> ⚠️ PlanetScale uses Vitess which may show slightly different table metadata than standard MySQL. All core functionality works correctly.

### Self-Hosted

### Self-Hosted MySQL

**Connection Details**:
- **Host**: Your server's hostname or IP
- **Port**: `3306` (or custom port)
- **SSL Mode**: Depends on your setup

**Firewall Configuration**:

Allow inbound connections from AnomalyArmor IPs:

```bash
# iptables example
iptables -A INPUT -p tcp --dport 3306 -s 34.xxx.xxx.xxx -j ACCEPT
iptables -A INPUT -p tcp --dport 3306 -s 34.xxx.xxx.xxx -j ACCEPT
```

**MySQL Configuration** (my.cnf):

Allow remote connections:

```ini
[mysqld]
bind-address = 0.0.0.0
```

**SSL Setup** (if not already configured):

```bash
# Generate certificates
mysql_ssl_rsa_setup --uid=mysql

# Enable in my.cnf
[mysqld]
ssl-ca=/var/lib/mysql/ca.pem
ssl-cert=/var/lib/mysql/server-cert.pem
ssl-key=/var/lib/mysql/server-key.pem
require_secure_transport=ON
```

**User Host Configuration**:

Ensure user is created for remote access:

```sql
-- Allow from any host
CREATE USER 'anomalyarmor'@'%' IDENTIFIED BY 'password';

-- Or allow from specific IPs only
CREATE USER 'anomalyarmor'@'34.xxx.xxx.xxx' IDENTIFIED BY 'password';
```

### Google Cloud SQL

### Google Cloud SQL for MySQL

**Connection Methods**:

| Method | Description |
|--------|-------------|
| **Public IP** | Add AnomalyArmor IPs to authorized networks |
| **Cloud SQL Proxy** | For private IP instances (self-managed) |

**Public IP Setup**:

1. Go to **Cloud Console → SQL → Your Instance → Connections**
2. Under **Authorized Networks**, click **Add Network**
3. Add each AnomalyArmor IP

**Connection Details**:
- **Host**: Public IP from instance overview
- **Port**: `3306`
- **SSL Mode**: `require`

> ℹ️ Cloud SQL requires SSL by default. If you need `verify-ca`, download the server certificate from the instance details.

### DigitalOcean

### DigitalOcean Managed MySQL

**Connection Details**:
- **Host**: Your database cluster hostname
- **Port**: `25060` (DigitalOcean uses non-standard port)
- **SSL Mode**: `require` (mandatory)

**Getting Credentials**:

1. Go to **DigitalOcean → Databases → Your Cluster**
2. Click **Connection Details**
3. Select **Direct Connection** or **Connection Pool**

**Trusted Sources**:

1. Go to **Settings → Trusted Sources**
2. Add AnomalyArmor IP addresses

> ℹ️ DigitalOcean requires SSL and uses port 25060 by default.

## What We Query

AnomalyArmor runs these types of queries:

```sql
-- Tables and views
SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys');

-- Columns
SELECT COLUMN_NAME, DATA_TYPE, IS_NULLABLE, COLUMN_DEFAULT
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = 'your_database';

-- Freshness (for timestamp columns)
SELECT MAX(your_timestamp_column) FROM your_table;
```

**Impact**: Minimal. These are lightweight metadata queries.

## Excluded Schemas

AnomalyArmor automatically excludes MySQL system schemas:

- `mysql`
- `information_schema`
- `performance_schema`
- `sys`

Only user-created databases and tables are monitored.

## Troubleshooting

### Can't connect to MySQL server

**Causes**:
- Firewall blocking the connection
- Wrong hostname or port
- Database not running

**Solutions**:
1. Verify AnomalyArmor IPs are allowlisted
2. Check security group rules (for RDS/Aurora)
3. Test connectivity: `nc -zv hostname 3306`
4. Verify MySQL is running: `systemctl status mysql`

### Access denied for user

**Causes**:
- Wrong password
- User doesn't exist for connecting host
- User lacks privileges

**Solutions**:
1. Verify password (copy-paste to avoid typos)
2. Confirm user exists: `SELECT User, Host FROM mysql.user;`
3. Check user is created for `%` or specific IP
4. Verify grants: `SHOW GRANTS FOR 'anomalyarmor'@'%';`

### SSL connection error

**Causes**:
- Database requires SSL but connection uses `disable`
- SSL certificate issues

**Solutions**:
1. Set SSL Mode to `require`
2. For RDS/Aurora/PlanetScale: SSL is required
3. For self-hosted: Enable SSL or allow non-SSL (not recommended)

### Unknown database

**Causes**:
- Database name is incorrect
- Database names are case-sensitive on Linux

**Solutions**:
1. Verify database name: `SHOW DATABASES;`
2. Use exact case for database name
3. Check you have access: `SHOW DATABASES;` (shows only accessible DBs)

### No tables found in discovery

**Causes**:
- User can't see tables in information_schema
- Schema filter excluding all schemas

**Solutions**:
1. Test as the user: `SELECT * FROM INFORMATION_SCHEMA.TABLES LIMIT 5;`
2. Check schema filter settings in AnomalyArmor
3. Verify tables exist in the database

## Common Questions

### Which MySQL versions and managed services does AnomalyArmor support?

MySQL 5.7 and 8.0+, Amazon RDS for MySQL, Aurora MySQL, Google Cloud SQL for MySQL, Azure Database for MySQL, and PlanetScale. MariaDB is MySQL-compatible for most monitoring use cases but not officially supported - test before relying on it in production.

### Does MySQL 5.7 have any monitoring limitations compared to 8.0?

Minor. MySQL 5.7 doesn't expose some column-statistics views that 8.0 has, so a few metric baselines are coarser. Schema drift and freshness work identically on both versions - no migration urgency for observability alone.

### Can AnomalyArmor monitor PlanetScale branches?

Yes. Each connection points at one PlanetScale branch. Monitor `main` for production; add separate data sources per branch if you want pre-merge schema checks. PlanetScale's schema migration flow surfaces in AnomalyArmor as schema drift on the target branch.

### How should I configure SSL for RDS MySQL?

Set SSL Mode to `require` (RDS MySQL's default). Upload Amazon's RDS CA bundle in connection settings if your compliance program requires `verify-ca` or `verify-full`.

### Does AnomalyArmor read MySQL binlogs?

No. AnomalyArmor queries `information_schema` and runs bounded aggregates on a schedule. Binlog access is not required - if you've granted `REPLICATION CLIENT` to the AnomalyArmor user, you can safely revoke it.

## Next Steps

- **[Run Discovery](/quickstart/run-first-discovery)**: Scan your MySQL database
- **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes

---

<!-- Source: https://docs.anomalyarmor.ai/data-sources/snowflake -->

# Snowflake

> Connect AnomalyArmor to your Snowflake data warehouse

Connect AnomalyArmor to your Snowflake data warehouse to monitor schemas, track freshness, and detect schema drift across your tables and views.

## Requirements

Before connecting, ensure you have:

- **Snowflake account** with database access
- **Virtual warehouse** for query execution
- **User with SELECT permissions** on information_schema and target schemas
- **Network access** from AnomalyArmor to your Snowflake account

## Connection Settings

| Field | Description | Example |
|-------|-------------|---------|
| **Connection Name** | Friendly identifier | `Snowflake Production` |
| **Account** | Snowflake account identifier | `xy12345.us-east-1.aws` |
| **Database** | Database (catalog) to monitor | `PRODUCTION_DB` |
| **Warehouse** | Virtual warehouse name | `COMPUTE_WH` |
| **Username** | Snowflake user account | `anomalyarmor_user` |
| **Password** | User password | `••••••••` |
| **Schema** (Optional) | Default schema | `PUBLIC` |
| **Role** (Optional) | Role to use | `ANOMALYARMOR_ROLE` |

## Finding Your Account Identifier

Your account identifier format depends on your cloud provider and region:

| Cloud Provider | Format |
|---------------|--------|
| **Standard (AWS)** | `xy12345.us-east-1.aws` |
| **Azure** | `xy12345.east-us-2.azure` |
| **GCP** | `xy12345.us-central1.gcp` |

Find your account identifier in the Snowflake web interface URL:

> ⚠️ Include the full account identifier with region and cloud provider. Using just the account locator (e.g., `xy12345`) may not work for all regions.

## Creating a Read-Only User

Create a dedicated user and role with minimal permissions.

> 💡 **Quick Setup**: [Download the Snowflake permissions script](/downloads/snowflake-permissions) for a ready-to-use SQL template with role and user setup.

```sql
-- Create a read-only role for AnomalyArmor
CREATE ROLE IF NOT EXISTS ANOMALYARMOR_ROLE;

-- Grant database access
GRANT USAGE ON DATABASE your_database TO ROLE ANOMALYARMOR_ROLE;

-- Grant schema access (for all schemas)
GRANT USAGE ON ALL SCHEMAS IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE;
GRANT USAGE ON FUTURE SCHEMAS IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE;

-- Grant SELECT on all tables
GRANT SELECT ON ALL TABLES IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE;
GRANT SELECT ON FUTURE TABLES IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE;

-- Grant SELECT on all views
GRANT SELECT ON ALL VIEWS IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE;
GRANT SELECT ON FUTURE VIEWS IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE;

-- Grant warehouse usage
GRANT USAGE ON WAREHOUSE your_warehouse TO ROLE ANOMALYARMOR_ROLE;

-- Create user and assign role
CREATE USER IF NOT EXISTS anomalyarmor_user
  PASSWORD = 'your_secure_password'
  DEFAULT_ROLE = ANOMALYARMOR_ROLE
  DEFAULT_WAREHOUSE = your_warehouse;

GRANT ROLE ANOMALYARMOR_ROLE TO USER anomalyarmor_user;
```

### Per-Schema Permissions

For more granular control:

```sql
-- Grant access to specific schemas only
GRANT USAGE ON SCHEMA your_database.raw TO ROLE ANOMALYARMOR_ROLE;
GRANT USAGE ON SCHEMA your_database.staging TO ROLE ANOMALYARMOR_ROLE;
GRANT USAGE ON SCHEMA your_database.marts TO ROLE ANOMALYARMOR_ROLE;

-- Grant SELECT per schema
GRANT SELECT ON ALL TABLES IN SCHEMA your_database.raw TO ROLE ANOMALYARMOR_ROLE;
GRANT SELECT ON ALL TABLES IN SCHEMA your_database.staging TO ROLE ANOMALYARMOR_ROLE;
GRANT SELECT ON ALL TABLES IN SCHEMA your_database.marts TO ROLE ANOMALYARMOR_ROLE;
```

## Authentication Methods

### Password Authentication

Standard username/password authentication. Simplest setup for getting started.

**In AnomalyArmor**:
- Enter your username and password
- No additional configuration required

### Key-Pair Authentication (Recommended)

More secure authentication using RSA key pairs. Recommended for production.

**Step 1: Generate Key Pair**

```bash
# Generate private key (unencrypted for automation)
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out snowflake_key.p8 -nocrypt

# Generate public key
openssl rsa -in snowflake_key.p8 -pubout -out snowflake_key.pub

# View public key (for Snowflake)
cat snowflake_key.pub
```

**Step 2: Assign Public Key to User**

```sql
-- Remove the header/footer lines from the public key
ALTER USER anomalyarmor_user SET RSA_PUBLIC_KEY='MIIBIjANBgkqhki...';
```

**Step 3: Configure in AnomalyArmor**

- Set **Authenticator** to `snowflake_jwt`
- Provide the contents of `snowflake_key.p8` as the **Private Key**
- If your key is encrypted, provide the **Private Key Passphrase**
- Leave password field empty

> 💡 For encrypted keys, generate with: `openssl genrsa 2048 | openssl pkcs8 -topk8 -v2 aes256 -inform PEM -out snowflake_key.p8`

> ℹ️ Key-pair authentication is more secure because:
- No password to rotate
- Keys can't be phished
- Supports hardware security modules

## Virtual Warehouse Considerations

### Warehouse Sizing

AnomalyArmor runs lightweight metadata queries. Recommended warehouse configuration:

| Environment | Size | Notes |
|-------------|------|-------|
| **Development** | X-Small | Sufficient for testing |
| **Production** | Small | Faster query execution |

### Auto-Suspend Configuration

Enable auto-suspend to minimize costs:

```sql
-- Configure warehouse for auto-suspend (1-5 minutes recommended)
ALTER WAREHOUSE your_warehouse SET
  AUTO_SUSPEND = 60       -- Suspend after 60 seconds idle
  AUTO_RESUME = TRUE      -- Resume automatically on query
  INITIALLY_SUSPENDED = TRUE;
```

> 💡 AnomalyArmor queries typically complete in under a second. With 1-minute auto-suspend, you'll pay only for actual query time.

### Cost Estimation

| Metric | Value |
|--------|-------|
| **Query duration** | < 1 second per discovery |
| **With 1-min auto-suspend** | ~$0.01-0.05 per discovery |
| **Hourly monitoring** | ~$15-30/month |

## Network Policies

If your Snowflake account uses network policies, add AnomalyArmor's IP addresses:

```sql
-- View existing network policies
SHOW NETWORK POLICIES;

-- Add AnomalyArmor IPs to your allowlist
ALTER NETWORK POLICY your_policy SET
  ALLOWED_IP_LIST = ('34.xxx.xxx.xxx/32', '35.xxx.xxx.xxx/32', ...);
```

> ℹ️ Find AnomalyArmor's current IP addresses in **Settings > Security** in the AnomalyArmor dashboard.

## What We Monitor

AnomalyArmor discovers and monitors these Snowflake objects:

| Object Type | Monitored | Notes |
|-------------|-----------|-------|
| **Tables** | Yes | Including managed and external |
| **Views** | Yes | Standard and materialized |
| **Schemas** | Yes | Schema-level metadata |
| **Stages** | No | External/internal stages not monitored |
| **Streams** | No | Change data capture not monitored |
| **Tasks** | No | Scheduled tasks not monitored |

### Metadata Captured

For each table and view:

- Table name and schema
- Column names and data types
- Nullability and default values
- Last modified timestamp (for freshness)
- Partition information (where applicable)

## Multiple Databases

To monitor multiple databases, create separate data sources for each:

| Data Source | Database |
|------------|----------|
| Snowflake Production | `PRODUCTION_DB` |
| Snowflake Staging | `STAGING_DB` |
| Snowflake Analytics | `ANALYTICS_DB` |

> ℹ️ Each data source needs access to its respective database. Use the same credentials if they have permissions across databases.

## Connection Architecture

## What We Query

AnomalyArmor runs these types of queries:

```sql
-- List tables and views
SELECT table_catalog, table_schema, table_name, table_type
FROM YOUR_DATABASE.INFORMATION_SCHEMA.TABLES
WHERE table_schema NOT IN ('INFORMATION_SCHEMA');

-- Get column details
SELECT column_name, data_type, is_nullable, column_default
FROM YOUR_DATABASE.INFORMATION_SCHEMA.COLUMNS
WHERE table_schema = 'YOUR_SCHEMA' AND table_name = 'YOUR_TABLE';

-- Check freshness (for tables with timestamp columns)
SELECT MAX(your_timestamp_column) FROM your_table;
```

**Impact**: Minimal. These are metadata queries that don't scan table data.

## Troubleshooting

### Connection test fails

**Common causes**:
1. Invalid account identifier
2. Wrong username or password
3. Warehouse doesn't exist or is suspended

**Solutions**:
1. Verify account identifier includes region and cloud (e.g., `xy12345.us-east-1.aws`)
2. Test credentials in Snowflake web interface first
3. Ensure warehouse exists: `SHOW WAREHOUSES;`
4. Resume warehouse if suspended: `ALTER WAREHOUSE your_wh RESUME;`

### Incorrect username or password

**Causes**:
- Typo in credentials
- User doesn't exist
- Password expired

**Solutions**:
1. Verify user exists: `SHOW USERS LIKE 'anomalyarmor%';`
2. Reset password if needed
3. For key-pair auth, verify public key is assigned to user

### Account not found

**Causes**:
- Missing region or cloud in account identifier
- Account locator typo

**Solutions**:
1. Get full account identifier from Snowflake URL
2. Include region and cloud: `xy12345.us-east-1.aws`
3. Try alternative formats if needed

### Warehouse does not exist

**Causes**:
- Typo in warehouse name
- User lacks USAGE on warehouse
- Warehouse was deleted

**Solutions**:
1. List warehouses: `SHOW WAREHOUSES;`
2. Grant usage: `GRANT USAGE ON WAREHOUSE wh TO ROLE role;`
3. Check warehouse name is exact match (case-sensitive)

### IP address blocked

**Causes**:
- Network policy restricting access
- AnomalyArmor IPs not allowlisted

**Solutions**:
1. Check network policies: `SHOW NETWORK POLICIES;`
2. Add AnomalyArmor IPs to allowlist
3. Contact your Snowflake admin for policy changes

### Role does not exist

**Causes**:
- Role typo
- Role was deleted
- User not granted the role

**Solutions**:
1. List roles: `SHOW ROLES;`
2. Check grants: `SHOW GRANTS TO USER your_user;`
3. Grant role: `GRANT ROLE role TO USER user;`

### No tables found in discovery

**Causes**:
- User lacks SELECT permissions
- Schema filter excluding all schemas
- Empty database

**Solutions**:
1. Test query as user: `SELECT * FROM your_db.INFORMATION_SCHEMA.TABLES LIMIT 5;`
2. Check grants: `SHOW GRANTS TO ROLE your_role;`
3. Grant SELECT on schemas: `GRANT SELECT ON ALL TABLES IN SCHEMA schema TO ROLE role;`

## Best Practices

### Use Dedicated Service Account

Create a dedicated user for AnomalyArmor rather than using personal accounts:

- Dedicated users persist regardless of employee changes
- Easier to audit and manage permissions
- Can be easily rotated or disabled

### Use Key-Pair Authentication for Production

Password authentication works but key-pair is more secure:

- No password to rotate
- Keys can't be phished
- Better audit trail
- Supports hardware security modules

### Monitor Your Production Database

Start with your production database where schema changes have the most impact:

| Priority | Database | Importance |
|----------|----------|------------|
| 1 | Production database | Critical |
| 2 | Staging database | Important |
| 3 | Development databases | Optional |

### Choose the Right Warehouse Size

Metadata queries are lightweight. X-Small is sufficient but Small provides faster startup:

| Warehouse Size | Credit/Hour | Recommendation |
|----------------|-------------|----------------|
| X-Small | 1 | Development |
| Small | 2 | Production |

## Common Questions

### How much Snowflake credit does AnomalyArmor consume?

Minimal. Discovery runs metadata queries against `information_schema.tables/columns` which auto-suspend the warehouse promptly. Typical daily consumption is under 1 credit on an X-Small warehouse. Use a dedicated X-Small warehouse for AnomalyArmor to keep cost visibility clean.

### What's the correct format for the Snowflake account identifier?

Include the full `account.region.cloud` format (e.g., `xy12345.us-east-1.aws`), not just the account locator. The URL in your Snowflake web interface shows it: `https://app.snowflake.com/<region>/<account>/...`. Getting this wrong is the most common first-connect error.

### Can AnomalyArmor monitor multiple Snowflake databases (catalogs) with one connection?

Not today. Connect each Snowflake database as a separate data source. A shared `ANOMALYARMOR_ROLE` with `USAGE` on multiple databases simplifies permission setup across them.

### How do I give AnomalyArmor access to future tables in Snowflake?

Use `FUTURE` grants: `GRANT SELECT ON FUTURE TABLES IN DATABASE <db> TO ROLE ANOMALYARMOR_ROLE` (and likewise for `FUTURE SCHEMAS` and `FUTURE VIEWS`). New objects inherit `SELECT` automatically - no re-grant needed after each deploy.

### Do I need a separate warehouse for AnomalyArmor, or can it share my existing one?

Sharing works but obscures cost attribution. A dedicated X-Small warehouse with `AUTO_SUSPEND = 60` isolates AnomalyArmor's credit usage and prevents unrelated queries from keeping the warehouse warm longer than needed.

## Next Steps

- **[Run Discovery](/quickstart/run-first-discovery)**: Scan your Snowflake database
- **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes

---

<!-- Source: https://docs.anomalyarmor.ai/data-sources/bigquery -->

# BigQuery

> Connect AnomalyArmor to Google BigQuery for schema monitoring and change detection

Connect AnomalyArmor to Google BigQuery to monitor schemas, track freshness, and detect schema drift across your datasets and tables.

## Requirements

Before connecting, ensure you have:

- **Google Cloud Platform (GCP) account** with an active project
- **BigQuery API enabled** in your project
- **Service account** with `bigquery.datasets.get` and `bigquery.tables.list` permissions
- **Service account JSON key** for authentication

## Connection Settings

| Field | Description | Example |
|-------|-------------|---------|
| **Asset Name** | Friendly identifier for this connection | `BigQuery Production` |
| **Project ID** | GCP project ID | `my-analytics-project` |
| **Service Account JSON** | Full JSON key contents | `{"type": "service_account", ...}` |
| **Default Dataset** (Optional) | Scope introspection to a single dataset | `analytics` |
| **Location** (Optional) | BigQuery data location/region | `US`, `EU`, `us-east1` |

### Service Account JSON Input

You can provide the service account JSON in two ways:

- **File Upload**: Click **Upload JSON File** to select your downloaded key file
- **Paste**: Copy and paste the entire JSON contents into the text area

> 💡 File upload is recommended to avoid copy-paste errors with the ~2KB JSON key.

### Default Dataset

If your project has many datasets, use the **Default Dataset** field to scope introspection to a single dataset. This is useful for:

- Large projects with 100+ datasets
- Focusing on production data only
- Reducing discovery time

Leave empty to discover all datasets in the project.

### Location

The **Location** field specifies where your BigQuery data resides. Common options:

| Location | Description |
|----------|-------------|
| `US` | United States (multi-region) |
| `EU` | European Union (multi-region) |
| `us-east1` | South Carolina |
| `us-west1` | Oregon |
| `europe-west1` | Belgium |
| `europe-west2` | London |
| `asia-northeast1` | Tokyo |

Leave as "Auto" to let BigQuery determine the location automatically.

> ℹ️ BigQuery uses project-level connections. Without a default dataset, all datasets within the project will be discovered automatically.

## Creating a Service Account

### Step 1: Create Service Account

1. Go to **GCP Console > IAM & Admin > Service Accounts**
2. Click **Create Service Account**
3. Enter details:
   - Name: `anomalyarmor`
   - Description: `AnomalyArmor schema monitoring`
4. Click **Create and Continue**

### Step 2: Assign Roles

Grant the minimum required roles:

| Role | Purpose |
|------|---------|
| **BigQuery Data Viewer** | Read table/view metadata and data |
| **BigQuery Job User** | Execute queries for introspection |

| Role | ID | Purpose |
|------|----|---------|
| **BigQuery Data Viewer** | `roles/bigquery.dataViewer` | Read metadata and table contents |
| **BigQuery Job User** | `roles/bigquery.jobUser` | Run queries in the project |

> 💡 For production, consider creating a custom role with only the specific permissions needed (see Minimal Permissions section below).

### Step 3: Create JSON Key

1. Click on the created service account
2. Go to **Keys** tab
3. Click **Add Key > Create new key**
4. Select **JSON** format
5. Download and securely store the key file

> ⚠️ Keep your service account key secure. Never commit it to version control or share it publicly. You can rotate keys in GCP Console if compromised.

## Minimal Permissions

For maximum security, create a custom role with only these permissions:

```yaml
# Custom role for AnomalyArmor
title: "AnomalyArmor Schema Monitor"
description: "Minimal permissions for schema monitoring"
includedPermissions:
  - bigquery.datasets.get
  - bigquery.tables.get
  - bigquery.tables.list
  - bigquery.jobs.create
  - bigquery.jobs.get
```

To create via gcloud CLI:

```bash
# Create custom role
gcloud iam roles create anomalyarmor_monitor \
  --project=your-project-id \
  --title="AnomalyArmor Schema Monitor" \
  --description="Minimal permissions for schema monitoring" \
  --permissions=bigquery.datasets.get,bigquery.tables.get,bigquery.tables.list,bigquery.jobs.create,bigquery.jobs.get

# Assign to service account
gcloud projects add-iam-policy-binding your-project-id \
  --member="serviceAccount:anomalyarmor@your-project-id.iam.gserviceaccount.com" \
  --role="projects/your-project-id/roles/anomalyarmor_monitor"
```

## Finding Your Project ID

Your project ID is shown in multiple places in GCP Console:

- **Dashboard**: Top-left dropdown shows project name and ID
- **URL**: `console.cloud.google.com/bigquery?project=YOUR-PROJECT-ID`
- **Project Settings**: IAM & Admin > Settings

> ⚠️ Use the **Project ID** (e.g., `my-analytics-123`), not the Project Name (e.g., "My Analytics"). Project IDs are lowercase with hyphens.

## Enabling BigQuery API

If the BigQuery API is not enabled:

1. Go to **GCP Console > APIs & Services > Library**
2. Search for "BigQuery API"
3. Click **Enable**

Or via gcloud CLI:

```bash
gcloud services enable bigquery.googleapis.com --project=your-project-id
```

## What We Monitor

AnomalyArmor discovers and monitors these BigQuery objects:

| Object Type | Monitored | Notes |
|-------------|-----------|-------|
| **Tables** | Yes | Standard and partitioned |
| **Views** | Yes | Standard views |
| **Materialized Views** | Yes | Treated as views |
| **External Tables** | Yes | Schema only, not source |
| **Datasets** | Yes | As schemas/namespaces |
| **Routines** | No | Functions/procedures not monitored |
| **Models** | No | ML models not monitored |

### Metadata Captured

For each table and view:

- Table name and dataset
- Column names and data types
- Nullability
- Last modified timestamp (for freshness)
- Partitioning information

## Connection Architecture

## What We Query

AnomalyArmor runs these types of queries:

```sql
-- List tables in each dataset
SELECT table_catalog, table_schema, table_name, table_type
FROM `project_id.dataset_id.INFORMATION_SCHEMA.TABLES`;

-- Get column details
SELECT column_name, data_type, is_nullable, column_default
FROM `project_id.dataset_id.INFORMATION_SCHEMA.COLUMNS`
WHERE table_name = 'your_table';

-- Check freshness (for tables with timestamp columns)
SELECT MAX(your_timestamp_column) FROM `project_id.dataset_id.your_table`;
```

**Impact**: Minimal. These are lightweight metadata queries that don't scan table data.

### Cost Estimation

| Metric | Value |
|--------|-------|
| **Query type** | Metadata queries (minimal cost) |
| **Data scanned** | ~1KB per INFORMATION_SCHEMA query |
| **Estimated cost** | < $0.01 per discovery |

> ℹ️ INFORMATION_SCHEMA queries in BigQuery are free. The only cost is for freshness checks that query actual tables, which scan minimal data.

## Multiple Projects

To monitor multiple GCP projects, create separate data sources for each:

| Data Source | Project |
|------------|---------|
| BigQuery Production | `prod-analytics-123` |
| BigQuery Staging | `staging-analytics-456` |
| BigQuery Development | `dev-analytics-789` |

Each project needs its own service account or a service account with cross-project permissions.

### Cross-Project Access

To use one service account for multiple projects:

```bash
# Grant access in each project
gcloud projects add-iam-policy-binding other-project-id \
  --member="serviceAccount:anomalyarmor@main-project.iam.gserviceaccount.com" \
  --role="roles/bigquery.dataViewer"
```

## VPC Service Controls

If your BigQuery is protected by VPC Service Controls:

1. Add AnomalyArmor's IP addresses to the access level
2. Or configure an access policy that allows the service account

> ℹ️ Contact AnomalyArmor support for Enterprise VPC Service Controls integration options.

## Troubleshooting

### Permission denied / 403 error

**Causes**:
- Service account lacks required roles
- API not enabled
- VPC Service Controls blocking access

**Solutions**:
1. Verify roles: **BigQuery Data Viewer** and **BigQuery Job User**
2. Enable BigQuery API in GCP Console
3. Check VPC Service Controls perimeter settings
4. Test service account access via gcloud CLI

### Project not found

**Causes**:
- Wrong project ID
- Project ID vs Project Name confusion
- Project deleted or suspended

**Solutions**:
1. Use Project ID (lowercase with hyphens), not Project Name
2. Verify project exists in GCP Console
3. Check project hasn't been deleted or suspended

### Invalid service account JSON

**Causes**:
- Malformed JSON
- Missing required fields
- Copied text instead of file contents

**Solutions**:
1. Download fresh key from GCP Console
2. Paste entire file contents (including braces)
3. Verify JSON is valid (no trailing commas, proper quotes)

### BigQuery API not enabled

**Causes**:
- API disabled in project
- New project without API enabled

**Solutions**:
1. Go to GCP Console > APIs & Services > Library
2. Search for "BigQuery API"
3. Click Enable
4. Wait a few minutes for propagation

### Dataset not found

**Causes**:
- Dataset was deleted
- Dataset in different project
- Permission to list datasets missing

**Solutions**:
1. Verify dataset exists in BigQuery Console
2. Check service account can list datasets
3. Grant `bigquery.datasets.get` permission

### No tables found in discovery

**Causes**:
- Empty datasets
- Service account can't read INFORMATION_SCHEMA
- Dataset-level permissions missing

**Solutions**:
1. Verify tables exist in BigQuery Console
2. Test query as service account
3. Grant dataset-level access if using fine-grained permissions

### Quota exceeded / Rate limit

**Causes**:
- Too many concurrent queries
- Project quota limits reached

**Solutions**:
1. Wait and retry
2. Check quota usage in GCP Console
3. Request quota increase if needed
4. Schedule discovery during off-peak hours

## Best Practices

### Use Dedicated Service Account

Create a dedicated service account for AnomalyArmor:

- Easier to audit and manage permissions
- Can be rotated independently
- Clear separation of concerns

### Start with Production Data

Monitor your production project first where schema changes have the most impact:

```
Priority:
1. Production project (critical)
2. Staging project (important)
3. Development projects (optional)
```

### Rotate Keys Periodically

Rotate service account keys on a regular schedule:

1. Create new key in GCP Console
2. Update credentials in AnomalyArmor
3. Delete old key after verification

### Monitor Specific Datasets

If you have many datasets, consider creating dataset-level permissions to monitor only critical ones:

```sql
-- Grant access to specific datasets only
GRANT `roles/bigquery.dataViewer`
ON SCHEMA `project.critical_dataset`
TO "serviceAccount:anomalyarmor@project.iam.gserviceaccount.com";
```

## Common Questions

### How does AnomalyArmor authenticate to BigQuery?

Via a Google Cloud service account JSON key. Create a service account with `BigQuery Metadata Viewer` and `BigQuery Data Viewer` on the project or specific datasets, generate a key, and paste the JSON into the connection form. Workload Identity federation is available on Enterprise plans for customers who want to avoid long-lived keys.

### Does AnomalyArmor monitor across multiple BigQuery projects?

One connection monitors one project. Add a separate data source per project. Assets from all connected projects appear unified in the asset catalog; plan limits count monitored tables, not projects.

### What BigQuery costs does AnomalyArmor incur?

Minimal. Metadata reads against `INFORMATION_SCHEMA` are free. Bounded aggregate queries (`COUNT(*)`, `MIN`, `MAX`) hit the usual slot pricing but scan very little data per table since BigQuery's column-store skips unused columns. Use a dedicated reservation or project if you want clean cost attribution.

### Can AnomalyArmor monitor partitioned tables and clustered tables in BigQuery?

Yes. Schema drift, freshness, and metrics all work on partitioned and clustered tables. For very large partitioned tables, configure freshness to query a specific partition (e.g., `WHERE _PARTITIONTIME >= CURRENT_TIMESTAMP() - INTERVAL 1 DAY`) to minimize bytes scanned.

## Next Steps

- **[Run Discovery](/quickstart/run-first-discovery)**: Scan your BigQuery project
- **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes

---

<!-- Source: https://docs.anomalyarmor.ai/data-sources/redshift -->

# Amazon Redshift

> Connect AnomalyArmor to Amazon Redshift data warehouses

Connect AnomalyArmor to your Amazon Redshift cluster to monitor schema changes and data freshness. This guide covers both provisioned clusters and Redshift Serverless.

## Supported Platforms

| Platform | Notes |
|----------|-------|
| **Redshift Provisioned** | All node types (RA3, DC2, DS2) |
| **Redshift Serverless** | Workgroups with any RPU configuration |

## Connection Settings

| Field | Description | Example |
|-------|-------------|---------|
| **Connection Name** | Friendly identifier | `Production Redshift` |
| **Host** | Cluster or workgroup endpoint | `my-cluster.abc123.us-east-1.redshift.amazonaws.com` |
| **Port** | Database port | `5439` |
| **Database** | Database name | `dev` |
| **Username** | Database user | `anomalyarmor` |
| **Password** | User password | See credentials section |

## Finding Your Connection Details

### Provisioned Cluster

1. Go to **AWS Console > Amazon Redshift > Clusters**
2. Click on your cluster name
3. Find the **Endpoint** in the cluster details
4. Copy the hostname portion (everything before the port)

### Redshift Serverless

1. Go to **AWS Console > Amazon Redshift > Serverless dashboard**
2. Click on your workgroup
3. Find the **Endpoint** in the workgroup details. The default database for Serverless is typically `dev`.

## Creating a Read-Only User

Create a dedicated user with minimal permissions:

```sql
-- Create the user
CREATE USER anomalyarmor WITH PASSWORD 'your-secure-password';

-- Grant USAGE on schemas you want to monitor
GRANT USAGE ON SCHEMA public TO anomalyarmor;
GRANT USAGE ON SCHEMA analytics TO anomalyarmor;

-- Grant SELECT on all current tables
GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO anomalyarmor;

-- Grant access to future tables (recommended)
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO anomalyarmor;

ALTER DEFAULT PRIVILEGES IN SCHEMA analytics
GRANT SELECT ON TABLES TO anomalyarmor;
```

### Verifying Permissions

Test that the user can access metadata:

```sql
-- Should return tables
SELECT table_schema, table_name FROM svv_tables
WHERE table_schema NOT IN ('pg_catalog', 'information_schema')
LIMIT 5;

-- Should return columns
SELECT column_name, data_type FROM information_schema.columns
WHERE table_schema = 'public' LIMIT 5;
```

## Network Configuration

### Security Groups

Add an inbound rule to your Redshift cluster's security group:

1. Go to **AWS Console > VPC > Security Groups**
2. Find the security group attached to your Redshift cluster
3. Edit inbound rules
4. Add rule:
   - **Type**: Custom TCP
   - **Port**: `5439`
   - **Source**: AnomalyArmor IP addresses (see Settings > Security)

### Public Accessibility

For AnomalyArmor to connect:

**Option 1: Publicly Accessible Cluster**

1. Enable "Publicly accessible" in cluster settings
2. Configure security group as shown above

**Option 2: VPC Peering (Enterprise)**

For private clusters, contact us for VPC peering or PrivateLink options.

> ℹ️ Redshift clusters in private subnets with no public access require NAT Gateway or VPC peering. Contact us for Enterprise networking options.

### SSL/TLS

Redshift requires SSL by default. AnomalyArmor automatically uses SSL for all Redshift connections.

## Redshift Serverless Considerations

### Automatic Scaling

Redshift Serverless scales compute automatically. Our discovery queries are lightweight and consume minimal RPUs.

### Idle Workgroups

If your workgroup has been idle, the first query may take a few seconds while compute scales up. This is normal behavior.

### Database Name

Redshift Serverless typically uses `dev` as the default database. Check your workgroup settings if unsure.

## What We Monitor

AnomalyArmor discovers and monitors:

| Object Type | Monitored | Notes |
|-------------|-----------|-------|
| **Tables** | Yes | All user-created tables |
| **Views** | Yes | Regular views |
| **Materialized Views** | Yes | Treated as tables |
| **External Tables** | Yes | Redshift Spectrum tables (if permissions granted) |
| **System Tables** | No | Excluded automatically |

### Metadata Captured

For each table and view:

- Table name and schema
- Column names and data types
- Nullable constraints
- Default values
- Last modified timestamp (for freshness)

## What We Query

AnomalyArmor runs these types of queries:

```sql
-- Tables and views (using Redshift system view)
SELECT table_schema, table_name, table_type
FROM svv_tables
WHERE table_schema NOT IN ('pg_catalog', 'information_schema');

-- Columns
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_schema = 'your_schema';

-- Freshness (for timestamp columns)
SELECT MAX(your_timestamp_column) FROM your_table;
```

**Impact**: Minimal. These are lightweight metadata queries that don't scan table data.

## Multiple Schemas

To monitor multiple schemas, grant permissions on each:

```sql
-- For each schema you want to monitor
GRANT USAGE ON SCHEMA raw TO anomalyarmor;
GRANT USAGE ON SCHEMA staging TO anomalyarmor;
GRANT USAGE ON SCHEMA production TO anomalyarmor;

GRANT SELECT ON ALL TABLES IN SCHEMA raw TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA staging TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA production TO anomalyarmor;
```

> 💡 AnomalyArmor automatically excludes system schemas (pg_catalog, information_schema, pg_internal).

## Redshift Spectrum (External Tables)

If you use Redshift Spectrum for querying S3:

```sql
-- Grant access to external schemas
GRANT USAGE ON SCHEMA spectrum_schema TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA spectrum_schema TO anomalyarmor;
```

External tables appear alongside regular tables in discovery.

## Troubleshooting

### Connection refused

**Causes**:
- Security group not configured
- Cluster not publicly accessible
- Wrong endpoint

**Solutions**:
1. Verify AnomalyArmor IPs are in security group inbound rules
2. Check "Publicly accessible" setting in cluster configuration
3. Verify endpoint is correct (check AWS Console)
4. Test connectivity: `nc -zv your-endpoint 5439`

### Connection timeout

**Causes**:
- Firewall blocking connection
- Cluster in private subnet without proper routing

**Solutions**:
1. Verify security group allows port 5439 from AnomalyArmor IPs
2. If private subnet, ensure NAT Gateway is configured
3. Consider enabling "Publicly accessible" for the cluster
4. Contact us for VPC peering options (Enterprise)

### Authentication failed

**Causes**:
- Wrong password
- User doesn't exist
- Case sensitivity issues

**Solutions**:
1. Verify password (copy-paste to avoid typos)
2. Confirm user exists: `SELECT usename FROM pg_user;`
3. Remember: Redshift usernames are case-sensitive when quoted
4. Try resetting the password:
```sql
ALTER USER anomalyarmor PASSWORD 'new-password';
```

### Database does not exist

**Causes**:
- Wrong database name
- Case sensitivity

**Solutions**:
1. List databases: `SELECT datname FROM pg_database;`
2. Redshift Serverless default is usually `dev`
3. Database names are case-sensitive

### No tables found in discovery

**Causes**:
- User lacks SELECT permission
- Schema permissions missing
- No tables in monitored schemas

**Solutions**:
1. Test as the user:
```sql
SELECT * FROM svv_tables
WHERE table_schema = 'public' LIMIT 5;
```
2. Grant schema access:
```sql
GRANT USAGE ON SCHEMA public TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor;
```
3. Verify tables exist in expected schemas

### SSL connection error

**Causes**:
- SSL configuration mismatch

**Solutions**:
1. AnomalyArmor uses SSL by default (required for Redshift)
2. Ensure your cluster has SSL enabled (default setting)
3. Contact support if you have custom SSL requirements

## Best Practices

### Use a Dedicated User

Create a read-only user specifically for AnomalyArmor:
- Don't use admin accounts
- Don't share credentials with other tools
- Use a strong, unique password

### Use Read Replicas (if available)

If you have Redshift RA3 with data sharing or multiple clusters, consider pointing AnomalyArmor at a read replica to isolate monitoring queries from production workloads.

### Grant Minimal Permissions

Only grant access to schemas you want to monitor. Avoid:
```sql
-- Don't do this
GRANT ALL ON DATABASE dev TO anomalyarmor;
```

### Schedule Discovery After ETL

If you have predictable ETL schedules, run discovery after ETL completes:

| Schedule | Time |
|----------|------|
| **ETL** | 2:00 AM daily |
| **Discovery** | 3:00 AM daily |

## Limitations

- **IAM authentication**: Not yet supported (username/password only)
- **Cross-database queries**: One connection per database
- **Concurrency scaling**: Queries run on main cluster, not concurrency scaling
- **Stored procedures**: Not monitored (tables and views only)

## Coming Soon

- IAM role-based authentication
- Redshift-specific metrics (WLM queue status, disk usage)
- Cross-cluster data sharing support

## Common Questions

### Does AnomalyArmor support both provisioned Redshift clusters and Redshift Serverless?

Yes. Both use the same PostgreSQL-wire-compatible connector. For Serverless, set the endpoint to the workgroup's connection URL. For provisioned, use the cluster endpoint. SSL is required in both cases.

### Can AnomalyArmor monitor Redshift Spectrum external tables?

Yes. External tables (backed by S3) are discovered and cataloged alongside local Redshift tables. Freshness works if the external table has a timestamp column queryable via a standard `MAX(ts)` - Spectrum resolves this against S3 metadata when partitioned appropriately.

### Do I need to VACUUM or ANALYZE for AnomalyArmor to work correctly?

No for discovery or schema drift - those read the catalog directly. For row-count and statistical metrics, Redshift's `SVV_TABLE_INFO` and `SVV_ALL_COLUMNS` views are used; these are maintained by Redshift automatically. You don't need to add any VACUUM / ANALYZE to your maintenance.

### Can I connect AnomalyArmor to Redshift in a private VPC without IP allowlisting?

Yes, via VPC peering or AWS PrivateLink (Enterprise plans). Both keep traffic off the public internet. For non-Enterprise, allowlist AnomalyArmor's static IPs in your Redshift cluster's security group.

## Next Steps

- **[Run Discovery](/quickstart/run-first-discovery)**: Scan your Redshift cluster
- **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes

---

<!-- Source: https://docs.anomalyarmor.ai/data-sources/databricks -->

# Databricks

> Connect AnomalyArmor to Databricks Unity Catalog

Connect AnomalyArmor to your Databricks workspace to monitor Unity Catalog assets. We support Delta tables, views, and all Unity Catalog-managed objects.

## Requirements

Before connecting, ensure you have:

- **Databricks workspace** with Unity Catalog enabled
- **SQL Warehouse** (serverless or classic)
- **Personal Access Token** or Service Principal credentials
- **Catalog access** for the catalogs you want to monitor

## Connection Settings

| Field | Description | Example |
|-------|-------------|---------|
| **Connection Name** | Friendly identifier | `Databricks Production` |
| **Workspace URL** | Your Databricks workspace | `https://xxx.cloud.databricks.com` |
| **HTTP Path** | SQL warehouse path | `/sql/1.0/warehouses/abc123` |
| **Catalog** | Unity Catalog to monitor | `main` |
| **Access Token** | Authentication token | `dapi...` |

## Finding Your Connection Details

### Workspace URL

Your workspace URL is in your browser when logged into Databricks:

| Cloud | URL Format |
|-------|-----------|
| **Azure** | `https://adb-1234567890.12.azuredatabricks.net` |
| **AWS** | `https://dbc-abc123.cloud.databricks.com` |
| **GCP** | `https://xxx.gcp.databricks.com` |

### SQL Warehouse HTTP Path

1. Go to **SQL Warehouses** in Databricks
2. Click on your warehouse
3. Go to **Connection Details** tab
4. Copy the **HTTP Path**

```
HTTP Path format:
/sql/1.0/warehouses/abc123def456
                    ↑ Your warehouse ID
```

> 💡 Use a **Serverless SQL Warehouse** for best compatibility. Classic warehouses work too but may have startup delays.

### Creating an Access Token

### Personal Access Token

Best for quick setup and testing:

1. Click your username in Databricks → **User Settings**
2. Go to **Access Tokens** tab
3. Click **Generate New Token**
4. Set a description: `AnomalyArmor`
5. Set lifetime (or leave blank for no expiry)
6. Click **Generate**
7. **Copy the token immediately** (you won't see it again)

```
Token format: dapi1234567890abcdef1234567890abcdef
```

> ⚠️ Personal access tokens are tied to your user account. If you leave the organization, the token stops working. Consider using a service principal for production.

### Service Principal (Recommended)

Best for production use:

**Step 1: Create Service Principal**

1. Go to **Admin Console → Service Principals**
2. Click **Add Service Principal**
3. Name it: `anomalyarmor-monitoring`
4. Note the **Application ID**

**Step 2: Generate OAuth Token**

1. Select the service principal
2. Go to **Secrets** tab
3. Click **Generate Secret**
4. Copy the **Client ID** and **Client Secret**

**Step 3: Grant Permissions**

The service principal needs:
- `USE CATALOG` on target catalogs
- `USE SCHEMA` on target schemas
- `SELECT` on tables (or `ALL PRIVILEGES` for read access)

```sql
-- Grant catalog access
GRANT USE CATALOG ON CATALOG main TO `anomalyarmor-monitoring`;

-- Grant schema access
GRANT USE SCHEMA ON SCHEMA main.* TO `anomalyarmor-monitoring`;

-- Grant table read access
GRANT SELECT ON SCHEMA main.* TO `anomalyarmor-monitoring`;
```

**Step 4: Use in AnomalyArmor**

Enter the OAuth token in the Access Token field.

## Granting Catalog Permissions

The user or service principal needs read access to Unity Catalog.

> 💡 **Quick Setup**: [Download the Databricks permissions script](/downloads/databricks-permissions) for a ready-to-use SQL template with Unity Catalog grants.

```sql
-- Minimal permissions for AnomalyArmor

-- Access to catalog
GRANT USE CATALOG ON CATALOG production TO `anomalyarmor`;

-- Access to all schemas in catalog
GRANT USE SCHEMA ON CATALOG production TO `anomalyarmor`;

-- Read access to tables
GRANT SELECT ON CATALOG production TO `anomalyarmor`;
```

### Per-Schema Permissions

For more granular control:

```sql
-- Access specific schemas only
GRANT USE SCHEMA ON SCHEMA production.raw TO `anomalyarmor`;
GRANT USE SCHEMA ON SCHEMA production.staging TO `anomalyarmor`;
GRANT USE SCHEMA ON SCHEMA production.marts TO `anomalyarmor`;

-- Read access per schema
GRANT SELECT ON SCHEMA production.raw TO `anomalyarmor`;
GRANT SELECT ON SCHEMA production.staging TO `anomalyarmor`;
GRANT SELECT ON SCHEMA production.marts TO `anomalyarmor`;
```

## What We Monitor

AnomalyArmor discovers and monitors these Unity Catalog objects:

| Object Type | Monitored | Notes |
|-------------|-----------|-------|
| **Delta Tables** | Yes | Including managed and external |
| **Views** | Yes | Standard and materialized |
| **Schemas** | Yes | Schema-level metadata |
| **Volumes** | No | Coming soon |
| **Functions** | No | Not supported |

### Metadata Captured

For each table and view:

- Table name and schema
- Column names and data types
- Table properties
- Last modified timestamp (for freshness)
- Partitioning information

## Multiple Catalogs

### 3-Level Namespace Support

Databricks Unity Catalog uses a 3-level namespace: `catalog.schema.table`. AnomalyArmor fully supports this structure, enabling you to:

- **Track tables across catalogs**: Distinguish between `prod.analytics.users` and `dev.analytics.users`
- **Filter by catalog**: View only tables from specific catalogs in the UI
- **Catalog-aware alerting**: Get notified of changes in production catalogs only
- **Lineage across catalogs**: Track data flow between development, staging, and production

### Connecting Multiple Catalogs

To monitor multiple catalogs, create separate data sources for each:

**Data Sources:**
- Databricks Production (catalog: production)
- Databricks Staging (catalog: staging)
- Databricks Development (catalog: development)

> ℹ️ Each data source needs access to its respective catalog. Use the same token if it has permissions across catalogs.

### Catalog-Aware Features

| Feature | Catalog Support |
|---------|-----------------|
| **Asset Discovery** | Tables shown with full `catalog.schema.table` path |
| **Schema Drift Alerts** | Filter alerts by catalog |
| **Tag Inheritance** | Tags propagate within catalog boundaries |
| **Table Filtering** | API supports `catalog_name` filter parameter |
| **Lineage Visualization** | Shows cross-catalog data dependencies |

## SQL Warehouse Considerations

### Warehouse State

AnomalyArmor queries run on your SQL warehouse. Consider:

| Warehouse Type | Behavior |
|----------------|----------|
| **Serverless** | Auto-starts, minimal delay |
| **Classic (Auto-stop)** | May have startup delay (30s-2min) |
| **Classic (Always-on)** | Immediate, but costs more |

### Warehouse Sizing

Discovery queries are lightweight. A **Small** or **X-Small** warehouse is sufficient:

- **Recommended**: Serverless SQL Warehouse
- **Alternative**: X-Small Classic Warehouse with auto-stop

### Scheduling Discovery

If using a classic warehouse with auto-stop:

1. Schedule discovery during business hours
2. Or extend auto-stop timeout to cover discovery windows
3. Or use serverless (recommended)

## Connection Architecture

## What We Query

AnomalyArmor runs these types of queries:

```sql
-- List schemas
SHOW SCHEMAS IN CATALOG production;

-- List tables
SHOW TABLES IN SCHEMA production.raw;

-- Get table details
DESCRIBE TABLE EXTENDED production.raw.events;

-- Check freshness (for tables with timestamp columns)
SELECT MAX(event_timestamp) FROM production.raw.events;
```

**Impact**: Minimal. These are metadata queries that don't scan table data.

## Troubleshooting

### Connection test fails

**Common causes**:
1. Invalid or expired access token
2. Wrong workspace URL
3. Incorrect HTTP path

**Solutions**:
1. Generate a new access token
2. Verify workspace URL matches your browser
3. Copy HTTP path directly from SQL Warehouse settings

### Permission denied errors

**Causes**:
- Token lacks catalog/schema permissions
- Service principal not granted access

**Solutions**:
```sql
-- Check current permissions
SHOW GRANTS ON CATALOG production;

-- Grant necessary permissions
GRANT USE CATALOG ON CATALOG production TO `your-user`;
GRANT SELECT ON CATALOG production TO `your-user`;
```

### Warehouse not found

**Causes**:
- Wrong HTTP path
- Warehouse deleted or renamed

**Solutions**:
1. Go to SQL Warehouses in Databricks
2. Copy the HTTP path from Connection Details
3. Ensure the warehouse exists and is accessible

### Discovery times out

**Causes**:
- Warehouse is starting up
- Large number of tables

**Solutions**:
1. Use a serverless warehouse (faster startup)
2. Extend warehouse auto-stop timeout
3. Filter to specific schemas if catalog is very large

### Token expired

**Causes**:
- Personal access token has expiry date

**Solutions**:
1. Generate a new token with longer expiry
2. Use a service principal with OAuth (no expiry)
3. Update the token in AnomalyArmor Data Sources settings

## Best Practices

### Use Service Principals for Production

Personal access tokens are tied to individual users. If that user leaves:
- Token stops working
- Monitoring breaks

Service principals are organization-owned and persist regardless of user changes.

### Monitor Production Catalog

Start with your production catalog where schema changes have the most impact.

### Schedule Discovery After ETL

If you have predictable ETL schedules, run discovery after ETL completes to catch changes immediately:

```
ETL Schedule:     2:00 AM daily
Discovery Schedule: 3:00 AM daily (1 hour after ETL)
```

## Common Questions

### Does AnomalyArmor require Unity Catalog, or does the legacy Hive Metastore work?

Unity Catalog is required. The legacy Hive Metastore does not expose the information-schema views AnomalyArmor needs for consistent cross-catalog monitoring. If you're still on Hive, migrating to Unity Catalog also unlocks most other modern observability tools.

### Should I use a SQL Warehouse or a general-purpose cluster for AnomalyArmor?

SQL Warehouse - specifically Serverless or Pro. All-Purpose compute clusters work but start cold, which slows discovery. Use the smallest warehouse size; AnomalyArmor's queries are lightweight metadata and bounded aggregates.

### How do I authenticate AnomalyArmor to Databricks?

Personal Access Token (PAT) on a dedicated service-principal user is the standard. Grant the service principal `USE CATALOG`, `USE SCHEMA`, and `SELECT` on the objects you want monitored. OAuth machine-to-machine is available as an Enterprise option.

### Will AnomalyArmor keep my Databricks warehouse warm and run up Databricks cost?

No. Discovery runs in short bursts; your warehouse's `auto_stop_mins` setting takes over afterward. Set `auto_stop_mins` to 5-10 minutes on the AnomalyArmor warehouse to minimize idle cost. Serverless warehouses bill per second of activity and spin down instantly when idle.

### Can AnomalyArmor monitor Delta Live Tables or streaming tables?

Yes. DLT materializations and streaming tables appear as ordinary Delta tables in Unity Catalog and are fully supported for schema, freshness, and metric monitoring. Freshness reads Delta's history for accurate last-update timestamps.

## Next Steps

- **[Run Discovery](/quickstart/run-first-discovery)**: Scan your Databricks catalog
- **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes

---

<!-- Source: https://docs.anomalyarmor.ai/data-sources/clickhouse -->

# ClickHouse

> Connect AnomalyArmor to ClickHouse databases

Connect AnomalyArmor to your ClickHouse database for schema monitoring and freshness tracking. We support both self-hosted ClickHouse and ClickHouse Cloud.

## Requirements

- **ClickHouse version**: 21.8 or higher
- **HTTP interface**: Enabled (default on most installations)
- **User credentials**: With read access to system tables
- **Network access**: From AnomalyArmor to your ClickHouse server

## Connection Settings

| Field | Description | Example |
|-------|-------------|---------|
| **Connection Name** | Friendly identifier | `ClickHouse Analytics` |
| **Host** | ClickHouse hostname | `xxx.clickhouse.cloud` |
| **Port** | HTTP(S) port | `8443` (HTTPS) or `8123` (HTTP) |
| **Database** | Database name | `default` |
| **Username** | ClickHouse user | `anomalyarmor` |
| **Password** | User password | `••••••••` |

### Port Configuration

| Port | Protocol | When to Use |
|------|----------|-------------|
| `8443` | HTTPS | **ClickHouse Cloud** and production |
| `8123` | HTTP | Development or internal networks |
| `9440` | Native TLS | Not supported (use HTTP interface) |

> ⚠️ Always use HTTPS (port 8443) for cloud-hosted or production ClickHouse. HTTP (8123) should only be used for local development.

## Provider-Specific Instructions

### ClickHouse Cloud

### ClickHouse Cloud

**Finding Connection Details**:

1. Go to your ClickHouse Cloud console
2. Select your service
3. Click **Connect** → **HTTPS**
4. Copy the connection details

```
Host: abc123.us-east-1.aws.clickhouse.cloud
Port: 8443
Database: default (or your database name)
```

**IP Allowlisting**:

1. Go to **Settings** → **Security**
2. Under **IP Access List**, add AnomalyArmor IPs
3. Save changes

Add the AnomalyArmor IP addresses: `34.xxx.xxx.xxx/32` and `34.xxx.xxx.xxx/32`

**Creating a Read-Only User**:

```sql
-- Create user
CREATE USER anomalyarmor
IDENTIFIED BY 'your-secure-password';

-- Grant read access
GRANT SELECT ON *.* TO anomalyarmor;
GRANT SHOW ON *.* TO anomalyarmor;

-- Access to system tables (required for discovery)
GRANT SELECT ON system.* TO anomalyarmor;
```

### Self-Hosted

### Self-Hosted ClickHouse

**Verify HTTP Interface**:

Check your `config.xml` has HTTP enabled:

```xml
<http_port>8123</http_port>
<!-- or for HTTPS -->
<https_port>8443</https_port>
```

**Firewall Configuration**:

Allow inbound connections from AnomalyArmor:

```bash
# iptables example
iptables -A INPUT -p tcp --dport 8443 -s 34.xxx.xxx.xxx -j ACCEPT
iptables -A INPUT -p tcp --dport 8443 -s 34.xxx.xxx.xxx -j ACCEPT
```

**Creating a Read-Only User**:

```sql
-- Create user
CREATE USER anomalyarmor
IDENTIFIED BY 'your-secure-password'
HOST IP '34.xxx.xxx.xxx', '34.xxx.xxx.xxx';

-- Grant read access
GRANT SELECT ON *.* TO anomalyarmor;
GRANT SHOW ON *.* TO anomalyarmor;
```

**SSL/TLS Setup** (recommended):

```xml
<!-- In config.xml -->
<https_port>8443</https_port>
<openSSL>
<server>
<certificateFile>/path/to/server.crt</certificateFile>
<privateKeyFile>/path/to/server.key</privateKeyFile>
</server>
</openSSL>
```

### Altinity Cloud

### Altinity Cloud

Altinity Cloud is a managed ClickHouse service:

**Connection Details**:

1. Go to your Altinity Cloud cluster
2. Find connection details in the cluster overview
3. Use the HTTPS endpoint

**IP Allowlisting**:

Add AnomalyArmor IPs to your cluster's allowed networks in the Altinity console.

**User Setup**:

Same as self-hosted ClickHouse. Create a read-only user through the Altinity console or SQL.

### Docker/Local

### Docker / Local Development

**Default Connection**:

```
Host: localhost (or container IP)
Port: 8123 (HTTP) or 8443 (HTTPS)
Database: default
Username: default
Password: (empty or as configured)
```

**Docker Compose Example**:

```yaml
services:
clickhouse:
image: clickhouse/clickhouse-server:latest
ports:
- "8123:8123"
- "9000:9000"
```

> ℹ️ For local development, you may need to expose the ClickHouse port publicly or use a tunneling solution for AnomalyArmor to connect.

## Creating a Read-Only User

Full SQL script for setting up AnomalyArmor access:

```sql
-- Create dedicated user
CREATE USER IF NOT EXISTS anomalyarmor
IDENTIFIED BY 'your-secure-password';

-- Grant read access to all databases
GRANT SELECT ON *.* TO anomalyarmor;

-- Grant ability to see databases and tables
GRANT SHOW ON *.* TO anomalyarmor;

-- Access to system tables (required for discovery)
GRANT SELECT ON system.tables TO anomalyarmor;
GRANT SELECT ON system.columns TO anomalyarmor;
GRANT SELECT ON system.databases TO anomalyarmor;
GRANT SELECT ON system.parts TO anomalyarmor;

-- Optional: Restrict to specific databases
-- GRANT SELECT ON analytics.* TO anomalyarmor;
-- GRANT SELECT ON production.* TO anomalyarmor;
```

### Verify Permissions

Test the user can access metadata:

```sql
-- Should work
SELECT database, name, engine FROM system.tables LIMIT 5;

-- Should work
SELECT database, table, name, type FROM system.columns LIMIT 5;
```

## What We Monitor

AnomalyArmor discovers and monitors these ClickHouse objects:

| Object Type | Monitored | Notes |
|-------------|-----------|-------|
| **Tables** | Yes | All table engines |
| **Views** | Yes | Standard views |
| **Materialized Views** | Yes | Including underlying tables |
| **Dictionaries** | No | Coming soon |
| **Functions** | No | Not supported |

### Metadata Captured

For each table:

- Database and table name
- Column names and data types
- Table engine type
- Partition information
- Last modification time (for freshness)

## What We Query

AnomalyArmor runs these types of queries:

```sql
-- List databases
SELECT name FROM system.databases
WHERE name NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema');

-- List tables
SELECT database, name, engine, metadata_modification_time
FROM system.tables
WHERE database NOT IN ('system', 'INFORMATION_SCHEMA');

-- List columns
SELECT database, table, name, type, default_kind, default_expression
FROM system.columns
WHERE database NOT IN ('system', 'INFORMATION_SCHEMA');

-- Check freshness (for tables with timestamp columns)
SELECT MAX(event_time) FROM analytics.events;
```

**Impact**: These are lightweight metadata queries. No table scans.

## ClickHouse-Specific Considerations

### Table Engines

AnomalyArmor works with all ClickHouse table engines:

| Engine | Schema Monitoring | Freshness |
|--------|-------------------|-----------|
| MergeTree family | Full | Yes |
| Log family | Full | Limited |
| Distributed | Full | Via underlying tables |
| View | Full | N/A |
| MaterializedView | Full | Yes |

### ReplicatedMergeTree

For replicated tables, connect to any replica. Schema changes propagate across all replicas, so monitoring one is sufficient.

### Distributed Tables

Distributed tables show the schema of the distributed table definition. Underlying shard tables are monitored separately if in the same cluster.

## Troubleshooting

### Connection refused

**Causes**:
- Wrong port (using native port instead of HTTP)
- Firewall blocking connection
- HTTP interface disabled

**Solutions**:
1. Verify port is 8443 (HTTPS) or 8123 (HTTP)
2. Check firewall/security group allows AnomalyArmor IPs
3. Verify HTTP interface is enabled in config.xml
4. Test: `curl https://your-host:8443/ping`

### Authentication failed

**Causes**:
- Wrong username or password
- User doesn't exist
- IP not in user's allowed hosts

**Solutions**:
1. Verify credentials
2. Check user exists: `SELECT * FROM system.users WHERE name = 'anomalyarmor'`
3. Verify IP is allowed: Check user's HOST restrictions
```sql
-- View user's allowed hosts
SELECT name, host_ip, host_names FROM system.users;
```

### SSL certificate error

**Causes**:
- Self-signed certificate not trusted
- Certificate hostname mismatch

**Solutions**:
1. For ClickHouse Cloud: Should work automatically
2. For self-hosted: Ensure certificate is valid
3. Contact support if issues persist with valid certificates

### Permission denied

**Causes**:
- User lacks SELECT on system tables
- User lacks access to target databases

**Solutions**:
```sql
-- Grant required permissions
GRANT SELECT ON system.* TO anomalyarmor;
GRANT SELECT ON your_database.* TO anomalyarmor;
GRANT SHOW ON *.* TO anomalyarmor;
```

### No tables found

**Causes**:
- User can only see specific databases
- All tables in excluded system databases

**Solutions**:
1. Grant `SHOW` privilege: `GRANT SHOW ON *.* TO anomalyarmor`
2. Verify tables exist outside system databases
3. Check AnomalyArmor schema filters

## Connection Architecture

## Best Practices

### Use HTTPS in Production

Always use port 8443 with HTTPS for production:
- Encrypted in transit
- Required by ClickHouse Cloud
- Protects credentials

### Connect to One Node

For clustered setups, connect to one node. System tables show cluster-wide metadata.

### Schedule Discovery After Mutations

If you have regular schema changes (ALTER TABLE), schedule discovery after those operations complete.

## Common Questions

### Which ClickHouse versions and deployments does AnomalyArmor support?

ClickHouse 21.8+ self-hosted and ClickHouse Cloud are fully supported. Both use the HTTPS/TCP native protocol on port 8443 (Cloud) or 8123/9000 (self-hosted). Older ClickHouse versions may work for basic monitoring but aren't officially supported - upgrade if possible.

### Does AnomalyArmor support ClickHouse materialized views and dictionaries?

Materialized views are first-class assets (schema drift, freshness, metrics all work). Dictionaries are discovered and cataloged but not actively monitored for freshness; treat them as reference metadata.

### Can AnomalyArmor monitor freshness on a ClickHouse `MergeTree` table that uses event-time partitioning?

Yes. Use your event-time column (e.g., `event_date`, `created_at`) as the freshness timestamp. AnomalyArmor queries `MAX(event_date)` which ClickHouse resolves efficiently via partition metadata, even on billion-row tables.

### How much load does AnomalyArmor put on my ClickHouse cluster?

Very little. Discovery hits `system.tables` / `system.columns` which are metadata tables. Freshness and metrics use `MAX` / `COUNT(*)` / `uniqExact` which ClickHouse optimizes to microseconds on partitioned tables. Expect microseconds-to-low-milliseconds per query.

## Next Steps

- **[Run Discovery](/quickstart/run-first-discovery)**: Scan your ClickHouse database
- **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes

---

<!-- Source: https://docs.anomalyarmor.ai/data-sources/sql-server -->

# Microsoft SQL Server

> Connect AnomalyArmor to SQL Server databases including Azure SQL Database and on-premise instances

Connect AnomalyArmor to any Microsoft SQL Server database. This guide covers on-premise SQL Server, Azure SQL Database, and Amazon RDS for SQL Server.

## Supported Versions & Platforms

| Platform | Minimum Version | Notes |
|----------|-----------------|-------|
| SQL Server | 2012+ | On-premise or any cloud |
| SQL Server 2019 | Recommended | Best compatibility |
| SQL Server 2022 | Latest | Fully supported |
| Azure SQL Database | Any | All service tiers |
| Azure SQL Managed Instance | Any | All service tiers |
| Amazon RDS SQL Server | 2012+ | All instance classes |

> ⚠️ SQL Server 2008 and earlier are not supported. Please upgrade to SQL Server 2012+ for compatibility.

## Connection Settings

| Field | Description | Example |
|-------|-------------|---------|
| **Connection Name** | Friendly identifier | `Production SQL Server` |
| **Host** | Hostname or IP address | `db.example.com` |
| **Port** | Database port | `1433` |
| **Database** | Database name | `myapp_production` |
| **Username** | SQL Server user | `anomalyarmor` |
| **Password** | User password | `********` |
| **SSL Mode** | SSL configuration | `require` |

### Authentication Methods

| Method | Supported | Notes |
|--------|-----------|-------|
| **SQL Server Authentication** | Yes | Username and password |
| **Windows Authentication** | No | Not currently supported |
| **Azure Active Directory** | No | Planned for future release |

> ℹ️ SQL Server Authentication (username/password) is required. Windows Authentication and Azure AD are planned for future releases.

## Creating a Read-Only User

Create a dedicated user with minimal permissions:

```sql
-- Create a login at the server level
CREATE LOGIN anomalyarmor WITH PASSWORD = 'YourSecurePassword123!';

-- Switch to your database
USE your_database;

-- Create a user for the login
CREATE USER anomalyarmor FOR LOGIN anomalyarmor;

-- Grant SELECT on schemas (repeat for each schema you want to monitor)
GRANT SELECT ON SCHEMA::dbo TO anomalyarmor;
GRANT SELECT ON SCHEMA::production TO anomalyarmor;

-- Grant VIEW DEFINITION for schema introspection
GRANT VIEW DEFINITION TO anomalyarmor;
```

### Verifying Permissions

Test that the user can access metadata:

```sql
-- Should return tables
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'dbo';

-- Should return columns
SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = 'dbo';
```

## Provider-Specific Instructions

### Azure SQL Database

### Azure SQL Database

**Connection Details**:
- **Host**: `yourserver.database.windows.net`
- **Port**: `1433`
- **SSL Mode**: Encryption is always enabled (TLS 1.2+)

**Firewall Configuration**:

1. Go to **Azure Portal > SQL databases > Your database > Set server firewall**
2. Add a rule for each AnomalyArmor IP address (see Settings > Security)
3. Or enable "Allow Azure services" if AnomalyArmor runs in Azure

```
Firewall Rules
──────────────
Rule name       │ Start IP       │ End IP
AnomalyArmor-1  │ 34.xxx.xxx.xxx │ 34.xxx.xxx.xxx
AnomalyArmor-2  │ 34.xxx.xxx.xxx │ 34.xxx.xxx.xxx
```

**Service Tiers**:
All tiers are supported:
- Basic, Standard, Premium (DTU-based)
- General Purpose, Business Critical, Hyperscale (vCore-based)
- Serverless

> ⚠️ Azure SQL Database enforces encrypted connections. The SSL mode setting is informational - Azure will always use TLS 1.2+.

### Azure SQL Managed Instance

### Azure SQL Managed Instance

**Connection Details**:
- **Host**: `yourinstance.abc123.database.windows.net`
- **Port**: `1433` (default) or `3342` (public endpoint)
- **SSL Mode**: `require`

**Connectivity Options**:

| Method | Description |
|--------|-------------|
| **VNet Integration** | Connect through Azure VNet peering |
| **Public Endpoint** | Enable public endpoint on port 3342 |

**Public Endpoint Setup**:

1. Go to **Azure Portal > SQL managed instances > Your instance > Security > Networking**
2. Enable **Public endpoint**
3. Add AnomalyArmor IPs to **Deny public access** exceptions

> ℹ️ For VNet-based connectivity, contact us about Enterprise VPC peering options.

### Amazon RDS

### Amazon RDS for SQL Server

**Connection Details**:
- **Host**: Your RDS endpoint (e.g., `mydb.abc123.us-east-1.rds.amazonaws.com`)
- **Port**: `1433` (default)
- **SSL Mode**: `require`

**Security Group Configuration**:

1. Go to **AWS Console > RDS > Your Instance > Security Groups**
2. Edit inbound rules
3. Add rule:
- Type: `MS SQL`
- Port: `1433`
- Source: AnomalyArmor IPs (see Settings > Security)

```
Security Group: sg-abc123
────────────────────────────────────────────────────────
Inbound Rules
MS SQL │ TCP │ 1433 │ 34.xxx.xxx.xxx/32 │ AnomalyArmor
MS SQL │ TCP │ 1433 │ 34.xxx.xxx.xxx/32 │ AnomalyArmor
```

**RDS SQL Server Editions**:
All editions are supported:
- Express Edition
- Web Edition
- Standard Edition
- Enterprise Edition

> ℹ️ RDS instances in private subnets require NAT Gateway or VPC peering for AnomalyArmor access.

### On-Premise

### On-Premise SQL Server

**Connection Details**:
- **Host**: Your server's hostname or IP
- **Port**: `1433` (or custom port)
- **SSL Mode**: Depends on your setup

**Firewall Configuration**:

Allow inbound connections from AnomalyArmor IPs:

```powershell
# Windows Firewall example
New-NetFirewallRule -DisplayName "AnomalyArmor SQL" `
-Direction Inbound -LocalPort 1433 -Protocol TCP `
-RemoteAddress 34.xxx.xxx.xxx -Action Allow
```

**SQL Server Configuration**:

Ensure TCP/IP is enabled:

1. Open **SQL Server Configuration Manager**
2. Go to **SQL Server Network Configuration > Protocols**
3. Enable **TCP/IP**
4. Set TCP Port to `1433` (or your preferred port)
5. Restart SQL Server service

**Mixed Mode Authentication**:

SQL Server Authentication must be enabled:

1. Connect in SSMS as administrator
2. Right-click server > **Properties > Security**
3. Select **SQL Server and Windows Authentication mode**
4. Restart SQL Server service

**SSL/TLS Configuration** (recommended):

1. Install a valid SSL certificate on the server
2. In SQL Server Configuration Manager, go to **SQL Server Network Configuration > Protocols > Properties**
3. Set **Force Encryption** to Yes

## What We Query

AnomalyArmor runs these types of queries:

```sql
-- Tables and views
SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA NOT IN ('sys', 'INFORMATION_SCHEMA');

-- Columns
SELECT COLUMN_NAME, DATA_TYPE, IS_NULLABLE, COLUMN_DEFAULT
FROM INFORMATION_SCHEMA.COLUMNS;

-- Freshness (for timestamp columns)
SELECT MAX(your_timestamp_column) FROM your_table;
```

**Impact**: Minimal. These are lightweight metadata queries using standard INFORMATION_SCHEMA views.

## Excluded Schemas

AnomalyArmor automatically excludes system schemas:

- `sys` - SQL Server system objects
- `INFORMATION_SCHEMA` - ANSI standard metadata views

All user-created schemas are included by default.

## Troubleshooting

### Login failed for user

**Error**: `Login failed for user 'anomalyarmor'`

**Causes**:
- Wrong username or password
- SQL Server Authentication not enabled
- User doesn't have access to the specified database

**Solutions**:
1. Verify username and password are correct
2. Check SQL Server is in Mixed Mode authentication
3. Ensure the login exists: `SELECT name FROM sys.server_principals WHERE name = 'anomalyarmor'`
4. Ensure the user has database access

### Cannot open database

**Error**: `Cannot open database 'mydb' requested by the login`

**Causes**:
- Database name is incorrect
- User doesn't have access to the database
- Database doesn't exist

**Solutions**:
1. Verify database name (case-sensitive on some configurations)
2. Check user permissions: `SELECT name FROM sys.database_principals WHERE name = 'anomalyarmor'`
3. Grant access: `USE mydb; CREATE USER anomalyarmor FOR LOGIN anomalyarmor;`

### Connection refused or timeout

**Error**: `Cannot connect to SQL Server` or connection timeout

**Causes**:
- Firewall blocking the connection
- Wrong hostname or port
- SQL Server not listening on TCP/IP
- SQL Server Browser service not running (named instances)

**Solutions**:
1. Verify AnomalyArmor IPs are allowlisted
2. Check firewall rules
3. Ensure TCP/IP protocol is enabled in SQL Server Configuration Manager
4. For named instances, ensure SQL Server Browser is running or specify the port
5. Test connectivity: `Test-NetConnection hostname -Port 1433`

### Azure SQL firewall error

**Error**: `Cannot connect - firewall rule` or error 40615

**Causes**:
- AnomalyArmor IP not in Azure SQL firewall rules
- Public access is disabled

**Solutions**:
1. Go to Azure Portal > SQL databases > Set server firewall
2. Add AnomalyArmor IP addresses to firewall rules
3. Ensure "Deny public network access" is Off

### Windows Authentication required

**Error**: Error 18470 or "Windows authentication is required"

**Causes**:
- Server is configured for Windows Authentication only
- SQL Server Authentication is disabled

**Solutions**:
1. Enable Mixed Mode authentication in SQL Server properties
2. Restart SQL Server service
3. AnomalyArmor currently requires SQL Server Authentication

### No tables found in discovery

**Causes**:
- User lacks SELECT permission on schemas
- User lacks VIEW DEFINITION permission
- All tables are in excluded schemas

**Solutions**:
1. Grant schema access: `GRANT SELECT ON SCHEMA::dbo TO anomalyarmor;`
2. Grant view definition: `GRANT VIEW DEFINITION TO anomalyarmor;`
3. Test query: `SELECT * FROM INFORMATION_SCHEMA.TABLES;`

## Common Questions

### Which SQL Server versions and services does AnomalyArmor support?

SQL Server 2012+, Azure SQL Database (Single and Elastic Pool), Azure SQL Managed Instance, and Amazon RDS for SQL Server. SQL Server 2008 and earlier are not supported due to missing information-schema features.

### Do I need a SQL login or Windows authentication for Azure SQL?

SQL login (username + password) is the recommended path for Azure SQL Database. Azure AD authentication is supported on Enterprise plans. Windows authentication is not supported for managed Azure SQL; for on-premise SQL Server, SQL auth is still the simplest setup.

### What permissions does AnomalyArmor need on SQL Server?

Minimal read: grant the connection's login `CONNECT` to the database, `VIEW DEFINITION` on the schemas to monitor (so information_schema returns your tables), and `SELECT` on those tables. No `ALTER`, `INSERT`, or `sysadmin` needed.

### Does AnomalyArmor support Always Encrypted columns in SQL Server?

AnomalyArmor doesn't decrypt Always Encrypted columns - nor does it need to. Schema drift and freshness work against encrypted columns because we only read metadata and aggregate over non-encrypted timestamp columns. If a column is Always Encrypted, we see its presence and type; its values stay encrypted end-to-end.

## Next Steps

- **[Run Discovery](/quickstart/run-first-discovery)**: Scan your SQL Server database
- **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes

---

<!-- Source: https://docs.anomalyarmor.ai/schema-monitoring/overview -->

# Schema Monitoring Overview

> Detect and track schema changes across your data assets

Schema monitoring is AnomalyArmor's core capability. It automatically detects changes to your database structure (columns added, removed, renamed, or modified) before they break your pipelines and dashboards.

## Why Schema Monitoring Matters

Schema changes are one of the most common causes of data pipeline failures:

## What We Monitor

AnomalyArmor detects all structural changes to your database:

| Change Type | Description | Impact Level |
|------------|-------------|--------------|
| **Column Removed** | Existing column no longer exists | High - breaks queries |
| **Column Added** | New column appears | Low - usually safe |
| **Column Type Changed** | Data type modified | Medium - may affect logic |
| **Column Renamed** | Name changed (detected as remove + add) | High - breaks queries |
| **Table Added** | New table discovered | Low - informational |
| **Table Removed** | Table no longer exists | High - breaks queries |
| **Constraint Changed** | Primary key, foreign key, unique modified | Medium - may affect joins |

## How It Works

Schema monitoring happens during **discovery**:

1. **Discovery scans** your database's system catalogs
2. **Comparison** against the previous discovery state
3. **Changes recorded** with full before/after details
4. **Alerts triggered** based on your rules

## Key Features

### Automatic Detection

Changes are detected automatically during scheduled discovery, no manual work required.

- Set discovery to run hourly, daily, or custom
- Changes detected on next run after they occur
- Zero configuration for basic monitoring

### Complete Change History

Every change is recorded with full context:

- **What changed**: Column name, type, constraints
- **When detected**: Timestamp of discovery
- **Before/after**: Previous and new values
- **Asset context**: Which table, schema, database

### Flexible Alerting

Route different changes to different places:

```
Production + Column Removed  →  PagerDuty + Slack #incidents
Production + Column Added    →  Slack #data-changes
Development + Any Change     →  Email digest
```

### Change Visualization

See schema changes at a glance:

*Schema diff showing before and after states*

## Getting Started

### 1. Connect Your Database

Schema monitoring starts when you [add a data source](/data-sources/overview).

### 2. Run Discovery

[Trigger discovery](/quickstart/run-first-discovery) to establish a baseline.

### 3. Set Up Alerts

[Create alert rules](/alerts/alert-rules) for schema changes:

```
Event: Schema Change Detected
Scope: Your production database
Destinations: Slack, Email, or PagerDuty
```

### 4. Schedule Discovery

Configure discovery to run regularly:

| Environment | Recommended Schedule |
|-------------|---------------------|
| Production | Hourly |
| Staging | Every 6 hours |
| Development | Daily |

## Viewing Schema Changes

### In the Asset Catalog

1. Go to **Assets**
2. Click on a table or view
3. Select the **Schema History** tab

You'll see:
- Timeline of all changes
- Before/after comparison for each change
- Which discovery run detected it

### In Alert History

1. Go to **Alerts → History**
2. Filter by "Schema Change"
3. Click an alert to see change details

## Schema Change Types

### Column Removed

**What**: A column that existed before is no longer present.

**Risk**: High, any query referencing this column will fail.

**Example**:
```
Table: orders
Column removed: shipping_status (varchar)
Detected: 2024-01-15 08:00 UTC
```

**Action**: Update downstream queries, models, and reports.

### Column Added

**What**: A new column appears that didn't exist before.

**Risk**: Low. Existing queries usually continue working.

**Example**:
```
Table: orders
Column added: delivery_estimate (timestamp)
Detected: 2024-01-15 08:00 UTC
```

**Action**: Consider adding to models if relevant.

### Column Type Changed

**What**: A column's data type was modified.

**Risk**: Medium. May affect type-specific logic.

**Example**:
```
Table: products
Column: price
Type change: integer → decimal(10,2)
Detected: 2024-01-15 08:00 UTC
```

**Action**: Review aggregations, comparisons, and type casting.

### Table Removed

**What**: An entire table no longer exists.

**Risk**: High. All queries to this table will fail.

**Example**:
```
Table removed: legacy_orders
Last seen: 2024-01-14 08:00 UTC
Detected: 2024-01-15 08:00 UTC
```

**Action**: Urgent. Investigate if intentional.

## Best Practices

### Monitor Production Most Closely

Production schema changes have the highest impact. Run discovery more frequently and set up immediate alerts.

### Route by Change Type

Not all changes are equally urgent:

| Change | Alert Destination |
|--------|------------------|
| Column/Table removed | PagerDuty + Slack |
| Type changed | Slack |
| Column/Table added | Email or Slack (low priority) |

### Coordinate with Change Management

When you know schema changes are coming:

1. Inform your team
2. Update downstream code first (if possible)
3. Make the schema change
4. Verify AnomalyArmor detects it
5. Confirm no unexpected alerts

### Use Schema History for Debugging

When a pipeline fails, check schema history:

1. Find the asset in AnomalyArmor
2. Look at recent schema changes
3. Correlate timing with pipeline failure
4. Identify the breaking change

## Common Questions

### How quickly will I know when a column is removed?

Detection latency equals your discovery schedule. Hourly discovery catches removals within an hour, daily discovery within 24 hours. Alerts fire as soon as the comparison completes. See [Asset Discovery](/schema-monitoring/asset-discovery) to adjust the schedule.

### Does schema monitoring require writing any code or queries?

No. Schema monitoring is automatic once you connect a data source and run discovery. You only configure alert rules to decide where notifications go.

### Can I get different alerts for different types of schema changes?

Yes. Route breaking changes like column removals to PagerDuty, type changes to Slack, and additions to an email digest. Configure this in [alert rules](/alerts/alert-rules) by filtering on change type.

### How does AnomalyArmor tell a rename from a remove-plus-add?

It doesn't, automatically. A rename is detected as a column removed plus a column added. The schema diff highlights the pair so you can confirm it's a rename and update downstream references.

### Is there a history of past schema changes I can review?

Yes. Open any asset and select the **Schema History** tab to see every detected change with before and after values, timestamps, and the discovery run that found it. This is useful for correlating pipeline failures with structural changes.

## Related Topics

- **[Schema Drift](/schema-monitoring/schema-drift)**: Deep dive into detecting and managing drift
- **[Asset Discovery](/schema-monitoring/asset-discovery)**: How discovery finds and catalogs assets
- **[Alert Rules](/alerts/alert-rules)**: Configure schema change alerts
- **[Core Concepts](/concepts/schema-changes)**: Understand schema changes in depth

---

<!-- Source: https://docs.anomalyarmor.ai/schema-monitoring/asset-discovery -->

# Asset Discovery

> Configure and run discovery to catalog your database assets

Discovery scans your databases to build a catalog of tables, views, and columns. Run it manually or on a schedule to keep your asset inventory current.

## Running Discovery Manually

1. Go to **Data Sources** in the sidebar
2. Find your data source
3. Click **Run Discovery** (or the refresh icon)
4. Wait for completion (typically 30 seconds to 5 minutes)

You'll see "Discovery completed" with a summary of assets found.

## Scheduling Discovery

Set discovery to run automatically:

1. Go to **Data Sources** → select your data source
2. Click **Settings** (gear icon)
3. Under **Discovery Schedule**, select frequency:

| Schedule | Best For | Typical Use Case |
|----------|----------|------------------|
| **Hourly** | Production databases | Catch changes quickly |
| **Every 6 hours** | Active development | Balance coverage and load |
| **Daily** | Stable databases | Low-change environments |
| **Weekly** | Archives, warehouses | Rarely modified systems |

4. Click **Save**

Discovery runs at the start of each interval (e.g., hourly = on the hour).

## What Discovery Finds

During each run, discovery catalogs:

| Asset Type | What's Captured |
|------------|-----------------|
| Tables | Name, schema, columns, types, constraints |
| Views | Name, schema, underlying columns |
| Columns | Name, data type, nullability, defaults |
| Relationships | Foreign keys, primary keys |

## Discovery Duration

Discovery time depends on database size:

| Database Size | Typical Duration |
|---------------|------------------|
| < 100 tables | 30 seconds |
| 100-500 tables | 1-2 minutes |
| 500-2000 tables | 2-5 minutes |
| 2000+ tables | 5-15 minutes |

> ℹ️ Discovery runs read-only queries against system catalogs. It doesn't impact your database performance significantly.

## Incremental vs Full Discovery

**Incremental** (default): Checks for changes since last run. Faster.

**Full**: Re-scans everything. Use when:
- First run after connecting
- Suspect missed changes
- After major database migrations

To run full discovery:
1. Go to **Data Sources** → select your data source
2. Click the dropdown arrow next to **Run Discovery**
3. Select **Full Discovery**

## Viewing Discovery Results

After discovery completes:

1. Go to **Assets** in the sidebar
2. Filter by data source to see discovered assets
3. Click any asset to view details

New assets show a "New" badge for 24 hours.

## Discovery History

View past discovery runs:

1. Go to **Data Sources** → select your data source
2. Click **History** tab
3. See each run with:
   - Timestamp
   - Duration
   - Assets found/changed
   - Any errors

## Troubleshooting

### Discovery stuck or taking too long

**Causes:**
- Large database with many tables
- Slow network connection
- Database under heavy load

**Fix:**
1. Check discovery history for progress
2. Wait up to 30 minutes for large databases
3. If stuck, cancel and retry during off-peak hours

### Discovery completed but missing tables

**Causes:**
- User doesn't have permission on those schemas
- Tables in schemas excluded from discovery
- Views the user can't access

**Fix:**
1. Check your database user's permissions
2. Grant `SELECT` on `information_schema` (or equivalent)
3. Verify schema inclusion settings

### Discovery failed

**Causes:**
- Connection credentials changed
- Database unreachable
- Permission revoked

**Fix:**
1. Check the error message in discovery history
2. Test the connection in Data Sources
3. Verify credentials and network access

## Common Questions

### Does discovery impact database performance?

No. Discovery runs read-only queries against system catalogs (`information_schema` and equivalents), not against your data tables. The load is negligible even on large databases.

### How often should I run discovery?

For production databases, run discovery **hourly** to catch schema changes quickly. Stable databases can run daily, and archives can run weekly. More frequent runs shorten the window between a change and detection.

### What's the difference between incremental and full discovery?

Incremental discovery checks only for changes since the last run and is the default. Full discovery re-scans everything and should be used after the first connection, after major migrations, or when you suspect a missed change.

### Why are some tables missing after discovery?

The most common cause is permissions. The database user AnomalyArmor connects with needs `SELECT` on `information_schema` (or the vendor equivalent) and access to the target schemas. Check also that the schema isn't excluded in your discovery settings.

### How long does discovery take on a large database?

Databases under 100 tables finish in about 30 seconds. 500-2000 tables take 2-5 minutes, and 2000+ tables can take 5-15 minutes. If a scan runs longer than 30 minutes it's likely stuck, retry during off-peak hours.

## See Also

- **[Discovery Concepts](/concepts/discovery)**: How discovery works under the hood
- **[Schema Changes](/concepts/schema-changes)**: Understanding detected changes

---

<!-- Source: https://docs.anomalyarmor.ai/schema-monitoring/schema-drift -->

# Schema Drift Detection

> Detect and track schema changes over time to prevent data pipeline failures

Schema drift occurs when your database structure , intentionally or accidentally. AnomalyArmor detects drift automatically and alerts you before it causes downstream failures.

*Schema diff visualization showing column changes*

## What is Schema Drift?

Schema drift is any change to database structure:

| Column | Before (Discovery T1) | After (Discovery T2) | Change |
|--------|----------------------|---------------------|--------|
| id | `integer` | `integer` | No change |
| user_id | `integer` | `integer` | No change |
| status | `varchar` | -- | **Removed** (or renamed) |
| order_status | -- | `varchar` | **Added** (possibly renamed from `status`) |
| total | `decimal` | `decimal` | No change |
| shipping_date | `timestamp` | -- | **Removed** (or renamed) |
| shipping_estimate | -- | `timestamp` | **Added** (possibly renamed from `shipping_date`) |
| delivery_fee | -- | `decimal` | **Added** |
| created_at | `timestamp` | `timestamp` | No change |

## Types of Schema Drift

### Breaking Changes (High Risk)

Changes that will cause immediate failures:

| Change | Example | Impact |
|--------|---------|--------|
| **Column Removed** | `shipping_status` deleted | Queries referencing it fail |
| **Table Removed** | `legacy_orders` dropped | All queries to table fail |
| **Column Renamed** | `status` → `order_status` | Queries using old name fail |
| **Incompatible Type Change** | `varchar` → `integer` | Type mismatches, cast failures |

### Non-Breaking Changes (Low-Medium Risk)

Changes that usually don't cause immediate failures:

| Change | Example | Impact |
|--------|---------|--------|
| **Column Added** | New `delivery_fee` column | Existing queries work; new data available |
| **Table Added** | New `order_items` table | No impact on existing queries |
| **Compatible Type Change** | `varchar(50)` → `varchar(100)` | Usually transparent |
| **Constraint Added** | New unique constraint | May affect inserts, not selects |

## How Drift is Detected

AnomalyArmor compares consecutive discovery runs:

| Column | Run #1 (Baseline) | Run #2 (Comparison) | Result |
|--------|-------------------|---------------------|--------|
| id | Present | Present | No change |
| user_id | Present | Present | No change |
| status | Present | Missing | **Removed** |
| total | Present | Present | No change |
| order_status | Missing | Present | **Added** |
| created_at | Present | Present | No change |

**Detected drift**: Column `status` removed, column `order_status` added (possibly a rename, investigate).

### Detection Timing

| Discovery Schedule | Max Detection Delay |
|-------------------|---------------------|
| Every 15 minutes | 15 minutes |
| Hourly | 1 hour |
| Daily | 24 hours |

> 💡 For production databases, hourly discovery provides a good balance of timeliness and resource usage.

## Viewing Schema Drift

### Asset Schema History

1. Navigate to **Assets**
2. Click on a table
3. Select **Schema History** tab

You'll see:

![Schema change timeline displaying multiple changes with timestamps and details](/images/screenshots/schema-history-timeline.png)

*Schema change history showing column additions, removals, and modifications over time*

### Change Details

Click any change to see details:

- **Before state**: Previous column definition
- **After state**: New column definition (or "Removed")
- **Detection time**: When the change was discovered
- **Discovery run**: Which scan found it

## Schema Drift Alerts

### Setting Up Alerts

Create rules to notify you of drift:

| Field | Value |
|-------|-------|
| **Name** | Production Breaking Changes |
| **Event** | Schema Change Detected |
| **Data source** | `production-postgres` |
| **Schema** | `public`, `analytics` |
| **Change type** | Column Removed, Table Removed, Type Changed |
| **Destinations** | Slack, PagerDuty |

### Alert Content

Schema drift alerts include:

```
⚠️ Schema Change Detected

Asset: production.public.orders
Changes detected:
  • Column removed: status (varchar)
  • Column added: order_status (varchar)

Detected: January 15, 2024 at 8:00 AM UTC
Discovery Run: #1234

[View Details] [View Asset]
```

## Common Drift Scenarios

### Scenario 1: Column Rename

**Symptom**: Column removed + column added with similar name

```
Detected changes:
- Column removed: user_email
- Column added: email_address

This is likely a rename.
```

**Action**:
1. Confirm it's a rename (ask the upstream team)
2. Update all downstream references
3. Verify queries work with new name

### Scenario 2: Unannounced Migration

**Symptom**: Multiple columns changed in related tables

```
Detected changes in users:
- Column type changed: phone (varchar → jsonb)
- Column added: phone_verified (boolean)

Detected changes in user_profiles:
- Column removed: phone_number
```

**Action**:
1. Identify the migration (check with upstream team)
2. Review impact on your queries
3. Update affected models

### Scenario 3: Table Dropped

**Symptom**: Table no longer exists

```
Asset removed: public.legacy_orders
Last seen: January 14, 2024

This could be:
- Intentional removal
- Renamed table
- Permission change
- Wrong database connected
```

**Action**:
1. Verify if intentional
2. Update any queries referencing it
3. Check if data was migrated elsewhere

### Scenario 4: Type Change

**Symptom**: Column data type modified

```
Column: products.price
Type change: integer → decimal(10,2)
```

**Action**:
1. Review aggregations (SUM, AVG) for precision changes
2. Check comparisons and arithmetic
3. Verify no type casting issues

## Managing Expected Drift

### Planned Schema Changes

When you know changes are coming:

1. **Communicate**: Tell your team about upcoming changes
2. **Prepare**: Update downstream code in advance (staged rollout)
3. **Execute**: Make the schema change
4. **Verify**: Confirm AnomalyArmor detects it
5. **Acknowledge**: Mark alerts as expected

### Disabling Rules During Migrations

For large migrations:

1. Go to **Alerts → Rules**
2. Toggle OFF relevant rules
3. Perform migration
4. Review detected changes
5. Toggle rules back ON

> ⚠️ Don't forget to re-enable rules after migration! Set a calendar reminder.

## Drift Prevention

### Schema Versioning

Coordinate schema changes with version control:

- Use migration tools (Flyway, Alembic, Atlas)
- Review migrations before applying
- Test in staging first

### Change Communication

Establish processes:

- Schema change announcements
- Dependency mapping
- Impact assessment before changes

### Backward Compatibility

When changing schemas:

- Add new columns instead of modifying existing
- Keep old columns during transition periods
- Use views to maintain compatibility

## Troubleshooting

### Drift detected but change was intentional

This is expected behavior. AnomalyArmor doesn't know intent. It reports all changes. Consider:
1. Acknowledging the alert
2. Documenting the change in your records
3. Disabling rules during planned migrations

### Drift not detected

1. **Check discovery schedule**: Has discovery run since the change?
2. **Check scope**: Is the table in a monitored schema?
3. **Check permissions**: Can AnomalyArmor user see the table?
4. **Run manual discovery**: Trigger discovery to check now

### Too many drift alerts

1. **Filter by change type**: Alert only on breaking changes (removed columns/tables)
2. **Exclude dev/test**: Don't monitor volatile environments
3. **Use patterns**: Exclude `*_temp`, `*_backup` tables

### False positive - rename detected as remove+add

This is expected. AnomalyArmor can't automatically determine if a remove+add pair is a rename. You should:
1. Investigate the pair manually
2. Document the rename in your records
3. Update downstream references

## Best Practices

1. **Monitor production closely**: Run frequent discovery, alert immediately
2. **Differentiate environments**: Different rules for prod vs. dev
3. **Route by change type**: Breaking changes → PagerDuty; additions → Email
4. **Review history regularly**: Check drift patterns to identify risky areas
5. **Coordinate changes**: Communicate schema changes before making them

## Common Questions

### What counts as a breaking schema change?

Column removals, table removals, column renames, and incompatible type changes (such as `varchar` to `integer`) will break queries immediately. Additions and compatible type widenings (`varchar(50)` to `varchar(100)`) usually don't.

### How do I avoid alert storms during a planned migration?

Disable the relevant alert rules in **Alerts → Rules** before the migration, let the changes flow through, then toggle the rules back on. Set a calendar reminder, leaving rules off is a common mistake.

### Can AnomalyArmor detect renames automatically?

No. A rename shows up as a column removed plus a column added with a similar name. The UI flags the pair as a possible rename, but confirming intent still requires a human to check with the upstream team.

### Why did drift get detected for a change I made intentionally?

AnomalyArmor reports all structural changes, it has no way to know which ones you planned. Acknowledge the alert, document the change, and optionally disable the rule during known migration windows.

### What should I do first when a breaking change is detected in production?

Open the asset's schema history to see exactly what changed and when. Then check downstream dependencies, such as views, models, and dashboards, that reference the affected column or table, and coordinate fixes before more queries fail.

### Does detection work on views and materialized views, not just tables?

Yes. Discovery catalogs tables, views, and their underlying columns, so drift on views is detected the same way. See [Asset Discovery](/schema-monitoring/asset-discovery) for what's captured.

## Related Topics

- **[Schema Monitoring Overview](/schema-monitoring/overview)**: Broader schema monitoring capabilities
- **[Alert Rules](/alerts/alert-rules)**: Configure drift alerts
- **[Schema Changes Concept](/concepts/schema-changes)**: Deep dive into change types
- **[Asset Discovery](/schema-monitoring/asset-discovery)**: How assets are discovered

---

<!-- Source: https://docs.anomalyarmor.ai/data-quality/overview -->

# Data Quality Overview

> Monitor the health and reliability of your data assets

Data quality monitoring ensures your data is fresh, complete, and reliable. AnomalyArmor helps you detect data issues before they impact business decisions.

## Why Data Quality Matters

Bad data leads to bad decisions:

## Data Quality Dimensions

AnomalyArmor monitors key data quality dimensions:

| Dimension | What It Means | How We Monitor |
|-----------|---------------|----------------|
| **Freshness** | Is data up to date? | Timestamp monitoring, SLAs |
| **Completeness** | Did the right amount arrive? | Row count monitoring, ML anomaly detection |
| **Metrics** | Are column values correct? | Statistical monitoring, anomaly detection |
| **Schema** | Is structure correct? | Schema drift detection |
| **Availability** | Is data accessible? | Discovery success/failure |

## Monitoring Capabilities

- **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Track when data was last updated and detect stale data before it impacts downstream consumers.
- **[Row Count Monitoring](/data-quality/row-count-monitoring)**: Monitor row counts with ML-based anomaly detection or explicit thresholds.
- **[Data Quality Metrics](/data-quality/metrics)**: Monitor null percentages, distinct counts, and other column-level statistics. Detect anomalies automatically.
- **[Schema Monitoring](/schema-monitoring/overview)**: Detect structural changes to your database that could break pipelines and reports.
- **[Report Badges](/data-quality/report-badges)**: Embed data quality status indicators in dashboards, wikis, and operational tools.

## How Data Quality Monitoring Works

1. **Discovery runs** on your configured schedule, **metrics captured** at defined intervals
2. **Metadata collected** including schema, timestamps, and metric values (row counts, null %, etc.)
3. **Compared against expectations** (SLAs, statistical baselines, previous state)
4. **Alerts fired** when expectations aren't met or anomalies detected

## Getting Started

### Set Up Freshness Monitoring

1. Navigate to an asset
2. Click **Freshness** tab
3. Select a timestamp column
4. Set your expected update frequency
5. Configure alert threshold

### Set Up Row Count Monitoring

1. Navigate to an asset
2. Click **Data Quality** tab
3. Scroll to **Row Count Monitoring** section
4. Click **Create Schedule**
5. Configure time window and check interval
6. Choose auto-learn or explicit thresholds

### Set Up Data Quality Metrics

1. Navigate to an asset
2. Click **Metrics** tab
3. Click **Create Metric**
4. Select metric type (null %, distinct count, etc.)
5. Configure capture interval
6. Enable anomaly detection (optional)

### Set Up Schema Monitoring

Schema monitoring is automatic once you:

1. Connect a data source
2. Run discovery
3. Configure alert rules for schema changes

## Best Practices

### Start with Critical Assets

Don't monitor everything at once. Focus on:

- **Revenue-impacting tables**: Orders, payments, transactions
- **Customer-facing data**: Data that powers dashboards and reports
- **Compliance-required data**: Audit logs, regulatory reports

### Set Realistic Expectations

Match SLAs to actual data patterns:

| Data Type | Typical Freshness |
|-----------|-------------------|
| Real-time events | Minutes |
| Hourly ETL | 1-2 hours |
| Daily batches | Same-day |
| Weekly reports | 1 week |

### Layer Your Monitoring

Combine multiple checks for full coverage:

**Critical table (orders):**
- **Freshness**: Alert if >2 hours stale
- **Completeness**: Alert if row count drops >50%
- **Metrics**: Alert if null_percent exceeds 5%
- **Schema**: Alert on any column removed
- **Availability**: Alert if discovery fails

## Data Quality Dashboard

View overall data health in the **Assets** section:

| Indicator | Meaning |
|-----------|---------|
| Green | All checks passing |
| Yellow | Warning threshold reached |
| Red | SLA violated or issue detected |
| Gray | Not monitored |

## Common Questions

### Which monitor should I set up first?

Start with [freshness monitoring](/data-quality/freshness-monitoring) on revenue-impacting tables. Freshness catches the most common failure mode (ETL didn't run) with the least configuration, then layer in row counts and column metrics as you learn your patterns.

### What's the difference between freshness, row count, and metrics?

**Freshness** answers "did data arrive on time?" **Row count** answers "did the right amount arrive?" **Metrics** answer "is the column-level data correct?" Critical tables benefit from all three. See [freshness](/data-quality/freshness-monitoring), [row count](/data-quality/row-count-monitoring), and [metrics](/data-quality/metrics).

### Do I have to monitor every table?

No, and you shouldn't try. Focus on revenue-impacting tables, customer-facing dashboard sources, and compliance-required data. Most teams get 80% of the value from monitoring 20% of their assets.

### Can AnomalyArmor detect issues it wasn't explicitly configured for?

Row count monitoring and metrics use anomaly detection against learned baselines, so they can flag unusual values without explicit thresholds. Freshness still needs an SLA, and schema drift detects all changes automatically once discovery runs.

## Related Topics

- **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set up freshness SLAs
- **[Row Count Monitoring](/data-quality/row-count-monitoring)**: Monitor row counts with ML anomaly detection
- **[Data Quality Metrics](/data-quality/metrics)**: Track column-level statistics and detect anomalies
- **[Alert Rules](/alerts/alert-rules)**: Configure data quality alerts
- **[Report Badges](/data-quality/report-badges)**: Embed quality status in external tools

---

<!-- Source: https://docs.anomalyarmor.ai/data-quality/freshness-monitoring -->

# Freshness Monitoring

> Set SLAs on data freshness and detect stale data before it impacts your business

Freshness monitoring tracks when your data was last updated and alerts you when it becomes stale. Stale data can be just as damaging as wrong data. If your dashboard shows yesterday's numbers, decisions made today could be wrong.

> ℹ️ **Prerequisites**: Before setting up freshness monitoring, you need:
- A [connected data source](/data-sources/overview) with discovery completed
- Assets with timestamp columns (`created_at`, `updated_at`, or similar)

*Freshness timeline showing expected vs actual data updates*

## Why Freshness Matters

Data has an expected update cadence. When that cadence breaks, something is wrong:

## How Freshness Monitoring Works

AnomalyArmor tracks freshness by monitoring timestamp columns:

1. **You specify** which column indicates "when data was updated"
2. **Discovery queries** the maximum value of that column
3. **We compare** against your configured SLA
4. **Alert fires** if data is older than allowed

```sql
-- What AnomalyArmor checks
SELECT MAX(created_at) FROM orders;
-- Result: 2024-01-15 07:58:32

-- Your SLA: Updated within 1 hour
-- Current time: 2024-01-15 08:30:00
-- Data age: 32 minutes
-- Status: OK Within SLA
```

## Setting Up Freshness Monitoring

## SLA Strategies

### Start Conservative

Set SLAs with buffer room to avoid false positives:

| Actual Cadence | Recommended SLA | Why |
|---------------|-----------------|-----|
| Every hour | 2 hours | Allows for occasional delays |
| Every day | 36 hours | Accounts for timing variations |
| Every 15 min | 30 minutes | Double the expected window |

> 💡 Start lenient and tighten over time. It's easier to make SLAs stricter than to deal with alert fatigue from SLAs that are too tight.

### By Data Criticality

| Criticality | SLA Approach | Example |
|-------------|--------------|---------|
| Critical | Expected + 25% | Revenue: 1hr expected → 75min SLA |
| High | Expected + 50% | Orders: 1hr expected → 90min SLA |
| Medium | Expected × 2 | Analytics: 1hr expected → 2hr SLA |
| Low | Expected × 3 | Reports: daily expected → 3 day SLA |

### By Update Pattern

### Streaming/Real-time

For data expected within minutes:

| Setting | Value |
|---------|-------|
| **Expected** | Every 5 minutes |
| **Warning** | 10 minutes |
| **Violation** | 15 minutes |
| **Alert** | Slack + PagerDuty |

### Batch/Hourly

For hourly ETL jobs:

| Setting | Value |
|---------|-------|
| **Expected** | Every hour |
| **Warning** | 80 minutes |
| **Violation** | 2 hours |
| **Alert** | Slack |

### Daily

For daily batch loads:

| Setting | Value |
|---------|-------|
| **Expected** | Daily by 6am |
| **Warning** | 8am (2hr buffer) |
| **Violation** | 12pm (6hr buffer) |
| **Alert** | Email |

## Freshness Alerts

### Setting Up Alert Rules

Create rules to notify you of freshness violations:

| Field | Value |
|-------|-------|
| **Rule** | Critical Table Freshness |
| **Event** | Freshness Violation |
| **Assets** | orders, payments, users |
| **Destinations** | Slack, PagerDuty |

### Alert Content

Freshness alerts include:

```
[!] Freshness Violation

Asset: production.public.orders
Column: created_at

Expected: Updated every 1 hour
Last update: 3 hours 15 minutes ago
Data age: 3h 15m (SLA: 1h)

Detected: January 15, 2024 at 8:00 AM UTC

[View Asset] [View Dashboard]
```

## Handling Special Cases

### Weekends and Holidays

Some data doesn't update on weekends:

**Options:**
1. **Longer weekend SLA**: Set different thresholds for Saturday/Sunday
2. **Pause monitoring**: Temporarily disable freshness checks
3. **Adjust expectations**: Set SLA to "72 hours" to cover full weekends

### Maintenance Windows

During planned maintenance:

1. **Disable rules**: Toggle OFF freshness alert rules for affected assets
2. **Document**: Note expected staleness
3. **Re-enable**: Toggle rules back ON after maintenance

### Infrequently Updated Tables

Some tables legitimately update rarely:

| Table Type | Update Pattern | SLA Approach |
|-----------|----------------|--------------|
| Reference/lookup | Monthly | 45-day SLA or no monitoring |
| Historical archive | Never | Don't monitor freshness |
| Dimension tables | Daily/weekly | Match actual pattern |

## Viewing Freshness Status

### Asset List View

In the **Assets** list, freshness status appears as indicators:

| Indicator | Meaning |
|-----------|---------|
| Green | Within SLA |
| Yellow | Warning threshold reached |
| Red | SLA violated |
| Gray | Freshness not configured |

### Asset Detail View

Click an asset to see:

- Current freshness status
- Last update timestamp
- Freshness history over time
- SLA configuration

The detail view shows a status card with the current freshness state, last update timestamp, data age, and a progress bar indicating how much of your SLA window has been consumed. Below that, a 7-day history shows freshness status over time.

**Example freshness status:**

| Field | Value |
|-------|-------|
| Status | ✓ Within SLA |
| Timestamp Column | `created_at` |
| Last Update | January 15, 2024 at 2:15 PM UTC |
| Data Age | 45 minutes |
| SLA Threshold | 2 hours |
| Progress | 37% of SLA (45 min / 2 hr) |

**Freshness History:** All checks in the last 7 days were within SLA.

## Troubleshooting

### False positive alerts

**Problem**: Freshness alerts fire when data is actually fine.

**Solutions**:
1. Verify timestamp column choice. Is it the right one?
2. Loosen SLA threshold
3. Check timezone handling (UTC vs. local)
4. Review discovery schedule vs. SLA timing

### Missed violations

**Problem**: Data was stale but no alert fired.

**Solutions**:
1. Verify freshness monitoring is enabled for the asset
2. Check alert rule configuration
3. Confirm discovery is running frequently enough
4. Verify timestamp column has recent values

### Wrong timestamp column

**Problem**: Freshness showing incorrect values.

**Solutions**:
1. Review column choice
2. For tables with updates, use `updated_at` not `created_at`
3. For ETL tables, use the load timestamp column
4. Ensure column is always populated (no NULLs)

### Timezone issues

**Problem**: Freshness calculations seem off by hours.

**Solutions**:
1. Check timestamp column timezone
2. AnomalyArmor normalizes to UTC
3. Ensure consistent timezone handling in your ETL

## Best Practices

### 1. Start with Critical Tables

Focus monitoring on:
- Revenue-impacting tables
- Customer-facing dashboard sources
- Compliance-required data

### 2. Align with Business Needs

Ask: "When would stale data cause a problem?"

| Scenario | Acceptable Staleness | SLA |
|----------|---------------------|-----|
| Real-time dashboard | Minutes | 15 min |
| Daily executive report | Hours | Same-day |
| Monthly compliance | Days | 1 week |

### 3. Coordinate with ETL Schedules

| Phase | Time |
|-------|------|
| **ETL Schedule** | 2:00 AM daily |
| **Processing time** | ~30 minutes |
| **Data available** | ~2:30 AM |
| **SLA** | 4:00 AM (90 min buffer) |

### 4. Use Warning Thresholds

Configure two levels:

1. **Warning**: "Heads up, getting stale" → Slack
2. **Violation**: "SLA breached" → PagerDuty

## Common Questions

### Which timestamp column should I pick?

Pick the column that updates when new data arrives. Use `created_at` for append-only tables like events and logs, `updated_at` for tables that get updated in place, and the load timestamp for ETL destination tables. Avoid columns that can be NULL.

### What's a sensible starting SLA?

Start lenient, around double the expected cadence. If data is expected hourly, set a 2-hour SLA. It's easier to tighten thresholds than to fight alert fatigue from SLAs set too tight on day one.

### How do I handle tables that don't update on weekends?

Either extend the SLA to cover the weekend (for example, 72 hours), set different thresholds for weekend days, or pause the freshness rule during known quiet periods. Choose based on how critical the data is.

### Why does the freshness calculation look off by several hours?

Almost always a timezone issue. AnomalyArmor normalizes to UTC, so a timestamp column stored in local time will appear skewed. Confirm your ETL writes timestamps consistently and check the column's declared timezone.

### What's the difference between a warning and a violation?

A **warning** fires before the SLA is breached (often at 80% of the threshold) as an early heads-up. A **violation** fires once the data is officially past SLA. Route them separately, warnings to Slack, violations to PagerDuty.

### Do I need freshness monitoring on every table?

No. Focus on tables that power dashboards, revenue reporting, customer-facing products, or compliance processes. Reference tables and historical archives usually don't need freshness checks.

## What's Next

- **[Set Up Metrics](/data-quality/metrics)**: Track row counts, null percentages, and detect anomalies
- **[Configure Alerts](/alerts/alert-rules)**: Get notified when freshness SLAs are violated
- **[Report Badges](/data-quality/report-badges)**: Embed freshness status in dashboards
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue

---

<!-- Source: https://docs.anomalyarmor.ai/data-quality/metrics -->

# Data Quality Metrics

> Monitor null percentages, distinct counts, and other column-level statistics to detect data quality issues

Data quality metrics let you track statistical properties of your columns over time. AnomalyArmor captures metric values on a schedule, builds historical baselines, and automatically detects when values fall outside expected ranges.

> ℹ️ **Looking for row count monitoring?** Use [Row Count Monitoring](/data-quality/row-count-monitoring) for tracking row counts with ML-based anomaly detection or explicit thresholds.

> ℹ️ **Prerequisites**: Before creating metrics, you need:
- A [connected data source](/data-sources/overview) with discovery completed
- At least one asset (table/view) to monitor

**Example scenario:** The `customer_email` column normally has ~3% null values. On Jan 30, null percentage jumped to 12.3%, well outside the expected range band. AnomalyArmor flags this as an anomaly, indicating a potential data quality issue in the source system.

## Why Use Metrics

Freshness tells you *when* data was updated. Completeness tells you *how much* arrived. Metrics tell you *what changed* at the column level:

| Issue | Freshness | Completeness | Metrics |
|-------|-----------|--------------|---------|
| ETL job failed completely | Detects it | Detects it | Detects it |
| ETL ran but loaded 0 rows | Might miss it | **Catches it** | N/A |
| Data loaded but 50% nulls | Misses it | Misses it | **Catches it** |
| Unexpected duplicates | Misses it | Misses it | **Catches it** |
| Values outside valid range | Misses it | Misses it | **Catches it** |

**Use freshness for "did data arrive on time?"**
**Use row count monitoring for "did the right amount of data arrive?"**
**Use metrics for "is the column-level data quality correct?"**

## Metric Types

All metrics require a specific column to monitor:

| Type | Description | Best For |
|------|-------------|----------|
| `null_percent` | Percentage of null values | Detecting missing data |
| `distinct_count` | Count of unique values | Cardinality monitoring |
| `duplicate_count` | Count of repeated values | Data quality checks |
| `min_value` | Minimum numeric value | Range validation |
| `max_value` | Maximum numeric value | Outlier detection |
| `mean` | Average numeric value | Central tendency |
| `percentile` | Nth percentile value | Distribution analysis |

## Creating a Metric

## Viewing Metric History

Each metric tracks historical values and displays them as a trend chart:

- **Value line**: Actual metric values over time
- **Anomaly band**: Expected range (mean +/- sensitivity * stddev)
- **Anomaly points**: Values outside the band are flagged

### Reading the Chart

| Indicator | Meaning |
|-----------|---------|
| Green line within band | Normal values |
| Red dot outside band | Anomaly detected |
| Gray dashed lines | Upper/lower bounds |

## Which Metric Type Should I Use?

### Is my table growing or shrinking unexpectedly?

Use [Row Count Monitoring](/data-quality/row-count-monitoring). It provides ML-based pattern learning, time-windowed counting, and explicit threshold support for row count monitoring.

### Are there unexpected null values?

Use **null_percent** on the column that shouldn't have nulls.

Example: Monitor `customer_email` for null percentage. Alert if nulls exceed historical baseline (e.g., jumps from 2% to 15%).

### Are values within expected range?

Use **min_value** and **max_value** on numeric columns.

Example: Monitor `price` column. Alert if minimum drops below 0 (invalid) or maximum exceeds historical norms.

### Is data being duplicated?

Use **duplicate_count** on columns that should be unique.

Example: Monitor `order_id` for duplicates. Any duplicates indicate a data quality issue.

### How many unique values exist?

Use **distinct_count** on categorical columns.

Example: Monitor `country_code` distinct count. A sudden increase might indicate invalid data.

## Best Practices

### Start with High-Impact Metrics

Focus on metrics that catch real problems:

**Critical table (orders):**

- **Completeness**: Catch data loss or duplication (see [Row Count Monitoring](/data-quality/row-count-monitoring))
- **null_percent** on `order_id`: Should never be null
- **null_percent** on `customer_id`: Should never be null
- **min_value** on `total_amount`: Should never be negative

### Match Capture Interval to Data Freshness

| Data Update Pattern | Recommended Interval |
|---------------------|---------------------|
| Real-time streaming | Hourly |
| Hourly batch jobs | Hourly |
| Daily batch jobs | Daily |
| Weekly aggregates | Weekly |

### Use Meaningful Sensitivity Values

| Scenario | Sensitivity | Rationale |
|----------|-------------|-----------|
| New table, learning patterns | 3.0 | Reduce noise while learning |
| Established table, stable patterns | 2.0 | Balanced detection |
| Critical data, low tolerance | 1.5 | More sensitive alerting |

## Troubleshooting

### Metric shows 'No data'

**Causes:**
- Metric was just created and hasn't captured yet
- Capture job failed
- Table is empty

**Solutions:**
1. Wait for the next scheduled capture (check interval)
2. Trigger a manual capture: **Actions > Capture Now**
3. Check the table has data

### Too many false positive anomalies

**Causes:**
- Sensitivity is too low (too sensitive)
- Normal data patterns are highly variable
- Seasonality not accounted for

**Solutions:**
1. Increase sensitivity (e.g., 2.0 to 3.0)
2. Allow more baseline data to accumulate (30+ days)
3. Consider if the variation is actually expected

### Missing real anomalies

**Causes:**
- Sensitivity is too high (not sensitive enough)
- Baseline includes anomalous data
- Capture interval too infrequent

**Solutions:**
1. Decrease sensitivity (e.g., 3.0 to 2.0)
2. Reset baseline after fixing data issues
3. Increase capture frequency

### Metric capture failing

**Causes:**
- Database connection issues
- Column was renamed or removed
- Permission changes

**Solutions:**
1. Check data source connection status
2. Verify column still exists
3. Check database user permissions

## Common Questions

### When should I use metrics versus row count monitoring?

Use **metrics** for column-level checks like null rates, distinct counts, and numeric ranges. Use [Row Count Monitoring](/data-quality/row-count-monitoring) for table-level volume tracking, it has ML-based pattern learning and time-windowed counting that metrics don't.

### What sensitivity should I start with for anomaly detection?

Start at **2.0** (balanced, alerts on 2 standard deviations). Drop to 1.5 for critical data where you want tight detection, or raise to 3.0 if you're seeing too many false positives from noisy patterns.

### How long before anomaly detection is useful?

Anomaly detection needs a baseline. Expect rougher results for the first week or two while history accumulates. For stable patterns, 30+ days of baseline data gives the tightest, most trustworthy bands.

### Does AnomalyArmor read my column values?

It runs aggregate queries (like `COUNT`, `MIN`, `MAX`, `AVG`) against your database to compute the metric. Only the numeric result is stored, individual row values aren't transmitted or retained.

### Can I monitor a metric on a custom SQL expression?

The built-in metric types run against a specific column. For arbitrary SQL, use [Custom SQL Monitoring](/data-quality/custom-sql-monitoring) instead, which lets you write any `SELECT` that returns a numeric value.

## What's Next

- **[Set Up Metric Alerts](/alerts/alert-rules)**: Get notified when metrics detect anomalies
- **[Metrics API](/api/metrics)**: Automate metric management with the API
- **[Report Badges](/data-quality/report-badges)**: Embed metric status in dashboards
- **[Alert Rules](/alerts/alert-rules)**: Configure where alerts are sent

---

<!-- Source: https://docs.anomalyarmor.ai/data-quality/row-count-monitoring -->

# Row Count Monitoring

> Monitor row counts with ML-based anomaly detection or explicit thresholds

Row Count Monitoring tracks row counts in your tables over time. It detects when data volumes drop unexpectedly (data loss) or spike unusually (duplicate loads), helping you catch ETL issues before they impact downstream consumers.

> ℹ️ **Why Row Count?** Row count monitoring used to be part of Data Quality Metrics. We moved it to its own feature with enhanced capabilities: ML-based pattern learning, time-windowed counting, and explicit threshold support.

**Example scenario:** The orders table typically receives 45,000-55,000 rows daily. On Jan 30, only 15,234 rows were loaded -- a 70% drop flagged as an anomaly, indicating a potential ETL failure.

## Configuration Reference

### Monitoring Mode {#monitoring-mode}

Row Count Monitoring offers two approaches to fit different needs:

#### Auto-Learn Mode (Recommended)

Let AnomalyArmor learn your table's normal row count patterns:

| Aspect | How It Works |
|--------|--------------|
| **Learning period** | Collects data for 7+ days to establish baseline |
| **Pattern detection** | Identifies daily, weekly, and seasonal trends |
| **Anomaly detection** | Uses statistical analysis (mean +/- stddev * sensitivity) |
| **Best for** | Tables with consistent, predictable patterns |

```
Auto-learn example (orders table):

Day 1-7:    Learning... collecting baseline data
Day 8+:     Baseline established (avg: 48,000, stddev: 3,200)
            Alerts if row count deviates significantly
```

#### Explicit Mode

Set specific row count thresholds when you know exactly what to expect:

| Setting | Description |
|---------|-------------|
| **Min rows** | Alert if row count falls below this value |
| **Max rows** | Alert if row count exceeds this value |
| **Best for** | Tables with known, fixed expectations |

```
Explicit example (daily_summary table):

Expected: Exactly 1 row per day
Min: 1, Max: 1
Alert if row count != 1
```

### Sensitivity {#sensitivity}

For auto-learn mode, sensitivity controls how strict the anomaly detection is. It's the multiplier applied to the standard deviation when calculating expected ranges.

| Sensitivity | Behavior | Use When |
|-------------|----------|----------|
| **1 (Tight)** | Very strict, catches small deviations | Critical data, low tolerance for anomalies |
| **2 (Balanced)** | Default, catches moderate deviations | Most tables, standard monitoring |
| **3 (Relaxed)** | Less strict, allows more variation | High natural variability, noisy data |
| **4 (Loose)** | Very permissive, only catches large deviations | Highly variable patterns, initial setup |

**Default**: 2 (balanced detection)

**Formula**: Expected range = mean ± (stddev × sensitivity)

```
Example with sensitivity = 2:
Mean: 48,000 rows
StdDev: 3,000 rows
Expected range: 42,000 - 54,000 rows
(48k - 6k to 48k + 6k)

If actual count = 35,000 → ANOMALY (outside range)
If actual count = 51,000 → HEALTHY (within range)
```

> 💡 Start with sensitivity 2-3 for new monitors, then tighten to 1-2 once patterns are stable.

### Timestamp Column {#timestamp-column}

Optional column used to filter rows within the time window. Without a timestamp column, all rows in the table are counted regardless of when they were created.

**When to use:**
-Append-only tables that grow over time
-Event streams or log tables
-Tables where you care about recent data arrival

**When to skip:**
-Tables that are fully replaced on each load
-Snapshot tables with fixed row counts
-Dimension tables with slow-changing data

```
With timestamp column (orders.created_at):
  Query: SELECT COUNT(*) FROM orders
         WHERE created_at >= NOW() - INTERVAL '24 hours'
  Result: 48,000 (recent rows only)

Without timestamp column:
  Query: SELECT COUNT(*) FROM orders
  Result: 5,000,000 (all rows ever)
```

**Column requirements:**
- Must be a timestamp or datetime type
- Should represent when the row was created/ingested
- Should be indexed for performance

### Time Window {#time-window}

How far back to count rows when a timestamp column is specified. Choose based on your data load frequency:

| Window | Duration | Best For |
|--------|----------|----------|
| **1 hour** | Last 60 minutes | Real-time streaming, high-frequency events |
| **6 hours** | Last 6 hours | Hourly batch jobs, frequent updates |
| **12 hours** | Last 12 hours | Twice-daily pipelines |
| **24 hours** | Last day | Daily batch ETL (most common) |
| **168 hours** | Last 7 days | Weekly aggregates, slow-changing data |

```
Example: Daily batch job loads orders every night at 2 AM

Time window: 24 hours
Check interval: 6 hours (runs at 2 AM, 8 AM, 2 PM, 8 PM)

Check at 8 AM:
  Counts rows WHERE created_at >= 8 AM yesterday
  Includes last night's batch + today's streaming data
```

> ⚠️ Without a timestamp column, the time window setting is ignored and all rows are counted.

### Check Interval {#check-interval}

How often to run the row count check and evaluate for anomalies:

| Interval | Frequency | Best For |
|----------|-----------|----------|
| **1 hour** | Every hour | Real-time monitoring, critical tables |
| **6 hours** | 4x per day | Standard monitoring, daily tables |
| **12 hours** | 2x per day | Less critical tables, longer time windows |
| **24 hours** | Once per day | Weekly tables, slow-changing data |

**Cost considerations:** More frequent checks = more compute resources. Choose the interval that matches your SLA requirements.

```
Example check intervals for different scenarios:

Scenario: Real-time event stream
  Time window: 1 hour
  Check interval: 1 hour
  Result: Hourly checks on last hour of data

Scenario: Daily batch job
  Time window: 24 hours
  Check interval: 6 hours
  Result: 4 checks per day on last 24h of data

Scenario: Weekly report table
  Time window: 168 hours
  Check interval: 24 hours
  Result: Daily checks on last week of data
```

> 💡 Check interval should be ≤ time window for meaningful monitoring. A 24-hour check interval with a 1-hour time window would miss most anomalies.

## Time-Windowed Counting

For tables that accumulate data over time, use a timestamp column to count rows within a specific window:

| Window | Use Case |
|--------|----------|
| **1 hour** | Real-time event streams |
| **6 hours** | Frequent batch loads |
| **12 hours** | Twice-daily pipelines |
| **24 hours** | Daily batch ETL (most common) |
| **168 hours** | Weekly aggregates |

```
Time-windowed counting (orders table with created_at):

Without time window:  COUNT(*) = 5,000,000 (all time)
With 24h window:      COUNT(*) WHERE created_at >= now() - 24h = 48,000
```

> 💡 Use time-windowed counting for append-only tables. Without it, row counts only grow, making anomaly detection less useful.

## Setting Up Row Count Monitoring

## Understanding Results

### Status Indicators

| Status | Meaning | Action |
|--------|---------|--------|
| **Healthy** | Row count within expected range | None needed |
| **Anomaly** | Row count outside expected range | Investigate the cause |
| **Learning** | Collecting baseline data | Wait for learning to complete |
| **No Data** | No checks have run yet | Check will run on next interval |

### Anomaly Types

| Anomaly | Possible Causes |
|---------|-----------------|
| **Row count too low** | ETL failure, data loss, filter bug, source issue |
| **Row count too high** | Duplicate load, removed filter, upstream spike |
| **Row count zero** | Complete ETL failure, wrong table, permissions |

## Best Practices

### Choose the Right Mode

| Scenario | Recommended Mode |
|----------|------------------|
| Data patterns vary naturally | Auto-learn with sensitivity 2-3 |
| Exact expectations known | Explicit with min/max thresholds |
| New table, unknown patterns | Auto-learn with sensitivity 3-4 |
| Critical data, low tolerance | Auto-learn with sensitivity 1-2 |

### Set Appropriate Windows

| Data Pattern | Recommended Window |
|--------------|-------------------|
| Real-time streaming | 1 hour |
| Hourly batch jobs | 6 hours |
| Daily batch jobs | 24 hours |
| Weekly aggregates | 168 hours |

### Start Conservative, Then Tighten

1. **Week 1**: Use auto-learn with sensitivity 3 (less sensitive)
2. **Week 2-4**: Review any anomalies, adjust if too noisy
3. **Month 2+**: Tighten to sensitivity 2 once patterns are stable

## Row Count vs. Metrics

| Feature | Row Count | Data Quality Metrics |
|---------|--------------|---------------------|
| **Purpose** | Monitor row counts | Monitor column statistics |
| **Scope** | Table-level | Column-level |
| **ML-based** | Yes (auto-learn) | Yes (anomaly detection) |
| **Time windows** | Yes | No |
| **Explicit thresholds** | Yes | Via checks |

**Use Row Count Monitoring for**: "Did the right amount of data arrive?"
**Use Metrics for**: "Is the data quality correct?" (nulls, duplicates, ranges)

## Troubleshooting

### Status shows 'Learning' for too long

**Causes:**
- Not enough data points collected yet
- Check interval is very long (weekly)

**Solutions:**
1. Wait for at least 7 data points (7 days for daily checks)
2. Consider switching to explicit mode if you know expected values

### Too many false positive anomalies

**Causes:**
- Sensitivity is too low (too sensitive)
- Natural data variation is high
- Seasonality not yet learned

**Solutions:**
1. Increase sensitivity (e.g., 2 to 3)
2. Allow more baseline data (30+ days)
3. Switch to explicit mode with wider thresholds

### Missing real anomalies

**Causes:**
- Sensitivity is too high (not sensitive enough)
- Baseline includes anomalous data

**Solutions:**
1. Decrease sensitivity (e.g., 3 to 2)
2. Switch to explicit mode with tighter thresholds

### Row count always zero with time window

**Causes:**
- Timestamp column has no recent data
- Wrong timestamp column selected
- Time window too narrow

**Solutions:**
1. Verify timestamp column has data in the window
2. Check column data type (should be timestamp/datetime)
3. Widen the time window

## Common Questions

### Auto-learn or explicit mode, which should I pick?

Use **auto-learn** when row counts fluctuate naturally, AnomalyArmor builds a statistical baseline and flags deviations. Use **explicit** when you know the exact min and max (for example, a daily summary that should always have exactly one row).

### Why configure a timestamp column?

Without one, `COUNT(*)` returns all rows ever, so counts only grow and anomaly detection loses meaning. A timestamp column lets AnomalyArmor count only rows inside a time window like the last 24 hours, which is what you actually want to monitor for append-only tables.

### How long does auto-learn mode need before it starts alerting?

At least 7 data points. For a daily check interval that's 7 days; for hourly it's 7 hours. Until enough baseline accumulates, the status shows **Learning** and no anomalies fire.

### What does a sensitivity of 2 actually mean?

It's the multiplier on the standard deviation used to define the expected range. Expected range = `mean ± (stddev × sensitivity)`. Lower values (1) are stricter and catch smaller drifts; higher values (3-4) tolerate more variation.

### What's the difference between row count monitoring and data quality metrics?

Row count is table-level ("did the right volume arrive?") with ML pattern learning and time windows. [Metrics](/data-quality/metrics) are column-level ("are the values correct?"), tracking things like null percentages, distinct counts, and numeric ranges.

### What typically causes a row count spike?

The most common cause is a duplicate load, where the same batch ran twice or a filter was removed upstream. Sudden drops are usually ETL failures, wrong source, or a filter change that excluded valid data. The anomaly page shows the timing so you can correlate with deploys.

## What's Next

- **[Set Up Alerts](/alerts/overview)**: Get notified when row count anomalies are detected
- **[Data Quality Metrics](/data-quality/metrics)**: Monitor column-level statistics like null percentages
- **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Track when data was last updated
- **[Report Badges](/data-quality/report-badges)**: Embed row count status in dashboards

---

<!-- Source: https://docs.anomalyarmor.ai/data-quality/custom-sql-monitoring -->

# "Custom SQL Monitoring"

> "Monitor any business rule by writing SQL queries that return a numeric value, with configurable thresholds and alerts."

Custom SQL monitors let you define arbitrary data quality checks by writing SELECT queries against your connected databases. Each monitor returns a single numeric value that is compared against thresholds you configure.

## When to use Custom SQL monitors

Custom SQL monitors are ideal when built-in monitors (freshness, row count, schema drift, metrics) do not cover your specific business rules:

| Use Case | Example SQL |
|----------|-------------|
| Zero-dollar transactions | `SELECT COUNT(*) FROM orders WHERE total_amount = 0` |
| Source-target reconciliation | `SELECT ABS(a.cnt - b.cnt) FROM (SELECT COUNT(*) cnt FROM source) a, (SELECT COUNT(*) cnt FROM target) b` |
| Null percentage | `SELECT 100.0 * COUNT(*) FILTER (WHERE email IS NULL) / COUNT(*) FROM users` |
| Duplicate detection | `SELECT COUNT(*) FROM (SELECT id, COUNT(*) FROM events GROUP BY id HAVING COUNT(*) > 1) t` |
| SLA compliance | `SELECT AVG(EXTRACT(EPOCH FROM completed_at - created_at)) / 3600 FROM jobs WHERE created_at > NOW() - INTERVAL '1 day'` |

> ℹ️ If your check is about **freshness** (is data arriving on time?), **row counts** (did volume change?), or **column statistics** (mean, null%, distinct count), use the dedicated built-in monitors instead. They provide richer baselines, anomaly detection, and require no SQL.

## Creating a Custom SQL monitor

## SQL requirements

Your query must follow these rules:

1. **Must be a SELECT or WITH (CTE) statement**. INSERT, UPDATE, DELETE, DROP, and other DML/DDL keywords are rejected.
2. **Must return at least one row** with a numeric value in the first column.
3. **Runs with a configurable timeout** (default 60 seconds, maximum 300 seconds). Optimize your query if it times out.
4. **Runs against your database**, not AnomalyArmor's internal database. The query uses the same connection and credentials as asset discovery.

## Threshold configuration

Custom SQL monitors use static bounds:

| Configuration | Behavior |
|---------------|----------|
| Upper bound only | Alert when value > upper bound |
| Lower bound only | Alert when value < lower bound |
| Both bounds | Alert when value > upper bound OR value < lower bound |
| Value at boundary | No alert (boundaries are inclusive) |

## Viewing results

Each monitor execution stores:
- The numeric **value** returned by the query
- **Pass/Fail** status based on threshold comparison
- **Execution time** in milliseconds
- **Timestamp** of when the check ran

Historical results are available on the monitor detail page, showing value over time with threshold lines overlaid.

## Alerts

When a threshold is breached, an alert flows through the standard AnomalyArmor alert pipeline:

1. Alert rules matching `custom_sql` event type are evaluated
2. Matching rules route to configured destinations (Slack, email, webhook)
3. The alert appears in **Incidents** alongside freshness, schema, and metric alerts

To set up alert routing, create an alert rule that includes the `custom_sql` event type.

## Database-specific tips

### PostgreSQL

```sql
-- Use FILTER for conditional aggregates
SELECT COUNT(*) FILTER (WHERE status = 'failed')
FROM jobs
WHERE created_at > NOW() - INTERVAL '1 hour'
```

### Snowflake

```sql
-- Use IFF for conditional counts
SELECT COUNT(IFF(status = 'failed', 1, NULL))
FROM jobs
WHERE created_at > DATEADD('hour', -1, CURRENT_TIMESTAMP())
```

### BigQuery

```sql
-- Use COUNTIF for conditional counts
SELECT COUNTIF(status = 'failed')
FROM `project.dataset.jobs`
WHERE created_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
```

### Redshift

```sql
-- Use CASE inside COUNT
SELECT COUNT(CASE WHEN status = 'failed' THEN 1 END)
FROM jobs
WHERE created_at > GETDATE() - INTERVAL '1 hour'
```

## Best practices

- **Keep queries fast.** Target under 10 seconds. Use appropriate indexes and time filters.
- **Return a single value.** Multi-row results use only the first row's first column.
- **Use meaningful names.** "Zero-Dollar Orders (Last Hour)" is clearer than "Custom Check 1".
- **Start with wide thresholds** and tighten them as you learn normal ranges.
- **Use the Run Now button** to validate your query before relying on scheduled execution.

## Common Questions

### When should I use a Custom SQL monitor instead of a built-in monitor?

Use Custom SQL when your check is a business rule the built-ins don't cover, like zero-dollar transactions, cross-table reconciliation, or SLA compliance. Stick with [freshness](/data-quality/freshness-monitoring), [row count](/data-quality/row-count-monitoring), or [metrics](/data-quality/metrics) for those standard cases, they provide anomaly detection without any SQL.

### What's allowed in the SQL query?

Only `SELECT` and `WITH` (CTE) statements. INSERT, UPDATE, DELETE, and DDL are rejected. The query must return at least one row with a numeric value in the first column.

### Where does the query run?

It runs against **your connected database**, using the same credentials as asset discovery, not against AnomalyArmor's internal database. Write queries the way you would in your warehouse's SQL editor.

### What happens if my query is slow?

Each run has a default 60-second timeout (300 seconds max). If your query times out, add time filters, use indexes, or reduce the scope. Target under 10 seconds for frequent checks.

### Can I get alerts from Custom SQL monitors?

Yes. Threshold breaches fire alerts through the standard pipeline. Create an alert rule that matches the `custom_sql` event type and route it to Slack, email, webhook, or any configured destination.

### Does the monitor support both upper and lower bounds?

Yes. Configure either or both. Setting both alerts if the value exceeds the upper bound **or** drops below the lower bound. Values at the boundary don't alert, thresholds are inclusive.

- **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Monitor data arrival timing automatically
- **[Metrics](/data-quality/metrics)**: Track column-level statistics with anomaly detection

---

<!-- Source: https://docs.anomalyarmor.ai/data-quality/report-badges -->

# Report Badges

> Embed data quality status indicators in Databricks, Looker, Notion, and other tools

Report badges let you embed AnomalyArmor data quality status directly into your dashboards, wikis, and operational tools. See pass/fail status at a glance without leaving your workflow.

## Why Use Report Badges

Data teams context-switch constantly. Opening a separate monitoring tool to check data quality breaks flow:

Badges provide "always-on" visibility into data quality wherever you work.

## Where Badges Work

Any tool that supports image embeds can display badges:

| Platform | How to Embed |
|----------|--------------|
| Databricks | Markdown cell: `![](badge-url)` |
| Looker | Custom visualization with HTML |
| Tableau | Web content or image object |
| Notion | Paste URL as embed |
| Confluence | Insert image by URL |
| Slack | Post badge URL directly |
| GitHub | Add to README |

## Badge Status

Badges show one of three states:

| Badge | Meaning |
|-------|---------|
| ![Passing](https://shields.io/badge/AnomalyArmor-Passing-brightgreen) | All monitored tables pass quality checks |
| ![2 Issues](https://shields.io/badge/AnomalyArmor-2_Issues-red) | Tables have schema drift, freshness violations, or metric anomalies |
| ![Unknown](https://shields.io/badge/AnomalyArmor-Unknown-lightgrey) | Badge disabled or configuration error |

## Creating a Badge

## Copying the Badge URL

After creating a badge:

1. Find the badge in the **Report Badges** list
2. Click **Copy URL** next to the badge
3. The URL is copied to your clipboard

Badge URLs follow this format:

```
https://app.anomalyarmor.ai/badges/{public-id}.svg
```

> ℹ️ Badge URLs are public and require no authentication. Anyone with the URL can view the badge status. URLs use randomly generated IDs that are not guessable.

## Embedding Examples

### Databricks Notebook

Add a markdown cell:

```markdown
# Data Quality Status

![Data Quality](https://app.anomalyarmor.ai/badges/abc123.svg)

The badge above shows real-time quality status for our analytics tables.
```

### Notion Page

1. Type `/embed`
2. Paste the badge URL
3. The badge renders inline

### GitHub README

```markdown
## Data Quality

[![Analytics Quality](https://app.anomalyarmor.ai/badges/abc123.svg)](https://app.anomalyarmor.ai)

Our analytics tables are monitored by AnomalyArmor.
```

### Slack

Simply paste the badge URL. Slack will render the SVG inline.

## Managing Badges

### Viewing All Badges

Navigate to **Report Badges** to see all badges:

- Badge status (live preview)
- Scope (asset and tag filters)
- Enabled checks (schema drift, freshness)
- Copy URL action

### Editing a Badge

1. Click the badge row in the list
2. Modify configuration
3. Click **Save**

The badge URL remains unchanged. Embedded badges automatically reflect the updated configuration.

### Deleting a Badge

1. Click the delete icon on the badge row
2. Confirm deletion

> ⚠️ Deleting a badge immediately invalidates the URL. Embedded badges will show "Unknown" status.

### Disabling vs Deleting

If you want to temporarily hide a badge:

1. Edit the badge
2. Toggle **Active** off
3. Save

Disabled badges show "Unknown" status but retain their URL for later reactivation.

## Caching Behavior

Badges cache their status for performance:

| Scenario | Cache Duration |
|----------|----------------|
| Default | 5 minutes |
| Embedded images | Respects HTTP cache headers |

This means:
- Badge status updates within 5 minutes of quality changes
- High traffic won't overload AnomalyArmor
- Multiple embeds of the same badge share cached results

## Best Practices

### 1. One Badge Per Logical Domain

Create badges for logical groupings:

**Good:**
- Analytics Schema Badge
- Core Tables Badge
- Customer-Facing Badge

**Avoid:**
- All Tables Badge (too broad)
- Single Table Badge (too narrow)

### 2. Use Tags for Precision

Combine asset selection with tags:

- **Asset**: production.analytics
- **Tags**: `["critical", "customer-facing"]`
- **Result**: Only critical, customer-facing tables in analytics

### 3. Place Badges Strategically

Embed badges where decisions are made:

| Location | Why |
|----------|-----|
| Dashboard header | Quality context before analysis |
| ETL documentation | Status of pipeline outputs |
| Data catalog | Trust indicator for consumers |
| Team wiki | Always-visible quality status |

### 4. Link to Details

When embedding, make badges clickable:

```markdown
[![Status](badge-url)](https://app.anomalyarmor.ai/assets/your-asset)
```

This lets viewers click through to investigate issues.

## Troubleshooting

### Badge shows Unknown

**Causes**:
- Badge was deleted
- Badge was disabled
- Invalid badge URL

**Solutions**:
1. Verify the badge exists in Report Badges
2. Check the badge is marked Active
3. Ensure URL is correct and complete

### Badge not updating

**Causes**:
- Browser caching old image
- Within 5-minute cache window

**Solutions**:
1. Hard refresh the page (Cmd+Shift+R / Ctrl+Shift+R)
2. Wait 5 minutes for cache to expire
3. Add cache-busting parameter: `badge-url?t=timestamp`

### Badge shows Issues unexpectedly

**Causes:**
- Schema drift detected on monitored tables
- Freshness SLA violated
- Metric anomaly detected (row count, null percentage, etc.)
- Upstream tables have issues (if enabled)

**Solutions:**
1. Click badge or navigate to AnomalyArmor
2. Review asset details for specific issues
3. Resolve underlying quality problems

### Badge not rendering in tool

**Causes**:
- Tool doesn't support SVG images
- Tool blocks external image URLs
- Firewall blocking AnomalyArmor domain

**Solutions**:
1. Check tool's image/embed documentation
2. Try alternative embed method (HTML vs Markdown)
3. Verify network access to app.anomalyarmor.ai

## Security Considerations

- **Public URLs**: Badge URLs are accessible without authentication
- **Non-guessable IDs**: URLs use random UUIDs that cannot be enumerated
- **Read-only**: Badges only display status; they cannot modify data
- **No sensitive data**: Badges show pass/fail only, not table contents

> ℹ️ If you need to revoke access to a badge, delete it. A new badge for the same scope will have a different URL.

## Common Questions

### Where can I embed an AnomalyArmor badge?

Any tool that accepts an image URL. That includes Databricks notebooks, Looker, Tableau, Notion, Confluence, Slack, and GitHub READMEs. Paste the badge URL like you would any SVG or Markdown image.

### Are badge URLs public?

Yes. Badge URLs require no authentication so they render anywhere. They use randomly generated, non-guessable IDs and only show pass/fail status, never table contents or row data. Delete and recreate a badge to rotate the URL.

### How fast does badge status update when something breaks?

Within **5 minutes**. Badges cache for performance so high-traffic embeds don't overload AnomalyArmor. If you need an immediate refresh, add a cache-busting query parameter or wait out the cache window.

### Why is my badge showing 'Unknown'?

Either the badge was deleted, disabled, or the URL is malformed. Check **Report Badges**, confirm the badge is marked Active, and verify the embedded URL matches exactly.

### Can one badge cover multiple tables?

Yes. Scope a badge to a schema and it will reflect the status of every table inside. Use tag filters (for example, `critical` or `customer-facing`) to narrow it further without creating one badge per table.

## Related Topics

- **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set SLAs on data freshness
- **[Data Quality Metrics](/data-quality/metrics)**: Track row counts, null percentages, and detect anomalies
- **[Schema Monitoring](/schema-monitoring/overview)**: Detect structural changes
- **[Alert Rules](/alerts/alert-rules)**: Get notified of quality issues

---

<!-- Source: https://docs.anomalyarmor.ai/coverage-tiers/overview -->

# Coverage Tiers

> Understand your monitoring coverage level and what each tier catches

Every data asset in AnomalyArmor earns a coverage score based on the monitoring you have configured. The score maps to a tier that tells you, at a glance, how well-protected your data is.

## The Four Tiers

| Tier | Score | What You Catch | What Can Still Surprise You |
|------|-------|---------------|---------------------------|
| **Monitored** | 10-29 | Schema changes that break pipelines | Data going stale, values going wrong, volume drops |
| **Protected** | 30-49 | Pipeline failures, data disappearing, schema breaks | Subtle quality issues, slow degradation |
| **Verified** | 50-69 | All of the above, plus stale data and value corruption | Edge cases in complex transformations |
| **Intelligent** | 70+ | All of the above, plus AI-powered anomaly detection and proactive alerts | Very little. You have full coverage across all six monitoring features. |

Assets scoring below 10 show as **Not monitored** with a prompt to set up basic monitoring.

## What Goes Into Each Tier

### Monitored (entry level, any plan)

- Schema drift monitoring enabled on the asset
- At least one alert destination configured (Slack, email, webhook)
- Discovery has been run (you know what tables you have)

**Customer value**: "I'll know when my upstream sources change before it breaks my pipelines."

### Protected (standard monitoring)

- Everything in Monitored
- Row count metrics on important tables (detects data loss, duplicates, empty loads)
- Alert rules routing anomalies to your team

**Customer value**: "I'll know when data stops flowing or volumes look wrong. My team gets notified."

### Verified (strong monitoring)

- Everything in Protected
- Freshness monitoring on actively-updated tables (auto-learn mode)
- Validity rules on critical columns (null checks, format validation, range bounds)
- AI knowledge base generated for the asset

**Customer value**: "I validate data quality, not just availability. I catch bad data before it reaches dashboards."

### Intelligent (top tier)

- Everything in Verified
- All coverage features above 70%
- Alert routing verified (rules have destinations)
- Daily briefing active

**Customer value**: "Full observability with AI-powered analysis. Anomalies are automatically investigated. I get proactive guidance."

## Where You See Your Tier

- **Asset cards** in the asset list (small badge with tier name)
- **Asset detail page** header (full badge with score and breakdown)
- **Daily briefing** header per asset
- **Homepage dashboard** (lowest-tier asset highlighted for attention)
- **Agent conversations** ("BalloonBazaar is at Verified tier, score 58")

## Best Practices

- Aim for **Verified** on production assets that feed dashboards and reports
- **Protected** is a solid baseline for all actively-used databases
- **Monitored** is appropriate for staging environments and development databases
- Review coverage quarterly as your data landscape changes

- **[How to Reach Each Tier](/coverage-tiers/guide)**: Step-by-step guide for upgrading your coverage tier.
- **[Coverage Score Explained](/coverage-tiers/score)**: How the score is calculated and how to improve it.

## Common Questions

### What's the difference between Monitored, Protected, Verified, and Intelligent?

Each tier adds a layer of coverage. Monitored catches schema changes, Protected adds row-count anomaly detection, Verified adds freshness and validity checks, and Intelligent adds AI-powered anomaly detection and daily briefings. See [The Four Tiers](#the-four-tiers).

### What tier should a production table be at?

Aim for Verified (50+) on production assets that feed dashboards and reports. Protected is a reasonable baseline for all actively-used databases. Monitored is fine for staging and development. See [Best Practices](#best-practices).

### Why is my asset showing 'Not monitored'?

Assets below score 10 show as Not monitored: discovery hasn't run, schema drift isn't enabled, or there's no alert destination configured. Enable schema monitoring and connect a destination and you'll clear the threshold.

### Where can I see an asset's tier?

Tiers show up as badges on asset cards in the asset list, on the asset detail page header, on the homepage dashboard (lowest-tier asset is highlighted), in daily briefings, and in agent replies.

---

<!-- Source: https://docs.anomalyarmor.ai/coverage-tiers/guide -->

# How to Reach Each Tier

> Step-by-step guide for upgrading your monitoring coverage tier

Each tier builds on the previous one. You can reach Protected in about 2 minutes with auto-setup, and Verified in under 10 minutes.

## Not Monitored to Monitored (score 10+)

**Time**: Under 1 minute

**What you gain**: You'll know when upstream sources change their schema before it breaks your pipelines.

## Monitored to Protected (score 30+)

**Time**: About 2 minutes

**Or use the agent**: Tell the AnomalyArmor agent "set up Protected-level monitoring for my database" and it will handle the setup for you.

**What you gain**: You'll catch data loss, duplicate loads, and empty pipelines. Your team gets notified automatically.

## Protected to Verified (score 50+)

**Time**: About 5 minutes

**What you gain**: You validate data quality, not just availability. You catch bad data before it reaches dashboards.

## Verified to Intelligent (score 70+)

**Time**: About 5 minutes

**What you gain**: Full observability with AI-powered anomaly detection. Proactive alerts and daily briefings keep you ahead of data issues.

## Using the Agent for Setup

The AnomalyArmor agent can handle tier upgrades conversationally:

```
You: "What tier is my production database at?"

Agent: "Your Production DB is at Protected tier (score 42/100).
  Schema Drift: 100% (25/25 pts)
  Freshness: 0% (0/25 pts)
  Metrics: 40% (8/20 pts)
  To reach Verified (50+), add freshness monitoring to 8 tables.
  Want me to set that up?"

You: "Yes, set up freshness monitoring."

Agent: "Done. Created 8 freshness schedules using auto-learn mode.
  Your score is now 54 - you've reached Verified tier."
```

## Tips

- **Start with the highest-impact feature**. The coverage breakdown shows which feature contributes the most points. Focus there first.
- **Use recommendations**. The asset page shows specific recommendations for improving your score, with one-click setup buttons.
- **Batch setup is fastest**. Use the "Apply All Recommendations" button or ask the agent to handle everything at once.

## Common Questions

### How long does it take to reach Verified on a production asset?

Under 10 minutes if you use the one-click setup: about 2 minutes for Protected (schema drift + row counts + alert routing), then 5 more for freshness and validity rules. Asking the agent to handle it is usually faster than clicking through yourself.

### Can I upgrade tiers without configuring each monitor manually?

Yes. Tell the AnomalyArmor agent "set up Protected-level monitoring for my database" and it will create the row-count metrics, freshness schedules, and alert rules for you. You can also use the **Apply All Recommendations** button on the asset page.

### Which feature should I add first to raise my score fastest?

Check the coverage breakdown on the asset page. The feature with the largest gap between current coverage and 100% is usually also the one with the most unclaimed points. Start there.

### Do I need Enterprise to reach Intelligent tier?

No. Intelligent tier is available on every plan. It requires all coverage features above 70% and alert routing verified. The limiter is plan capacity (monitor count), not tier eligibility.

---

<!-- Source: https://docs.anomalyarmor.ai/coverage-tiers/score -->

# Coverage Score Explained

> How your coverage score is calculated and how to improve it

Your coverage score is a weighted percentage (0-100) that reflects how thoroughly you're monitoring a data asset. Each monitoring feature contributes points based on its weight and your coverage percentage.

## Score Calculation

The score is the sum of weighted percentages across six monitoring features:

| Feature | Weight | Max Points | What It Measures |
|---------|--------|-----------|------------------|
| Schema Drift | 25% | 25 | Tables with active schema monitoring vs total |
| Freshness | 25% | 25 | Tables with freshness schedules vs total |
| Row Count Metrics | 20% | 20 | Tables with metrics configured vs total |
| Alert Routing | 15% | 15 | Alert rules with destinations vs total rules |
| Validity Rules | 10% | 10 | Tables with validity rules vs total |
| Intelligence KB | 5% | 5 | Whether AI knowledge base is fresh (less than 7 days old) |

**Total possible: 100 points**

### Example

An asset with 20 tables:

| Feature | Coverage | Calculation | Points |
|---------|----------|-------------|--------|
| Schema Drift | Enabled (1/1) | 100% x 25 | 25 |
| Freshness | 5 of 20 tables | 25% x 25 | 6 |
| Metrics | 8 of 20 tables | 40% x 20 | 8 |
| Alert Routing | 2 of 3 rules | 67% x 15 | 10 |
| Validity | 0 of 20 tables | 0% x 10 | 0 |
| Intelligence | KB generated | 100% x 5 | 5 |
| **Total** | | | **54** (Verified) |

## Score Color Coding

| Score Range | Color | Meaning |
|-------------|-------|---------|
| 0-25 | Red | Minimal monitoring, significant blind spots |
| 26-50 | Orange | Basic monitoring in place, gaps remain |
| 51-75 | Yellow | Good monitoring, some features not fully covered |
| 76-100 | Green | Full monitoring across all features |

## Common Questions

### Why is my coverage score low even though I have freshness monitoring enabled?

Freshness monitoring contributes a maximum of 25 points, so a freshness-only setup caps around 25 (Monitored tier). Add coverage across multiple features - schema drift, metrics, validity rules, and alert routing - to move up tiers.

### Does schema drift monitoring cover all my tables automatically?

Yes. Schema drift monitoring is asset-level, not per-table. Enabling it once covers every table in that asset, which is why it contributes up to 25 points with a single toggle.

### What counts as a fresh intelligence knowledge base?

The AI knowledge base is considered fresh if it was generated within the last 7 days. Regenerate it after significant schema changes or when you add new tables.

### Does my coverage score affect my plan or pricing?

No. Coverage tiers are earned by configuring monitoring, not purchased. Any customer on any plan can work toward any tier. Higher plans do include AI investigations, which boost the Intelligence KB score component and make higher tiers easier to reach.

### How often does my coverage score update?

In real time. The score is calculated when you view the asset page or when the agent checks your tier, so it always reflects the current state of your monitoring configuration.

## Improving Your Score

The fastest ways to improve your score, by impact:

1. **Enable schema drift monitoring** (+25 points max, single toggle)
2. **Add freshness monitoring** (+25 points max, batch setup available)
3. **Add row count metrics** (+20 points max, batch setup available)
4. **Link destinations to alert rules** (+15 points max)
5. **Add validity rules** (+10 points max)
6. **Generate AI intelligence** (+5 points, single button)

Use the **Monitoring Recommendations** section on your asset page for one-click setup, or ask the agent to "improve my coverage score."

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/overview -->

# Alerts Overview

> Get notified when data issues occur through flexible alert routing

Alerts are how AnomalyArmor notifies you when something needs attention. Whether it's a schema change, stale data, or a failed discovery job, alerts ensure the right people know at the right time.

*Alert pipeline from event to notification*

## How Alerts Work

Alerts follow a three-stage pipeline:

### 1. Event Detection

AnomalyArmor detects events during discovery runs:

| Event Type | Description |
|------------|-------------|
| **Schema Change** | Column added, removed, or type changed |
| **Freshness Violation** | Data not updated within SLA |
| **Discovery Failed** | Connection or permission error |
| **Asset Removed** | Table/view no longer exists |
| **New Asset** | Table/view discovered for first time |

### 2. Rule Evaluation

Each event is checked against your alert rules:

- **Scope**: Does the event match the rule's filters? (data source, schema, asset)
- **Conditions**: Does it meet additional criteria? (change type, etc.)
- **Active**: Is the rule enabled?

### 3. Suppression Check

*Alert suppression pipeline: schedule, blackout, cooldown, and daily limit checks*

Before delivery, alerts pass through suppression checks:

- **[Operating Schedules](/alerts/schedules)**: Is the event within the rule's active hours?
- **[Blackout Windows](/alerts/blackouts)**: Is a company-wide blackout currently active?
- **Cooldown**: Has this rule already fired recently?
- **Daily Limit**: Has the rule exceeded its daily notification cap?

Suppressed alerts are still recorded in the alert log for auditing.

### 4. Routing & Delivery

Matching events are sent to configured destinations:

- Rules can have multiple destinations
- Each destination can receive from multiple rules
- Deduplication prevents repeat alerts for the same event

## Supported Destinations

- **[Slack](/alerts/destinations/slack)**: Real-time channel notifications
- **[Email](/alerts/destinations/email)**: Individual or team distribution
- **[Webhooks](/alerts/destinations/webhooks)**: Custom integrations
- **[PagerDuty](/alerts/destinations/pagerduty)**: On-call escalation
- **[MS Teams](/alerts/destinations/ms-teams)**: Teams channel notifications

## Alert Components

### Rules

Rules define **when** alerts fire and **where** they go:

**Example: "Production Schema Changes"**

- **Event Type**: Schema Change Detected
- **Scope**: Data source = production-postgres
- **Conditions**: Change type = Column Removed
- **Destinations**:
  - Slack (#data-alerts)
  - PagerDuty (on-call)

See [Alert Rules](/alerts/alert-rules) for detailed configuration.

### Destinations

Destinations are the channels where alerts are delivered:

**Example: "Slack - Data Alerts"**

- **Type**: Slack
- **Channel**: #data-alerts
- **Workspace**: your-company.slack.com
- **Status**: Connected

Configure destinations before creating rules that use them.

### Alert History

All alerts are logged for review:

- View past alerts in **Alerts → History**
- Filter by date, type, destination, or asset
- See which rules triggered each alert
- Track response times and patterns

## Setting Up Alerts

### Quick Start

1. **Add a destination**: Connect Slack, email, or another channel
2. **Create a rule**: Define what triggers alerts and where they go
3. **Test**: Use "Send Test Alert" to verify delivery
4. **Monitor**: Review alert history and adjust thresholds

### Recommended First Rules

Start with these three rules:

| Rule | Event | Destination | Why |
|------|-------|-------------|-----|
| Schema Changes | Schema Change | Slack | Catch breaking changes |
| Stale Data | Freshness Violation | Slack | Detect pipeline failures |
| Connection Issues | Discovery Failed | Email | Know when monitoring breaks |

## Alert Deduplication

AnomalyArmor prevents alert storms:

- **Same event**: Won't re-alert for the same change until resolved
- **Cooldown period**: Configurable delay between repeated alerts
- **Aggregation**: Multiple changes can be grouped (coming soon)

## Managing Alerts

### Viewing Active Alerts

Go to **Alerts → Active** to see unresolved alerts:

- Filter by asset or date
- Click to view details and related changes
- Mark as acknowledged or resolved

### Disabling Rules

To temporarily stop alerts during maintenance:

1. Go to **Alerts → Rules**
2. Find the rule and toggle it **OFF**
3. After maintenance, toggle it back **ON**

### Reviewing History

**Alerts → History** shows all past alerts:

- When each alert fired
- Which rule triggered it
- Where it was delivered
- Alert details and context

Use history to:
- Identify alert fatigue (too many alerts)
- Find patterns (same asset always alerting)
- Tune thresholds and conditions

## Best Practices

### Start with critical assets

Don't alert on everything. Begin with your most important tables (revenue, users, orders) and expand from there.

### Match channels to urgency

- **PagerDuty**: Only for truly urgent issues requiring immediate response
- **Slack**: Team visibility, moderate urgency
- **Email**: Low urgency, informational, digests

### Set realistic thresholds

If your data updates hourly, don't set a 30-minute freshness SLA. Start lenient and tighten over time.

### Review and tune regularly

Check alert history weekly. If you're getting too many alerts, adjust rules. If you're missing issues, add coverage.

See [Alert Best Practices](/alerts/best-practices) for more detailed guidance.

## Troubleshooting

### Alerts not firing

1. Check rule is enabled (toggle ON)
2. Verify destination is connected (test it)
3. Confirm scope matches the asset
4. Ensure events are occurring (check discovery is running)

### Too many alerts

1. Add conditions to filter events
2. Exclude development/test schemas
3. Increase thresholds (e.g., longer freshness SLA)
4. Route different event types to different destinations

### Wrong destination receiving alerts

1. Check rule configuration
2. Verify destination is selected for the correct rule
3. Check for duplicate rules with different destinations

## Common Questions

### What kinds of events can AnomalyArmor alert on?

AnomalyArmor alerts on schema changes (columns or tables added, removed, or retyped), freshness violations when data falls behind its SLA, discovery failures from connection or permission errors, assets that have been removed, and newly discovered tables or views. See [Alert Rules](/alerts/alert-rules) for the full event vocabulary.

### How does AnomalyArmor prevent alert storms?

Every event passes through cooldown, daily-limit, and deduplication checks before delivery. The same unresolved event will not re-alert, and multiple related changes to one table are grouped into a single notification. Review [Best Practices](/alerts/best-practices) for additional tuning.

### Can one alert rule send to multiple destinations?

Yes. A single rule can fan out to Slack, PagerDuty, email, and any other configured destination at the same time, and the same destination can receive from many rules. This is how teams route breaking changes to PagerDuty while also posting them to a Slack channel for visibility.

### Where do I see alerts that were suppressed by a schedule or blackout?

Suppressed alerts still appear under **Alerts > History** with the suppression reason recorded (`outside operating hours`, `blackout period`, `cooldown`, or `daily limit`). Nothing is ever silently dropped.

### What are the first alert rules I should set up?

Start with three: schema changes on your production database routed to Slack, freshness violations on revenue tables routed to Slack, and discovery failures routed to email. Expand from there once you know the volume. See [Best Practices](/alerts/best-practices) for a recommended starting configuration.

## Next Steps

- **[Create Alert Rules](/alerts/alert-rules)**: Configure when and where alerts fire
- **[Set Up Slack](/alerts/destinations/slack)**: Connect your Slack workspace
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue
- **[Freshness SLAs](/data-quality/freshness-monitoring)**: Set up data freshness alerts

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/alert-rules -->

# Create and Manage Alert Rules

> Configure when and how to receive alerts

Rules are the core of AnomalyArmor's alerting system. Each rule defines what events trigger alerts, what conditions must be met, and where notifications are sent.

> ℹ️ **Prerequisites**: Before creating alert rules, you need:
- A [connected data source](/data-sources/overview) with discovery completed
- At least one [alert destination](/alerts/destinations/slack) configured (Slack, email, etc.)

## Rule Components

Every alert rule has these parts:

## Creating a Rule

## Event Types in Detail

### Schema Change Detected

Triggers when table or column structure changes:

| Change | Description |
|--------|-------------|
| Column Removed | Existing column no longer exists |
| Table Removed | Entire table no longer exists |
| Column Type Changed | Data type modified |
| Column Added | New column appeared |
| Table Added | New table discovered |
| Column Renamed | Name change detected (remove + add) |

**Example rule:**
- **Event**: Schema Change
- **Scope**: Data source = production
- **Conditions**: Change type = Column Removed OR Table Removed
- **Destinations**: Slack, PagerDuty

### Freshness Violation

Triggers when data isn't updated within the SLA:

| Condition | Description |
|-----------|-------------|
| SLA exceeded | Latest timestamp older than threshold |
| Warning threshold | Approaching SLA (e.g., 80% of threshold) |
| Resolved | Data became fresh again |

**Example rule:**
- **Event**: Freshness Violation
- **Scope**: Asset = orders, payments, users
- **Conditions**: (none - all violations)
- **Destinations**: Slack

### Discovery Failed

Triggers when a discovery job can't complete:

| Failure Type | Description |
|--------------|-------------|
| Connection failed | Can't reach database |
| Authentication failed | Invalid credentials |
| Permission denied | User lacks access |
| Timeout | Discovery took too long |

**Example rule:**
- **Event**: Discovery Failed
- **Scope**: All data sources
- **Conditions**: (none)
- **Destinations**: Email, Slack

### Asset Removed

Triggers when a previously-known asset no longer exists:

**Common causes:**
- Table was intentionally dropped
- Table was renamed
- Permission revoked
- Discovery ran against wrong database

**Example rule:**
- **Event**: Asset Removed
- **Scope**: Data source = production
- **Destinations**: Slack

## Rule Examples

### Critical Tables Only

Monitor your most important tables closely:

| Field | Value |
|-------|-------|
| **Name** | Critical Tables - All Changes |
| **Event** | Schema Change Detected |
| **Data Source** | `production-postgres` |
| **Assets** | `orders`, `users`, `payments`, `products` |
| **Conditions** | All changes |
| **Destinations** | Slack `#critical-data`, PagerDuty |

### Production Breaking Changes

Alert only on potentially breaking changes. Pick **Breaking only** as the
trigger scope and the rule will fire on any schema alert that contains at
least one breaking change, no matter how many additive changes come along
with it:

| Field | Value |
|-------|-------|
| **Name** | Production - Breaking Changes Only |
| **Event** | Schema Change Detected |
| **Data Source** | `production-postgres` |
| **Schema** | `public`, `analytics` |
| **Trigger scope** | Breaking only |
| **Destinations** | Slack `#data-alerts`, PagerDuty |

### Gold Tables - Change Freeze

Enforce a strict change-freeze policy on a set of critical tables by scoping
to specific change types:

| Field | Value |
|-------|-------|
| **Name** | Gold Tables - Change Freeze |
| **Event** | Schema Change Detected |
| **Assets** | `gold.*` |
| **Trigger scope** | Specific types |
| **Change types** | `COLUMN_REMOVED`, `COLUMN_TYPE_CHANGED`, `PRIMARY_KEY_REMOVED`, `PRIMARY_KEY_MODIFIED` |
| **Destinations** | PagerDuty, Slack `#gold-table-freeze` |

### Non-Production Informational

Low-priority notifications for dev/staging:

| Field | Value |
|-------|-------|
| **Name** | Staging Schema Changes (Info Only) |
| **Event** | Schema Change Detected |
| **Data Source** | `staging-postgres` |
| **Conditions** | All changes |
| **Destinations** | Email (daily digest) |

### Freshness by Criticality

Different thresholds for different tables:

**Revenue Tables Freshness**

| Field | Value |
|-------|-------|
| **Event** | Freshness Violation |
| **Assets** | `orders`, `payments`, `revenue_*` |
| **Destinations** | Slack, PagerDuty |

**Analytics Tables Freshness**

| Field | Value |
|-------|-------|
| **Event** | Freshness Violation |
| **Assets** | `daily_*`, `weekly_*`, `analytics_*` |
| **Destinations** | Slack |

## Managing Rules

### Enabling/Disabling

Toggle rules on or off without deleting:

1. Go to **Alerts → Rules**
2. Find the rule
3. Click the toggle switch

Disabled rules won't evaluate events but preserve configuration.

### Editing Rules

Modify existing rules:

1. Go to **Alerts → Rules**
2. Click the rule name
3. Make changes
4. Click **Save**

> ℹ️ Changes take effect immediately. Events that already fired won't re-fire.

### Deleting Rules

Remove rules you no longer need:

1. Go to **Alerts → Rules**
2. Click the rule menu (⋮)
3. Select **Delete**
4. Confirm deletion

> ⚠️ Deleted rules can't be recovered. Consider disabling instead if you might need the rule later.

## Advanced Configuration

### Multiple Destinations

A single rule can send to multiple places:

**Rule: "Critical Schema Changes"**
- Slack `#data-alerts`
- Slack `#data-engineering`
- PagerDuty
- Email team@company.com (daily digest)

### Pattern Matching

Use patterns in asset filters:

| Pattern | Matches |
|---------|---------|
| `orders*` | orders, orders_2024, orders_archive |
| `*_staging` | orders_staging, users_staging |
| `analytics.*` | All tables in analytics schema |
| `*.users` | users table in any schema |

## Troubleshooting

### Rule not triggering

1. **Check enabled**: Is the toggle ON?
2. **Check scope**: Does the asset match the filter?
3. **Check conditions**: Are conditions too restrictive?
4. **Check events**: Is discovery running and finding changes?
5. **Check destination**: Is it configured correctly?

### Rule triggering unexpectedly

1. **Review scope**: Is it broader than intended?
2. **Check conditions**: Add filters to narrow matches
3. **Review event type**: Correct event selected?
4. **Check duplicates**: Multiple rules for same scope?

### Can't save rule

1. **Required fields**: Ensure name, event type, and destination are set
2. **Valid scope**: Check pattern syntax
3. **Destination exists**: Verify destination is configured

## Common Questions

### How do I alert only on breaking schema changes?

Create a Schema Change rule and set **Trigger scope** to **Breaking only**. The rule fires on any alert that contains at least one breaking change (column removed, type changed, primary key removed, etc.), even when mixed with additive changes. Additive-only alerts will be skipped.

### Can I scope an alert rule to specific tables or schemas?

Yes. Every rule has optional Data Source, Schema, and Asset filters, and all three accept glob patterns like `orders*`, `*_staging`, or `analytics.*`. Combine them to target, for example, only `gold.*` tables in your production warehouse.

### What happens when I disable an alert rule?

Disabled rules stop evaluating incoming events but keep their configuration intact, so you can toggle them back on later. This is safer than deleting a rule during a known-noisy window like a migration. Use a [blackout window](/alerts/blackouts) instead if you want to silence everything temporarily without touching individual rules.

### Can one rule send alerts to more than one destination?

Yes. Pick any number of configured destinations in the rule builder, and each receives a copy of the alert. Teams commonly route production breaking changes to PagerDuty, Slack, and a team email list from a single rule.

### How do I enforce a change freeze on critical tables?

Create a rule with **Trigger scope: Specific types** and pick the change types you want to block, such as `COLUMN_REMOVED`, `COLUMN_TYPE_CHANGED`, and `PRIMARY_KEY_REMOVED`. Scope the asset filter to your protected tables (for example `gold.*`) and route the rule to PagerDuty.

## Next Steps

- **[Set Up Slack](/alerts/destinations/slack)**: Configure Slack as a destination
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/destinations/slack -->

# Slack

> Send AnomalyArmor alerts to Slack channels

Send AnomalyArmor alerts directly to your Slack workspace. Get notified about schema changes, freshness violations, and other data events where your team already works.

Alerts arrive as richly formatted Slack Block Kit messages with a color-coded severity bar, structured fields, and a **View Details** link back to AnomalyArmor.

**Example Slack Alert:**

**Breaking schema changes detected**

3 changes in 2 table(s) (2 breaking)

**Rule:** Production Schema Monitor
**Tables:** orders, customers

**Changes:**
- orders: Column Removed `shipping_status` (breaking)
- orders: Column Added `tracking_number`
- customers: Column Removed `legacy_id` (breaking)

[View Details]

## Why Slack?

Slack is ideal for alerts that need quick team visibility:

- **Real-time**: Alerts appear instantly in channels
- **Contextual**: Team can discuss and coordinate in threads
- **Actionable**: Click through to AnomalyArmor for details
- **Flexible**: Route different alerts to different channels

## Prerequisites

Before you begin:

- Admin access to your Slack workspace (or permission to install apps)
- AnomalyArmor account with alert configuration permissions
- At least one Slack channel where you want to receive alerts

## Setup Guide

### Step 1: Navigate to Destinations

1. Log in to AnomalyArmor
2. Click **Alerts** in the left sidebar
3. Select **Destinations** tab
4. Click **Add Destination**

You'll see the Destinations page with a **+ Add Destination** button in the top-right corner.

### Step 2: Select Slack

From the destination type list, click **Slack**.

### Step 3: Authorize with Slack

Click **Connect to Slack** to begin the OAuth flow. Slack will show a standard authorization dialog listing the permissions AnomalyArmor needs:

| Permission | Why We Need It |
|------------|---------------|
| Post to channels | Send alert messages |
| View channel list | Let you select destination channels |
| View workspace info | Display workspace name in AnomalyArmor |

> ℹ️ AnomalyArmor requests only the minimum permissions needed to send alerts. We cannot read your messages, files, or private channels.

### Step 4: Select Channels

After authorization, choose which channels can receive alerts:

1. **Public channels**: Any public channel in your workspace
2. **Private channels**: Must invite the AnomalyArmor bot first

To use a private channel:
1. Open the private channel in Slack
2. Type `/invite @AnomalyArmor`
3. The channel will appear in AnomalyArmor's channel list

### Step 5: Name Your Destination

Give the destination a descriptive name:

```
Examples:
- "Slack - #data-alerts"
- "Production Alerts - Slack"
- "Platform Team Slack"
```

This name appears when creating alert rules, so make it recognizable.

### Step 6: Test the Connection

Click **Send Test Alert** to verify everything works:

```
[OK] Test alert sent successfully
  Channel: #data-alerts
  Message: "Test alert from AnomalyArmor"
```

Check Slack to confirm the message arrived.

### Step 7: Save

Click **Save** to complete the setup. Your Slack destination is now ready to use in alert rules.

## Alert Message Format

AnomalyArmor Slack alerts are formatted using Slack's Block Kit for rich, structured messages. Each alert type displays contextually relevant information.

### Schema Change Alerts

Schema change alerts show affected tables, breaking change counts, and detailed change lists:

**Breaking schema changes detected in warehouse**

5 changes in 2 table(s) (2 breaking)

| Field | Value |
|-------|-------|
| Rule | Production Schema Monitor |
| Severity | CRITICAL |
| Tables | orders, customers |
| Breaking | 2 changes |

**Changes:**
- orders: Column Removed `shipping_status` (breaking)
- orders: Column Added `tracking_number`
- customers: Column Removed `legacy_id` (breaking)

[View Details]

### Freshness Violation Alerts

Freshness SLA violation alerts show overdue time, SLA thresholds, and last update times:

**Freshness SLA Violation**

Data is 4.5 hours overdue

| Field | Value |
|-------|-------|
| Rule | Sales Freshness |
| Severity | HIGH |
| Asset | warehouse.sales |
| Overdue | 4.5 hours (2.3 intervals) |
| SLA Threshold | 2.0 hours |
| Last Update | 2025-12-28 10:30 UTC |

[View Details]

### Message Components

| Component | Description |
|-----------|-------------|
| Color bar | Visual severity indicator (red for critical, orange for high, yellow for medium, blue for low) |
| Alert title | What happened with summary stats |
| Context fields | Rule name, severity, and event-specific details |
| Changes list | Detailed breakdown of changes (schema alerts) |
| View Details button | Link to view full alert details in AnomalyArmor |
| Footer | Alert ID and Rule ID for tracing |

## Multiple Slack Destinations

You can create multiple Slack destinations for different use cases:

| Destination | Channel | Use Case |
|-------------|---------|----------|
| Critical Alerts | #data-incidents | Breaking changes, SLA violations |
| Schema Changes | #data-changes | All schema drift notifications |
| Dev/Test Alerts | #data-dev | Non-production events |
| Team-Specific | #analytics-team | Alerts for specific assets |

### Routing Example

**Production Breaking Changes**
- Scope: production database
- Event: Schema change (column removed, table removed)
- Destinations: Slack #data-alerts, PagerDuty (for on-call)

**Dev Schema Changes**
- Scope: development database
- Event: Any schema change
- Destinations: Slack #data-dev

## Best Practices

### Channel Selection

### Use dedicated alert channels

Don't send alerts to busy team channels. Create dedicated channels like `#data-alerts` so alerts don't get lost in conversation.

### Separate by urgency

Route breaking changes to a channel with notifications enabled. Route informational alerts (like new tables) to a quieter channel.

### Consider channel membership

Make sure the right people are in the channel. Data engineers should be in `#data-alerts`, but not necessarily all of `#general`.

### Alert Volume

> ⚠️ Too many alerts leads to alert fatigue. If a channel gets 50+ alerts/day, people stop paying attention. Be selective about what triggers Slack notifications.

**Tips to reduce noise**:
- Start with critical alerts only
- Use email for low-priority events
- Set freshness SLAs with buffer (e.g., 2x expected update time, not too tight)
- Filter out expected changes (dev databases)

### Thread Usage

AnomalyArmor posts each alert as a new message. To keep channels clean:

1. Create a thread on the alert to discuss
2. Use reactions (:eyes: = investigating, :white_check_mark: = resolved)
3. @mention teammates who need to see it

## Troubleshooting

### "Channel not found"

**Cause**: The channel was deleted, renamed, or AnomalyArmor lost access.

**Fix**:
1. Verify the channel exists in Slack
2. For private channels, re-invite the AnomalyArmor bot
3. Edit the destination and reselect the channel

### "Unable to post message"

**Cause**: Permission issues or Slack API errors.

**Fix**:
1. Check that AnomalyArmor app is still installed in Slack
2. Try removing and re-adding the Slack destination
3. Verify your Slack workspace allows third-party apps

### Alerts not appearing

**Cause**: Rule configuration or routing issues.

**Fix**:
1. Check the alert rule is enabled
2. Verify the rule's scope matches the event
3. Confirm the Slack destination is selected for the rule
4. Use **Send Test Alert** to verify the connection

### OAuth errors during setup

**Cause**: Permission or network issues.

**Fix**:
1. Ensure you have admin permissions in Slack (or can request app installation)
2. Disable browser popup blockers temporarily
3. Try a different browser
4. Contact your Slack workspace admin if apps are restricted

## Security

### Data Sent to Slack

Alert messages contain:
- Asset names (table/column names)
- Change types (added, removed, modified)
- Timestamps

Alert messages **do not** contain:
- Actual data values
- Database credentials
- Connection strings
- Query results

### Revoking Access

To disconnect AnomalyArmor from Slack:

1. In Slack: **Settings** → **Manage apps** → **AnomalyArmor** → **Remove**
2. In AnomalyArmor: Delete the Slack destination

## Common Questions

### How do I route AnomalyArmor alerts to a private Slack channel?

Private channels don't appear in the selector until you invite the bot. Open the private channel in Slack, run `/invite @AnomalyArmor`, then refresh the channel list in AnomalyArmor. Public channels work without any invite.

### Can I send different alert types to different Slack channels?

Yes. Create a separate Slack destination for each channel (e.g., `#data-incidents`, `#data-changes`, `#data-dev`), then pick the appropriate destination on each alert rule. One rule can also fan out to multiple channels at once.

### Does AnomalyArmor's Slack app read my messages?

No. The app requests only the minimum scopes needed to post alerts: post to channels, view the channel list, and view workspace info. It cannot read messages, files, or private channel contents.

### What information is included in a Slack alert?

Schema alerts include the rule name, affected tables, a color-coded severity bar, and a detailed change list with breaking changes marked. Freshness alerts include overdue time, SLA threshold, and last update time. Every alert has a **View Details** link back to AnomalyArmor. No row-level data values are ever sent.

### Why are my alerts not appearing in Slack?

Check, in order: the rule is enabled, the rule's scope matches the event, the Slack destination is selected on the rule, and the channel still exists with the AnomalyArmor bot invited (for private channels). Use **Send Test Alert** on the destination to confirm the connection itself is healthy.

## Next Steps

- **[Alert Rules](/alerts/alert-rules)**: Create rules that route to Slack
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/destinations/email -->

# Email

> Send AnomalyArmor alerts to email addresses for team notifications

Send AnomalyArmor alerts to individual email addresses or distribution lists. Email is ideal for alerts that don't require immediate action or for team members who prefer email over chat.

## Why Email?

Email works well for certain alert scenarios:

- **Digest-friendly**: Batch multiple alerts into summaries
- **Universal**: Everyone has email, no app installation needed
- **Async**: Good for non-urgent notifications
- **Auditable**: Built-in record of all alerts received

## Prerequisites

Before you begin:

- AnomalyArmor account with alert configuration permissions
- Valid email addresses for recipients

## Setup Guide

### Step 1: Navigate to Destinations

1. Log in to AnomalyArmor
2. Click **Alerts** in the left sidebar
3. Select **Destinations** tab
4. Click **Add Destination**

### Step 2: Select Email

From the destination type list, click **Email**.

### Step 3: Configure Recipients

Enter the email addresses that should receive alerts:

| Field | Description |
|-------|-------------|
| **Destination Name** | A descriptive name (e.g., "Data Team Email") |
| **Recipients** | Email addresses, one per line or comma-separated |

> ℹ️ You can use distribution lists or group email addresses to reach multiple team members with a single destination.

### Step 4: Test the Connection

Click **Send Test Alert** to verify delivery:

```
Test alert sent to: team@example.com
Subject: [AnomalyArmor] Test Alert
```

Check the recipient inbox (and spam folder) to confirm the message arrived.

### Step 5: Save

Click **Save** to complete the setup. Your email destination is now ready to use in alert rules.

## Alert Email Format

AnomalyArmor emails include:

- **Subject**: Alert type and affected asset
- **Body**: Full alert details with links to AnomalyArmor
- **Unsubscribe link**: Manage notification preferences

## Best Practices

### Use Distribution Lists

Instead of individual emails, use a team distribution list:
- Easier to manage team changes
- Single destination covers entire team
- Reduces configuration maintenance

### Pair with Other Destinations

Email works well alongside real-time channels:

| Alert Type | Destinations |
|------------|--------------|
| Breaking changes | Slack + PagerDuty + Email |
| Schema changes | Slack + Email |
| Informational | Email only |

### Avoid Alert Fatigue

Email alerts can pile up quickly. Consider:
- Routing only important alerts to email
- Using Slack for high-volume informational events
- Setting freshness SLAs with buffer (e.g., 2x expected update time)

## Troubleshooting

### Emails going to spam

**Cause**: Email filters flagging AnomalyArmor messages.

**Fix**:
1. Add `alerts@anomalyarmor.ai` to your allowlist
2. Mark test emails as "Not Spam"
3. Contact your IT team to allowlist the domain

### Emails not arriving

**Cause**: Invalid email address or delivery issues.

**Fix**:
1. Verify the email address is correct
2. Check spam/junk folders
3. Try a different email address to test
4. Use **Send Test Alert** to verify the destination

## Common Questions

### Can I send AnomalyArmor alerts to a distribution list?

Yes. Enter any group or distribution list address in the recipients field, the same as an individual mailbox. This is the recommended pattern, since managing membership in your mail system is easier than editing AnomalyArmor destinations when the team changes.

### Why are AnomalyArmor emails going to spam?

Alerts come from `alerts@anomalyarmor.ai`. Add that address to your allowlist, mark any test email as "Not Spam", and ask your IT team to allowlist the domain if the issue persists. Corporate spam filters are the most common cause of missing alerts.

### Can I add multiple recipients to one email destination?

Yes. Enter addresses one per line or comma-separated in the recipients field. Every alert sent to that destination is delivered to all listed addresses.

## Next Steps

- **[Alert Rules](/alerts/alert-rules)**: Create rules that route to email
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/destinations/webhooks -->

# Webhooks

> Send AnomalyArmor alerts to any HTTP endpoint for custom integrations

Send AnomalyArmor alerts to any HTTP endpoint via webhooks. This enables custom integrations with internal tools, monitoring systems, or any service that accepts HTTP requests.

## Why Webhooks?

Webhooks provide maximum flexibility:

- **Universal**: Integrate with any system that accepts HTTP
- **Customizable**: Build your own alert handling logic
- **Automation**: Trigger workflows, scripts, or pipelines
- **Integration**: Connect to tools without native integrations

## Use Cases

Common webhook integrations:

| System | Use Case |
|--------|----------|
| Custom dashboards | Display alerts on internal monitoring |
| Automation platforms | Trigger Zapier, n8n, or Make workflows |
| Ticketing systems | Create tickets in Jira, Linear, etc. |
| Data pipelines | Pause pipelines on schema changes |
| Logging systems | Send alerts to Datadog, Splunk, etc. |

## Prerequisites

Before you begin:

- An HTTP endpoint that can receive POST requests
- AnomalyArmor account with alert configuration permissions
- (Optional) Authentication credentials for your endpoint

## Setup Guide

### Step 1: Navigate to Destinations

1. Log in to AnomalyArmor
2. Click **Alerts** in the left sidebar
3. Select **Destinations** tab
4. Click **Add Destination**

### Step 2: Select Webhook

From the destination type list, click **Webhook**.

### Step 3: Configure the Webhook

| Field | Description |
|-------|-------------|
| **Destination Name** | A descriptive name (e.g., "Internal Alerting System") |
| **URL** | Your endpoint URL (must be HTTPS) |
| **Authentication** | Optional: Bearer token or API key |
| **Headers** | Optional: Custom HTTP headers |

### Step 4: Test the Connection

Click **Send Test Alert** to verify your endpoint receives the payload:

```
POST https://your-endpoint.com/alerts
Status: 200 OK
Response time: 142ms
```

Your endpoint should respond with a 2xx status code.

### Step 5: Save

Click **Save** to complete the setup. Your webhook destination is now ready to use in alert rules.

## Payload Format

AnomalyArmor sends a JSON payload for each alert:

```json
{
  "event_type": "alert.triggered",
  "alert": {
    "id": "alert_abc123",
    "rule_name": "Production Schema Changes",
    "alert_type": "schema",
    "status": "triggered",
    "created_at": "2024-01-15T08:00:00Z"
  },
  "event": {
    "type": "schema_change",
    "change_type": "column_removed",
    "asset": {
      "qualified_name": "production.public.orders",
      "asset_type": "table"
    },
    "details": {
      "column_name": "shipping_status",
      "column_type": "varchar"
    }
  },
  "links": {
    "alert_url": "https://app.anomalyarmor.ai/alerts/alert_abc123",
    "asset_url": "https://app.anomalyarmor.ai/assets/..."
  }
}
```

### Event Types

| Event Type | Description |
|------------|-------------|
| `schema_change` | Column, table, or constraint changes |
| `freshness_violation` | Table exceeded freshness SLA |
| `discovery_complete` | Discovery job finished |
| `connection_failed` | Database connection issue |

## Authentication

### Bearer Token

Add an `Authorization` header with your token:

```
Authorization: Bearer your-secret-token
```

### API Key Header

Add a custom header with your API key:

```
X-API-Key: your-api-key
```

### HMAC Signature

AnomalyArmor includes an `X-AnomalyArmor-Signature` header with each request. Verify this signature to ensure the request came from AnomalyArmor:

```python
import hmac
import hashlib

def verify_signature(payload, signature, secret):
    expected = hmac.new(
        secret.encode(),
        payload.encode(),
        hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature)
```

## Retry Logic

AnomalyArmor retries failed webhook deliveries:

| Attempt | Delay |
|---------|-------|
| 1 | Immediate |
| 2 | 1 minute |
| 3 | 5 minutes |
| 4 | 15 minutes |
| 5 | 1 hour |

After 5 failed attempts, the alert is marked as failed delivery.

## Best Practices

### Endpoint Design

- Respond quickly (< 5 seconds) to avoid timeouts
- Return 200 for successful receipt, even if processing is async
- Use a queue for heavy processing
- Log incoming payloads for debugging

### Security

- Always use HTTPS endpoints
- Verify the `X-AnomalyArmor-Signature` header
- Rotate authentication tokens periodically
- Allowlist AnomalyArmor IP addresses if needed

### Error Handling

Your endpoint should handle:
- Duplicate alerts (use `alert.id` for deduplication)
- Unknown event types (ignore gracefully)
- Missing fields (use defaults)

## Troubleshooting

### "Connection refused" or timeout

**Cause**: Endpoint unreachable or slow.

**Fix**:
1. Verify the URL is correct and accessible
2. Check firewall rules allow AnomalyArmor IPs
3. Ensure endpoint responds within 30 seconds

### "401 Unauthorized"

**Cause**: Authentication failed.

**Fix**:
1. Verify the token/API key is correct
2. Check the header name matches your endpoint's expectation
3. Ensure credentials haven't expired

### "400 Bad Request"

**Cause**: Endpoint rejected the payload.

**Fix**:
1. Check endpoint logs for specific error
2. Verify endpoint accepts `application/json`
3. Test with a simple endpoint first

## Common Questions

### How do I verify a webhook actually came from AnomalyArmor?

Every request includes an `X-AnomalyArmor-Signature` header in the form `sha256=<hex digest>`. Compute HMAC-SHA256 of the raw request body using your webhook secret and compare with `hmac.compare_digest` to prevent spoofed deliveries. A Python example is in the [Authentication](#authentication) section.

### What happens if my webhook endpoint is down?

AnomalyArmor retries failed deliveries five times with backoff: immediate, 1 minute, 5 minutes, 15 minutes, and 1 hour. After five failures the alert is marked as failed delivery. Build your endpoint to respond in under 5 seconds and queue heavy work asynchronously.

### What does the webhook JSON payload look like?

Each request is a POST with a JSON body containing `event_type`, an `alert` object (id, rule_name, status, timestamps), an `event` object describing the change and affected asset, and `links` back to AnomalyArmor. Use `alert.id` to deduplicate on your side. See the [Payload Format](#payload-format) example above for the full shape.

### Can I use a self-signed or HTTP endpoint?

No. Webhook URLs must be HTTPS with a valid certificate. This protects the authentication header and the asset metadata in transit, and it's a requirement for every webhook destination.

## Next Steps

- **[Alert Rules](/alerts/alert-rules)**: Create rules that route to webhooks
- **[API Reference](/api/overview)**: Build integrations with the API

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/destinations/pagerduty -->

# PagerDuty

> Send AnomalyArmor alerts to PagerDuty for incident management

Send AnomalyArmor alerts to PagerDuty to trigger incidents and page on-call engineers. Critical schema changes or freshness violations can automatically create PagerDuty incidents for immediate response.

## Why PagerDuty?

PagerDuty is ideal for alerts that require immediate human intervention:

- **On-call routing**: Automatically page the right person based on schedules
- **Escalation**: Ensure critical alerts don't go unnoticed
- **Incident management**: Track response and resolution
- **Deduplication**: Related alerts are grouped together

## Prerequisites

Before you begin:

- PagerDuty account with admin access (or permission to create integrations)
- AnomalyArmor account with alert configuration permissions
- A PagerDuty service configured for your data infrastructure

## Setup Guide

### Step 1: Create a PagerDuty Integration

In PagerDuty:

1. Go to **Services** and select (or create) a service for data alerts
2. Click **Integrations** tab
3. Click **Add Integration**
4. Search for **Events API v2** and select it
5. Click **Add**
6. Copy the **Integration Key** (also called Routing Key)

> ℹ️ The Integration Key is a 32-character string that looks like: `a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6`

### Step 2: Add Destination in AnomalyArmor

1. Log in to AnomalyArmor
2. Click **Alerts** in the left sidebar
3. Select **Destinations** tab
4. Click **Add Destination**
5. Select **PagerDuty**

### Step 3: Configure the Destination

Enter the following:

| Field | Description |
|-------|-------------|
| **Name** | A descriptive name (e.g., "PagerDuty - Data Platform") |
| **Integration Key** | The key copied from PagerDuty |

### Step 4: Test the Connection

Click **Test** to send a test event to PagerDuty.

```
Success! Test event sent to PagerDuty
Dedup Key: anomalyarmor-test-12345
```

Check PagerDuty to confirm the incident was created.

> ⚠️ Remember to resolve the test incident in PagerDuty to avoid confusion.

### Step 5: Save

Click **Create Destination** to complete the setup.

## Alert Format

AnomalyArmor sends events using the PagerDuty Events API v2 format:

| Field | Value |
|-------|-------|
| **Summary** | Alert title with context |
| **Source** | `anomalyarmor` |
| **Severity** | `critical` (all PagerDuty alerts are high priority) |
| **Dedup Key** | Unique identifier to group related alerts |

> ℹ️ Since PagerDuty is reserved for critical alerts, all events sent to PagerDuty use the `critical` severity level. Use Slack or email for lower-priority notifications.

### Custom Details

Each PagerDuty incident includes:

- Rule name that triggered the alert
- Event type (schema_change, freshness_violation, etc.)
- Alert description
- Asset ID
- Link to view in AnomalyArmor

## Best Practices

### Use PagerDuty for Critical Alerts Only

> ⚠️ Don't route all alerts to PagerDuty. Reserve it for events that require immediate action - typically production schema changes that could break pipelines.

**Good use cases**:
- Production column removed
- Critical table freshness SLA violated
- Breaking schema changes in production

**Better handled elsewhere**:
- Development database changes (use Slack)
- Informational schema additions (use email)
- Routine freshness warnings (use Slack digest)

### Set Up Proper Escalation

Configure your PagerDuty service with:

1. **Acknowledgement timeout**: Auto-escalate if not acknowledged (e.g., 5 minutes)
2. **Escalation policy**: Include backup responders
3. **On-call schedule**: Ensure 24/7 coverage for critical services

### Combine with Other Destinations

Create alert rules that send to multiple destinations:

**Production Breaking Changes**
- Event: Schema Change
- Scope: production databases
- Conditions: Column removed OR type changed
- Destinations: PagerDuty (immediate paging), Slack #data-incidents (team visibility), Email data-eng-list@company.com (record)

## Troubleshooting

### "Invalid routing key"

**Cause**: The integration key is incorrect or the integration was deleted in PagerDuty.

**Fix**:
1. Go to your PagerDuty service
2. Check the Events API v2 integration still exists
3. Copy a fresh integration key
4. Update the destination in AnomalyArmor

### Incidents not appearing

**Cause**: Service is disabled or in maintenance mode.

**Fix**:
1. Check the PagerDuty service is enabled
2. Verify no maintenance window is active
3. Check the Events API v2 integration is active
4. Use the Test button to verify connectivity

### Duplicate incidents

**Cause**: Multiple alert rules triggering for the same event.

**Fix**:
AnomalyArmor includes a deduplication key with each event. PagerDuty will group alerts with the same dedup key into a single incident. If you're seeing duplicates:

1. Review your alert rules for overlap
2. Consider combining rules or adjusting scopes

### Rate limiting

**Cause**: Too many events sent in a short period.

**Fix**:
1. Review alert rule thresholds
2. Consider using alert grouping/digest for high-volume events
3. Route lower-priority alerts to Slack or email instead

## Security

### Data Sent to PagerDuty

Alert events contain:
- Asset names (database, schema, table names)
- Change types and descriptions
- Timestamps
- Rule information

Alert events **do not** contain:
- Actual data values
- Database credentials
- Connection strings
- Query results

### Revoking Access

To disconnect AnomalyArmor from PagerDuty:

1. In AnomalyArmor: Delete the PagerDuty destination
2. In PagerDuty: Remove the Events API v2 integration from the service

## Common Questions

### Which PagerDuty integration type should I use?

Use **Events API v2** on the target service. In PagerDuty, open the service, go to **Integrations**, click **Add Integration**, select Events API v2, and copy the 32-character Integration Key into AnomalyArmor. Older Events API v1 integrations are not supported.

### Can I set non-critical severity for PagerDuty alerts?

No. Every event AnomalyArmor sends to PagerDuty uses `critical` severity by design, since PagerDuty is reserved for events that require immediate human response. Route lower-priority alerts to [Slack](/alerts/destinations/slack) or [email](/alerts/destinations/email) instead.

### Will PagerDuty deduplicate repeated alerts from AnomalyArmor?

Yes. Each event ships with a deduplication key derived from the rule and affected asset, so PagerDuty groups repeated firings into a single incident rather than paging on-call again. If you still see duplicates, check whether multiple overlapping rules are firing for the same event.

### What happens if PagerDuty isn't acknowledged?

That's controlled by the PagerDuty service's escalation policy, not by AnomalyArmor. Set an acknowledgement timeout (for example 5 minutes) and configure backup responders in PagerDuty so unacknowledged pages escalate automatically.

## Next Steps

- **[Alert Rules](/alerts/alert-rules)**: Create rules that route to PagerDuty
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue and page only when necessary

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/destinations/ms-teams -->

# Microsoft Teams

> Send AnomalyArmor alerts to Microsoft Teams channels

Send AnomalyArmor alerts directly to your Microsoft Teams channels. Get notified about schema changes, freshness violations, and other data events where your team collaborates.

## Why Microsoft Teams?

Teams is ideal for organizations using Microsoft 365:

- **Real-time**: Alerts appear instantly in channels
- **Contextual**: Team can discuss and coordinate in threads
- **Actionable**: Click through to AnomalyArmor for details
- **Integrated**: Works with your existing Microsoft ecosystem

## Prerequisites

Before you begin:

- Microsoft Teams account with permission to add connectors (or admin who can approve)
- AnomalyArmor account with alert configuration permissions
- A Teams channel where you want to receive alerts

## Setup Guide

### Step 1: Create an Incoming Webhook in Teams

1. Open Microsoft Teams
2. Navigate to the channel where you want alerts
3. Click the **...** menu next to the channel name
4. Select **Connectors** (or **Manage channel** → **Connectors**)
5. Find **Incoming Webhook** and click **Configure**
6. Give it a name: "AnomalyArmor Alerts"
7. Optionally upload the AnomalyArmor logo
8. Click **Create**
9. **Copy the webhook URL** - you'll need this in AnomalyArmor

> ⚠️ Keep the webhook URL secret. Anyone with this URL can post messages to your channel.

### Step 2: Add Destination in AnomalyArmor

1. Log in to AnomalyArmor
2. Click **Alerts** in the left sidebar
3. Select **Destinations** tab
4. Click **Add Destination**
5. Select **Microsoft Teams**

### Step 3: Configure the Destination

| Field | Description |
|-------|-------------|
| **Destination Name** | A descriptive name (e.g., "Teams - #data-alerts") |
| **Webhook URL** | Paste the URL copied from Teams |

### Step 4: Test the Connection

Click **Send Test Alert** to verify everything works:

```
Test alert sent successfully
Channel: Data Alerts
Message: "Test alert from AnomalyArmor"
```

Check Teams to confirm the message arrived in your channel.

### Step 5: Save

Click **Save** to complete the setup. Your Teams destination is now ready to use in alert rules.

## Alert Message Format

AnomalyArmor sends Adaptive Cards to Teams with:

- Alert type indicator (schema, freshness, discovery)
- Affected asset details
- Change description
- Timestamp
- Action button to view in AnomalyArmor

## Best Practices

### Channel Selection

- Create dedicated alert channels (e.g., `Data Alerts`)
- Don't send to busy general channels
- Separate by urgency: breaking changes vs informational

### Webhook Management

- Rotate webhook URLs periodically for security
- Document which webhooks are used where
- Delete unused webhooks from Teams

## Troubleshooting

### "Webhook URL invalid"

**Cause**: The webhook URL is malformed or expired.

**Fix**:
1. Regenerate the webhook in Teams
2. Copy the new URL carefully (it's long)
3. Update the destination in AnomalyArmor

### Messages not appearing

**Cause**: Webhook deleted or channel permissions changed.

**Fix**:
1. Verify the webhook still exists in Teams channel settings
2. Recreate the webhook if needed
3. Update AnomalyArmor with the new URL

### Rate limiting

**Cause**: Too many alerts in a short period.

**Fix**:
1. Review alert rules to reduce volume
2. Consider email for high-volume, low-priority alerts
3. Teams webhooks have rate limits (~4 messages/second)

## Security

### Data Sent to Teams

Alert messages contain:
- Asset names (table/column names)
- Change types (added, removed, modified)
- Timestamps

Alert messages **do not** contain:
- Actual data values
- Database credentials
- Connection strings

### Revoking Access

To disconnect:
1. In Teams: Remove the Incoming Webhook connector from the channel
2. In AnomalyArmor: Delete the Teams destination

## Common Questions

### How do I connect AnomalyArmor to a Microsoft Teams channel?

In the target Teams channel, open **Connectors**, configure an **Incoming Webhook** named "AnomalyArmor Alerts", and copy the webhook URL. Paste that URL into a new Microsoft Teams destination in AnomalyArmor and send a test alert to confirm delivery.

### Is the Teams webhook URL sensitive?

Yes. Anyone with the URL can post to the channel, so treat it like a credential. Rotate it periodically, avoid checking it into source control, and remove unused webhooks from the Teams channel settings.

### What's the rate limit for Teams alerts?

Microsoft Teams incoming webhooks accept roughly 4 messages per second. High-volume alert rules can trip that limit, so keep the Teams destination for schema changes and critical events. Route verbose informational events to email or a digest instead.

## Next Steps

- **[Alert Rules](/alerts/alert-rules)**: Create rules that route to Teams
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/destinations/linear -->

# Linear

> Create Linear issues automatically from AnomalyArmor alerts

Create Linear issues automatically when AnomalyArmor detects schema changes or data quality issues. This enables your team to track and resolve data infrastructure problems using your existing project management workflow.

## Why Linear?

Linear is ideal for teams that want to track data issues alongside their engineering work:

- **Automatic issue creation**: Schema changes become trackable issues
- **Team assignment**: Route issues to the right team automatically
- **Priority management**: Set priority based on alert severity
- **Integration with workflows**: Issues flow through your existing Linear workflow

## Prerequisites

Before you begin:

- Linear account with API access
- Permission to create API keys in Linear
- A Linear team where issues will be created

## Setup Guide

### Step 1: Create a Linear API Key

In Linear:

1. Click your avatar in the bottom-left corner
2. Select **Settings**
3. Go to **API** in the left sidebar
4. Click **Create new API key**
5. Give it a descriptive name (e.g., "AnomalyArmor Integration")
6. Copy the API key

> ⚠️ Store this API key securely. It won't be shown again after you close the dialog.

### Step 2: Find Your Team ID

In Linear:

1. Go to the team where you want issues created
2. Look at the URL: `https://linear.app/your-workspace/team/ABC/...`
3. The Team ID is the short code after `/team/` (e.g., `ABC` or `DATA`)

Alternatively, use the Linear API or GraphQL explorer to list your teams.

### Step 3: Add Destination in AnomalyArmor

1. Log in to AnomalyArmor
2. Click **Alerts** in the left sidebar
3. Select **Destinations** tab
4. Click **Add Destination**
5. Select **Linear**

### Step 4: Configure the Destination

Enter the following:

| Field | Description | Required |
|-------|-------------|----------|
| **Name** | A descriptive name (e.g., "Linear - Data Team") | Yes |
| **API Key** | The API key from Step 1 | Yes |
| **Team ID** | The team identifier from Step 2 | Yes |
| **Project ID** | Target project for issues | No |
| **Label IDs** | Comma-separated label IDs to auto-apply | No |
| **Priority** | Default issue priority (0-4) | No |

### Step 5: Test the Connection

Click **Test** to create a test issue in Linear.

```
Success! Test issue created in Linear
Issue: DATA-123
```

Check Linear to confirm the issue was created in the correct team.

> ℹ️ Remember to close or delete the test issue to keep your backlog clean.

### Step 6: Save

Click **Create Destination** to complete the setup.

## Issue Format

AnomalyArmor creates Linear issues with:

| Field | Value |
|-------|-------|
| **Title** | Alert title with context |
| **Description** | Full alert details including asset and change info |
| **Team** | Your configured team |
| **Priority** | Based on alert type or your default setting |
| **Labels** | Any labels you configured |

### Priority Mapping

Linear uses 0-4 priority scale:

| Priority | Meaning |
|----------|---------|
| 0 | No priority |
| 1 | Urgent |
| 2 | High |
| 3 | Medium |
| 4 | Low |

## Best Practices

### Create Focused Alert Rules

Create specific alert rules for Linear to avoid issue overload:

**Production Schema Issues**
- Event: Schema Change
- Scope: production databases only
- Conditions: Column removed OR type changed
- Destination: Linear - Data Team

### Use Labels for Categorization

Configure label IDs to automatically categorize issues:

- `data-quality` - for freshness violations
- `schema-change` - for schema drift alerts
- `urgent` - for critical production issues

### Combine with Real-Time Notifications

Use Linear for tracking alongside real-time notifications:

**Critical Data Issues**
- Destinations: Slack (immediate notification), Linear (issue tracking)

## Troubleshooting

### "Invalid API key"

**Cause**: The API key is incorrect or revoked.

**Fix**:
1. Generate a new API key in Linear settings
2. Update the destination in AnomalyArmor

### "Team not found"

**Cause**: Invalid team ID or API key lacks access to the team.

**Fix**:
1. Verify the team ID in the Linear URL
2. Ensure the API key has access to the workspace containing the team

### Issues not appearing

**Cause**: API key permissions or team configuration.

**Fix**:
1. Verify the API key has write access
2. Check the team exists and is active
3. Use the Test button to verify connectivity

## Security

### Data Sent to Linear

Issue content includes:
- Asset names (database, schema, table names)
- Change types and descriptions
- Timestamps
- Alert rule information

Issue content **does not** include:
- Actual data values
- Database credentials
- Connection strings

### Revoking Access

To disconnect AnomalyArmor from Linear:

1. In AnomalyArmor: Delete the Linear destination
2. In Linear: Revoke the API key in Settings > API

## Common Questions

### Where do I find my Linear Team ID?

Open your team in Linear and look at the URL: `https://linear.app/your-workspace/team/ABC/...`. The short code after `/team/` (for example `ABC` or `DATA`) is the Team ID. You can also list teams via Linear's GraphQL API.

### Can I auto-apply labels and priority to Linear issues from AnomalyArmor?

Yes. When configuring the destination, add comma-separated **Label IDs** to auto-tag every created issue and set a default **Priority** (0-4, where 1 is Urgent and 4 is Low). Labels like `data-quality` or `schema-change` make it easy to filter Linear by alert type.

### Should I use Linear as my only destination for production alerts?

Not on its own. Linear is good for tracking but not real-time notification. Pair it with Slack or PagerDuty so the team sees the alert immediately, then uses Linear to track resolution.

## Next Steps

- **[Alert Rules](/alerts/alert-rules)**: Create rules that route to Linear
- **[Slack](/alerts/destinations/slack)**: Add real-time notifications alongside Linear tracking

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/destinations/incident-io -->

# incident.io

> Send AnomalyArmor alerts to incident.io for incident management

Send AnomalyArmor alerts to incident.io to automatically create incidents when schema drift or data quality issues are detected. Critical changes can trigger incidents for immediate response from your on-call team.

## Why incident.io?

incident.io is ideal for teams that want structured incident management:

- **Incident lifecycle**: Track incidents from detection to resolution
- **Automatic creation**: Alerts create incidents automatically
- **Slack integration**: Automatically create incident channels
- **Post-mortems**: Built-in tooling for learning from incidents

## Prerequisites

Before you begin:

- incident.io account with API access
- AnomalyArmor account with alert configuration permissions
- Permission to create API keys in incident.io

## Setup Guide

### Step 1: Create an incident.io API Key

In incident.io:

1. Go to **Settings** → **API Keys**
2. Click **Create API key**
3. Give it a descriptive name (e.g., "AnomalyArmor Integration")
4. Ensure it has the `incident.write` permission
5. Click **Create**
6. Copy the API key (it won't be shown again)

> ℹ️ API keys start with `inc_` followed by a long alphanumeric string.

### Step 2: Get Your Closed Status ID (Optional)

To enable automatic incident closing when alerts are resolved in AnomalyArmor:

1. Go to **Settings** → **Incident Lifecycle** in incident.io
2. Click on a status with category "Closed" (e.g., "Closed", "Resolved")
3. Copy the UUID from the URL (e.g., `01FCNDV6P870EA6S7TK1DSYDG0`)

> ℹ️ When configured, resolving or dismissing an alert in AnomalyArmor will automatically close the corresponding incident in incident.io.

### Step 3: Add Destination in AnomalyArmor

1. Log in to AnomalyArmor
2. Click **Alerts** in the left sidebar
3. Select **Destinations** tab
4. Click **Add Destination**
5. Select **incident.io**

### Step 4: Configure the Destination

Enter the following:

| Field | Description |
|-------|-------------|
| **Name** | A descriptive name (e.g., "incident.io - Data Platform") |
| **API Key** | The API key copied from incident.io |
| **Closed Status ID** | (Optional) UUID of the status to set when closing incidents automatically |

### Step 5: Test the Connection

Click **Test** to create a test incident in incident.io.

```
Success! Incident created: INC-123
```

Check incident.io to confirm the incident was created.

> ⚠️ Remember to resolve the test incident in incident.io to keep your incident list clean.

### Step 6: Save

Click **Create Destination** to complete the setup.

## Alert Format

AnomalyArmor creates incidents using the incident.io API v2:

| Field | Value |
|-------|-------|
| **Name** | `[AnomalyArmor] Alert title` |
| **Summary** | Alert description with AnomalyArmor context |
| **Mode** | `standard` (not retrospective) |
| **Visibility** | `public` |

### Incident Summary

Each incident includes:

- Alert description
- Source: AnomalyArmor
- Rule name that triggered the alert
- Event type (schema_change, freshness_violation, etc.)
- Asset ID

### Idempotency

AnomalyArmor includes an idempotency key with each incident request. This prevents duplicate incidents if the same alert is processed multiple times.

### Incident Lifecycle

AnomalyArmor supports full incident lifecycle management:

| AnomalyArmor Action | incident.io Effect |
|---------------------|-------------------|
| Alert triggered | New incident created |
| Alert resolved | Incident status updated to configured closed status |
| Alert dismissed | Incident status updated to configured closed status |

> ℹ️ Automatic incident closing requires the **Closed Status ID** to be configured. If not set, incidents must be closed manually in incident.io.

## Best Practices

### Reserve for Critical Alerts

> ⚠️ Don't route all alerts to incident.io. Reserve it for events that require coordinated response - typically production schema changes that could break pipelines or critical freshness SLA violations.

**Good use cases**:
- Production column removed or renamed
- Critical table freshness SLA violated
- Breaking schema changes in production databases

**Better handled elsewhere**:
- Development database changes (use Slack)
- Informational schema additions (use email)
- Routine freshness warnings (use Slack digest)

### Combine with Other Destinations

Create alert rules that send to multiple destinations:

**Production Breaking Changes**
- Event: Schema Change
- Scope: production databases
- Conditions: Column removed OR type changed
- Destinations: incident.io (incident creation), Slack #data-incidents (team visibility), Email data-eng-list@company.com (record)

## Troubleshooting

### "No API key configured"

**Cause**: The API key field is empty.

**Fix**:
1. Edit the destination in AnomalyArmor
2. Enter your incident.io API key
3. Save the destination

### "HTTP 401: Unauthorized"

**Cause**: The API key is invalid or expired.

**Fix**:
1. Go to incident.io Settings → API Keys
2. Verify the key exists and hasn't been revoked
3. Create a new API key if needed
4. Update the destination in AnomalyArmor

### "HTTP 403: Forbidden"

**Cause**: The API key lacks required permissions.

**Fix**:
1. Go to incident.io Settings → API Keys
2. Verify the key has `incident.write` permission
3. Create a new key with correct permissions if needed

### Incidents not appearing

**Cause**: Rate limiting or API issues.

**Fix**:
1. Check incident.io status page for outages
2. Use the Test button to verify connectivity
3. Check the AnomalyArmor alert history for delivery errors

> 💡 Set incident severity manually in incident.io based on your organization's criteria, or use incident.io's built-in rules to auto-assign severity.

## Security

### Data Sent to incident.io

Incident data contains:
- Asset names (database, schema, table names)
- Change types and descriptions
- Timestamps
- Rule information

Incident data **does not** contain:
- Actual data values
- Database credentials
- Connection strings
- Query results

### Revoking Access

To disconnect AnomalyArmor from incident.io:

1. In AnomalyArmor: Delete the incident.io destination
2. In incident.io: Revoke the API key in Settings → API Keys

## Common Questions

### What permissions does the incident.io API key need?

The key must have the `incident.write` permission to create incidents. If you want automatic incident closing when alerts resolve, the same key handles status updates. API keys start with `inc_` and are created under **Settings > API Keys** in incident.io.

### Will resolving an AnomalyArmor alert close the incident.io incident?

Only if you configure the **Closed Status ID** on the destination. Without it, incidents stay open in incident.io even after the alert is resolved or dismissed in AnomalyArmor. Find the status UUID under **Settings > Incident Lifecycle** in incident.io.

### Does AnomalyArmor create duplicate incidents if the same alert fires twice?

No. Every request includes an idempotency key, so retries and repeated processing of the same alert collapse into a single incident in incident.io.

### Should I route every alert to incident.io?

No. Reserve it for events that justify a coordinated response such as production breaking schema changes or critical freshness SLA violations. Route dev changes, informational additions, and routine freshness warnings to Slack or email instead. See [Best Practices](/alerts/best-practices).

## Next Steps

- **[Alert Rules](/alerts/alert-rules)**: Create rules that route to incident.io
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue and create incidents only when necessary

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/schedules -->

# Operating Schedules

> Control when alert rules are active with time-based schedules

Operating schedules let you define business hours or monitoring windows so alerts only fire when your team is available to respond. Instead of waking up on-call engineers at 3 AM for non-critical issues, schedules suppress alerts outside defined time windows.

*A typical business-hours operating schedule*

## How Schedules Work

Schedules sit between rule evaluation and alert delivery:

1. An event matches an alert rule
2. AnomalyArmor checks whether the current time falls within the rule's operating schedule
3. If inside the schedule window, the alert is delivered normally
4. If outside the schedule window, the alert is suppressed with reason "outside operating hours"

> ℹ️ Suppressed alerts still appear in **Alerts > History** for auditing. No data is lost. You can always review what happened outside your schedule windows.

## Creating a Schedule

## Schedule Rules

Each schedule contains one or more rules that define active time windows. Rules are combined with OR logic, so an alert fires if the current time falls within any of the schedule's rules.

### Rule Structure

Each rule defines three things:

| Field | Format | Example |
|-------|--------|---------|
| **Days** | 1 = Monday through 7 = Sunday | 1-5 (weekdays) |
| **Start time** | HH:MM (24-hour) | 09:00 |
| **End time** | HH:MM (24-hour) | 18:00 |

### Combining Rules

Use multiple rules to cover different day/time combinations:

| Rule | Days | Start | End | Coverage |
|------|------|-------|-----|----------|
| 1 | Mon-Fri (1-5) | 09:00 | 18:00 | Weekday business hours |
| 2 | Sat (6) | 10:00 | 14:00 | Saturday morning shift |

This schedule activates alerts Monday through Friday from 9 AM to 6 PM, plus Saturday from 10 AM to 2 PM.

### Overnight Schedules

If the start time is later than the end time, the window wraps to the next day:

| Rule | Days | Start | End | Coverage |
|------|------|-------|-----|----------|
| 1 | Mon-Thu (1-4) | 22:00 | 06:00 | Overnight batch window |

This covers 10 PM to 6 AM the following morning, Monday night through Friday morning.

## Assigning to Rules

Schedules take effect when assigned to an alert rule.

### During Rule Creation

When creating a new alert rule, select a schedule from the **Operating Schedule** dropdown in the rule builder. Leave it blank to keep the rule always active.

### On an Existing Rule

1. Go to **Alerts > Rules**
2. Click the rule name to edit
3. Select or change the **Operating Schedule**
4. Click **Save**

> ℹ️ Removing a schedule from a rule makes it always active. Alerts are never lost. They are suppressed and logged when outside the schedule window.

### Constraints

- A rule can only have one schedule at a time
- Rules without a schedule are always active
- Multiple rules can share the same schedule
- Changing a schedule affects all rules that use it

## Timezone Handling

Schedules evaluate against the current time in the configured timezone:

- All times use IANA timezone identifiers (`America/New_York`, `Europe/London`, `Asia/Tokyo`)
- Daylight Saving Time transitions are handled automatically
- When clocks spring forward, the schedule adjusts accordingly. A 2 AM to 3 AM window would be skipped on the transition day
- When clocks fall back, the schedule covers the repeated hour

> ⚠️ Avoid using UTC offsets like `UTC-5` for schedules in regions that observe DST. Use the region-based identifier (e.g., `America/New_York`) so DST is handled correctly.

## Common Patterns

### Business Hours Only

Standard weekday coverage for a single team:

| Rule | Days | Start | End |
|------|------|-------|-----|
| 1 | Mon-Fri (1-5) | 09:00 | 18:00 |

**Best for**: Non-critical alerts that can wait until the next business day.

### 24/7 Weekdays with Reduced Weekends

Full weekday coverage with limited weekend hours:

| Rule | Days | Start | End |
|------|------|-------|-----|
| 1 | Mon-Fri (1-5) | 00:00 | 23:59 |
| 2 | Sat-Sun (6-7) | 09:00 | 17:00 |

**Best for**: Teams with weekend on-call during daytime hours only.

### Follow the Sun

Multiple rules covering regional business hours for global teams:

| Rule | Days | Start | End | Region |
|------|------|-------|-----|--------|
| 1 | Mon-Fri (1-5) | 09:00 | 18:00 | US East (America/New_York) |
| 2 | Mon-Fri (1-5) | 09:00 | 18:00 | Europe (Europe/London) |
| 3 | Mon-Fri (1-5) | 09:00 | 18:00 | APAC (Asia/Tokyo) |

> ℹ️ Follow-the-sun requires creating separate schedules (one per timezone) and separate alert rules for each region's destinations.

## Common Questions

### How do I stop getting paged outside business hours?

Create a schedule with Mon-Fri, 09:00 to 18:00 in your team's timezone and assign it to the rule in the **Operating Schedule** dropdown. Events that fire outside the window are suppressed and logged with reason "outside operating hours". For company-wide quiet periods, use [blackout windows](/alerts/blackouts) instead.

### Can one schedule cover overnight hours?

Yes. Set the start time later than the end time and the window wraps to the next day. For example, Mon-Thu with start 22:00 and end 06:00 covers each overnight period from Monday night through Friday morning.

### Does an operating schedule handle daylight saving time?

Yes, as long as you use an IANA identifier like `America/New_York` rather than a fixed UTC offset. DST transitions are handled automatically: spring-forward skips the repeated hour, fall-back covers it. Fixed offsets like `UTC-5` will drift when DST changes.

### Can multiple rules share the same schedule?

Yes. One schedule can be assigned to any number of rules, and editing the schedule affects all of them. A rule can only have one schedule at a time, and rules with no schedule are always active.

### How do I set up follow-the-sun alerting across regions?

Create one schedule per region using the region's IANA timezone (e.g., `America/New_York`, `Europe/London`, `Asia/Tokyo`) and one alert rule per region routing to that region's destinations. There is no single "any region" schedule. The per-rule assignment is what makes the routing work.

## Next Steps

- **[Create Alert Rules](/alerts/alert-rules)**: Assign schedules to your alert rules
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue with effective scheduling

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/blackouts -->

# Blackout Windows

> Suppress all alerts during planned maintenance or holidays

Blackout windows let you temporarily suppress all alert notifications company-wide. Use them during planned maintenance, deployments, holidays, or any period when alerts would create unnecessary noise.

## How Blackouts Work

- Blackouts suppress **all** alerts for the entire company during the window
- Unlike [operating schedules](/alerts/schedules) (which are per-rule), blackouts are global
- Suppressed alerts are still recorded in the alert log with reason "blackout period"
- Blackouts can be one-time or yearly recurring

## Creating a Blackout

## Recurrence Options

| Type | Behavior | Use Case |
|------|----------|----------|
| None | Fires once, then expires | Deployment windows, one-off maintenance |
| Yearly | Repeats same dates each year | Company holidays, annual freeze periods |

> ℹ️ Yearly recurring blackouts match the month and day, adjusting for timezone. A blackout from Dec 24-26 will repeat every year on those dates.

## Managing Blackouts

- Active blackouts suppress alerts immediately when the window starts
- Toggle **is_active** to temporarily disable a blackout without deleting it
- Past one-time blackouts remain in the list for audit purposes but have no effect

> 💡 Create recurring yearly blackouts for predictable events like company holidays. Use one-time blackouts for ad-hoc maintenance windows.

## Blackouts vs Operating Schedules

| Feature | Operating Schedules | Blackout Windows |
|---------|-------------------|------------------|
| Scope | Per-rule | Company-wide |
| Timing | Recurring weekly pattern | Specific date ranges |
| Use case | Business hours | Maintenance, holidays |
| Configuration | Assigned to individual rules | Applies globally |

> ⚠️ Blackouts override operating schedules. If an event falls within both an active schedule and an active blackout, the blackout takes precedence and the alert is suppressed.

## Common Questions

### How do I silence alerts during a planned deployment?

Create a one-time blackout in **Alerts > Blackouts** with the start and end times of your deployment window. The blackout suppresses all alerts company-wide until it ends. Suppressed events are still logged in **Alerts > History** with reason "blackout period".

### What's the difference between a blackout and an operating schedule?

Blackouts are company-wide date ranges, typically used for maintenance windows and holiday freezes. [Operating schedules](/alerts/schedules) are per-rule recurring weekly patterns, typically used for business-hours-only alerting. If both apply, the blackout wins and the alert is suppressed.

### Can blackouts repeat every year for holidays?

Yes. Set **Recurrence** to **Yearly** when creating the blackout and it will repeat on the same month and day each year, adjusting for timezone. This is the recommended pattern for company holiday code freezes.

### Do suppressed alerts still show up anywhere?

Yes. All blackout-suppressed events are written to **Alerts > History** with the reason "blackout period", so you can audit what happened during the window without receiving the notifications in real time.

## Next Steps

- **[Alert Rules](/alerts/alert-rules)**: Configure per-rule schedules and conditions
- **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue across your team

---

<!-- Source: https://docs.anomalyarmor.ai/alerts/best-practices -->

# Alert Best Practices

> Reduce alert fatigue and improve response times with effective alerting strategies

Effective alerting is about balance: too few alerts and you miss issues; too many and you ignore them all. This guide helps you build an alerting strategy that keeps you informed without overwhelming your team.

## The Alert Fatigue Problem

Alert fatigue happens when teams receive too many notifications:

**The goal**: Every alert should be actionable and worth investigating.

## Core Principles

### 1. Start Narrow, Expand Carefully

Don't monitor everything at once:

1. **Week 1**: Monitor 5 critical production tables
2. **Week 2**: Add freshness monitoring to those tables
3. **Week 3**: Expand to 10 more important tables
4. **Week 4**: Review alert history, tune thresholds
5. Continue expanding gradually

### 2. Every Alert Should Be Actionable

Before creating an alert, ask:
- What action should someone take when this fires?
- Is immediate action required, or can it wait?
- Who is the right person to respond?

If you can't answer these questions, the alert may not be useful.

### 3. Match Urgency to Destination

| Urgency | Destination | When to Use |
|---------|-------------|-------------|
| **Immediate** | PagerDuty | On-call response needed now |
| **Soon** | Slack | Team should see within hours |
| **Eventually** | Email | Can be reviewed daily/weekly |

## Event-Based Routing

Route different event types based on impact severity:

### Recommended Setup

| Alert Type | Event | Trigger scope | Destination |
|------------|-------|---------------|-------------|
| Production breaking changes | Schema Change | Breaking only | PagerDuty + Slack |
| Production additive changes | Schema Change | Non-breaking only | Slack (low urgency) |
| Gold-table change freeze | Schema Change | Specific types (`COLUMN_REMOVED`, `PRIMARY_KEY_REMOVED`, ...) | PagerDuty |
| Freshness violations | Freshness Violation | SLA breached | Slack |
| Discovery failures | Discovery Failed | Any failure | Slack + Email |
| Dev/staging changes | Schema Change | Breaking only | Email |

## Environment Separation

Monitor different environments differently:

### Production

**Rules:**
- All schema changes → Slack + PagerDuty (for breaking)
- All freshness violations → Slack
- Discovery failures → Slack + Email

**Schedule:** Hourly discovery | **Threshold:** Strict SLAs

### Staging

**Rules:**
- Breaking changes only → Slack
- Freshness (critical tables only) → Slack

**Schedule:** Every 6 hours | **Threshold:** Lenient SLAs

### Development

**Rules:**
- None or weekly digest only

**Schedule:** Daily | **Threshold:** Very lenient or disabled

## Threshold Tuning

### Start Lenient

If your ETL runs hourly, don't set a 30-minute SLA:

| Pattern | Starting SLA | After Tuning |
|---------|--------------|--------------|
| 15 min updates | 45 min | 30 min |
| Hourly updates | 3 hours | 2 hours |
| Daily updates | 36 hours | 24 hours |

### Use Warning Thresholds

Two-stage alerts reduce surprise violations:

**orders table freshness:**
- **Expected**: Updated hourly
- **Warning**: After 90 minutes (alert to Slack)
- **Violation**: After 2 hours (alert to PagerDuty)

Warnings give you time to investigate before escalation.

### Review and Tighten

After 2-4 weeks:

1. Check alert history
2. Identify alerts that fired but weren't actionable
3. Tighten thresholds that never trigger
4. Loosen thresholds that trigger too often

## Scope Filtering

### Include Only What Matters

Filter rules to relevant assets:

**Rule: Production Revenue Freshness**
- **Data source**: production-postgres
- **Schema**: public
- **Assets**: `orders`, `payments`, `revenue_*`, `transaction_*`

### Exclude Noise

Remove assets that don't need monitoring:

**Exclusions:**
- `*_temp` (temporary tables)
- `*_backup` (backup copies)
- `*_old` (deprecated tables)
- `pg_temp_*` (PostgreSQL temp)
- `test_*` (test tables)

## Alert Aggregation

Avoid alert storms by grouping related alerts:

### Same Asset, Multiple Changes

**Instead of:**
- Column added: new_field_1
- Column added: new_field_2
- Column added: new_field_3
- Column type changed: status

**AnomalyArmor groups:**
- **Schema Change: 4 changes detected**
  - 3 columns added
  - 1 column type changed
  - View details →

### Deduplication

The same change won't re-alert until resolved or a cooldown period passes.

## Common Mistakes

### Alerting on everything

**Problem**: Every table, every change, every environment → hundreds of alerts

**Solution**: Start with 5-10 critical tables. Expand only after you've proven the value.

### Same destination for everything

**Problem**: All alerts go to Slack → important ones get buried

**Solution**: Use event-based routing. PagerDuty for breaking changes, Slack for schema changes, Email for informational.

### Too-tight SLAs

**Problem**: Freshness SLA is 1 hour, but ETL sometimes takes 70 minutes → constant false positives

**Solution**: Set SLA at 2x expected, tune down over time.

### Monitoring dev environments

**Problem**: Dev databases change constantly → alert storm

**Solution**: Don't monitor dev at all, or use weekly email digests only.

### No one owns the alerts

**Problem**: Alerts fire but no one responds

**Solution**: Define ownership for each alert type. Use PagerDuty with on-call rotations for critical alerts.

## Weekly Review Process

Schedule 15-30 minutes weekly to review alerts:

### Questions to Ask

1. **How many alerts fired this week?**
   - If more than 50: Too many. Add filters or raise thresholds.
   - If fewer than 5: Are you monitoring enough?

2. **What percentage were actionable?**
   - Target: >80%
   - If lower: Identify patterns and add filters

3. **Were any issues missed?**
   - If yes: Add coverage for those scenarios

4. **Which alerts took longest to resolve?**
   - These may need better routing or documentation

### Tuning Actions

| Finding | Action |
|---------|--------|
| Alert fires often but isn't actioned | Disable or change to email digest |
| Same asset alerts repeatedly | Investigate root cause, not just the alert |
| Critical issue wasn't alerted | Add coverage |
| Team ignores channel | Reduce volume or change channel |

## Sample Alert Configuration

Here's a recommended starting configuration:

| Rule | Event | Scope | Trigger scope | Destinations |
|------|-------|-------|---------------|--------------|
| **Production Breaking Changes** | Schema Change | Production database, all schemas | Breaking only | PagerDuty, Slack #incidents |
| **Production Additive Changes** | Schema Change | Production database, all schemas | Non-breaking only | Slack #data-alerts |
| **Critical Table Freshness** | Freshness Violation | orders, payments, users, products | SLA from asset config | Slack #data-alerts, PagerDuty (if >4h stale) |
| **Analytics Freshness** | Freshness Violation | daily_*, weekly_*, analytics_* | SLA from asset config | Slack #analytics-team |
| **Discovery Failures** | Discovery Failed | All | All failures | Slack #data-alerts, Email ops@company.com |
| **Staging Changes (Breaking)** | Schema Change | Staging database | Breaking only | Email (daily digest) |

## Checklist

Before going live with alerts:

- [ ] Defined critical tables (start with 5-10)
- [ ] Set up event-based routing (breaking → PagerDuty, others → Slack)
- [ ] Excluded dev/test environments
- [ ] SLAs set with buffer (2x expected)
- [ ] Warning thresholds configured
- [ ] Assigned ownership for each alert type
- [ ] Scheduled weekly review meeting
- [ ] Documented escalation process

## Use Schedules and Blackouts

Reduce noise by controlling when alerts fire:

### Operating Schedules

Assign [operating schedules](/alerts/schedules) to rules that only matter during business hours:

- **Freshness rules**: If your pipelines run overnight, set schedules to only alert during business hours when the team can respond
- **Non-critical schema changes**: Alert during work hours, suppress overnight
- **Development environments**: Restrict to CI/CD windows

### Blackout Windows

Use [blackout windows](/alerts/blackouts) for planned quiet periods:

- **Deployment windows**: Suppress alerts during known release times
- **Holiday freezes**: Create yearly recurring blackouts for company holidays
- **Maintenance periods**: Silence alerts during planned infrastructure work

> 💡 Combine schedules and blackouts: schedules handle recurring weekly patterns, blackouts handle specific date ranges. Both keep your team focused on alerts they can act on.

## Common Questions

### How do I stop getting too many data alerts?

Alert fatigue usually comes from monitoring too broadly. Start with 5-10 critical production tables, route only breaking changes to PagerDuty, send additive changes to Slack, and exclude dev and staging from noisy rules. Set freshness SLAs at roughly 2x expected update time, then tighten once you see real patterns.

### Should I alert on dev and staging databases?

Usually no. Dev databases change constantly and produce noise without actionable signal. If you must monitor non-prod, restrict it to breaking changes only and route to a weekly email digest, not to a real-time channel.

### What's a good starting freshness SLA?

Start at roughly twice your expected update interval, then tighten over time. For hourly pipelines try a 3-hour SLA and tune down to 2 hours. For 15-minute pipelines try 45 minutes. Tight SLAs at launch produce constant false positives during normal pipeline variance.

### Which alerts belong on PagerDuty versus Slack versus email?

Match the destination to urgency. PagerDuty is for breaking production changes and critical SLA violations that need on-call response now. Slack is for schema changes and freshness issues the team should see within hours. Email fits informational events, digests, and low-urgency records.

### How often should I review my alert rules?

Block 15-30 minutes weekly. Count alerts fired, estimate what percentage were actionable (target above 80 percent), and check whether any real issue was missed. Disable rules that never produce action and tighten ones that never fire. This is the fastest path out of chronic alert fatigue.

## Related Topics

- **[Alert Rules](/alerts/alert-rules)**: Configure alert rules
- **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set up freshness SLAs
- **[Slack Integration](/alerts/destinations/slack)**: Configure Slack alerts
- **[Alerts Overview](/alerts/overview)**: Alert system architecture
- **[Operating Schedules](/alerts/schedules)**: Control when rules are active
- **[Blackout Windows](/alerts/blackouts)**: Suppress alerts during maintenance

---

<!-- Source: https://docs.anomalyarmor.ai/intelligence/overview -->

# Intelligence Overview

> AI that connects schema drift, freshness, data quality, and lineage into actionable answers

Intelligence is the layer that ties your monitoring together. Schema drift detects changes. Freshness catches stale tables. Data quality tracks metrics. Intelligence reads all of that context and lets you ask questions, surface findings, and diagnose problems in plain English.

## How Intelligence Uses Your Monitoring Context

Every AnomalyArmor feature generates context that Intelligence can reason over:

| Feature | Context Created | What Intelligence Does With It |
|---------|----------------|-------------------------------|
| [Schema Drift](/concepts/schema-changes) | Column additions, removals, type changes | "A column was removed Tuesday. That's likely why your dashboard broke." |
| [Freshness](/concepts/freshness) | Update patterns, SLA violations, stale tables | "The orders table is 6 hours overdue. Its upstream source is also stale." |
| [Data Quality](/concepts/metrics) | Null rates, row counts, validity rule failures | "Null rate on email spiked from 2% to 28% after Wednesday's deploy." |
| [Tags](/data-classification/overview) | PII labels, domain classifications | "3 tables with PII tags had schema changes this week." |
| [Lineage](/intelligence/overview) | Upstream sources, downstream consumers | "The stale staging table affects 7 downstream gold tables." |
| [Alerts](/concepts/alerts) | Alert history, rule configurations | "You had 4 freshness alerts and 1 schema drift alert since Monday." |

The more monitoring you set up, the smarter Intelligence gets.

## Features

- **[Object Profiles](/intelligence/object-profiles)**: AI-generated profiles with findings, relationships, and change history for every table
- **[Ask Questions](/intelligence/ask-questions)**: Chat with an AI agent that has full context across all your monitoring domains
- **[Walkthrough](/intelligence/use-cases)**: See how monitoring context flows into Intelligence to solve real problems

## What You Can Ask

Intelligence answers are grounded in real data from your monitoring setup, not just schema names.

### Cross-Domain Diagnosis

```
"Why is my orders dashboard showing errors?"
"What changed upstream that could cause NULLs in silver_orders?"
"What should I prioritize this morning?"
```

### Schema and Structure

```
"What columns were removed this week?"
"Which tables have PII?"
"How is users related to orders?"
```

### Freshness and Health

```
"Which tables are stale right now?"
"What's the update pattern for my fact tables?"
"Show me SLA violations this week"
```

### Data Quality

```
"Which columns have spiking null rates?"
"What validity rules are failing?"
"Are there anomalies in my row counts?"
```

### Impact Analysis

```
"What's the blast radius if orders goes down?"
"What depends on the customers table?"
"Show me everything downstream of raw_events"
```

## The Intelligence Page

The Intelligence page is a two-panel layout:

- **Left**: Schema Explorer tree. Navigate databases, schemas, and tables.
- **Right**: Dynamic content that changes based on your selection.

Select a table to see its **Object Profile**: an AI-generated summary with findings grouped by severity, relationship mapping, change history, and which monitoring context is available. Select a database or schema to see a high-level overview with aggregated findings.

Click **Ask Agent** to open the full-screen chat, where you can ask questions scoped to the selected asset with full conversation history.

## Getting Started

## How It Works

Intelligence builds a knowledge base from your operational metadata:

| What We Analyze | What We Never Access |
|----------------|---------------------|
| Table and column names | Actual row data |
| Data types and constraints | Query results |
| Schema change history | Business values |
| Freshness patterns and SLAs | Database credentials |
| Data quality metrics | |
| Alert history and rules | |
| Tags and classifications | |
| Lineage and dependencies | |

Your data values stay in your database. Intelligence only sees structure and operational metadata.

## Usage and Billing

Intelligence queries count against your monthly API limit:

| Plan | Monthly Queries |
|------|-----------------|
| Starter | 500 |
| Growth | 2,000 |
| Professional | 10,000 |
| Enterprise | Unlimited |

Check usage: **Settings** > **Billing** > **Usage**

## Privacy and Security

- Intelligence processes queries using AI language models (configurable per account)
- Only schema metadata and operational data are sent (names, types, alert history, freshness status)
- No actual data values are ever transmitted
- You can request Intelligence be disabled for your account

## Common Questions

### What does Intelligence actually do that monitoring alone doesn't?

Monitoring detects events, schema drift, stale tables, null rate spikes. Intelligence correlates them. It reads context from every monitoring domain and answers questions like "why is this broken?" in one place instead of you clicking through separate pages.

### Does Intelligence send my data to an LLM?

Only metadata: table names, column types, schema changes, freshness status, metric values, alert history, and tags. Row data, query results, and credentials are never transmitted. See the [Ask Questions](/intelligence/ask-questions) page for more on privacy.

### Do I need monitoring set up before Intelligence is useful?

Yes, effectively. Intelligence reasons over the context your monitoring creates. Without freshness schedules, drift detection, or metrics, it falls back to schema-only analysis and loses most of its diagnostic power.

### Is Intelligence usage capped?

Yes, by plan. Starter gets 500 queries per month, Growth 2,000, Professional 10,000, and Enterprise is unlimited. Check consumption under **Settings → Billing → Usage**.

### Can I turn Intelligence off for my account?

Yes. If your organization doesn't want AI processing, request that Intelligence be disabled. The rest of AnomalyArmor (monitoring, alerts, classification) continues working without it.

## Next Steps

- **[Object Profiles](/intelligence/object-profiles)**: Understand what you see when you select a table
- **[Ask Questions](/intelligence/ask-questions)**: Chat with the AI agent

---

<!-- Source: https://docs.anomalyarmor.ai/intelligence/object-profiles -->

# Object Profiles

> AI-generated intelligence profiles for every table, powered by your monitoring context

When you select a table on the Intelligence page, you see its **Object Profile**: an AI-generated analysis that combines schema metadata with context from freshness, data quality, schema drift, tags, and lineage into a single view.

The profile is only as useful as the monitoring you've set up. A table with freshness schedules, data quality metrics, and schema drift monitoring will have a much richer profile than one with no monitoring at all.

## What's in a Profile

### Object Summary

The top card shows the table's identity at a glance:

- **Name and path**: `warehouse.gold.fact_orders`
- **Warehouse role**: Classified as Fact Table, Dimension, Staging, Raw, or other patterns
- **Confidence score**: How confident the AI is in its analysis (based on available context)
- **Summary**: One-paragraph explanation of what this table is for
- **Business context**: What team uses it, what process populates it, what it means to the business
- **Domain and tags**: Business and technical classifications applied to this table

### Findings

Findings are the most actionable part of the profile. They surface issues and observations pulled from across your monitoring domains, grouped by severity:

| Severity | Example |
|----------|---------|
| **Critical** | "Column `order_total` was removed on Tuesday. 3 downstream tables reference this column." |
| **High** | "Table is 6 hours overdue for update. SLA threshold is 2 hours." |
| **Medium** | "Null rate on `email` increased from 2.1% to 12.4% over the past week." |
| **Low** | "Table has no freshness monitoring configured." |

Each finding includes the affected object, a description, and the monitoring domain it came from. Critical and High findings are expanded by default. Low findings are collapsed.

### Relationships and Lineage

Shows how this table connects to the rest of your data:

- **Foreign key relationships**: Detected column references to other tables
- **Lineage signals**: Whether this table is a source, derived, or staging table
- **Schema-level lineage**: Which schemas act as sources vs. consumers
- **Hub tables**: Central entities that many tables reference

### Change History

A timeline of recent schema changes detected on this table:

- Column additions (green)
- Column removals (red)
- Type modifications (amber)

Shows the 10 most recent changes with timestamps. This context is what lets Intelligence answer questions like "what changed on the orders table this week?"

### Analysis Context

Shows which monitoring domains have data available for this table. The agent uses this context when answering questions:

| Domain | What It Provides |
|--------|-----------------|
| **Schema** | Table structure, column types, constraints |
| **Tags** | Business classifications, PII labels |
| **Freshness** | Update patterns, SLA status, learned schedules |
| **Rules** | Alert rule configurations and firing history |
| **Data Quality** | Metric trends, validity rule results |

A checkmark means context is available. Missing domains mean you haven't set up that monitoring yet.

## Generating Profiles

### Single Table

1. Navigate to a table in the Schema Explorer
2. Click **Analyze Table** in the top right
3. Wait for analysis to complete (typically 10-30 seconds)
4. The profile appears with all available context

### Entire Schema

1. Select a schema in the Schema Explorer
2. Click **Analyze Schema**
3. All tables in the schema are analyzed

### Full Database

1. Select the database root in the Schema Explorer
2. Click **Analyze All**
3. Every table across all schemas is analyzed

Analysis runs in the background. You'll see a notification when it completes, and the profiles update automatically.

## Making Profiles More Useful

The quality of a profile depends directly on what monitoring you've set up:

| What You Set Up | What the Profile Gains |
|----------------|----------------------|
| Freshness schedule | Update patterns, SLA violation findings, staleness history |
| Schema drift monitoring | Change timeline, removed column warnings, drift findings |
| Data quality metrics | Trend data, anomaly findings, null rate tracking |
| Tags | Business context, PII classification, domain grouping |
| Alert rules | Alert coverage analysis, firing history context |

**Start with your most critical tables.** Set up freshness monitoring and a few data quality metrics, then generate intelligence. The profile will immediately surface findings from that context.

## Editing Descriptions

AI-generated descriptions are a starting point. You can edit any description to add context the AI can't infer:

- Business context: "Used by the finance team for quarterly reporting"
- Data sources: "Populated by the Stripe webhook integration"
- Update frequency: "Updated in real-time as orders are placed"
- Caveats: "Does not include cancelled orders before 2023"

Edits are preserved across re-analysis. Your additions won't be overwritten.

## Common Questions

### What's in an object profile?

A summary of the table's role, AI-generated business context, findings grouped by severity, foreign key and lineage relationships, and a timeline of recent schema changes. Every section is grounded in your actual monitoring data, not inferred from the table name alone.

### Why are my profiles sparse on findings?

Profiles are only as good as the monitoring feeding them. A table with freshness schedules, data quality metrics, and schema drift monitoring produces rich findings. A table with no monitoring produces mostly schema-level context and "consider enabling X" suggestions.

### How do I generate a profile for many tables at once?

Select a schema and click **Analyze Schema**, or pick the database root and click **Analyze All**. Analysis runs in the background and profiles update automatically as each table completes.

### Do my edits to a profile's description survive re-analysis?

Yes. Manual edits to descriptions are preserved across future runs. Use this to add business context the AI can't infer, like which team owns a table or how it's populated.

### What does the confidence score represent?

It reflects how much monitoring context was available when the profile was generated. More freshness history, drift detection, metrics, and tags produce higher confidence. Tables with only schema metadata score lower.

## Next Steps

- **[Ask Questions](/intelligence/ask-questions)**: Use profiles as context for AI-powered Q&A
- **[Walkthrough](/intelligence/use-cases)**: See profiles in action during a real debugging scenario

---

<!-- Source: https://docs.anomalyarmor.ai/intelligence/ask-questions -->

# Ask Questions

> Chat with an AI agent that has full context from your monitoring setup

The Intelligence agent is an AI assistant that can answer questions about your data using context from every monitoring domain: schema drift, freshness, data quality, tags, lineage, and alerts. It chains multiple tools together automatically, so a single question can pull from several sources to give you a complete answer.

## How to Ask

1. Navigate to the **Intelligence** page
2. Click **Ask Agent** in the top right
3. Type your question in plain English
4. The agent investigates, calling tools and checking context
5. You see the full response with tool calls visible

The agent opens as a full-screen overlay. You can scope it to a specific table by selecting one in the Schema Explorer before opening the chat.

## What Makes This Different

The agent doesn't just search table names. It reads your monitoring context:

| Traditional Search | Intelligence Agent |
|---|---|
| "orders" returns tables with "orders" in the name | "Why is the orders dashboard broken?" checks schema drift for removed columns, freshness for stale data, and data quality for metric anomalies |
| "stale tables" returns a static list | "What should I prioritize?" ranks issues by severity across all domains |
| "PII tables" returns tagged assets | "Which PII tables had changes this week?" cross-references tags with schema drift history |

## Question Categories

### Diagnosis (Cross-Domain)

These are the most powerful questions because they pull from multiple monitoring domains at once.

```
You: Why is my orders dashboard showing "column not found"?

Agent: I checked schema drift for the orders table and found that the column
       `discount_pct` was removed on Tuesday at 3:47 PM.

       This column is referenced by 2 downstream views:
       - analytics.order_summary
       - reporting.daily_revenue

       The removal coincides with a schema change detected on the upstream
       staging.orders_raw table. I recommend checking the ETL pipeline that
       populates staging.orders_raw for intentional column removal.
```

```
You: What changed upstream that could cause NULLs in silver_orders?

Agent: I found 3 potential causes:

       1. Schema change: raw_events.user_id column type changed from INT to
          VARCHAR on Monday. This could cause join failures with the users table.

       2. Freshness: raw_events hasn't updated in 14 hours (SLA: 2 hours).
          Missing upstream data would produce NULLs in derived columns.

       3. Data quality: null_percentage on raw_events.order_id spiked from
          0.1% to 8.3% starting Wednesday.
```

### Health and Prioritization

```
You: What should I prioritize this morning?

Agent: Here's your priority list:

       CRITICAL:
       - gold.fact_orders is 6 hours stale (SLA: 2h). Upstream
         staging.orders_raw is also stale.

       HIGH:
       - 3 schema drift alerts: columns removed from gold.dim_customer,
         gold.dim_product, silver.silver_users
       - Null rate on gold.fact_payments.amount spiked to 15%

       MEDIUM:
       - 2 validity rules failing on gold.dim_customer (UNIQUE on email,
         NOT_NULL on created_at)

       No new info-level issues.
```

### Data Discovery

```
You: Where is revenue data?

Agent: Revenue data lives in 3 tables:

       1. gold.fact_orders (3.2M rows, updated 5 min ago)
          - Columns: order_total, tax_amount, discount_amount, net_revenue
          - Tagged: revenue, finance

       2. gold.fact_subscriptions (450K rows, updated 1 hour ago)
          - Columns: mrr, arr, subscription_value
          - Tagged: revenue, SaaS-metrics

       3. reporting.revenue_daily (aggregated, updated daily at 6 AM)
          - Columns: total_revenue, revenue_by_product, revenue_by_region

       gold.fact_orders has the most granular and freshest data.
```

### Data Quality

```
You: Are there anomalies in my row counts this week?

Agent: I checked row_count metrics across all monitored tables:

       ANOMALIES DETECTED:
       - gold.fact_orders: Row count dropped 23% on Wednesday
         (3.2M -> 2.4M). This is outside the normal daily variance of +/- 5%.
       - staging.email_events: Row count doubled on Thursday
         (890K -> 1.8M). Possible duplicate ingestion.

       NORMAL:
       - 14 other monitored tables show expected row count patterns.
```

### Impact Analysis

```
You: What's the blast radius if I drop the customers table?

Agent: Dropping gold.dim_customer would affect:

       DIRECT DEPENDENTS (3 tables):
       - gold.fact_orders (JOIN on customer_id)
       - gold.fact_subscriptions (JOIN on customer_id)
       - reporting.customer_ltv (derived from dim_customer)

       INDIRECT DEPENDENTS (2 tables):
       - reporting.revenue_daily (depends on fact_orders)
       - reporting.churn_analysis (depends on fact_subscriptions)

       ALERT IMPACT:
       - 4 freshness schedules would stop reporting
       - 2 data quality metrics would fail
       - 1 validity rule (UNIQUE on email) would be orphaned
```

## Sessions

Conversations are organized into sessions. Each session maintains full context, so follow-up questions work naturally.

- **New session**: Click **+** in the session sidebar
- **Resume session**: Click a previous session to continue
- **Session history**: Sidebar shows all past conversations with timestamps

## Tool Visibility

As the agent works, you can see each tool call it makes. This transparency lets you verify what the agent is checking:

- Which tool was called (e.g., `list_schema_changes`, `check_freshness`)
- What parameters were passed
- The result returned

For actions that modify data (creating metrics, applying tags), the agent asks for confirmation before proceeding.

## Tips

- **Be specific about scope**: "Check freshness for the 3 gold fact tables" works better than "check everything"
- **Chain requests**: "Find tables with spiking null rates, then show me their schema changes this week" works in one message
- **Ask diagnostic questions**: "Why is X broken?" triggers cross-domain investigation, which is where Intelligence is strongest
- **Let it discover first**: Say "find tables with customer data" instead of typing exact table paths

## Common Questions

### How is this different from searching a data catalog?

A catalog returns tables that match keywords. The Intelligence agent chains tools across schema drift, freshness, data quality, tags, and lineage to diagnose **why** something is happening, not just what exists. "Why is my orders dashboard broken?" pulls from multiple monitoring domains in a single answer.

### Does the agent see my actual data values?

No. It only reads operational metadata, schema structure, freshness patterns, metric values, alert history, tags, and lineage. Row-level data stays in your database. See [Intelligence Overview](/intelligence/overview) for the full list of what's analyzed versus what's never accessed.

### Can the agent make changes, or is it read-only?

It can suggest and, with your confirmation, perform actions like creating metrics or applying tags. Any action that modifies data requires you to approve it first. Investigative questions are always read-only.

### What kinds of questions work best?

Diagnostic ones. "Why is X broken?", "What changed upstream?", "What should I prioritize?" trigger cross-domain investigation where the agent is strongest. Scope matters: asking about specific tables beats "check everything".

### Do sessions remember earlier questions?

Yes. Each session keeps full conversation context, so follow-ups like "and what about the phone column?" work naturally. Start a new session when you switch topics to keep context clean.

### Can I see what tools the agent called to answer a question?

Yes. Each response shows the tool calls inline, which tool was used, what arguments were passed, and what it returned. This transparency lets you verify the answer and spot places where extra monitoring would give the agent more context.

## Next Steps

- **[Walkthrough](/intelligence/use-cases)**: See the agent diagnose a real problem end to end
- **[Object Profiles](/intelligence/object-profiles)**: Understand the context that powers agent answers

---

<!-- Source: https://docs.anomalyarmor.ai/intelligence/use-cases -->

# Intelligence in Action

> How monitoring context flows into Intelligence to solve real problems

Intelligence is most powerful when you've set up monitoring across your data stack. This page walks through two real scenarios where schema drift, freshness, and data quality create the context that Intelligence uses to diagnose problems.

## Scenario 1: The Broken Dashboard

A revenue dashboard shows "column not found" at 8 AM. Here's how you'd use AnomalyArmor to find the root cause.

### Step 1: Set Up Schema Drift Monitoring

Before anything breaks, you enabled schema drift detection on your gold tables:

```
Navigate to gold.fact_orders > Schema Monitoring > Enable
```

AnomalyArmor captures a baseline of every column, type, and constraint. Any future change triggers a schema drift alert.

### Step 2: Set Up Freshness Monitoring

You created a freshness schedule for the orders pipeline:

```
gold.fact_orders: Check hourly, SLA 2 hours
staging.orders_raw: Check hourly, SLA 1 hour
```

Now AnomalyArmor knows when these tables should update and will alert if they fall behind.

### Step 3: Set Up Data Quality Metrics

You added quality metrics on key columns:

```
gold.fact_orders.order_total: row_count, null_percentage
gold.fact_orders.discount_pct: null_percentage, distinct_count
gold.dim_customer.email: null_percentage
```

AnomalyArmor tracks these metrics on every discovery run, building a history of normal ranges.

### Step 4: Something Breaks

Tuesday at 3:47 PM, a deploy removes the `discount_pct` column from `staging.orders_raw`. The ETL pipeline runs, propagating the removal to `gold.fact_orders`. Wednesday morning, the revenue dashboard fails with "column not found: discount_pct."

At this point, AnomalyArmor has already captured:
- **Schema drift**: Column `discount_pct` removed from `gold.fact_orders` at 4:12 PM Tuesday
- **Data quality**: `discount_pct` null_percentage metric stopped reporting (column gone)
- **Alerts**: Schema drift alert fired at 4:12 PM, freshness alert cleared (table still updating on schedule)

### Step 5: Intelligence Connects the Dots

You open the Intelligence page, select `gold.fact_orders`, and see the Object Profile:

**Findings (Critical)**:
> Column `discount_pct` (FLOAT) was removed on Tuesday at 4:12 PM. This column had an active null_percentage metric that is no longer reporting. 2 downstream views reference this column: `analytics.order_summary` and `reporting.daily_revenue`.

**Findings (Medium)**:
> The removal correlates with a schema change on `staging.orders_raw` at 3:47 PM Tuesday, suggesting the column was dropped upstream.

You click **Ask Agent** and type:

```
You: Why is the revenue dashboard broken?

Agent: The revenue dashboard is failing because the column `discount_pct`
       was removed from gold.fact_orders.

       Timeline:
       - Tue 3:47 PM: discount_pct removed from staging.orders_raw
       - Tue 4:12 PM: Change propagated to gold.fact_orders via ETL
       - Tue 4:12 PM: Schema drift alert fired
       - Wed 8:00 AM: Dashboard query fails on missing column

       The column is referenced by:
       - analytics.order_summary (SELECT discount_pct)
       - reporting.daily_revenue (SUM(discount_pct))

       Root cause: Upstream schema change in staging.orders_raw.
       Check the deploy that went out Tuesday afternoon.

       Recommended actions:
       1. Restore the column if the removal was unintentional
       2. Update the 2 downstream views if it was intentional
       3. Resolve the schema drift alert after addressing
```

Without Intelligence, this investigation would involve checking the schema history page, the freshness page, the alerts page, and the data quality page separately. With Intelligence, you get the full picture in one question.

## The Pattern

Every Intelligence investigation follows the same loop:

1. **Features generate context**: Schema drift, freshness, and data quality monitoring continuously capture operational data about your tables
2. **Intelligence aggregates context**: When you select a table or ask a question, Intelligence pulls from all available monitoring domains
3. **AI reasons across domains**: Instead of checking each page separately, Intelligence correlates findings across schema changes, freshness status, quality metrics, tags, and lineage
4. **You get a diagnosis**: A single answer that explains what happened, when, why, and what to do about it

The more monitoring you set up, the more complete the picture. Start with your most critical tables: add freshness schedules, enable schema drift detection, and track a few data quality metrics. Then let Intelligence do the correlation work for you.

## Common Questions

### What's the minimum monitoring setup for Intelligence to be useful?

Freshness on your most critical tables plus schema drift detection. That combination alone unlocks most diagnostic questions. Add data quality metrics next for the null-rate and row-count regressions shown in Scenario 2.

### Can Intelligence catch a regression nothing alerted on?

Yes. Scenario 2 is exactly that case, no alert fires because the table updates on schedule and schema is unchanged, but the null-rate trend in data quality metrics surfaces the regression in the Object Profile. Intelligence reads trend history, not just threshold breaches.

### How does Intelligence trace a broken dashboard back to its root cause?

It correlates schema drift (column removed), freshness (was the table also late?), lineage (which views reference the column), and alert history (when did the alert fire?) into a single timeline. That's what turns "dashboard is broken" into "deploy on Tuesday 3:47 PM removed `discount_pct`".

### Does Intelligence work across upstream and downstream tables?

Yes. Lineage context lets it trace a problem from a gold table back to staging sources, and forward to downstream views and reports. The "blast radius" question in [Ask Questions](/intelligence/ask-questions) uses the same signal.

## Next Steps

- **[Set Up Freshness](/concepts/freshness)**: Monitor table update patterns
- **[Enable Schema Drift](/concepts/schema-changes)**: Detect column changes automatically
- **[Add Data Quality](/concepts/metrics)**: Track null rates, row counts, and validity rules
- **[Ask Questions](/intelligence/ask-questions)**: Start chatting with the Intelligence agent

---

<!-- Source: https://docs.anomalyarmor.ai/data-classification/overview -->

# Data Classification

> Tag and categorize your data assets for governance and compliance

Data classification helps you track sensitive data, organize assets, and meet compliance requirements. Know which tables contain PII before your next audit asks.

## Two Types of Tags

| Type | How It Works | Use For |
|------|--------------|---------|
| **Auto-classification** | Automatically detects PII patterns in column names | Compliance, PII tracking |
| **Custom tags** | You apply labels manually or in bulk | Organization, ownership, status |

Both types work together. Auto-classification catches PII; custom tags add business context.

## The Value

### Before Classification

Auditor: "Where is customer PII?"

You: "Let me check... probably in users... and maybe orders... I think there's an email somewhere in marketing..."

### After Classification

Auditor: "Where is customer PII?"

You: *filters by `pii:*`* "Here's the complete list of 47 columns containing PII across 12 tables."

## Quick Start

**Already have data sources connected?** Classification starts working automatically:

1. Auto-classification runs during discovery
2. Check results: **Assets** > **Filter** > **Classification**
3. Add custom tags to organize assets by environment, team, or review status

**Need to customize patterns?** Go to **Settings** > **Data Classification**.

## What Gets Auto-Classified

AnomalyArmor detects common PII patterns:

| Data Type | Example Columns | Tag |
|-----------|-----------------|-----|
| Email | `email`, `user_email` | `pii:email` |
| Phone | `phone`, `mobile` | `pii:phone` |
| SSN | `ssn`, `tax_id` | `pii:ssn` |
| Credit Card | `card_number` | `pii:credit-card` |
| Address | `address`, `zip_code` | `pii:address` |
| Name | `first_name`, `last_name` | `pii:name` |
| IP Address | `ip`, `client_ip` | `pii:ip-address` |

> ℹ️ Auto-classification reads **column names only**, never actual data values.

## Common Use Cases

### Compliance Audit Preparation

Before an audit:
1. Filter assets by `pii:*` tags to see all PII
2. Verify critical tables are classified
3. Export the list for documentation

See [Compliance & Governance](/data-classification/compliance) for detailed workflows.

### Organizing by Environment

Distinguish production from development:
1. Create tags: `production`, `staging`, `development`
2. Apply to assets via bulk tagging
3. Filter alerts by tag to reduce noise

### Tracking Review Status

Mark assets as reviewed:
1. Create tags: `pii-reviewed`, `needs-review`
2. Work through unreviewed assets
3. Track progress with filters

### Alert Routing

Route alerts based on tags:
- Critical tables → PagerDuty
- Non-critical → Email digest
- Development → Slack only

## Alerting on Tags

Create tag-based alert rules:

1. Go to **Alerts** > **Rules**
2. Click **Create Rule**
3. Under **Scope**, select **Filter by tag**
4. Choose your tag (e.g., `production`)
5. Configure destinations

Now alerts only fire for assets with that tag.

## Common Questions

### Do I need to tag everything manually?

No. [Auto-classification](/data-classification/auto-classification) runs during discovery and tags PII patterns like email, phone, and SSN automatically. You only add [custom tags](/data-classification/custom-tags) for things AnomalyArmor can't infer, like environment, ownership, or review status.

### Does auto-classification look at my data?

No. It reads only column names, types, and metadata. Column values stay in your database and are never transmitted.

### How does this help with compliance?

Classification gives you an instant inventory of PII locations, which auditors regularly ask for. See [Compliance & Governance](/data-classification/compliance) for SOC 2, GDPR, and access review workflows.

### Can alerts be scoped to specific tags?

Yes. In alert rules, choose **Filter by tag** and select any tag, like `production` or `pii:email`. Only assets with that tag will trigger the rule, which is the standard way to separate prod alerts from dev noise.

## In This Section

- **[Auto-Classification](/data-classification/auto-classification)**: Automatic PII detection and patterns
- **[Custom Tags](/data-classification/custom-tags)**: Create and apply your own labels
- **[Compliance](/data-classification/compliance)**: SOC 2, GDPR, and audit workflows

---

<!-- Source: https://docs.anomalyarmor.ai/data-classification/auto-classification -->

# Auto-Classification

> Automatically detect PII and sensitive data in your columns

Auto-classification scans your column names and metadata to identify potentially sensitive data like emails, phone numbers, and SSNs. This helps you track PII for compliance without manual tagging.

## How It Works

Auto-classification runs automatically during discovery:

1. **Discovery scans** your database metadata
2. **Pattern matching** checks column names against known PII patterns
3. **Tags applied** automatically to matching columns
4. **Results visible** in the asset detail page

> ℹ️ Auto-classification analyzes column **names and types only**, never actual data values. A column named `email` gets tagged, but we never read the emails inside it.

## Detected Patterns

| Pattern | Example Column Names | Tag Applied |
|---------|---------------------|-------------|
| Email | `email`, `user_email`, `email_address` | `pii:email` |
| Phone | `phone`, `phone_number`, `mobile` | `pii:phone` |
| SSN | `ssn`, `social_security`, `tax_id` | `pii:ssn` |
| Credit Card | `cc_number`, `card_number`, `credit_card` | `pii:credit-card` |
| IP Address | `ip`, `ip_address`, `client_ip` | `pii:ip-address` |
| Address | `address`, `street`, `zip_code` | `pii:address` |
| Name | `first_name`, `last_name`, `full_name` | `pii:name` |

## Viewing Classified Data

To see which columns have been auto-classified:

1. Go to **Assets** in the sidebar
2. Click on any asset to view details
3. Look for **Classification** tags on columns
4. Or filter: **Assets** → **Filter** → **Classification** → select a PII type

## Enabling Auto-Classification

Auto-classification is **enabled by default** for all data sources. To verify or change:

1. Go to **Settings** → **Data Classification**
2. Check that **Auto-classification** is toggled ON
3. Select which patterns to detect (all enabled by default)

## Accuracy and Limitations

Auto-classification uses pattern matching on column names, which means:

**It will catch:**
- Columns with obvious names (`email`, `phone_number`, `ssn`)
- Common naming conventions (`user_email`, `billing_address`)

**It may miss:**
- Abbreviated names (`em`, `ph`)
- Non-English names
- Ambiguous names (`data`, `value`, `field1`)

**False positives:**
- `email_sent_count` might be tagged as email (it's not PII)
- Review classifications and remove incorrect tags manually

## Customizing Detection

To adjust which patterns are detected:

1. Go to **Settings** → **Data Classification**
2. Under **Auto-classification Patterns**, toggle patterns on/off
3. Changes apply to the next discovery run

> 💡 Run a manual discovery after changing patterns to see updated classifications immediately.

## Manual Override

If auto-classification gets it wrong:

**To remove an incorrect tag:**
1. Go to the asset detail page
2. Find the column with the incorrect tag
3. Click the tag → **Remove**

**To add a missing tag:**
1. Go to the asset detail page
2. Click **Add Tag** on the column
3. Select the classification that matches the data type (e.g., `pii:email` for email addresses, `pii:phone` for phone numbers)

## Verification

After discovery runs, verify auto-classification is working:

1. Go to **Assets** → **Filter** → **Classification** → **Has any**
2. You should see assets with classified columns
3. Click into an asset to see which columns were tagged

If no assets appear:
- Verify auto-classification is enabled in Settings
- Check that discovery has run recently
- Your column names may not match detection patterns

## Common Questions

### Does auto-classification read my actual data?

No. Auto-classification looks only at column names, types, and metadata. Column values are never read or transmitted. A column called `email` is tagged based on its name, not on inspecting what's inside.

### What PII types get detected out of the box?

Email, phone, SSN, credit card, IP address, physical address, and name. Detection runs automatically during discovery and tags matching columns with values like `pii:email` or `pii:phone`.

### Why did a non-PII column like `email_sent_count` get tagged?

Pattern matching is name-based, so anything with `email` in the name matches. Open the column, click the tag, and remove it manually. Manual overrides persist across future discovery runs.

### Will it miss PII in abbreviated or non-English column names?

Yes, it's likely to miss short names like `em` or `ph`, non-English names, and ambiguous ones like `data` or `value`. Add the appropriate `pii:*` tag manually from the asset detail page for those columns.

### Is auto-classification on by default?

Yes, for all data sources. You can toggle individual patterns on or off in **Settings → Data Classification**. Changes take effect on the next discovery run.

## Next Steps

- **[Custom Tags](/data-classification/custom-tags)**: Add your own classification tags
- **[Data Handling](/security/overview)**: How we protect your data

---

<!-- Source: https://docs.anomalyarmor.ai/data-classification/custom-tags -->

# Custom Tags

> Create and apply your own labels to organize data assets

Custom tags let you organize assets beyond auto-, such as tables as `production`, `deprecated`, `pii-reviewed`, or any label that fits your workflow.

## Creating a Tag

1. Go to **Settings** → **Tags**
2. Click **Create Tag**
3. Enter a tag name (e.g., `production`, `needs-review`)
4. Choose a color (optional, helps with visual scanning)
5. Click **Save**

You'll see "Tag created successfully" and the tag appears in your list.

## Applying Tags to Assets

### Single Asset

1. Navigate to the asset in **Assets**
2. Click the asset name to open details
3. Click **Add Tag** (next to the asset title)
4. Select from your available tags
5. Tag appears immediately on the asset

### Multiple Assets (Bulk)

1. Go to **Assets** → **All Assets**
2. Check the boxes next to assets you want to tag
3. Click **Actions** → **Add Tag**
4. Select the tag to apply
5. Click **Apply**

All selected assets now have the tag.

## Removing Tags

### From a Single Asset

1. Go to the asset detail page
2. Find the tag you want to remove
3. Click the **X** on the tag
4. Tag is removed immediately

### Bulk Remove

1. Go to **Assets** → filter by the tag you want to remove
2. Select all assets (checkbox in header)
3. Click **Actions** → **Remove Tag**
4. Select the tag to remove
5. Click **Remove**

## Editing a Tag

To rename or change a tag's color:

1. Go to **Settings** → **Tags**
2. Find the tag and click **Edit** (pencil icon)
3. Change name or color
4. Click **Save**

Changes apply everywhere the tag is used.

## Deleting a Tag

1. Go to **Settings** → **Tags**
2. Find the tag and click **Delete** (trash icon)
3. Confirm deletion

> ⚠️ Deleting a tag removes it from all assets. This cannot be undone.

## Filtering by Tags

Find all assets with a specific tag:

1. Go to **Assets**
2. Click **Filter**
3. Select **Tags** → choose your tag
4. View shows only assets with that tag

Combine with other filters (data source, schema) to narrow results.

## Tag Naming Conventions

Consistent naming helps your team:

| Category | Examples | Use For |
|----------|----------|---------|
| Environment | `production`, `staging`, `dev` | Distinguish environments |
| Status | `reviewed`, `needs-review`, `deprecated` | Track review status |
| Ownership | `team-data`, `team-analytics` | Assign ownership |
| Compliance | `pii-reviewed`, `gdpr-ready` | Track compliance work |

## Verification

After creating and applying tags:

1. Go to **Assets** → **Filter** → **Tags** → select your tag
2. Verify the correct assets appear
3. Click into an asset to confirm the tag shows

## Common Questions

### How do custom tags differ from auto-classification?

[Auto-classification](/data-classification/auto-classification) is automatic and focused on PII patterns. Custom tags are anything you define: environment (`production`, `staging`), ownership (`team-analytics`), status (`needs-review`), or any label your team needs. Both types coexist on the same asset.

### Can I tag many assets at once?

Yes. Go to **Assets → All Assets**, check the assets you want, then use **Actions → Add Tag**. Bulk removal works the same way. This is the fastest path to tagging a large set of tables by environment or team.

### Can I route alerts based on tags?

Yes. In [alert rules](/alerts/alert-rules), scope a rule by tag so only tagged assets trigger the rule. This is how teams send production alerts to PagerDuty and development alerts to Slack from the same underlying events.

### What happens when I delete a tag?

Deletion is permanent and removes the tag from every asset it was applied to. There's no undo. To temporarily hide a tag without losing data, rename it instead of deleting.

## Next Steps

- **[Auto-Classification](/data-classification/auto-classification)**: Automatic PII detection
- **[Alert Rules](/alerts/alert-rules)**: Create alerts filtered by tag

---

<!-- Source: https://docs.anomalyarmor.ai/data-classification/compliance -->

# Compliance & Governance

> Use tagging for audits, GDPR, and data governance

## Why Classification Matters for Compliance

Auditors ask: "Where is your PII?" You need an answer that isn't "let me check."

AnomalyArmor's auto-classification and custom tags give you:
- **Instant PII inventory** across all databases
- **Audit-ready exports** of sensitive data locations
- **Continuous monitoring** as new tables appear

## Recommended Tag Structure

| Tag | Use For |
|-----|---------|
| `pii:email`, `pii:phone`, etc. | Auto-classified PII (automatic) |
| `sensitivity:high` | Manually flagged critical data |
| `compliance:reviewed` | Audit trail of reviewed assets |
| `compliance:gdpr-scope` | GDPR-relevant data |
| `retention:30-days` | Data retention policy |

---

## Audit Preparation Checklist

```
[ ] Run discovery to ensure catalog is current
[ ] Review auto-classification results for accuracy
[ ] Remove false positives (email_count ≠ PII)
[ ] Add manual tags for data auto-classification missed
[ ] Export filtered asset list for auditor
[ ] Document any exceptions with descriptions
```

---

## Staying Compliant Over Time

New tables appear. Schemas change. Stay ahead:

1. **Alert on new PII**: Create rule for "New asset detected" + filter by auto-classification
2. **Review cadence**: Monthly review of `compliance:needs-review` tagged assets
3. **Discovery schedule**: Run frequently enough to catch new tables before auditors do

## Common Questions

### How do I answer 'where is our PII?' during an audit?

Go to **Assets**, filter by classification `pii:*`, and export the list. You get a complete inventory of PII-tagged columns across every connected database. Pair with a manual spot-check for PII hidden in non-obvious column names.

### Does AnomalyArmor help with GDPR data mapping?

Yes. [Auto-classification](/data-classification/auto-classification) tags emails, names, and addresses automatically. Filter by the relevant `pii:*` tags to map where personal data lives, then add retention or scope tags like `compliance:gdpr-scope` for documentation.

### Can I flag tables as reviewed for audit purposes?

Use [custom tags](/data-classification/custom-tags) like `compliance:reviewed` and `compliance:needs-review`. Apply them manually or in bulk, then filter by tag to see what still needs attention.

### What happens if a new table appears between audits?

Create an alert rule for "New asset detected" filtered by auto-classification. Every time discovery finds a new PII-tagged table, the alert fires so you can review it before the next audit cycle.

---

<!-- Source: https://docs.anomalyarmor.ai/guides/overview -->

# Guides Overview

> Practical guides for common AnomalyArmor use cases

These guides walk you through solving specific problems with AnomalyArmor. Each guide targets a particular persona or use case.

## Use Case Guides

- **[Prevent Pipeline Failures](/guides/prevent-pipeline-failures)**: **For Data Engineers**

    Catch schema changes before they break your dbt models and ETL jobs.
- **[On-Call Data Alerting](/guides/oncall-data-alerting)**: **For Platform Teams**

    Set up production monitoring with proper escalation and on-call routing.
- **[Compliance Monitoring](/guides/compliance-monitoring)**: **For Data Governance**

    Track data lineage and changes for audit and compliance requirements.

## Who Are These For?

| Guide | Primary Audience | Key Outcome |
|-------|------------------|-------------|
| Prevent Pipeline Failures | Data Engineers | Proactive schema change detection |
| On-Call Data Alerting | Platform/SRE Teams | 24/7 data incident response |
| Compliance Monitoring | Data Governance | Audit trails and change documentation |

## Getting Started

If you're new to AnomalyArmor, start with the [Quickstart Guide](/quickstart/overview) to connect your first database. Then return to these guides for specific workflows.

---

<!-- Source: https://docs.anomalyarmor.ai/guides/video-tutorials -->

# "Video Tutorials"

> Step-by-step video walkthroughs of AnomalyArmor features

Video tutorials are coming soon. We're recording Loom walkthroughs covering:

- **Getting Started** - Connect your first database and run discovery
- **Setting Up Alerts** - Configure Slack, email, and webhook notifications
- **Creating Metrics** - Track row counts, null percentages, and custom queries
- **Schema Monitoring** - Detect and respond to schema changes

Check back soon, or [contact us](/support/contact) if you'd like early access to video content.

---

<!-- Source: https://docs.anomalyarmor.ai/getting-started/for-data-engineers -->

# For Data Engineers

> Schema drift detection, CI/CD integration, and automation for data engineers

As a data engineer, you build and maintain the pipelines that keep data flowing. AnomalyArmor helps you catch breaking changes before they impact downstream systems.

## Your Key Workflows

### Detect Breaking Schema Changes

Schema drift is your biggest enemy. A column rename or type change can silently break pipelines that ran fine yesterday.

### Integrate with Your CI/CD

Gate deployments on data quality using the CLI:

```bash
# Install
pip install anomalyarmor-cli

# Check freshness before running dbt
armor freshness check snowflake.prod.warehouse.orders

# Exit code 1 if stale, blocking the pipeline
```

[Full CLI reference](/cli/overview)

### Automate with Webhooks

Trigger actions when schema changes are detected:

```python
from anomalyarmor import Client

client = Client()

# Get schema changes from last 24 hours
changes = client.schema.changes(
    since="24h",
    change_types=["column_removed", "type_changed"]
)

for change in changes:
    print(f"Breaking change: {change.asset_name} - {change.description}")
```

[Python SDK guide](/sdk/overview)

## Recommended Setup

| Feature | Why You Need It |
|---------|-----------------|
| [Schema Drift Detection](/schema-monitoring/schema-drift) | Catch column changes before they break pipelines |
| [Freshness Monitoring](/data-quality/freshness-monitoring) | Know when upstream data is stale |
| [Webhook Alerts](/alerts/destinations/webhooks) | Integrate with your existing monitoring |
| [CLI](/cli/overview) | Automate checks in CI/CD |

## Common Tasks

- **[Set Up dbt Integration](/integrations/dbt)**: Run AnomalyArmor checks as part of dbt runs
- **[Airflow Pre-flight Checks](/integrations/airflow)**: Gate DAG tasks on data freshness
- **[GitHub Actions Integration](/integrations/github-actions)**: Add data quality checks to your CI pipeline
- **[CLI Reference](/cli/reference)**: Full command documentation

## Common Questions

### How do I gate a dbt run or Airflow DAG on data freshness?

Use the `armor freshness check <asset>` CLI command in your pipeline. It exits non-zero when data is stale, which blocks the next step in most orchestrators. See [CLI overview](/cli/overview) and [Airflow integration](/integrations/airflow).

### Can AnomalyArmor catch a breaking schema change before my pipeline runs?

Yes, if you schedule discovery to run before your pipelines. Point discovery at production every hour and schedule it 30-60 minutes before your main ETL windows. Breaking changes (column removed, type changed) fire alerts on the next discovery.

### How do I send schema change alerts to a webhook for automation?

Create a webhook destination, then build an alert rule scoped to the change types you care about (e.g. `column_removed`, `type_changed`). Incoming payloads include asset name, change type, and diff for your automation to act on. See [Webhook destinations](/alerts/destinations/webhooks).

### Can I pull schema change history programmatically?

Yes, via the Python SDK: `client.schema.changes(since="24h", change_types=[...])`. See [Python SDK guide](/sdk/overview).

### How does AnomalyArmor handle column renames?

Today a rename surfaces as a column removed plus a column added. You can correlate the pair in your webhook handler or alert rule. Automatic rename detection is on the roadmap.

---

<!-- Source: https://docs.anomalyarmor.ai/getting-started/for-analytics-engineers -->

# For Analytics Engineers

> Data quality metrics, freshness SLAs, and stakeholder reporting for analytics engineers

As an analytics engineer, you own the quality of data that powers business decisions. AnomalyArmor helps you catch data quality issues before they reach dashboards.

## Your Key Workflows

### Set Freshness SLAs

Stakeholders expect data to be current. Set SLAs to get alerted when data is stale:

### Track Data Quality Metrics

Go beyond freshness. Monitor row counts, null percentages, and more:

```
Critical Table Monitoring:
+-- row_count: Catch unexpected data loss or growth
+-- null_percent on key columns: Detect incomplete data loads
+-- distinct_count on IDs: Identify duplicate records
```

[Create your first metric](/quickstart/create-first-metric)

### Report Quality to Stakeholders

Embed data quality status directly in dashboards:

```markdown
# Sales Dashboard
![Data Quality](https://app.anomalyarmor.ai/badges/abc123.svg)
```

When the badge is green, stakeholders know the data is reliable. [Set up report badges](/data-quality/report-badges)

### Document Tables with AI

Use Intelligence to auto-generate descriptions for your dbt models:

```python
from anomalyarmor import Client

client = Client()

# Generate descriptions for all columns
descriptions = client.intelligence.describe(
    "snowflake.prod.warehouse.orders"
)
# Paste into your dbt YAML files
```

[Use Intelligence](/intelligence/object-profiles)

## Recommended Setup

| Feature | Why You Need It |
|---------|-----------------|
| [Freshness Monitoring](/data-quality/freshness-monitoring) | Set SLAs on data recency |
| [Data Quality Metrics](/data-quality/metrics) | Track statistical properties over time |
| [Report Badges](/data-quality/report-badges) | Show quality status in dashboards |
| [Slack Alerts](/alerts/destinations/slack) | Get notified in your team channel |

## Common Tasks

- **[Set Up Freshness SLAs](/data-quality/freshness-monitoring)**: Define expected update frequency
- **[Create Metrics](/data-quality/metrics)**: Track row counts, null percentages, and more
- **[Embed Report Badges](/data-quality/report-badges)**: Show quality status in Looker, Tableau, Notion
- **[Object Profiles](/intelligence/object-profiles)**: AI-generated intelligence profiles for every table

## Common Questions

### How do I set freshness SLAs on my dbt models?

Identify the tables that feed executive or customer-facing dashboards, then set freshness thresholds per table (hourly, daily, weekly). Route alerts to your team's Slack channel. See [Freshness Monitoring](/data-quality/freshness-monitoring).

### Can I show data quality status inside my Looker or Tableau dashboards?

Yes. Embed a report badge that turns green or red based on freshness and quality monitors. Stakeholders can see at a glance whether the underlying data is trustworthy. See [Report Badges](/data-quality/report-badges).

### How do I generate column descriptions for my dbt YAML files?

Use the Intelligence API (`client.intelligence.describe(...)`) to auto-generate descriptions and paste them into your `schema.yml`. See [Object Profiles](/intelligence/object-profiles) for the full workflow.

### What metrics should I track on my most important tables?

Start with row count, null percentage on required columns, and distinct count on ID columns. Those three catch the majority of real-world quality issues (data loss, incomplete loads, duplicate inserts). See [Data Quality Metrics](/data-quality/metrics).

---

<!-- Source: https://docs.anomalyarmor.ai/getting-started/for-platform-teams -->

# For Platform Teams

> Multi-database monitoring, incident management, team administration, and API integration

As a platform team, you manage data infrastructure at scale. AnomalyArmor provides unified observability across all your databases with enterprise-grade controls.

## Your Key Workflows

### Monitor Multiple Databases

Connect all your data sources from a single pane of glass:

### Integrate with Incident Management

Route critical alerts to your existing incident management:

| Platform | Integration |
|----------|-------------|
| PagerDuty | [Native integration](/alerts/destinations/pagerduty) |
| Opsgenie | Via webhook |
| ServiceNow | Via webhook |

Example PagerDuty routing:

```
Alert Rule: Schema Change (column removed)
Severity: Critical
Route to: PagerDuty → Data Platform On-Call
```

### Manage Team Access

Set up role-based access control:

| Role | Permissions |
|------|-------------|
| Admin | Full access, team management, API keys |
| Editor | Configure alerts, create metrics |
| Viewer | Read-only access to all data |

[Team management guide](/account/team-management)

### Build Custom Integrations

Use the API to integrate AnomalyArmor with your internal tools:

```python
from anomalyarmor import Client

client = Client(api_key="aa_live_xxx")

# Get all assets across all sources
assets = client.assets.list()

# Export to your internal data catalog
for asset in assets:
    sync_to_internal_catalog(asset)

# Monitor API key usage
keys = client.api_keys.list()
for key in keys:
    print(f"{key.name}: {key.last_used}")
```

[API documentation](/api/overview)

### Security and Compliance

Review our security posture:

- **Read-only access**: We only query metadata, never your data
- **Query validation**: Open-source Query Gateway validates all SQL
- **Encryption**: AES-256 at rest, TLS 1.3 in transit
- **Audit logs**: All administrative actions logged

[Security overview](/security/overview)

## Recommended Setup

| Feature | Why You Need It |
|---------|-----------------|
| [Multi-source monitoring](/data-sources/overview) | Unified view across all databases |
| [PagerDuty integration](/alerts/destinations/pagerduty) | Incident escalation |
| [Team management](/account/team-management) | Role-based access control |
| [API access](/api/overview) | Custom integrations |

## Common Tasks

- **[Add Data Source](/data-sources/overview)**: Connect PostgreSQL, Databricks, Snowflake, and more
- **[Configure PagerDuty](/alerts/destinations/pagerduty)**: Route critical alerts to on-call
- **[Manage Team](/account/team-management)**: Invite users and set permissions
- **[API Keys](/api/authentication)**: Create and manage API access

## Common Questions

### How many databases can I connect to AnomalyArmor?

Every plan supports multiple connections; the limit is set by plan capacity (total tables monitored), not by connection count. Enterprise supports unlimited tables and VPC peering / PrivateLink per source. See [Data Sources](/data-sources/overview).

### Can I route alerts to PagerDuty, Opsgenie, or ServiceNow?

PagerDuty has a native integration. Opsgenie and ServiceNow are supported via webhook destinations. You can route by severity so breaking changes page on-call while low-priority events go to Slack or email. See [PagerDuty integration](/alerts/destinations/pagerduty).

### What RBAC roles are available?

Viewer (read-only), Editor (configure alerts and metrics), Admin (full access plus team and API keys). SSO with SAML 2.0 and SCIM provisioning is available on Enterprise. See [Team management](/account/team-management).

### Can I sync AnomalyArmor's asset inventory to our internal data catalog?

Yes. Use the REST API or Python SDK to pull the asset list (`client.assets.list()`) and push it into your catalog. Changes are available via schema-change webhooks so syncs stay current. See [API overview](/api/overview).

### How does AnomalyArmor fit our compliance requirements?

Read-only metadata access, AES-256 at rest, TLS 1.3 in transit, audit logs for every admin action, GDPR DPA and HIPAA BAA available. VPC peering and PrivateLink keep traffic off the public internet on Enterprise. See [Security Overview](/security/overview).

---

<!-- Source: https://docs.anomalyarmor.ai/guides/prevent-pipeline-failures -->

# Prevent Pipeline Failures

> Catch schema changes before they break your dbt models and ETL jobs

**Audience**: Data Engineers, Analytics Engineers

Schema changes are one of the most common causes of pipeline failures. A dropped column upstream can cascade into failed dbt runs, broken dashboards, and late-night debugging sessions.

This guide shows you how to use AnomalyArmor to catch schema changes before they impact your pipelines.

## The Problem

## The Solution

With AnomalyArmor, you'll know about schema changes before your pipelines run:

## Setup Guide

### Step 1: Connect Your Source Databases

Connect the databases that your pipelines read , not just your warehouse.

**Common sources to monitor:**
- Production application databases (the ones your dbt reads from)
- Third-party data sources
- Shared data lakes

For each source, follow the [connection guide](/data-sources/overview).

### Step 2: Schedule Frequent Discovery

For pipeline-critical databases, run discovery frequently:

| Database Type | Recommended Schedule | Why |
|---------------|---------------------|-----|
| Application databases | Hourly | Changes can happen anytime |
| Shared warehouses | Every 6 hours | Less frequent changes |
| Third-party sources | Daily | Usually stable |

Configure in: **Data Sources → [Your Connection] → Settings → Discovery Schedule**

### Step 3: Create Breaking Change Alerts

Set up alerts specifically for changes that break pipelines:

**Rule: Breaking Schema Changes (Production)**

| Field | Value |
|-------|-------|
| **Event** | Schema Change Detected |
| **Data Source** | `production-app-db` |
| **Schema** | `public` |
| **Assets** | All (or list specific tables) |
| **Change Type** | Column Removed, Table Removed, Type Changed |
| **Destinations** | Slack `#data-engineering`, Email `data-team@company.com` |

### Step 4: Time Alerts Before Pipeline Runs

If your dbt runs at 3 AM, schedule discovery at 2 AM:

## Advanced: Pre-dbt Validation

### Option 1: Webhook Integration

Use webhooks to fail your pipeline early if breaking changes are detected:

1. Set up a webhook destination in AnomalyArmor
2. Point it at a validation endpoint in your orchestrator
3. If webhook fires, block the dbt run

1. **AnomalyArmor Alert** fires on schema change
2. **Webhook** sent to Airflow/Dagster
3. **Set flag**: `schema_changes_detected = true`
4. **dbt task** checks flag before running
5. **If flag = true**: Fail fast with meaningful error

### Option 2: Discovery Schedule Alignment

Align discovery with your orchestration schedule:

```python
# In your Airflow DAG
discovery_check = SimpleHttpOperator(
    task_id='check_for_schema_changes',
    http_conn_id='anomalyarmor',
    endpoint='/api/v1/discoveries/latest',
    method='GET',
)

run_dbt = BashOperator(
    task_id='run_dbt',
    bash_command='dbt run',
)

discovery_check >> run_dbt
```

## What to Do When Alerts Fire

### Immediate Actions

1. **Acknowledge the alert**: Let your team know you're investigating
2. **Check the change details**: View in AnomalyArmor: what changed, when, and on which asset
3. **Assess impact**: Which models/dashboards use this table?

### If the Change is Breaking

1. **Pause affected pipelines** (if possible before they run)
2. **Update your dbt models** to handle the change
3. **Test locally** with the new schema
4. **Deploy the fix** before the next scheduled run

### If the Change is Expected

1. **Document it**: Note in AnomalyArmor or your team wiki
2. **Update downstream**: Ensure all dependents are updated
3. **Consider communication**: Should you announce to stakeholders?

## Model Dependency Mapping

Know which models depend on which tables:

**Source Table: `production.orders`**

- `stg_orders` (staging model)
  - `int_orders_enriched` (intermediate)
    - `fct_orders` (fact table)
      - monthly_revenue (dashboard)
      - customer_lifetime_value (analytics)
  - `rpt_daily_orders` (report)
- `dim_order_status` (dimension)

When `production.orders` changes, all of these are potentially impacted.

> 💡 Use dbt's `dbt ls --select +models/staging/stg_orders.sql` to see all downstream dependencies.

## Alert Configuration Examples

| Priority | Rule Name | Event | Scope | Conditions | Destinations |
|----------|-----------|-------|-------|------------|--------------|
| **High** | Revenue Table Changes | Schema Change | orders, payments, transactions | Any change | Slack #data-critical, PagerDuty |
| **Medium** | Dimension Table Changes | Schema Change | dim_*, *_lookup | Column removed or type changed | Slack #data-engineering |
| **Low** | External Source Changes | Schema Change | external.*, partner_* | Any change | Email (daily digest) |

## Troubleshooting

### Pipeline failed but I didn't get an alert

1. **Check discovery timing**: Did discovery run before the pipeline?
2. **Check scope**: Is the table included in the alert rule?
3. **Check conditions**: Does the change type match your conditions?
4. **Verify destination**: Is the destination configured correctly?

### Too many alerts for non-breaking changes

1. **Filter change types**: Alert only on `Column Removed`, `Table Removed`, `Type Changed`
2. **Exclude test schemas**: Filter out `test_*`, `dev_*`
3. **Separate environments**: Different rules for prod vs. staging

### Can't connect to production database

1. **Use a read replica**: Monitor the replica instead of primary
2. **Create a dedicated user**: With read-only permissions
3. **Check network access**: Firewall rules, security groups

## Checklist

Before going live:

- [ ] Connected all source databases that feed pipelines
- [ ] Discovery scheduled to run before pipeline runs
- [ ] Alert rules for breaking changes (column/table removed)
- [ ] Alerts routed to the right channel (data engineering team)
- [ ] Team knows what to do when alerts fire
- [ ] Documented critical table dependencies

## Common Questions

### How often should I run schema discovery for pipeline-critical tables?

For production application databases that feed dbt or ETL, run discovery **hourly**, shared warehouses every 6 hours, and stable third-party sources daily. The goal is to detect a change before the next pipeline run, so align the discovery schedule with your orchestrator. See [Schedule Frequent Discovery](#step-2-schedule-frequent-discovery).

### Which schema changes actually break dbt pipelines?

The high-severity ones are **column removed**, **table removed**, and **column type changed**. Additive changes (new columns, new tables) rarely break existing models. Scope your breaking-change alert rule to those three change types to cut alert noise without missing pipeline-breakers.

### Can AnomalyArmor block my dbt run if a breaking change is detected?

Yes, via a webhook destination. Point the webhook at an Airflow/Dagster sensor that sets a flag, then make your dbt task depend on the flag being clear. See [Option 1: Webhook Integration](#option-1-webhook-integration) for the pattern. This is the "fail fast with a meaningful error" flow that beats a 3 AM dbt compilation error.

### Should I monitor my source database or my warehouse?

Monitor both, but source databases are where most breaking changes originate, upstream teams drop columns without telling you. Connect the production application DBs your dbt project reads from, not just the warehouse you write into. See [Step 1](#step-1-connect-your-source-databases).

### Why am I getting too many alerts for non-breaking changes?

Tighten the change-type filter on the rule to **Column Removed**, **Table Removed**, and **Type Changed** only. Exclude `test_*` and `dev_*` schemas, and split prod and staging into separate rules with different destinations. See the [Troubleshooting section](#troubleshooting).

### How do I know which dbt models a source table feeds?

Upload your dbt manifest via the [lineage upload](/guides/lineage-upload) flow and AnomalyArmor's asset page will show downstream dependencies. You can also use `dbt ls --select +models/staging/stg_orders.sql` locally to see the chain from source to dashboard.

## Related Resources

- **[Schema Monitoring](/schema-monitoring/overview)**: Deep dive into schema change detection
- **[Alert Rules](/alerts/alert-rules)**: Configure alert conditions

---

<!-- Source: https://docs.anomalyarmor.ai/guides/oncall-data-alerting -->

# On-Call Data Alerting

> Set up production monitoring with proper escalation and on-call routing

**Audience**: Platform Teams, Data Platform, SRE

Data incidents need the same rigor as application incidents. This guide helps you set up 24/7 monitoring with proper escalation, on-call routing, and incident response.

## The Goal

## Architecture Overview

## Setting Up PagerDuty Integration

### Step 1: Create PagerDuty Service

In PagerDuty:

1. Go to **Services → New Service**
2. Name: `Data Observability - AnomalyArmor`
3. Integration: Select **Events API V2**
4. Copy the **Integration Key**

### Step 2: Add PagerDuty Destination in AnomalyArmor

1. Go to **Alerts → Destinations**
2. Click **Add Destination**
3. Select **PagerDuty**
4. Enter the Integration Key
5. Name: `PagerDuty - Data On-Call`
6. **Test** and **Save**

### Step 3: Configure Escalation Policy

In PagerDuty, set up escalation:

## Alert Urgency Framework

Define how urgently different data incidents need response:

### Critical (Page Immediately)

**Criteria:**
- Production data pipeline completely down
- Core revenue tables missing or stale >4 hours
- Discovery failures for >24 hours

**Examples:**
- Column removed from `orders` table
- `payments` table data >4 hours stale
- Can't connect to production database

**Destination:** PagerDuty → On-Call

### High (Respond Within 4 Hours)

**Criteria:**
- Important tables stale (1-4 hours)
- Schema changes in production
- Non-critical discovery failures

**Examples:**
- Column type changed in production
- Analytics tables 2 hours stale
- Staging discovery failed

**Destination:** Slack #data-incidents

### Medium (Respond Within 24 Hours)

**Criteria:**
- Non-production schema changes
- Warning thresholds reached
- New assets discovered

**Examples:**
- Staging schema changed
- Freshness approaching SLA (warning)
- New table discovered in production

**Destination:** Slack #data-alerts

### Low (Informational)

**Criteria:**
- Development changes
- Expected changes
- Routine discoveries

**Destination:** Email digest (daily)

## Alert Rule Configuration

### Rule 1: Critical - Production Breaking Changes

| Field | Value |
|-------|-------|
| **Name** | CRITICAL - Production Breaking Changes |
| **Event** | Schema Change Detected |
| **Data source** | `production-*` |
| **Schema** | `public`, `analytics` |
| **Change type** | Column Removed, Table Removed |
| **Destinations** | PagerDuty (Data On-Call), Slack `#data-incidents` |

### Rule 2: Critical - Revenue Table Freshness

| Field | Value |
|-------|-------|
| **Name** | CRITICAL - Revenue Data Stale |
| **Event** | Freshness Violation |
| **Assets** | `orders`, `payments`, `revenue_*` |
| **SLA exceeded by** | >4 hours |
| **Destinations** | PagerDuty (Data On-Call), Slack `#data-incidents` |

### Rule 3: High - Production Schema Changes

| Field | Value |
|-------|-------|
| **Name** | Production Schema Changes |
| **Event** | Schema Change Detected |
| **Data source** | `production-*` |
| **Change type** | All |
| **Destinations** | Slack `#data-incidents` |

### Rule 4: High - Data Freshness Violations

| Field | Value |
|-------|-------|
| **Name** | HIGH - Data Freshness Violations |
| **Event** | Freshness Violation |
| **Data source** | `production-*` |
| **Condition** | SLA exceeded |
| **Destinations** | Slack `#data-incidents` |

### Rule 5: High - Discovery Failures

| Field | Value |
|-------|-------|
| **Name** | HIGH - Discovery Failures |
| **Event** | Discovery Failed |
| **Data source** | `production-*` |
| **Destinations** | Slack `#data-incidents`, Email `data-platform@company.com` |

## On-Call Runbook

### When Paged for Schema Change

### When Paged for Freshness Violation

1. **ACKNOWLEDGE** the alert

2. **CHECK ETL STATUS**
   - Is the ETL job running? Failed? Stuck?
   - Check Airflow/Dagster/orchestrator

3. **CHECK SOURCE SYSTEM**
   - Is the source database accessible?
   - Is source data actually updating?

4. **IDENTIFY ROOT CAUSE**
   - ETL failure → Fix and restart
   - Source delay → Communicate delay
   - Connection issue → Troubleshoot connection

5. **MITIGATE**
   - Restart failed jobs
   - Notify stakeholders of delay

6. **RESOLVE** and document

## Slack Integration Best Practices

### Channel Setup

**Slack Channels:**
- `#data-incidents` - Breaking changes (notifications on)
- `#data-alerts` - All schema changes (lower priority)
- `#data-digest` - Daily/weekly summaries

### Alert Message Format

AnomalyArmor alerts include:

```
🔴 CRITICAL: Schema Change Detected

Asset: production.public.orders
Change: Column removed - shipping_status (varchar)

Detected: Today at 3:15 PM UTC
Discovery Run: #12345

Impact: High - This table is used by 5 downstream models

Actions:
• [View in AnomalyArmor]
• [View Asset Details]
• [View Downstream Dependencies]

On-Call: @data-oncall
```

## Maintenance Windows

### Scheduled Maintenance

Before planned changes:

1. Go to **Alerts → Rules**
2. Toggle OFF relevant rules
3. Set a reminder to re-enable (e.g., calendar event)
4. Proceed with maintenance
5. Verify changes detected correctly
6. Toggle rules back ON

### Quick Disable

For unexpected but known issues, quickly disable a rule:

1. Go to **Alerts → Rules**
2. Find the rule
3. Toggle it **OFF**
4. Remember to re-enable when the issue is resolved

## Metrics to Track

| Metric | Target | How to Measure |
|--------|--------|----------------|
| MTTD (Time to Detect) | < 1 hour | Discovery frequency |
| MTTN (Time to Notify) | < 5 min | Alert → PagerDuty time |
| MTTR (Time to Resolve) | < 4 hours | Alert → Resolution time |
| False Positive Rate | < 20% | Alerts ignored / Total alerts |
| Pager Load | < 5/week | Critical alerts per week |

Review these weekly in your on-call handoff.

## Checklist

Before going live with on-call alerting:

- [ ] PagerDuty integration configured
- [ ] Escalation policy set up
- [ ] Critical/High/Medium/Low rules defined
- [ ] Slack channels created and configured
- [ ] On-call runbook documented
- [ ] Team trained on response procedures
- [ ] Test alert sent and verified

## Common Questions

### How do I page my on-call engineer when data breaks?

Create a PagerDuty service with an **Events API V2** integration, copy the integration key, and add a PagerDuty destination in **Alerts → Destinations**. Then route only your Critical rules (breaking schema changes, revenue-table freshness >4h) to that destination. See [Setting Up PagerDuty Integration](#setting-up-pagerduty-integration).

### Which data incidents should actually page someone?

Page on production pipelines being completely down, core revenue tables stale for more than 4 hours, or discovery failures lasting over 24 hours. Everything else should go to Slack, not PagerDuty, to protect on-call from alert fatigue. See the [Alert Urgency Framework](#alert-urgency-framework).

### How do I suppress alerts during planned maintenance?

Go to **Alerts → Rules** and toggle off the relevant rules before the maintenance window, then re-enable after. Set a calendar reminder so rules don't stay off indefinitely. For recurring windows, use operating schedules and blackouts in the contract config instead.

### What metrics should I track for data on-call health?

MTTD (under 1 hour, driven by discovery frequency), MTTN (under 5 minutes from alert to page), MTTR (under 4 hours), false-positive rate (under 20%), and pager load (under 5 critical alerts per week). Review these weekly in your on-call handoff. See [Metrics to Track](#metrics-to-track).

### Can I send different alerts to different Slack channels?

Yes. Create separate destinations for `#data-incidents` (breaking changes), `#data-alerts` (all schema changes), and `#data-digest` (daily summaries), then route each alert rule by severity. That keeps high-signal alerts out of the noisy firehose and stops people from muting the wrong channel.

## Related Resources

- **[PagerDuty Setup](/alerts/destinations/pagerduty)**: Detailed PagerDuty integration guide
- **[Alert Best Practices](/alerts/best-practices)**: Reduce alert fatigue

---

<!-- Source: https://docs.anomalyarmor.ai/guides/compliance-monitoring -->

# Compliance Monitoring

> Track data lineage and changes for audit and compliance requirements

**Audience**: Data Governance, Compliance Teams, DPOs

Regulatory frameworks like GDPR, SOX, HIPAA, and industry standards require organizations to demonstrate control over their data. AnomalyArmor provides the audit trails and change documentation needed for compliance.

## Compliance Use Cases

| Requirement | How AnomalyArmor Helps |
|-------------|------------------------|
| **Data Lineage** | Track what data exists and where |
| **Change Management** | Document all schema modifications |
| **Audit Trails** | Immutable logs of detected changes |
| **Access Documentation** | Record what AnomalyArmor can see |
| **Data Classification** | Tag PII and sensitive data |

## Schema Change Documentation

### Automatic Change History

Every schema change is automatically recorded:

```
Schema History: customers table
────────────────────────────────

2024-01-15 08:00:00 UTC
Change: Column removed - ssn (varchar)
Detected by: Discovery Run #1234
Details: Column no longer exists in source

2024-01-10 14:30:00 UTC
Change: Column added - gdpr_consent (boolean)
Detected by: Discovery Run #1200
Details: New column discovered

2024-01-05 09:15:00 UTC
Change: Column type changed - email
From: varchar(100)
To: varchar(255)
Detected by: Discovery Run #1150
```

### Accessing Change History

1. Navigate to **Assets**
2. Click on any asset
3. Select **Schema History** tab
4. Filter by date range
5. Export for auditors

### Exporting for Audits

Export schema change history:

1. Go to **Assets → [Asset] → Schema History**
2. Click **Export**
3. Select format: CSV, JSON, or PDF
4. Choose date range
5. Download

The export includes:
- Asset name and location
- Change type and details
- Detection timestamp
- Discovery run ID
- Before/after values

## Data Classification

### Tagging Sensitive Data

Identify and tag PII and sensitive columns:

1. Navigate to an asset
2. Click **Classification** tab
3. Select columns to classify
4. Apply tags:
   - `PII` - Personally Identifiable Information
   - `Financial` - Financial data
   - `Health` - Health/medical data
   - `Confidential` - Internal confidential
   - Create custom tags for your organization's specific requirements

### Classification Alerts

Alert when classified columns change:

| Field | Value |
|-------|-------|
| **Rule** | PII Column Changes |
| **Event** | Schema Change Detected |
| **Classification** | PII |
| **Conditions** | Any change |
| **Destinations** | Email compliance@company.com, Slack #data-governance |

### Classification Report

Generate a report of all classified data:

1. Go to **Assets**
2. Filter by Classification
3. Export filtered results

Report includes:
- All assets with classified columns
- Classification tags applied
- Column details
- Last change date

## Audit Trail Access

### What's Logged

AnomalyArmor maintains audit logs for:

| Activity | Logged Details |
|----------|---------------|
| Discovery runs | Start time, end time, assets found, changes detected |
| Schema changes | Before/after state, detection time |
| Configuration changes | Who changed what, when |
| User actions | Login, logout, settings changes |
| Alert activity | Alerts fired, acknowledged, resolved |

### Accessing Audit Logs

**In the UI:**
1. Go to **Settings → Audit Log**
2. Filter by date, user, or activity type
3. Export for compliance review

**Via API:**
```bash
GET /api/v1/audit-logs
?start_date=2024-01-01
&end_date=2024-01-31
&activity_type=schema_change
```

### Log Retention

| Plan | Retention Period |
|------|------------------|
| Standard | 90 days |
| Enterprise | Configurable (up to 7 years) |

Contact support for extended retention requirements.

## Compliance Frameworks

### GDPR

| GDPR Requirement | AnomalyArmor Support |
|-----------------|----------------------|
| **Art. 30 - Records of Processing** | Asset catalog documents data locations |
| **Art. 32 - Security of Processing** | Change monitoring detects unauthorized modifications |
| **Art. 33 - Breach Notification** | Alerts can notify of potential data issues |
| **Art. 17 - Right to Erasure** | Track when PII columns are removed |

**Recommended Setup:**
- Tag all PII columns
- Alert on any PII column changes
- Export monthly PII inventory reports

### SOX

| SOX Requirement | AnomalyArmor Support |
|-----------------|----------------------|
| **Section 302 - Financial Controls** | Monitor financial data tables |
| **Section 404 - Internal Controls** | Document all schema changes |
| **Change Management** | Audit trail of modifications |

**Recommended Setup:**
- Monitor all financial data tables
- Alert on any schema changes to financial data
- Weekly export of change reports

### HIPAA

| HIPAA Requirement | AnomalyArmor Support |
|------------------|----------------------|
| **Access Controls** | Document what systems access PHI |
| **Audit Controls** | Log all data access and changes |
| **Integrity Controls** | Detect unauthorized modifications |

**Recommended Setup:**
- Tag all PHI columns
- Alert immediately on PHI changes
- Enable extended audit log retention

## Reporting for Auditors

### Monthly Compliance Report

Generate monthly reports showing:

1. **Asset Inventory**: All monitored tables and columns
2. **Change Summary**: Schema changes in the period
3. **Classification Status**: PII and sensitive data inventory
4. **Alert Summary**: Alerts fired and responses

### Quarterly Review

Prepare for quarterly reviews:

1. Export schema change history for quarter
2. Export classification inventory
3. Export audit logs
4. Document any incidents and responses
5. Review and update classifications

### Annual Audit Package

For annual audits, compile:

- [ ] Complete asset inventory
- [ ] 12 months of schema change history
- [ ] Classification inventory with evidence
- [ ] Audit logs for the year
- [ ] Incident response documentation
- [ ] Evidence of regular monitoring (alert history)

## Alert Rules for Compliance

### PII Monitoring

| Field | Value |
|-------|-------|
| **Rule** | PII Data Changes |
| **Event** | Schema Change Detected |
| **Classification** | PII, Personal Data |
| **Conditions** | Any change |
| **Destinations** | Email dpo@company.com, Email compliance@company.com, Slack #data-governance |

### Financial Data Monitoring

| Field | Value |
|-------|-------|
| **Rule** | Financial Data Changes |
| **Event** | Schema Change Detected |
| **Scope** | Classification: Financial, OR Assets: *revenue*, *payment*, *transaction* |
| **Conditions** | Any change |
| **Destinations** | Email finance-compliance@company.com, Slack #finance-data |

### Unauthorized Access Detection

| Field | Value |
|-------|-------|
| **Rule** | Discovery Failures |
| **Event** | Discovery Failed |
| **Scope** | All data sources |
| **Destinations** | Email security@company.com, Slack #security-alerts |

> ℹ️ Repeated failures may indicate permission changes or unauthorized access attempts.

## Best Practices

### Documentation

1. **Tag everything sensitive**: Don't miss PII or financial columns
2. **Regular reviews**: Quarterly review of classifications
3. **Export regularly**: Don't wait for audits to export data

### Monitoring

1. **Alert on all sensitive data changes**: Better to know than miss
2. **Include compliance team on alerts**: They need visibility
3. **Document incident responses**: Keep records of how you responded

### Retention

1. **Extended retention for regulated data**: Match your regulatory requirements
2. **Backup exports**: Keep copies outside AnomalyArmor
3. **Immutable storage**: Use write-once storage for audit exports

## Checklist

For compliance readiness:

- [ ] All databases connected and discovered
- [ ] PII columns identified and tagged
- [ ] Financial data identified and tagged
- [ ] Health data identified and tagged (if applicable)
- [ ] Alert rules for classified data changes
- [ ] Compliance team added to alerts
- [ ] Export schedule established
- [ ] Retention policy configured
- [ ] Audit log access documented

## Common Questions

### How does AnomalyArmor help with GDPR compliance?

AnomalyArmor supports GDPR Articles 30, 32, and 33 by cataloging where personal data lives, tracking every schema change, and alerting on modifications to **PII**-tagged columns. Tag PII columns in the **Classification** tab and create an alert rule that routes any change to your DPO. See the [GDPR table](#gdpr) for the full mapping.

### How long are audit logs retained?

Standard plans retain audit logs for **90 days**. Enterprise plans support configurable retention up to **7 years** to match regulatory requirements like SOX and HIPAA. Contact support if you need extended retention beyond the default.

### Can I export schema change history for an auditor?

Yes. Open the asset's **Schema History** tab, click **Export**, and choose CSV, JSON, or PDF over any date range. The export includes change type, detection timestamp, discovery run ID, and before/after values, which is usually enough to satisfy a SOX or HIPAA change-management review.

### How do I tag PII columns so I get alerted when they change?

In the asset view, open the **Classification** tab, select the columns, and apply the built-in `PII` tag (or a custom tag). Then create an alert rule scoped to that classification with event `Schema Change Detected` and route it to your compliance channel. Repeat for `Financial`, `Health`, or custom classifications.

### Does AnomalyArmor store the actual PII values?

No. AnomalyArmor reads metadata and schema, not row data. We record that a column named `ssn` existed and was removed, never the values inside it. That keeps the audit trail useful without creating a new compliance surface to defend.

## Related Resources

- **[Data Classification](/data-classification/overview)**: Tagging and classification features
- **[Security Overview](/security/overview)**: Security and compliance documentation

---

<!-- Source: https://docs.anomalyarmor.ai/guides/lineage-upload -->

# Upload Lineage Data

> Import data lineage from dbt manifests, dbt Cloud, or manual definitions

AnomalyArmor can visualize how your data flows from source to destination, but it needs lineage data to work with. This guide covers three ways to get lineage into AnomalyArmor:

1. **Upload a dbt manifest.json** file (most common)
2. **Sync from dbt Cloud** automatically
3. **Define lineage manually** via the API

## Option 1: Upload a dbt manifest.json

If you use dbt, the fastest way to populate lineage is uploading your `manifest.json` file. This file contains your full DAG, including all models, sources, seeds, and their dependencies.

### Generate the manifest

Run one of these dbt commands to produce `target/manifest.json`:

```bash
# Either of these generates a manifest.json in target/
dbt parse      # Fastest, parses without compiling
dbt compile    # Compiles SQL, also generates manifest
dbt run        # Full run, also generates manifest
```

### Upload via the API

### What gets imported

AnomalyArmor parses the `nodes` and `parent_map` from your manifest to extract:

- **Models** (transformations in your dbt project)
- **Sources** (raw tables dbt reads from)
- **Seeds** (CSV files loaded by dbt)
- **Parent-child relationships** between all of the above

The `sync_to_catalog` parameter (default: `true`) also triggers a asset discovery job so your dbt models appear as assets in the catalog.

### Response

```json
{
  "data": {
    "asset_id": "550e8400-e29b-41d4-a716-446655440000",
    "sync_stats": {
      "nodes_created": 42,
      "nodes_updated": 8,
      "edges_created": 67,
      "edges_updated": 3
    },
    "manifest_metadata": {
      "generated_at": "2025-03-15T10:30:00Z",
      "dbt_version": "1.7.4",
      "project_name": "my_analytics"
    },
    "catalog_sync_job_id": "job-uuid-here"
  }
}
```

### Automate uploads in CI/CD

Add a manifest upload step after your dbt run completes:

```yaml
# .github/workflows/dbt.yml
jobs:
  dbt-run:
    steps:
      - name: Run dbt
        run: dbt run

      - name: Upload lineage to AnomalyArmor
        run: |
          curl -X POST \
            "https://api.anomalyarmor.ai/api/v1/assets/$ASSET_ID/lineage/upload" \
            -H "Authorization: Bearer $ARMOR_API_KEY" \
            -F "file=@target/manifest.json"
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
          ASSET_ID: ${{ vars.ARMOR_ASSET_ID }}
```

## Option 2: Sync from dbt Cloud

If you use dbt Cloud, AnomalyArmor can fetch the manifest directly from your dbt Cloud account. No file upload needed.

### Finding your dbt Cloud credentials

| Parameter | Where to find it |
|-----------|-----------------|
| `account_id` | dbt Cloud URL: `cloud.getdbt.com/deploy/**12345**/...` |
| `api_token` | dbt Cloud > Account Settings > API Access > Service tokens |
| `job_id` | dbt Cloud > Jobs > select your job > ID in the URL |

> ℹ️ Use a **service token** with at least the "Read artifacts" permission. Personal tokens work but are tied to individual users.

## Option 3: Define lineage manually

For data sources that are not managed by dbt, you can define lineage nodes and edges directly via the API.

### Create a lineage node

```bash
curl -X POST \
  "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/nodes" \
  -H "Authorization: Bearer aa_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "unique_id": "source.crm.customers",
    "name": "customers",
    "resource_type": "source",
    "schema": "crm",
    "database": "production"
  }'
```

### Create a lineage edge

```bash
curl -X POST \
  "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/edges" \
  -H "Authorization: Bearer aa_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "parent_unique_id": "source.crm.customers",
    "child_unique_id": "model.analytics.dim_customers",
    "relationship_type": "derives_from"
  }'
```

## Updating lineage

When your dbt project changes, re-upload the manifest. AnomalyArmor handles updates intelligently:

- New nodes and edges are created
- Existing nodes are updated with new metadata
- Relationships that no longer exist are removed

For a clean reset, delete all lineage from a source first:

```bash
curl -X DELETE \
  "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/source/dbt" \
  -H "Authorization: Bearer aa_live_xxx"
```

## Limits

| Constraint | Value |
|-----------|-------|
| Max manifest file size | 50 MB |
| File format | JSON (UTF-8 encoded) |
| Required manifest keys | `nodes`, `parent_map` |

## Common Questions

### Where do I find my dbt `manifest.json` file?

It lives under `target/manifest.json` after you run `dbt parse`, `dbt compile`, or `dbt run`. `dbt parse` is the fastest option because it does not compile SQL or hit your warehouse. See [Generate the manifest](#generate-the-manifest).

### Do I need dbt to upload lineage?

No. Option 3 lets you [define lineage manually](#option-3-define-lineage-manually) via the API by creating nodes and edges directly. This is how you model lineage for sources outside dbt, like Fivetran pipelines, Airflow tasks, or bespoke ETL jobs.

### How do I sync lineage from dbt Cloud instead of uploading a file?

Call `POST /api/v1/assets/{asset_id}/lineage/dbt-cloud/sync` with your dbt Cloud `account_id`, `api_token`, and `job_id`. AnomalyArmor pulls the latest manifest from the run artifacts directly. Use a **service token** with "Read artifacts" permission, not a personal token. See [Option 2](#option-2-sync-from-dbt-cloud).

### How often should I re-upload the manifest?

Re-upload any time your dbt project changes, typically as a CI step after `dbt run` in your deploy pipeline. AnomalyArmor diffs the new manifest against the existing graph: new nodes and edges are created, updated ones are updated, and relationships that no longer exist are removed.

### What is the max manifest size AnomalyArmor accepts?

**50 MB**, UTF-8 encoded JSON, with `nodes` and `parent_map` keys required. Most dbt projects fit well under this; if you're hitting it, check whether you're accidentally uploading a `run_results.json` instead of `manifest.json`.

## Next steps

- **[Query Lineage](/api/lineage)**: Explore upstream and downstream dependencies
- **[Impact Analysis](/api/lineage#use-case-impact-analysis)**: Check downstream impact before schema changes
- **[dbt Integration](/integrations/dbt)**: Add quality gates to your dbt workflows
- **[AI Agent: Lineage](/ai-agents/skills/lineage)**: Ask natural language questions about data flow

---

<!-- Source: https://docs.anomalyarmor.ai/account/profile -->

# Profile Settings

> Update your name, email, and account preferences

Manage your personal account settings including display name, email, and security.

## Updating Your Profile

1. Click your avatar in the top right corner
2. Select **Settings**
3. Update any of the following:

| Field | Description |
|-------|-------------|
| **Display Name** | How your name appears to teammates |
| **Email** | Your login email (requires verification if changed) |
| **Avatar** | Upload an image or use your Gravatar |
| **Timezone** | Used for displaying timestamps |

4. Click **Save Changes**

You'll see "Profile updated" confirmation at the top of the page.

## Changing Your Email

Changing your email requires verification:

1. Enter your new email address
2. Click **Save Changes**
3. Check your new email for a verification link
4. Click the link to confirm

Until verified, you'll continue using your old email to log in.

## Notification Preferences

Control which emails you receive:

1. Go to **Settings** > **Notifications**
2. Toggle notifications on/off:

| Notification | Default | Description |
|--------------|---------|-------------|
| Alert emails | On | Email when alerts fire (if email destination configured) |
| Weekly digest | On | Summary of changes and alerts |
| Product updates | On | New features and announcements |

3. Click **Save**

## Security

### Changing Your Password

1. Go to **Settings** > **Security**
2. Click **Change Password**
3. Enter your current password
4. Enter your new password (minimum 12 characters)
5. Confirm your new password
6. Click **Update Password**

You'll be logged out of other sessions and need to log in again.

## Account Recovery

### Forgot Password

1. Go to the login page
2. Click **Forgot password?**
3. Enter your email address
4. Check email for reset link (expires in 1 hour)
5. Click link and set new password

### Locked Out

If you can't access your account:

| Situation | Solution |
|-----------|----------|
| Forgot password | Use password reset flow above |
| Lost 2FA device | Use backup codes or contact support |
| Email no longer accessible | Contact support with identity verification |
| Account suspended | Contact support for details |

## Deleting Your Account

To delete your personal account:

1. Contact **support@anomalyarmor.ai**
2. Request account deletion
3. We'll confirm and process within 7 days

> ℹ️ If you're the workspace owner, you must transfer ownership or delete the workspace first.

## Related

- **[Team Management](/account/team-management)**: Invite members and manage roles
- **[Billing](/billing/overview)**: Manage subscription and invoices

---

<!-- Source: https://docs.anomalyarmor.ai/account/team-management -->

# Team Management

> Invite teammates and manage roles and permissions

Control who has access to your AnomalyArmor workspace and what they can do.

## Roles and Permissions

| Role | Description |
|------|-------------|
| **Viewer** | Read-only access. Can see assets, alerts, and dashboards. |
| **Member** | Can configure alerts and destinations. Cannot manage data sources or team. |
| **Admin** | Full access to features and team management. Cannot access billing. |
| **Owner** | Full access including billing. One owner per workspace. |

> ℹ️ **Admin vs Owner**: The only difference is billing access. Use Admin for team members who should manage users and data sources but not payment details.

### Permission Matrix

| Permission | Viewer | Member | Admin | Owner |
|------------|--------|--------|-------|-------|
| View assets and dashboards | Yes | Yes | Yes | Yes |
| Configure alerts | No | Yes | Yes | Yes |
| Manage data sources | No | No | Yes | Yes |
| Invite/remove team members | No | No | Yes | Yes |
| Access billing and payments | No | No | No | Yes |

## Seat Limits

Your subscription plan determines how many team members you can have:

| Plan | Seats |
|------|-------|
| Starter | 1 |
| Growth | 5 |
| Professional | 10 |
| Enterprise | Unlimited |

Pending invitations count toward your seat limit. Upgrade your plan in **Settings > Billing** to add more team members.

## Inviting Team Members

1. Go to **Settings > Users**
2. Click **Invite member**
3. Enter their email address
4. Select a role from the dropdown
5. Click **Send invitation**

The invitation email includes a link to join your workspace. Invitations expire after 7 days.

> ℹ️ Only Admins and Owners can invite team members.

### What Happens When They Accept

When someone accepts your invitation:
1. If they don't have an AnomalyArmor account, they create one
2. They automatically join your workspace with the assigned role
3. They appear in your team list immediately

## Changing a Member's Role

Only the Owner can change roles:

1. Go to **Settings > Users**
2. Click the menu icon (three dots) next to the team member
3. Select a new role from the submenu
4. Change applies immediately

> ℹ️ You cannot change your own role. The Owner role cannot be assigned through this menu.

## Removing Team Members

1. Go to **Settings > Users**
2. Click the menu icon (three dots) next to the team member
3. Click **Remove member**
4. Confirm removal

Removed members immediately lose access. Their past actions remain in the system.

> ⚠️ You cannot remove the workspace Owner. You cannot remove yourself.

## Revoking Pending Invitations

1. Go to **Settings > Users**
2. Click the **Pending Invitations** tab
3. Click the X button next to the invitation
4. The invitation link stops working immediately

## Troubleshooting

### Invitation not received

1. Check spam/junk folder
2. Verify the email address is correct
3. Ask them to add `alerts@anomalyarmor.com` to their contacts
4. Revoke and resend the invitation

### Cannot invite more members

You may have reached your seat limit. Check your current usage in **Settings > Billing** and upgrade if needed.

### Cannot change someone's role

Only the Owner can change roles. If you need a role change, contact your workspace Owner.

### Cannot remove someone

You cannot remove the Owner. You also cannot remove yourself. Ask the Owner to make changes.

---

<!-- Source: https://docs.anomalyarmor.ai/billing/overview -->

# Billing Overview

> Understand pricing, manage subscriptions, and track usage

AnomalyArmor pricing is based on the number of monitored tables. All features are included in every plan.

## Choosing a Plan

| If you have... | Choose | Monthly |
|----------------|--------|---------|
| 1-20 tables | Starter | $99 |
| 21-100 tables | Growth | $475 |
| 101-300 tables | Professional | $1,188 |
| 300+ tables | Enterprise | Custom |

Not sure how many tables you have? Connect your database and run discovery. We'll show your count before you commit.

## What's Included

Every plan includes:

- **All monitoring features**: Schema drift, freshness, metrics, alerts
- **All integrations**: Slack, PagerDuty, webhooks, email
- **AI Intelligence**: Ask questions, auto-generate descriptions
- **Auto-classification**: PII detection, custom tags
- **API access**: Full REST API for automation

The only difference between plans is capacity (tables and team members).

## Managing Your Subscription

### View Current Plan

**Settings** > **Billing** > **Subscription**

See your current plan, usage, and renewal date.

### Upgrade

1. Go to **Settings** > **Billing**
2. Click **Upgrade Plan**
3. Select new plan
4. Confirm payment

Upgrades take effect immediately. You pay the prorated difference.

### Downgrade

1. Go to **Settings** > **Billing**
2. Click **Change Plan**
3. Select lower tier

Downgrades take effect at next billing cycle. Your table count must fit within the new plan's limit.

### Cancel

1. Go to **Settings** > **Billing**
2. Click **Cancel Subscription**

Access continues until the end of your paid period.

## In This Section

- **[Plans & Pricing](/billing/plans)**: Compare plan tiers and features
- **[Usage Limits](/billing/usage-limits)**: Understand how usage is measured
- **[Billing FAQ](/billing/faq)**: Common billing questions answered

## Common Questions

### How is AnomalyArmor priced?

Pricing is based on how many tables you monitor. Every plan ships with every feature: schema drift, freshness, metrics, integrations, API access, AI Intelligence. See [Plans & Pricing](/billing/plans) for tier-by-tier pricing.

### Am I locked into features based on my plan?

No. All plans include every feature. The only differences between tiers are capacity (tables, team members) and API rate limits. You never lose a feature by downgrading.

### Where do I see my current plan and usage?

Go to **Settings → Billing → Subscription**. You'll see your current plan, table usage versus limit, next renewal date, and buttons to upgrade, downgrade, or cancel.

### How fast does an upgrade take effect?

Upgrades are instant. You pay the prorated difference for the remainder of the current billing period, and the new capacity is available immediately.

---

<!-- Source: https://docs.anomalyarmor.ai/billing/plans -->

# Plans & Pricing

> AnomalyArmor subscription tiers

Choose the plan that fits your data monitoring needs.

## Plan Comparison

| | Starter | Growth | Professional | Enterprise |
|---|---------|--------|--------------|------------|
| **Monthly** | $99 | $475 | $1,188 | Custom |
| **Annual** | $84/mo | $404/mo | $1,010/mo | Custom |
| **Tables** | 20 | 100 | 300 | 1000+ |
| **Team Members** | 1 | 5 | 10 | Unlimited |
| **API Calls** | 10k/mo | 50k/mo | 200k/mo | Unlimited |

Save 15% with annual billing.

## Feature Availability

All plans include every feature:

| Feature | All Plans |
|---------|-----------|
| Schema drift detection | Included |
| Freshness monitoring | Included |
| Data quality metrics | Included |
| AI Intelligence | Included |
| Auto-classification | Included |
| All alert destinations | Included |
| Full API access | Included |
| Report badges | Included |
| Custom tags | Included |

## Which Plan Is Right for You?

### Starter ($99/mo)
**Best for**: Solo practitioners, single database, getting started

- Monitoring one database
- Personal project or small startup
- Evaluating before team rollout

### Growth ($475/mo)
**Best for**: Small data teams, multiple databases

- 2-5 team members need access
- Multiple databases or environments
- Production + staging monitoring

### Professional ($1,188/mo)
**Best for**: Larger teams, enterprise data stacks

- Complex data infrastructure
- Multiple teams need access
- High-volume API usage

### Enterprise (Custom)
**Best for**: Large organizations, custom requirements

- 1000+ tables
- Custom integrations
- Dedicated support
- SLA guarantees
- SSO/SCIM

[Contact sales](mailto:sales@anomalyarmor.ai) for Enterprise pricing.

## Changing Plans

### Upgrades

- Take effect immediately
- Pay prorated difference for current period
- New limits available instantly

### Downgrades

- Take effect at next billing cycle
- No refund for current period
- Must reduce usage to fit new limits first

### How to Change

1. Go to **Settings** > **Billing**
2. Click **Manage Subscription**
3. Select new plan
4. Confirm

## Trials

Trials are available by invitation: a magic link in a cold email, a trial code we share directly, or the [homepage try-it form](https://www.anomalyarmor.ai/). There is no self-serve trial from the signup page without a code and no free tier.

- 14-day duration
- All features unlocked
- No credit card required
- Workspace pauses after trial until a plan is chosen (no data loss)

## Common Questions

### What counts as a table for plan limits?

Any table or view that AnomalyArmor discovers and monitors counts toward your plan limit. System tables are excluded automatically. You control which schemas to include during connection setup.

### Can I add more tables without upgrading my plan?

No. Once you hit your plan limit, you must upgrade to monitor additional tables. The practical approach is to start with your most business-critical tables and expand from there.

### What happens if I exceed AnomalyArmor's API rate limits?

Requests over the limit return HTTP 429. Upgrade for higher limits, or batch and cache API calls in your integration to reduce request volume.

### Does AnomalyArmor offer discounts?

Yes. Annual billing saves 15% vs monthly. Non-profits, early-stage startups, and education customers may qualify for additional discounts; [contact sales](mailto:sales@anomalyarmor.ai) to discuss.

### Is there a free trial?

Trials are available by invitation: a magic link in a cold email, a trial code we share directly, or the [homepage try-it form](https://www.anomalyarmor.ai/). There is no self-serve trial from the signup page without a code and no free tier. Trials are 14 days with all features unlocked and no credit card; at the end your workspace pauses until you pick a paid plan.

---

<!-- Source: https://docs.anomalyarmor.ai/billing/usage-limits -->

# Usage Limits

> Understanding how monitor usage is calculated

Your plan includes a set number of active monitors. This page explains how we count monitors toward your limit.

## What Counts as an Active Monitor

Each individual monitor counts toward your limit:

- Each freshness schedule = 1 monitor
- Each data quality metric (row count, null %, distinct count, etc.) = 1 monitor
- Each validity rule (regex, allowed values, format checks) = 1 monitor
- Each referential integrity check (orphan detection) = 1 monitor

### Key Rules

1. **Per-item counting** - Each monitor counts separately toward your limit
2. **Multiple monitors on one table** - A table with 1 freshness + 5 metrics + 3 validity rules = 9 monitors
3. **Removing monitors reduces your count** - Delete a metric or schedule and your count decreases by 1
4. **Cross-database** - Monitors from all your connected databases count toward one shared limit

## What's Free (Unlimited)

The following features are free and do not count toward your monitor limit:

| Feature | Description |
|---------|-------------|
| Schema discovery | Discovering tables and columns in your databases |
| Schema drift monitoring | Alerts when schema changes are detected |
| AI descriptions & tags | Auto-generated documentation for your data |
| Alert destinations | Slack, email, PagerDuty, webhooks configuration |

## Checking Your Usage

Go to **Settings** → **Billing** to see:

- Current monitors in use vs. your plan limit
- Breakdown by feature type (freshness, data quality, validity)
- Which tables have monitors
- Which databases they belong to

## Approaching Your Limit

When you reach 80% of your monitor limit, you'll see a warning in the billing settings. To add more monitors, you can:

1. **Remove unused monitors** - Delete metrics or schedules you no longer need
2. **Upgrade your plan** - Higher tiers include more monitors at a lower per-monitor cost

## Example

If you're on the Starter plan (20 monitors) and you:

- Add freshness schedule to `orders` table → 1/20 monitors used
- Add 3 data quality metrics to `orders` → 4/20 monitors used (1 freshness + 3 metrics)
- Add freshness to `customers` table → 5/20 monitors used
- Add 2 validity rules to `customers` → 7/20 monitors used
- Delete freshness from `orders` → 6/20 monitors used
- Delete 1 metric from `orders` → 5/20 monitors used

## Common Questions

### Does schema drift monitoring count toward my monitor limit?

No. Schema discovery, schema drift monitoring, AI descriptions, and alert destinations are all free and unlimited. Only active freshness schedules, data quality metrics, and validity rules count. See [What's Free](#whats-free-unlimited).

### If I add 5 metrics to one table, does that count as 1 or 5?

5. Monitors are counted per-item, not per-table. A table with 1 freshness schedule, 3 metrics, and 2 validity rules uses 6 monitors out of your plan's allowance.

### Do monitors across different databases share one pool?

Yes. All monitors from all your connected data sources share a single plan-wide limit. You can spend your allocation across one database or spread it across many.

### What happens when I hit my monitor limit?

You can't create new monitors until you remove existing ones or upgrade. Existing monitors keep running. You'll see an 80%-used warning in **Settings → Billing** so you can plan ahead.

### How do I lower my monitor usage without upgrading?

Delete unused freshness schedules, metrics, or validity rules. Each deletion frees up 1 slot instantly. Schema drift monitoring stays free, so you can keep schema coverage on tables where per-row quality isn't critical.

---

<!-- Source: https://docs.anomalyarmor.ai/billing/faq -->

# Billing FAQ

> Common billing questions answered

## Common Questions

### What happens if I exceed my table limit?

You can't add new data sources or monitors that would put you over the limit. Existing monitoring keeps running uninterrupted. To add more, upgrade your plan from **Settings → Billing**.

### Can I downgrade my plan?

Yes. Downgrades take effect at your next billing cycle, not immediately. Before downgrading, reduce your monitored tables to fit the lower plan's limit.

### How does proration work?

Upgrades charge the prorated difference immediately for the remainder of the current period. Downgrades do not issue a refund for the current period; the new (lower) rate starts on the next cycle.

### Can I cancel anytime?

Yes. Cancel from **Settings → Billing**. Access continues through the end of your already-paid period, then the workspace is paused until you reactivate. There is no free tier.

### How do I get invoices or receipts?

Go to **Settings → Billing → Invoices** to download PDF receipts for any past payment. Enterprise customers on invoice billing receive invoices by email from accounting@anomalyarmor.ai.

### Do you offer refunds?

We handle refunds case-by-case. Email support@anomalyarmor.ai within 14 days of the charge with your account email and the reason, and we'll respond within 2 business days.

### What payment methods do you accept?

Credit cards (Visa, Mastercard, American Express) via Stripe. Enterprise customers can pay by invoice in USD with Net-30 terms. Contact sales@anomalyarmor.ai to set up invoicing.

### Is there a free trial?

Trials are available by invitation: a magic link in a cold email, a trial code we share directly, or the [homepage try-it form](https://www.anomalyarmor.ai/). There is no self-serve trial from the signup page without a code and no free tier. Trials are 14 days with all features unlocked and no credit card; at the end your workspace pauses until you pick a paid plan.

---

<!-- Source: https://docs.anomalyarmor.ai/security/overview -->

# Security Overview

> How AnomalyArmor protects your data with enterprise-grade security controls

AnomalyArmor is built with security as a foundational principle, not an afterthought. We understand that data observability tools have access to sensitive infrastructure, and we've designed our platform to limit data exposure to metadata and bounded aggregates, and to enforce that boundary at every layer.

## Core Security Principles

### Metadata and Aggregates Only, Never Raw Row Values

The most important security control is what we **don't** access:

| We Access | We Never Access |
|-----------|-----------------|
| Table names | Individual row data |
| Column names and types | `SELECT *` or raw column values |
| Schema metadata | PII content |
| Timestamp values (for freshness) | Business-sensitive records |
| Constraint definitions | Query results beyond aggregates |
| Aggregate statistics (`COUNT`, `MIN`, `MAX`, `SUM`, `AVG`, `PERCENTILE`) | |

When you connect a database, AnomalyArmor queries system catalogs (`information_schema`, `pg_catalog`, etc.) for structure, and runs bounded aggregate queries against your tables for monitoring. It never runs `SELECT *` or row-level reads of sensitive columns.

```sql
-- What we query (metadata + aggregates)
SELECT table_name, column_name, data_type
FROM information_schema.columns;
SELECT COUNT(*), MAX(created_at) FROM orders;

-- What we NEVER query (raw rows)
SELECT * FROM users;                   -- ❌ Never executed
SELECT email, ssn FROM customers;      -- ❌ Never executed
```

This isn't just policy. It's enforced by our open-source [Query Gateway](/security/query-gateway), which parses and validates every SQL query before execution.

### Defense in Depth

Security is implemented at every layer:

## Compliance & Certifications

### GDPR Compliance

For customers in the EU or handling EU data:

- Data Processing Agreement (DPA) available
- Right to erasure supported
- Data portability supported
- EU data residency options (contact sales)

### HIPAA Readiness

For healthcare organizations:

- Business Associate Agreement (BAA) available
- Enhanced audit logging
- Contact us for healthcare-specific deployment options

## Encryption

### Data in Transit

All network traffic uses **TLS 1.3** encryption:

- Database connections from AnomalyArmor to your infrastructure
- Web traffic to the AnomalyArmor dashboard
- API calls to AnomalyArmor services
- Webhook deliveries to your endpoints

**Minimum TLS version**: 1.2 (1.3 preferred)
**Cipher suites**: Modern, AEAD-only (AES-GCM, ChaCha20-Poly1305)

### Data at Rest

All stored data is encrypted with **AES-256**:

| Data Type | Encryption | Key Management |
|-----------|------------|----------------|
| Database credentials | AES-256-GCM | AWS KMS, automatic rotation |
| Asset metadata | AES-256 | Managed by infrastructure |
| Audit logs | AES-256 | Immutable storage |
| Backups | AES-256 | Separate backup keys |

### Key Management

Encryption keys are managed through AWS KMS:

- Automatic key rotation every 365 days
- Keys never leave the KMS boundary
- Separate keys per customer (Enterprise)
- Hardware Security Module (HSM) backed

## Network Security

### Connection Architecture

AnomalyArmor connects **outbound** to your databases:

### IP Allowlisting

AnomalyArmor uses static IP addresses for database connections:

```
Production IPs (US region):
34.xxx.xxx.xxx/32
34.xxx.xxx.xxx/32

# Add these to your security group/firewall
```

> ℹ️ Contact support for the current list of IP addresses, or view them in **Settings → Security** in your dashboard.

### VPC Peering (Enterprise)

For enhanced network isolation:

- Direct VPC peering between your AWS account and AnomalyArmor
- No public internet exposure for database connections
- Private DNS resolution

### AWS PrivateLink (Enterprise)

Connect via AWS PrivateLink for:

- Fully private connectivity
- No data traversing public internet
- Simplified firewall rules

## Authentication & Access Control

### User Authentication

AnomalyArmor supports multiple authentication methods:

| Method | Availability | Details |
|--------|--------------|---------|
| Email + Password | All plans | Strong password requirements |
| Google SSO | All plans | OAuth 2.0 integration |
| SAML 2.0 SSO | Enterprise | Okta, Azure AD, OneLogin, etc. |
| Multi-Factor Auth | All plans | TOTP-based (Google Authenticator, etc.) |

### Role-Based Access Control (RBAC)

Control who can do what:

| Role | Permissions |
|------|-------------|
| Viewer | View assets, alerts, dashboards |
| Editor | Viewer + configure alerts, manage destinations |
| Admin | Editor + manage data sources, team members |
| Owner | Admin + billing, security settings |

### Session Security

- Session tokens expire after 24 hours of inactivity
- Sessions invalidated on password change
- Concurrent session limits (Enterprise)
- IP-based session restrictions (Enterprise)

## Credential Security

### How We Store Your Database Credentials

Database credentials are the most sensitive data we handle:

1. **Encryption**: AES-256-GCM with customer-specific keys
2. **Key storage**: AWS KMS with HSM backing
3. **Access logging**: Every credential access is logged
4. **Least privilege**: Only the discovery engine accesses credentials
5. **No visibility**: Support staff cannot view your credentials

### Credential Recommendations

> ⚠️ Always create a **read-only, dedicated user** for AnomalyArmor. Never share credentials with production applications.

Recommended setup for PostgreSQL:

```sql
-- Create dedicated read-only user
CREATE USER anomalyarmor WITH PASSWORD 'secure-password';

-- Grant minimal permissions
GRANT CONNECT ON DATABASE your_db TO anomalyarmor;
GRANT USAGE ON SCHEMA public TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor;

-- For information_schema access (required)
-- This is typically granted by default
```

## Audit Logging

### What We Log

Every significant action is logged:

| Event Type | Details Logged |
|------------|---------------|
| Authentication | Login, logout, failed attempts, MFA events |
| Data access | Discovery runs, asset views, exports |
| Configuration | Rule changes, destination updates, team changes |
| Admin actions | User management, security settings |

### Log Retention

- **Standard**: 90 days retention
- **Enterprise**: Configurable retention, SIEM integration

### Accessing Audit Logs

Navigate to **Settings → Audit Log** to view activity:

- Filter by user, action type, date range
- Export to CSV
- API access for SIEM integration (Enterprise)

## Infrastructure Security

### Cloud Provider

AnomalyArmor runs on AWS with:

- Multi-AZ deployment for high availability
- Automated backups with point-in-time recovery
- DDoS protection via AWS Shield
- Web Application Firewall (WAF)

### Host Security

- Hardened AMIs with minimal attack surface
- Automatic security patching
- No SSH access to production hosts
- Immutable infrastructure (replaced, not updated)

### Container Security

- Non-root container execution
- Read-only file systems where possible
- Resource limits enforced
- Regular vulnerability scanning

## Incident Response

### Our Commitment

In the event of a security incident:

1. **Detection**: 24/7 monitoring with automated alerting
2. **Response**: Incident response team engaged within 1 hour
3. **Communication**: Affected customers notified within 24 hours
4. **Resolution**: Root cause analysis and remediation
5. **Post-mortem**: Detailed report shared with affected customers

### Reporting Security Issues

If you discover a security vulnerability:

- Email: [security@anomalyarmor.ai](mailto:security@anomalyarmor.ai)
- Response time: Within 24 hours
- Bug bounty: Contact us for details

> ⚠️ Please do not disclose security issues publicly until we've had a chance to address them.

## Penetration Testing

### Our Testing

- Annual third-party penetration tests
- Continuous automated vulnerability scanning
- Bug bounty program for responsible disclosure

### Your Testing

Enterprise customers may conduct penetration testing:

1. Contact us at least 2 weeks in advance
2. Scope limited to your tenant
3. Provide test dates and methodology
4. Share findings for mutual benefit

## Security FAQ

### Can AnomalyArmor employees see my data?

No. Employees cannot access customer credentials or metadata. All access is logged and audited. Support debugging happens with synthetic data only.

### What happens if AnomalyArmor is breached?

Your database credentials are encrypted with customer-specific keys. Even in a breach, credentials cannot be decrypted without the KMS keys. We would notify you within 24 hours of any confirmed breach.

### Can I run AnomalyArmor in my own cloud?

Self-hosted deployments are available for Enterprise customers with specific compliance requirements. Contact sales for details.

### How do I revoke AnomalyArmor's access?

1. Delete the data source connection in AnomalyArmor
2. Revoke the database user's permissions or delete the user
3. Remove AnomalyArmor IPs from your security group

## Common Questions

### Is AnomalyArmor SOC 2 / GDPR / HIPAA compliant?

AnomalyArmor is GDPR-compliant with a DPA available, and HIPAA-ready with a BAA available for Enterprise. SOC 2 Type II is in progress; contact security@anomalyarmor.ai for the current report status.

### Does AnomalyArmor support SSO and MFA?

Yes. Email + password, Google SSO, and TOTP-based MFA are available on all plans. SAML 2.0 SSO (Okta, Azure AD, OneLogin) is available on Enterprise. See [User Authentication](#user-authentication).

### Can I connect AnomalyArmor to a database that isn't publicly reachable?

Yes, on Enterprise. VPC peering and AWS PrivateLink let AnomalyArmor reach databases without a public endpoint, so traffic never traverses the public internet. See [Network Security](#network-security).

### How are database credentials protected if AnomalyArmor is breached?

Credentials are encrypted with AES-256-GCM using customer-specific keys managed by AWS KMS with HSM backing. Even in a breach, credentials cannot be decrypted without the KMS keys, which never leave the KMS boundary.

### How do I revoke AnomalyArmor's access to my database?

Delete the data source connection in AnomalyArmor, then revoke or delete the database user on your side, and remove our IPs from your security group. That gives you a belt-and-suspenders revocation.

### Can I run penetration tests against AnomalyArmor?

Enterprise customers can run penetration tests against their own tenant with 2 weeks' notice. Email security@anomalyarmor.ai with scope, dates, and methodology. We also run annual third-party pen tests and a bug bounty for responsible disclosure.

## Related Pages

- **[Query Gateway](/security/query-gateway)**: Open-source SQL security layer
- **[Data Handling](/security/data-handling)**: What data we access and how we handle it
- **[Data Retention](/security/data-retention)**: How long we keep your data

---

<!-- Source: https://docs.anomalyarmor.ai/security/query-gateway -->

# Query Gateway

> How AnomalyArmor enforces SQL access restrictions with an open-source security layer

AnomalyArmor uses an open-source SQL security layer called the **Query Gateway** to enforce strict access controls on every query we run against your database. This page explains how it works and how you can verify our security claims.

## What is the Query Gateway?

The Query Gateway is a SQL validation layer that parses and validates every query before execution. It's the enforcement mechanism behind our "metadata only" promise.

### Key Security Properties

| Property | Description |
|----------|-------------|
| **Fail-closed** | If a query cannot be parsed, it is blocked. No exceptions. |
| **Pre-execution validation** | Queries are validated before reaching your database |
| **Recursive checking** | Subqueries and CTEs are validated against the same rules |
| **Comment stripping** | SQL comments are removed to prevent obfuscation attacks |
| **Open source** | Full source code available for your security team to audit |

## Access Levels

The Query Gateway enforces three distinct permission tiers. AnomalyArmor uses different access levels depending on the operation:

| Level | Purpose | Permitted | Blocked |
|-------|---------|-----------|---------|
| **Schema Only** | Metadata queries | System tables (`information_schema`, `pg_catalog`, etc.) | Any row-level data |
| **Aggregates** | Statistics only | `COUNT(*)`, `AVG`, `SUM`, `MIN`, `MAX` | Raw column values, `SELECT *` |
| **Full** | Complete read access | Any valid `SELECT` | (unrestricted) |

### What AnomalyArmor Uses

| Feature | Access Level | Example Query |
|---------|--------------|---------------|
| Schema discovery | Schema Only | `SELECT table_name FROM information_schema.tables` |
| Freshness monitoring | Aggregates | `SELECT MAX(updated_at) FROM orders` |
| Row counts | Aggregates | `SELECT COUNT(*) FROM users` |

> ℹ️ AnomalyArmor does not use `full` access level. We only query metadata (schema_only) and aggregate statistics (aggregates) for freshness monitoring.

### Schema Only Mode

In schema_only mode, queries can only access system catalogs:

```sql
-- Allowed
SELECT table_name, column_name, data_type
FROM information_schema.columns;

-- Blocked (not a system table)
SELECT * FROM users;

-- Blocked (even aggregates on user tables)
SELECT COUNT(*) FROM orders;
```

### Aggregates Mode

In aggregates mode, queries can retrieve statistical information but never raw values:

```sql
-- Allowed
SELECT MAX(created_at) FROM orders;
SELECT COUNT(*) FROM users;
SELECT AVG(amount) FROM transactions;

-- Blocked (returns raw values)
SELECT email FROM users;
SELECT * FROM orders LIMIT 1;

-- Blocked (window functions can leak row data)
SELECT ROW_NUMBER() OVER (ORDER BY id) FROM users;
```

## Open Source Verification

The Query Gateway is fully open source under the Apache 2.0 license. Your security team can:

1. **Audit the code** - Review exactly how queries are validated
2. **Run the test suite** - 97+ tests covering security edge cases
3. **Verify our claims** - See the exact rules enforced at each access level

### Running the Tests Yourself

```bash
# Clone the repository
git clone https://github.com/anomalyarmor/anomalyarmor-query-gateway.git
cd anomalyarmor-query-gateway

# Install dependencies
pip install -e ".[dev]"

# Run the test suite
pytest -v
```

## Technical Implementation

### SQL Parsing

The gateway uses [sqlglot](https://github.com/tobymao/sqlglot) to parse queries into an Abstract Syntax Tree (AST). This enables:

- Dialect-aware parsing (PostgreSQL, MySQL, Databricks, ClickHouse, etc.)
- Accurate identification of accessed tables and columns
- Detection of nested queries and CTEs

### Validation Process

### Supported Databases

The gateway supports all databases that AnomalyArmor connects to:

- PostgreSQL
- MySQL
- Databricks
- ClickHouse
- Snowflake
- SQL Server
- BigQuery
- Redshift

## Audit Logging

Every query validation is logged with:

- Query text
- Access level applied
- Allow/block decision
- Rejection reason (if blocked)
- Metadata (asset ID, user ID, timestamp)

These logs are available in your audit trail (Enterprise plans).

## Common Questions

### Can I run the Query Gateway myself?

The gateway is open source, but AnomalyArmor runs it as part of our managed service. Self-hosted deployments are available for Enterprise customers.

### What happens if a query is blocked?

Blocked queries are logged and never reach your database. The discovery job reports an error, which our engineering team investigates.

### Can the Query Gateway be bypassed?

No. All database queries from AnomalyArmor pass through the gateway. There is no code path that executes queries directly.

### How do I verify AnomalyArmor is using the Query Gateway?

Check your database query logs: you will only see metadata queries (against system catalogs) and aggregate queries (`COUNT`, `AVG`, `MIN`, `MAX`). Never `SELECT *` or raw row reads. You can also review the open-source gateway code directly.

## Related Pages

- **[Security Overview](/security/overview)**: Full security documentation
- **[Data Handling](/security/data-handling)**: What data we access and store

---

<!-- Source: https://docs.anomalyarmor.ai/security/data-handling -->

# Data Handling

> What data AnomalyArmor accesses and how we protect it

The most common security question: "What data does AnomalyArmor actually see?" Here's the complete answer.

## What We Access

AnomalyArmor queries **metadata and aggregate statistics only**, never your actual row data. Every query is validated by our open-source [Query Gateway](/security/query-gateway) before execution.

| We Access | We Never Access |
|-----------|-----------------|
| Table names | Row data |
| Column names | Actual values in columns |
| Data types | PII content |
| Timestamps (for freshness) | Query results |
| Row counts | Business-sensitive data |
| Constraint definitions | File contents |

### The Queries We Run

During discovery, we query system catalogs:

```sql
-- PostgreSQL example
SELECT table_name, column_name, data_type
FROM information_schema.columns
WHERE table_schema NOT IN ('pg_catalog', 'information_schema');

-- For freshness, we query MAX of timestamp columns
SELECT MAX(updated_at) FROM your_table;
```

We **never** run:
```sql
SELECT * FROM your_table;  -- ❌ Never
SELECT email FROM users;   -- ❌ Never
```

### Freshness Monitoring

For freshness checks, we query the maximum value of your designated timestamp column:

```sql
SELECT MAX(created_at) FROM orders;
-- Returns: 2024-01-15 08:30:00
-- We see: "Last update was 30 minutes ago"
-- We don't see: Any actual order data
```

## Data Storage

### What We Store

| Data Type | Location | Encryption | Retention |
|-----------|----------|------------|-----------|
| Database credentials | AWS (encrypted) | AES-256-GCM | Until you delete |
| Schema metadata | AWS RDS | AES-256 | Account lifetime |
| Change history | AWS RDS | AES-256 | 90 days |
| Audit logs | AWS S3 | AES-256 | 1 year |

### What We Don't Store

- Your actual data values
- Query results beyond metadata
- Copies of your database

## Data Flow

1. Discovery engine connects via TLS 1.3
2. Runs metadata queries against system catalogs
3. Receives only schema information
4. Stores encrypted in our infrastructure
5. Your actual data never leaves your database

## Compliance Implications

### For GDPR

- We don't process personal data (only metadata)
- DPA available if required
- Data deletion on account closure

### For HIPAA

- No PHI accessed
- BAA available for healthcare customers
- Enhanced audit logging available

## Verification

Want to verify what we access? You have multiple options:

### Check Your Database Logs

```sql
-- PostgreSQL: Enable query logging
ALTER SYSTEM SET log_statement = 'all';
SELECT pg_reload_conf();

-- Filter for AnomalyArmor user
grep 'anomalyarmor' /var/log/postgresql/postgresql.log
```

You'll see only `information_schema` queries and timestamp aggregations.

### Audit the Query Gateway

Our [Query Gateway](https://github.com/anomalyarmor/anomalyarmor-query-gateway) is open source. Your security team can:

1. Review the source code
2. Run the 97+ security tests
3. Verify exactly which queries are allowed at each access level

### Can AnomalyArmor employees see my data?

No. We can't see your data because we never access it. Support staff can see metadata (table names, column names) to help troubleshoot, but never actual data values.

### What if I have a column named 'password'?

We see the column name "password" exists and its data type (e.g., varchar). We never see the actual password values stored in that column.

### How do I know you're not lying?

1. Audit our open-source [Query Gateway](https://github.com/anomalyarmor/anomalyarmor-query-gateway)
2. Check your database query logs
3. Use a read-only database user with only `SELECT` on system catalogs
4. Run a network packet capture during discovery

### Does AnomalyArmor ever read the actual values stored in my tables?

No. Every query is restricted to system catalogs and bounded aggregates (`COUNT`, `MAX`, `MIN`, `AVG`, `SUM`). Raw reads like `SELECT *` or `SELECT email FROM users` are blocked by the open-source [Query Gateway](/security/query-gateway) before they ever reach your database.

### If I have a PII column like 'ssn' or 'email', what does AnomalyArmor see?

We see the column name and its data type. We do not see a single row of the column's contents. Column names help auto-classify sensitivity, but the values never leave your database.

### Exactly which SQL statements does AnomalyArmor run against my database?

Metadata queries against `information_schema` and `pg_catalog` for structure, plus aggregate queries like `SELECT MAX(updated_at) FROM your_table` for freshness. See the [queries we run](#the-queries-we-run) section above for concrete examples.

### How can I independently verify what AnomalyArmor queries?

Turn on query logging for the `anomalyarmor` user and audit the logs, or run a packet capture during discovery. The [Query Gateway](https://github.com/anomalyarmor/anomalyarmor-query-gateway) is open source with 97+ security tests you can run yourself. See [Verification](#verification) above.

### Can AnomalyArmor employees see my data?

No. We can't see your data because we never access it. Support staff can see metadata (table and column names) to help troubleshoot, but never actual values. Every credential access is logged.

### Does AnomalyArmor store copies of my tables?

No. We store schema metadata (table and column names, types, constraints), change history, and timestamp aggregates, not the underlying rows. See [What We Store](#what-we-store) for the full list with encryption and retention details.

## See Also

- **[Query Gateway](/security/query-gateway)**: Open-source SQL security layer
- **[Security Overview](/security/overview)**: Full security documentation
- **[Data Retention](/security/data-retention)**: How long we keep data

---

<!-- Source: https://docs.anomalyarmor.ai/security/data-retention -->

# Data Retention

> How long AnomalyArmor keeps your metadata and how to request deletion

Understanding what data we keep and for how long.

## Retention Periods

| Data Type | Retention | Notes |
|-----------|-----------|-------|
| Schema metadata | Account lifetime | Current state of your assets |
| Schema change history | 90 days | Historical diffs |
| Alert history | 90 days | Past alert triggers |
| Audit logs | 1 year | Who did what, when |
| Database credentials | Until deleted | Encrypted, you control |

## Extended Retention (Enterprise)

Enterprise customers can customize retention:

| Data Type | Standard | Enterprise Options |
|-----------|----------|-------------------|
| Change history | 90 days | Up to 2 years |
| Alert history | 90 days | Up to 2 years |
| Audit logs | 1 year | Up to 7 years |

Contact your account manager to adjust retention periods.

## What Happens When Data Expires

When data exceeds its retention period:

1. **Automatic deletion** - No action required from you
2. **Soft delete first** - Data marked for deletion
3. **Permanent removal** - Purged from all systems within 30 days
4. **Backup cleanup** - Removed from backups on next rotation

## Account Deletion

When you close your account:

1. All data marked for deletion immediately
2. Schema metadata, history, and credentials deleted
3. Audit logs retained for 30 days (legal/compliance)
4. Complete removal within 30 days of account closure

To delete your account:
1. Go to **Settings** → **Account**
2. Click **Delete Account**
3. Confirm deletion
4. You'll receive confirmation email when complete

## Data Export

Before deletion, export your data:

1. Go to **Settings** → **Data Export**
2. Select what to export:
   - Asset inventory (CSV)
   - Change history (CSV)
   - Alert history (CSV)
3. Click **Export**
4. Download within 7 days

## GDPR Right to Erasure

EU customers can request data deletion:

1. Email privacy@anomalyarmor.ai
2. Include your account email
3. Specify what data to delete
4. We'll respond within 30 days

We'll delete:
- All metadata associated with your account
- Audit logs (after legal retention period)
- Any PII in our systems

We'll retain:
- Anonymized usage statistics
- Billing records (legal requirement)

## Compliance Certifications

Our retention practices support:

- **GDPR** - DPA available
- **HIPAA** - BAA available

## Common Questions

### How long does AnomalyArmor keep my schema change history?

90 days on Starter, Growth, and Professional plans. Enterprise customers can extend to up to 2 years. See [Retention Periods](#retention-periods) for the full breakdown.

### How long are audit logs retained?

1 year by default. Enterprise customers can configure retention up to 7 years for compliance programs that require longer windows. Audit logs are stored in immutable encrypted storage.

### What happens to my data when I close my account?

Metadata, change history, and database credentials are marked for deletion immediately, and fully purged within 30 days. Audit logs are retained for 30 days after closure for legal and compliance purposes, then deleted.

### How do I submit a GDPR right-to-erasure request?

Email privacy@anomalyarmor.ai with your account email and what you want deleted. We respond within 30 days. Anonymized usage statistics and billing records are kept as required by law. See [GDPR Right to Erasure](#gdpr-right-to-erasure).

### Can I export my data before deletion?

Yes. Go to **Settings → Data Export** and you can download your asset inventory, change history, and alert history as CSV. The export link is available for 7 days after it's generated.

### When does deleted data actually leave your backups?

Retention deletions are soft-deleted first, permanently removed from primary storage within 30 days, and purged from backups on the next rotation cycle. Account closures follow the same 30-day hard-delete window.

## See Also

- **[Data Handling](/security/data-handling)**: What data we access
- **[Security Overview](/security/overview)**: Full security documentation

---

<!-- Source: https://docs.anomalyarmor.ai/troubleshooting/common-setup-mistakes -->

# Common Setup Mistakes

> Avoid these common configuration errors when setting up AnomalyArmor

Avoid these pitfalls when setting up AnomalyArmor. Most users encounter at least one of these. Here's how to prevent and fix them.

## Database Connection Mistakes

### Wrong Port Number

**The mistake**: Using the wrong default port for your database.

| Database | Correct Port | Common Mistake |
|----------|--------------|----------------|
| PostgreSQL | 5432 | 3306 (MySQL) |
| ClickHouse HTTPS | 8443 | 9000 (native) |
| ClickHouse HTTP | 8123 | 9000 (native) |
| Databricks | 443 | N/A |

**How to fix**: Double-check the port in your database settings or cloud console.

### SSL Not Enabled

**The mistake**: Not enabling SSL when your database requires it.

Most cloud databases require SSL:
- AWS RDS/Aurora: Required by default
- Supabase: Required
- ClickHouse Cloud: Always HTTPS
- Google Cloud SQL: Required by default

**How to fix**: Set **SSL Mode** to `require` in your connection settings.

### Forgot to Allowlist IPs

**The mistake**: Firewall or security group blocks AnomalyArmor.

**Symptoms**:
- "Connection refused" error
- "Connection timed out" error

**How to fix**:
1. Go to **Settings → Security** to find AnomalyArmor IPs
2. Add them to your security group/firewall rules
3. Test the connection again

### Using Wrong Endpoint (AWS)

**The mistake**: Using the wrong RDS/Aurora endpoint.

```
Common mistakes:
✗ mydb.rds.amazonaws.com (missing region)
✓ mydb.us-east-1.rds.amazonaws.com

✗ Writer endpoint for read-only operations
✓ Reader endpoint (recommended for monitoring)
```

**How to fix**: Copy the endpoint directly from AWS Console.

## Permission Mistakes

### Insufficient Database Permissions

**The mistake**: User can connect but can't read metadata.

**Symptoms**:
- "Permission denied" error
- Discovery finds no tables

**Required permissions**:

```sql
-- PostgreSQL
GRANT CONNECT ON DATABASE your_db TO anomalyarmor;
GRANT USAGE ON SCHEMA public TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor;

-- ClickHouse
GRANT SELECT ON system.* TO anomalyarmor;
GRANT SELECT ON your_database.* TO anomalyarmor;
GRANT SHOW ON *.* TO anomalyarmor;
```

### Forgetting Schema Permissions

**The mistake**: Granting table access but not schema access.

```sql
-- Wrong: Has table access but not schema access
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO anomalyarmor;

-- Right: Add schema usage
GRANT USAGE ON SCHEMA analytics TO anomalyarmor;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO anomalyarmor;
```

### Not Granting Access to Future Tables

**The mistake**: New tables aren't automatically monitored.

```sql
-- PostgreSQL: Grant access to future tables
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO anomalyarmor;
```

Without this, you'll need to manually grant access each time a table is created.

## Alert Configuration Mistakes

### Alerting on Everything

**The mistake**: Creating broad rules that alert on every change in every environment.

**Result**: Alert fatigue. Your team ignores all alerts.

**Better approach**:
1. Start with 5-10 critical production tables
2. Alert only on breaking changes (column removed, table removed)
3. Exclude dev/test environments
4. Add coverage gradually

### Same Destination for All Severities

**The mistake**: Sending all alerts to Slack or all alerts to PagerDuty.

**Result**: Important alerts get buried or you get paged for trivial issues.

**Better approach**:
```
Critical (column removed)  → PagerDuty + Slack #incidents
High (type changed)        → Slack #data-alerts
Low (column added)         → Email digest
```

### Too-Tight Freshness SLAs

**The mistake**: Setting freshness SLA to match exact expected timing.

```
ETL runs hourly
SLA set to 1 hour
ETL occasionally takes 65 minutes
Result: Constant false positives
```

**Better approach**:
```
ETL runs hourly
SLA set to 2 hours (2x expected)
Buffer for normal variation
Tighten later if needed
```

### Not Disabling Rules During Maintenance

**The mistake**: Getting paged during planned maintenance.

**How to fix**: Before maintenance:
1. Go to **Alerts → Rules**
2. Toggle OFF relevant rules
3. Set a reminder to re-enable
4. Toggle rules back ON after maintenance

## Discovery Configuration Mistakes

### Discovery Too Infrequent

**The mistake**: Daily discovery for production databases.

**Result**: Schema changes aren't caught until the next day. After pipelines fail.

**Better approach**:
| Environment | Discovery Frequency |
|-------------|---------------------|
| Production | Hourly |
| Staging | Every 6 hours |
| Development | Daily |

### Not Scheduling Before Pipeline Runs

**The mistake**: Discovery runs at random times, not before pipelines.

**Better approach**:
```
Pipeline runs at: 3:00 AM
Discovery should run at: 2:00 AM

This way, you get alerts before the pipeline runs.
```

### Including Unnecessary Schemas

**The mistake**: Monitoring schemas that change constantly.

```
Schemas generating noise:
- pg_temp_* (PostgreSQL temp tables)
- test_* (test tables)
- *_backup (backup copies)
- dev_* (development)
```

**How to fix**: Add schema filters to exclude noisy schemas.

## Freshness Monitoring Mistakes

### Wrong Timestamp Column

**The mistake**: Using `created_at` for tables that get updated.

| Table Type | Right Column |
|------------|--------------|
| Append-only (events, logs) | `created_at` |
| Updated tables (users, profiles) | `updated_at` |
| ETL destination | `_loaded_at` or `_etl_timestamp` |

**Symptom**: Freshness shows stale even though data is current.

### Not Handling Weekends

**The mistake**: Freshness SLA assumes 24/7 updates.

**Result**: Weekend alerts for data that legitimately doesn't update.

**Options**:
1. Longer SLA (72 hours for daily data)
2. Disable rules on Friday, re-enable Monday
3. Different weekend thresholds (coming soon)

## Security Mistakes

### Using Production Application Credentials

**The mistake**: Using the same credentials your application uses.

**Risks**:
- If compromised, attacker has full application access
- Can't easily rotate without affecting application
- No audit trail separation

**Better approach**: Create a dedicated read-only user for AnomalyArmor.

### Not Using SSL

**The mistake**: Disabling SSL "because it's easier."

**Risk**: Credentials transmitted in plaintext.

**How to fix**: Always use `SSL Mode: require` for production.

## Checklist: Before Going Live

Use this checklist to avoid common mistakes:

**Connection**:
- [ ] Correct hostname (no typos)
- [ ] Correct port for database type
- [ ] SSL mode set to `require`
- [ ] AnomalyArmor IPs allowlisted

**Permissions**:
- [ ] Dedicated read-only user created
- [ ] CONNECT granted on database
- [ ] USAGE granted on schemas
- [ ] SELECT granted on tables
- [ ] Future table access configured

**Discovery**:
- [ ] Discovery frequency matches environment (hourly for production, 6h for staging, daily for dev)
- [ ] Scheduled before pipeline runs
- [ ] Noisy schemas excluded

**Alerts**:
- [ ] Started with critical tables only
- [ ] Routing by event type configured
- [ ] Dev/test excluded
- [ ] Freshness SLAs have buffer
- [ ] Maintenance rule-toggle process documented

## Still Having Issues?

- **[Connection Issues](/troubleshooting/connection-issues)**: Detailed connection troubleshooting
- **[Contact Support](/support/contact)**: Get help from our team

## Common Questions

### Why does AnomalyArmor find zero tables after I connect my database?

The user can connect but lacks `USAGE` on the schemas or `SELECT` on `information_schema`. Grant both, then re-run discovery. See [Forgetting Schema Permissions](#forgetting-schema-permissions) above for the exact grants.

### Why do new tables never show up in AnomalyArmor?

New tables are not automatically covered by `GRANT SELECT ON ALL TABLES`. Add `ALTER DEFAULT PRIVILEGES` so future tables inherit access. See [Not Granting Access to Future Tables](#not-granting-access-to-future-tables) for the PostgreSQL syntax.

### Why are we getting false positive freshness alerts on a table that updates fine?

Almost always the wrong timestamp column. Append-only tables use `created_at`, but tables that get updated in place need `updated_at` or `_loaded_at`. Switch the column and the noise usually stops.

### Should I use the same database user my application uses?

No. Create a dedicated read-only user for AnomalyArmor so you can rotate credentials independently, keep audit trails clean, and limit blast radius if a key leaks. See [Security Mistakes](#security-mistakes).

### How often should discovery run?

Hourly for production, every 6 hours for staging, daily for development. Schedule it an hour before your main pipelines run so schema changes are caught before they break anything. See [Discovery Configuration Mistakes](#discovery-configuration-mistakes).

---

<!-- Source: https://docs.anomalyarmor.ai/troubleshooting/connection-issues -->

# Connection Issues

> Troubleshoot database connectivity problems

This guide helps you diagnose and fix common database connection problems. If you're having trouble connecting AnomalyArmor to your database, start here.

## Quick Diagnosis

Use this decision tree to identify your issue:

| Error Message | Likely Cause | Jump To |
|---------------|--------------|---------|
| "Connection refused" | Network/firewall issue | [Connection refused](#connection-refused) |
| "Authentication failed" | Credentials issue | [Authentication failed](#authentication-failed) |
| "SSL required" | SSL configuration issue | [SSL required](#ssl-required--ssl-connection-error) |
| "Permission denied" | Database permissions issue | [Permission denied](#permission-denied--access-denied) |
| "Timeout" | Network latency issue | [Timeout](#timeout--connection-timed-out) |

## Connection Test Failures

### "Connection refused"

**What it means**: AnomalyArmor can't reach your database at all.

**What you'll see**: When you click "Test Connection" in the Add Connection form, you'll see a red destructive toast notification with:

- **Title**: "Error"
- **Message**: "Connection refused - Could not connect to database at mydb.us-east-1.rds.amazonaws.com:5432"
- **Details**: A list of possible causes:
  - Firewall/security group blocking connection
  - Incorrect hostname or port
  - Database not running

The toast appears in the top-right corner of the screen and includes action buttons to retry or edit the connection settings.

**Common causes**:
1. Firewall/security group blocking the connection
2. Wrong hostname or port
3. Database not running or not accepting connections

**Troubleshooting steps**:

### 1. Verify hostname and port

Double-check the hostname. Typos are common:

```
Common mistakes:
[X] mydb.rds.amazonaws.com (missing region)
[OK] mydb.us-east-1.rds.amazonaws.com

[X] Port 3306 (MySQL default)
[OK] Port 5432 (PostgreSQL default)
```

### 2. Check firewall/security group

AnomalyArmor needs inbound access on your database port.

**AWS RDS/Aurora:**
1. Go to RDS Console → Your instance → Security Groups
2. Add inbound rule:
- Type: PostgreSQL (or your DB type)
- Port: 5432 (or your port)
- Source: AnomalyArmor IPs (see Settings → Security)

**Google Cloud SQL:**
1. Go to SQL Console → Your instance → Connections
2. Under "Authorized Networks", add AnomalyArmor IPs

**Self-hosted:**
```bash
# Allow AnomalyArmor IPs
iptables -A INPUT -p tcp --dport 5432 -s 34.xxx.xxx.xxx -j ACCEPT
```

### 3. Verify database is running

Can you connect from another location?

```bash
# Test from your local machine
nc -zv your-host.rds.amazonaws.com 5432

# Expected output (success):
Connection to your-host.rds.amazonaws.com port 5432 [tcp/postgresql] succeeded!

# If it fails, the issue is likely firewall-related
```

### 4. Check for VPC/private network issues

If your database is in a private subnet:
- It may not be accessible from the internet
- You may need VPC peering or PrivateLink (Enterprise)
- Contact us about private connectivity options

### "Authentication failed"

**What it means**: AnomalyArmor reached your database but the credentials were rejected.

**What you'll see**: A red error notification appears with a message like:

> ⚠️ **Connection failed**

Authentication failed - FATAL: password authentication failed for user 'anomalyarmor'

Possible causes:
- Incorrect username or password
- User doesn't exist in the database
- User lacks CONNECT permission

**Common causes**:
1. Wrong username or password
2. User doesn't exist
3. User doesn't have permission to connect

**Troubleshooting steps**:

### 1. Verify credentials

Copy-paste to avoid typos:
- Copy the username exactly as it appears in your database
- Copy the password (watch for leading/trailing spaces)
- Check for special characters that may need escaping

### 2. Confirm user exists

Connect to your database and check:

```sql
-- PostgreSQL
SELECT usename FROM pg_user WHERE usename = 'anomalyarmor';

-- ClickHouse
SELECT name FROM system.users WHERE name = 'anomalyarmor';
```

### 3. Reset the password

If unsure, reset the password:

```sql
-- PostgreSQL
ALTER USER anomalyarmor WITH PASSWORD 'new-secure-password';

-- ClickHouse
ALTER USER anomalyarmor IDENTIFIED BY 'new-secure-password';
```

Then update the password in AnomalyArmor.

### 4. Check pg_hba.conf (PostgreSQL)

For self-hosted PostgreSQL, check `pg_hba.conf` allows the connection:

```
# Add entry for AnomalyArmor
hostssl all anomalyarmor 34.xxx.xxx.xxx/32 scram-sha-256
```

Reload PostgreSQL after changes:
```bash
sudo systemctl reload postgresql
```

### "SSL required" / "SSL connection error"

**What it means**: The database requires SSL but the connection isn't using it (or vice versa).

**What you'll see**: A red error notification appears with a message like:

> ⚠️ **Connection failed**

SSL connection required - server requires SSL connection

Solution: Set SSL Mode to 'require' in your connection settings and try again.

**Troubleshooting steps**:

### 1. Enable SSL in AnomalyArmor

Edit your connection:
1. Go to Data Sources → Your connection → Settings
2. Set **SSL Mode** to `require`
3. Save and test

### 2. Try different SSL modes

| Mode | Description |
|------|-------------|
| `disable` | No SSL (not recommended) |
| `require` | SSL required, no certificate verification |
| `verify-ca` | Verify server certificate |
| `verify-full` | Verify certificate and hostname |

Start with `require` for most cloud databases.

### 3. Check cloud provider requirements

- **AWS RDS**: Requires SSL by default, use `require`
- **Aurora**: Requires SSL, use `require`
- **Supabase**: Requires SSL, use `require`
- **Google Cloud SQL**: Requires SSL, use `require`
- **ClickHouse Cloud**: Always HTTPS on port 8443

### "Timeout" / "Connection timed out"

**What it means**: The connection attempt took too long.

**Common causes**:
1. Network latency
2. Firewall silently dropping packets
3. Database under heavy load

**Troubleshooting steps**:

### 1. Check network path

Test connectivity from your own machine:

```bash
# Test if port is reachable
nc -zv -w 10 your-host.com 5432

# Test latency
ping your-host.com
```

### 2. Check security groups

If the security group doesn't have an explicit allow rule, packets may be silently dropped (causing timeout instead of refused).

### 3. Try during off-peak hours

If the database is under heavy load, connections may time out.

### "Permission denied" / "Access denied"

**What it means**: Connection succeeded but the user can't access the required data.

**Troubleshooting steps**:

### 1. Grant CONNECT permission

```sql
-- PostgreSQL
GRANT CONNECT ON DATABASE your_database TO anomalyarmor;
```

### 2. Grant schema access

```sql
-- PostgreSQL
GRANT USAGE ON SCHEMA public TO anomalyarmor;
GRANT USAGE ON SCHEMA analytics TO anomalyarmor;
```

### 3. Grant table access

```sql
-- PostgreSQL
GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor;

-- For future tables
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO anomalyarmor;
```

### 4. Verify with test query

Connect as the anomalyarmor user and run:

```sql
SELECT table_name FROM information_schema.tables LIMIT 5;
```

If this fails, you need more permissions.

## Discovery Failures

If connection tests pass but discovery fails:

### "No tables found"

**Causes**:
- User can connect but can't see tables
- Schema filter excluding all schemas
- Connected to wrong database

**Solutions**:
1. Check schema filter settings
2. Verify user has SELECT on `information_schema`
3. Confirm database name is correct

### "Discovery timeout"

**Causes**:
- Very large database (10,000+ tables)
- Slow network
- Database under load

**Solutions**:
1. Exclude unnecessary schemas
2. Run during off-peak hours
3. Consider using a read replica

## Provider-Specific Issues

### AWS RDS

**Common issues:**
- Security group not allowing AnomalyArmor IPs
- Using private subnet without public access
- Wrong endpoint (cluster vs instance)

**Solutions:**
1. Add AnomalyArmor IPs to security group
2. Enable "Publicly accessible" or use VPC peering
3. Use the correct endpoint for your use case

### Databricks

**Common issues:**
- Invalid or expired access token
- Wrong HTTP path
- SQL warehouse not running

**Solutions:**
1. Generate a new access token
2. Copy HTTP path from SQL Warehouse settings
3. Ensure warehouse is running or use serverless

### ClickHouse

**Common issues:**
- Using native port instead of HTTP port
- HTTP interface disabled
- IP not in user's allowed hosts

**Solutions:**
1. Use port 8443 (HTTPS) or 8123 (HTTP)
2. Enable HTTP interface in config.xml
3. Add IPs to user's HOST restriction

## Still Stuck?

If you've tried everything above and still can't connect:

1. **Collect diagnostic info**:
   - Error message (exact text)
   - Database type and version
   - Cloud provider (if applicable)
   - Network setup (public/private subnet)

2. **Contact support**:
   - Email: support@anomalyarmor.ai
   - Include the diagnostic info above
   - We can help troubleshoot complex network setups

## Common Questions

### Why does my connection test say 'Connection refused'?

AnomalyArmor could not open a socket to your host. It's almost always a firewall or security group blocking our IPs, a wrong hostname or port, or a database that isn't actually listening. Walk through the [Connection refused](#connection-refused) checklist.

### What IP addresses do I need to allowlist?

AnomalyArmor connects from a small set of static IPs listed in **Settings → Security** in your dashboard. Add those to your security group or firewall inbound rules for the database port.

### Why am I getting 'Authentication failed' when the credentials look correct?

The most common causes are trailing whitespace in the password, a user that was created but not granted `CONNECT`, or (self-hosted PostgreSQL) a missing `pg_hba.conf` entry for our IPs. See [Authentication failed](#authentication-failed).

### Why does the connection time out instead of returning an error?

Timeouts almost always mean a firewall is silently dropping packets rather than rejecting them. Check your security group has an explicit allow rule for AnomalyArmor's IPs on the database port. See [Timeout](#timeout--connection-timed-out).

### Can AnomalyArmor connect to a database in a private VPC?

Not over the public internet. Enterprise customers can connect via VPC peering or AWS PrivateLink so the database never needs a public endpoint. [Contact sales](mailto:sales@anomalyarmor.ai) to set that up.

---

<!-- Source: https://docs.anomalyarmor.ai/troubleshooting/faq -->

# FAQ

> Frequently asked questions about AnomalyArmor

## General

### What databases do you support?

AnomalyArmor currently supports PostgreSQL (12+) including RDS, Aurora, Supabase, and self-hosted; Databricks with Unity Catalog; and ClickHouse (21.8+) including ClickHouse Cloud. More databases are on the way. [Contact us](mailto:support@anomalyarmor.ai) if you need a specific database.

### Do you access my actual data?

No. AnomalyArmor queries metadata from system catalogs like `information_schema` plus bounded aggregates (`COUNT`, `MAX`) for freshness. We never run `SELECT *` or read row values. See [Data Handling](/security/data-handling) for the full list.

### How does discovery work?

Discovery queries your database's system catalogs to find tables, views, and columns. For PostgreSQL we query `information_schema` and `pg_catalog`; for Databricks we use Unity Catalog APIs. Each run is compared against the previous state to detect changes.

### How often should I run discovery?

Hourly for production, every 6 hours for staging, and daily for development. You can also trigger discovery on demand from the asset page at any time.

### What's the performance impact on my database?

Minimal. Discovery runs a handful of lightweight system-catalog queries (typically 3-10 queries, under 1 second each) and never scans your tables. For very large databases you can point AnomalyArmor at a read replica.

## Schema Monitoring

### What schema changes do you detect?

Columns added, columns removed, column type changes, tables added, tables removed, and constraint modifications. Changes are surfaced on the next discovery run after they occur.

### Can you detect column renames?

Not directly yet. A rename surfaces as a column removed plus a column added, which you can correlate manually. Automatic rename detection is on the roadmap.

### Why didn't I get alerted about a schema change?

Check, in order: did discovery run after the change; is the table in a monitored schema; does the alert rule scope include this asset; is the rule enabled; is the destination configured and tested.

## Freshness Monitoring

### How does freshness monitoring work?

You specify a timestamp column (like `created_at` or `updated_at`). AnomalyArmor queries `MAX(column)` and alerts if the latest timestamp is older than your SLA threshold. See [Freshness Monitoring](/data-quality/freshness-monitoring).

### What if my table doesn't have a timestamp column?

Without a timestamp column we can't monitor freshness. Options: add a timestamp column in your ETL, use an `_etl_loaded_at` audit column, or skip freshness for that table and rely on row-count metrics instead.

### How do I fix a false-positive freshness alert?

Usually the SLA is too tight or you're using the wrong column (`created_at` on a table that gets updated). Loosen the threshold to 2x the expected interval, switch to `updated_at` for mutable tables, and check timezone handling.

## Alerts

### How do I reduce alert fatigue?

Start with 5-10 critical tables, route by event type (breaking changes to PagerDuty, additions to email), exclude dev/test environments, and start with lenient thresholds that you tighten over time. See [Alert Best Practices](/alerts/best-practices).

### Can I disable alerts temporarily?

Yes. Go to **Alerts → Rules**, find the rule, and toggle it off. The rule won't fire until you toggle it back on, which is the right pattern for planned maintenance.

### How do I test if my alerts work?

Go to **Alerts → Destinations**, pick the destination, and click **Send Test Alert**. You'll see a test notification land in Slack, PagerDuty, email, or the webhook target within a few seconds.

### Can I send alerts to multiple destinations?

Yes. When you create or edit an alert rule you can attach any number of destinations, so (for example) the same schema-drift alert can hit both Slack and email.

## Security

### How are my credentials stored?

Database credentials are encrypted at rest with AES-256-GCM using keys managed by AWS KMS. Keys never leave the KMS boundary and not even our support team can read your credentials. See [Security Overview](/security/overview).

### What compliance certifications do you have?

AnomalyArmor is GDPR compliant (DPA available) and HIPAA-ready (BAA available on Enterprise). Email security@anomalyarmor.ai for compliance documentation.

### What IP addresses does AnomalyArmor connect from?

AnomalyArmor uses a small set of static IPs. View the current list in **Settings → Security** and add them to your firewall or security-group inbound rules.

### Can I use a read-only database user?

Yes, and we recommend it. We only need `SELECT` on system catalogs and the tables you want monitored. See the [PostgreSQL guide](/data-sources/postgresql) for the exact grants.

## Billing & Plans

### Is there a free trial?

Trials are available by invitation: a magic link in a cold email, a trial code we share directly, or the [homepage try-it form](https://www.anomalyarmor.ai/). There is no self-serve trial from the signup page without a code and no free tier. Trials are 14 days with no credit card; at the end you pick a paid plan or your workspace pauses until you choose one.

### What counts toward my usage?

Usage is based on active monitors: each freshness schedule, data quality metric, and validity rule. Schema drift monitoring and AI descriptions are free. See [Usage Limits](/billing/usage-limits).

### Can I change plans?

Yes. Upgrade or downgrade any time from **Settings → Billing**. Upgrades take effect immediately with prorated charges; downgrades take effect next billing cycle.

### What happens if I exceed my plan limits?

You'll get a notification and the option to upgrade or remove unused monitors. We won't cut off service without warning.

## Technical

### Do you have an API?

Yes. A full REST API is available on all plans. See the [API overview](/api/overview) for authentication and endpoints.

### Can I integrate with my existing tools?

Yes, via native Slack and PagerDuty integrations, outbound webhooks to any HTTPS endpoint, email destinations, and the REST API for custom integrations.

### What browsers do you support?

Modern Chrome, Firefox, Safari, and Edge. We don't support Internet Explorer.

### Is there a self-hosted option?

Self-hosted deployments are available for Enterprise customers with specific compliance requirements. [Contact sales](mailto:sales@anomalyarmor.ai) for details.

## Troubleshooting

### My connection test fails. What do I do?

Start with the [Connection Issues](/troubleshooting/connection-issues) guide. It walks through refused, timed-out, authentication, SSL, and permission errors with specific fixes for each.

### Discovery finds no tables. Why?

Typically the user is missing `USAGE` on the schemas or `SELECT` on `information_schema`, the schema filter is excluding everything, or you connected to the wrong database name. See [Common Setup Mistakes](/troubleshooting/common-setup-mistakes).

### I'm not receiving alerts. What's wrong?

Check that the alert rule is enabled, the destination is configured and passes a test send, the rule scope actually matches the asset, and discovery is running and detecting changes.

## Still Have Questions?

- **[Contact Support](/support/contact)**: Get help from our team
- **[Documentation](/)**: Browse the full documentation

---

<!-- Source: https://docs.anomalyarmor.ai/support/contact -->

# Get Help

> Contact support and find answers

## Self-Service First

Most issues resolve in minutes:

| Issue | Solution |
|-------|----------|
| Data seems stale | Run discovery to refresh |
| Connection failing | Check [Connection Issues](/troubleshooting/connection-issues) |
| Alert not firing | Verify rule scope and destination |
| Missing tables | Check schema filters and permissions |

See [Common Setup Mistakes](/troubleshooting/common-setup-mistakes) for quick fixes.

## Contact Support

| Channel | Best For | Response Time |
|---------|----------|---------------|
| **In-app chat** | Quick questions | Minutes (business hours) |
| **support@anomalyarmor.ai** | Detailed issues | Within 24 hours |
| **security@anomalyarmor.ai** | Security concerns | Within 4 hours |

Business hours: Monday-Friday, 9am-6pm PT

## System Status

Check for ongoing incidents: **[status.anomalyarmor.ai](https://status.anomalyarmor.ai)**

Subscribe to status updates for proactive notifications.

## When Contacting Support

Include these details for faster resolution:

**Required:**
- Data source type (Postgres, Snowflake, etc.)
- Error message (screenshot or copy/paste)
- When it started happening

**Helpful:**
- Steps to reproduce
- What you've already tried
- Asset or rule ID (from URL)

## Enterprise Support

Enterprise plans include:

| Feature | Growth | Professional | Enterprise |
|---------|--------|--------------|------------|
| Email support | 24h response | 24h response | 4h response |
| In-app chat | Business hours | Business hours | Extended hours |
| Dedicated CSM | - | - | Included |
| Phone support | - | - | Available |

## Feature Requests

Have an idea? We track all requests:

1. **In-app chat**: Tell us what you need
2. **Email**: support@anomalyarmor.ai with subject "Feature Request"

We prioritize based on customer impact and strategic fit.

## Related

- **[Troubleshooting](/troubleshooting/common-setup-mistakes)**: Common fixes for setup issues
- **[FAQ](/troubleshooting/faq)**: Frequently asked questions

---

<!-- Source: https://docs.anomalyarmor.ai/glossary -->

# Glossary

> Key terms and definitions for AnomalyArmor

Terms and definitions used throughout AnomalyArmor and this documentation.

## A

**Alert**
A notification sent when AnomalyArmor detects an event matching one of your configured rules. Alerts can be sent to Slack, email, PagerDuty, webhooks, or Microsoft Teams.

**Alert Rule**
A configuration that defines what events trigger alerts, what conditions must be met, and where notifications are sent. Each rule has an event type, optional scope/conditions, and one or more destinations.

**Asset**
A data object that AnomalyArmor discovers and monitors. Assets include tables, views, and materialized views. Each asset has metadata like column names, data types, and change history.

**Asset Catalog**
The complete inventory of all discovered assets across your connected data sources. Searchable and filterable, the catalog is your source of truth for what exists in your data infrastructure.

## B

**Baseline**
The initial state recorded during the first discovery of an asset. Subsequent discoveries compare against this baseline to detect changes.

**Breaking Change**
A schema modification likely to cause query failures, such as a removed column or table. Breaking changes typically warrant immediate attention and urgent alerts.

## C

**Change Detection**
The process of comparing the current state of an asset against its previous state to identify modifications. Part of the discovery process.

**Connection**
See [Data Source](#d).

**Cooldown Period**
The minimum time between repeated alerts for the same event. Prevents alert storms when an issue persists across multiple discoveries.

## D

**Data Classification**
Tags and labels applied to assets or columns for organization, governance, and compliance. Examples include `PII`, `Financial`, `Confidential`, or custom tags.

**Data Quality**
The degree to which data meets expectations for freshness, completeness, accuracy, and schema correctness. AnomalyArmor monitors freshness and schema as key quality dimensions.

**Data Source**
A connection to a database or data platform. Each data source stores credentials, connection settings, and discovery configuration. Also called a "connection."

**Destination**
A configured endpoint where alerts are sent. Supported destinations include Slack, email, PagerDuty, Microsoft Teams, and webhooks.

**Discovery** (asset discovery)
The process by which AnomalyArmor scans a database's system catalogs to find and catalog assets (tables, views, columns, data types). Discovery is the *action* that produces a metadata snapshot; it does not, by itself, detect changes. The individual differences between successive discoveries are [schema changes](#s), and the system that detects them is [schema drift](#s). User-facing verb: **Run Discovery**.

**Discovery Job**
A single execution of the discovery process. Each job queries the database and produces a metadata snapshot. Change detection happens separately by comparing the new snapshot against the previous one.

**Asset Discovery**
See [Discovery](#d). Canonical name for the discovery workflow (`asset_discovery` in the backend). Synonymous with Discovery.

## F

**Freshness**
A measure of how recently data in a table was updated. Calculated by checking the maximum value of a configured timestamp column. Tables with data older than their SLA are considered "stale."

**Freshness SLA**
See [SLA](#s).

**Freshness Violation**
An event triggered when a table's data exceeds its configured freshness threshold. Indicates the data is stale and may impact downstream systems.

## I

**Incremental Discovery**
A discovery mode that only detects changes since the last run, rather than rescanning everything. Faster and lighter than full discovery. The default for scheduled discoveries after the initial baseline.

**Information Schema**
The standard SQL schema containing metadata about database objects. AnomalyArmor queries `information_schema` (and similar system catalogs) to discover assets. We never query your actual data tables.

## M

**Metadata**
Data about data. For AnomalyArmor, this includes table names, column names, data types, constraints, timestamps, and schema history. Metadata is all we , we never see row data.

**Metric**
A statistical measurement tracked over time for data quality monitoring. Types include row count, null percentage, distinct count, duplicate count, and numeric statistics (min, max, average). Metrics are captured on a schedule and analyzed for anomalies.

**Metric Anomaly**
A metric value that falls outside the expected range based on historical data. Detected using statistical methods (z-score) with configurable sensitivity. Anomalies may indicate data quality issues.

## P

**PII (Personally Identifiable Information)**
Data that can identify an individual, such as names, email addresses, phone numbers, or social security numbers. Often tagged for compliance purposes using data classification.

## R

**Rule**
See [Alert Rule](#a).

## S

**Schema**
The structure of a database, including tables, columns, data types, and constraints. In database terminology, also refers to a namespace within a database (e.g., `public` schema in PostgreSQL).

**Schema Change**
Any modification to database structure. Types include: column added, column removed, column type changed, table added, table removed, and constraint modified.

**Schema Drift**
The phenomenon of database structure changing over time, intentionally or unintentionally. AnomalyArmor detects drift by comparing consecutive discoveries.

**Schema History**
The complete record of all schema changes detected for an asset. Includes what changed, when it was detected, and before/after values.

**Sensitivity**
A configuration parameter for anomaly detection that determines how many standard deviations from the mean trigger an alert. Lower values (1.0) are more sensitive and catch smaller deviations. Higher values (3.0) reduce false positives. Default is 2.0.

**Scope**
The filter criteria on an alert rule that determine which assets the rule applies to. Scope can include data source, schema, specific assets, or patterns.

**SLA (Service Level Agreement)**
A configured threshold defining how fresh data should be. For example, an SLA of "1 hour" means the table should have data no older than 1 hour. Violations trigger freshness alerts.

**Stale Data**
Data that hasn't been updated within its expected timeframe (SLA). Stale data may indicate ETL failures, source system issues, or pipeline problems.

## T

**Table**
A fundamental database object that stores data in rows and columns. Tables are the most common asset type monitored by AnomalyArmor.

**Timestamp Column**
A column containing date/time values used to track when data was created or updated. AnomalyArmor queries the maximum value of this column for freshness monitoring.

## U

**Unity Catalog**
Databricks' unified governance solution for data and AI. AnomalyArmor integrates with Unity Catalog to discover and monitor Databricks assets.

## V

**View**
A virtual table defined by a SQL query. Views don't store data themselves but present data from underlying tables. AnomalyArmor monitors views as assets.

**VPC Peering**
A network connection between two Virtual Private Clouds (VPCs) that enables private connectivity. Enterprise feature for connecting AnomalyArmor to databases in private subnets.

## W

**Warning Threshold**
An optional secondary threshold that triggers an early warning alert before the main SLA violation. Gives teams advance notice that data is approaching staleness.

**Webhook**
An HTTP callback that receives alert data when events occur. Webhooks enable custom integrations with any system that can receive HTTP requests.

## Z

**Z-Score**
A statistical measure indicating how many standard deviations a value is from the mean. Used in anomaly detection for metrics. A z-score above the configured sensitivity threshold triggers an anomaly alert.

---

**Allowlist**
A list of allowed IP addresses. Many databases require adding AnomalyArmor's IP addresses to allowlists in security groups or firewalls before connections can be established.

---

<!-- Source: https://docs.anomalyarmor.ai/downloads/overview -->

# Downloads

> Downloadable resources for setting up AnomalyArmor

Download these ready-to-use resources to accelerate your AnomalyArmor setup.

## SQL Permission Templates

Minimal-privilege SQL scripts for granting AnomalyArmor read-only access to your databases. Each script includes:

- Step-by-step setup instructions
- Customizable placeholders
- Verification queries
- Cleanup commands

- **[PostgreSQL](/downloads/postgresql-permissions)**: For PostgreSQL, RDS, Aurora, Supabase
- **[Snowflake](/downloads/snowflake-permissions)**: Role and user setup for Snowflake
- **[Databricks](/downloads/databricks-permissions)**: Unity Catalog grants

## How to Use

1. **Open** the script page for your database
2. **Copy** the SQL code
3. **Replace** the placeholder values (marked with `your_` prefix)
4. **Run** as a database admin or owner
5. **Verify** using the test queries provided

> 💡 These scripts grant the minimum permissions required for AnomalyArmor. You can further restrict access by using the per-schema options in each script.

## Security Best Practices

### Use dedicated service accounts

Create a dedicated user for AnomalyArmor rather than using personal accounts. This ensures:
- Monitoring continues when employees leave
- Clear audit trail of AnomalyArmor activity
- Easy permission revocation if needed

### Grant only what's needed

AnomalyArmor only needs SELECT access for:
- `information_schema` or equivalent for discovery
- Tables you want to monitor for freshness
- All scripts default to read-only access

### Use key-pair authentication when available

For Snowflake and other databases that support it:
- More secure than passwords
- No password rotation required
- Supports hardware security modules

## Need Help?

- **[PostgreSQL Guide](/data-sources/postgresql)**: Detailed PostgreSQL setup instructions
- **[Snowflake Guide](/data-sources/snowflake)**: Detailed Snowflake setup instructions
- **[Databricks Guide](/data-sources/databricks)**: Detailed Databricks setup instructions
- **[Contact Support](/support/contact)**: Get help from our team

---

<!-- Source: https://docs.anomalyarmor.ai/api/overview -->

# AnomalyArmor API Overview

> Programmatic access to AnomalyArmor data observability features via the REST API, OpenAPI 3.1 spec, and SDKs

AnomalyArmor provides a REST API for programmatic access to data observability features. This enables automation, CI/CD integration, and custom tooling.

## Base URL

```
https://api.anomalyarmor.ai/api/v1
```

## Authentication

All API requests require authentication via API key. Include your key in the `Authorization` header:

```bash
curl -H "Authorization: Bearer aa_live_your_key_here" \
  https://api.anomalyarmor.ai/api/v1/assets
```

> ℹ️ Create API keys in **Settings > API Keys** or via CLI: `armor api-keys create`

## Rate Limiting

Rate limits are enforced per API key based on your subscription tier:

| Tier | Rate Limit | Burst |
|------|------------|-------|
| Free Trial | 20 req/min | 2 req/sec |
| Starter | 30 req/min | 3 req/sec |
| Growth | 500 req/min | 25 req/sec |
| Professional | 1,000 req/min | 50 req/sec |
| Enterprise | 5,000 req/min | 100 req/sec |

### Rate Limit Headers

Every response includes rate limit information:

```
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 499
X-RateLimit-Reset: 1701705600
```

When rate limited, you'll receive a `429` response with `Retry-After` header.

## Pagination

All list endpoints use offset/limit pagination:

```bash
GET /api/v1/assets?limit=50&offset=100
```

Response includes pagination metadata:

```json
{
  "data": [...],
  "pagination": {
    "total": 245,
    "limit": 50,
    "offset": 100,
    "has_more": true
  }
}
```

## Error Responses

Errors return a consistent JSON structure:

```json
{
  "error": {
    "code": "ASSET_NOT_FOUND",
    "message": "Asset not found",
    "details": {
      "asset_id": "snowflake.prod.warehouse.orders",
      "suggestion": "Check the qualified name format: source.database.schema.table"
    }
  }
}
```

### Error Codes

| Code | HTTP Status | Description |
|------|-------------|-------------|
| `ASSET_NOT_FOUND` | 404 | Asset doesn't exist or not accessible |
| `UNAUTHORIZED` | 401 | Invalid or missing API key |
| `FORBIDDEN` | 403 | Valid key but insufficient scope |
| `RATE_LIMITED` | 429 | Too many requests |
| `VALIDATION_ERROR` | 400 | Invalid request parameters |
| `INTERNAL_ERROR` | 500 | Server error |

## Quick Start

## Available Endpoints

- **[Assets](/api/assets)**: List and retrieve data assets
- **[Freshness](/api/freshness)**: Monitor and check data freshness
- **[Schema](/api/schema)**: Track schema changes and drift
- **[Lineage](/api/lineage)**: Explore data dependencies
- **[Alerts](/api/alerts)**: Query alert history and status
- **[Authentication](/api/authentication)**: API key management

### Data Quality APIs

- **[Metrics](/api/metrics)**: Track row counts, null rates, and data trends
- **[Validity](/api/validity)**: Enforce NOT NULL, UNIQUE, REGEX, and custom rules
- **[Referential Checks](/api/referential-checks)**: Verify foreign key relationships

## Next Steps

- **[SDK Quickstart](/sdk/overview)**: Get started with the Python SDK
- **[CLI Reference](/cli/overview)**: Command-line interface guide
- **[Airflow Integration](/integrations/airflow)**: Use with Apache Airflow

## Common Questions

### What is the base URL for the AnomalyArmor API?

All endpoints are served from `https://api.anomalyarmor.ai/api/v1`. Every request must include a Bearer token in the `Authorization` header. Point the Python or TypeScript SDK at a different `api_url` / `baseUrl` if you are targeting a staging environment.

### How do I handle rate limits and 429 responses?

Limits are per-API-key and vary by subscription tier, from 20 req/min on Free Trial up to 5,000 req/min on Enterprise. Every response includes `X-RateLimit-Limit`, `X-RateLimit-Remaining`, and `X-RateLimit-Reset` headers, and 429s include a `Retry-After` header you should honor before retrying. The TypeScript SDK retries idempotent requests automatically; the Python SDK raises `RateLimitError` with a `retry_after` attribute.

### How does pagination work on list endpoints?

List endpoints use offset/limit pagination with `limit` (default 50, max 100) and `offset` query parameters. Responses include a `pagination` object with `total`, `limit`, `offset`, and `has_more` so you can loop until `has_more` is `false`. Prefer paging in batches of 100 to minimize request count against your rate limit.

### What error codes does the API return?

Errors come back as `{"error": {"code", "message", "details"}}`. Common codes include `ASSET_NOT_FOUND` (404), `UNAUTHORIZED` (401), `FORBIDDEN` (403), `RATE_LIMITED` (429), `VALIDATION_ERROR` (400), and `INTERNAL_ERROR` (500). The `details` object usually includes field names or suggestions you can surface in logs.

### Should I use the REST API directly or the SDK?

Use the SDK in application code: it handles auth, retries on 429, and typed responses for you. Hit the REST API directly from shell scripts, non-supported languages, or when you need a verb not yet wrapped by the ergonomic SDK surface. Both paths share the same `aa_live_*` Bearer token.

---

<!-- Source: https://docs.anomalyarmor.ai/api/authentication -->

# AnomalyArmor API Authentication

> Authenticate to the AnomalyArmor REST API with API keys, or to the AnomalyArmor MCP server with OAuth 2.1

AnomalyArmor supports two authentication methods:

| Method | Use Case |
|--------|----------|
| **API Keys** | SDK, CLI, CI/CD pipelines, programmatic access |
| **OAuth 2.1** | MCP server connections from AI tools (Claude Code, Cursor) |

For MCP integrations, see the [MCP Server](/integrations/mcp-server) page. OAuth authentication is handled automatically when you connect via the remote MCP server.

The rest of this page covers API key authentication for programmatic access.

## API Key Format

API keys use the format `aa_live_<random>`:

```
aa_live_k8jd92hf8j2hd98fh2d9h2f98h2d9fh2
```

> ⚠️ API keys are shown only once at creation. Store them securely - we cannot retrieve them later.

## Creating API Keys

### Via Dashboard

1. Go to **Settings > API Keys**
2. Click **Create API Key**
3. Enter a descriptive name (e.g., "Airflow Production")
4. Select scope based on needs: `read-only` for monitoring, `read-write` for triggering actions, `admin` for key management
5. Click **Create Key**
6. Copy the key immediately

### Via CLI

```bash
# Create a read-only key
armor api-keys create --name "airflow-prod" --scope read-only

# Create a read-write key for triggering refreshes
armor api-keys create --name "ci-pipeline" --scope read-write
```

### Via API

```bash
curl -X POST "https://api.anomalyarmor.ai/api/v1/api-keys" \
  -H "Authorization: Bearer aa_live_admin_key" \
  -H "Content-Type: application/json" \
  -d '{"name": "automation-key", "scope": "read-only"}'
```

## Scopes

| Scope | Capabilities |
|-------|--------------|
| `read-only` | GET endpoints only. Read assets, freshness, lineage, alerts. |
| `read-write` | GET + POST. Trigger freshness/schema refreshes. |
| `admin` | Full access including API key management. |

> 💡 Follow the principle of least privilege - use `read-only` for monitoring and `read-write` only when you need to trigger actions.

### Scope Examples

### read-only: Airflow pre-flight check

Check if data is fresh before running pipelines. No ability to modify anything.
```python
# Just reads freshness status
client.freshness.require_fresh("warehouse.orders")
```

### read-write: Trigger refresh after dbt run

Check freshness and trigger a refresh when needed.
```python
# Can trigger refresh operations
client.freshness.refresh("warehouse.orders", wait=True)
```

### admin: Key rotation automation

Create and revoke keys programmatically for security compliance.
```python
# Can manage API keys
new_key = client.api_keys.create(name="rotated-key", scope="read-only")
client.api_keys.revoke(old_key_id)
```

## Using API Keys

### Environment Variable (Recommended)

```bash
export ARMOR_API_KEY="aa_live_your_key_here"
```

Then in your code:

```python
from anomalyarmor import Client

# Automatically uses ARMOR_API_KEY
client = Client()
```

### Direct Parameter

```python
from anomalyarmor import Client

client = Client(api_key="aa_live_your_key_here")
```

### HTTP Header

For direct API calls:

```bash
curl -H "Authorization: Bearer aa_live_your_key_here" \
  https://api.anomalyarmor.ai/api/v1/assets
```

## Rate Limits by Tier

Rate limits are set when you create the API key based on your subscription:

| Tier | Max Keys | Rate Limit | Burst |
|------|----------|------------|-------|
| Free Trial | 1 | 20/min | 2/sec |
| Starter | 1 | 30/min | 3/sec |
| Growth | 10 | 500/min | 25/sec |
| Professional | 25 | 1,000/min | 50/sec |
| Enterprise | Unlimited | 5,000/min | 100/sec |

> ℹ️ When you upgrade your plan, existing API keys automatically get the new rate limits.

## Revoking Keys

Revoke compromised or unused keys immediately:

### Via Dashboard

1. Go to **Settings > API Keys**
2. Find the key and click the trash icon
3. Confirm revocation

### Via CLI

```bash
armor api-keys revoke <key-id>
```

### Via API

```bash
curl -X DELETE "https://api.anomalyarmor.ai/api/v1/api-keys/<key-id>" \
  -H "Authorization: Bearer aa_live_admin_key"
```

## Security Best Practices

## Troubleshooting

### 401 Unauthorized

- Check the key is not revoked
- Verify the `Authorization: Bearer` header format
- Ensure no extra whitespace in the key

### 403 Forbidden

- The key is valid but lacks permission for this operation
- Check the scope - you may need `read-write` or `admin`

### 429 Rate Limited

- You've exceeded your rate limit
- Check `Retry-After` header for when to retry
- Consider upgrading your plan for higher limits

## Common Questions

### How do I create an API key?

Generate keys under **Settings > API Keys** in the dashboard, via `armor api-keys create --name ... --scope ...`, or by POSTing to `/api/v1/api-keys` with an admin key. Name keys after the integration that uses them ("airflow-prod", "ci-pipeline") so rotation and revocation stay auditable. The plaintext key is shown only once at creation.

### Which scope should my API key have?

Use `read-only` for monitoring, dashboards, and CI checks that only query state. Use `read-write` when you need to trigger freshness or schema refreshes. Reserve `admin` for key management automation itself. Follow least privilege: most integrations only need `read-only`.

### What does the aa_live_ prefix on a key mean?

All AnomalyArmor API keys start with `aa_live_` followed by a random secret, making them easy to detect with secret scanners and grep over source trees. Treat the full string as sensitive. If you see one committed to git or pasted in logs, revoke it immediately from **Settings > API Keys**.

### How do I rotate or revoke an API key?

Create the replacement key first, roll it out to the consumer, then revoke the old key via the dashboard, `armor api-keys revoke <key-id>`, or `DELETE /api/v1/api-keys/{key-id}` with an admin token. Revocation takes effect immediately and cannot be undone, so keep the new key live before cutting the old one.

### Where should I store API keys in code?

Read the key from the `ARMOR_API_KEY` environment variable (Python SDK / CLI) or `ANOMALYARMOR_API_KEY` (TypeScript SDK CLI shim), and inject it from a secrets manager in production. Never hardcode keys in source, Docker images, or Jupyter notebooks. Use separate keys per environment so a dev leak can't touch prod.

### Why am I getting 403 Forbidden when my key works elsewhere?

403 means the key is valid but doesn't have the scope the endpoint requires. Write endpoints (freshness refresh, schema refresh, create-metric) need `read-write`; key management needs `admin`. The error body shows `current_scope` and `required_scope` so you can pick the right key.

---

<!-- Source: https://docs.anomalyarmor.ai/api-reference/list-assets -->

# "List Assets"

> "Retrieve all data assets (tables, views) you have access to"

Get a list of all discovered assets across your data sources. Use this to build integrations, sync with data catalogs, or audit your inventory.

## When to Use

- **Data catalog sync**: Export assets to external tools
- **Automation**: Build workflows that operate on your asset inventory
- **Auditing**: Generate reports of all monitored tables

## SDK & CLI Examples

## Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `source` | string | Filter by data source name |
| `asset_type` | string | Filter by type: `table`, `view` |
| `schema` | string | Filter by schema name |
| `limit` | integer | Max results to return (default: 100, max: 1000) |
| `offset` | integer | Skip first N results for pagination |

## Response

```json
{
  "assets": [
    {
      "id": "ast_abc123",
      "qualified_name": "snowflake.prod.public.orders",
      "asset_type": "table",
      "source": "snowflake.prod",
      "schema": "public",
      "name": "orders",
      "columns": 12,
      "freshness_status": "fresh",
      "last_updated": "2024-01-15T08:30:00Z",
      "tags": ["production", "pii:email"]
    }
  ],
  "total": 150,
  "limit": 100,
  "offset": 0
}
```

## Response Fields

| Field | Description |
|-------|-------------|
| `id` | Unique asset identifier |
| `qualified_name` | Full path: `source.schema.table` |
| `asset_type` | `table` or `view` |
| `source` | Data source this asset belongs to |
| `schema` | Database schema name |
| `name` | Table/view name |
| `columns` | Number of columns |
| `freshness_status` | `fresh`, `stale`, or `unknown` |
| `last_updated` | Last data update timestamp |
| `tags` | Applied classification and custom tags |

## Error Responses

| Status | Meaning |
|--------|---------|
| `401` | Invalid or missing API key |
| `403` | API key doesn't have read permissions |
| `429` | Rate limit exceeded |

---

<!-- Source: https://docs.anomalyarmor.ai/api-reference/get-asset -->

# "Get Asset"

> "Retrieve details for a specific data asset"

Get complete details for a single asset including columns, relationships, freshness status, and tags.

## When to Use

- **Asset details**: Get full metadata for a specific table
- **Column inspection**: List all columns with types and descriptions
- **Freshness check**: Verify data recency for a single asset
- **Integration sync**: Pull asset metadata into external tools

## SDK & CLI Examples

## Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `asset_id` | string | Asset ID (UUID) or qualified name |
| `include_columns` | boolean | Include column details (default: true) |

## Response

```json
{
  "id": "ast_abc123",
  "qualified_name": "snowflake.prod.public.orders",
  "asset_type": "table",
  "source": "snowflake.prod",
  "schema": "public",
  "name": "orders",
  "description": "Customer order records including order details and fulfillment status",
  "freshness": {
    "status": "fresh",
    "last_updated": "2024-01-15T08:30:00Z",
    "sla_threshold": "2h",
    "checked_at": "2024-01-15T09:00:00Z"
  },
  "columns": [
    {
      "name": "order_id",
      "data_type": "integer",
      "nullable": false,
      "description": "Unique identifier for each order",
      "tags": []
    },
    {
      "name": "user_email",
      "data_type": "varchar(255)",
      "nullable": true,
      "description": "Customer email address",
      "tags": ["pii:email"]
    }
  ],
  "tags": ["production", "critical"],
  "created_at": "2024-01-01T00:00:00Z",
  "updated_at": "2024-01-15T08:30:00Z"
}
```

## Response Fields

| Field | Description |
|-------|-------------|
| `id` | Unique asset identifier |
| `qualified_name` | Full path: `source.schema.table` |
| `description` | Human-readable description (generated or manual) |
| `freshness.status` | `fresh`, `stale`, or `unknown` |
| `freshness.last_updated` | Timestamp of most recent data |
| `freshness.sla_threshold` | Configured freshness SLA |
| `columns` | Array of column definitions |
| `columns[].tags` | Classification tags on this column |
| `tags` | Asset-level tags |

## Error Responses

| Status | Meaning |
|--------|---------|
| `401` | Invalid or missing API key |
| `404` | Asset not found |
| `429` | Rate limit exceeded |

---

<!-- Source: https://docs.anomalyarmor.ai/api-reference/get-freshness -->

# "Get Freshness Status"

> "Retrieve freshness status for an asset or data source"

Check whether data is fresh or stale. Use this to gate pipelines, build monitoring dashboards, or verify SLA compliance.

## When to Use

- **Pipeline gating**: Fail jobs if upstream data is stale
- **Dashboards**: Show freshness status in monitoring tools
- **SLA tracking**: Verify data meets freshness requirements
- **Alerting integration**: Build custom alert workflows

## SDK & CLI Examples

## Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `asset_id` | string | Asset ID or qualified name |

## Response

```json
{
  "asset": "snowflake.prod.public.orders",
  "status": "fresh",
  "last_updated": "2024-01-15T08:30:00Z",
  "sla_threshold": "2h",
  "sla_status": "within_sla",
  "checked_at": "2024-01-15T09:00:00Z",
  "timestamp_column": "created_at",
  "data_age_minutes": 30
}
```

## Response Fields

| Field | Description |
|-------|-------------|
| `status` | `fresh`, `stale`, or `unknown` |
| `last_updated` | Timestamp of most recent data in the table |
| `sla_threshold` | Configured freshness SLA (e.g., "2h", "24h") |
| `sla_status` | `within_sla`, `warning`, or `violated` |
| `checked_at` | When this freshness check ran |
| `timestamp_column` | Column used to determine freshness |
| `data_age_minutes` | How old the data is in minutes |

## Status Values

| Status | Meaning |
|--------|---------|
| `fresh` | Data updated within SLA threshold |
| `stale` | Data older than SLA threshold |
| `unknown` | No timestamp column configured or no data |

## Error Responses

| Status | Meaning |
|--------|---------|
| `401` | Invalid or missing API key |
| `404` | Asset not found |
| `422` | No freshness configuration for this asset |
| `429` | Rate limit exceeded |

---

<!-- Source: https://docs.anomalyarmor.ai/api-reference/get-schema-changes -->

# "Get Schema Changes"

> "Retrieve recent schema changes detected for an asset"

Get the history of schema changes for an asset. Use this to track drift, understand what changed, and build change management workflows.

## When to Use

- **Change tracking**: Monitor schema drift over time
- **Impact analysis**: Understand what changed before troubleshooting
- **Compliance**: Audit schema modifications
- **Integration**: Sync changes to external systems

## SDK & CLI Examples

## Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `asset_id` | string | Asset ID or qualified name |
| `days_back` | integer | Look back this many days (default: 30) |
| `change_type` | string | Filter by type (see Change Types below) |
| `limit` | integer | Max results (default: 100) |

## Response

```json
{
  "asset": "snowflake.prod.public.orders",
  "changes": [
    {
      "id": "chg_abc123",
      "change_type": "column_removed",
      "object_name": "status",
      "object_type": "column",
      "old_value": {
        "name": "status",
        "data_type": "varchar(20)",
        "nullable": true
      },
      "new_value": null,
      "detected_at": "2024-01-15T08:30:00Z",
      "discovery_id": "disc_xyz789"
    },
    {
      "id": "chg_def456",
      "change_type": "column_added",
      "object_name": "fulfillment_status",
      "object_type": "column",
      "old_value": null,
      "new_value": {
        "name": "fulfillment_status",
        "data_type": "varchar(50)",
        "nullable": true
      },
      "detected_at": "2024-01-15T08:30:00Z",
      "discovery_id": "disc_xyz789"
    }
  ],
  "total": 2
}
```

## Response Fields

| Field | Description |
|-------|-------------|
| `id` | Unique change identifier |
| `change_type` | Type of change (see Change Types) |
| `object_name` | Name of changed column or table |
| `object_type` | `column` or `table` |
| `old_value` | Previous state (null for additions) |
| `new_value` | New state (null for removals) |
| `detected_at` | When the change was discovered |
| `discovery_id` | Discovery run that found this change |

## Change Types

| Type | Description | Impact |
|------|-------------|--------|
| `column_added` | New column appeared | Low - usually safe |
| `column_removed` | Column no longer exists | High - breaks queries |
| `column_type_changed` | Data type modified | Medium - may affect logic |
| `column_renamed` | Detected as remove + add | High - breaks queries |
| `table_added` | New table discovered | Low - informational |
| `table_removed` | Table no longer exists | High - breaks queries |
| `constraint_changed` | PK, FK, or unique modified | Medium - may affect joins |

## Error Responses

| Status | Meaning |
|--------|---------|
| `401` | Invalid or missing API key |
| `404` | Asset not found |
| `429` | Rate limit exceeded |

---

<!-- Source: https://docs.anomalyarmor.ai/api-reference/list-alerts -->

# "List Alerts"

> "List triggered alerts with filtering by status, type, and date range"

Retrieve alerts that have been triggered. Use this to build dashboards, integrate with ticketing systems, or create custom notification workflows.

## When to Use

- **Dashboard integration**: Show recent alerts in monitoring tools
- **Ticketing sync**: Create tickets for triggered alerts
- **Reporting**: Generate alert volume and response metrics
- **Custom workflows**: Build automated response systems

## SDK & CLI Examples

## Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `status` | string | Filter by status: `triggered`, `acknowledged`, `resolved` |
| `event_type` | string | Filter by type: `schema_change`, `freshness_violation`, `discovery_failed` |
| `asset` | string | Filter by asset qualified name |
| `since` | datetime | Only alerts after this timestamp |
| `until` | datetime | Only alerts before this timestamp |
| `limit` | integer | Max results (default: 100, max: 1000) |
| `offset` | integer | Skip first N results |

## Response

```json
{
  "alerts": [
    {
      "id": "alt_abc123",
      "rule_name": "Production Schema Changes",
      "event_type": "schema_change",
      "status": "triggered",
      "message": "Column 'status' removed from orders table",
      "asset": "snowflake.prod.public.orders",
      "details": {
        "change_type": "column_removed",
        "column_name": "status",
        "column_type": "varchar(20)"
      },
      "triggered_at": "2024-01-15T08:30:00Z",
      "acknowledged_at": null,
      "resolved_at": null,
      "destinations": ["slack", "pagerduty"]
    }
  ],
  "total": 45,
  "limit": 100,
  "offset": 0
}
```

## Response Fields

| Field | Description |
|-------|-------------|
| `id` | Unique alert identifier |
| `rule_name` | Name of the rule that triggered |
| `event_type` | Type of event that triggered the alert |
| `status` | `triggered`, `acknowledged`, or `resolved` |
| `message` | Human-readable alert description |
| `asset` | Affected asset qualified name |
| `details` | Event-specific details (varies by type) |
| `triggered_at` | When the alert fired |
| `acknowledged_at` | When someone acknowledged it (if applicable) |
| `resolved_at` | When the alert was resolved (if applicable) |
| `destinations` | Where the alert was sent |

## Status Values

| Status | Meaning |
|--------|---------|
| `triggered` | Alert fired, not yet acknowledged |
| `acknowledged` | Someone is looking at it |
| `resolved` | Issue has been addressed |

## Event Types

| Type | Description |
|------|-------------|
| `schema_change` | Table or column structure changed |
| `freshness_violation` | Data older than SLA threshold |
| `discovery_failed` | Discovery job couldn't complete |
| `asset_removed` | Table or view no longer exists |

## Error Responses

| Status | Meaning |
|--------|---------|
| `401` | Invalid or missing API key |
| `400` | Invalid filter parameters |
| `429` | Rate limit exceeded |

---

<!-- Source: https://docs.anomalyarmor.ai/api/assets -->

# Assets API

> List and retrieve data assets

Query your data assets (tables, views, models) discovered by AnomalyArmor.

## List Assets

```
GET /api/v1/assets
```

### Query Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `source` | string | Filter by data source name |
| `type` | string | Filter by asset type (`table`, `view`, `model`) |
| `limit` | integer | Max items to return (default: 50, max: 100) |
| `offset` | integer | Number of items to skip (default: 0) |

### Example Request

### Example Response

```json
{
  "data": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "qualified_name": "snowflake.prod.warehouse.orders",
      "name": "orders",
      "asset_type": "table",
      "source": "snowflake",
      "database": "prod",
      "schema": "warehouse",
      "description": "Customer order transactions",
      "row_count": 1500000,
      "column_count": 24,
      "created_at": "2024-01-15T10:30:00Z",
      "updated_at": "2024-12-04T08:00:00Z"
    }
  ],
  "pagination": {
    "total": 245,
    "limit": 10,
    "offset": 0,
    "has_more": true
  }
}
```

## Get Asset

```
GET /api/v1/assets/{id}
```

Retrieve a single asset by qualified name or UUID.

### Path Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `id` | string | Qualified name (e.g., `snowflake.prod.warehouse.orders`) or UUID |

### Example Request

### Example Response

```json
{
  "data": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "qualified_name": "snowflake.prod.warehouse.orders",
    "name": "orders",
    "asset_type": "table",
    "source": "snowflake",
    "database": "prod",
    "schema": "warehouse",
    "description": "Customer order transactions",
    "row_count": 1500000,
    "column_count": 24,
    "columns": [
      {
        "name": "order_id",
        "data_type": "VARCHAR",
        "is_nullable": false,
        "is_primary_key": true
      },
      {
        "name": "customer_id",
        "data_type": "VARCHAR",
        "is_nullable": false
      },
      {
        "name": "order_date",
        "data_type": "TIMESTAMP",
        "is_nullable": false
      }
    ],
    "tags": ["pii", "financial"],
    "created_at": "2024-01-15T10:30:00Z",
    "updated_at": "2024-12-04T08:00:00Z"
  }
}
```

## Asset Identification

AnomalyArmor supports two ways to identify assets:

### Qualified Name (Primary)

Human-readable, hierarchical identifier:

```
{source}.{database}.{schema}.{table}
```

Examples:
- `snowflake.prod.warehouse.orders`
- `databricks.main.analytics.daily_sales`
- `postgresql.app_db.public.users`

> 💡 Use qualified names in code for readability. They're stable as long as you don't rename the underlying table.

### UUID (Secondary)

System-generated unique identifier. Use for automation where names may change:

```
550e8400-e29b-41d4-a716-446655440000
```

## Error Responses

| Status | Code | Description |
|--------|------|-------------|
| 404 | `ASSET_NOT_FOUND` | Asset doesn't exist or you don't have access |
| 400 | `VALIDATION_ERROR` | Invalid qualified name format |

```json
{
  "error": {
    "code": "ASSET_NOT_FOUND",
    "message": "Asset not found",
    "details": {
      "asset_id": "snowflake.prod.warehouse.orders",
      "suggestion": "Check the qualified name format: source.database.schema.table"
    }
  }
}
```

## Common Questions

### Should I identify an asset by qualified name or UUID?

Use the qualified name (`source.database.schema.table`) for code and config you read by hand: it survives re-discovery and is human-debuggable. Use the UUID for automation where names might be renamed upstream, since UUIDs are stable across renames. Both identifiers work on every `/api/v1/assets/{id}` endpoint.

### How does AnomalyArmor discover assets?

Assets are discovered automatically when you connect a source (Snowflake, BigQuery, Postgres, etc.) via the dashboard. The Assets API is read-only; it returns whatever the platform has discovered so far. If an expected table is missing, trigger a re-scan from the source's page in the dashboard.

### Does listing assets return column-level details?

No. `GET /api/v1/assets` returns metadata and row/column counts. To get the column schema (names, types, nullability, primary keys), call `GET /api/v1/assets/{id}` for a single asset. That keeps list responses small and fast when you have hundreds of tables.

---

<!-- Source: https://docs.anomalyarmor.ai/api/freshness -->

# Freshness API

> Monitor and validate data freshness

The Freshness API enables monitoring and validation of data freshness. Use it to check if your data is up-to-date and trigger on-demand freshness checks.

## Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/v1/sdk/freshness` | List freshness status for all assets |
| GET | `/api/v1/sdk/freshness/summary` | Get freshness summary statistics |
| GET | `/api/v1/sdk/freshness/{id}` | Get freshness status for an asset |
| POST | `/api/v1/sdk/freshness/{id}/refresh` | Trigger freshness check |

## Get Freshness Summary

```
GET /api/v1/sdk/freshness/summary
```

Returns aggregate freshness statistics across all monitored assets.

### Response

```json
{
  "data": {
    "total_assets": 245,
    "fresh": 230,
    "stale": 10,
    "unknown": 5,
    "fresh_percentage": 93.9,
    "by_source": {
      "snowflake": {"total": 150, "fresh": 145, "stale": 5},
      "bigquery": {"total": 95, "fresh": 85, "stale": 5}
    }
  }
}
```

## List Freshness Status

```
GET /api/v1/sdk/freshness
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `status` | string | - | Filter: `fresh`, `stale`, or `unknown` |
| `source` | string | - | Filter by data source |
| `limit` | integer | 50 | Max results (max: 100) |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": [
    {
      "asset_id": "550e8400-e29b-41d4-a716-446655440000",
      "qualified_name": "snowflake.prod.warehouse.orders",
      "status": "fresh",
      "is_fresh": true,
      "last_updated": "2024-12-04T08:15:00Z",
      "threshold_hours": 24,
      "hours_since_update": 2.5,
      "checked_at": "2024-12-04T10:30:00Z"
    }
  ],
  "pagination": {
    "total": 100,
    "limit": 50,
    "offset": 0,
    "has_more": true
  }
}
```

## Get Asset Freshness

```
GET /api/v1/sdk/freshness/{id}
```

Get detailed freshness status for a specific asset.

### Response

```json
{
  "data": {
    "asset_id": "550e8400-e29b-41d4-a716-446655440000",
    "qualified_name": "snowflake.prod.warehouse.orders",
    "status": "fresh",
    "is_fresh": true,
    "last_updated": "2024-12-04T08:15:00Z",
    "threshold_hours": 24,
    "hours_since_update": 2.5,
    "checked_at": "2024-12-04T10:30:00Z",
    "history": [
      {"checked_at": "2024-12-04T10:30:00Z", "status": "fresh"},
      {"checked_at": "2024-12-04T06:30:00Z", "status": "fresh"},
      {"checked_at": "2024-12-03T22:30:00Z", "status": "stale"}
    ]
  }
}
```

## Trigger Freshness Check

```
POST /api/v1/sdk/freshness/{id}/refresh
```

> ℹ️ Requires `read-write` or `admin` scope.

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `wait` | boolean | false | Wait for check to complete |

### Response (async)

```json
{
  "data": {
    "job_id": "job_abc123",
    "status": "pending",
    "asset_id": "snowflake.prod.warehouse.orders"
  }
}
```

### Response (sync with wait=true)

```json
{
  "data": {
    "job_id": "job_abc123",
    "status": "completed",
    "asset_id": "snowflake.prod.warehouse.orders",
    "result": {
      "is_fresh": true,
      "last_updated": "2024-12-04T10:35:00Z"
    }
  }
}
```

## Gate Pattern: require_fresh()

The SDK provides a convenient gate pattern for pipelines:

```python
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

client = Client()

try:
    # Raises StalenessError if data is stale
    client.freshness.require_fresh("snowflake.prod.warehouse.orders")
    print("Data is fresh, proceeding...")
except StalenessError as e:
    print(f"Data is stale: last updated {e.last_updated}")
    raise  # Fail the pipeline
```

> 💡 Use `require_fresh()` in Airflow tasks to automatically fail pipelines when upstream data is stale. See the [Airflow Integration](/integrations/airflow) guide.

## CLI Check Command

The CLI provides a `check` command that exits with code 1 if data is stale:

```bash
# Exit 0 if fresh, exit 1 if stale
armor freshness check snowflake.prod.warehouse.orders
echo $?  # 0 = fresh, 1 = stale

# Use in shell scripts
if armor freshness check snowflake.prod.warehouse.orders; then
    echo "Data is fresh"
    dbt run
else
    echo "Data is stale, aborting"
    exit 1
fi
```

## Error Responses

### Asset Not Found (404)

```json
{
  "error": {
    "code": "ASSET_NOT_FOUND",
    "message": "Asset not found",
    "details": {"asset_id": "invalid.qualified.name"}
  }
}
```

### Forbidden (403)

When attempting to trigger refresh without proper scope:

```json
{
  "error": {
    "code": "FORBIDDEN",
    "message": "Insufficient permissions. Required scope: read-write",
    "details": {"current_scope": "read-only", "required_scope": "read-write"}
  }
}
```

## Common Questions

### What's the difference between the async and sync refresh calls?

Without `wait=true`, `POST /freshness/{id}/refresh` returns immediately with a `job_id` and `status=pending`. With `wait=true`, the request blocks until the check completes and returns the actual `is_fresh` result. Use async for fire-and-forget cron triggers and sync when a pipeline needs the answer before proceeding.

### How do I gate an Airflow or dbt pipeline on freshness?

In Python, call `client.freshness.require_fresh("...")` which raises `StalenessError` when stale, which fails the task cleanly. In shell, run `armor freshness check <asset>` and rely on exit code 1 for stale. The [Airflow integration guide](/integrations/airflow) shows the full DAG pattern.

### Where does the freshness threshold come from?

Each asset has a `threshold_hours` configured from the dashboard (Freshness tab on the asset page) or inferred from historical update cadence. The API response shows the active `threshold_hours` next to `hours_since_update` so you can display both in your own UI.

### Why does freshness return status=unknown?

`unknown` means AnomalyArmor hasn't yet observed enough update history to decide, typically for newly connected assets or tables that haven't been written to since the source was connected. Trigger a refresh to get a current read, and give the asset a few update cycles for the baseline to fill in.

---

<!-- Source: https://docs.anomalyarmor.ai/api/schema -->

# Schema API

> Monitor schema changes and detect drift

The Schema API monitors schema changes and detects drift in your data assets. Use it to track column additions, removals, type changes, and trigger on-demand schema checks.

## Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/v1/sdk/schema/summary` | Get schema drift summary |
| GET | `/api/v1/sdk/schema/changes` | List recent schema changes |
| POST | `/api/v1/sdk/schema/{id}/refresh` | Trigger schema check |

## Get Schema Summary

```
GET /api/v1/sdk/schema/summary
```

Returns aggregate schema drift statistics.

### Response

```json
{
  "data": {
    "total_assets": 245,
    "assets_with_changes": 12,
    "changes_last_24h": 5,
    "changes_last_7d": 23,
    "by_change_type": {
      "column_added": 15,
      "column_removed": 3,
      "column_type_changed": 8,
      "column_renamed": 2
    }
  }
}
```

## List Schema Changes

```
GET /api/v1/sdk/schema/changes
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `asset_id` | string | - | Filter by asset qualified name |
| `change_type` | string | - | Filter by change type |
| `since` | datetime | - | Changes since timestamp |
| `limit` | integer | 50 | Max results |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": [
    {
      "id": "change_123",
      "asset_id": "550e8400-e29b-41d4-a716-446655440000",
      "qualified_name": "snowflake.prod.warehouse.orders",
      "change_type": "column_added",
      "column_name": "discount_code",
      "details": {
        "new_type": "VARCHAR(50)",
        "nullable": true
      },
      "detected_at": "2024-12-04T09:15:00Z",
      "previous_schema_hash": "abc123",
      "current_schema_hash": "def456"
    }
  ],
  "pagination": {
    "total": 23,
    "limit": 50,
    "offset": 0,
    "has_more": false
  }
}
```

## Trigger Schema Check

```
POST /api/v1/sdk/schema/{id}/refresh
```

> ℹ️ Requires `read-write` or `admin` scope.

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `wait` | boolean | false | Wait for check to complete |

### Response

```json
{
  "data": {
    "job_id": "job_xyz789",
    "status": "completed",
    "asset_id": "snowflake.prod.warehouse.orders",
    "result": {
      "changes_detected": true,
      "changes": [
        {
          "change_type": "column_added",
          "column_name": "discount_code",
          "details": {"new_type": "VARCHAR(50)"}
        }
      ]
    }
  }
}
```

## Change Types

| Type | Description |
|------|-------------|
| `column_added` | New column added to table |
| `column_removed` | Column removed from table |
| `column_type_changed` | Column data type changed |
| `column_renamed` | Column name changed |
| `table_created` | New table created |
| `table_dropped` | Table dropped |

## Use Case: Post-Deploy Schema Check

Trigger a schema check after deploying dbt models:

```python
from anomalyarmor import Client

client = Client()

def verify_schema_after_deploy(models: list[str]):
    """Run schema check after dbt deployment."""
    for model in models:
        print(f"Checking schema: {model}")
        result = client.schema.refresh(model, wait=True)

        if result.changes_detected:
            print(f"  Schema changes detected:")
            for change in result.changes:
                print(f"    - {change.change_type}: {change.column_name}")
        else:
            print(f"  No schema changes")

# After dbt run
verify_schema_after_deploy([
    "snowflake.prod.mart.orders_mart",
    "snowflake.prod.mart.customers_mart",
])
```

## Common Questions

### How does AnomalyArmor detect schema changes?

A baseline schema snapshot is taken the first time an asset is profiled. On every scan (scheduled or triggered via `POST /schema/{id}/refresh`), the current column list, types, and nullability are diffed against the baseline and any delta is written to the changes feed. The `previous_schema_hash` and `current_schema_hash` fields let you correlate changes across runs.

### Can I trigger a schema check on demand after a dbt deploy?

Yes. Call `POST /api/v1/sdk/schema/{id}/refresh?wait=true` (or `client.schema.refresh(asset, wait=True)`) in your post-deploy step. The synchronous response includes `changes_detected` and the change list, so you can fail the deploy pipeline if a breaking change lands unexpectedly.

### Which change types are reported?

The feed reports `column_added`, `column_removed`, `column_type_changed`, `column_renamed`, `table_created`, and `table_dropped`. Renames are heuristic and are sometimes reported as paired add/remove when the column's data distribution has also shifted.

---

<!-- Source: https://docs.anomalyarmor.ai/api/lineage -->

# Lineage API

> Explore data asset relationships and dependencies

The Lineage API provides access to data asset relationships, showing upstream sources and downstream dependencies. Use it for impact analysis and understanding data flow.

## Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/v1/sdk/lineage` | List assets with lineage info |
| GET | `/api/v1/sdk/lineage/{id}` | Get lineage for specific asset |

## List Assets with Lineage

```
GET /api/v1/sdk/lineage
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `source` | string | - | Filter by data source |
| `has_upstream` | boolean | - | Only assets with upstream dependencies |
| `has_downstream` | boolean | - | Only assets with downstream dependencies |
| `limit` | integer | 50 | Max results |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": [
    {
      "asset_id": "550e8400-e29b-41d4-a716-446655440000",
      "qualified_name": "snowflake.prod.warehouse.orders",
      "upstream_count": 3,
      "downstream_count": 5
    }
  ],
  "pagination": {
    "total": 100,
    "limit": 50,
    "offset": 0,
    "has_more": true
  }
}
```

## Get Asset Lineage

```
GET /api/v1/sdk/lineage/{id}
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `direction` | string | `both` | `upstream`, `downstream`, or `both` |
| `depth` | integer | 1 | Levels to traverse (1-5) |

### Response

```json
{
  "data": {
    "asset_id": "550e8400-e29b-41d4-a716-446655440000",
    "qualified_name": "snowflake.prod.warehouse.orders",
    "upstream": [
      {
        "asset_id": "raw_orders_id",
        "qualified_name": "snowflake.raw.stripe.orders",
        "asset_type": "table",
        "relationship": "derives_from",
        "depth": 1,
        "upstream": [
          {
            "qualified_name": "stripe.api.orders",
            "asset_type": "api",
            "depth": 2
          }
        ]
      },
      {
        "asset_id": "raw_customers_id",
        "qualified_name": "snowflake.raw.crm.customers",
        "asset_type": "table",
        "relationship": "joins_with",
        "depth": 1
      }
    ],
    "downstream": [
      {
        "asset_id": "orders_mart_id",
        "qualified_name": "snowflake.prod.mart.orders_mart",
        "asset_type": "table",
        "relationship": "feeds_into",
        "depth": 1
      },
      {
        "asset_id": "dashboard_id",
        "qualified_name": "looker.sales_dashboard",
        "asset_type": "dashboard",
        "relationship": "feeds_into",
        "depth": 1
      }
    ]
  }
}
```

## Relationship Types

| Type | Description |
|------|-------------|
| `derives_from` | Target is derived from source (transformation) |
| `joins_with` | Assets are joined together |
| `feeds_into` | Source feeds into target |
| `copies_to` | Direct copy relationship |
| `references` | Lookup/reference relationship |

## Use Case: Check Upstream Before Pipeline

Verify all upstream sources are fresh before running:

```python
from anomalyarmor import Client

client = Client()

def check_upstream_freshness(asset: str):
    """Check all upstream sources are fresh."""
    lineage = client.lineage.get(asset, direction="upstream")

    print(f"Checking {len(lineage.upstream)} upstream sources...")

    for upstream in lineage.upstream:
        client.freshness.require_fresh(upstream.qualified_name)
        print(f"  {upstream.qualified_name}")

    print("All upstream sources are fresh!")

check_upstream_freshness("snowflake.prod.mart.orders_summary")
```

## Use Case: Impact Analysis

Before making schema changes, check downstream impact:

```python
from anomalyarmor import Client

client = Client()

def check_impact(asset: str):
    """Check downstream impact before schema change."""
    lineage = client.lineage.get(asset, direction="downstream", depth=3)

    print(f"Downstream impact for {asset}:")
    print(f"Total downstream assets: {len(lineage.downstream)}")

    # Group by type
    by_type = {}
    for dep in lineage.downstream:
        by_type.setdefault(dep.asset_type, []).append(dep)

    for asset_type, assets in by_type.items():
        print(f"  {asset_type}: {len(assets)}")
        for asset in assets[:3]:
            print(f"    - {asset.qualified_name}")

check_impact("snowflake.prod.warehouse.orders")
```

Output:
```
Downstream impact for snowflake.prod.warehouse.orders:
Total downstream assets: 8
  table: 5
    - snowflake.prod.mart.orders_mart
    - snowflake.prod.mart.revenue_daily
    - snowflake.prod.mart.customer_orders
  dashboard: 3
    - looker.sales_dashboard
    - looker.executive_summary
    - tableau.orders_analysis
```

## Common Questions

### How deep can I traverse lineage in one request?

Pass `depth` from 1 to 5 on `GET /lineage/{id}`. Depth 1 returns direct neighbors, which is cheap and enough for most impact-analysis UIs. Higher depths fan out quickly, so avoid depth 5 in tight loops and paginate the results you consume.

### Where does lineage come from?

Lineage is derived from warehouse query history, dbt manifests, and view definitions for supported sources, then enriched with dashboard consumers (Looker, Tableau, Mode). You won't see relationships for sources that don't expose query history or for ad-hoc scripts run outside the warehouse.

### What do the relationship types mean?

`derives_from` marks transformation-based derivation (CREATE TABLE AS, dbt models). `joins_with` captures co-used inputs. `feeds_into` describes a non-derivation downstream consumer (e.g., a dashboard). `copies_to` is a direct copy, and `references` is a lookup/FK-style join.

---

<!-- Source: https://docs.anomalyarmor.ai/api/alerts -->

# Alerts API

> Query alert history and status

The Alerts API provides access to data quality alerts triggered by AnomalyArmor monitoring. Use it to query alert history, check active incidents, and integrate with your incident response workflows.

## Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/v1/sdk/alerts/summary` | Get alerts summary |
| GET | `/api/v1/sdk/alerts` | List alerts |
| GET | `/api/v1/sdk/alerts/rules` | List alert rules |

## Get Alerts Summary

```
GET /api/v1/sdk/alerts/summary
```

Returns aggregate alert statistics.

### Response

```json
{
  "data": {
    "total": 45,
    "by_status": {
      "triggered": 5,
      "acknowledged": 3,
      "resolved": 37
    },
    "by_type": {
      "freshness": 20,
      "schema_change": 15,
      "row_count": 10
    },
    "triggered_last_24h": 8,
    "triggered_last_7d": 25
  }
}
```

## List Alerts

```
GET /api/v1/sdk/alerts
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `status` | string | - | Filter: `triggered`, `acknowledged`, `resolved` |
| `alert_type` | string | - | Filter: `freshness`, `schema_change`, `row_count` |
| `asset_id` | string | - | Filter by asset qualified name |
| `since` | datetime | - | Alerts since timestamp |
| `limit` | integer | 50 | Max results |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": [
    {
      "id": "alert_123",
      "alert_type": "freshness",
      "status": "triggered",
      "asset_id": "550e8400-e29b-41d4-a716-446655440000",
      "qualified_name": "snowflake.prod.warehouse.orders",
      "message": "Data is 26 hours stale (threshold: 24 hours)",
      "triggered_at": "2024-12-04T08:00:00Z",
      "acknowledged_at": null,
      "resolved_at": null,
      "details": {
        "last_updated": "2024-12-03T06:00:00Z",
        "threshold_hours": 24,
        "hours_stale": 26
      },
      "rule_id": "rule_freshness_orders"
    }
  ],
  "pagination": {
    "total": 5,
    "limit": 50,
    "offset": 0,
    "has_more": false
  }
}
```

## List Alert Rules

```
GET /api/v1/sdk/alerts/rules
```

Returns configured alert rules for your organization.

### Response

```json
{
  "data": [
    {
      "id": "rule_freshness_orders",
      "name": "Orders Freshness",
      "alert_type": "freshness",
      "asset_id": "snowflake.prod.warehouse.orders",
      "enabled": true,
      "config": {
        "threshold_hours": 24
      },
      "destinations": ["slack", "email"],
      "created_at": "2024-01-15T10:00:00Z"
    }
  ],
  "pagination": {
    "total": 15,
    "limit": 50,
    "offset": 0,
    "has_more": false
  }
}
```

## Alert Types

| Type | Description |
|------|-------------|
| `freshness` | Data is stale beyond threshold |
| `schema_change` | Schema change detected |
| `row_count` | Row count anomaly detected |
| `null_percentage` | Null percentage exceeded threshold |
| `duplicate_keys` | Duplicate primary keys detected |

## Alert Statuses

| Status | Description |
|--------|-------------|
| `triggered` | Alert is active and needs attention |
| `acknowledged` | Alert has been seen but not resolved |
| `resolved` | Alert has been resolved |

## Use Case: Incident Response

Build an incident response script:

```python
from anomalyarmor import Client

client = Client()

def get_active_incidents():
    """Get all active alerts for incident response."""
    alerts = client.alerts.list(status="triggered")

    by_type = {"schema": [], "freshness": [], "discovery": []}
    for alert in alerts:
        by_type[alert.alert_type].append(alert)

    print("=== Active Incidents ===")
    for alert_type in ["schema", "freshness", "discovery"]:
        if by_type[alert_type]:
            print(f"\n{alert_type.upper()} ({len(by_type[alert_type])}):")
            for alert in by_type[alert_type]:
                print(f"  - {alert.qualified_name}: {alert.message}")

    return alerts

incidents = get_active_incidents()
```

Output:
```
=== Active Incidents ===

SCHEMA (1):
  - snowflake.prod.warehouse.payments: Column removed: deprecated_field

FRESHNESS (2):
  - snowflake.prod.warehouse.orders: Data is 26 hours stale
  - bigquery.analytics.events: Data is 4 hours stale
```

## Use Case: Slack Bot Integration

Post alerts to Slack:

```python
from anomalyarmor import Client
import requests

client = Client()
SLACK_WEBHOOK = "https://hooks.slack.com/services/..."

def post_alerts_to_slack():
    """Post triggered alerts to Slack."""
    alerts = client.alerts.list(status="triggered")

    if not alerts:
        return

    blocks = [{"type": "header", "text": {"type": "plain_text", "text": f"{len(alerts)} Active Alerts"}}]

    for alert in alerts[:5]:  # Limit to 5
        blocks.append({
            "type": "section",
            "text": {
                "type": "mrkdwn",
                "text": f"*{alert.qualified_name}*\n{alert.message}"
            }
        })

    requests.post(SLACK_WEBHOOK, json={"blocks": blocks})

# Run periodically
post_alerts_to_slack()
```

## Common Questions

### How do I filter alerts by asset or type?

Combine `asset_id`, `alert_type`, and `status` query parameters on `GET /api/v1/sdk/alerts`. For example, `?status=triggered&alert_type=freshness` returns only active freshness alerts. The `since` parameter accepts an ISO-8601 timestamp to bound history queries.

### Can I acknowledge or resolve alerts through the API?

The read endpoints on this page cover history and summaries. For state transitions (acknowledge / resolve), use the per-alert endpoints documented in the [TypeScript SDK reference](/sdk/javascript-reference) (`client.alerts.acknowledge`, `client.alerts.resolve`) which post to `/api/v1/alerts/{alert_id}/acknowledge` and `.../resolve`.

### What alert types does AnomalyArmor emit?

The API returns `freshness`, `schema_change`, `row_count`, `null_percentage`, and `duplicate_keys`. Each carries a `details` object with the relevant context (e.g. `hours_stale`, `threshold_hours` for freshness). Use `alert_type` to route to different Slack channels or PagerDuty services.

---

<!-- Source: https://docs.anomalyarmor.ai/api/metrics -->

# Metrics API

> Monitor and track data quality metrics like row counts, null percentages, and more

The Metrics API enables programmatic management of data quality metrics. Use it to track trends in your data over time, detect anomalies, and integrate quality monitoring into your pipelines.

## Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/v1/sdk/metrics/{asset_id}/summary` | Get metrics summary for an asset |
| GET | `/api/v1/sdk/metrics/{asset_id}` | List metrics for an asset |
| GET | `/api/v1/sdk/metrics/{asset_id}/{metric_id}` | Get metric details |
| POST | `/api/v1/sdk/metrics/{asset_id}` | Create a new metric |
| PATCH | `/api/v1/sdk/metrics/{asset_id}/{metric_id}` | Update a metric |
| DELETE | `/api/v1/sdk/metrics/{asset_id}/{metric_id}` | Delete a metric |
| POST | `/api/v1/sdk/metrics/{asset_id}/{metric_id}/capture` | Trigger metric capture |
| GET | `/api/v1/sdk/metrics/{asset_id}/{metric_id}/snapshots` | List metric snapshots |

## Metric Types

| Type | Description | Requires Column |
|------|-------------|-----------------|
| `row_count` | Total row count of the table | No |
| `null_percent` | Percentage of null values | Yes |
| `distinct_count` | Count of distinct values | Yes |
| `duplicate_count` | Count of duplicate values | Yes |
| `min_value` | Minimum numeric value | Yes |
| `max_value` | Maximum numeric value | Yes |
| `mean` | Average numeric value | Yes |
| `percentile` | Percentile value (requires `percentile_value`) | Yes |

## Get Metrics Summary

```
GET /api/v1/sdk/metrics/{asset_id}/summary
```

Returns aggregate metrics statistics for an asset.

### Response

```json
{
  "data": {
    "total_metrics": 15,
    "active_metrics": 12,
    "total_checks": 8,
    "passing": 6,
    "failing": 1,
    "warning": 1,
    "error": 0,
    "health_percentage": 87.5
  }
}
```

## List Metrics

```
GET /api/v1/sdk/metrics/{asset_id}
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `metric_type` | string | - | Filter by type (e.g., `row_count`, `null_percent`) |
| `is_active` | boolean | - | Filter by active status |
| `limit` | integer | 50 | Max results (max: 100) |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": {
    "items": [
      {
        "id": "m_550e8400-e29b-41d4-a716-446655440001",
        "internal_id": 123,
        "asset_id": 456,
        "table_path": "snowflake.prod.warehouse.orders",
        "column_name": "customer_email",
        "metric_type": "null_percent",
        "capture_interval": "daily",
        "sensitivity": 3,
        "is_active": true,
        "created_at": "2026-01-01T10:00:00Z"
      }
    ]
  },
  "pagination": {
    "total": 15,
    "limit": 50,
    "offset": 0,
    "has_more": false
  }
}
```

## Get Metric Details

```
GET /api/v1/sdk/metrics/{asset_id}/{metric_id}
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `include_snapshots` | boolean | true | Include recent snapshots |
| `snapshot_limit` | integer | 30 | Max snapshots to include |

### Response

```json
{
  "data": {
    "id": "m_550e8400-e29b-41d4-a716-446655440001",
    "internal_id": 123,
    "asset_id": 456,
    "table_path": "snowflake.prod.warehouse.orders",
    "column_name": "customer_email",
    "metric_type": "null_percent",
    "capture_interval": "daily",
    "sensitivity": 3,
    "is_active": true,
    "created_at": "2026-01-01T10:00:00Z"
  }
}
```

## Create Metric

```
POST /api/v1/sdk/metrics/{asset_id}
```

> ℹ️ Requires `read-write` or `admin` scope.

### Request Body

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `metric_type` | string | Yes | Metric type (see table above) |
| `table_path` | string | Yes | Full table path (catalog.schema.table) |
| `column_name` | string | For column metrics | Column name |
| `capture_interval` | string | No | `hourly`, `daily`, `weekly` (default: `daily`) |
| `sensitivity` | float | No | Anomaly detection sensitivity (default: 1.0) |
| `group_by_columns` | array | No | Columns to group by |
| `percentile_value` | float | No | Percentile value (for `percentile` type) |

### Response

```json
{
  "data": {
    "id": "m_550e8400-e29b-41d4-a716-446655440001",
    "internal_id": 123,
    "asset_id": 456,
    "table_path": "snowflake.prod.warehouse.orders",
    "column_name": "customer_email",
    "metric_type": "null_percent",
    "capture_interval": "daily",
    "sensitivity": 1.0,
    "is_active": true,
    "created_at": "2026-01-04T10:30:00Z"
  }
}
```

## Update Metric

```
PATCH /api/v1/sdk/metrics/{asset_id}/{metric_id}
```

> ℹ️ Requires `read-write` or `admin` scope.

### Request Body

| Field | Type | Description |
|-------|------|-------------|
| `is_active` | boolean | Whether metric is active |
| `capture_interval` | string | Capture interval |
| `sensitivity` | float | Anomaly detection sensitivity |

## Delete Metric

```
DELETE /api/v1/sdk/metrics/{asset_id}/{metric_id}
```

> ℹ️ Requires `read-write` or `admin` scope.

### Response

```json
{
  "data": {
    "success": true,
    "message": "Metric deleted"
  }
}
```

## Trigger Metric Capture

```
POST /api/v1/sdk/metrics/{asset_id}/{metric_id}/capture
```

> ℹ️ Requires `read-write` or `admin` scope.

Triggers an immediate capture of the metric value.

### Response

```json
{
  "data": {
    "snapshot_count": 1,
    "snapshots": [
      {
        "id": 789,
        "value": 2.5,
        "captured_at": "2026-01-04T10:35:00Z",
        "is_anomaly": false,
        "status": "PASS"
      }
    ]
  }
}
```

## List Metric Snapshots

```
GET /api/v1/sdk/metrics/{asset_id}/{metric_id}/snapshots
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `limit` | integer | 100 | Max results |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": {
    "items": [
      {
        "id": 789,
        "metric_definition_id": 123,
        "value": 2.5,
        "captured_at": "2026-01-04T10:35:00Z",
        "is_anomaly": false,
        "z_score": 0.3,
        "status": "PASS"
      },
      {
        "id": 788,
        "metric_definition_id": 123,
        "value": 15.2,
        "captured_at": "2026-01-03T10:35:00Z",
        "is_anomaly": true,
        "z_score": 4.2,
        "status": "FAIL"
      }
    ]
  },
  "pagination": {
    "total": 90,
    "limit": 30,
    "offset": 0,
    "has_more": true
  }
}
```

## Use Case: Monitor Row Count Trends

Track daily row counts to detect unexpected data volume changes:

```python
from anomalyarmor import Client

client = Client()
asset_id = "550e8400-e29b-41d4-a716-446655440000"

# Create row count metric
metric = client.metrics.create(
    asset_id,
    metric_type="row_count",
    table_path="snowflake.prod.warehouse.orders",
    capture_interval="daily",
    sensitivity=2.0,  # Alert on 2+ standard deviations
)

# Trigger initial capture
result = client.metrics.capture(asset_id, metric.id)
print(f"Initial row count: {result['snapshots'][0]['value']}")

# Later: check for anomalies
snapshots = client.metrics.snapshots(asset_id, metric.id, limit=7)
anomalies = [s for s in snapshots if s.is_anomaly]
if anomalies:
    print(f"Found {len(anomalies)} anomalies in the last 7 captures")
```

## Error Responses

### Metric Not Found (404)

```json
{
  "error": {
    "code": "METRIC_NOT_FOUND",
    "message": "Metric not found",
    "details": {"metric_id": "m_invalid-uuid"}
  }
}
```

### Validation Error (400)

```json
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "column_name is required for null_percent metrics",
    "details": {"field": "column_name", "metric_type": "null_percent"}
  }
}
```

### Forbidden (403)

```json
{
  "error": {
    "code": "FORBIDDEN",
    "message": "Insufficient permissions. Required scope: read-write",
    "details": {"current_scope": "read-only", "required_scope": "read-write"}
  }
}
```

## Common Questions

### What's the difference between a metric and a validity rule?

Metrics track numeric values over time (row counts, null percentages, mean, percentile) and alert on statistical anomalies via the `sensitivity` parameter. Validity rules ([Validity API](/api/validity)) enforce deterministic pass/fail constraints like NOT NULL, REGEX, or RANGE. Use metrics to catch drift, validity to catch explicit contract violations.

### How does the sensitivity parameter affect anomaly detection?

Sensitivity is the z-score threshold for flagging a snapshot as anomalous, defaulting to 1.0. Raise it (e.g. `2.0` or `3.0`) to reduce false positives on noisy data, lower it to catch subtler shifts. Each captured snapshot returns `z_score` and `is_anomaly` so you can tune in production.

### Can I capture a metric on demand outside its scheduled interval?

Yes. `POST /api/v1/sdk/metrics/{asset_id}/{metric_id}/capture` (or `client.metrics.capture(...)`) triggers an immediate capture and returns the new snapshot with anomaly status. This is useful for backfilling after creating a metric and for post-deploy validation without waiting for the next scheduled run.

### Which metric types require a column_name?

Column-level types (`null_percent`, `distinct_count`, `duplicate_count`, `min_value`, `max_value`, `mean`, `percentile`) require `column_name`. `row_count` operates on the whole table and ignores `column_name`. `percentile` additionally needs `percentile_value` in the request body.

---

<!-- Source: https://docs.anomalyarmor.ai/api/validity -->

# Validity API

> Define and enforce data validity rules like NOT NULL, UNIQUE, and custom patterns

The Validity API enables programmatic management of data validity rules. Use it to enforce data quality constraints, detect invalid records, and integrate validation into your pipelines.

## Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/v1/sdk/validity/{asset_id}/summary` | Get validity summary for an asset |
| GET | `/api/v1/sdk/validity/{asset_id}` | List validity rules for an asset |
| GET | `/api/v1/sdk/validity/{asset_id}/{rule_id}` | Get validity rule details |
| POST | `/api/v1/sdk/validity/{asset_id}` | Create a new validity rule |
| PATCH | `/api/v1/sdk/validity/{asset_id}/{rule_id}` | Update a validity rule |
| DELETE | `/api/v1/sdk/validity/{asset_id}/{rule_id}` | Delete a validity rule |
| POST | `/api/v1/sdk/validity/{asset_id}/{rule_id}/check` | Trigger validity check |
| GET | `/api/v1/sdk/validity/{asset_id}/{rule_id}/results` | List check results |

## Rule Types

| Type | Description | Configuration |
|------|-------------|---------------|
| `NOT_NULL` | Column must not contain null values | None |
| `UNIQUE` | Column values must be unique | None |
| `REGEX` | Values must match a regex pattern | `rule_config.pattern` |
| `RANGE` | Numeric values must be within range | `rule_config.min`, `rule_config.max` |
| `ENUM` | Values must be in allowed set | `rule_config.allowed_values` |
| `DATE_FORMAT` | Values must match date format | `rule_config.format` |
| `CUSTOM_SQL` | Custom SQL expression | `rule_config.sql_expression` |

## Get Validity Summary

```
GET /api/v1/sdk/validity/{asset_id}/summary
```

Returns aggregate validity statistics for an asset.

### Response

```json
{
  "data": {
    "total_rules": 12,
    "passing": 10,
    "failing": 1,
    "error": 1
  }
}
```

## List Validity Rules

```
GET /api/v1/sdk/validity/{asset_id}
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `rule_type` | string | - | Filter by type (e.g., `NOT_NULL`, `REGEX`) |
| `is_active` | boolean | - | Filter by active status |
| `limit` | integer | 50 | Max results (max: 100) |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": {
    "items": [
      {
        "id": 123,
        "uuid": "v_550e8400-e29b-41d4-a716-446655440001",
        "table_path": "snowflake.prod.warehouse.orders",
        "column_name": "customer_email",
        "rule_type": "NOT_NULL",
        "name": "Customer Email Required",
        "severity": "critical",
        "is_active": true,
        "check_interval": "daily",
        "created_at": "2024-12-01T10:00:00Z"
      }
    ]
  },
  "pagination": {
    "total": 12,
    "limit": 50,
    "offset": 0,
    "has_more": false
  }
}
```

## Get Validity Rule Details

```
GET /api/v1/sdk/validity/{asset_id}/{rule_id}
```

### Response

```json
{
  "data": {
    "id": 123,
    "uuid": "v_550e8400-e29b-41d4-a716-446655440001",
    "table_path": "snowflake.prod.warehouse.orders",
    "column_name": "customer_email",
    "rule_type": "NOT_NULL",
    "rule_config": null,
    "name": "Customer Email Required",
    "description": "Email address must not be null for valid orders",
    "severity": "critical",
    "is_active": true,
    "alert_threshold_percent": 1.0,
    "treat_null_as_valid": false,
    "check_interval": "daily"
  }
}
```

## Create Validity Rule

```
POST /api/v1/sdk/validity/{asset_id}
```

> ℹ️ Requires `read-write` or `admin` scope.

### Request Body

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `rule_type` | string | Yes | Rule type (see table above) |
| `table_path` | string | Yes | Full table path (catalog.schema.table) |
| `column_name` | string | For column rules | Column name |
| `rule_config` | object | For some types | Rule-specific configuration |
| `name` | string | No | Human-readable rule name |
| `description` | string | No | Rule description |
| `severity` | string | No | `info`, `warning`, `critical` (default: `warning`) |
| `error_message` | string | No | Custom error message |
| `alert_threshold_percent` | float | No | Alert when invalid % exceeds this |
| `treat_null_as_valid` | boolean | No | Whether nulls pass (default: `false`) |
| `check_interval` | string | No | `hourly`, `daily`, `weekly` (default: `daily`) |

### Examples

### Response

```json
{
  "data": {
    "id": 124,
    "uuid": "v_550e8400-e29b-41d4-a716-446655440002",
    "table_path": "snowflake.prod.warehouse.orders",
    "column_name": "customer_email",
    "rule_type": "NOT_NULL",
    "name": "Customer Email Required",
    "severity": "critical",
    "is_active": true,
    "check_interval": "daily",
    "created_at": "2024-12-04T10:30:00Z"
  }
}
```

## Update Validity Rule

```
PATCH /api/v1/sdk/validity/{asset_id}/{rule_id}
```

> ℹ️ Requires `read-write` or `admin` scope.

### Request Body

| Field | Type | Description |
|-------|------|-------------|
| `is_active` | boolean | Whether rule is active |
| `name` | string | Rule name |
| `description` | string | Rule description |
| `severity` | string | Severity level |
| `alert_threshold_percent` | float | Alert threshold |
| `treat_null_as_valid` | boolean | Null handling |
| `check_interval` | string | Check interval |

## Delete Validity Rule

```
DELETE /api/v1/sdk/validity/{asset_id}/{rule_id}
```

> ℹ️ Requires `read-write` or `admin` scope.

## Trigger Validity Check

```
POST /api/v1/sdk/validity/{asset_id}/{rule_id}/check
```

> ℹ️ Requires `read-write` or `admin` scope.

### Request Body

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `sample_limit` | integer | 10 | Max invalid samples to collect |

### Response

```json
{
  "data": {
    "id": 456,
    "validity_rule_id": 123,
    "status": "fail",
    "total_rows": 10000,
    "invalid_count": 25,
    "invalid_percent": 0.25,
    "invalid_samples": {
      "samples": [
        {"row_id": 1001, "value": null},
        {"row_id": 1042, "value": null}
      ]
    },
    "execution_duration_ms": 1250,
    "checked_at": "2024-12-04T10:35:00Z"
  }
}
```

## List Check Results

```
GET /api/v1/sdk/validity/{asset_id}/{rule_id}/results
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `limit` | integer | 100 | Max results |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": {
    "items": [
      {
        "id": 456,
        "validity_rule_id": 123,
        "status": "fail",
        "total_rows": 10000,
        "invalid_count": 25,
        "invalid_percent": 0.25,
        "execution_duration_ms": 1250,
        "checked_at": "2024-12-04T10:35:00Z"
      },
      {
        "id": 455,
        "validity_rule_id": 123,
        "status": "pass",
        "total_rows": 9975,
        "invalid_count": 0,
        "invalid_percent": 0.0,
        "execution_duration_ms": 1100,
        "checked_at": "2024-12-03T10:35:00Z"
      }
    ]
  },
  "pagination": {
    "total": 60,
    "limit": 30,
    "offset": 0,
    "has_more": true
  }
}
```

## Use Case: Validate Email Format

Ensure all customer emails match a valid format:

```python
from anomalyarmor import Client

client = Client()
asset_id = "550e8400-e29b-41d4-a716-446655440000"

# Create email validation rule
rule = client.validity.create(
    asset_id,
    rule_type="REGEX",
    table_path="snowflake.prod.warehouse.customers",
    column_name="email",
    rule_config={"pattern": r"^[\w.-]+@[\w.-]+\.\w{2,}$"},
    name="Valid Email Format",
    description="Validates email addresses match standard format",
    severity="warning",
    alert_threshold_percent=1.0,  # Alert if > 1% invalid
)

# Run initial check
result = client.validity.check(asset_id, rule.uuid, sample_limit=20)

if result.status == "fail":
    print(f"Warning: {result.invalid_count} invalid emails found")
    print(f"Invalid rate: {result.invalid_percent:.2f}%")
    for sample in (result.invalid_samples or {}).get("samples", [])[:5]:
        print(f"  - {sample}")
else:
    print("All emails are valid!")
```

## Error Responses

### Rule Not Found (404)

```json
{
  "error": {
    "code": "RULE_NOT_FOUND",
    "message": "Validity rule not found",
    "details": {"rule_id": "v_invalid-uuid"}
  }
}
```

### Invalid Rule Configuration (400)

```json
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid regex pattern in rule_config",
    "details": {"field": "rule_config.pattern", "error": "Invalid regex syntax"}
  }
}
```

### Forbidden (403)

```json
{
  "error": {
    "code": "FORBIDDEN",
    "message": "Insufficient permissions. Required scope: read-write",
    "details": {"current_scope": "read-only", "required_scope": "read-write"}
  }
}
```

## Common Questions

### Which validity rule type should I pick?

Use `NOT_NULL` for required columns, `UNIQUE` for primary-key-like invariants, `REGEX` for string formats (emails, IDs), `RANGE` for numeric bounds, `ENUM` for finite allowed sets, `DATE_FORMAT` for date strings, and `CUSTOM_SQL` when no built-in type fits. Each type's required `rule_config` is listed in the Rule Types table at the top of this page.

### How do I inspect rows that failed a validity rule?

Every `check` response includes `invalid_samples.samples`, up to `sample_limit` rows (default 10, configurable per check). Bump `sample_limit` up to 100 when debugging a broken ingest. The response also returns `invalid_count` and `invalid_percent` so you can report a failure rate even when individual samples aren't needed.

### What does the alert_threshold_percent field control?

`alert_threshold_percent` is the invalid-row percentage that flips a check result from pass to fail. Set it to `0` if any single invalid row should page you. Use higher values (e.g. `1.0`) on rules where a small amount of invalid data is tolerated and you only want to catch systemic regressions.

### How is treat_null_as_valid different from using NOT_NULL?

`treat_null_as_valid` governs how a non-NULL-type rule (REGEX, RANGE, ENUM, etc.) handles NULLs. When `true`, NULL rows are skipped; when `false`, NULLs count as invalid. Combine with a separate `NOT_NULL` rule when you need to enforce both non-null and format at once - they surface as two distinct checks you can alert on independently.

---

<!-- Source: https://docs.anomalyarmor.ai/api/referential-checks -->

# Referential Checks API

> Verify foreign key relationships and detect orphaned records

The Referential Checks API enables monitoring of foreign key relationships between tables. Use it to detect orphaned records, validate referential integrity, and integrate relationship checks into your data quality workflows.

## Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/v1/sdk/referential/{asset_id}/summary` | Get referential summary for an asset |
| GET | `/api/v1/sdk/referential/{asset_id}` | List referential checks for an asset |
| GET | `/api/v1/sdk/referential/{asset_id}/{check_id}` | Get referential check details |
| POST | `/api/v1/sdk/referential/{asset_id}` | Create a new referential check |
| PATCH | `/api/v1/sdk/referential/{asset_id}/{check_id}` | Update a referential check |
| DELETE | `/api/v1/sdk/referential/{asset_id}/{check_id}` | Delete a referential check |
| POST | `/api/v1/sdk/referential/{asset_id}/{check_id}/execute` | Execute referential check |
| GET | `/api/v1/sdk/referential/{asset_id}/{check_id}/results` | List check results |

## Concepts

A referential integrity check validates that values in a **child column** (foreign key) exist in a **parent column** (primary key).

- **Child table**: The table containing the foreign key (e.g., `orders.customer_id`)
- **Parent table**: The table being referenced (e.g., `customers.id`)
- **Orphan**: A record in the child table with no matching parent record

## Get Referential Summary

```
GET /api/v1/sdk/referential/{asset_id}/summary
```

Returns aggregate referential integrity statistics for an asset.

### Response

```json
{
  "data": {
    "total_checks": 8,
    "active_checks": 7,
    "passing_checks": 6,
    "failing_checks": 1,
    "last_check_at": "2024-12-04T10:30:00Z"
  }
}
```

## List Referential Checks

```
GET /api/v1/sdk/referential/{asset_id}
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `is_active` | boolean | - | Filter by active status |
| `limit` | integer | 50 | Max results (max: 100) |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": {
    "items": [
      {
        "id": "r_550e8400-e29b-41d4-a716-446655440001",
        "internal_id": 123,
        "asset_id": 456,
        "child_table_path": "snowflake.prod.warehouse.orders",
        "child_column_name": "customer_id",
        "parent_table_path": "snowflake.prod.warehouse.customers",
        "parent_column_name": "id",
        "name": "Orders -> Customers FK",
        "capture_interval": "daily",
        "is_active": true,
        "created_at": "2024-12-01T10:00:00Z"
      }
    ]
  },
  "pagination": {
    "total": 8,
    "limit": 50,
    "offset": 0,
    "has_more": false
  }
}
```

## Get Referential Check Details

```
GET /api/v1/sdk/referential/{asset_id}/{check_id}
```

### Response

```json
{
  "data": {
    "id": "r_550e8400-e29b-41d4-a716-446655440001",
    "internal_id": 123,
    "asset_id": 456,
    "child_table_path": "snowflake.prod.warehouse.orders",
    "child_column_name": "customer_id",
    "parent_table_path": "snowflake.prod.warehouse.customers",
    "parent_column_name": "id",
    "name": "Orders -> Customers FK",
    "description": "Validates orders reference existing customers",
    "capture_interval": "daily",
    "max_orphan_count": 0,
    "max_orphan_percent": 0.1,
    "min_child_count": null,
    "max_child_count": null,
    "is_active": true
  }
}
```

## Create Referential Check

```
POST /api/v1/sdk/referential/{asset_id}
```

> ℹ️ Requires `read-write` or `admin` scope.

### Request Body

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `child_table_path` | string | Yes | Full path to child table (with FK column) |
| `child_column_name` | string | Yes | Column name in child table (FK) |
| `parent_table_path` | string | Yes | Full path to parent table (with PK column) |
| `parent_column_name` | string | Yes | Column name in parent table (PK) |
| `name` | string | No | Human-readable check name |
| `description` | string | No | Check description |
| `capture_interval` | string | No | `hourly`, `daily`, `weekly` (default: `daily`) |
| `max_orphan_count` | integer | No | Alert if orphan count exceeds this |
| `max_orphan_percent` | float | No | Alert if orphan % exceeds this |
| `min_child_count` | integer | No | Min expected children per parent |
| `max_child_count` | integer | No | Max expected children per parent |

### Response

```json
{
  "data": {
    "id": "r_550e8400-e29b-41d4-a716-446655440002",
    "internal_id": 124,
    "asset_id": 456,
    "child_table_path": "snowflake.prod.warehouse.orders",
    "child_column_name": "customer_id",
    "parent_table_path": "snowflake.prod.warehouse.customers",
    "parent_column_name": "id",
    "name": "Orders -> Customers FK",
    "capture_interval": "daily",
    "max_orphan_count": 0,
    "is_active": true,
    "created_at": "2024-12-04T10:30:00Z"
  }
}
```

## Update Referential Check

```
PATCH /api/v1/sdk/referential/{asset_id}/{check_id}
```

> ℹ️ Requires `read-write` or `admin` scope.

### Request Body

| Field | Type | Description |
|-------|------|-------------|
| `is_active` | boolean | Whether check is active |
| `name` | string | Check name |
| `description` | string | Check description |
| `capture_interval` | string | Capture interval |
| `max_orphan_count` | integer | Max orphan count threshold |
| `max_orphan_percent` | float | Max orphan percent threshold |
| `min_child_count` | integer | Min child count cardinality |
| `max_child_count` | integer | Max child count cardinality |

## Delete Referential Check

```
DELETE /api/v1/sdk/referential/{asset_id}/{check_id}
```

> ℹ️ Requires `read-write` or `admin` scope.

## Execute Referential Check

```
POST /api/v1/sdk/referential/{asset_id}/{check_id}/execute
```

> ℹ️ Requires `read-write` or `admin` scope.

Triggers an immediate execution of the referential integrity check.

### Response

```json
{
  "data": {
    "id": 789,
    "referential_check_id": 123,
    "status": "fail",
    "orphan_count": 5,
    "orphan_percent": 0.05,
    "total_child_rows": 10000,
    "orphan_sample": [
      {"customer_id": 99999},
      {"customer_id": 99998}
    ],
    "parents_below_min": null,
    "parents_above_max": null,
    "query_duration_ms": 2500,
    "created_at": "2024-12-04T10:35:00Z"
  }
}
```

## List Check Results

```
GET /api/v1/sdk/referential/{asset_id}/{check_id}/results
```

### Query Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `limit` | integer | 100 | Max results |
| `offset` | integer | 0 | Results to skip |

### Response

```json
{
  "data": {
    "items": [
      {
        "id": 789,
        "referential_check_id": 123,
        "status": "fail",
        "orphan_count": 5,
        "orphan_percent": 0.05,
        "total_child_rows": 10000,
        "query_duration_ms": 2500,
        "created_at": "2024-12-04T10:35:00Z"
      },
      {
        "id": 788,
        "referential_check_id": 123,
        "status": "pass",
        "orphan_count": 0,
        "orphan_percent": 0.0,
        "total_child_rows": 9995,
        "query_duration_ms": 2300,
        "created_at": "2024-12-03T10:35:00Z"
      }
    ]
  },
  "pagination": {
    "total": 60,
    "limit": 30,
    "offset": 0,
    "has_more": true
  }
}
```

## Use Case: Detect Orphaned Orders

Monitor orders that reference non-existent customers:

```python
from anomalyarmor import Client

client = Client()
asset_id = "550e8400-e29b-41d4-a716-446655440000"

# Create a referential check
check = client.referential.create(
    asset_id,
    child_table_path="snowflake.prod.warehouse.orders",
    child_column_name="customer_id",
    parent_table_path="snowflake.prod.warehouse.customers",
    parent_column_name="id",
    name="Orders -> Customers FK",
    description="Ensures all orders reference valid customers",
    max_orphan_count=0,  # Zero tolerance for orphans
)

# Execute check immediately
result = client.referential.execute(asset_id, check.id)

if result.status == "fail":
    print(f"Data quality issue: {result.orphan_count} orphaned orders found")
    print(f"Orphan rate: {result.orphan_percent:.3f}%")

    # Log sample orphans for investigation
    for orphan in (result.orphan_sample or []):
        print(f"  Orphaned customer_id: {orphan}")

    # This could trigger an alert or fail a pipeline
    raise Exception("Referential integrity violation detected")
else:
    print("All orders reference valid customers!")
```

## Use Case: Cardinality Validation

Ensure each order has a reasonable number of line items:

```python
from anomalyarmor import Client

client = Client()
asset_id = "550e8400-e29b-41d4-a716-446655440000"

# Create check with cardinality constraints
check = client.referential.create(
    asset_id,
    child_table_path="snowflake.prod.warehouse.order_items",
    child_column_name="order_id",
    parent_table_path="snowflake.prod.warehouse.orders",
    parent_column_name="id",
    name="Order Items -> Orders FK",
    min_child_count=1,   # Each order must have at least 1 item
    max_child_count=100, # No order should have more than 100 items
)

result = client.referential.execute(asset_id, check.id)

if result.parents_below_min is not None and result.parents_below_min > 0:
    print(f"Found {result.parents_below_min} orders with no items!")

if result.parents_above_max is not None and result.parents_above_max > 0:
    print(f"Found {result.parents_above_max} orders with too many items!")
```

## Error Responses

### Check Not Found (404)

```json
{
  "error": {
    "code": "CHECK_NOT_FOUND",
    "message": "Referential check not found",
    "details": {"check_id": "r_invalid-uuid"}
  }
}
```

### Invalid Table Path (400)

```json
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Table not found: snowflake.prod.warehouse.invalid_table",
    "details": {"field": "child_table_path"}
  }
}
```

### Check Inactive (400)

```json
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Cannot execute inactive check",
    "details": {"check_id": "r_550e8400-e29b-41d4-a716-446655440001"}
  }
}
```

### Forbidden (403)

```json
{
  "error": {
    "code": "FORBIDDEN",
    "message": "Insufficient permissions. Required scope: read-write",
    "details": {"current_scope": "read-only", "required_scope": "read-write"}
  }
}
```

## Common Questions

### Do my warehouse tables need declared foreign keys for this to work?

No. Referential checks work against any two columns you name, regardless of whether the underlying warehouse enforces or even supports FK constraints. Snowflake and BigQuery, for instance, don't enforce FKs, which is exactly why this API exists. Just provide the child and parent `table_path` and column names.

### When should I use max_orphan_count vs max_orphan_percent?

Use `max_orphan_count=0` for hard-invariant relationships (every order must reference a real customer). Use `max_orphan_percent` when some drift is acceptable (e.g. late-arriving dimension rows) and you care about the rate, not the absolute count. You can set both; either threshold breach marks the check as failed.

### How do min_child_count and max_child_count work?

These are cardinality constraints on the parent->children relationship. `min_child_count=1` flags parents with zero children (e.g. orders with no line items), and `max_child_count=100` flags parents with too many. The result returns `parents_below_min` and `parents_above_max` counts separate from the orphan count.

### How expensive are referential checks on large tables?

The query joins child and parent tables, so cost scales with the warehouse's join performance on the key columns. `query_duration_ms` is returned on every result for observability. For very large tables, prefer `daily` or `weekly` `capture_interval` and rely on `execute` for on-demand checks after loads rather than running them hourly.

---

<!-- Source: https://docs.anomalyarmor.ai/sdk/overview -->

# SDK overview

> Get started with the AnomalyArmor SDKs (Python and TypeScript)

AnomalyArmor ships official SDKs for **Python** and **TypeScript**. Both talk to the same REST API (`app.anomalyarmor.ai`) with the same `aa_live_*` Bearer tokens, so an existing Python script and a new Vercel function see identical data.

| Language | Package | Versioning | Docs |
|----------|---------|-----------|------|
| Python | `anomalyarmor-cli` on PyPI | `pip install anomalyarmor-cli` | Below |
| TypeScript / JavaScript | `@anomalyarmor/sdk` on npm | `npm install @anomalyarmor/sdk` | [JavaScript SDK](/sdk/javascript) |

The Python SDK also ships the `anomalyarmor` CLI. The TS SDK ships an `npx anomalyarmor` CLI. Both CLIs read `ANOMALYARMOR_API_KEY` from env as a convenience - library code in both SDKs requires the key to be passed explicitly.

## Install side-by-side

## Quickstart side-by-side

## Python SDK

`anomalyarmor-cli` provides a Python SDK and CLI for programmatic access.

### Installation

```bash
pip install anomalyarmor-cli
```

> ℹ️ Requires Python 3.9 or higher.

### Quick Start

```python
from anomalyarmor import Client

# Initialize - uses ARMOR_API_KEY env var
client = Client()

# Or pass key directly
client = Client(api_key="aa_live_xxx")

# List assets
assets = client.assets.list(source="snowflake", limit=10)
for asset in assets:
    print(asset.qualified_name)

# Check freshness (raises StalenessError if stale)
client.freshness.require_fresh("snowflake.prod.warehouse.orders")
```

## Configuration

### Environment Variables

| Variable | Description |
|----------|-------------|
| `ARMOR_API_KEY` | Your API key (recommended) |
| `ARMOR_API_URL` | Custom API URL (optional) |

```bash
export ARMOR_API_KEY="aa_live_your_key_here"
```

### Client Options

```python
from anomalyarmor import Client

client = Client(
    api_key="aa_live_xxx",      # Or use ARMOR_API_KEY env var
    api_url="https://...",       # Custom API URL (optional)
    timeout=30,                  # Request timeout in seconds
)
```

## Resources

The client provides access to all AnomalyArmor resources:

```python
client.assets      # Data assets
client.freshness   # Freshness monitoring
client.schema      # Schema drift detection
client.lineage     # Data lineage
client.alerts      # Alert history
client.metrics     # Data quality metrics
client.api_keys    # API key management (admin scope)
```

## Common Patterns

### Airflow Pre-flight Check

Gate your pipeline on data freshness:

```python
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

def check_upstream_freshness():
    client = Client()

    try:
        # Raises StalenessError if stale
        client.freshness.require_fresh("snowflake.prod.warehouse.orders")
        print("Data is fresh, proceeding...")
    except StalenessError as e:
        print(f"Data is stale: {e}")
        raise  # Fail the task
```

### List and Filter Assets

```python
from anomalyarmor import Client

client = Client()

# All Snowflake tables
tables = client.assets.list(source="snowflake", type="table")

# Paginate through all assets
offset = 0
while True:
    assets = client.assets.list(limit=100, offset=offset)
    if not assets:
        break
    for asset in assets:
        process(asset)
    offset += 100
```

### Trigger and Wait for Refresh

```python
from anomalyarmor import Client

client = Client()

# Trigger freshness check and wait for completion
result = client.freshness.refresh(
    "snowflake.prod.warehouse.orders",
    wait=True  # Block until complete
)
print(f"Job {result.job_id}: {result.status}")
```

### Check Lineage Before Running

```python
from anomalyarmor import Client

client = Client()

# Get upstream dependencies
lineage = client.lineage.get("snowflake.prod.warehouse.orders")

# Check all upstream sources are fresh
for upstream in lineage.upstream:
    client.freshness.require_fresh(upstream.qualified_name)

print("All upstream sources are fresh!")
```

### Monitor Data Quality Metrics

```python
from anomalyarmor import Client

client = Client()
asset_id = "550e8400-e29b-41d4-a716-446655440000"

# Create a row count metric with anomaly detection
metric = client.metrics.create(
    asset_id,
    metric_type="row_count",
    table_path="snowflake.prod.warehouse.orders",
    capture_interval="daily",
    sensitivity=2.0,  # Alert on 2+ standard deviations
)

# Get metrics summary
summary = client.metrics.summary(asset_id)
print(f"Health: {summary.health_percentage}%")

# Check for recent anomalies
snapshots = client.metrics.snapshots(asset_id, metric.id, limit=7)
anomalies = [s for s in snapshots if s.is_anomaly]
if anomalies:
    print(f"Found {len(anomalies)} anomalies in the last 7 days")
```

## Exception Handling

```python
from anomalyarmor import Client
from anomalyarmor.exceptions import (
    StalenessError,      # Data is stale
    AuthenticationError, # Invalid/missing API key
    NotFoundError,       # Asset not found
    RateLimitError,      # Rate limit exceeded
    ValidationError,     # Invalid parameters
    ServerError,         # Server error
    ArmorError,          # Base exception
)

client = Client()

try:
    client.freshness.require_fresh("snowflake.prod.warehouse.orders")
except StalenessError as e:
    print(f"Data is stale: last updated {e.last_updated}")
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited, retry after {e.retry_after} seconds")
except ArmorError as e:
    print(f"API error: {e}")
```

## Context Manager

The client supports context manager for automatic cleanup:

```python
from anomalyarmor import Client

with Client() as client:
    assets = client.assets.list()
    # Connection automatically closed
```

## Type Hints

The SDK is fully typed for IDE support:

```python
from anomalyarmor import Client
from anomalyarmor.models import Asset, FreshnessStatus

client = Client()

asset: Asset = client.assets.get("snowflake.prod.warehouse.orders")
status: FreshnessStatus = client.freshness.get(asset.qualified_name)

print(status.is_fresh)  # IDE knows this is bool
print(status.last_updated)  # IDE knows this is datetime
```

## Next Steps

- **[SDK Reference](/sdk/reference)**: Complete method reference
- **[CLI Guide](/cli/overview)**: Command-line interface
- **[Airflow Integration](/integrations/airflow)**: Use in Airflow DAGs
- **[API Reference](/api/overview)**: REST API documentation

## Common Questions

### Should I use the Python SDK or the TypeScript SDK?

Pick whichever matches the runtime you're already using: Python for Airflow, dbt hooks, and notebooks; TypeScript for Next.js, Vercel functions, and Node services. Both SDKs wrap the same REST API and accept the same `aa_live_*` keys, so mixing them across services is fine. See the [TypeScript SDK page](/sdk/javascript) for Node-specific setup.

### Which Python version does the SDK require?

Python 3.9 or higher. The SDK ships fully typed models (`Asset`, `FreshnessStatus`, etc.) so you get IDE completion on every method and field. Install with `pip install anomalyarmor-cli`, which also installs the `armor` CLI.

### How do I paginate through thousands of assets in Python?

Call `client.assets.list(limit=100, offset=n)` in a loop, incrementing `offset` by the page size until an empty page comes back. The SDK mirrors the REST pagination directly rather than hiding it, which keeps memory flat for very large accounts. The pattern is shown in the "List and Filter Assets" example above.

### How do I use the SDK with Airflow?

Call `client.freshness.require_fresh(asset)` at the top of a task; it raises `StalenessError` when the asset is stale, which Airflow surfaces as a task failure. Set `ARMOR_API_KEY` as an Airflow connection secret and instantiate `Client()` with no arguments. The [Airflow integration guide](/integrations/airflow) shows a full DAG.

---

<!-- Source: https://docs.anomalyarmor.ai/sdk/reference -->

# SDK Reference

> Complete Python SDK method reference

Complete reference for all `anomalyarmor-cli` SDK classes and methods.

## Client

The main entry point for the SDK.

```python
from anomalyarmor import Client

client = Client(
    api_key="aa_live_xxx",  # Or use ARMOR_API_KEY env var
    api_url="https://app.anomalyarmor.ai/api/v1",  # Optional
    timeout=30,  # Request timeout in seconds
)
```

### Constructor

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `api_key` | `str \| None` | `None` | API key. Falls back to `ARMOR_API_KEY` env var |
| `api_url` | `str \| None` | Production URL | Base URL for API requests |
| `timeout` | `int \| None` | `30` | Request timeout in seconds |

### Context Manager

```python
with Client() as client:
    assets = client.assets.list()
# Connection automatically closed
```

## client.freshness

Monitor data freshness.

### freshness.summary()

Get aggregate freshness statistics.

```python
summary = client.freshness.summary()
print(f"Fresh: {summary.fresh}/{summary.total_assets}")
print(f"Fresh rate: {summary.fresh_percentage}%")
```

**Returns:** `FreshnessSummary`

### freshness.list()

List freshness status for all assets.

```python
statuses = client.freshness.list(
    status="stale",     # Filter: "fresh", "stale", "unknown"
    limit=50,
    offset=0,
)
```

**Returns:** `list[FreshnessStatus]`

### freshness.get()

Get freshness status for a specific asset.

```python
status = client.freshness.get("snowflake.prod.warehouse.orders")
print(f"Fresh: {status.is_fresh}")
print(f"Last updated: {status.last_updated}")
print(f"Hours since update: {status.hours_since_update}")
```

**Returns:** `FreshnessStatus`

### freshness.require_fresh()

Require an asset to be fresh, raising an error if stale. **This is the recommended gate pattern for pipelines.**

```python
from anomalyarmor.exceptions import StalenessError

try:
    client.freshness.require_fresh(
        "snowflake.prod.warehouse.orders",
        max_age_hours=24,  # Optional custom threshold
    )
    print("Data is fresh!")
except StalenessError as e:
    print(f"Stale: {e.hours_since_update}h old")
    raise
```

**Parameters:**
- `asset_id` (str): Asset qualified name or UUID
- `max_age_hours` (float | None): Custom threshold. Uses asset's configured threshold if not provided.

**Returns:** `FreshnessStatus` if fresh

**Raises:** `StalenessError` if stale, `NotFoundError` if not found

### freshness.refresh()

Trigger a freshness check.

```python
result = client.freshness.refresh("snowflake.prod.warehouse.orders")
print(f"Job ID: {result['job_id']}")
```

**Returns:** `dict` with `job_id`, `status`, `message`

**Raises:** `NotFoundError`, `AuthorizationError` (requires `read-write` scope)

---

## client.schema

Monitor schema drift.

### schema.summary()

Get schema drift summary statistics.

```python
summary = client.schema.summary()
print(f"Changes last 24h: {summary.changes_last_24h}")
```

**Returns:** `SchemaSummary`

### schema.changes()

List recent schema changes.

```python
changes = client.schema.changes(
    asset_id="snowflake.prod.warehouse.orders",  # Optional filter
    change_type="column_added",  # Optional filter
    limit=50,
    offset=0,
)
for change in changes:
    print(f"{change.qualified_name}: {change.change_type}")
```

**Returns:** `list[SchemaChange]`

### schema.refresh()

Trigger a schema check.

```python
result = client.schema.refresh("snowflake.prod.warehouse.orders")
```

**Returns:** `dict` with `job_id`, `status`

---

## client.lineage

Explore data dependencies.

### lineage.list()

List assets with lineage information.

```python
assets = client.lineage.list(
    source="snowflake",
    has_upstream=True,
    has_downstream=True,
    limit=50,
)
```

**Returns:** `list[LineageAsset]`

### lineage.get()

Get lineage for a specific asset.

```python
lineage = client.lineage.get(
    "snowflake.prod.warehouse.orders",
    direction="both",  # "upstream", "downstream", or "both"
    depth=2,           # Levels to traverse (1-5)
)

for upstream in lineage.upstream:
    print(f"<- {upstream.qualified_name}")

for downstream in lineage.downstream:
    print(f"-> {downstream.qualified_name}")
```

**Returns:** `Lineage`

---

## client.tags

Manage data classification tags.

### tags.list()

List tags for an asset.

```python
tags = client.tags.list(
    asset="postgresql.analytics",  # Asset ID or qualified name
    category="business",           # Optional filter
)
for tag in tags:
    print(f"{tag.name} on {tag.object_path}")
```

**Returns:** `list[Tag]`

### tags.create()

Create a tag on a database object.

```python
tag = client.tags.create(
    asset="postgresql.analytics",
    name="pii_data",
    object_path="gold.customers",          # Required: schema.table or schema.table.column
    object_type="table",                   # "table" or "column" (default: "table")
    category="governance",                 # "business", "technical", "governance"
    description="Contains customer PII",   # Optional
)
```

**Returns:** `Tag`

### tags.apply()

Apply multiple tags to multiple objects.

```python
result = client.tags.apply(
    asset="postgresql.analytics",
    tag_names=["pii", "gdpr"],                           # Required
    object_paths=["gold.customers", "gold.orders"],      # Required
    category="governance",
)
print(f"Applied: {result.applied}, Failed: {result.failed}")
```

**Returns:** `BulkApplyResult`

---

## client.intelligence

Query the AI knowledge base about your data.

### intelligence.ask()

Ask a question about an asset's data.

```python
answer = client.intelligence.ask(
    asset="postgresql.analytics",
    question="What tables contain customer data?",
)
print(answer.answer)
print(f"Confidence: {answer.confidence}")
print(f"Sources: {answer.sources}")
```

**Returns:** `IntelligenceAnswer`

**Raises:** `NotFoundError` if asset not found, `ValidationError` if intelligence not generated

### intelligence.generate()

Generate AI intelligence for an asset (async job).

```python
result = client.intelligence.generate(
    asset="postgresql.analytics",
)
print(f"Job ID: {result.job_id}")
print(f"Status: {result.status}")
```

**Returns:** `dict` with `job_id`, `status`

> ℹ️ Requires asset discovery to be run first. Use the UI or API to discover schema before generating intelligence.

---

## client.jobs

Monitor async job status.

### jobs.status()

Get status of an async job.

```python
status = client.jobs.status("job_abc123")
print(f"Status: {status.status}")
print(f"Progress: {status.progress}%")
if status.error:
    print(f"Error: {status.error}")
```

**Returns:** `JobStatus`

---

## client.metrics

Monitor data quality metrics like row counts, null percentages, and more.

### metrics.summary()

Get metrics summary for an asset.

```python
summary = client.metrics.summary("asset-uuid")
print(f"Active: {summary.active_metrics}/{summary.total_metrics}")
print(f"Health: {summary.health_percentage}%")
```

**Returns:** `MetricsSummary`

### metrics.list()

List metrics for an asset.

```python
metrics = client.metrics.list(
    "asset-uuid",
    metric_type="null_percent",  # Optional filter
    is_active=True,              # Optional filter
    limit=50,
    offset=0,
)
for m in metrics:
    print(f"{m.table_path}.{m.column_name}: {m.metric_type}")
```

**Returns:** `list[MetricDefinition]`

### metrics.get()

Get metric details with optional snapshots.

```python
metric = client.metrics.get(
    "asset-uuid",
    "metric-uuid",
    include_snapshots=True,
    snapshot_limit=30,
)
print(f"Type: {metric.metric_type}")
print(f"Table: {metric.table_path}")
```

**Returns:** `MetricDefinition`

### metrics.create()

Create a new metric. Requires `read-write` scope.

```python
metric = client.metrics.create(
    "asset-uuid",
    metric_type="null_percent",
    table_path="catalog.schema.table",
    column_name="email",
    capture_interval="daily",
)
print(f"Created: {metric.id}")
```

**Parameters:**
- `asset_id` (str): Asset UUID
- `metric_type` (str): `row_count`, `null_percent`, `distinct_count`, etc.
- `table_path` (str): Full table path
- `column_name` (str | None): Column name for column metrics
- `capture_interval` (str): `hourly`, `daily`, `weekly` (default: `daily`)
- `sensitivity` (float): Anomaly detection sensitivity (default: 1.0)

**Returns:** `MetricDefinition`

### metrics.update()

Update a metric. Requires `read-write` scope.

```python
metric = client.metrics.update(
    "asset-uuid",
    "metric-uuid",
    is_active=False,
    sensitivity=2.0,
)
```

**Returns:** `MetricDefinition`

### metrics.delete()

Delete a metric. Requires `read-write` scope.

```python
client.metrics.delete("asset-uuid", "metric-uuid")
```

### metrics.capture()

Trigger an immediate metric capture. Requires `read-write` scope.

```python
result = client.metrics.capture("asset-uuid", "metric-uuid")
print(f"Captured {result.get('snapshot_count', 0)} snapshots")
```

**Returns:** `dict` with `snapshot_count` and `snapshots`

### metrics.snapshots()

List historical snapshots for a metric.

```python
snapshots = client.metrics.snapshots(
    "asset-uuid",
    "metric-uuid",
    limit=100,
)
for s in snapshots:
    print(f"{s.captured_at}: {s.value}")
```

**Returns:** `list[MetricSnapshot]`

---

## client.validity

Define and enforce data validity rules.

### validity.summary()

Get validity summary for an asset.

```python
summary = client.validity.summary("asset-uuid")
print(f"Total rules: {summary.total_rules}")
print(f"Passing: {summary.passing}, Failing: {summary.failing}")
```

**Returns:** `ValiditySummary`

### validity.list()

List validity rules for an asset.

```python
rules = client.validity.list(
    "asset-uuid",
    rule_type="NOT_NULL",  # Optional filter
    is_active=True,        # Optional filter
    limit=50,
)
for r in rules:
    print(f"{r.column_name}: {r.rule_type}")
```

**Returns:** `list[ValidityRule]`

### validity.get()

Get validity rule details.

```python
rule = client.validity.get("asset-uuid", "rule-uuid")
print(f"Type: {rule.rule_type}")
print(f"Severity: {rule.severity}")
```

**Returns:** `ValidityRule`

### validity.create()

Create a new validity rule. Requires `read-write` scope.

```python
# NOT_NULL rule
rule = client.validity.create(
    "asset-uuid",
    rule_type="NOT_NULL",
    table_path="catalog.schema.table",
    column_name="email",
    severity="critical",
)

# REGEX rule
regex_rule = client.validity.create(
    "asset-uuid",
    rule_type="REGEX",
    table_path="catalog.schema.table",
    column_name="email",
    rule_config={"pattern": r"^[\w.-]+@[\w.-]+\.\w+$"},
)
```

**Parameters:**
- `asset_id` (str): Asset UUID
- `rule_type` (str): `NOT_NULL`, `UNIQUE`, `REGEX`, `RANGE`, `ENUM`, etc.
- `table_path` (str): Full table path
- `column_name` (str | None): Column name
- `rule_config` (dict | None): Rule-specific configuration
- `severity` (str): `info`, `warning`, `critical` (default: `warning`)
- `check_interval` (str): `hourly`, `daily`, `weekly` (default: `daily`)

**Returns:** `ValidityRule`

### validity.update()

Update a validity rule. Requires `read-write` scope.

```python
rule = client.validity.update(
    "asset-uuid",
    "rule-uuid",
    severity="critical",
    is_active=True,
)
```

**Returns:** `ValidityRule`

### validity.delete()

Delete a validity rule. Requires `read-write` scope.

```python
client.validity.delete("asset-uuid", "rule-uuid")
```

### validity.check()

Trigger an immediate validity check. Requires `read-write` scope.

```python
result = client.validity.check(
    "asset-uuid",
    "rule-uuid",
    sample_limit=20,
)
if result.status == "fail":
    print(f"Invalid: {result.invalid_count} ({result.invalid_percent:.2f}%)")
```

**Returns:** `ValidityCheckResult`

### validity.results()

List historical check results.

```python
results = client.validity.results(
    "asset-uuid",
    "rule-uuid",
    limit=30,
)
for r in results:
    print(f"{r.checked_at}: {r.status}")
```

**Returns:** `list[ValidityCheckResult]`

---

## client.referential

Monitor referential integrity between tables.

### referential.summary()

Get referential summary for an asset.

```python
summary = client.referential.summary("asset-uuid")
print(f"Total: {summary.total_checks}")
print(f"Passing: {summary.passing_checks}, Failing: {summary.failing_checks}")
```

**Returns:** `ReferentialSummary`

### referential.list()

List referential checks for an asset.

```python
checks = client.referential.list(
    "asset-uuid",
    is_active=True,
    limit=50,
)
for c in checks:
    print(f"{c.child_column_name} -> {c.parent_column_name}")
```

**Returns:** `list[ReferentialCheck]`

### referential.get()

Get referential check details.

```python
check = client.referential.get("asset-uuid", "check-uuid")
print(f"FK: {check.child_table_path}.{check.child_column_name}")
print(f"PK: {check.parent_table_path}.{check.parent_column_name}")
```

**Returns:** `ReferentialCheck`

### referential.create()

Create a new referential check. Requires `read-write` scope.

```python
check = client.referential.create(
    "asset-uuid",
    child_table_path="catalog.schema.orders",
    child_column_name="customer_id",
    parent_table_path="catalog.schema.customers",
    parent_column_name="id",
    name="Orders -> Customers FK",
    max_orphan_count=0,
)
```

**Parameters:**
- `asset_id` (str): Asset UUID
- `child_table_path` (str): Child table path (contains FK)
- `child_column_name` (str): FK column name
- `parent_table_path` (str): Parent table path (contains PK)
- `parent_column_name` (str): PK column name
- `name` (str | None): Check name
- `max_orphan_count` (int | None): Alert threshold for orphan count
- `max_orphan_percent` (float | None): Alert threshold for orphan %

**Returns:** `ReferentialCheck`

### referential.update()

Update a referential check. Requires `read-write` scope.

```python
check = client.referential.update(
    "asset-uuid",
    "check-uuid",
    max_orphan_percent=0.5,
    capture_interval="hourly",
)
```

**Returns:** `ReferentialCheck`

### referential.delete()

Delete a referential check. Requires `read-write` scope.

```python
client.referential.delete("asset-uuid", "check-uuid")
```

### referential.execute()

Execute a referential check immediately. Requires `read-write` scope.

```python
result = client.referential.execute("asset-uuid", "check-uuid")
if result.status == "fail":
    print(f"Orphans: {result.orphan_count} ({result.orphan_percent:.2f}%)")
```

**Returns:** `ReferentialCheckResult`

### referential.results()

List historical check results.

```python
results = client.referential.results(
    "asset-uuid",
    "check-uuid",
    limit=30,
)
for r in results:
    print(f"{r.created_at}: {r.status} ({r.orphan_count} orphans)")
```

**Returns:** `list[ReferentialCheckResult]`

---

## client.alerts

Query alert history.

### alerts.summary()

Get alert summary statistics.

```python
summary = client.alerts.summary()
print(f"Triggered: {summary.triggered}")
print(f"Last 24h: {summary.triggered_last_24h}")
```

**Returns:** `AlertsSummary`

### alerts.list()

List alerts with filters.

```python
alerts = client.alerts.list(
    status="triggered",  # "triggered", "acknowledged", "resolved"
    alert_type="freshness",  # "freshness", "schema_change", "row_count"
    asset_id="snowflake.prod.warehouse.orders",
    limit=50,
)
```

**Returns:** `list[Alert]`

### alerts.rules()

List configured alert rules.

```python
rules = client.alerts.rules()
for rule in rules:
    print(f"{rule.name}: {rule.alert_type}")
```

**Returns:** `list[AlertRule]`

---

## client.api_keys

Manage API keys (requires `admin` scope).

### api_keys.list()

List your organization's API keys.

```python
keys = client.api_keys.list(
    include_revoked=False,
    limit=50,
)
```

**Returns:** `list[APIKey]`

### api_keys.create()

Create a new API key.

> ⚠️ The full key is only returned once. Store it securely!

```python
new_key = client.api_keys.create(
    name="Airflow Production",
    scope="read-only",  # "read-only", "read-write", "admin"
)
print(f"Key: {new_key.key}")  # Save this!
```

**Returns:** `CreatedAPIKey` (includes full key)

### api_keys.get()

Get details of a specific key.

```python
key = client.api_keys.get(key_id)
print(f"{key.name}: {key.scope}")
```

**Returns:** `APIKey` (without full key)

### api_keys.revoke()

Revoke an API key. This cannot be undone.

```python
client.api_keys.revoke(key_id)
```

**Returns:** `APIKey`

### api_keys.usage()

Get API key usage and limits.

```python
usage = client.api_keys.usage()
print(f"Keys: {usage['current_count']}/{usage['max_keys']}")
print(f"Rate limit: {usage['rate_limit_per_min']}/min")
```

**Returns:** `dict`

---

## Models

### Asset

```python
class Asset:
    id: str
    qualified_name: str
    name: str
    asset_type: str  # "table", "view", "model"
    source: str
    database: str | None
    schema_name: str | None
    description: str | None
    row_count: int | None
    column_count: int | None
    tags: list[str]
    created_at: datetime
    updated_at: datetime
```

### FreshnessStatus

```python
class FreshnessStatus:
    asset_id: str
    qualified_name: str
    status: str  # "fresh", "stale", "unknown"
    is_fresh: bool
    is_stale: bool
    last_updated: datetime | None
    hours_since_update: float | None
    staleness_threshold_hours: float | None
    checked_at: datetime
```

### APIKey

```python
class APIKey:
    id: str
    name: str
    key_prefix: str  # e.g., "aa_live_abc1"
    key_suffix: str  # e.g., "xy9z"
    scope: str
    created_at: datetime
    last_used_at: datetime | None
    revoked_at: datetime | None
```

### CreatedAPIKey

```python
class CreatedAPIKey:
    id: str
    name: str
    key: str  # Full key - only shown once!
    scope: str
    created_at: datetime
```

### Tag

```python
class Tag:
    id: str
    name: str
    category: str  # "business", "technical", "governance"
    object_path: str | None  # e.g., "gold.customers"
    object_type: str | None  # "table" or "column"
    description: str | None
    created_at: datetime | None
```

### BulkApplyResult

```python
class BulkApplyResult:
    applied: int   # Number of tags successfully applied
    failed: int    # Number that failed
    total: int     # Total attempted
```

### IntelligenceAnswer

```python
class IntelligenceAnswer:
    answer: str           # AI-generated answer
    confidence: str       # "high", "medium", "low"
    sources: str          # Data sources used
```

### JobStatus

```python
class JobStatus:
    job_id: str
    status: str      # "pending", "running", "completed", "failed"
    progress: int    # 0-100
    error: str | None
    created_at: datetime
    completed_at: datetime | None
```

### MetricsSummary

```python
class MetricsSummary:
    total_metrics: int
    active_metrics: int
    total_checks: int
    passing: int
    failing: int
    warning: int
    error: int
    health_percentage: float
```

### MetricDefinition

```python
class MetricDefinition:
    id: str              # Public UUID
    internal_id: int
    asset_id: int
    table_path: str
    column_name: str | None
    metric_type: str     # row_count, null_percent, etc.
    capture_interval: str
    sensitivity: int
    is_active: bool
    created_at: datetime | None
```

### MetricSnapshot

```python
class MetricSnapshot:
    id: int
    metric_definition_id: int
    value: float
    captured_at: datetime
    is_anomaly: bool
    z_score: float | None
    status: str | None   # PASS, FAIL, WARNING
```

### ValiditySummary

```python
class ValiditySummary:
    total_rules: int
    passing: int
    failing: int
    error: int
```

### ValidityRule

```python
class ValidityRule:
    id: int
    uuid: str
    table_path: str
    column_name: str | None
    rule_type: str       # NOT_NULL, UNIQUE, REGEX, etc.
    rule_config: dict | None
    name: str | None
    severity: str        # info, warning, critical
    is_active: bool
    check_interval: str
```

### ValidityCheckResult

```python
class ValidityCheckResult:
    id: int
    validity_rule_id: int
    status: str          # pass, fail, error
    total_rows: int
    invalid_count: int
    invalid_percent: float
    invalid_samples: dict | None
    checked_at: datetime
```

### ReferentialSummary

```python
class ReferentialSummary:
    total_checks: int
    active_checks: int
    passing_checks: int
    failing_checks: int
    last_check_at: datetime | None
```

### ReferentialCheck

```python
class ReferentialCheck:
    id: str              # Public UUID
    internal_id: int
    asset_id: int
    child_table_path: str
    child_column_name: str
    parent_table_path: str
    parent_column_name: str
    name: str | None
    capture_interval: str
    max_orphan_count: int | None
    max_orphan_percent: float | None
    is_active: bool
```

### ReferentialCheckResult

```python
class ReferentialCheckResult:
    id: int
    referential_check_id: int
    status: str          # pass, fail, error
    orphan_count: int
    orphan_percent: float
    total_child_rows: int
    orphan_sample: list | None
    created_at: datetime
```

---

<!-- Source: https://docs.anomalyarmor.ai/sdk/error-handling -->

# Error Handling

> Handle exceptions and debug SDK issues

The SDK uses typed exceptions to communicate errors. All exceptions inherit from `ArmorError`.

## Exception Hierarchy

## Import Exceptions

```python
from anomalyarmor.exceptions import (
    ArmorError,           # Base exception
    AuthenticationError,  # Invalid/missing API key
    AuthorizationError,   # Valid key, insufficient scope
    NotFoundError,        # Resource not found
    ValidationError,      # Invalid parameters
    RateLimitError,       # Rate limit exceeded
    ServerError,          # Server error
    StalenessError,       # Data is stale
)
```

### AuthenticationError

Raised when authentication fails (401).

```python
try:
    client = Client(api_key="invalid_key")
    client.assets.list()
except AuthenticationError as e:
    print("Invalid API key")
```

**Common causes:**
- Invalid API key
- Expired or revoked key
- Missing `Authorization` header

---

### AuthorizationError

Raised when authorization fails (403). The API key is valid but lacks permissions.

```python
class AuthorizationError(ArmorError):
    required_scope: str | None   # Scope needed for this action
    current_scope: str | None    # Scope of your API key
```

**Example:**

```python
try:
    # Trying to create a key with read-only scope
    client.api_keys.create(name="test", scope="admin")
except AuthorizationError as e:
    print(f"Need {e.required_scope}, have {e.current_scope}")
```

**Common causes:**
- Using `read-only` key for write operations
- Using `read-write` key for admin operations

---

### NotFoundError

Raised when a resource doesn't exist (404).

```python
class NotFoundError(ArmorError):
    resource_type: str | None  # e.g., "asset"
    resource_id: str | None    # The ID that wasn't found
```

**Example:**

```python
try:
    asset = client.assets.get("nonexistent.qualified.name")
except NotFoundError as e:
    print(f"Asset not found: {e.resource_id}")
```

---

### ValidationError

Raised when request parameters are invalid (422).

```python
class ValidationError(ArmorError):
    field_errors: dict[str, str]  # Field-specific errors
```

**Example:**

```python
try:
    client.api_keys.create(name="", scope="invalid")
except ValidationError as e:
    print(f"Validation failed: {e.field_errors}")
```

---

### RateLimitError

Raised when rate limit is exceeded (429).

```python
class RateLimitError(ArmorError):
    retry_after: int | None  # Seconds to wait before retrying
```

**Example:**

```python
import time

try:
    assets = client.assets.list()
except RateLimitError as e:
    if e.retry_after:
        print(f"Rate limited. Waiting {e.retry_after}s...")
        time.sleep(e.retry_after)
        # Retry
```

---

### ServerError

Raised for server-side errors (5xx).

```python
class ServerError(ArmorError):
    status_code: int  # HTTP status code
```

**Example:**

```python
try:
    assets = client.assets.list()
except ServerError as e:
    print(f"Server error ({e.status_code}): {e.message}")
```

---

### StalenessError

Raised by `require_fresh()` when data is stale. This is a **data quality** exception, not an API error.

```python
class StalenessError(ArmorError):
    asset: str                    # Asset qualified name
    hours_since_update: float     # Hours since last update
    threshold_hours: float        # The threshold that was exceeded
```

**Example:**

```python
from anomalyarmor.exceptions import StalenessError

try:
    client.freshness.require_fresh("snowflake.prod.warehouse.orders")
except StalenessError as e:
    print(f"Asset {e.asset} is stale")
    print(f"Last update: {e.hours_since_update:.1f}h ago")
    print(f"Threshold: {e.threshold_hours:.1f}h")
    sys.exit(1)
```

---

## Best Practices

### Catch Specific Exceptions

```python
from anomalyarmor.exceptions import (
    StalenessError,
    AuthenticationError,
    RateLimitError,
    ArmorError,
)

try:
    client.freshness.require_fresh(asset)

except StalenessError as e:
    # Data quality issue - fail the pipeline
    logger.error(f"Stale data: {e.asset}")
    raise

except AuthenticationError:
    # Configuration issue - alert on-call
    logger.critical("Invalid API key!")
    notify_oncall()
    raise

except RateLimitError as e:
    # Transient - retry after waiting
    time.sleep(e.retry_after or 60)
    retry()

except ArmorError as e:
    # Unexpected error - log and continue
    logger.warning(f"API error: {e}")
```

### Retry with Backoff

```python
import time
from anomalyarmor.exceptions import RateLimitError, ServerError

def with_retry(fn, max_retries=3):
    """Execute function with exponential backoff."""
    for attempt in range(max_retries):
        try:
            return fn()

        except RateLimitError as e:
            wait = e.retry_after or (2 ** attempt * 10)
            print(f"Rate limited, waiting {wait}s...")
            time.sleep(wait)

        except ServerError as e:
            if attempt == max_retries - 1:
                raise
            wait = 2 ** attempt * 5
            print(f"Server error, retrying in {wait}s...")
            time.sleep(wait)

    raise Exception("Max retries exceeded")

# Usage
assets = with_retry(lambda: client.assets.list())
```

### Pipeline Gate Pattern

```python
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError, ArmorError
import sys

def check_freshness_gate(assets: list[str]) -> bool:
    """Gate pipeline on data freshness."""
    client = Client()

    stale = []
    for asset in assets:
        try:
            client.freshness.require_fresh(asset)
        except StalenessError:
            stale.append(asset)
        except ArmorError as e:
            print(f"Warning: Could not check {asset}: {e}")

    if stale:
        print(f"BLOCKED: {len(stale)} stale assets: {stale}")
        return False

    print("All assets fresh, proceeding...")
    return True

# In your pipeline
if not check_freshness_gate(["orders", "customers"]):
    sys.exit(1)
```

---

## Debugging

### Enable Request Logging

```python
import logging

# Enable debug logging for httpx
logging.basicConfig(level=logging.DEBUG)
logging.getLogger("httpx").setLevel(logging.DEBUG)

client = Client()
```

### Inspect Error Details

```python
try:
    client.assets.get("invalid")
except ArmorError as e:
    print(f"Message: {e.message}")
    print(f"Code: {e.code}")
    print(f"Details: {e.details}")

    # For ValidationError
    if hasattr(e, 'field_errors'):
        print(f"Field errors: {e.field_errors}")
```

### Check API Key Validity

```python
try:
    # Simple health check
    client.freshness.summary()
    print("API key is valid")
except AuthenticationError:
    print("API key is invalid or revoked")
```

## Common Questions

### Which exception should I catch to fail a pipeline on stale data?

Catch `StalenessError`, which is raised by `client.freshness.require_fresh(asset)` when the asset is past its freshness threshold. It carries `asset`, `hours_since_update`, and `threshold_hours` so you can log actionable context. Let it propagate in Airflow tasks to mark the task failed cleanly.

### How do I distinguish a transient server error from a permanent one?

Catch `ServerError` (5xx) separately from `ArmorError` and retry with backoff; these are usually transient. `ValidationError` and `NotFoundError` are permanent for the given input, so retrying won't help. The "Retry with Backoff" example above shows the pattern for 429 + 5xx specifically.

### What's the difference between AuthenticationError and AuthorizationError?

`AuthenticationError` (401) means the API key itself is missing, invalid, or revoked. `AuthorizationError` (403) means the key is valid but lacks the scope the endpoint requires. The latter exposes `required_scope` and `current_scope` attributes so you can point users at the right key to use.

### How do I debug an unexpected error from the SDK?

Enable httpx debug logging (`logging.getLogger("httpx").setLevel(logging.DEBUG)`) to see the raw request and response, then inspect `e.message`, `e.code`, and `e.details` on the caught `ArmorError`. For `ValidationError`, `e.field_errors` points at the exact fields the API rejected.

---

<!-- Source: https://docs.anomalyarmor.ai/sdk/javascript -->

# TypeScript SDK

> Install and use @anomalyarmor/sdk for TypeScript and JavaScript

`@anomalyarmor/sdk` is the official TypeScript SDK for AnomalyArmor. Types are generated straight from our OpenAPI spec via `openapi-typescript`, so every endpoint is typed end-to-end. Runtime is a thin wrapper around `openapi-fetch` with Bearer auth and retry-on-429 middleware baked in.

## Install

```bash
npm install @anomalyarmor/sdk
# or: pnpm add @anomalyarmor/sdk  /  yarn add @anomalyarmor/sdk
```

Node 18+ is required. The package ships dual ESM + CJS builds and TypeScript types.

## Quickstart

```ts
import { createAnomalyArmorClient } from '@anomalyarmor/sdk';

const client = createAnomalyArmorClient({
  apiKey: process.env.ANOMALYARMOR_API_KEY!, // aa_live_...
});

// Liveness probe — 200 means key is valid
const health = await client.health.check();

// Aggregate counts of open alerts
const overview = await client.alerts.overview();
console.log(`${overview.unresolved_alerts} unresolved alerts`);

// Per-table freshness for an asset
const freshness = await client.freshness.check('asset-uuid-here');
```

Get an API key at [app.anomalyarmor.ai/settings/api-keys](https://app.anomalyarmor.ai/settings/api-keys). Keys start with `aa_live_`.

## Authentication

The SDK captures your API key at construction time and attaches `Authorization: Bearer <key>` to every outgoing request via a small middleware. **The SDK library itself never reads from `process.env`** - that would hide auth at a distance. Only the CLI shim (next section) reads `ANOMALYARMOR_API_KEY`, because CLIs traditionally do.

## CLI

The package ships a `bin` binary so you can smoke-check connectivity without writing code:

```bash
# With the env var set:
export ANOMALYARMOR_API_KEY=aa_live_...
npx anomalyarmor health

# Or pass the key directly:
npx anomalyarmor health --api-key aa_live_...
```

The CLI has exactly one command (`health`) by design - anything richer belongs in a real script calling the library API.

## Rate limiting & retries

The SDK automatically retries HTTP 429 responses, honoring `Retry-After`. Retries are bounded (default 3 attempts, each sleep capped at 60 s) and only apply to idempotent verbs (`GET`, `HEAD`, `PUT`, `DELETE`, `OPTIONS`) so a `POST` failure never duplicates a side effect.

Tune or disable:

```ts
const client = createAnomalyArmorClient({
  apiKey: '...',
  maxRetries: 5,              // default 3
  maxRetrySleepSeconds: 120,  // default 60
});

// Opt out of retries entirely (mutation-heavy callers who want explicit control):
const client = createAnomalyArmorClient({ apiKey: '...', maxRetries: 0 });
```

## How requests work

Every typed SDK call flows through six layers: the ergonomic helper → `openapi-fetch` → the middleware stack → platform `fetch` → parse / unwrap → a statically-typed response.

## Error handling

Ergonomic methods throw `AnomalyArmorApiError` on 4xx / 5xx responses:

```ts
import { AnomalyArmorApiError } from '@anomalyarmor/sdk';

try {
  await client.alerts.get('does-not-exist');
} catch (err) {
  if (err instanceof AnomalyArmorApiError) {
    console.error(`HTTP ${err.status}: ${err.message}`);
    console.error('Full body:', err.body);
  } else {
    throw err; // network / parse / programmer errors
  }
}
```

## Drop down to the raw client

The ergonomic surface covers the endpoints customers use most often. For anything else, `client.raw` exposes the full typed `openapi-fetch` client:

```ts
const { data, error } = await client.raw.GET('/api/v1/assets/{asset_id}', {
  params: { path: { asset_id: 'my-asset' } },
});
if (error) throw new Error(`asset lookup failed: ${error.message}`);
// data is typed from the OpenAPI spec
```

## Configuration reference

| Option | Default | Purpose |
|---|---|---|
| `apiKey` | *(required)* | Your `aa_live_*` Bearer token. |
| `baseUrl` | `https://app.anomalyarmor.ai` | Override for staging or a local backend. |
| `maxRetries` | `3` | 429-retry budget. Set to `0` to disable. |
| `maxRetrySleepSeconds` | `60` | Cap on any single `Retry-After` sleep. |
| `fetch` | `globalThis.fetch` | Injectable `fetch` for tests / alternate runtimes. |

## Next steps

- **[API reference](/sdk/javascript-reference)**: Full typed reference for every SDK method
- **[Authentication](/api/authentication)**: How API keys work across the platform
- **[Python SDK](/sdk/overview)**: Same API surface in Python
- **[CLI reference](/cli/overview)**: Installable CLI for interactive use

## Common Questions

### What Node version does the SDK support?

Node 18 or higher, so `globalThis.fetch` is available natively. The package ships dual ESM + CJS builds and its own TypeScript types, so it drops into Next.js, Vite, and plain Node projects without extra polyfills.

### Does the SDK retry failed requests automatically?

Yes, but only for idempotent verbs (`GET`, `HEAD`, `PUT`, `DELETE`, `OPTIONS`) and only on HTTP 429. It honors the `Retry-After` header with a 3-attempt default, each sleep capped at 60 seconds. Tune with `maxRetries` / `maxRetrySleepSeconds`, or pass `maxRetries: 0` to opt out when you want explicit control of mutation retries.

### How do I call an endpoint that isn't in client.alerts / client.freshness / client.schema?

Drop down to `client.raw`, which is a fully typed `openapi-fetch` client covering the whole OpenAPI surface. You get the same Bearer auth and retry middleware, plus path/query/body types generated from the spec. Example: `client.raw.GET('/api/v1/assets/{asset_id}', { params: { path: { asset_id } } })`.

### How do I point the SDK at a local backend or staging environment?

Pass `baseUrl` at construction time: `createAnomalyArmorClient({ apiKey, baseUrl: 'http://localhost:8000' })`. The default is `https://app.anomalyarmor.ai`. For testing, you can also inject a custom `fetch` via the `fetch` option to stub responses without going over the network.

### Why does the SDK library ignore ANOMALYARMOR_API_KEY?

Intentional: library code must receive `apiKey` explicitly so authentication isn't hidden at a distance, which makes multi-tenant and per-request key rotation safe. Only the `npx anomalyarmor` CLI shim reads `ANOMALYARMOR_API_KEY`, matching standard CLI conventions.

---

<!-- Source: https://docs.anomalyarmor.ai/sdk/javascript-reference -->

# TypeScript SDK reference

> Full API reference for @anomalyarmor/sdk

The SDK's full typed surface lives on npm as `@anomalyarmor/sdk` and in-source as typedoc-generated reference pages.

## Live reference

The canonical reference is generated from TypeScript source on every release and hosted on GitHub Pages:

## Ergonomic surface

If you just want a quick lookup table, here are the hand-written wrappers. For anything not listed, drop down to `client.raw` and use the generated OpenAPI `paths` type directly - see [How requests work](/sdk/javascript#how-requests-work).

### `client.health`

| Method | Backing endpoint |
|---|---|
| `check()` | `GET /api/v1/health` |

### `client.alerts`

| Method | Backing endpoint |
|---|---|
| `overview()` | `GET /api/v1/alerts/overview` |
| `history(params?)` | `GET /api/v1/alerts/history` |
| `inbox()` | `GET /api/v1/alerts/inbox` |
| `get(alertId)` | `GET /api/v1/alerts/{alert_id}` |
| `acknowledge(alertId, notes?)` | `POST /api/v1/alerts/{alert_id}/acknowledge` |
| `resolve(alertId, notes?)` | `POST /api/v1/alerts/{alert_id}/resolve` |

### `client.freshness`

| Method | Backing endpoint |
|---|---|
| `check(assetId)` | `GET /api/v1/assets/{asset_id}/freshness` |
| `getTable(assetId, tablePath)` | `GET /api/v1/assets/{asset_id}/freshness/{table_path}` |
| `checkTable(assetId, tablePath)` | `POST /api/v1/assets/{asset_id}/freshness/{table_path}/check` |

### `client.schema`

| Method | Backing endpoint |
|---|---|
| `listChanges(assetId)` | `GET /api/v1/schema-drift/assets/{asset_id}/changes` |
| `getChange(assetId, changeId)` | `GET /api/v1/schema-drift/assets/{asset_id}/changes/{change_id}` |
| `baselineStatus(assetId)` | `GET /api/v1/schema-drift/assets/{asset_id}/baseline-status` |
| `detectChanges(assetId)` | `POST /api/v1/schema-drift/assets/{asset_id}/detect-changes` |

## Exported types

```ts
import type {
  AnomalyArmorClient,
  AnomalyArmorClientOptions,
  AlertsResource,
  FreshnessResource,
  SchemaResource,
  HealthResource,
  // Raw OpenAPI types for advanced callers:
  paths,
  components,
  operations,
} from '@anomalyarmor/sdk';

import { createAnomalyArmorClient, AnomalyArmorApiError, DEFAULT_BASE_URL } from '@anomalyarmor/sdk';
```

## Source

The SDK source lives in the [AnomalyArmor core monorepo](https://github.com/anomalyarmor/core/tree/main/sdk/javascript). Bug reports, feature requests, and PRs welcome at [github.com/anomalyarmor/core/issues](https://github.com/anomalyarmor/core/issues).

---

<!-- Source: https://docs.anomalyarmor.ai/cli/overview -->

# CLI Overview

> Install and use the armor CLI for scripting, automation, and CI/CD integration

Stop bad data before it reaches production. The `armor` CLI lets you gate deployments on data freshness, check quality in CI/CD pipelines, and automate monitoring setup.

```bash
# Gate your pipeline on data freshness
armor freshness check snowflake.prod.warehouse.orders || exit 1
```

## Installation

```bash
pip install anomalyarmor-cli
```

This installs both the Python SDK and the `armor` CLI command.

## Authentication

### Login with API Key

```bash
# Interactive prompt
armor auth login

# Non-interactive (for CI/CD)
armor auth login --key aa_live_your_key_here
```

Credentials are stored in `~/.armor/config.yaml`.

### Check Auth Status

```bash
armor auth status
```

Output:
```
Authenticated as: your-email@company.com
API Key: aa_live_k8jd...9z4f
Organization: Acme Corp
```

### Logout

```bash
armor auth logout
```

## Commands

### Assets

```bash
# List all assets
armor assets list

# Filter by source
armor assets list --source snowflake

# Filter by type
armor assets list --type table

# Get single asset
armor assets get snowflake.prod.warehouse.orders
```

### Freshness

```bash
# Get freshness summary
armor freshness summary

# List all freshness statuses
armor freshness list

# Get freshness for specific asset
armor freshness get snowflake.prod.warehouse.orders

# Check freshness (exit 0 if fresh, exit 1 if stale)
armor freshness check snowflake.prod.warehouse.orders

# Trigger refresh
armor freshness refresh snowflake.prod.warehouse.orders

# Trigger and wait for completion
armor freshness refresh snowflake.prod.warehouse.orders --wait
```

### Schema

```bash
# Get schema drift summary
armor schema summary

# List recent schema changes
armor schema changes

# Trigger schema check
armor schema refresh snowflake.prod.warehouse.orders --wait
```

### Lineage

```bash
# List assets with lineage
armor lineage list

# Get lineage for asset
armor lineage get snowflake.prod.warehouse.orders

# Get only upstream
armor lineage get snowflake.prod.warehouse.orders --direction upstream

# Get only downstream
armor lineage get snowflake.prod.warehouse.orders --direction downstream
```

### Alerts

```bash
# Get alerts summary
armor alerts summary

# List all alerts
armor alerts list

# Filter by status
armor alerts list --status triggered

# Filter by asset
armor alerts list --asset snowflake.prod.warehouse.orders
```

### Metrics

```bash
# Get metrics summary for an asset
armor metrics summary <asset-id>

# List all metrics for an asset
armor metrics list <asset-id>

# Filter by type
armor metrics list <asset-id> --type null_percent

# Get specific metric details
armor metrics get <asset-id> <metric-id>

# Create a row count metric
armor metrics create <asset-id> --type row_count --table snowflake.prod.warehouse.orders

# Create a null percentage metric
armor metrics create <asset-id> --type null_percent --table snowflake.prod.warehouse.orders --column email

# Trigger manual capture
armor metrics capture <asset-id> <metric-id>

# Delete a metric
armor metrics delete <asset-id> <metric-id> --yes
```

### API Keys (Admin)

```bash
# List your API keys
armor api-keys list

# Create new key
armor api-keys create --name "airflow-prod" --scope read-only

# Revoke key
armor api-keys revoke <key-id>
```

## Exit Codes

| Code | Meaning |
|------|---------|
| 0 | Success |
| 1 | Check failed (e.g., data is stale) |
| 2 | Authentication error |
| 3 | Resource not found |
| 4 | Rate limited |

### Using in Scripts

```bash
#!/bin/bash

# Gate pipeline on freshness
if armor freshness check snowflake.prod.warehouse.orders; then
    echo "Data is fresh, running pipeline..."
    dbt run
else
    echo "Data is stale, aborting"
    exit 1
fi
```

## Output Formats

### Table (Default)

```bash
armor assets list
```

```
QUALIFIED NAME                        TYPE    SOURCE     ROWS
snowflake.prod.warehouse.orders       table   snowflake  1.5M
snowflake.prod.warehouse.customers    table   snowflake  250K
snowflake.prod.warehouse.products     table   snowflake  10K
```

### JSON

```bash
armor assets list --format json
```

```json
[
  {
    "qualified_name": "snowflake.prod.warehouse.orders",
    "asset_type": "table",
    "source": "snowflake",
    "row_count": 1500000
  }
]
```

## Configuration File

The CLI stores configuration in `~/.armor/config.yaml`:

```yaml
api_key: aa_live_your_key_here
api_url: https://api.anomalyarmor.ai
```

### Environment Variables

Override config file with environment variables:

```bash
export ARMOR_API_KEY="aa_live_xxx"
export ARMOR_API_URL="https://api.anomalyarmor.ai"
```

## CI/CD Examples

### GitHub Actions

```yaml
- name: Check data freshness
  env:
    ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
  run: |
    pip install anomalyarmor-cli
    armor freshness check snowflake.prod.warehouse.orders
```

### GitLab CI

```yaml
check_freshness:
  script:
    - pip install anomalyarmor-cli
    - armor freshness check snowflake.prod.warehouse.orders
  variables:
    ARMOR_API_KEY: $ARMOR_API_KEY
```

## Next Steps

- **[CLI Reference](/cli/reference)**: Complete command reference
- **[Python SDK](/sdk/overview)**: Use programmatically in Python
- **[Airflow Integration](/integrations/airflow)**: Use in Airflow DAGs
- **[API Reference](/api/overview)**: REST API documentation

## Common Questions

### How do I use the CLI in GitHub Actions or GitLab CI?

Install it with `pip install anomalyarmor-cli`, set `ARMOR_API_KEY` from your secrets store, and run `armor freshness check <asset>`. The non-zero exit code on stale data fails the step cleanly. The "CI/CD Examples" section above has copy-paste snippets for both runners.

### What does each exit code mean in scripts?

`0` is success, `1` is a check failure (stale data, validity fail), `2` is auth, `3` is not-found, and `4` is rate-limited. Write shell guards against the specific codes you care about (`if [ $? -eq 1 ]; then ...`) rather than treating any non-zero as the same error.

### How do I get machine-readable output from the CLI?

Pass `--format json` on any list or get command. The output is a raw JSON array (or object) suitable for piping into `jq`, other CLIs, or uploading as a CI artifact. The default table format is optimized for terminal reading only.

### Where does the CLI store credentials and how do I override them?

`armor auth login` writes `api_key` and `api_url` to `~/.armor/config.yaml`. The `ARMOR_API_KEY` and `ARMOR_API_URL` environment variables override the config file, which is what you want in CI where you shouldn't write secrets to disk.

### Can I use the CLI non-interactively for CI?

Yes. `armor auth login --key aa_live_...` skips the interactive prompt, and every command reads `ARMOR_API_KEY` directly if you prefer not to persist a config file at all. Most CI pipelines just set the env var and run commands without ever calling `auth login`.

---

<!-- Source: https://docs.anomalyarmor.ai/cli/reference -->

# CLI Reference

> Complete command reference for the armor CLI

Complete reference for all `armor` CLI commands.

## Exit Codes

| Code | Meaning |
|------|---------|
| `0` | Success |
| `1` | Staleness/check failed |
| `2` | Authentication error |
| `3` | Resource not found |
| `4` | Rate limited |
| `5` | General error |

## armor assets

Asset management commands.

### armor assets list

List assets with optional filters.

```bash
armor assets list [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--source, -s` | Filter by source type (e.g., `snowflake`, `postgresql`) |
| `--type, -t` | Filter by asset type (e.g., `table`, `view`) |
| `--search` | Search in asset names |
| `--limit, -l` | Max results (default: 50) |

**Example:**

```bash
$ armor assets list --source snowflake --type table

                           Assets
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━┓
┃ Qualified Name                    ┃ Type  ┃ Source    ┃ Active ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━┩
│ snowflake.prod.warehouse.orders   │ table │ snowflake │ Yes    │
│ snowflake.prod.warehouse.users    │ table │ snowflake │ Yes    │
└───────────────────────────────────┴───────┴───────────┴────────┘

Showing 2 assets
```

### armor assets get

Get asset details.

```bash
armor assets get <asset_id>
```

| Argument | Description |
|----------|-------------|
| `asset_id` | Asset ID or qualified name |

**Example:**

```bash
$ armor assets get snowflake.prod.warehouse.orders

Asset: snowflake.prod.warehouse.orders
ID: 550e8400-e29b-41d4-a716-446655440000
Type: table
Source: snowflake
Active: Yes
Description: Customer order transactions
```

---

## armor freshness

Freshness monitoring commands.

### armor freshness summary

Get freshness summary.

```bash
armor freshness summary
```

**Example:**

```bash
$ armor freshness summary

Freshness Summary
Total Assets: 245
Fresh: 230
Stale: 10
Unknown: 5
Freshness Rate: 93.9%
```

### armor freshness list

List freshness status for all assets.

```bash
armor freshness list [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--status, -s` | Filter by status (`fresh`, `stale`, `unknown`) |
| `--limit, -l` | Max results (default: 50) |

**Example:**

```bash
$ armor freshness list --status stale

                      Freshness Status
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Asset                             ┃ Status ┃ Hours Since Update┃ Threshold ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ snowflake.prod.warehouse.orders   │ Stale  │ 26.5              │ 24h       │
└───────────────────────────────────┴────────┴───────────────────┴───────────┘
```

### armor freshness get

Check freshness for a specific asset.

```bash
armor freshness get <asset_id>
```

**Example:**

```bash
$ armor freshness get snowflake.prod.warehouse.orders

Asset: snowflake.prod.warehouse.orders
Status: Fresh
Last Update: 2024-12-04T10:30:00Z
Hours Since Update: 2.5
Threshold: 24h
```

### armor freshness check

Check if an asset is fresh, fail if stale. **Ideal for CI/CD pipelines.**

```bash
armor freshness check <asset_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--max-age, -m` | Max acceptable age in hours (optional) |

**Exit codes:**
- `0` - Data is fresh
- `1` - Data is stale
- `3` - Asset not found

**Example (fresh):**

```bash
$ armor freshness check snowflake.prod.warehouse.orders
FRESH: snowflake.prod.warehouse.orders
Hours since update: 2.5h
$ echo $?
0
```

**Example (stale):**

```bash
$ armor freshness check snowflake.prod.warehouse.orders
STALE: Asset 'snowflake.prod.warehouse.orders' is stale: 26.5h since last update (threshold: 24.0h)
Hours since update: 26.5h
Threshold: 24.0h
$ echo $?
1
```

**Shell script usage:**

```bash
#!/bin/bash
if armor freshness check snowflake.prod.warehouse.orders; then
    echo "Data is fresh, running pipeline..."
    dbt run
else
    echo "Data is stale, aborting"
    exit 1
fi
```

### armor freshness refresh

Trigger a freshness check for an asset.

```bash
armor freshness refresh <asset_id>
```

> ℹ️ Requires an API key with `read-write` or `admin` scope.

**Example:**

```bash
$ armor freshness refresh snowflake.prod.warehouse.orders

Refresh initiated
Job ID: job_abc123
Status: queued
```

---

## armor schema

Schema drift monitoring commands.

### armor schema summary

Get schema changes summary.

```bash
armor schema summary
```

**Example:**

```bash
$ armor schema summary

Schema Changes Summary
Total Changes: 23
Unacknowledged: 5
Critical: 2
Warning: 8
Info: 13
```

### armor schema changes

List schema changes.

```bash
armor schema changes [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--asset, -a` | Filter by asset |
| `--type, -t` | Filter by change type (`column_added`, `column_removed`, `type_changed`, etc.) |
| `--breaking` | Only show breaking changes (removed columns/tables, type changes) |
| `--unacknowledged, -u` | Only show unacknowledged changes |
| `--limit, -l` | Max results (default: 50) |

**Example:**

```bash
$ armor schema changes --breaking

                           Schema Changes
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━┓
┃ Asset                           ┃ Change         ┃ Column    ┃ Ack ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━┩
│ snowflake.prod.warehouse.orders │ column_removed │ user_id   │ No  │
└─────────────────────────────────┴────────────────┴───────────┴─────┘
```

---

## armor alerts

Alert management commands.

### armor alerts summary

Get alerts summary.

```bash
armor alerts summary
```

**Example:**

```bash
$ armor alerts summary

Alerts Summary
Total Rules: 15
Active Rules: 12
Recent Alerts: 45
Unresolved: 3
```

### armor alerts list

List alerts with filters.

```bash
armor alerts list [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--status, -s` | Filter by status (`triggered`, `acknowledged`, `resolved`) |
| `--type, -t` | Filter by alert type (`schema`, `freshness`, `discovery`) |
| `--limit, -l` | Max results (default: 50) |

**Example:**

```bash
$ armor alerts list --status triggered

                              Alerts
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Asset                           ┃ Message               ┃ Type      ┃ Status    ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ snowflake.prod.warehouse.orders │ Data is 26 hours stale│ freshness │ Triggered │
└─────────────────────────────────┴───────────────────────┴───────────┴───────────┘
```

---

## armor api-keys

API key management commands. Requires `admin` scope.

### armor api-keys list

List API keys.

```bash
armor api-keys list [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--include-revoked` | Include revoked keys |

**Example:**

```bash
$ armor api-keys list

                           API Keys
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━┓
┃ Name               ┃ Key              ┃ Scope      ┃ Active ┃ Last Used  ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━┩
│ Airflow Production │ aa_live_abc...z4 │ read-only  │ Yes    │ 2024-12-04 │
│ CI Pipeline        │ aa_live_def...y8 │ read-write │ Yes    │ 2024-12-03 │
└────────────────────┴──────────────────┴────────────┴────────┴────────────┘
```

### armor api-keys create

Create a new API key.

```bash
armor api-keys create --name <name> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--name, -n` | Human-readable name for the key |
| `--scope, -s` | Permission scope (default: `read-only`) |

**Scopes:**
- `read-only` - Read assets, freshness, lineage, alerts
- `read-write` - Read + trigger refreshes
- `admin` - Full access including key management

**Example:**

```bash
$ armor api-keys create --name "Airflow Production" --scope read-only

API key created successfully!

Key: aa_live_k8jd92hf8j2hd98fh2d9h2f98h2d9fh2

IMPORTANT: This key will only be shown once!
Store it securely.
```

### armor api-keys revoke

Revoke an API key.

```bash
armor api-keys revoke <key_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--yes, -y` | Skip confirmation prompt |

**Example:**

```bash
$ armor api-keys revoke 550e8400-e29b-41d4-a716-446655440000
Are you sure you want to revoke this key? [y/N]: y
API key revoked successfully.
```

---

## armor tags

Tag management commands.

### armor tags list

List tags for an asset.

```bash
armor tags list --asset <asset_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--asset, -a` | Asset ID or qualified name (required) |
| `--category, -c` | Filter by category |

**Example:**

```bash
$ armor tags list --asset postgresql.analytics

                              Tags
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Name           ┃ Category   ┃ Object Path         ┃ Type       ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ pii_data       │ governance │ gold.customers      │ table      │
│ financial      │ business   │ gold.orders         │ table      │
└────────────────┴────────────┴─────────────────────┴────────────┘
```

### armor tags create

Create a tag on a database object.

```bash
armor tags create <name> --asset <asset_id> --path <object_path> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--asset, -a` | Asset ID or qualified name (required) |
| `--path, -p` | Object path, e.g., `schema.table` (required) |
| `--category, -c` | Category: `business`, `technical`, `governance` (default: `business`) |
| `--description, -d` | Tag description |

**Example:**

```bash
$ armor tags create pii_data --asset postgresql.analytics --path gold.customers --category governance

Tag created: pii_data
  Category: governance
  Path: gold.customers
```

### armor tags apply

Apply multiple tags to multiple objects.

```bash
armor tags apply <tag_names> --asset <asset_id> --paths <object_paths> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--asset, -a` | Asset ID or qualified name (required) |
| `--paths, -p` | Comma-separated object paths (required) |
| `--category, -c` | Category for new tags (default: `business`) |

**Example:**

```bash
$ armor tags apply "pii,gdpr" --asset postgresql.analytics --paths "gold.customers,gold.orders"

Applied 4 tags
```

---

## armor intelligence

AI intelligence commands.

### armor intelligence ask

Ask a question about your data.

```bash
armor intelligence ask <asset_id> <question>
```

**Example:**

```bash
$ armor intelligence ask postgresql.analytics "What tables contain customer data?"

Question: What tables contain customer data?

Answer:
The following tables contain customer data:
- gold.customers: Primary customer dimension table
- gold.orders: Contains customer_id foreign key

Confidence: high
Sources: Asset Discovery, Intelligence KB
```

### armor intelligence generate

Generate AI intelligence for an asset.

```bash
armor intelligence generate <asset_id>
```

> ℹ️ Requires asset discovery to be run first.

**Example:**

```bash
$ armor intelligence generate postgresql.analytics

Intelligence generation started
Job ID: job_abc123
Status: queued
```

---

## armor jobs

Job monitoring commands.

### armor jobs status

Get status of an async job.

```bash
armor jobs status <job_id>
```

**Example:**

```bash
$ armor jobs status job_abc123

Job: job_abc123
Status: completed
Progress: 100%
Started: 2024-12-21T10:30:00Z
Completed: 2024-12-21T10:32:15Z
```

---

## armor metrics

Data quality metrics commands.

### armor metrics summary

Get metrics summary for an asset.

```bash
armor metrics summary <asset_id>
```

**Example:**

```bash
$ armor metrics summary 550e8400-e29b-41d4-a716-446655440000

Metrics Summary
Total Metrics: 15
Active Metrics: 12
Health: 87.5%
Passing: 10
Failing: 2
```

### armor metrics list

List metrics for an asset.

```bash
armor metrics list <asset_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--type, -t` | Filter by metric type (e.g., `row_count`, `null_percent`) |
| `--active` | Only show active metrics |
| `--limit, -l` | Max results (default: 50) |

**Example:**

```bash
$ armor metrics list 550e8400-e29b-41d4-a716-446655440000 --type null_percent

                           Metrics
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Table                             ┃ Column       ┃ Type          ┃ Active ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ snowflake.prod.warehouse.orders   │ customer_id  │ null_percent  │ Yes    │
│ snowflake.prod.warehouse.orders   │ email        │ null_percent  │ Yes    │
└───────────────────────────────────┴──────────────┴───────────────┴────────┘
```

### armor metrics get

Get metric details.

```bash
armor metrics get <asset_id> <metric_id>
```

**Example:**

```bash
$ armor metrics get 550e8400-e29b-41d4-a716-446655440000 m_abc123

Metric: m_abc123
Type: null_percent
Table: snowflake.prod.warehouse.orders
Column: customer_email
Interval: daily
Active: Yes
```

### armor metrics create

Create a new metric. Requires `read-write` scope.

```bash
armor metrics create <asset_id> --type <type> --table <path> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--type, -t` | Metric type (required): `row_count`, `null_percent`, etc. |
| `--table` | Full table path (required) |
| `--column` | Column name (for column metrics) |
| `--interval` | Capture interval: `hourly`, `daily`, `weekly` (default: `daily`) |

**Example:**

```bash
$ armor metrics create 550e8400-e29b-41d4-a716-446655440000 \
  --type null_percent \
  --table snowflake.prod.warehouse.orders \
  --column customer_email

Metric created: m_abc123
Type: null_percent
Table: snowflake.prod.warehouse.orders
Column: customer_email
```

### armor metrics capture

Trigger an immediate metric capture. Requires `read-write` scope.

```bash
armor metrics capture <asset_id> <metric_id>
```

**Example:**

```bash
$ armor metrics capture 550e8400-e29b-41d4-a716-446655440000 m_abc123

Captured 1 snapshot
Value: 2.5
Anomaly: No
```

### armor metrics delete

Delete a metric. Requires `read-write` scope.

```bash
armor metrics delete <asset_id> <metric_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--yes, -y` | Skip confirmation prompt |

---

## armor validity

Data validity rules commands.

### armor validity summary

Get validity summary for an asset.

```bash
armor validity summary <asset_id>
```

**Example:**

```bash
$ armor validity summary 550e8400-e29b-41d4-a716-446655440000

Validity Summary
Total Rules: 12
Passing: 10
Failing: 1
Error: 1
```

### armor validity list

List validity rules for an asset.

```bash
armor validity list <asset_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--type, -t` | Filter by rule type (e.g., `NOT_NULL`, `REGEX`) |
| `--active` | Only show active rules |
| `--limit, -l` | Max results (default: 50) |

**Example:**

```bash
$ armor validity list 550e8400-e29b-41d4-a716-446655440000 --type NOT_NULL

                           Validity Rules
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Table                             ┃ Column        ┃ Type     ┃ Severity  ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│ snowflake.prod.warehouse.orders   │ customer_id   │ NOT_NULL │ critical  │
│ snowflake.prod.warehouse.orders   │ email         │ NOT_NULL │ warning   │
└───────────────────────────────────┴───────────────┴──────────┴───────────┘
```

### armor validity get

Get validity rule details.

```bash
armor validity get <asset_id> <rule_id>
```

**Example:**

```bash
$ armor validity get 550e8400-e29b-41d4-a716-446655440000 v_abc123

Rule: v_abc123
Name: Customer Email Required
Type: NOT_NULL
Table: snowflake.prod.warehouse.orders
Column: customer_email
Severity: critical
Active: Yes
```

### armor validity create

Create a new validity rule. Requires `read-write` scope.

```bash
armor validity create <asset_id> --type <type> --table <path> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--type, -t` | Rule type (required): `NOT_NULL`, `UNIQUE`, `REGEX`, etc. |
| `--table` | Full table path (required) |
| `--column` | Column name |
| `--config` | Rule configuration (JSON string) |
| `--severity` | Severity: `info`, `warning`, `critical` (default: `warning`) |

**Examples:**

```bash
# NOT NULL rule
$ armor validity create 550e8400-e29b-41d4-a716-446655440000 \
  --type NOT_NULL \
  --table snowflake.prod.warehouse.orders \
  --column customer_email \
  --severity critical

Rule created: v_abc123
Type: NOT_NULL

# REGEX rule
$ armor validity create 550e8400-e29b-41d4-a716-446655440000 \
  --type REGEX \
  --table snowflake.prod.warehouse.orders \
  --column customer_email \
  --config '{"pattern": "^[\\w.-]+@[\\w.-]+\\.\\w+$"}'

Rule created: v_def456
Type: REGEX
```

### armor validity check

Trigger an immediate validity check. Requires `read-write` scope.

```bash
armor validity check <asset_id> <rule_id>
```

**Exit codes:**
- `0` - Validation passed
- `1` - Validation failed (invalid records found)
- `3` - Rule not found

**Example (pass):**

```bash
$ armor validity check 550e8400-e29b-41d4-a716-446655440000 v_abc123

PASS: v_abc123 (NOT_NULL on customer_email)
Total Rows: 10000
Invalid: 0 (0.00%)
$ echo $?
0
```

**Example (fail):**

```bash
$ armor validity check 550e8400-e29b-41d4-a716-446655440000 v_abc123

FAIL: v_abc123 (NOT_NULL on customer_email)
Total Rows: 10000
Invalid: 25 (0.25%)
Sample invalid values:
  - row 1001: null
  - row 1042: null
$ echo $?
1
```

### armor validity delete

Delete a validity rule. Requires `read-write` scope.

```bash
armor validity delete <asset_id> <rule_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--yes, -y` | Skip confirmation prompt |

---

## armor referential

Referential integrity check commands.

### armor referential summary

Get referential summary for an asset.

```bash
armor referential summary <asset_id>
```

**Example:**

```bash
$ armor referential summary 550e8400-e29b-41d4-a716-446655440000

Referential Summary
Total Checks: 8
Active: 7
Passing: 6
Failing: 1
```

### armor referential list

List referential checks for an asset.

```bash
armor referential list <asset_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--active` | Only show active checks |
| `--limit, -l` | Max results (default: 50) |

**Example:**

```bash
$ armor referential list 550e8400-e29b-41d4-a716-446655440000

                           Referential Checks
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Child                           ┃ Parent                           ┃ Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ orders.customer_id              │ customers.id                     │ Pass   │
│ payments.order_id               │ orders.id                        │ Fail   │
└─────────────────────────────────┴──────────────────────────────────┴────────┘
```

### armor referential get

Get referential check details.

```bash
armor referential get <asset_id> <check_id>
```

**Example:**

```bash
$ armor referential get 550e8400-e29b-41d4-a716-446655440000 r_abc123

Check: r_abc123
Name: Orders -> Customers FK
FK: orders.customer_id
PK: customers.id
Interval: daily
Active: Yes
```

### armor referential create

Create a new referential check. Requires `read-write` scope.

```bash
armor referential create <asset_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--child-table` | Child table path (required) |
| `--child-column` | Child column name (required) |
| `--parent-table` | Parent table path (required) |
| `--parent-column` | Parent column name (required) |
| `--name` | Human-readable name |
| `--max-orphans` | Max orphan count threshold |
| `--max-orphan-percent` | Max orphan percentage threshold |

**Example:**

```bash
$ armor referential create 550e8400-e29b-41d4-a716-446655440000 \
  --child-table snowflake.prod.warehouse.orders \
  --child-column customer_id \
  --parent-table snowflake.prod.warehouse.customers \
  --parent-column id \
  --name "Orders -> Customers FK"

Check created: r_abc123
FK: orders.customer_id -> customers.id
```

### armor referential execute

Execute a referential check. Requires `read-write` scope.

```bash
armor referential execute <asset_id> <check_id>
```

**Exit codes:**
- `0` - Check passed (no orphans or within threshold)
- `1` - Check failed (orphans exceed threshold)
- `3` - Check not found

**Example (pass):**

```bash
$ armor referential execute 550e8400-e29b-41d4-a716-446655440000 r_abc123

PASS: r_abc123 (orders.customer_id -> customers.id)
Total Child Rows: 10000
Orphans: 0 (0.00%)
$ echo $?
0
```

**Example (fail):**

```bash
$ armor referential execute 550e8400-e29b-41d4-a716-446655440000 r_abc123

FAIL: r_abc123 (orders.customer_id -> customers.id)
Total Child Rows: 10000
Orphans: 5 (0.05%)
Sample orphan values:
  - customer_id: 99999
  - customer_id: 99998
$ echo $?
1
```

### armor referential delete

Delete a referential check. Requires `read-write` scope.

```bash
armor referential delete <asset_id> <check_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--yes, -y` | Skip confirmation prompt |

---

## armor lineage

Data lineage commands.

### armor lineage get

Get lineage for an asset.

```bash
armor lineage get <asset_id> [OPTIONS]
```

| Option | Description |
|--------|-------------|
| `--depth, -d` | Depth of lineage (1-5, default: 1) |
| `--direction` | Direction: `upstream`, `downstream`, `both` (default: `both`) |

**Example:**

```bash
$ armor lineage get snowflake.prod.warehouse.orders --depth 2

Lineage for: snowflake.prod.warehouse.orders

Upstream (dependencies):
  - snowflake.raw.stripe.orders
  - snowflake.raw.crm.customers

Downstream (dependents):
  - snowflake.prod.mart.orders_mart
  - looker.sales_dashboard
```

---

## armor contract

ODCS contract export/import (see the [Data Contracts guide](/guides/contracts) for the ODCS background).

### armor contract pull

Export one or many contracts as ODCS YAML.

```bash
armor contract pull [--asset UUID | --warehouse NAME | --all] [-o FILE] [OPTIONS]
```

**Scope (exactly one required):**
- `--asset UUID`: one asset, sync. Writes YAML to stdout unless `-o` given.
- `--warehouse NAME`: every contract in the warehouse, async job, writes zip to `-o`.
- `--all`: every contract in the account, async job, writes zip to `-o`.

**Options:**
- `-o, --output PATH`: output file. Required for bulk scopes.
- `--mode extended|odcs-pure`: default `extended`. `odcs-pure` strips AnomalyArmor-specific fields for maximum ODCS interop.
- `--include DOMAINS`: comma-separated domain names to include (e.g. `freshness,validity`).
- `--exclude DOMAINS`: comma-separated domain names to exclude.
- `--timeout SECONDS`: async job poll timeout, default 600.

**Examples:**

```bash
# One asset to stdout
armor contract pull --asset <uuid>

# One asset to file, freshness + validity only
armor contract pull --asset <uuid> -o orders.yaml --include freshness,validity

# Whole warehouse
armor contract pull --warehouse analytics -o analytics.zip

# Everything, pure ODCS (max portability)
armor contract pull --all --mode odcs-pure -o contracts.zip
```

**Domain names:** `schema`, `freshness`, `validity`, `metrics`, `row_count`, `drift_monitors`, `schema_drift`, `alert_rules`, `destinations`, `blackouts`.

---

## Configuration

### Config File

The CLI stores configuration in `~/.armor/config.yaml`:

```yaml
api_key: aa_live_your_key_here
api_url: https://api.anomalyarmor.ai
timeout: 30
```

### Environment Variables

Override config file with environment variables:

```bash
export ARMOR_API_KEY="aa_live_xxx"
export ARMOR_API_URL="https://api.anomalyarmor.ai"
```

Environment variables take precedence over config file values.

---

<!-- Source: https://docs.anomalyarmor.ai/guides/contracts -->

# Data Contracts (ODCS export/import)

> Export and import AnomalyArmor monitoring config as portable ODCS YAML

AnomalyArmor speaks the [Open Data Contract Standard (ODCS) v3.1.0](https://bitol-io.github.io/open-data-contract-standard/latest/). Every asset's monitoring config (freshness SLAs, validity rules, schema, metrics, drift monitors, alert rules, destinations, blackouts) exports as ODCS YAML and imports back on demand.

Three things this gives you:

1. **Interoperability.** ODCS is a Linux Foundation project. Soda, Great Expectations, and dbt speak it. Your contract is not a proprietary blob.
2. **No lock-in.** Click Export, get a directory of YAML. Import it into whichever tool you pick next. No rewrite required.
3. **Version control.** Contracts diff cleanly in Git. Reviewers see which validity rule changed in a PR, not a screenshot of a settings page.

## ODCS-native vs ODCS-extended

ODCS covers the core concepts every tool agrees on: schema, column quality rules, freshness SLAs, ownership. We use its sanctioned `customProperties` extension mechanism for the AnomalyArmor-specific concepts ODCS does not have a native slot for.

| Native ODCS (any tool reads these) | Extended (under `customProperties.anomalyarmor`) |
|---|---|
| `apiVersion`, `id`, `name`, `version`, `status`, `domain`, `tags` | Schema drift monitoring (ODCS has no schema-change concept) |
| `schema[]` with `properties[]` + `physicalType` | ML distribution drift (PSI, KS, chi-squared) |
| `quality[]` metric library (`nullValues`, `invalidValues`, `duplicateValues`, `rowCount`) | Row-count anomaly detection with rolling baselines |
| `quality[]` custom SQL | Alert rules and routing |
| `slaProperties[]` (latency, frequency) | Destinations (Slack, email, PagerDuty, Linear, webhooks) |
| `team[]` | Operating schedules and blackout windows |

Our exported YAML is **fully ODCS-compliant**: it passes the [official v3.1.0 JSON Schema validator](https://github.com/bitol-io/open-data-contract-standard/tree/main/schema). Using `customProperties` is exactly how Soda and Great Expectations handle the same tension: industry-standard, not a workaround.

ODCS models `customProperties` as a **list of `{property, value}` entries**, not a dict. Our AnomalyArmor extensions all ride inside a single entry whose `property` is `anomalyarmor`:

```yaml
customProperties:
  - property: anomalyarmor
    value:
      freshness: [...]
      validity: [...]
      drift_monitors: [...]
```

## Deviations from ODCS v3.1.0

Small, documented places where our exported YAML does more or less than the spec:

- **`schema[].properties[].id` (extension).** ODCS does not define an `id` at the column level. We write a stable UUID5 per column so re-import can match columns across renames. `odcs-pure` mode strips it; other ODCS tools read the column by `name` and ignore `id`.
- **Unmodeled optional sections.** These ODCS sections are recognized but we do not produce or consume them: `servers`, `roles`, `price`, `support`, `authoritativeDefinitions`, `contractCreatedTs`. They round-trip opaquely if present on import and are never emitted on export.

## Portability modes

Two export modes control what gets written to the YAML.

### `extended` (default)

Full round-trip fidelity. Includes `customProperties.anomalyarmor.<domain>` blocks with every AA-specific configuration knob (drift thresholds, monitoring modes, alert routing, etc.). Use this for version control, backup, and re-import into AnomalyArmor.

### `odcs-pure`

Strips every `customProperties.anomalyarmor` block. The resulting YAML contains only the ODCS-native subset that any ODCS tool can read and act on: schema, quality rules, freshness SLAs, team. You lose the AA-extended features (drift, alert routing, blackouts) but gain maximum portability for the interop case.

Pick `odcs-pure` when you want to hand the contract to a Soda or Great Expectations customer. Pick `extended` everywhere else.

## CLI

```bash
pip install anomalyarmor-cli
armor auth login --key aa_live_your_key_here
```

The package installs two console scripts, `armor` and `anomalyarmor`. They are aliases, use whichever you prefer. Examples below use `armor` for brevity.

### Export: one asset to stdout

```bash
armor contract pull --asset <asset-uuid>
```

Writes ODCS YAML to stdout. Pipe to `yq` or redirect to a file.

### Export: one asset to a file

```bash
armor contract pull --asset <asset-uuid> -o orders.yaml
```

### Export: whole warehouse as a zip

```bash
armor contract pull --warehouse analytics -o analytics.zip
```

Submits an async job, polls to completion, and writes the zip. Layout inside the zip:

```
analytics/
  contract/
    orders.yaml
    customers.yaml
    invoices.yaml
```

### Export: every contract in your account

```bash
armor contract pull --all -o everything.zip
```

### Filtering (export or import)

Include or exclude specific config domains:

```bash
# Only freshness + validity
armor contract pull --asset <uuid> --include freshness,validity -o partial.yaml

# Everything except alert routing and destinations
armor contract pull --asset <uuid> --exclude alert_rules,destinations -o no-alerts.yaml
```

Registered domain names: `schema`, `freshness`, `validity`, `metrics`, `row_count`, `drift_monitors`, `schema_drift`, `alert_rules`, `destinations`, `blackouts`.

### Export: ODCS-pure mode

```bash
armor contract pull --asset <uuid> --mode odcs-pure -o orders.odcs.yaml
```

### Validate a YAML file (no DB, no auth required beyond the API key)

```bash
armor contract validate -f orders.yaml
```

Exits 0 on a valid document, non-zero on parse or schema errors. Errors are printed one-per-line with file path, YAML path, and line number, so editors and CI can parse the output. Cheap: a single HTTP round-trip regardless of contract size.

### Plan an import (dry-run diff)

```bash
armor contract plan --asset <asset-uuid> -f orders.yaml
```

Shows a per-domain diff without applying. Output:

```
Plan for asset 53581432-...:
  freshness         +2 ~1 -0
  validity          +0 ~3 -1
  drift_monitors    +0 ~0 -0
```

`+` is additions, `~` is modifications, `-` is deletions.

### Apply an import

```bash
armor contract apply --asset <asset-uuid> -f orders.yaml
```

Transactional per asset. Any domain failure rolls back the whole contract. By default, configs present in the live asset but missing from the YAML are **not deleted** (warn-only). To prune, pass `--prune`:

```bash
armor contract apply --asset <asset-uuid> -f orders.yaml --prune
```

`--prune` deletes configs that are absent from the YAML. Treat this the way you would treat `rm -rf`, review the `plan` output first.

### Diff two local YAML files

```bash
armor contract diff baseline.yaml pr.yaml
```

Pure client-side, no server call. Uses the same identity-matching logic as the server-side plan, so the summary line-up is identical. Useful in CI to compare `contracts/main.yaml` to `contracts/pr.yaml` before calling `plan` or `apply`.

## In-product UI

The same primitives ship in the app.

- **Export.** Every asset detail page has an **Export as YAML** action. Opens a modal with three controls: asset scope (this asset / schema / warehouse / all), feature-domain checkboxes, and the `extended` / `odcs-pure` toggle. Multi-asset scopes download as a zip.
- **Import.** Same page, **Import YAML** action next to Export. Drag-drop or click-to-browse a `.yaml` file. The modal runs server-side validation, then shows a plan preview (`+2 freshness, ~1 validity, -0 drift`). Apply button is disabled until the plan is clean. Prune has an explicit confirmation dialog that lists what will be removed.
- **Bulk import.** Asset list supports checkbox selection; the bulk action bar exposes **Import YAML** which applies one contract to many selected assets (async job, polled in the modal).

## REST API

All endpoints authenticated via your API key. Single-asset endpoints are synchronous; bulk endpoints run as async jobs.

### Export, single asset (synchronous)

```
GET /api/v1/contracts/{asset_public_id}/export
    ?mode=extended|odcs-pure     (default: extended)
    &include=freshness,validity  (comma-separated, default: all)
    &exclude=alert_rules         (applied after include)
```

Returns `application/x-yaml` with a `Content-Disposition: attachment` header.

### Export, bulk (asynchronous)

```
POST /api/v1/contracts/export-jobs
  { "scope": "warehouse"|"all",
    "scope_name": "analytics",        // required for scope=warehouse
    "mode": "extended",
    "include": ["freshness"],
    "exclude": ["alert_rules"] }
```

Returns `{"job_id": "<uuid>"}`. Poll with:

```
GET /api/v1/contracts/export-jobs/{job_id}
```

Response includes `status`, `progress_percent`, and `asset_count`. When status is `completed`, download:

```
GET /api/v1/contracts/export-jobs/{job_id}/download
```

Returns `application/zip` with the laid-out `{warehouse}/contract/{table}.yaml` archive.

### Validate (no DB, no asset context)

```
POST /api/v1/contracts/validate
Body: raw ODCS YAML
```

Returns `{ valid: bool, errors: [{line, yaml_path, code, message}], contract_summary: {...} }`.

### Plan, single asset (synchronous)

```
POST /api/v1/contracts/{asset_public_id}/plan
Body: raw ODCS YAML
```

Returns `{ valid, errors, diffs: { <domain>: { added, modified, removed } } }`. Read-only, no mutations.

### Apply, single asset (synchronous, transactional)

```
POST /api/v1/contracts/{asset_public_id}/apply?prune=false
Body: raw ODCS YAML
```

Returns `{ valid, applied: { <domain>: {added, modified, removed} }, unsupported_domains: [], pruned: bool }`. Partial failures roll back the entire contract.

### Apply, bulk (asynchronous)

```
POST /api/v1/contracts/apply-jobs
  { "asset_public_ids": ["...", "..."],
    "yaml": "apiVersion: v3.1.0\n...",
    "prune": false }
```

Returns `{"job_id": "<uuid>"}`. Poll `GET /api/v1/contracts/apply-jobs/{job_id}` for per-asset status.

## Round-trip example

```bash
# Export
armor contract pull --asset <uuid> -o orders.yaml

# Edit the freshness threshold in orders.yaml
vim orders.yaml

# Commit to Git
git add orders.yaml && git commit -m "tighten freshness SLA on orders"

# Preview the change without applying
armor contract plan --asset <uuid> -f orders.yaml

# Apply once the diff looks right
armor contract apply --asset <uuid> -f orders.yaml
```

## Adapter coverage at a glance

Three adapters ship today. The grid shows which kinds of rules each adapter converts automatically, which convert with caveats, and which you rewrite as custom rules in AnomalyArmor.

Every skipped check surfaces a warning, never a silent drop. See the per-adapter guides below for the exact mapping tables.

## Coming from Soda?

[Soda ships ODCS export](https://docs.soda.io/data-testing/git-managed-data-contracts) as of 2024. Run `soda export --odcs` on your existing Soda project and the resulting YAML imports directly into AnomalyArmor:

```bash
soda export --odcs > contracts/
armor contract apply --asset <uuid> -f contracts/orders.yaml
```

If you want to preview the migration before signing up, paste your Soda YAML into [anomalyarmor.ai/migrate](https://www.anomalyarmor.ai/migrate) for a structural summary without an account.

## Coming from dbt or Great Expectations?

Those tools do not emit ODCS directly. We ship adapters that translate their config into ODCS YAML, which then feeds the same `contract apply` pipeline:

```bash
# dbt: translate schema.yml tests and contracts into ODCS
armor migrate-from dbt ./my-dbt-project/ > contracts/

# Great Expectations: translate expectation suites into ODCS
armor migrate-from great-expectations ./great_expectations/ > contracts/
```

See the per-adapter guides for mapping rules, unsupported expectations, and CLI options:

- [Migrate from dbt](/guides/migrate-from-dbt)
- [Migrate from Great Expectations](/guides/migrate-from-great-expectations)

## Limits and current gaps

- **No S3 artifact storage for huge bulk exports.** Jobs today inline the zip bytes in the status response, which is fine for hundreds of tables. Company-wide exports with thousands of tables should use multiple `--warehouse` jobs.
- **Single-asset apply only in `contract apply`.** Multi-asset apply ships through the bulk-apply REST/UI path, not through the CLI flag surface yet.
- **Some domains are `unsupported` on import.** The `apply` response lists any domains whose `import_` is not yet implemented in `unsupported_domains`. Pair with `plan` first to see whether they would have changed before committing.

## Common Questions

### What is ODCS and why does AnomalyArmor use it?

ODCS is the [Open Data Contract Standard](https://bitol-io.github.io/open-data-contract-standard/latest/), a Linux Foundation project that defines a vendor-neutral YAML format for data contracts. Using it means your AnomalyArmor config is portable: Soda, Great Expectations, and dbt also speak it, so you keep optionality if you ever switch tools.

### What is the difference between `extended` and `odcs-pure` export modes?

`extended` (default) includes the full `customProperties.anomalyarmor` block so the contract round-trips with every AA-specific feature intact (drift, alert routing, blackouts). `odcs-pure` strips those blocks and emits only the ODCS-native subset (schema, quality, freshness, team). Use `odcs-pure` when handing the contract to a Soda or Great Expectations user, `extended` everywhere else.

### How do I preview contract changes before applying them?

Run `armor contract plan --asset <uuid> -f contract.yaml`. It returns a per-domain diff showing additions, modifications, and deletions without mutating anything. Apply with `armor contract apply` once the diff looks right. See [Plan an import](#plan-an-import-dry-run-diff).

### Does `contract apply` delete configs that aren't in my YAML?

No, not by default. Configs present in the live asset but missing from the YAML are warn-only, so you won't accidentally wipe rules by applying a partial contract. Pass `--prune` to explicitly delete them, and treat that flag the way you would `rm -rf`, always review the `plan` output first.

### Can I put AnomalyArmor contracts in Git?

Yes, that's one of the main reasons to use them. The YAML diffs cleanly in PRs so reviewers see which validity rule or freshness SLA changed, not a screenshot of a settings page. Export with `armor contract pull`, commit the file, and use `armor contract diff` in CI to compare branches.

### Can I validate a contract YAML without applying it?

Yes. `armor contract validate -f contract.yaml` runs the document against the official ODCS v3.1.0 JSON Schema and exits non-zero on parse or schema errors. Errors print with file path, YAML path, and line number so editors and CI can surface them inline. No DB connection required.

---

<!-- Source: https://docs.anomalyarmor.ai/guides/migrate-from-dbt -->

# Migrate from dbt

> Translate a dbt project's schema.yml into ODCS YAML and apply it to AnomalyArmor

If you already describe your warehouse in dbt, you already have half of an AnomalyArmor contract. This guide walks through converting a dbt project's `schema.yml` files into ODCS YAML and applying it with `armor contract apply`.

Install the CLI first (see [Data Contracts](/guides/contracts#cli) for the auth step):

```bash
pip install anomalyarmor-cli
armor auth login --key aa_live_your_key_here
```

## One-line pipeline

```bash
armor migrate-from dbt ./my-dbt-project/ | armor contract apply --asset <asset-uuid> -f -
```

The adapter writes ODCS YAML to stdout and `contract apply` reads from stdin when passed `-f -`. If the adapter maps zero models, it refuses to emit (non-zero exit) so the pipeline fails clean instead of clobbering your live config with an empty contract.

Prefer a two-step flow for anything non-trivial:

```bash
# 1. Translate
armor migrate-from dbt ./my-dbt-project/ -o contracts/from-dbt.yaml

# 2. Preview
armor contract plan --asset <asset-uuid> -f contracts/from-dbt.yaml

# 3. Apply once the diff looks right
armor contract apply --asset <asset-uuid> -f contracts/from-dbt.yaml
```

## What the adapter reads

Running `armor migrate-from dbt <project>` walks the project root recursively and parses every `schema.yml` / `schema.yaml` file. It skips:

- `dbt_packages/` and `dbt_modules/` (vendored third-party packages, not your models).
- `node_modules/` (front-end tooling that sometimes coexists with dbt repos).
- Any hidden directory (a path segment starting with `.`).

Your project root is the directory holding `dbt_project.yml`. The adapter does **not** call `dbt compile` or `dbt parse`, so you do not need a live target or a working `profiles.yml`. Static file scan only.

## Mapping table

| dbt input | ODCS output | Notes |
|---|---|---|
| `models[].name` | `schema[].name` | Table identity. |
| `models[].columns[].name` | `schema[].properties[].name` | Column identity. |
| `models[].columns[].data_type` | `schema[].properties[].physicalType` | Only emitted if present in the dbt file. |
| `models[].columns[].description` | `schema[].properties[].description` | Native ODCS. |
| `models[].description` | `schema[].description` | Native ODCS. |
| `columns[].tests[].not_null` | validity rule, `rule_type=not_null` | AA extension. |
| `columns[].tests[].unique` | validity rule, `rule_type=unique` | AA extension. |
| `columns[].tests[].accepted_values.values` | validity rule, `rule_type=allowed_values` | Needs a non-empty `values` list. |
| `models[].config.freshness` | freshness schedule (hours) | Best-effort: `error_after` preferred, falls back to `warn_after`. `minute` / `hour` / `day` periods convert to hours. |

Everything the adapter understands lands in the emitted YAML. Everything else lands in the **warnings** section of the summary line so you know what to review.

## What gets skipped (and why)

| Input shape | Reason | Resolution |
|---|---|---|
| `dbt_utils.*` tests (e.g. `dbt_utils.accepted_range`) | No 1:1 map to an AA validity rule | Re-author as a custom SQL check in AnomalyArmor, or add to the request queue for adapter coverage. |
| `dbt_expectations.*` tests | Too many variants to map safely | Same as above. |
| `relationships` tests | Reference integrity is not modeled by AA validity rules | Track as a custom SQL check. |
| Any test with a namespaced name (contains `.`) | Conservative skip | Adapter errs on the side of warning over silent mistranslation. |
| `accepted_values` with empty or missing `values` | Would emit an empty rule | Fix the test in dbt, re-run. |
| `freshness` with a non-`minute`/`hour`/`day` period | Cannot normalize to hours | Normalize in dbt or edit the emitted YAML directly. |
| dbt `meta`, `tags`, `owner` | No round-trip design yet | Not a data-quality signal; track with AA tags or leave in dbt. |

The CLI summary prints a one-liner that groups these warnings so you see totals at a glance:

```
dbt -> ODCS (./my-dbt-project/): mapped 42, warnings 7 (dbt_utils.accepted_range: 3, ...).
```

If `mapped_count` is zero, the adapter exits non-zero and writes nothing to stdout.

## Flags

| Flag | Purpose |
|---|---|
| `--output`, `-o` | Write YAML to a file instead of stdout. Pass `-` or omit for stdout. |
| `--name` | Override the contract `name` field. Defaults to the project directory name. |

## Validate before applying

```bash
armor migrate-from dbt ./my-dbt-project/ -o from-dbt.yaml
armor contract validate -f from-dbt.yaml
```

`contract validate` runs the document against the ODCS v3.1.0 JSON Schema. Parse or schema errors print with file path, YAML path, and line number so editors and CI can surface them inline. No DB connection required beyond auth.

## One dbt project, many assets

Today `contract apply` is asset-scoped (one asset per call). If your dbt project covers multiple warehouse tables:

1. Translate once: `armor migrate-from dbt ./proj -o full.yaml`.
2. Split by table or re-export per-asset from AnomalyArmor to get the target asset scoping, then apply each file against its asset.

Multi-asset bulk apply runs through the REST jobs endpoint today (see [the bulk-apply section of the contracts guide](/guides/contracts#apply-bulk-asynchronous)). A CLI shortcut is on the roadmap.

## Common Questions

### How do I migrate from dbt tests to AnomalyArmor?

Run `armor migrate-from dbt ./my-dbt-project/` and pipe the ODCS YAML into `armor contract apply --asset <uuid> -f -`. The adapter reads every `schema.yml` statically, so you don't need a live warehouse or working `profiles.yml`. See the [one-line pipeline](#one-line-pipeline).

### Which dbt tests does AnomalyArmor support?

Built-in `not_null`, `unique`, and `accepted_values` tests map to AnomalyArmor validity rules. Column `description`, `data_type`, and model `description` land in native ODCS fields. Freshness configs convert into hour-based SLAs. See the full [mapping table](#mapping-table).

### Why does the adapter skip `dbt_utils` and `dbt_expectations` tests?

Those tests have too many variants to map safely into AnomalyArmor's validity rules without risking silent mistranslation, so the adapter warns rather than guesses. Re-author them as custom SQL checks in AnomalyArmor. The CLI summary groups the warnings so you see totals at a glance.

### Do I need dbt installed to run the migration?

No. The adapter is a static file scan, it walks the project directory and parses `schema.yml` files directly. You don't need `dbt compile`, `dbt parse`, a live target, or a working `profiles.yml`.

### Can I preview the migration before applying it?

Yes. Write the translated YAML to a file first with `armor migrate-from dbt ./proj -o from-dbt.yaml`, then run `armor contract validate` and `armor contract plan` against it. Plan shows a per-domain diff (additions, modifications, deletions) without mutating anything. See [Validate before applying](#validate-before-applying).

### How do I migrate a dbt project that covers many warehouse tables?

`contract apply` is asset-scoped today (one call per asset), so translate once with `armor migrate-from dbt ./proj -o full.yaml`, then apply the file against each target asset. For a true bulk flow, use the [REST bulk-apply endpoint](/guides/contracts#apply-bulk-asynchronous). A CLI shortcut is on the roadmap.

## See also

- [Data Contracts (ODCS export/import)](/guides/contracts)
- [Migrate from Great Expectations](/guides/migrate-from-great-expectations)
- [anomalyarmor.ai/migrate](https://www.anomalyarmor.ai/migrate) for Soda users (ODCS export is native in Soda)

---

<!-- Source: https://docs.anomalyarmor.ai/guides/migrate-from-great-expectations -->

# Migrate from Great Expectations

> Convert a Great Expectations project into an AnomalyArmor data contract

The `armor migrate-from great-expectations` command converts a Great Expectations project on disk into an [ODCS](/guides/contracts) YAML document. Pipe the output straight into `armor contract apply` to port your expectation suites into AnomalyArmor validity rules.

```bash
armor migrate-from great-expectations ./great_expectations/ | armor contract apply --asset <asset-uuid> -f -
```

The adapter is file-based: it reads `great_expectations.yml` and the expectation-suite JSON files directly. You do not need the `great_expectations` pip package installed to run the migration.

## What gets mapped

### Native ODCS fields

These expectations land on the standard ODCS property surface so they round-trip through any ODCS tool, not just AnomalyArmor.

| Great Expectations                       | ODCS field                              |
| ---------------------------------------- | --------------------------------------- |
| `expect_column_values_to_not_be_null`    | `schema[].properties[].required = true` |
| `expect_column_values_to_be_unique`      | `schema[].properties[].unique = true`   |

### AnomalyArmor validity rules

These expectations map to validity rules under `customProperties.anomalyarmor.validity`.

| Great Expectations                               | AnomalyArmor `rule_type` | Notes                                                     |
| ------------------------------------------------ | ------------------------ | --------------------------------------------------------- |
| `expect_column_values_to_be_in_set`              | `allowed_values`         | `value_set` → `values`                                    |
| `expect_column_values_to_match_regex`            | `regex_match`            | `regex` → `pattern`                                       |
| `expect_column_values_to_be_between`             | `range_bounds`           | `min_value` / `max_value` → `min` / `max`                 |
| `expect_column_value_lengths_to_be_between`      | `length_bounds`          | `min_value` / `max_value` → `min_length` / `max_length`   |
| `expect_column_values_to_match_strftime_format`  | `format`                 | `%Y-%m-%d` → `iso_date`; `%Y-%m-%dT%H:%M:%S` → `iso_datetime` |

### Unmapped expectations

Anything not in the table above surfaces as a warning in the command summary rather than being silently dropped. The common ones that land in warnings today:

- `expect_column_pair_*` and `expect_multicolumn_*` (cross-column logic)
- `expect_column_kl_divergence_*` and other statistical distribution checks
- Custom expectations (`dbt_expectations.*`, user-defined subclasses)
- `expect_column_values_to_be_of_type` (type-system translation is its own problem)
- Table-scoped expectations like `expect_table_row_count_to_be_between` (row-count monitoring needs its own follow-up wiring)

For expectations without a native AnomalyArmor equivalent, author a custom SQL check after the migration rather than trying to force them through the validity-rule shape.

## Project shapes supported

The adapter auto-detects whether you have a legacy (pre-v1) project or a v1 project using Fluent Datasources.

- **Legacy projects.** Expectation suites live as JSON under `{project_root}/expectations/*.json`. The adapter walks that tree directly.
- **v1 Fluent Datasources projects.** The adapter reads the `fluent_datasources` block from `great_expectations.yml` to pull per-column type hints into `schema[].properties[].physicalType`. Suites still need to live under `expectations/`. If your project uses a non-default suite store (a database, S3, etc.), export the suites to disk before migrating:

  ```python
  import great_expectations as gx
  context = gx.get_context()
  for name in context.list_expectation_suite_names():
      suite = context.get_expectation_suite(name)
      with open(f"expectations/{name}.json", "w") as f:
          f.write(suite.to_json_dict())
  ```

The adapter emits a warning with a pointer to this snippet when it detects a project shape it cannot walk via files alone.

## Suite-to-table resolution

Great Expectations organizes suites by suite name, not by table. AnomalyArmor scopes every validity rule to a `table_path`. The adapter resolves the mapping in this order:

1. `meta.anomalyarmor.table_path` inside the suite JSON (the explicit escape hatch).
2. The suite's `expectation_suite_name` field (GE's canonical source of truth for the suite's logical name).
3. The suite filename without the `.json` extension (final fallback).

If your suite names already match your target table names, you don't need to touch anything. When they don't, set the escape hatch inside the suite:

```json
{
  "expectation_suite_name": "orders_v2_suite",
  "meta": {
    "anomalyarmor": {
      "table_path": "warehouse.public.orders"
    }
  },
  "expectations": [...]
}
```

## Worked example

Given this legacy project layout:

```
great_expectations/
├── great_expectations.yml
└── expectations/
    └── orders.json
```

...where `orders.json` contains:

```json
{
  "expectation_suite_name": "public.orders",
  "expectations": [
    {
      "expectation_type": "expect_column_values_to_not_be_null",
      "kwargs": {"column": "order_id"}
    },
    {
      "expectation_type": "expect_column_values_to_be_unique",
      "kwargs": {"column": "order_id"}
    },
    {
      "expectation_type": "expect_column_values_to_be_in_set",
      "kwargs": {"column": "status", "value_set": ["pending", "shipped", "delivered"]}
    },
    {
      "expectation_type": "expect_column_values_to_match_regex",
      "kwargs": {"column": "email", "regex": "^[^@]+@[^@]+\\.[^@]+$"}
    }
  ]
}
```

Running:

```bash
armor migrate-from great-expectations ./great_expectations/ --output contract.yaml
```

...produces:

```yaml
apiVersion: v3.1.0
kind: DataContract
id: <stable UUID5 derived from project path>
name: great_expectations
version: 1.0.0
status: active
schema:
  - name: public.orders
    properties:
      - name: order_id
        required: true
        unique: true
      - name: status
      - name: email
customProperties:
  - property: anomalyarmor
    value:
      validity:
        - table_path: public.orders
          column_name: status
          rule_type: allowed_values
          rule_config:
            values: [pending, shipped, delivered]
          severity: error
          treat_null_as_valid: true
          check_interval: daily
        - table_path: public.orders
          column_name: email
          rule_type: regex_match
          rule_config:
            pattern: "^[^@]+@[^@]+\\.[^@]+$"
          severity: error
          treat_null_as_valid: true
          check_interval: daily
```

Then validate and apply:

```bash
armor contract validate -f contract.yaml
armor contract plan --asset <asset-uuid> -f contract.yaml
armor contract apply --asset <asset-uuid> -f contract.yaml
```

Re-running the migration is safe: the command produces a deterministic `id` seeded from the project path, so re-applying the same contract will dedupe rather than create duplicates.

## Common Questions

### How do I migrate from Great Expectations to AnomalyArmor?

Run `armor migrate-from great-expectations ./great_expectations/` to translate your expectation suites into ODCS YAML, then pipe it into `armor contract apply --asset <uuid> -f -`. The adapter is file-based and reads suite JSON directly, so you don't need the `great_expectations` pip package installed on the migration host.

### Which Great Expectations expectations are supported?

`not_null` and `unique` map to native ODCS `required` / `unique`. Six more map to AnomalyArmor validity rules: `be_in_set`, `match_regex`, `be_between`, `value_lengths_between`, and `match_strftime_format`. See the full [mapping tables](#what-gets-mapped) for exact field translations.

### What happens to expectations the adapter can't map?

They surface as warnings in the CLI summary, never silent drops. Common unsupported cases are `expect_column_pair_*`, `expect_multicolumn_*`, KL-divergence / distribution checks, custom subclasses, and type checks. Re-author those as custom SQL checks after the migration rather than forcing them through the validity-rule shape.

### Does the adapter work with Great Expectations v1 Fluent Datasources?

Yes. The adapter auto-detects v1 projects and reads the `fluent_datasources` block from `great_expectations.yml` to pull per-column type hints into `physicalType`. Suites still need to live on disk under `expectations/`, so if you use a non-default store (database, S3), export them to disk first. See [Project shapes supported](#project-shapes-supported).

### How does AnomalyArmor know which table an expectation suite belongs to?

The adapter resolves the suite-to-table mapping in this order: `meta.anomalyarmor.table_path` inside the suite JSON, then the suite's `expectation_suite_name` field, then the filename. Use the `meta.anomalyarmor.table_path` escape hatch when suite names don't match your target table paths. See [Suite-to-table resolution](#suite-to-table-resolution).

### Is it safe to re-run the migration?

Yes. The command produces a deterministic contract `id` seeded from the project path, so re-applying the same contract deduplicates instead of creating duplicates. Combine with `armor contract plan` first to preview the diff before `apply`.

## See also

- [Data Contracts (ODCS export/import)](/guides/contracts)
- [Migrate from dbt](/guides/migrate-from-dbt)

---

<!-- Source: https://docs.anomalyarmor.ai/integrations/first-integration -->

# First API Integration

> Step-by-step guide to your first programmatic integration with AnomalyArmor

This guide walks you through building your first programmatic integration with AnomalyArmor. By the end, you'll have a working data quality check that can run in your pipeline.

## Prerequisites

- An AnomalyArmor account with at least one connected data source
- Python 3.9+ installed
- An API key (create in **Settings > API Keys**)

## Step 1: Install the SDK

```bash
pip install anomalyarmor-cli
```

## Step 2: Configure Authentication

Store your API key securely. You have two options:

**Option A: Environment variable (recommended for CI/CD)**

```bash
export ARMOR_API_KEY="aa_live_your_key_here"
```

**Option B: Config file (for local development)**

```bash
armor auth login
# Follow the prompts to enter your API key
```

## Step 3: Verify Connection

Test that everything is working:

```python
from anomalyarmor import Client

client = Client()

# List your data assets
assets = client.assets.list(limit=5)
for asset in assets:
    print(f"{asset.qualified_name} ({asset.asset_type})")
```

You should see a list of your connected tables and views.

## Step 4: Check Data Freshness

The most common integration pattern is checking data freshness before running a pipeline. Here's a complete example:

```python
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

def run_pipeline():
    client = Client()

    # Check that source data is fresh before processing
    try:
        client.freshness.require_fresh("snowflake.prod.warehouse.orders")
        print("Data is fresh, proceeding with pipeline...")
        # Your pipeline logic here
    except StalenessError as e:
        print(f"Pipeline aborted: data is {e.hours_since_update:.1f}h stale")
        raise

if __name__ == "__main__":
    run_pipeline()
```

## Step 5: Add Data Quality Checks

Expand your integration with validity and referential integrity checks:

```python
from anomalyarmor import Client

client = Client()
asset_id = "your-asset-uuid"  # Get from assets.list()

# Check freshness
summary = client.freshness.summary()
print(f"Overall freshness: {summary.fresh_percentage}%")

# Check validity rules
validity_summary = client.validity.summary(asset_id)
if validity_summary.failing > 0:
    print(f"Warning: {validity_summary.failing} validity rules failing")
    # List failing rules
    rules = client.validity.list(asset_id)
    for rule in rules:
        result = client.validity.check(asset_id, rule.uuid)
        if result.status == "fail":
            print(f"  - {rule.name}: {result.invalid_count} invalid records")

# Check referential integrity
ref_summary = client.referential.summary(asset_id)
if ref_summary.failing_checks > 0:
    print(f"Warning: {ref_summary.failing_checks} referential checks failing")
```

## Step 6: Create a Quality Gate

Combine all checks into a single quality gate function:

```python
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

def quality_gate(asset_id: str, critical_tables: list[str]) -> bool:
    """
    Run all data quality checks before pipeline execution.
    Returns True if all checks pass, False otherwise.

    Note: list[str] type hint requires Python 3.9+. For earlier versions,
    use typing.List[str] instead.
    """
    client = Client()
    passed = True

    # 1. Check freshness of critical tables
    for table in critical_tables:
        try:
            client.freshness.require_fresh(table)
            print(f"[PASS] Freshness: {table}")
        except StalenessError as e:
            print(f"[FAIL] Freshness: {table} ({e.hours_since_update:.1f}h stale)")
            passed = False

    # 2. Check validity rules
    validity = client.validity.summary(asset_id)
    if validity.failing == 0:
        print(f"[PASS] Validity: {validity.total_rules} rules")
    else:
        print(f"[FAIL] Validity: {validity.failing}/{validity.total_rules} rules failing")
        passed = False

    # 3. Check referential integrity
    ref = client.referential.summary(asset_id)
    if ref.failing_checks == 0:
        print(f"[PASS] Referential: {ref.total_checks} checks")
    else:
        print(f"[FAIL] Referential: {ref.failing_checks}/{ref.total_checks} failing")
        passed = False

    return passed

# Usage
if __name__ == "__main__":
    asset = "your-asset-uuid"
    tables = [
        "snowflake.prod.warehouse.orders",
        "snowflake.prod.warehouse.customers",
    ]

    if quality_gate(asset, tables):
        print("\nAll quality checks passed! Running pipeline...")
        # run_pipeline()
    else:
        print("\nQuality checks failed. Pipeline aborted.")
        exit(1)
```

## Step 7: CLI Integration

For shell scripts and CI/CD, use the CLI directly:

```bash
#!/bin/bash
set -e

# Check freshness (exits 1 if stale)
armor freshness check snowflake.prod.warehouse.orders

# If we get here, data is fresh
echo "Data quality checks passed!"
dbt run
```

## Common Patterns

### Pattern 1: Pre-ETL Validation

Run checks before ETL starts:

```python
# At the start of your ETL
client.freshness.require_fresh("source_table")
client.validity.check(asset_id, "not_null_rule")
# Then run ETL...
```

### Pattern 2: Post-ETL Validation

Verify output quality after ETL:

```python
# After ETL completes
result = client.validity.check(asset_id, "uniqueness_rule")
if result.status == "fail":
    rollback_etl()
    alert_team(f"ETL produced {result.invalid_count} duplicate records")
```

### Pattern 3: Continuous Monitoring

Schedule regular quality checks:

```python
# In a scheduled job (e.g., Airflow, cron)
summary = client.metrics.summary(asset_id)
if summary.failing > 0:
    alert_team(f"{summary.failing} metrics failing")
```

## Next Steps

## Common Questions

### Where do I get an API key to start?

Sign in to [app.anomalyarmor.ai](https://app.anomalyarmor.ai), open **Settings → API Keys**, and click **Create Key**. New keys are shown once - copy to a password manager or secrets manager immediately. Set the scope to `read-only` for pipeline gating or `read-write` if your integration needs to create metrics or acknowledge alerts.

### Should I use the Python SDK or raw REST calls for my first integration?

Python SDK if you're in Python or Airflow - it handles pagination, retries on 429, and error typing for you. Raw REST (`curl` or `requests`) if you're in a different language, want zero dependencies, or are prototyping a webhook handler. Both paths are first-class.

### How do I test an integration without affecting production monitoring?

Create a separate AnomalyArmor workspace (or use a sandbox project) with a non-production database connection. Integrations there can create/delete monitors freely without touching prod. When ready, swap the API key to production and re-run against real endpoints.

### What's the lightest possible first integration?

A freshness check at the start of your pipeline: one API call, one conditional exit. `GET /api/v1/freshness/check?asset=my_table` returns whether the table meets its SLA. If no, abort the pipeline. This pattern is 5 lines of code and catches the most common data issue (stale upstream).

- **[dbt Integration](/integrations/dbt)**: Add quality gates to dbt workflows
- **[GitHub Actions](/integrations/github-actions)**: Run checks in CI/CD pipelines
- **[Airflow Integration](/integrations/airflow)**: Integrate with Apache Airflow
- **[API Reference](/api/overview)**: Full API documentation

---

<!-- Source: https://docs.anomalyarmor.ai/integrations/airflow -->

# Airflow Integration

> Use AnomalyArmor in Apache Airflow DAGs

Integrate AnomalyArmor with Apache Airflow to gate pipelines on data quality and freshness.

## Installation

Install the SDK in your Airflow environment:

```bash
pip install anomalyarmor-cli
```

## Configuration

Set `ARMOR_API_KEY` as an Airflow variable or environment variable:

```bash
# Airflow Variable
airflow variables set ARMOR_API_KEY "aa_live_xxx"

# Or environment variable
export ARMOR_API_KEY="aa_live_xxx"
```

## Pre-flight Freshness Check

The most common pattern: fail the task if upstream data is stale.

```python
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime

from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

def check_upstream_freshness():
    """Gate: Fail task if upstream data is stale."""
    client = Client()

    # This raises StalenessError if data is stale
    client.freshness.require_fresh("snowflake.prod.warehouse.orders")
    print("Upstream data is fresh, proceeding...")

def run_transformation():
    """Main transformation logic."""
    print("Running dbt models...")
    # subprocess.run(["dbt", "run", "--select", "orders_mart"])

with DAG(
    "orders_pipeline",
    start_date=datetime(2024, 1, 1),
    schedule_interval="@hourly",
    catchup=False,
) as dag:

    freshness_gate = PythonOperator(
        task_id="check_freshness",
        python_callable=check_upstream_freshness,
    )

    transform = PythonOperator(
        task_id="run_transformation",
        python_callable=run_transformation,
    )

    freshness_gate >> transform
```

> 💡 Use `read-only` scope for pre-flight checks. You only need `read-write` if triggering refreshes.

## Check Multiple Sources

Verify all upstream dependencies before running:

```python
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

def check_all_upstream():
    """Check all upstream sources are fresh."""
    client = Client()

    upstream_tables = [
        "snowflake.prod.warehouse.orders",
        "snowflake.prod.warehouse.customers",
        "snowflake.prod.warehouse.products",
    ]

    stale_tables = []
    for table in upstream_tables:
        try:
            client.freshness.require_fresh(table)
        except StalenessError:
            stale_tables.append(table)

    if stale_tables:
        raise Exception(f"Stale upstream data: {stale_tables}")

    print("All upstream sources are fresh!")
```

## Trigger Freshness Check

Trigger a freshness check and wait for completion:

```python
from anomalyarmor import Client

def refresh_and_check():
    """Trigger freshness check, then verify."""
    client = Client()

    # Trigger refresh and wait
    result = client.freshness.refresh(
        "snowflake.prod.warehouse.orders",
        wait=True,
    )

    print(f"Refresh job {result.job_id}: {result.status}")

    # Now check freshness
    client.freshness.require_fresh("snowflake.prod.warehouse.orders")
```

## Sensor Pattern

Wait for data to become fresh:

```python
from airflow.sensors.python import PythonSensor
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

def is_data_fresh():
    """Return True when data is fresh."""
    client = Client()
    try:
        client.freshness.require_fresh("snowflake.prod.warehouse.orders")
        return True
    except StalenessError:
        return False

freshness_sensor = PythonSensor(
    task_id="wait_for_fresh_data",
    python_callable=is_data_fresh,
    poke_interval=300,  # Check every 5 minutes
    timeout=3600,       # Timeout after 1 hour
    mode="poke",
)
```

## Check Lineage

Verify all upstream dependencies using lineage:

```python
from anomalyarmor import Client

def check_upstream_via_lineage():
    """Check all upstream sources via lineage API."""
    client = Client()

    # Get upstream dependencies
    lineage = client.lineage.get("snowflake.prod.mart.orders_summary")

    print(f"Checking {len(lineage.upstream)} upstream sources...")

    for upstream in lineage.upstream:
        client.freshness.require_fresh(upstream.qualified_name)
        print(f"  {upstream.qualified_name}")

    print("All upstream sources are fresh!")
```

## Error Handling

Handle different error types appropriately:

```python
from anomalyarmor import Client
from anomalyarmor.exceptions import (
    StalenessError,
    AuthenticationError,
    RateLimitError,
    ArmorError,
)
import time

def check_with_retry():
    """Check freshness with error handling."""
    client = Client()

    max_retries = 3
    for attempt in range(max_retries):
        try:
            client.freshness.require_fresh("snowflake.prod.warehouse.orders")
            return  # Success

        except StalenessError as e:
            # Data is stale - this is expected, fail the task
            raise Exception(f"Data is stale: last updated {e.last_updated}")

        except RateLimitError as e:
            # Rate limited - retry after waiting
            if attempt < max_retries - 1:
                print(f"Rate limited, waiting {e.retry_after}s...")
                time.sleep(e.retry_after)
            else:
                raise

        except AuthenticationError:
            # Auth error - likely config issue
            raise Exception("Invalid ARMOR_API_KEY")

        except ArmorError as e:
            # Other API error - retry
            if attempt < max_retries - 1:
                print(f"API error, retrying: {e}")
                time.sleep(10)
            else:
                raise
```

## Complete DAG Example

Full example with freshness gate, transformation, and post-run schema check:

```python
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.operators.bash import BashOperator
from datetime import datetime

from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

default_args = {
    "retries": 1,
    "retry_delay": timedelta(minutes=5),
}

def check_upstream():
    client = Client()
    client.freshness.require_fresh("snowflake.prod.warehouse.orders")
    client.freshness.require_fresh("snowflake.prod.warehouse.customers")

def trigger_schema_check():
    client = Client()
    result = client.schema.refresh(
        "snowflake.prod.mart.orders_summary",
        wait=True,
    )
    print(f"Schema check: {result.status}")

with DAG(
    "orders_mart_pipeline",
    default_args=default_args,
    start_date=datetime(2024, 1, 1),
    schedule_interval="@daily",
    catchup=False,
    tags=["data-quality", "orders"],
) as dag:

    check_freshness = PythonOperator(
        task_id="check_upstream_freshness",
        python_callable=check_upstream,
    )

    run_dbt = BashOperator(
        task_id="run_dbt_models",
        bash_command="cd /dbt && dbt run --select orders_mart",
    )

    verify_schema = PythonOperator(
        task_id="verify_schema",
        python_callable=trigger_schema_check,
    )

    check_freshness >> run_dbt >> verify_schema
```

## Next Steps

## Common Questions

### How do I gate an Airflow DAG on data freshness using AnomalyArmor?

Use a `PythonSensor` or `PythonOperator` at the start of your DAG that calls `client.freshness.check(table="...")` from the AnomalyArmor Python SDK. If freshness is violated, raise an exception to short-circuit the DAG. See the "Gating DAGs" section above for a full example.

### Does the AnomalyArmor Python SDK work with Airflow 2 and MWAA?

Yes. The SDK is a pure-Python package (`pip install anomalyarmor`) and works in any Airflow environment that supports pip-installed dependencies: Airflow 2.x, Astronomer, MWAA, and Airflow on Kubernetes. MWAA requires adding the package to `requirements.txt`.

### Where should I store my AnomalyArmor API key in Airflow?

In an Airflow Connection (Extra JSON field) or a Variable - never in DAG code. For MWAA, use AWS Secrets Manager with Airflow's Secrets Backend so keys rotate without redeploying DAGs.

### Can AnomalyArmor trigger Airflow DAG runs when data quality fails?

Yes via webhooks. Configure an AnomalyArmor alert with a webhook destination pointing at Airflow's REST API (e.g., `POST /api/v1/dags/{dag_id}/dagRuns`). Most teams use this pattern to trigger remediation DAGs when a critical freshness or schema-drift alert fires.

- **[Python SDK](/sdk/overview)**: SDK reference and patterns
- **[Freshness API](/api/freshness)**: Freshness endpoint details
- **[Lineage API](/api/lineage)**: Explore data dependencies
- **[Alerts](/alerts/overview)**: Set up freshness alerts

---

<!-- Source: https://docs.anomalyarmor.ai/integrations/dbt -->

# dbt Integration

> Add data quality gates to your dbt workflows

Integrate AnomalyArmor with dbt to add data quality gates before and after your transformations. This guide covers common patterns for ensuring data quality throughout your dbt workflow.

## Prerequisites

- AnomalyArmor account with connected data source
- dbt project configured
- Python 3.9+ (for SDK) or shell access (for CLI)
- API key with `read-only` scope (or `read-write` for triggering checks)

## Installation

```bash
pip install anomalyarmor-cli dbt-core
```

Configure your API key:

```bash
export ARMOR_API_KEY="aa_live_your_key_here"
```

## Pattern 1: Pre-run Quality Gate

Check data quality before running dbt:

### Using CLI (Shell Script)

```bash
#!/bin/bash
# pre_dbt_check.sh
set -e

echo "Running pre-dbt quality checks..."

# Check freshness of source tables
armor freshness check snowflake.raw.stripe.payments || {
    echo "Source data is stale. Aborting dbt run."
    exit 1
}

armor freshness check snowflake.raw.crm.customers || {
    echo "Source data is stale. Aborting dbt run."
    exit 1
}

echo "All source tables are fresh. Starting dbt..."
dbt run
```

### Using Python

```python
# pre_dbt_check.py
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError
import subprocess
import sys

def check_sources_and_run_dbt():
    client = Client()

    sources = [
        "snowflake.raw.stripe.payments",
        "snowflake.raw.crm.customers",
        "snowflake.raw.shopify.orders",
    ]

    print("Checking source freshness...")
    for source in sources:
        try:
            status = client.freshness.require_fresh(source)
            print(f"  [OK] {source} ({status.hours_since_update:.1f}h old)")
        except StalenessError as e:
            print(f"  [STALE] {source} ({e.hours_since_update:.1f}h old)")
            print("\nAborting: Source data is stale.")
            sys.exit(1)

    print("\nAll sources fresh. Running dbt...")
    result = subprocess.run(["dbt", "run"], check=True)
    return result.returncode

if __name__ == "__main__":
    check_sources_and_run_dbt()
```

## Pattern 2: Post-run Validation

Validate output quality after dbt completes:

```python
# post_dbt_validate.py
from anomalyarmor import Client

def validate_dbt_outputs():
    client = Client()
    asset_id = "your-asset-uuid"

    print("Validating dbt outputs...")

    # Check validity rules on transformed tables
    validity = client.validity.summary(asset_id)
    if validity.failing > 0:
        print(f"[WARN] {validity.failing} validity rules failing")

        # Get details of failing rules
        rules = client.validity.list(asset_id)
        for rule in rules:
            result = client.validity.check(asset_id, rule.uuid)
            if result.status == "fail":
                print(f"  - {rule.name} on {rule.column_name}")
                print(f"    Invalid: {result.invalid_count} ({result.invalid_percent:.2f}%)")

        return False
    else:
        print(f"[OK] All {validity.total_rules} validity rules passing")

    # Check referential integrity
    ref = client.referential.summary(asset_id)
    if ref.failing_checks > 0:
        print(f"[WARN] {ref.failing_checks} referential checks failing")
        return False
    else:
        print(f"[OK] All {ref.total_checks} referential checks passing")

    return True

if __name__ == "__main__":
    if not validate_dbt_outputs():
        print("\nPost-dbt validation failed!")
        exit(1)
    print("\nAll validations passed!")
```

## Pattern 3: dbt run-operation Hook

Create a dbt macro that calls AnomalyArmor:

> ℹ️ The `shell()` function is not a built-in dbt Jinja function. This pattern requires a custom macro
or package that provides shell execution capabilities. Consider using the wrapper script approach
(Pattern 4) for a simpler, more portable solution.

```sql
-- macros/armor_quality_gate.sql
{% macro armor_freshness_check(table_name) %}
    {% set result = run_query("SELECT 1") %}
    {{ log("Checking freshness for " ~ table_name, info=True) }}

    {# Call CLI from dbt - requires custom shell() macro #}
    {% set check_result = shell("armor freshness check " ~ table_name ~ " 2>&1 || echo STALE") %}

    {% if "STALE" in check_result %}
        {{ exceptions.raise_compiler_error("Data quality check failed: " ~ table_name ~ " is stale") }}
    {% endif %}
{% endmacro %}
```

Then in your model:

```sql
-- models/marts/orders_mart.sql
{{ config(pre_hook=armor_freshness_check('snowflake.raw.stripe.payments')) }}

SELECT
    o.order_id,
    o.customer_id,
    c.customer_name,
    o.total_amount
FROM {{ ref('stg_orders') }} o
LEFT JOIN {{ ref('stg_customers') }} c ON o.customer_id = c.customer_id
```

## Pattern 4: Full dbt Wrapper Script

A wrapper that handles pre-checks, dbt run, and post-validation:

```python
#!/usr/bin/env python
# dbt_with_quality_gates.py
"""
Run dbt with AnomalyArmor quality gates.

Usage:
    python dbt_with_quality_gates.py run
    python dbt_with_quality_gates.py run --select marts.*
"""

import argparse
import subprocess
import sys
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

# Configuration
ASSET_ID = "your-asset-uuid"
SOURCE_TABLES = [
    "snowflake.raw.stripe.payments",
    "snowflake.raw.crm.customers",
]

def pre_checks(client: Client) -> bool:
    """Run pre-dbt quality checks."""
    print("\n=== Pre-dbt Quality Checks ===\n")

    # Check source freshness
    all_fresh = True
    for table in SOURCE_TABLES:
        try:
            status = client.freshness.require_fresh(table)
            print(f"[FRESH] {table} ({status.hours_since_update:.1f}h)")
        except StalenessError as e:
            print(f"[STALE] {table} ({e.hours_since_update:.1f}h)")
            all_fresh = False

    return all_fresh

def run_dbt(args: list[str]) -> int:
    """Execute dbt with provided arguments."""
    print("\n=== Running dbt ===\n")
    cmd = ["dbt"] + args
    result = subprocess.run(cmd)
    return result.returncode

def post_checks(client: Client) -> bool:
    """Run post-dbt quality checks."""
    print("\n=== Post-dbt Quality Checks ===\n")

    passed = True

    # Validity checks
    validity = client.validity.summary(ASSET_ID)
    if validity.failing == 0:
        print(f"[PASS] Validity: {validity.total_rules} rules")
    else:
        print(f"[FAIL] Validity: {validity.failing}/{validity.total_rules} failing")
        passed = False

    # Referential integrity
    ref = client.referential.summary(ASSET_ID)
    if ref.failing_checks == 0:
        print(f"[PASS] Referential: {ref.total_checks} checks")
    else:
        print(f"[FAIL] Referential: {ref.failing_checks}/{ref.total_checks} failing")
        passed = False

    # Metrics anomalies
    metrics = client.metrics.summary(ASSET_ID)
    if metrics.failing == 0:
        print(f"[PASS] Metrics: {metrics.active_metrics} monitored")
    else:
        print(f"[WARN] Metrics: {metrics.failing} anomalies detected")
        # Don't fail on metric anomalies, just warn

    return passed

def main():
    parser = argparse.ArgumentParser(description="Run dbt with quality gates")
    parser.add_argument("dbt_command", help="dbt command (run, build, test)")
    parser.add_argument("dbt_args", nargs="*", help="Additional dbt arguments")
    parser.add_argument("--skip-pre", action="store_true", help="Skip pre-checks")
    parser.add_argument("--skip-post", action="store_true", help="Skip post-checks")
    args = parser.parse_args()

    client = Client()
    dbt_args = [args.dbt_command] + args.dbt_args

    # Pre-checks
    if not args.skip_pre:
        if not pre_checks(client):
            print("\nPre-checks failed. Use --skip-pre to bypass.")
            sys.exit(1)

    # Run dbt
    dbt_exit_code = run_dbt(dbt_args)
    if dbt_exit_code != 0:
        print(f"\ndbt exited with code {dbt_exit_code}")
        sys.exit(dbt_exit_code)

    # Post-checks
    if not args.skip_post:
        if not post_checks(client):
            print("\nPost-checks failed. Review quality issues.")
            sys.exit(1)

    print("\n=== All checks passed! ===")

if __name__ == "__main__":
    main()
```

Run it:

```bash
# Full run with all checks
python dbt_with_quality_gates.py run

# Run specific models
python dbt_with_quality_gates.py run --select marts.orders_mart

# Skip pre-checks (for development)
python dbt_with_quality_gates.py run --skip-pre
```

## Pattern 5: dbt Cloud Webhook Integration

For dbt Cloud, use webhooks to trigger AnomalyArmor checks:

```python
# webhook_handler.py (Flask example)
from flask import Flask, request, jsonify
from anomalyarmor import Client

app = Flask(__name__)

@app.route("/dbt-webhook", methods=["POST"])
def handle_dbt_webhook():
    payload = request.json
    event_type = payload.get("eventType")
    run_status = payload.get("data", {}).get("runStatus")

    if event_type == "run.completed" and run_status == "Success":
        # dbt run completed successfully, run post-checks
        client = Client()
        asset_id = "your-asset-uuid"

        validity = client.validity.summary(asset_id)
        ref = client.referential.summary(asset_id)

        if validity.failing > 0 or ref.failing_checks > 0:
            # Alert your team via Slack, PagerDuty, etc.
            send_alert(f"dbt run completed but quality issues found")

    return jsonify({"status": "ok"})
```

## Pattern 6: Upload Lineage from dbt

Upload your dbt `manifest.json` to populate data lineage in AnomalyArmor. This lets you visualize the full DAG, run impact analysis, and check upstream freshness before transformations.

```bash
# Generate the manifest
dbt parse

# Upload to AnomalyArmor
curl -X POST \
  "https://api.anomalyarmor.ai/api/v1/assets/$ASSET_ID/lineage/upload" \
  -H "Authorization: Bearer $ARMOR_API_KEY" \
  -F "file=@target/manifest.json"
```

Or add it as a post-run step in your dbt wrapper:

```python
from anomalyarmor import Client

client = Client()

def upload_lineage(asset_id: str, manifest_path: str = "target/manifest.json"):
    """Upload dbt manifest to sync lineage after a dbt run."""
    with open(manifest_path, "rb") as f:
        result = client.lineage.upload(asset_id=asset_id, file=f)
    print(f"Lineage synced: {result.sync_stats['nodes_created']} nodes, "
          f"{result.sync_stats['edges_created']} edges")
```

For dbt Cloud users, sync lineage directly without file uploads:

```python
result = client.lineage.sync_dbt_cloud(
    asset_id=ASSET_ID,
    account_id="12345",
    api_token="dbtc_your_token_here",
    job_id="67890",
)
```

> 💡 See the full [Lineage Upload Guide](/guides/lineage-upload) for detailed setup, CI/CD integration, and manual lineage options.

## Best Practices

### 1. Scope Your Checks

Don't check everything. Focus on critical paths:

```python
# Check only critical source tables
CRITICAL_SOURCES = [
    "production.stripe.payments",  # Revenue-critical
    "production.core.users",       # Identity-critical
]
```

### 2. Set Appropriate Thresholds

Configure checks with realistic thresholds:

```python
# Allow small percentage of stale data in non-critical tables
client.freshness.require_fresh(
    "analytics.events",
    max_age_hours=48,  # More lenient for analytics
)
```

### 3. Fail Fast, But Not Always

```python
# Critical checks should fail the pipeline
if not critical_checks_pass():
    sys.exit(1)

# Non-critical checks can warn without failing
if not advisory_checks_pass():
    log_warning("Advisory checks failed, continuing anyway")
```

### 4. Cache API Calls

For large dbt projects, minimize API calls:

```python
# Get summary once, not per-model
validity_summary = client.validity.summary(asset_id)
# Then use summary.failing, summary.passing, etc.
```

## Troubleshooting

### "Asset not found" errors

Ensure your table names match AnomalyArmor's qualified names exactly:

```bash
# List your assets to see exact names
armor assets list --source snowflake
```

### Slow checks

For faster feedback, use summary endpoints instead of individual checks:

```python
# Fast: single API call
summary = client.validity.summary(asset_id)
print(f"Failing: {summary.failing}")

# Slow: N API calls
for rule in client.validity.list(asset_id):
    result = client.validity.check(asset_id, rule.uuid)  # Avoid in loops
```

## Next Steps

## Common Questions

### Do I replace my dbt tests with AnomalyArmor, or run both?

Run both - they do different things. dbt tests validate model-level contracts at build time (is this column unique? are these values in this set?). AnomalyArmor monitors continuously after dbt runs, catching freshness violations, schema drift, and statistical anomalies that point-in-time tests miss. See [migrating from dbt tests](/guides/migrate-from-dbt) for the patterns.

### Does AnomalyArmor hook into `dbt run` or does it monitor tables after the fact?

Both patterns work. For post-run monitoring, nothing to configure - AnomalyArmor watches your warehouse independently. For pre-run gating (fail fast if source data isn't fresh enough), add a Python/shell step before `dbt run` that calls the AnomalyArmor freshness API. See the "Pre-run gating" section above.

### Can AnomalyArmor import my existing dbt tests instead of rewriting them?

Yes. Use `armor migrate-from dbt path/to/dbt/project` (available via the AnomalyArmor CLI) to generate ODCS contract files from your dbt schema.yml. Tests like `not_null`, `unique`, `accepted_values`, and `relationships` map cleanly; others surface as warnings. See [migrating from dbt](/guides/migrate-from-dbt).

### Does the integration work with dbt Cloud as well as dbt Core?

Yes. dbt Cloud emits `manifest.json` on every run; configure AnomalyArmor to consume it via the lineage-upload flow, and model-to-table relationships populate automatically. See [lineage upload](/guides/lineage-upload).

- **[GitHub Actions](/integrations/github-actions)**: Run checks in CI/CD
- **[Airflow Integration](/integrations/airflow)**: Orchestrate with Airflow
- **[Validity API](/api/validity)**: Full validity API reference
- **[Metrics API](/api/metrics)**: Track data metrics

---

<!-- Source: https://docs.anomalyarmor.ai/integrations/github -->

# GitHub Integration

> Connect GitHub repositories for code ownership, change tracking, and incident context

Connect your GitHub repositories to AnomalyArmor to monitor code ownership, track changes to data pipelines, and get incident context when alerts fire. AnomalyArmor analyzes your repos for CODEOWNERS files, git blame history, and table references so you can route alerts to the right people.

## What AnomalyArmor Does with Your Repos

Once connected and monitored, AnomalyArmor scans your repositories for:

- **CODEOWNERS files** to map directories and files to team owners
- **Git blame data** to identify who last modified SQL models, dbt files, and pipeline code
- **Table references** in SQL files to link code to monitored data assets
- **Change correlation** to show recent code changes when data quality issues arise

This lets AnomalyArmor automatically suggest alert owners, show who to contact when a table breaks, and provide code-level context for schema changes and freshness incidents.

## GitHub App vs OAuth: Which to Choose

AnomalyArmor supports two connection methods. For most teams, the GitHub App is the better choice.

| | GitHub App | OAuth App |
|---|---|---|
| **Setup** | One-click install from GitHub Marketplace | Personal OAuth flow |
| **Permissions** | Scoped to selected repos only | Access to all repos you can see |
| **Token lifecycle** | Auto-refreshing, no expiry | Permanent until revoked |
| **Webhooks** | Automatic push event webhooks | Manual per-repo webhook setup |
| **Best for** | Teams, production use | Quick personal testing |
| **Identity** | Org-level installation | Individual user account |

> 💡 **Use the GitHub App** if you're setting this up for a team. It provides granular repo-level permissions and automatic webhook delivery for real-time sync.

## Setup: GitHub App (Recommended)

## Setup: OAuth App

> ℹ️ OAuth tokens are tied to your personal GitHub account. If you leave the organization, the connection breaks. For team setups, use the GitHub App instead.

## Monitoring Repos

The Monitor toggle controls whether AnomalyArmor actively analyzes a repository:

- **Toggle ON**: Creates a repository record and triggers initial analysis (CODEOWNERS parsing, git blame, table reference scanning). The Status column shows progress (Pending, Running, Synced, or Failed).
- **Toggle OFF**: Pauses monitoring. All existing analysis data is preserved. You can re-enable at any time.

### Linking Assets to Repos

For the richest analysis, link your monitored repos to data assets (databases). Click the link icon next to a monitored repo's status to open the asset linking dialog. This lets AnomalyArmor correlate code changes in the repo with specific tables in your database.

## Webhooks

### GitHub App

Webhooks are configured automatically when you install the app. AnomalyArmor receives push events in real time and re-analyzes monitored repos when relevant files change (CODEOWNERS, SQL files, dbt models).

### OAuth App

OAuth connections do not include automatic webhooks. AnomalyArmor relies on periodic re-sync to pick up changes. You can manually trigger a re-sync from **Settings > Git Integration** by clicking **Re-sync**.

## Common Questions

### Should I use the GitHub App or OAuth to connect?

Use the GitHub App for team or organization installs: it delivers webhooks automatically so AnomalyArmor re-syncs on push, gives you fine-grained per-repo access control, and uses a refreshing token that never expires. Use OAuth only for personal-account trials or temporary single-user connections, or when your organization blocks third-party GitHub Apps. You can switch later without losing monitored repo data.

### Does AnomalyArmor read my source code?

AnomalyArmor analyzes CODEOWNERS, commit history, and references to your monitored tables (typically in SQL files and dbt models). It does not read or store full source code. The repo connection is for ownership and change context, not for code execution.

### Does the GitHub integration support private repositories?

Yes. Both GitHub App and OAuth connection methods support private repos. For GitHub App, grant access to the specific repos during installation. For OAuth, your authorized scopes must include private repo read access.

### Why does my integration show 'No repositories found' after connecting?

For GitHub App: confirm you granted access to at least one repo during install (check the installed GitHub Apps list in your org settings). For OAuth: private repos in orgs with SAML SSO require explicit SSO authorization on the token.

### How do I disconnect or change which repositories AnomalyArmor can see?

For GitHub App, open your GitHub organization settings, find AnomalyArmor under Installed GitHub Apps, and edit the repo access list. Then click **Re-sync** in AnomalyArmor to refresh. For OAuth, click **Disconnect** in the Git Integration settings and reconnect. Previously monitored repositories and their analysis data are preserved across disconnect/reconnect.

## Troubleshooting

### "No repositories found" after connecting

- **GitHub App**: Check that you granted access to at least one repository during installation. Go to your GitHub organization settings, find the AnomalyArmor app under **Installed GitHub Apps**, and verify the repository access list.
- **OAuth**: Verify that your GitHub account has access to the repositories you expect. Private repos in organizations may require SSO authorization.

### Connection shows "Pending Setup"

The OAuth flow was started but not completed. Click **Complete Setup** to finish the authorization.

### Permissions seem wrong after changing repo access

If you change which repos the GitHub App can access (in GitHub's settings), click **Re-sync** in AnomalyArmor to refresh the repository list.

### Need to start over

Click **Disconnect** in the Git Integration settings, then reconnect using either method. Your previously monitored repositories and their analysis data are preserved.

## Next Steps

- **[GitHub Actions](/integrations/github-actions)**: Run quality checks in CI/CD
- **[dbt Integration](/integrations/dbt)**: Connect dbt project metadata
- **[Alert Destinations](/alerts/overview)**: Route alerts to the right team
- **[Data Sources](/data-sources/overview)**: Connect your data warehouse

---

<!-- Source: https://docs.anomalyarmor.ai/integrations/github-actions -->

# GitHub Actions Integration

> Run data quality checks in your CI/CD pipeline with GitHub Actions

Integrate AnomalyArmor into your GitHub Actions workflows to automatically run data quality checks on pull requests, scheduled jobs, or deployments.

## Prerequisites

- AnomalyArmor account with connected data source
- GitHub repository
- API key stored as a GitHub secret

## Setup

### 1. Create API Key

Generate an API key in **Settings > API Keys** with `read-only` scope (or `read-write` if you need to trigger checks).

### 2. Add Secret to GitHub

Go to your repository's **Settings > Secrets and variables > Actions** and add:

- Name: `ARMOR_API_KEY`
- Value: `aa_live_your_key_here`

## Basic Workflow

Add this workflow file to `.github/workflows/data-quality.yml`:

```yaml
name: Data Quality Checks

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
  schedule:
    # Run daily at 6 AM UTC
    - cron: '0 6 * * *'
  workflow_dispatch:  # Manual trigger

jobs:
  quality-checks:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install AnomalyArmor CLI
        run: pip install anomalyarmor-cli

      - name: Check data freshness
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
        run: |
          armor freshness check snowflake.prod.warehouse.orders
          armor freshness check snowflake.prod.warehouse.customers

      - name: Run validity checks
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
        run: |
          armor validity summary ${{ vars.ASSET_ID }}
```

## Workflow Patterns

### Pattern 1: Pre-deployment Gate

Block deployments if data quality checks fail:

```yaml
name: Deploy with Quality Gate

on:
  push:
    branches: [main]

jobs:
  quality-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install CLI
        run: pip install anomalyarmor-cli

      - name: Quality Gate
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
        run: |
          # Check all critical tables
          armor freshness check snowflake.prod.warehouse.orders
          armor freshness check snowflake.prod.warehouse.customers
          armor freshness check snowflake.prod.warehouse.products

          echo "All quality checks passed!"

  deploy:
    needs: quality-gate
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Deploy application
        run: ./deploy.sh
```

### Pattern 2: Scheduled Quality Report

Generate a daily quality report:

```yaml
name: Daily Quality Report

on:
  schedule:
    - cron: '0 8 * * *'  # 8 AM UTC daily
  workflow_dispatch:

jobs:
  quality-report:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: pip install anomalyarmor-cli

      - name: Generate Quality Report
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
        run: |
          echo "# Daily Data Quality Report" >> $GITHUB_STEP_SUMMARY
          echo "" >> $GITHUB_STEP_SUMMARY
          echo "## Freshness Summary" >> $GITHUB_STEP_SUMMARY
          armor freshness summary >> $GITHUB_STEP_SUMMARY
          echo "" >> $GITHUB_STEP_SUMMARY
          echo "## Validity Summary" >> $GITHUB_STEP_SUMMARY
          armor validity summary ${{ vars.ASSET_ID }} >> $GITHUB_STEP_SUMMARY
          echo "" >> $GITHUB_STEP_SUMMARY
          echo "## Referential Summary" >> $GITHUB_STEP_SUMMARY
          armor referential summary ${{ vars.ASSET_ID }} >> $GITHUB_STEP_SUMMARY
```

### Pattern 3: PR Comment with Quality Status

Post quality status as a PR comment:

```yaml
name: PR Quality Check

on:
  pull_request:
    branches: [main]

jobs:
  quality-check:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write

    steps:
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install CLI
        run: pip install anomalyarmor-cli

      - name: Run Quality Checks
        id: quality
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
        run: |
          # Capture results
          FRESHNESS=$(armor freshness summary 2>&1) || true
          VALIDITY=$(armor validity summary ${{ vars.ASSET_ID }} 2>&1) || true

          # Build comment
          echo "COMMENT<<EOF" >> $GITHUB_OUTPUT
          echo "## Data Quality Status" >> $GITHUB_OUTPUT
          echo "" >> $GITHUB_OUTPUT
          echo "### Freshness" >> $GITHUB_OUTPUT
          echo '```' >> $GITHUB_OUTPUT
          echo "$FRESHNESS" >> $GITHUB_OUTPUT
          echo '```' >> $GITHUB_OUTPUT
          echo "" >> $GITHUB_OUTPUT
          echo "### Validity" >> $GITHUB_OUTPUT
          echo '```' >> $GITHUB_OUTPUT
          echo "$VALIDITY" >> $GITHUB_OUTPUT
          echo '```' >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: Post PR Comment
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `${{ steps.quality.outputs.COMMENT }}`
            })
```

### Pattern 4: dbt + Quality Checks

Combine dbt runs with quality validation:

```yaml
name: dbt with Quality Gates

on:
  push:
    branches: [main]
    paths:
      - 'dbt/**'
  workflow_dispatch:

jobs:
  dbt-run:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: |
          pip install dbt-snowflake anomalyarmor-cli

      - name: Pre-flight Quality Check
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
        run: |
          echo "Checking source data freshness..."
          armor freshness check snowflake.raw.stripe.payments
          armor freshness check snowflake.raw.crm.customers

      - name: Run dbt
        working-directory: ./dbt
        env:
          DBT_PROFILES_DIR: ${{ github.workspace }}/dbt
        run: |
          dbt deps
          dbt run

      - name: Post-run Validation
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
        run: |
          echo "Validating dbt outputs..."
          armor validity summary ${{ vars.ASSET_ID }}
          armor referential summary ${{ vars.ASSET_ID }}
```

### Pattern 5: Python Script for Complex Logic

For complex quality gates, use a Python script:

```yaml
name: Advanced Quality Gate

on:
  push:
    branches: [main]

jobs:
  quality-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: pip install anomalyarmor-cli

      - name: Run Quality Gate
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
          ASSET_ID: ${{ vars.ASSET_ID }}
        run: python scripts/quality_gate.py
```

Create `scripts/quality_gate.py`:

```python
#!/usr/bin/env python
"""
Advanced quality gate for GitHub Actions.
Exits with code 0 if all checks pass, 1 otherwise.
"""
import os
import sys
from anomalyarmor import Client
from anomalyarmor.exceptions import StalenessError

def main():
    client = Client()
    asset_id = os.environ["ASSET_ID"]
    passed = True

    print("=" * 50)
    print("DATA QUALITY GATE")
    print("=" * 50)

    # 1. Freshness checks
    print("\n## Freshness Checks")
    critical_tables = [
        "snowflake.prod.warehouse.orders",
        "snowflake.prod.warehouse.customers",
    ]
    for table in critical_tables:
        try:
            status = client.freshness.require_fresh(table)
            print(f"[PASS] {table} ({status.hours_since_update:.1f}h old)")
        except StalenessError as e:
            print(f"[FAIL] {table} ({e.hours_since_update:.1f}h stale)")
            passed = False

    # 2. Validity checks
    print("\n## Validity Checks")
    validity = client.validity.summary(asset_id)
    if validity.failing == 0:
        print(f"[PASS] {validity.total_rules} rules, all passing")
    else:
        print(f"[FAIL] {validity.failing}/{validity.total_rules} rules failing")
        # List failing rules
        for rule in client.validity.list(asset_id):
            result = client.validity.check(asset_id, rule.uuid)
            if result.status == "fail":
                print(f"       - {rule.name}: {result.invalid_count} invalid")
        passed = False

    # 3. Referential integrity
    print("\n## Referential Integrity")
    ref = client.referential.summary(asset_id)
    if ref.failing_checks == 0:
        print(f"[PASS] {ref.total_checks} checks, all passing")
    else:
        print(f"[FAIL] {ref.failing_checks}/{ref.total_checks} failing")
        passed = False

    # 4. Summary
    print("\n" + "=" * 50)
    if passed:
        print("RESULT: ALL CHECKS PASSED")
        print("=" * 50)
        sys.exit(0)
    else:
        print("RESULT: QUALITY GATE FAILED")
        print("=" * 50)
        sys.exit(1)

if __name__ == "__main__":
    main()
```

## Environment Variables

Reference these in your workflows:

| Variable | Required | Description |
|----------|----------|-------------|
| `ARMOR_API_KEY` | Yes | Your AnomalyArmor API key (store as secret) |
| `ASSET_ID` | For some commands | Asset UUID (store as variable) |

### Using GitHub Variables

Store non-sensitive config as repository variables:

1. Go to **Settings > Secrets and variables > Actions**
2. Click **Variables** tab
3. Add variables like `ASSET_ID`, `CRITICAL_TABLES`, etc.

Reference in workflows:

```yaml
env:
  ASSET_ID: ${{ vars.ASSET_ID }}
```

## Best Practices

### 1. Use Secrets for API Keys

Never hardcode API keys:

```yaml
# Good
env:
  ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}

# Bad - Never do this!
env:
  ARMOR_API_KEY: aa_live_xxxxx
```

### 2. Fail Fast

Put quality checks early in workflows:

```yaml
jobs:
  quality-gate:
    # Runs first
    runs-on: ubuntu-latest
    ...

  build:
    needs: quality-gate  # Only runs if quality passes
    ...

  deploy:
    needs: build
    ...
```

### 3. Cache Dependencies

Speed up workflows by caching:

```yaml
- uses: actions/setup-python@v5
  with:
    python-version: '3.11'
    cache: 'pip'

- name: Install dependencies
  run: pip install anomalyarmor-cli
```

### 4. Use Job Summaries

Write results to `$GITHUB_STEP_SUMMARY` for visibility:

```yaml
- name: Quality Summary
  run: |
    echo "## Quality Gate Results" >> $GITHUB_STEP_SUMMARY
    armor freshness summary >> $GITHUB_STEP_SUMMARY
```

### 5. Set Timeouts

Prevent hanging jobs:

```yaml
jobs:
  quality-checks:
    runs-on: ubuntu-latest
    timeout-minutes: 10
```

## Troubleshooting

### "Command not found: armor"

Ensure you've installed the CLI before using it:

```yaml
- name: Install CLI
  run: pip install anomalyarmor-cli

- name: Use CLI
  run: armor --version  # Should work now
```

### "Authentication failed"

Check that:
1. `ARMOR_API_KEY` secret is set correctly
2. The secret name matches your workflow reference
3. API key hasn't been revoked

```yaml
- name: Debug auth
  env:
    ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
  run: |
    armor auth status
```

### Workflow not triggering

Check your `on:` triggers and branch patterns match your setup.

## Next Steps

## Common Questions

### How do I store my AnomalyArmor API key in GitHub Actions?

Add it as a repository or organization secret: **Settings → Secrets and variables → Actions → New repository secret**, name it `ARMOR_API_KEY`, and paste the key value. Reference it in your workflow as `${{ secrets.ARMOR_API_KEY }}`. For org-wide pipelines, use an org-level secret scoped to specific repos.

### Can I fail a pull request on a data quality check failure?

Yes. Add an AnomalyArmor freshness or validity check as a required status check on the branch protection rule for `main`. The workflow exits non-zero when a check fails, GitHub marks the PR as failing, and merge is blocked until the check passes or is overridden.

### Can I run AnomalyArmor checks on a schedule from GitHub Actions?

Yes. Use a `workflow_dispatch` trigger plus a `schedule:` cron entry. But most customers skip this - AnomalyArmor's own scheduled monitoring and native alert destinations handle recurring checks without GitHub Actions runner time. Use GitHub Actions specifically for PR-time gates and deploy-time validation.

### Is there an official AnomalyArmor GitHub Action?

Not yet as a Marketplace action - a `run:` step that installs the CLI (`pip install anomalyarmor-cli`) or calls REST directly via `curl` is the current pattern. A dedicated action is on the roadmap. See the examples in this page for the canonical shape.

- **[Airflow Integration](/integrations/airflow)**: Orchestrate with Airflow
- **[dbt Integration](/integrations/dbt)**: Add gates to dbt workflows
- **[CLI Reference](/cli/reference)**: Full CLI documentation
- **[API Reference](/api/overview)**: Full API documentation

---

<!-- Source: https://docs.anomalyarmor.ai/integrations/mcp-server -->

# MCP Server

> Connect AI assistants to AnomalyArmor with Model Context Protocol

The AnomalyArmor MCP server lets AI assistants like Claude Code and Cursor interact with your data observability platform through natural language.

## Connection Methods

| Method | Auth | Best For |
|--------|------|----------|
| **Remote (Recommended)** | Sign in with your AnomalyArmor account | Zero setup, always up to date |
| **Local** | API key | Air-gapped environments, custom setups |

## Remote Server (Recommended)

Connect directly to AnomalyArmor's hosted MCP server. No installation required. You sign in with your existing AnomalyArmor account through OAuth.

### Claude Code

```bash
claude mcp add anomalyarmor --transport http https://mcp.anomalyarmor.ai/mcp
```

Claude Code will open a browser window to sign in the first time you use a tool.

### Cursor

Add to your Cursor MCP configuration:

**macOS**: `~/Library/Application Support/Cursor/mcp.json`
**Linux**: `~/.config/Cursor/mcp.json`
**Windows**: `%APPDATA%\Cursor\mcp.json`

```json
{
  "mcpServers": {
    "anomalyarmor": {
      "url": "https://mcp.anomalyarmor.ai/mcp"
    }
  }
}
```

Restart Cursor after saving. You will be prompted to sign in when you first use a tool.

### Other MCP Clients

Any MCP client that supports the **streamable-http** transport can connect:

```
URL: https://mcp.anomalyarmor.ai/mcp
Transport: streamable-http
Auth: OAuth 2.1 (automatic)
```

The server publishes standard OAuth 2.1 discovery metadata at `https://mcp.anomalyarmor.ai/.well-known/oauth-protected-resource`.

## Local Server (API Key)

For environments where you need a local MCP server (air-gapped networks, custom configurations), install and run the server locally with an API key.

### Install

```bash
pip install armor-mcp
```

> ℹ️ If the package is not yet on PyPI, install from source:
```bash
pip install git+https://github.com/anomalyarmor/agents.git#subdirectory=armor-mcp
```

### Configure

Add to your AI tool's MCP settings:

```json
{
  "mcpServers": {
    "anomalyarmor": {
      "command": "armor-mcp",
      "env": {
        "ANOMALYARMOR_API_KEY": "your-api-key"
      }
    }
  }
}
```

### Getting Your API Key

1. Go to **Settings > API Keys** in the AnomalyArmor dashboard
2. Click **Create API Key**
3. Name it (e.g., "Claude Code MCP")
4. Copy the key (it won't be shown again)

## Available Tools

Once connected, your AI assistant has access to 40+ tools across these categories:

### Health and Briefings

| Tool | Description |
|------|-------------|
| `health_summary` | Overall health status across all assets |
| `get_todays_briefing` | Daily digest with key insights and alerts |
| `get_coverage_summary` | Monitoring coverage analysis |

### Alerts

| Tool | Description |
|------|-------------|
| `list_alerts` | Query alerts with date, severity, and status filters |
| `get_alert_summary` | Aggregate alert counts and trends |
| `list_incidents` | List incidents with status, severity, and date filters |
| `get_incident` | Get full incident details including activity timeline |
| `acknowledge_alert` | Mark an incident as seen |
| `resolve_alert` | Close an incident with resolution metadata |
| `dismiss_alert` | Dismiss a non-actionable incident |
| `snooze_alert` | Snooze an incident for a set duration |
| `preview_alerts` | Dry-run alert rules against historical data |

### Assets

| Tool | Description |
|------|-------------|
| `list_assets` | List data sources with filters |
| `get_asset` | Get details for a specific asset |
| `explore` | Navigate schemas, tables, and columns within an asset. Supports hierarchical browsing (`explore(asset_id)` for schemas, `explore(asset_id, parent_path="schema")` for tables) and full-text search (`explore(asset_id, search="keyword")`) |
| `create_asset` | Connect a new data source |
| `test_asset_connection` | Test connectivity to a source |
| `trigger_asset_discovery` | Start asset discovery (runs as background job) |

### Freshness

| Tool | Description |
|------|-------------|
| `get_freshness_summary` | Freshness overview. Supports optional `asset_id` to filter to a single source |
| `check_freshness` | Check a specific table's freshness |
| `list_stale_assets` | Find currently stale tables |
| `list_freshness_schedules` | View monitoring schedules |
| `create_freshness_schedule` | Set up freshness monitoring. Use `setup_freshness` for bulk creation with `table_paths` and `schema_name` params |
| `delete_freshness_schedule` | Remove a schedule |
| `dry_run_freshness` | Preview what alerts a threshold would produce |

### Schema Monitoring

| Tool | Description |
|------|-------------|
| `get_schema_summary` | Schema drift overview. Supports optional `asset_id` to filter to a single source |
| `list_schema_changes` | Recent schema changes with filters |
| `create_schema_baseline` | Capture current schema as baseline |
| `enable_schema_monitoring` | Start drift detection for one or more assets. Accepts `asset_id` (single) or `asset_ids` (bulk), interval strings (`'5m'`, `'1h'`, `'6h'`, `'1d'`) or `cron_expression` for custom schedules |
| `disable_schema_monitoring` | Stop drift detection |
| `dry_run_schema` | Preview drift detection results |

### Data Quality

| Tool | Description |
|------|-------------|
| `list_metrics` | Quality metrics for an asset |
| `get_metrics_summary` | Metrics health overview |
| `create_metric` | Add a quality metric (row count, null rate, etc.) |
| `delete_metric` | Remove a metric |
| `capture_metric` | Trigger immediate metric capture |
| `dry_run_metric` | Preview metric alerting behavior |

### Validity Rules

| Tool | Description |
|------|-------------|
| `list_validity_rules` | Validity rules for an asset |
| `get_validity_summary` | Passing/failing rule overview |
| `create_validity_rule` | Add a rule (NOT_NULL, UNIQUE, etc.) |
| `delete_validity_rule` | Remove a rule |
| `check_validity_rule` | Run a check immediately |

### Intelligence, Lineage, and Jobs

| Tool | Description |
|------|-------------|
| `ask_question` | Natural language Q&A about your data |
| `generate_intelligence` | Trigger AI analysis for an asset |
| `get_lineage` | Explore upstream/downstream dependencies |
| `job_status` | Track progress of async operations |

### Tags and Classification

| Tool | Description |
|------|-------------|
| `list_tags` | Tags for an asset |
| `create_tag` | Tag a table or column |
| `apply_tags` | Apply multiple tags to multiple objects |
| `bulk_apply_tag` | Apply a tag across multiple assets |

## Usage Examples

These examples show full terminal sessions using Claude Code with the AnomalyArmor MCP server. Each one demonstrates a real workflow from start to finish.

### Morning Triage: Check Health and Act on Issues

```bash
~/warehouse $ claude

╭────────────────────────────────────────────────────────────────────╮
│ Claude Code                                                        │
╰────────────────────────────────────────────────────────────────────╯

You: What's the health of my data warehouse? Show me anything that
     needs attention.

Claude: I'll check your overall health and look for issues.

  ┌ health_summary
  │ Querying AnomalyArmor health status...
  └ Result: 47 assets monitored

  ┌ list_stale_assets
  │ Checking for freshness violations...
  └ Result: 2 stale tables found

  ┌ list_alerts
  │ Fetching unresolved alerts (last 24h)...
  └ Result: 5 alerts

Here's your morning summary:

OVERALL: NEEDS ATTENTION (2 issues)

Freshness:
  ✅ 45 tables fresh
  ❌ gold.fact_orders - 6h overdue (SLA: 2h)
  ❌ staging.orders_raw - 8h overdue (SLA: 1h)

Alerts (last 24h):
  🔴 1 critical - freshness SLA breach on gold.fact_orders
  🟡 2 warnings - schema drift on gold.fact_orders,
                    null rate spike on gold.dim_customer.email
  🔵 2 info - new schema baselines captured

Schema drift: 3 changes detected this week
Data quality: 1 anomaly (null rate)

The critical issue is the orders pipeline. staging.orders_raw
stopped updating 8 hours ago and gold.fact_orders is stale as a
result. Want me to dig into the root cause?

You: Yes, investigate the orders staleness and check if anything
     downstream is affected.

Claude: Let me trace the lineage and check downstream impact.

  ┌ check_freshness
  │ gold.fact_orders → STALE (6h overdue)
  └ Last update: yesterday 10:12 PM

  ┌ get_lineage
  │ Fetching upstream/downstream for gold.fact_orders...
  └ 1 upstream, 7 downstream

  ┌ check_freshness
  │ staging.orders_raw → STALE (8h overdue)
  └ Last update: yesterday 8:30 PM

  ┌ list_schema_changes
  │ staging.orders_raw - changes in last 7 days
  └ 1 change: discount_pct removed Tuesday 3:47 PM

Here's the full picture:

Pipeline trace:
  raw.events (✅ FRESH, updated 5m ago)
    → staging.orders_raw (❌ STALE, 8h overdue)
      → gold.fact_orders (❌ STALE, 6h overdue)

The pipeline breaks at staging.orders_raw. The upstream raw.events
table is fresh, so the issue is in the ETL between raw → staging.

Downstream impact (7 tables):
  ❌ gold.fact_payments        - depends on fact_orders
  ❌ gold.fact_refunds          - depends on fact_orders
  ❌ analytics.order_summary    - derived view
  ❌ analytics.order_funnel     - derived view
  ❌ reporting.daily_revenue    - aggregated report
  ❌ reporting.weekly_kpis      - aggregated report
  ❌ reporting.customer_ltv     - joins with dim_customer

Additional context: staging.orders_raw also had a schema change
this week (discount_pct column removed Tuesday). This may be
related to the staleness if the ETL is failing on the missing
column.

Recommended next steps:
1. Check the Airflow/dbt DAG for the staging.orders_raw job
2. Look for errors related to the discount_pct column removal
3. Once fixed, the 7 downstream tables should recover automatically

You: Acknowledge the freshness alert and resolve the schema drift
     alert with a note that the column removal was intentional.

Claude: I'll handle both alerts.

  ┌ acknowledge_alert
  │ Alert #4521 (freshness SLA breach) → acknowledged
  └ Done

  ┌ resolve_alert
  │ Alert #4518 (schema drift) → resolved
  │ Resolution: "Column removal was intentional per product change"
  └ Done

Done. The freshness alert is acknowledged (you'll still see it
until the pipeline recovers), and the schema drift alert is
resolved with your note.
```

### Setting Up Monitoring for a New Schema

```bash
~/warehouse $ claude

You: I just added a new "analytics" schema with 8 tables. Set up
     full monitoring: freshness, schema drift, and null checks on
     any ID or email columns.

Claude: Let me discover what's in the analytics schema first.

  ┌ search_tables
  │ Searching for tables in analytics schema...
  └ Found 8 tables

  ┌ get_table_columns
  │ Reading columns for all 8 tables...
  └ 67 columns total

Found 8 tables in analytics with 67 columns. Here's what I'll set up:

Freshness monitoring (8 tables):
  analytics.order_summary      - hourly, 2h SLA
  analytics.order_funnel       - hourly, 2h SLA
  analytics.customer_cohorts   - daily, 26h SLA
  analytics.product_performance - daily, 26h SLA
  analytics.revenue_by_region  - daily, 26h SLA
  analytics.churn_analysis     - daily, 26h SLA
  analytics.funnel_conversion  - hourly, 2h SLA
  analytics.session_metrics    - hourly, 2h SLA

Schema drift monitoring: Enabled for all 8 tables

Null percentage metrics (12 columns with ID or email patterns):
  analytics.order_summary.order_id
  analytics.order_summary.customer_id
  analytics.order_funnel.user_id
  analytics.order_funnel.session_id
  analytics.customer_cohorts.customer_id
  analytics.customer_cohorts.email
  analytics.product_performance.product_id
  analytics.churn_analysis.customer_id
  analytics.churn_analysis.email
  analytics.funnel_conversion.user_id
  analytics.session_metrics.session_id
  analytics.session_metrics.user_id

I chose hourly checks for tables that look transactional and daily
for tables that look like aggregations. Want me to proceed?

You: Yes, do it all.

Claude: Setting everything up now.

  ┌ create_freshness_schedule (×8)
  │ Creating schedules for all analytics tables...
  └ 8 schedules created

  ┌ enable_schema_monitoring (×8)
  │ Enabling drift detection...
  └ 8 baselines captured

  ┌ create_metric (×12)
  │ Creating null_percentage metrics...
  └ 12 metrics created

All done:
  ✅ 8 freshness schedules (4 hourly, 4 daily)
  ✅ 8 schema drift monitors with baselines
  ✅ 12 null_percentage metrics on ID/email columns

You'll start receiving alerts if any table misses its SLA, any
schema changes are detected, or any null rates spike outside
normal ranges. The first data points will appear after the next
discovery run.
```

### Investigating a Data Quality Issue

```bash
~/warehouse $ claude

You: The finance team says revenue numbers look wrong today.
     Something about missing discount amounts. Can you investigate?

Claude: Let me check the revenue-related tables for anomalies.

  ┌ search_tables
  │ Searching for tables matching "revenue", "order", "discount"...
  └ Found 5 relevant tables

  ┌ get_metrics_summary
  │ Checking data quality metrics on gold.fact_orders...
  └ 3 metrics tracked

  ┌ list_schema_changes
  │ gold.fact_orders - recent changes
  └ 1 change this week

  ┌ check_freshness
  │ gold.fact_orders → FRESH (updated 12m ago)
  └ No freshness issues

Found the issue. Here's what happened:

Root cause: The discount_pct column was removed from
gold.fact_orders on Tuesday at 4:12 PM.

Evidence:
  - Schema drift detected the removal Tuesday 4:12 PM
  - The column had a null_percentage metric that stopped reporting
    (no column to measure)
  - Freshness is fine - the table is still updating
  - The column was also removed from upstream staging.orders_raw
    at 3:47 PM

Impact on revenue:
  - analytics.order_summary uses SUM(discount_pct) to calculate
    discount totals - this query is now failing
  - reporting.daily_revenue references discount_pct in its
    net_revenue calculation

The table has data, but the discount column is gone. That's why
revenue looks wrong - discount amounts aren't being subtracted.

You: Was this intentional? Can you check if there's a related
     alert that someone already handled?

Claude: Let me check the alert history.

  ┌ list_alerts
  │ Filtering for schema drift alerts on gold.fact_orders...
  └ 1 alert found

  ┌ get_alert_details
  │ Alert #4518...
  └ Status: RESOLVED

There was a schema drift alert (#4518) that fired Tuesday at
4:12 PM for the discount_pct column removal. It was resolved
today at 9:15 AM with the note:

  "Column removal was intentional per product change"

So the column removal was intentional, but the downstream views
haven't been updated yet. The finance team's revenue calculations
are referencing a column that no longer exists.

Recommended:
1. Update analytics.order_summary to remove the discount_pct
   reference
2. Update reporting.daily_revenue net_revenue calculation
3. Consider adding a new discount column if discounts are still
   tracked differently

You: Tag the affected downstream tables so the data team knows.

Claude: I'll tag them.

  ┌ apply_tags
  │ Applying "needs-migration" tag...
  └ Applied to 2 tables

Tagged 2 tables with "needs-migration":
  - analytics.order_summary
  - reporting.daily_revenue

The data team can find these by asking "what tables are tagged
needs-migration?" in any AnomalyArmor agent.
```

## Troubleshooting

### Authentication Failed (Remote)

```
Authentication failed
```

**Solution**:
1. Clear your MCP auth cache and re-authenticate
2. In Claude Code: run `/mcp` and select the reconnect option
3. Verify you can sign in to [app.anomalyarmor.ai](https://app.anomalyarmor.ai)

### Authentication Failed (Local)

```
Error: Invalid API key
```

**Solution**: Verify your API key is correct in your MCP config. Generate a new key from **Settings > API Keys** if needed.

### Server Not Found

```
Error: MCP server 'anomalyarmor' not found
```

**Solution**:
1. Restart your AI tool after config changes
2. For local mode, verify armor-mcp is installed: `armor-mcp --version`
3. Check MCP config syntax (valid JSON)

### Empty Responses

If tools return empty results after successful authentication:

**Solution**:
1. Verify you have data sources connected in AnomalyArmor
2. Try `health_summary` as a basic connectivity test
3. For local mode, ensure your API key has the correct scope

### Rate Limiting

```
Error: Rate limit exceeded
```

**Solution**: Wait a few seconds and retry. The server handles rate limits automatically.

## Inline UI rendering (MCP Apps)

Hosts that support the [draft MCP Apps spec](/integrations/mcp-apps) (Claude Desktop nightly, Cursor experimental) render interactive charts inline when you call `check_freshness`, `get_freshness_summary`, `list_schema_changes`, `get_lineage`, or `health_summary` - instead of a JSON blob, you see a timeline, diff table, lineage graph, or stat dashboard. Hosts without MCP Apps support fall back to JSON automatically; the tools work everywhere.

See [MCP Apps (inline UI)](/integrations/mcp-apps) for host-support details and examples.

## Next Steps

- **[MCP Apps (inline UI)](/integrations/mcp-apps)**: Interactive charts rendered inline in supported hosts
- **[Claude Code Setup](/integrations/claude-code)**: Detailed Claude Code configuration
- **[Cursor Setup](/integrations/cursor)**: Cursor IDE integration
- **[Web Agent](/ai-agents/web-agent)**: Use the same tools in your browser
- **[AI Skills](/ai-agents/overview)**: Browse available skills

## Common Questions

### What's the MCP server URL and auth model?

Hosted at `https://mcp.anomalyarmor.ai/mcp`. Authenticate with a Bearer API key (`Authorization: Bearer aa_live_...`). Keys are scoped `read-only` (demo/trial) or `read-write` (paid) - any write tool call returns 403 on a read-only key with a handoff URL the agent can surface.

### Which MCP hosts does AnomalyArmor support?

Tested with Claude Desktop, Claude Code CLI, Cursor, and Windsurf. Any host that speaks the Model Context Protocol over HTTPS/SSE should work. See [Claude Code](/integrations/claude-code) and [Cursor](/integrations/cursor) for host-specific setup.

### Do MCP tool calls count against my API rate limit?

Yes. MCP traffic uses the same API rate limit as direct REST calls (per-key). Agents tend to be bursty, so enable retry-with-backoff in your host config if you hit 429s. Enterprise plans raise the limit significantly.

### Does the MCP server let agents modify my data or infrastructure?

No. All AnomalyArmor tools operate on metadata and monitoring state (assets, metrics, alerts, tags). Agents cannot write to your connected databases, rotate credentials, or change billing. Mutating tools on AnomalyArmor itself (creating a metric, dismissing an alert) require a `read-write` scoped key and always ask for confirmation in the host UI before executing.

---

<!-- Source: https://docs.anomalyarmor.ai/integrations/mcp-apps -->

# MCP Apps (inline UI)

> Interactive chart rendering for AnomalyArmor MCP tools inside Claude Desktop and Cursor

AnomalyArmor's hosted MCP server (`mcp.anomalyarmor.ai/mcp`) attaches an inline HTML view to five flagship tools - `check_freshness`, `get_freshness_summary`, `list_schema_changes`, `get_lineage`, and `health_summary`. Hosts that implement the draft [MCP Apps](https://modelcontextprotocol.io/) spec render the view directly in the chat; hosts without MCP Apps support keep seeing the existing JSON response. Nothing breaks either way.

The images below are **illustrative static snapshots** of what the live renders look like. The actual tool response is interactive and generated by the MCP server at render time.

## Supported hosts

| Host | Status | Notes |
|------|--------|-------|
| **Claude Desktop nightly** | Full | Requires Developer Mode → MCP Apps feature flag |
| **Cursor experimental** | Best-effort | Behind an experimental feature flag; rendering fidelity varies |
| **Everything else** (Claude Code CLI, Cursor stable, web agents, custom clients) | Falls back to JSON | The tools still work; you see the raw JSON payload |

No configuration is required on the AnomalyArmor side - if your host supports MCP Apps, the inline UI appears automatically the next time you call one of the five flagship tools.

## Freshness timeline

Tool: `check_freshness` / `get_freshness_summary`. Returns per-table or aggregate freshness with status-colored bars and stat cards.

## Schema diff table

Tool: `list_schema_changes`. Color-codes each row by change type (green = added, red = dropped, yellow = type-changed) with a severity pill and detection timestamp.

## Lineage graph

Tool: `get_lineage`. Three-column layout of upstream sources, the queried asset, and downstream consumers. Edge count is surfaced in the caption.

## Health dashboard

Tool: `health_summary`. Stat-card grid with freshness %, active alert count, and unacknowledged schema drift, followed by a "Needs attention" list.

## Security and privacy

- The embedded HTML runs in the host's resource sandbox with a strict Content Security Policy: `default-src 'none'`, no `connect-src` (so the template cannot make network calls), and no inline secrets.
- All data is pre-filtered by the MCP server before it's injected into the template - Bearer tokens, `aa_live_*` API keys, and email addresses cannot reach the render layer. A CI conformance sweep guards against regressions.
- Vega-Lite is pinned to `5.17.0` and loaded from `cdn.jsdelivr.net`. If the CDN is unreachable at render time, hosts show the JSON payload alongside a `<noscript>` fallback.

## How it works

Each flagship tool returns a two-element MCP content-block list:

1. A `TextContent` with the exact JSON payload every host saw before MCP Apps landed - this is the byte-identical fallback for hosts that ignore `EmbeddedResource`.
2. An `EmbeddedResource` pointing to a `ui://armor/<template>/<sha256(payload)[:12]>` URI whose text is a self-contained HTML document. Hosts that support MCP Apps render this resource inline; hosts that don't, skip it.

The URI is stable per `(template, payload)` pair, so repeated identical calls from the same conversation are deduped by the host's resource cache.

## Next steps

## Common Questions

### Which AnomalyArmor MCP tools render inline chart views?

Five today: `check_freshness`, `get_freshness_summary`, `list_schema_changes`, `get_lineage`, and `health_summary`. Hosts that speak the MCP Apps spec render these inline; other hosts receive the standard JSON response automatically, so nothing breaks.

### Which MCP hosts support the inline UI today?

Claude Desktop has full support. Cursor has it behind an experimental flag. Claude Code CLI renders the JSON fallback today - inline UI arrives when Claude Code ships its MCP Apps integration. Adding the tools now is safe either way.

### What happens if my host doesn't support MCP Apps?

The tool returns the same structured JSON it always has. Agents read it normally; you just don't get the inline chart preview. No error, no fallback configuration needed.

### Can I embed these interactive views outside of an MCP host?

Not directly - the views are part of the MCP tool response protocol. For standalone embedding (dashboards, wikis), use [report badges](/data-quality/report-badges) for static status or the REST API to fetch raw data for your own rendering.

- **[MCP server overview](/integrations/mcp-server)**: Connection methods, auth, and the full tool catalog
- **[Claude Code](/integrations/claude-code)**: Claude Code CLI setup (uses JSON fallback today)
- **[Cursor](/integrations/cursor)**: Cursor IDE setup - experimental MCP Apps flag
- **[Web agent](/ai-agents/web-agent)**: The same tools in the AnomalyArmor web UI

---

<!-- Source: https://docs.anomalyarmor.ai/integrations/claude-code -->

# Claude Code Integration

> Use AnomalyArmor with Claude Code for AI-powered data observability

Claude Code is Anthropic's CLI tool for AI-assisted development. With AnomalyArmor integration, you can monitor data quality, investigate issues, and manage alerts through natural conversation.

## Setup

### Option 1: Remote Server (Recommended)

Connect directly to AnomalyArmor's hosted MCP server. No installation or API key required. You sign in with your AnomalyArmor account.

```bash
claude mcp add anomalyarmor --transport http https://mcp.anomalyarmor.ai/mcp
```

The first time you use a tool, Claude Code will open a browser window to sign in. After that, your session stays authenticated.

### Option 2: Local Server (API Key)

For air-gapped environments or custom setups, run the MCP server locally:

```bash
pip install armor-mcp
```

Then add to your Claude Code MCP configuration (`~/.claude/mcp.json`):

```json
{
  "mcpServers": {
    "anomalyarmor": {
      "command": "armor-mcp",
      "env": {
        "ANOMALYARMOR_API_KEY": "your-api-key"
      }
    }
  }
}
```

Get your API key from **Settings > API Keys** in the AnomalyArmor dashboard.

### Option 3: Skills (Recommended for Slash Commands)

Install the `/armor:*` skills using the [skills.sh](https://skills.sh) CLI. Works with Claude Code, Cursor, Windsurf, Codex, and 40+ other agents:

```bash
npx skills add anomalyarmor/agents
```

Or install the native Claude Code plugin:

```bash
claude plugin install armor@anomalyarmor
```

Both options give you access to skills like `/armor:status`, `/armor:alerts`, and more. Source: [github.com/anomalyarmor/agents](https://github.com/anomalyarmor/agents).

### Demo mode (no signup required)

If you install the Claude Code plugin without an `ARMOR_API_KEY` or `~/.armor/config.yaml`, the skills' pre-tool-use hook mints a short-lived read-only API key and runs your prompt against a synthetic BalloonBazaar demo dataset. You get a real answer before you commit to a signup.

Safety rails ride automatically: the demo key has `scope=read-only`, expires after 1 hour, is rate-limited per IP, and points at synthetic data only. Writes return 403, at which point the skill surfaces a signup URL that preserves your exact question so it replays into the in-app agent after you finish signup.

## Using Skills

With the plugin installed, use slash commands:

```
/armor:status    # Check data health
/armor:alerts    # View recent alerts
/armor:ask       # Ask questions about your data
/armor:analyze   # Trigger AI analysis
```

## Example Conversations

### Check Data Health

```
You: /armor:status

Claude: Here's your data health summary:

Overall Status: WARNING

Issues:
- 2 tables are stale (orders, customers)
- 1 unacknowledged schema change
- 3 alerts triggered in the last 24 hours

Recommended actions:
1. Check the orders ETL pipeline
2. Review schema change in users table
```

### Investigate an Issue

```
You: Why is the orders table stale?

Claude: Let me investigate...

The orders table hasn't updated since yesterday 6:00 AM.

Root cause: The upstream staging.orders_raw table is also stale.
This appears to be related to an ETL job failure.

Lineage shows 7 downstream tables are affected:
- gold.daily_revenue
- gold.customer_metrics
- reporting.exec_dashboard
...

Recommended: Check your Airflow logs for the orders_etl DAG.
```

### Query Alerts

```
You: What critical alerts fired this week?

Claude: Found 2 critical alerts this week:

1. [CRITICAL] Freshness SLA breach - orders
   Triggered: Monday 8:15 AM
   Status: Unresolved

2. [CRITICAL] Schema breaking change - users.email removed
   Triggered: Tuesday 3:30 PM
   Status: Acknowledged
```

## Available Skills

| Skill | Description |
|-------|-------------|
| `/armor:status` | Health summary across all assets |
| `/armor:alerts` | Query and manage alerts |
| `/armor:connect` | Connect a new data source |
| `/armor:monitor` | Set up freshness monitoring |
| `/armor:ask` | Natural language Q&A |
| `/armor:analyze` | Trigger AI analysis |
| `/armor:quality` | Data quality checks |
| `/armor:lineage` | Explore data dependencies |
| `/armor:investigate` | Root cause analysis |

> 💡 You can also ask questions naturally without using slash commands. Claude will use the appropriate tools automatically.

## Configuration

### Remote Server (OAuth)

No configuration file needed. Authentication is handled automatically through your browser. To re-authenticate or check connection status:

```
/mcp
```

Select the `anomalyarmor` server to reconnect or view status.

### Local Server / Plugin (API Key)

Set your API key in `~/.armor/config.yaml`:

```yaml
api_key: aa_live_your_key_here
```

Or use an environment variable:

```bash
export ARMOR_API_KEY="aa_live_your_key_here"
```

### Verify Connection

Check that MCP tools are available:

```bash
claude mcp list
```

You should see `anomalyarmor` listed with a connected status.

## Troubleshooting

### Remote Auth: Browser Not Opening

**Solution**:
1. Run `/mcp` in Claude Code and select reconnect
2. If behind a proxy, you may need to use the local server option instead

### Remote Auth: Empty Responses

If tools connect but return no data:

**Solution**:
1. Clear auth and re-authenticate: remove the server with `claude mcp remove anomalyarmor`, then re-add it
2. Verify you have data sources connected in the AnomalyArmor dashboard

### Skills Not Loading

**Solution**:
1. Verify plugin installed: `claude plugin list`
2. Restart Claude Code
3. Check for updates: `claude plugin update armor@anomalyarmor`

### API Key Auth Error

**Solution**:
1. Verify API key is set correctly in your MCP config or env var
2. Check key permissions in AnomalyArmor dashboard
3. Regenerate key if needed

## Next Steps

## Common Questions

### How do I add the AnomalyArmor MCP server to Claude Code?

Run `claude mcp add anomalyarmor https://mcp.anomalyarmor.ai/mcp --transport http --header "Authorization: Bearer $ARMOR_API_KEY"`. Or add it to `~/.claude.json` manually under `mcpServers`. Both methods are equivalent. Restart Claude Code to pick up the change.

### Do I need a Claude API key to use AnomalyArmor in Claude Code?

You need a Claude Code install (which authenticates to Anthropic separately) and an AnomalyArmor API key. Get the AnomalyArmor key from **Settings → API Keys**. Claude Code itself handles Anthropic auth via its own login flow.

### Can I try AnomalyArmor in Claude Code without creating an account?

Yes - use the demo flow. Claude Code will auto-mint a short-lived demo key (1-hour expiry, read-only) the first time you invoke an AnomalyArmor skill. To get your own data, sign up at [app.anomalyarmor.ai/sign-up](https://app.anomalyarmor.ai/sign-up) and swap in a real API key.

### Why do some AnomalyArmor actions in Claude Code ask for confirmation while others don't?

Read-only tools (check freshness, list alerts, describe asset) execute immediately. Mutating tools (create metric, dismiss alert, update tag) require confirmation via Claude Code's tool-use approval flow so you never accidentally change state. You can trust-list specific tools in Claude Code's settings.

- **[Skills Reference](/ai-agents/overview)**: Complete skills documentation
- **[MCP Server](/integrations/mcp-server)**: Advanced MCP configuration
- **[Python SDK](/sdk/overview)**: Programmatic access
- **[API Reference](/api/overview)**: REST API documentation

---

<!-- Source: https://docs.anomalyarmor.ai/integrations/cursor -->

# Cursor Integration

> Use AnomalyArmor with Cursor IDE for AI-powered data observability

Cursor is an AI-powered IDE. With AnomalyArmor's MCP integration, you can query data health, investigate issues, and manage monitoring directly from your editor.

## Setup

### Option 1: Remote Server (Recommended)

Connect to AnomalyArmor's hosted MCP server. No installation or API key required. You sign in with your AnomalyArmor account.

Add to your Cursor MCP configuration:

**macOS**: `~/Library/Application Support/Cursor/mcp.json`
**Linux**: `~/.config/Cursor/mcp.json`
**Windows**: `%APPDATA%\Cursor\mcp.json`

```json
{
  "mcpServers": {
    "anomalyarmor": {
      "url": "https://mcp.anomalyarmor.ai/mcp"
    }
  }
}
```

Restart Cursor after saving. You will be prompted to sign in when you first use a tool.

### Option 2: Local Server (API Key)

For air-gapped environments or custom setups, install and run the server locally.

**1. Install the MCP server:**

```bash
pip install armor-mcp
```

**2. Configure Cursor:**

```json
{
  "mcpServers": {
    "anomalyarmor": {
      "command": "armor-mcp",
      "env": {
        "ANOMALYARMOR_API_KEY": "your-api-key"
      }
    }
  }
}
```

**3. Get your API key:**

1. Go to **Settings > API Keys** in AnomalyArmor
2. Click **Create API Key**
3. Name it "Cursor"
4. Copy the key (it won't be shown again)

## Usage

Open Cursor's AI chat and ask questions about your data:

```
What's the health status of my data warehouse?
```

```
Show me any stale tables
```

```
What alerts fired yesterday?
```

## Example Queries

### Data Health Check

```
User: Is my data healthy?

Cursor: Based on AnomalyArmor:

Overall Status: WARNING

Summary:
- 45 assets monitored
- 2 freshness issues
- 1 schema change pending review
- 3 alerts in last 24 hours

The orders and customers tables haven't updated in 6 hours.
```

### Investigate Pipeline Issues

```
User: Why is the orders_mart table stale?

Cursor: I checked the lineage and freshness status:

Root Cause: staging.orders_raw is stale (last update: 18 hours ago)

Dependency Chain:
  raw.events -> staging.orders_raw -> gold.orders_mart

The issue originates in the staging layer. Check your ETL job for staging.orders_raw.
```

### Query Schema Changes

```
User: What schema changes happened this week?

Cursor: Found 3 schema changes:

1. users.phone_number - Column added (Tuesday)
   Status: Acknowledged

2. orders.discount_code - Type changed VARCHAR(20) -> VARCHAR(50) (Wednesday)
   Status: Pending review

3. products.category_id - Column removed (Thursday)
   Status: Critical - may break downstream
```

## Available Tools

The MCP server provides 40+ tools to Cursor, including health checks, alert management, freshness monitoring, schema drift detection, data quality metrics, validity rules, AI intelligence, lineage, and tagging.

See the [MCP Server page](/integrations/mcp-server#available-tools) for the complete tool reference.

## Troubleshooting

### MCP Server Not Found

**Solution**:
1. Restart Cursor after editing mcp.json
2. For local mode, verify armor-mcp is installed: `armor-mcp --version`
3. Check JSON syntax in config file

### Authentication Failed (Remote)

**Solution**:
1. Re-open Cursor to trigger a fresh sign-in prompt
2. Verify you can sign in to [app.anomalyarmor.ai](https://app.anomalyarmor.ai)
3. Check that your Cursor version supports streamable-http MCP transport

### Authentication Failed (Local)

**Solution**:
1. Verify API key in mcp.json
2. Check key is not expired or revoked
3. Generate a new key from **Settings > API Keys**

### No Data Returned

**Solution**:
1. Verify you have data sources connected in AnomalyArmor
2. Try a simpler query like "health summary"
3. For remote mode, try removing and re-adding the server config to clear cached auth

## Next Steps

## Common Questions

### How do I add AnomalyArmor to Cursor?

Open Cursor Settings → Features → MCP. Add a server named `anomalyarmor` with URL `https://mcp.anomalyarmor.ai/mcp` and header `Authorization: Bearer $ARMOR_API_KEY`. Restart Cursor to load the server.

### Can I use inline chart previews from AnomalyArmor in Cursor?

Yes, with Cursor's experimental MCP Apps flag enabled. Five tools (freshness, schema changes, lineage, health summary) return interactive chart views that render inline in the chat. See [MCP Apps](/integrations/mcp-apps) for details. Without the flag, Cursor falls back to JSON responses gracefully.

### Does AnomalyArmor in Cursor see my code?

No. AnomalyArmor MCP tools read only the monitoring state in your AnomalyArmor workspace - assets, metrics, alerts, tags. Cursor sends your prompt and relevant context to its LLM provider; AnomalyArmor receives only the tool calls the agent makes, which carry table names and filter parameters, not source code.

### Why isn't Cursor showing AnomalyArmor tools after I configured the MCP server?

Three common causes: (1) Cursor wasn't fully restarted after the config change; (2) the Bearer token is invalid or missing the `Authorization: ` prefix; (3) the host field `Authorization` is case-sensitive in some Cursor versions. Check Cursor's MCP logs (View → Output → MCP) for the exact error.

- **[MCP Server](/integrations/mcp-server)**: Advanced configuration
- **[Claude Code](/integrations/claude-code)**: Claude Code integration
- **[Python SDK](/sdk/overview)**: Programmatic access
- **[API Reference](/api/overview)**: REST API documentation

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/overview -->

# AI Agents Overview

> Manage data observability through natural language, in the browser or your IDE

AnomalyArmor gives you three ways to interact with your data platform using natural language:

| Access Point | Where | Best For |
|-------------|-------|----------|
| **[Web Agent](/ai-agents/web-agent)** | Built into the Intelligence page | Diagnosing issues while looking at your data. No setup required. |
| **[MCP Server](/integrations/mcp-server)** | Claude Code, Cursor, any MCP client | Managing observability from your IDE without switching apps. |
| **[Python SDK](/sdk/overview)** | Scripts, notebooks, CI/CD | Automating checks, building workflows, programmatic access. |

All three share the same underlying tools and API. The difference is where you interact.

## Why AI Agents?

Instead of clicking through dashboards, just ask:

```
"What tables are unhealthy right now?"
"Add freshness monitoring to all gold tables"
"What changed upstream that could cause NULLs in silver_orders?"
```

Your AI assistant queries APIs, chains tools together, and interprets results for you.

## Available Skills

### Core Skills

| Skill | Description | Example |
|-------|-------------|---------|
| `/armor:status` | Health summary | "Is my data healthy?" |
| `/armor:alerts` | Query and manage alerts | "What alerts fired yesterday?" |
| `/armor:ask` | Natural language Q&A | "What tables contain PII?" |
| `/armor:analyze` | Trigger AI analysis | "Analyze the finance schema" |
| `/armor:connect` | Connect a data source | "Connect my Snowflake warehouse" |
| `/armor:monitor` | Set up monitoring | "Monitor freshness for orders hourly" |

### Extended Skills

| Skill | Description | Example |
|-------|-------------|---------|
| `/armor:quality` | Data quality checks | "Add null check for email column" |
| `/armor:tags` | Data classification | "Tag this table as PII" |
| `/armor:investigate` | Root cause analysis | "Why is this table stale?" |
| `/armor:lineage` | Data dependencies | "What depends on this table?" |
| `/armor:profile` | Table statistics | "Profile the orders table" |
| `/armor:coverage` | Monitoring coverage | "What tables have no alerts?" |

## Quick Start

### Web Agent (No Setup)

1. Go to any asset's **Intelligence** tab
2. Click **Ask Agent**
3. Start chatting

### MCP Server (IDE)

```bash
# Claude Code (one command)
claude mcp add anomalyarmor --transport http https://mcp.anomalyarmor.ai/mcp

# Cursor (add to mcp.json)
{
  "mcpServers": {
    "anomalyarmor": {
      "url": "https://mcp.anomalyarmor.ai/mcp"
    }
  }
}
```

Sign in through your browser when prompted. See [MCP Server setup](/integrations/mcp-server) for details.

### Skills (Any Agent)

Install the 14 `/armor:*` skills into Claude Code, Cursor, Windsurf, Codex, OpenCode, or any of 40+ supported agents using the [skills.sh](https://skills.sh) CLI:

```bash
npx skills add anomalyarmor/agents
```

Source: [github.com/anomalyarmor/agents](https://github.com/anomalyarmor/agents)

## Example: Morning Health Check

### Web Agent

```
You: What should I prioritize this morning?

Agent: Here's your priority list:

CRITICAL:
- gold.fact_orders is 6 hours stale (SLA: 2h)
Upstream staging.orders_raw is also stale.

HIGH:
- Schema drift: discount_pct column removed from
gold.fact_orders on Tuesday
- Null rate on gold.dim_customer.email spiked to 12.4%

MEDIUM:
- 2 validity rules failing on gold.dim_customer

All other monitored tables are healthy.
```

### MCP (Claude Code)

```
$ claude

You: /armor:status

Claude: Let me check your data health...

[Tool: health_summary]

Overall: NEEDS ATTENTION

3 issues found:
- 1 critical freshness violation (gold.fact_orders, 6h overdue)
- 1 schema drift alert (column removed from gold.fact_orders)
- 1 data quality anomaly (null rate spike on dim_customer.email)

45 other assets are healthy. Want me to investigate any of these?
```

## Example: Investigating an Alert

### Web Agent

```
You: I got a freshness alert on the orders table. What's happening?

Agent: gold.fact_orders is 6 hours overdue (SLA: 2 hours).

I checked the upstream lineage:

raw.events (FRESH)
> staging.orders_raw (STALE, 8h overdue)
> gold.fact_orders (STALE, 6h overdue)

The issue originated in staging.orders_raw. 7 downstream
tables are also affected:
- gold.fact_payments
- gold.fact_refunds
- analytics.order_summary
- reporting.daily_revenue
... and 3 more

Recommended: Check the ETL pipeline that populates
staging.orders_raw.
```

### MCP (Claude Code)

```
You: The orders table is stale. Why?

Claude: Let me investigate...

[Tool: check_freshness > gold.fact_orders: STALE, 6h overdue]
[Tool: get_lineage > upstream: staging.orders_raw]
[Tool: check_freshness > staging.orders_raw: STALE, 8h overdue]

The orders table is stale because its upstream dependency
staging.orders_raw stopped updating 8 hours ago. The issue
started upstream.

7 downstream tables are affected. Want me to list them?
```

## Example: Setting Up Monitoring

### Web Agent

```
You: Set up hourly monitoring for all gold fact tables

Agent: Found 4 fact tables in the gold schema. Created freshness
schedules:

- gold.fact_orders: hourly, 2h SLA
- gold.fact_payments: hourly, 2h SLA
- gold.fact_refunds: hourly, 2h SLA
- gold.fact_subscriptions: hourly, 2h SLA

Want me to also add null_percentage metrics on key columns?
```

### MCP (Claude Code)

```
You: Monitor all gold fact tables for freshness, hourly

Claude: I'll set those up now.

[Tool: search_tables > 4 fact tables in gold schema]
[Tool: create_freshness_schedule > gold.fact_orders: hourly, 2h SLA]
[Tool: create_freshness_schedule > gold.fact_payments: hourly, 2h SLA]
[Tool: create_freshness_schedule > gold.fact_refunds: hourly, 2h SLA]
[Tool: create_freshness_schedule > gold.fact_subscriptions: hourly, 2h SLA]

Done. 4 freshness schedules created. You'll get alerts if any
table doesn't update within 2 hours.
```

## Common Questions

### Do I need Claude to use AnomalyArmor's AI agents?

No. The Web Agent runs in your browser with no install, and the MCP server works with any [MCP-compatible client](/integrations/mcp-server), including Cursor, Windsurf, Codex, OpenCode, and Claude Code. The `/armor:*` skills install into 40+ supported agents via [skills.sh](https://skills.sh).

### What's the difference between the Web Agent, MCP server, and Python SDK?

All three call the same underlying tools. The Web Agent is built into the Intelligence page with zero setup. The [MCP server](/integrations/mcp-server) lets your IDE agent manage observability without switching apps. The [Python SDK](/sdk/overview) is for scripts, notebooks, and CI/CD automation.

### How do I install the /armor skills in my IDE?

Run `npx skills add anomalyarmor/agents` and the 14 skills install into any supported agent (Claude Code, Cursor, Windsurf, Codex, OpenCode, and more). Source is at [github.com/anomalyarmor/agents](https://github.com/anomalyarmor/agents).

### Can the agent make changes to my monitoring setup, or is it read-only?

The agent can create freshness schedules, metrics, validity rules, tags, and alert rules. For any action that modifies data, it shows the plan and asks for confirmation before executing. Tool calls are always visible so you can verify what ran.

### Does the agent have access to my actual data rows?

No. The agent queries AnomalyArmor's metadata (schemas, freshness timestamps, monitoring results, lineage), not your row-level data. See [Query Gateway](/security/query-gateway) for how AnomalyArmor reads customer warehouses without exfiltrating rows.

## Next Steps

- **[Web Agent](/ai-agents/web-agent)**: Chat with the agent in your browser
- **[MCP Server](/integrations/mcp-server)**: Connect your IDE
- **[All Skills](/ai-agents/skills/status)**: Browse available skills
- **[Python SDK](/sdk/overview)**: Programmatic access

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/web-agent -->

# Web Agent

> Chat with the AI agent directly inside AnomalyArmor

The Web Agent is an AI assistant built into AnomalyArmor's Intelligence page. It has access to 30+ tools across 9 domains and can chain them together to investigate issues, set up monitoring, manage alerts, and answer questions about your data.

## Getting Started

1. Navigate to any asset's **Intelligence** tab
2. Click **Ask Agent** in the top right
3. Type a question or request in plain English

The agent opens as a full-screen overlay with your Intelligence page still accessible behind it.

## What Makes It Powerful

The Web Agent isn't just a chatbot. When you ask a question, it:

1. **Reads your monitoring context**: Schema drift history, freshness status, data quality metrics, tags, lineage, and alert rules for the selected asset
2. **Chains tools automatically**: A single question can trigger multiple tool calls across domains
3. **Shows its work**: Every tool call is visible with parameters and results, so you can verify what the agent is doing

### Example: Diagnosing a Problem

```
You: Why is the orders dashboard broken?

Agent: Let me investigate...

[Tool: list_schema_changes > gold.fact_orders]
  Found: discount_pct column removed, Tuesday 4:12 PM

[Tool: check_freshness > gold.fact_orders]
  Status: FRESH (last update 12 minutes ago)

[Tool: get_lineage > gold.fact_orders]
  Upstream: staging.orders_raw
  Downstream: analytics.order_summary, reporting.daily_revenue

[Tool: list_schema_changes > staging.orders_raw]
  Found: discount_pct column removed, Tuesday 3:47 PM

The dashboard is broken because the discount_pct column was removed
from gold.fact_orders on Tuesday at 4:12 PM.

Root cause: The column was first removed from staging.orders_raw at
3:47 PM, then the ETL propagated the removal downstream.

2 views reference this column:
- analytics.order_summary
- reporting.daily_revenue

Freshness is normal, so the table is still updating. The issue is
the missing column, not stale data.
```

The agent called 4 tools across 3 domains (schema drift, freshness, lineage) to build a complete diagnosis from a single question.

## Capabilities by Domain

### Discovery

Browse your data assets without leaving the chat.

```
You: What tables do I have in the gold schema?

Agent: Found 12 tables in warehouse.gold:

       Fact tables:
       - gold.fact_orders (3.2M rows, fresh)
       - gold.fact_payments (1.8M rows, fresh)
       - gold.fact_refunds (245K rows, fresh)
       - gold.fact_subscriptions (450K rows, fresh)

       Dimension tables:
       - gold.dim_customer (245K rows, fresh)
       - gold.dim_product (18K rows, fresh)
       - gold.dim_geography (2.1K rows, fresh)
       ... and 5 more
```

### Monitoring Setup

Set up freshness schedules, schema monitoring, and data quality metrics in one conversation.

```
You: Set up monitoring for the gold fact tables. I want hourly
     freshness checks and null_percentage metrics on any amount columns.

Agent: Done. Here's what I set up:

       Freshness schedules (4 tables):
       - gold.fact_orders: hourly, 2h SLA
       - gold.fact_payments: hourly, 2h SLA
       - gold.fact_refunds: hourly, 2h SLA
       - gold.fact_subscriptions: hourly, 2h SLA

       Data quality metrics (6 columns):
       - gold.fact_orders.order_total: null_percentage
       - gold.fact_orders.discount_amount: null_percentage
       - gold.fact_payments.amount: null_percentage
       - gold.fact_payments.fee_amount: null_percentage
       - gold.fact_refunds.refund_amount: null_percentage
       - gold.fact_subscriptions.subscription_value: null_percentage

       Want me to add alert rules for these metrics?
```

### Alert Management

View, acknowledge, resolve, or dismiss alerts. The agent shows context from monitoring data alongside each alert.

```
You: What alerts fired today? Dismiss any info-level ones.

Agent: Found 5 alerts today.

       Dismissed 2 info-level alerts:
       - Schema baseline captured for gold.dim_geography
       - Freshness schedule activated for staging.email_events

       Remaining 3 alerts:
       1. [CRITICAL] Freshness SLA breach
          gold.fact_orders - 6 hours overdue (SLA: 2h)
          Upstream staging.orders_raw is also stale.

       2. [WARNING] Data quality anomaly
          gold.dim_customer.email - null rate 2.1% > 12.4%
          Gradual increase over 5 days, not a sudden spike.

       3. [WARNING] Schema drift detected
          gold.fact_orders - discount_pct column removed
          2 downstream views affected.
```

### Data Quality

Create metrics and validity rules. The agent asks for confirmation before creating anything.

```
You: Add null checks on all email columns across my tables

Agent: I found email columns in 4 tables:

       - gold.dim_customer.email (current null rate: 12.4%)
       - silver.silver_customers.email_address (current: 0.4%)
       - raw.user_signups.email (current: 0.0%)
       - staging.email_events.recipient_email (current: 1.2%)

       I'll create null_percentage metrics for each. Proceed? [Yes/No]

You: Yes

Agent: Created 4 null_percentage metrics.

       Note: gold.dim_customer.email already has an elevated null rate
       (12.4%). You may want to investigate before setting alert
       thresholds, otherwise you'll get an immediate alert.
```

### Schema Drift

Review changes, trace their origin, and understand downstream impact.

```
You: What schema changes happened this week? Show me the impact.

Agent: 3 schema changes detected this week:

       1. gold.fact_orders.discount_pct REMOVED (Tuesday 4:12 PM)
          Impact: 2 downstream views reference this column
          Origin: Removed from staging.orders_raw at 3:47 PM

       2. gold.dim_customer.loyalty_tier ADDED (Wednesday 9:30 AM)
          Type: VARCHAR(50)
          Impact: No downstream references yet (new column)

       3. silver.silver_products.weight REMOVED (Monday 2:15 PM)
          Impact: 1 downstream table (gold.dim_product) joins on this
          Origin: Source table raw.products no longer has this column
```

### Tags and Classification

Apply and manage tags for organization and compliance.

```
You: Find all tables with email columns and tag them as PII

Agent: Found email columns in 4 tables. Applied PII tag:

       - gold.dim_customer (column: email)
       - silver.silver_customers (column: email_address)
       - raw.user_signups (column: email)
       - staging.email_events (column: recipient_email)

       These tables now appear when you ask "which tables have PII?"
       and will be flagged in compliance reports.
```

## Sessions

Conversations are organized into sessions. Each session maintains full context, so follow-up questions work naturally.

- **New session**: Click **+** in the session sidebar
- **Resume session**: Click a previous session to continue where you left off
- **Pin session**: Pin important sessions to the top of the list
- **Session history**: Sidebar shows all past conversations with timestamps and message counts

## Tool Visibility

As the agent works, you see each tool call in real time:

- Which tool was called (e.g., `list_schema_changes`, `check_freshness`)
- What parameters were passed
- The result returned

For actions that modify data (creating metrics, applying tags, acknowledging alerts), the agent asks for confirmation before proceeding.

## Tips

- **Be specific about scope**: "Check freshness for the 3 gold fact tables" works better than "check everything"
- **Chain requests**: "Find tables with spiking null rates, then show me their schema changes" works in one message
- **Ask diagnostic questions**: "Why is X broken?" triggers cross-domain investigation, which is where the agent is strongest
- **Let it discover first**: Say "find tables with customer data" instead of typing exact table paths
- **Review before acting**: For bulk operations, ask "what tables match?" before "set up monitoring for all of them"

## Available Tools

| Domain | Count | Capabilities |
|--------|-------|-------------|
| Discovery | 7 | Browse assets, catalogs, schemas, tables, columns |
| Freshness | 4 | Check freshness, find stale tables, create schedules, get recommendations |
| Schema | 3 | List changes, view details, enable monitoring |
| Data Quality | 3 | Create metrics, list existing, view health summary |
| Validity | 3 | Create rules, list existing, view results |
| Alerts | 4 | List alerts, view details, list rules, create rules |
| Alert Actions | 5 | Acknowledge, resolve, dismiss, snooze alerts |
| Tags | 2 | List and apply tags |
| Health | 1 | Overall health summary |

## Common Questions

### Do I need to install anything to use the Web Agent?

No. The Web Agent is built into AnomalyArmor. Open any asset's Intelligence tab and click **Ask Agent**. There's no extension, no API key, and no local setup.

### Does the agent remember previous conversations?

Yes. Conversations are saved as sessions. Each session keeps full context so follow-ups work naturally, and you can pin important sessions or resume them later from the sidebar.

### Will the agent change my monitoring without asking?

No. For any action that creates, modifies, or deletes configuration (metrics, rules, tags, schedules), the agent previews the plan and waits for confirmation. Read-only queries (freshness checks, lineage, alerts) run without prompting.

### Why does the agent sometimes call multiple tools for one question?

The agent chains tools to answer cross-domain questions. "Why is this dashboard broken?" can trigger schema, freshness, and lineage checks in sequence. You can see each call and its result inline, which is how the agent shows its reasoning.

## Next Steps

- **[Intelligence Overview](/intelligence/overview)**: How monitoring context powers the agent's answers
- **[MCP Server](/integrations/mcp-server)**: Use the same tools from your IDE
- **[Status Skill](/ai-agents/skills/status)**: Health monitoring details
- **[Alerts Skill](/ai-agents/skills/alerts)**: Alert management details

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/status -->

# Status Skill

> Check overall data health with /armor:status

The `/armor:status` skill provides a quick overview of your data observability status across all monitored assets.

## Usage

```
/armor:status
```

Or ask naturally:
- "Is my data healthy?"
- "What's the status of my warehouse?"
- "Any issues I should know about?"

## What It Shows

- **Overall health status**: Healthy, Warning, or Critical
- **Asset summary**: Total monitored assets
- **Freshness issues**: Tables that are stale
- **Schema changes**: Unacknowledged drift
- **Recent alerts**: Alerts triggered recently

## Example Output

```
=== Data Health Summary ===

Overall Status: WARNING

Assets: 45 monitored
Freshness: 2 stale tables
Schema: 1 unacknowledged change
Alerts: 3 in last 24 hours

Issues Requiring Attention:
1. orders table is 6 hours stale
2. customers table is 2 hours stale
3. users.phone_number column added (pending review)

Recommended Actions:
- Check ETL pipeline for orders and customers
- Review schema change in users table
```

## Related Skills

- **[Alerts](/ai-agents/skills/alerts)**: Detailed alert information
- **[Investigate](/ai-agents/skills/investigate)**: Root cause analysis
- **[Coverage](/ai-agents/skills/coverage)**: Monitoring coverage gaps

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/alerts -->

# Alerts Skill

> Query and manage alerts with /armor:alerts

The `/armor:alerts` skill lets you query, filter, and manage data observability alerts.

## Usage

```
/armor:alerts
```

Or ask naturally:
- "What alerts fired yesterday?"
- "Show me critical alerts"
- "What's alerting on the orders table?"

## Filters

You can filter alerts by:
- **Status**: triggered, acknowledged, resolved
- **Severity**: critical, warning, info
- **Asset**: specific table or data source
- **Time range**: today, yesterday, this week

## Example Queries

### Recent Alerts

```
You: What alerts fired in the last 24 hours?

Claude: Found 5 alerts in the last 24 hours:

1. [CRITICAL] Freshness SLA breach - orders
   Triggered: Today 8:15 AM
   Status: Unresolved

2. [WARNING] Schema change detected - users
   Triggered: Today 6:30 AM
   Status: Acknowledged

3. [WARNING] Null rate increase - products.category_id
   Triggered: Yesterday 11:00 PM
   Status: Resolved

4. [INFO] New table discovered - staging.temp_orders
   Triggered: Yesterday 9:00 PM
   Status: Acknowledged

5. [INFO] Freshness recovered - customers
   Triggered: Yesterday 4:00 PM
   Status: Auto-resolved
```

### Filter by Severity

```
You: Show me only critical alerts

Claude: Found 1 critical alert:

[CRITICAL] Freshness SLA breach - orders
- Triggered: Today 8:15 AM
- Table: warehouse.public.orders
- SLA: 2 hours
- Current delay: 6 hours
- Status: Unresolved

This alert has been open for 4 hours.
```

### Filter by Asset

```
You: What alerts are there for the customers table?

Claude: Found 2 alerts for customers:

1. [WARNING] Row count dropped 15% - customers
   Triggered: Yesterday
   Status: Investigating

2. [INFO] Schema change - customers.loyalty_tier added
   Triggered: Last week
   Status: Acknowledged
```

## Managing Alerts

You can also manage alerts:

```
You: Acknowledge the schema change alert

Claude: Acknowledged the schema change alert for users table.

Alert ID: alert-123
Status: acknowledged
Note: Schema change reviewed and approved
```

## Common Questions

### Can I acknowledge or resolve alerts from the agent?

Yes. Ask the agent to acknowledge, resolve, dismiss, or snooze any alert by description. It will confirm the match before updating the alert status.

### How do I silence noisy info-level alerts in bulk?

Ask something like "dismiss all info alerts from today" or "snooze freshness alerts for the staging schema for 24 hours." The agent batches the action after confirming the scope.

## Related Skills

- **[Status](/ai-agents/skills/status)**: Overall health check
- **[Investigate](/ai-agents/skills/investigate)**: Dig into alert causes

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/ask -->

# Ask Skill

> Natural language Q&A with /armor:ask

The `/armor:ask` skill lets you ask questions about your data in natural language.

## Usage

```
/armor:ask
```

Or just ask naturally:
- "What tables contain customer data?"
- "Which columns have PII?"
- "Explain the orders schema"

## Example Usage

```
You: What tables contain customer PII?

Claude: Based on AnomalyArmor Intelligence:

Tables with PII:
1. customers
   - email (PII: email address)
   - phone (PII: phone number)
   - address (PII: physical address)

2. orders
   - shipping_address (PII: address)
   - billing_email (PII: email)

3. user_profiles
   - ssn_last4 (PII: partial SSN)
   - date_of_birth (PII: DOB)

Recommendation: Ensure these tables have proper access controls.
```

## Related Skills

- **[Analyze](/ai-agents/skills/analyze)**: Deep analysis
- **[Tags](/ai-agents/skills/tags)**: Data classification

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/analyze -->

# Analyze Skill

> Trigger AI analysis with /armor:analyze

The `/armor:analyze` skill triggers deep AI analysis of your data assets.

## Usage

```
/armor:analyze
```

Or ask naturally:
- "Analyze the finance schema"
- "Generate insights for orders"
- "What can you tell me about this table?"

## Example Usage

```
You: Analyze the orders table

Claude: Triggering AI analysis for orders...

Analysis complete:

Schema Understanding:
- Purpose: E-commerce order tracking
- Primary key: order_id
- Foreign keys: customer_id, product_id
- Temporal: created_at, updated_at

Data Patterns:
- Average 2,500 new orders/day
- Peak hours: 10am-2pm, 7pm-10pm
- Weekend volume 40% lower

Quality Observations:
- 0.1% null rate on customer_id (investigate)
- Status distribution normal
- No duplicate order_ids detected

Recommendations:
1. Add null check on customer_id
2. Consider partitioning by created_at
3. Index on status for common queries
```

## Related Skills

- **[Ask](/ai-agents/skills/ask)**: Quick questions
- **[Profile](/ai-agents/skills/profile)**: Statistics

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/connect -->

# Connect Skill

> Connect data sources with /armor:connect

The `/armor:connect` skill helps you connect new data sources to AnomalyArmor.

## Usage

```
/armor:connect
```

Or ask naturally:
- "Connect my Snowflake warehouse"
- "Add a new PostgreSQL database"
- "Set up BigQuery connection"

## Example Usage

```
You: Connect my Snowflake warehouse

Claude: I'll help you connect Snowflake. I need a few details:

1. Account identifier (e.g., xy12345.us-east-1)
2. Warehouse name
3. Database to monitor
4. Authentication method (password or key pair)

What's your Snowflake account identifier?

You: xy12345.us-east-1

Claude: Great! Creating connection...

Connection created:
- Name: snowflake-prod
- Account: xy12345.us-east-1
- Status: Testing connection...

Connection successful! Starting asset discovery...
Found 156 tables across 12 schemas.

Next: Use /armor:monitor to set up freshness monitoring.
```

## Common Questions

### Which data warehouses and databases can I connect with this skill?

Snowflake, BigQuery, Postgres, Redshift, and Databricks. See [data sources](/data-sources/overview) for the full list and the credentials each connector needs.

### Where are my warehouse credentials stored after I connect?

Credentials are encrypted at rest and used only by the [Query Gateway](/security/query-gateway) to run metadata queries. They are never exposed to the agent or included in logs.

## Related Skills

- **[Monitor](/ai-agents/skills/monitor)**: Set up monitoring
- **[Status](/ai-agents/skills/status)**: Check data health

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/monitor -->

# Monitor Skill

> Set up freshness monitoring with /armor:monitor

The `/armor:monitor` skill helps you set up freshness monitoring for your tables. It can handle single tables or bulk operations across entire schemas.

## Usage

```
/armor:monitor
```

Or ask naturally:
- "Monitor freshness for orders table"
- "Set up hourly checks for customers"
- "Add a 2-hour SLA to the payments table"
- "Set up monitoring for all gold tables"

## Single Table

```
You: Monitor the orders table hourly

Claude: Created freshness schedule:

Table: warehouse.public.orders
Check Interval: hourly
SLA Threshold: 2 hours
Alert Severity: warning

You'll be notified if the table doesn't update within 2 hours.
```

## Custom SLA

```
You: Monitor customers with a 30-minute SLA

Claude: Created freshness schedule:

Table: warehouse.public.customers
Check Interval: every 15 minutes
SLA Threshold: 30 minutes
Alert Severity: critical

Tight SLA configured. Consider if this is production-critical.
```

## Guided Setup

When used through the [Web Agent](/ai-agents/web-agent), the monitor skill can chain multiple steps together automatically:

```
You: Set up full monitoring for my gold schema

Agent: Done. Here's what I set up:

Freshness (8 tables):
- gold.fact_orders: hourly, 2h SLA
- gold.fact_payments: hourly, 2h SLA
- gold.dim_customer: daily, 24h SLA
... and 5 more

Schema monitoring: enabled on all 8 tables

Want me to set up a Slack alert for freshness violations?

You: Yes, send to #data-alerts

Agent: Done. Created:
- Slack destination: #data-alerts
- Alert rule: freshness violations (warning + critical) -> #data-alerts
- Alert rule: schema changes (all severities) -> #data-alerts

Your gold schema is fully monitored.
```

The agent discovers tables, creates schedules, enables schema monitoring, and wires up alerts, all in one conversation.

## Smart Recommendations

The agent can suggest monitoring intervals based on your table's update patterns:

```
You: What monitoring should I set up for the orders table?

Claude: Based on the orders table update history:
- Updates every ~45 minutes during business hours
- Recommended: 30-minute check interval, 90-minute SLA
- This balances alert noise with detection speed

Want me to create this schedule?
```

## Common Questions

### Can I set up monitoring for a whole schema at once?

Yes. Ask "monitor all tables in the gold schema" or "add freshness to every fact table." The skill expands the pattern, previews the tables, and creates schedules after you confirm.

### What freshness interval should I pick if I don't know my update cadence?

Don't guess. Use [`/armor:recommend`](/ai-agents/skills/recommend) to get an interval based on the table's historical update frequency, or [`/armor:test`](/ai-agents/skills/test) to dry-run a threshold against the last 30 days before enabling.

## Related Skills

- **[Status](/ai-agents/skills/status)**: Health check
- **[Coverage](/ai-agents/skills/coverage)**: Find gaps
- **[Web Agent](/ai-agents/web-agent)**: In-app AI assistant
- **[Quality](/ai-agents/skills/quality)**: Data quality metrics

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/quality -->

# Quality Skill

> Set up data quality checks with /armor:quality

The `/armor:quality` skill helps you create and manage data quality checks including metrics and validity rules.

## Usage

```
/armor:quality
```

Or ask naturally:
- "Add a null check for the email column"
- "Create a row count metric for orders"
- "What quality checks exist for this table?"

## Quality Check Types

### Metrics
Track quantitative measurements over time:
- **row_count**: Number of rows
- **null_rate**: Percentage of null values
- **distinct_count**: Unique value count
- **freshness**: Time since last update

### Validity Rules
Validate data integrity:
- **NOT_NULL**: Column must not contain nulls
- **UNIQUE**: Values must be unique
- **ACCEPTED_VALUES**: Values must be in allowed list
- **REGEX**: Values must match pattern

## Example Usage

### Add Null Check

```
You: Add a null check for the email column in customers

Claude: Created NOT_NULL validity rule:

Rule: email_not_null
Table: warehouse.public.customers
Column: email
Severity: warning

The rule will alert if any null values are found in the email column.
```

### Create Row Count Metric

```
You: Create a daily row count metric for orders

Claude: Created row count metric:

Metric: orders_row_count
Table: warehouse.public.orders
Type: row_count
Interval: daily

Current value: 1,234,567 rows
You'll be able to track trends over time.
```

### Check Quality Status

```
You: What quality checks exist for the orders table?

Claude: Found 5 quality checks for orders:

Metrics:
- row_count: 1,234,567 (passing)
- null_rate on customer_id: 0.1% (passing)
- distinct_count on status: 5 values (passing)

Validity Rules:
- order_id unique (passing)
- customer_id not null (failing - 12 nulls found)

1 failing check needs attention.
```

## Common Questions

### When should I use a metric versus a validity rule?

Use a metric when you want to track a number over time and alert on anomalies (null rate climbing from 0.1% to 5%). Use a validity rule when you want a hard pass/fail check on every run (email must match a regex, status must be in an allowed list).

## Related Skills

- **[Profile](/ai-agents/skills/profile)**: Table statistics
- **[Coverage](/ai-agents/skills/coverage)**: Find gaps in monitoring

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/tags -->

# Tags Skill

> Classify data with /armor:tags

The `/armor:tags` skill helps you classify and organize data with tags for governance and compliance.

## Usage

```
/armor:tags
```

Or ask naturally:
- "Tag this table as PII"
- "Mark the email column as sensitive"
- "Apply financial reporting tag"

## Tag Categories

- **business**: Business domain (finance, marketing, sales)
- **technical**: Technical classification (fact, dimension, staging)
- **governance**: Compliance (pii, confidential, gdpr)

## Example Usage

### Tag a Table

```
You: Tag the customers table as PII

Claude: Created governance tag:

Tag: pii_data
Table: warehouse.public.customers
Category: governance
Description: Contains personally identifiable information

The table is now flagged for PII compliance tracking.
```

### Tag a Column

```
You: Mark the email column as sensitive

Claude: Created column tag:

Tag: sensitive
Column: warehouse.public.customers.email
Category: governance

This column will be included in sensitivity reports.
```

### Bulk Tagging

```
You: Apply financial_reporting tag to all gold tables

Claude: Applied tag to 12 tables:

- gold.fact_orders
- gold.fact_revenue
- gold.dim_customers
- gold.dim_products
...

All gold layer tables are now tagged for financial reporting.
```

## Common Questions

### Can I create custom tags, or am I limited to the built-in categories?

You can create any tag name under the three categories (business, technical, governance). The categories are fixed so tags stay organized for reporting, but the tag values themselves are open, for example `finance`, `pii_data`, `deprecated`, `team_checkout`.

### Do tags propagate from a table to its columns, or do I tag each separately?

Tags are applied at whatever level you specify, table or column. They don't auto-propagate. Applying a `pii` tag to a table marks the table itself, not its columns. To tag sensitive columns, ask the agent to find them and apply the tag across matches in one step.

## Related Skills

- **[Ask](/ai-agents/skills/ask)**: Find tagged data
- **[Coverage](/ai-agents/skills/coverage)**: Classification coverage

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/lineage -->

# Lineage Skill

> Explore data dependencies with /armor:lineage

The `/armor:lineage` skill helps you understand how data flows through your systems by showing upstream sources and downstream consumers.

## Usage

```
/armor:lineage
```

Or ask naturally:
- "What depends on this table?"
- "Where does this data come from?"
- "Show me the impact of changing the customers table"

## Concepts

### Upstream
Tables that feed data into the target table (data sources).

### Downstream
Tables that consume data from the target table (data consumers).

### Depth
How many levels of dependencies to traverse:
- **depth=1**: Direct dependencies only
- **depth=2**: Two levels
- **depth=3+**: Extended chain (max 5)

## Example Usage

### Find Downstream Consumers

```
You: What depends on the customers table?

Claude: Found 7 downstream tables that depend on customers:

Direct consumers (depth 1):
- gold.customer_metrics
- gold.order_summary
- reporting.customer_dashboard

Indirect consumers (depth 2):
- ml.churn_features
- ml.ltv_predictions
- analytics.weekly_cohorts
- external.bi_tool_view

Any changes to customers will affect these tables.
```

### Find Upstream Sources

```
You: Where does the orders_mart table get its data?

Claude: orders_mart has 4 upstream sources:

Direct sources:
- staging.orders_raw
- staging.customers
- staging.products

Root sources:
- raw.events

Data flow:
raw.events -> staging.orders_raw -> gold.orders_mart
```

### Impact Analysis

```
You: If I change the users table schema, what's affected?

Claude: Impact analysis for users:

Directly affected (12 tables):
- gold.user_metrics
- gold.user_activity
- reporting.user_dashboard
- ml.user_features
...

Indirectly affected (5 tables):
- analytics.monthly_actives
- reporting.exec_summary
...

Total: 17 tables may need updates.

Recommendation: Test changes in staging before production.
```

## Common Questions

### Where does lineage come from, query logs or dbt manifests?

AnomalyArmor parses query history from your warehouse to build table-level lineage automatically. If you use dbt, manifest-based lineage can be ingested on top for more complete column-level detail. See [lineage](/lineage/overview).

### How deep can I trace dependencies in one query?

Up to 5 levels. Ask for `depth=3` or "show me 5 levels deep" to extend traversal. Deeper queries take longer to return, so start with depth 2 unless you're doing full impact analysis.

## Related Skills

- **[Investigate](/ai-agents/skills/investigate)**: Root cause analysis
- **[Status](/ai-agents/skills/status)**: Health overview

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/investigate -->

# Investigate Skill

> Root cause analysis with /armor:investigate

The `/armor:investigate` skill performs root cause analysis by combining lineage, intelligence, and historical data.

## Usage

```
/armor:investigate
```

Or ask naturally:
- "Why is this table stale?"
- "What caused this alert?"
- "Debug the orders pipeline issue"

## Investigation Workflow

1. **Gather Context**: Check current status, recent alerts, freshness
2. **Trace Dependencies**: Use lineage to find upstream issues
3. **Analyze Patterns**: Check historical data for recurring problems
4. **Recommend Actions**: Suggest fixes based on findings

## Example Usage

### Investigate Stale Table

```
You: Why is the orders table stale?

Claude: Investigating orders table...

Current Status:
- Status: STALE
- Last update: Yesterday 6:00 AM
- Expected: Today 6:00 AM
- Delay: 24 hours

Root Cause Analysis:
Tracing upstream dependencies...

Found root cause: staging.orders_raw

Dependency chain:
  raw.events (FRESH)
    -> staging.orders_raw (STALE - 24h)
      -> orders (STALE - 24h)

The staging.orders_raw table stopped updating at 5:45 AM yesterday.
This matches the time of an ETL job failure alert.

Recommendation:
1. Check Airflow logs for orders_etl DAG
2. Look for connection issues to source database
3. Once fixed, data should propagate within 1 hour
```

### Investigate Alert

```
You: What caused the schema change alert?

Claude: Investigating schema change alert...

Alert Details:
- Table: warehouse.public.users
- Change: Column 'phone_number' added
- Detected: Tuesday 3:30 PM
- Status: Pending review

Change Analysis:
- Column type: VARCHAR(20)
- Nullable: YES
- No default value

Impact Assessment:
- 3 downstream tables may need updates
- No breaking changes detected (additive only)
- Compatible with existing queries

Historical Context:
- Similar changes happened 2 months ago (loyalty_tier added)
- Pattern suggests feature releases trigger schema updates

Recommendation:
1. Acknowledge the change if expected
2. Update downstream ETL if needed
3. No immediate action required
```

## Common Questions

### Can the investigate skill find root causes outside AnomalyArmor, like Airflow or dbt failures?

No. The agent stops at the edge of AnomalyArmor's visibility, so it can identify which upstream table is stale and when it stopped updating, but it can't read your orchestrator's logs. It does surface the ETL time window and affected tables so you know where to look.

## Related Skills

- **[Lineage](/ai-agents/skills/lineage)**: Trace dependencies
- **[Alerts](/ai-agents/skills/alerts)**: Alert details
- **[Status](/ai-agents/skills/status)**: Health overview

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/profile -->

# Profile Skill

> Analyze table statistics with /armor:profile

The `/armor:profile` skill analyzes table and column statistics, distributions, and data characteristics.

## Usage

```
/armor:profile
```

Or ask naturally:
- "Profile the orders table"
- "Show column statistics"
- "What's the cardinality of status column?"
- "Show null rates"

## Profiling Metrics

### Table Level
- Row count
- Last update time
- Growth trends

### Column Level
- Null rate
- Distinct count (cardinality)
- Min/max values
- Value distributions

## Example Usage

### Profile a Table

```
You: Profile the orders table

Claude: Table Profile: warehouse.public.orders

Summary:
- Row Count: 1,234,567
- Last Updated: 2 hours ago
- Daily Growth: +2,500 rows

Column Statistics:

order_id:
  Null Rate: 0%
  Distinct: 1,234,567 (100% unique)

customer_id:
  Null Rate: 0.1%
  Distinct: 45,678 customers

status:
  Null Rate: 0%
  Distinct: 5 values
  Distribution: pending (12%), processing (8%),
                shipped (35%), delivered (40%), cancelled (5%)

created_at:
  Null Rate: 0%
  Range: 2023-01-01 to 2026-01-31

amount:
  Null Rate: 0%
  Min: $0.99, Max: $12,500, Avg: $156.32
```

### View Trends

```
You: Show row count trend for orders

Claude: Row Count Trend (last 7 days):

Jan 25: 1,200,000
Jan 26: 1,210,000 (+10,000)
Jan 27: 1,220,000 (+10,000)
Jan 28: 1,225,000 (+5,000)
Jan 29: 1,230,000 (+5,000)
Jan 30: 1,232,000 (+2,000)
Jan 31: 1,234,567 (+2,567)

Average daily growth: ~7,200 rows
Trend: Stable growth pattern
```

## Related Skills

- **[Quality](/ai-agents/skills/quality)**: Add quality checks
- **[Coverage](/ai-agents/skills/coverage)**: Monitoring gaps

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/coverage -->

# Coverage Skill

> Find monitoring gaps with /armor:coverage

The `/armor:coverage` skill analyzes what's being monitored and identifies gaps in your data observability coverage.

## Usage

```
/armor:coverage
```

Or ask naturally:
- "What am I monitoring?"
- "What tables have no alerts?"
- "Show me monitoring gaps"
- "Which tables need freshness checks?"

## Coverage Categories

- **Freshness**: Tables with/without freshness schedules
- **Schema**: Tables with/without schema baselines
- **Metrics**: Tables with/without data quality metrics
- **Alerts**: Tables with/without alert rules

## Example Usage

### Coverage Overview

```
You: What's my monitoring coverage?

Claude: Monitoring Coverage Summary:

Total Assets: 45

Coverage by Type:
- Freshness: 38 (84%)
- Schema: 42 (93%)
- Metrics: 35 (78%)
- Alerts: 28 (62%)

By Environment:
- Production: 100% covered
- Staging: 75% covered
- Raw/Landing: 50% covered

Overall Score: B+ (Good, but room for improvement)
```

### Find Unmonitored Tables

```
You: What tables have no monitoring?

Claude: Found 7 unmonitored tables:

Raw Layer:
- raw.events_backup
- raw.legacy_data

Staging:
- staging.temp_orders
- staging.test_data

Archive:
- archive.orders_2023
- archive.customers_2023

Sandbox:
- sandbox.dev_analysis

Recommendation:
1. Add freshness monitoring to raw.events_backup (production data)
2. Review if archive tables need monitoring
3. Consider excluding sandbox from coverage metrics
```

### Prioritize Gaps

```
You: What critical tables need monitoring?

Claude: Critical Tables Missing Monitoring:

HIGH PRIORITY (Production):
1. raw.events_backup
   - No freshness monitoring
   - No quality metrics
   - Recommendation: Add immediately

2. gold.financial_summary
   - Has freshness, missing quality checks
   - Recommendation: Add row count + null checks

MEDIUM PRIORITY:
3. staging.customer_enriched
   - Missing schema baseline
   - Recommendation: Create baseline

Coverage improvement: +12% by addressing these 3 tables
```

## Common Questions

### What counts as a 'good' coverage score?

Production tables should be at 100%. Staging in the 70-90% range is typical. Raw/landing and archive tables often don't need full coverage, so anything above 50% there is usually fine. The overall letter grade weights production tables most heavily.

### How do I exclude sandbox or archive tables from my coverage score?

Tag them with a governance tag (like `sandbox` or `archive`) using [`/armor:tags`](/ai-agents/skills/tags). The coverage skill respects exclusion tags so your score reflects only the tables you actually care about.

## Related Skills

- **[Monitor](/ai-agents/skills/monitor)**: Set up monitoring
- **[Quality](/ai-agents/skills/quality)**: Add quality checks
- **[Status](/ai-agents/skills/status)**: Health overview

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/recommend -->

# Recommend Skill

> Get AI-driven monitoring recommendations with /armor:recommend

The `/armor:recommend` skill analyzes your data assets and suggests what to monitor, which tables are most critical, and what thresholds to use.

## Usage

```
/armor:recommend
```

Or ask naturally:
- "What should I monitor?"
- "Suggest tables to add freshness monitoring"
- "What are good thresholds for my orders table?"
- "Which tables are most critical to monitor?"
- "What's missing from my monitoring setup?"

## What It Does

The recommend skill uses three recommendation engines:

### Freshness Recommendations
Analyzes table update patterns and suggests monitoring intervals:
- Identifies tables that update regularly but aren't monitored
- Suggests check intervals based on historical update frequency
- Prioritizes tables by importance (row count, naming patterns, downstream dependents)

### Metric Recommendations
Scans column types and patterns to suggest data quality metrics:
- Identifies columns that should have null checks
- Suggests row count monitoring for high-volume tables
- Recommends format validation for email, phone, and ID columns

### Coverage Recommendations
Provides a full view of monitoring gaps:
- Shows your coverage score and current tier
- Lists unmonitored tables ranked by importance
- Suggests which monitoring types to add for each table

## Example

```
You: What should I monitor on my production database?

Agent: I'll analyze your database for monitoring recommendations.

Based on your 45 tables, here are the top recommendations:

Freshness (8 tables need monitoring):
  1. gold.orders - updates hourly, suggest 2h threshold
  2. gold.customers - updates daily, suggest 26h threshold
  3. gold.payments - updates every 30m, suggest 1h threshold

Metrics (5 tables need row count tracking):
  1. gold.orders - 2.3M rows, high-traffic table
  2. gold.events - 15M rows, event stream
  3. gold.transactions - 890K rows, financial data

Coverage score: 35/100 (Protected tier)
To reach Verified (50+), add freshness to 8 tables.

Want me to set these up? I can use /armor:monitor to enable them.
```

## Common Questions

### How does the recommend skill decide which tables are 'most critical'?

It ranks tables by row count, naming patterns (fact/gold tables rank higher than staging), downstream dependents from lineage, and existing tags. Tables with many consumers and no current monitoring surface first.

### Do the suggested thresholds actually fit my data, or are they generic?

They're computed from each table's own update history, not defaults. The skill looks at recent gaps between updates and picks a threshold that would not have fired during normal variation. You can dry-run any suggestion with [`/armor:test`](/ai-agents/skills/test) before enabling.

## Related Skills

- [Monitor](/ai-agents/skills/monitor) - Enable the recommended monitoring
- [Coverage](/ai-agents/skills/coverage) - View your overall coverage score
- [Test](/ai-agents/skills/test) - Preview configurations before enabling

---

<!-- Source: https://docs.anomalyarmor.ai/ai-agents/skills/test -->

# Test Skill

> Preview monitoring configurations before enabling with /armor:test

The `/armor:test` skill lets you dry-run monitoring configurations before enabling them. Preview what alerts would fire to avoid alert fatigue.

## Usage

```
/armor:test
```

Or ask naturally:
- "Test this freshness threshold before I enable it"
- "What alerts would fire with a 4-hour threshold?"
- "Dry-run the schema drift check"
- "Preview the impact of this alert rule"
- "Will this configuration cause too many alerts?"

## What It Does

### Freshness Dry-Run
Preview what would happen with a proposed freshness threshold:
- Compares the threshold against historical update patterns
- Shows how many times the table would have been flagged as stale
- Helps you pick a threshold that catches real issues without false positives

### Schema Drift Preview
Test schema monitoring before enabling:
- Compares current schema against a baseline
- Shows what changes would be detected
- Estimates alert frequency based on historical schema change rate

### Alert Rule Preview
Preview which alerts would fire with proposed rule settings:
- Test event type filters and severity levels
- See which recent events would match the rule
- Estimate daily alert volume

## Example

```
You: Test a 4-hour freshness threshold on gold.orders

Agent: I'll dry-run that threshold against recent data.

Dry-run results for gold.orders with 4h threshold:
  - Last 30 days: 3 stale events would have fired
  - Average gap between updates: 55 minutes
  - Longest gap: 3h 42m (March 15, during maintenance)
  - Recommendation: 4h is a good threshold. It catches real
    staleness without firing during normal variation.

Want me to enable this? I'll use /armor:monitor to set it up.
```

## Common Questions

### Does the test skill actually create or modify any monitoring?

No. A dry-run reads historical data to simulate outcomes and returns a preview. Nothing is created until you confirm and run [`/armor:monitor`](/ai-agents/skills/monitor) or the relevant enabling skill.

## Related Skills

- [Recommend](/ai-agents/skills/recommend) - Get suggestions for what to monitor
- [Monitor](/ai-agents/skills/monitor) - Enable monitoring after testing
- [Quality](/ai-agents/skills/quality) - Set up data quality checks

---

<!-- Source: https://docs.anomalyarmor.ai/releases/changelog -->

# Changelog

> Product updates, new features, and improvements to AnomalyArmor

Stay up to date with the latest improvements to AnomalyArmor.

## Recent Highlights

| Date | Feature | What It Means for You |
|------|---------|----------------------|
| Mar 15 | Agent Tool Consolidation | 47 tools consolidated to 18 for faster, more reliable agent conversations |
| Mar 15 | Object Intelligence | Full object-level profiles with change history, KB context, and quality alerting |
| Mar 15 | Alert Schedules | Operating schedules and blackout windows to silence alerts during maintenance |
| Mar 15 | Dead Code Cleanup | ~31,700 lines removed for a leaner, faster codebase |
| Mar 15 | Lifecycle Emails | Dedicated HTML templates for all lifecycle emails with consistent branding |

> ℹ️ Changes older than 6 months have been archived.