# AnomalyArmor Documentation (Full Content) > AnomalyArmor is a data observability platform that detects schema changes, data freshness issues, and quality problems before they impact your business. This file concatenates every documentation page in navigation order, including all Q&A / Common Questions sections, for LLM ingestion. > Index: https://docs.anomalyarmor.ai/llms.txt --- # Introduction > AnomalyArmor - Data observability for modern data teams AnomalyArmor is a data observability platform that helps data teams detect issues before they become incidents. Know when schemas change, data goes stale, or pipelines break, before your stakeholders do. *AnomalyArmor: Data sources to alerts in one platform* ## The Problem We Solve Data teams face a common challenge: **you don't know what you don't know**. - A column gets dropped upstream → your dbt models fail at 3am - An ETL job silently stops → dashboards show stale data for hours - A data type changes → aggregations return wrong results for days - A new table appears → you find out when someone asks about it These aren't edge cases. They're Tuesday. **AnomalyArmor gives you visibility** so you can catch issues when they happen, not when your CFO asks why the numbers are wrong. ## What is Data Observability? Data observability brings the principles of application monitoring to your data infrastructure. Just as you wouldn't run a production application without alerts for errors and latency, you shouldn't run data pipelines without visibility into data health. ## Key Features - **[Schema Monitoring](/schema-monitoring/overview)**: Automatically discover tables and detect schema changes before they break downstream systems. Know immediately when columns are added, removed, or modified. - **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set SLAs on data freshness and get alerted when data is stale or delayed. Never show yesterday's data on today's dashboard. - **[Data Quality Metrics](/data-quality/metrics)**: Track row counts, null percentages, distinct values, and more. Detect anomalies automatically using statistical analysis. - **[Intelligent Alerts](/alerts/overview)**: Route alerts to Slack, email, PagerDuty, or webhooks. Configure rules to filter noise and escalate what matters. - **[AI-Powered Insights](/intelligence/overview)**: Get AI-generated documentation for your tables and columns. Ask natural language questions about your data catalog. - **[Report Badges](/data-quality/report-badges)**: Embed data quality status in Databricks, Looker, Notion, and other tools. See pass/fail at a glance. ## Who Uses AnomalyArmor? ### Data Engineers **Your challenge**: Pipelines break when upstream schemas change. You're the first call when data is wrong. **How AnomalyArmor helps**: - Get alerted to schema changes before your pipelines run - Understand what changed and when - Reduce time-to-resolution for data incidents ### Analytics Engineers **Your challenge**: dbt models fail mysteriously. You spend hours debugging "column not found" errors. **How AnomalyArmor helps**: - Know when source tables change - Track which models depend on changed assets - Document your data models with AI assistance ### Platform Teams **Your challenge**: Multiple teams, multiple databases, no central visibility. **How AnomalyArmor helps**: - Single pane of glass across all data sources - Centralized alerting with team-based routing - Audit trail for compliance requirements ### Data Leaders **Your challenge**: Stakeholders don't trust the data. You need to reduce data incidents. **How AnomalyArmor helps**: - Proactive monitoring reduces incidents by catching issues early - SLA tracking demonstrates data reliability - Centralized catalog improves data discovery ## Supported Data Sources AnomalyArmor connects to your existing data infrastructure with no agents to install: | Data Source | Support Level | Key Features | |-------------|---------------|--------------| | **PostgreSQL** | Full | Tables, views, schemas, constraints | | **Databricks** | Full | Unity Catalog, Delta tables, views | | **ClickHouse** | Full | Tables, materialized views, dictionaries | > 💡 More data sources coming soon. [Contact us](mailto:support@anomalyarmor.ai) if you need a specific connector. ## How It Works AnomalyArmor follows a simple, secure model: 1. **Connect**: Add your database with read-only credentials 2. **Discover**: We scan your metadata (never your actual data) 3. **Monitor**: Scheduled scans detect changes automatically 4. **Alert**: Get notified through your preferred channels ## Security First AnomalyArmor is built with security as a foundation: - **Metadata only**: We never access your actual data - **Encrypted everywhere**: TLS 1.3 in transit, AES-256 at rest - **Minimal access**: Read-only, dedicated database users recommended ## Getting Started Ready to try AnomalyArmor? You can be monitoring your first database in under 15 minutes. - **[Quickstart Guide](/quickstart/overview)**: Step-by-step guide to connect your first database and set up monitoring - **[Core Concepts](/concepts/overview)**: Understand assets, discovery, schema changes, and freshness ## Common Questions ### Does AnomalyArmor access my actual data? No. AnomalyArmor only queries metadata (system catalogs like `information_schema` and `pg_catalog`) and bounded aggregates (`COUNT`, `MIN`, `MAX`, `AVG`, distinct counts). It never runs `SELECT *` or reads raw row values. The open-source [Query Gateway](/security/query-gateway) enforces this on every query. ### What databases does AnomalyArmor support? PostgreSQL (12+), Databricks (with Unity Catalog), and ClickHouse are fully supported today. BigQuery, Snowflake, Redshift, MySQL, and SQL Server are in active development. See [Data Sources](/data-sources/overview) for the current list and [contact us](mailto:support@anomalyarmor.ai) if you need a specific connector. ### How long does it take to set up AnomalyArmor? About 15-20 minutes to connect your first database and get your first alert. The [quickstart](/quickstart/overview) walks through the four steps: connect, discover, alert, metric. ### Do I need to install anything in my infrastructure? No. AnomalyArmor is fully managed SaaS. Connections go outbound from our cloud to your database using the credentials you provide. Nothing runs on your side. ### Will AnomalyArmor slow down my production database? Impact is minimal. Queries hit only system tables and run lightweight aggregates on a schedule you control. Typical discovery completes in under 60 seconds for up to 1,000 tables. You can point AnomalyArmor at a read replica if you want zero impact on the primary. ### Can I try AnomalyArmor without talking to sales? Trials are available by invitation. If you received a cold email or a trial code, use the link provided. Otherwise, start from the [homepage try-it form](https://www.anomalyarmor.ai/) or email [hello@anomalyarmor.ai](mailto:hello@anomalyarmor.ai). Trials are 14 days with no credit card; there is no self-serve trial without a code and no free tier. ## Need Help? - **[Support](/support/contact)**: Contact our team for help with your account. - **[FAQ](/troubleshooting/faq)**: Answers to common questions. --- # How AnomalyArmor Works > Understand the architecture and data flow behind AnomalyArmor's data observability platform AnomalyArmor provides visibility into your data infrastructure without accessing your actual data. We connect to your databases, discover your assets, track data quality metrics, use AI to document your catalog, and alert you when something needs attention. *AnomalyArmor system architecture* ## Core Architecture AnomalyArmor operates on a simple principle: **metadata and aggregates only, never raw row values**. We query your database's information schema for structure, and run bounded aggregate queries (`COUNT`, `MIN`, `MAX`, `SUM`, `AVG`, `PERCENTILE`, distinct-value counts) for monitoring - never `SELECT *` or row-level reads of sensitive columns. The Query Gateway enforces this at every query. See [Query Gateway](/security/query-gateway) for the full policy. ### What We Access | We Access | We Never Access | |-----------|-----------------| | Table and view names | Individual row data | | Column names and types | `SELECT *` or raw column values | | Schema metadata | PII or sensitive content | | Timestamp columns (for freshness) | Raw business records | | Index and constraint definitions | Query results beyond aggregates | | Aggregate statistics (counts, min/max, percentiles) | | This gives you value-level monitoring - row counts, freshness, distribution shifts - with minimal data exposure. ## The Data Flow AnomalyArmor follows a six-stage pipeline from connection to notification: *Data flow from discovery to alerting* ### Stage 1: Discovery When you connect a database, AnomalyArmor runs a **discovery job** that: 1. Connects securely using your provided credentials 2. Queries the information schema (`pg_catalog`, `information_schema`, etc.) 3. Catalogs all tables, views, columns, and their data types 4. Records the current state as a baseline Discovery can run on a schedule (hourly, daily) or on-demand. After the initial full discovery, subsequent runs are **incremental**, only detecting what changed since the last run. ### Stage 2: Asset Catalog Discovered assets are stored in your **Asset Catalog**, a searchable inventory of everything in your data infrastructure: - **Tables and Views**: With column details, data types, and relationships - **Metadata**: Last updated timestamps, row counts (if available), descriptions - **History**: Full audit trail of every change detected The catalog becomes your source of truth for understanding what exists across all connected databases. ### Stage 3: Change Detection On each discovery run, AnomalyArmor compares the current state against the previous state to detect: - **Schema drift**: Columns added, removed, renamed, or type-changed - **New assets**: Tables or views that didn't exist before - **Removed assets**: Tables or views that no longer exist - **Freshness violations**: Data not updated within expected SLA Each detected change is recorded with timestamp, type, before/after values, and affected asset details. ### Stage 4: Data Quality Metrics Beyond schema monitoring, AnomalyArmor tracks statistical properties of your data: | Metric | What It Measures | |--------|------------------| | Row count | Total rows in a table | | Null percentage | Missing values in a column | | Distinct count | Unique values | | Min/max/avg | Numeric column statistics | Metrics run on a schedule you configure. AnomalyArmor builds baselines from historical values and detects anomalies when new values fall outside expected ranges. ### Stage 5: Intelligence AnomalyArmor uses AI to help you understand your data catalog: - **Natural language search**: Ask "where is customer data?" instead of clicking through schemas - **Auto-generated descriptions**: Get human-readable documentation for tables and columns - **Pattern detection**: Identify PII, relationships, and naming conventions Intelligence analyzes your schema metadata only. It never reads actual data values. ### Stage 6: Alerting When changes match your configured **alert rules**, notifications are sent to your chosen destinations: - **Slack**: Channel messages with rich formatting - **Email**: Detailed reports to individuals or distribution lists - **PagerDuty**: Incidents for on-call escalation - **Microsoft Teams**: Channel notifications - **Webhooks**: Custom integrations with any system You control which changes trigger alerts and where they go. A dropped column in production might page on-call, while a new table in dev just goes to email. ## Security Model AnomalyArmor uses enterprise-grade security controls: ### Query Security Every SQL query is validated by our open-source [Query Gateway](/security/query-gateway) before execution: - **Fail-closed validation**: Queries that can't be parsed are blocked - **Access level enforcement**: Only metadata and aggregate queries are allowed - **Open source**: Audit the code yourself at [github.com/anomalyarmor/anomalyarmor-query-gateway](https://github.com/anomalyarmor/anomalyarmor-query-gateway) ### Credential Storage - All credentials encrypted at rest with **AES-256** - Encryption keys managed via AWS KMS - Credentials never logged or exposed in UI ### Network Security - All connections use **TLS 1.3** encryption - Static IP addresses available for firewall allowlisting - Support for VPC peering and Private Link (Enterprise) ### Access Control - Role-based access control (RBAC) for team members - Audit logs for all administrative actions - SSO integration (SAML 2.0) available > â„šī¸ For detailed security information, see our [Security Overview](/security/overview). ## Deployment Architecture AnomalyArmor is a fully managed SaaS platform. There's nothing to install in your infrastructure. *Deployment model: your databases stay in your infrastructure* ### What Runs Where | Component | Location | Details | |-----------|----------|---------| | Discovery jobs | AnomalyArmor Cloud | Scheduled or on-demand | | Asset catalog | AnomalyArmor Cloud | Your metadata stored securely | | Metrics engine | AnomalyArmor Cloud | Scheduled metric capture and anomaly detection | | Intelligence | AnomalyArmor Cloud | AI-powered search and documentation | | Alert rules | AnomalyArmor Cloud | Configured via web UI | | Database credentials | AnomalyArmor Cloud | Encrypted at rest | | Your database | Your infrastructure | We connect outbound to you | ## Performance Impact AnomalyArmor is designed for minimal performance impact on your databases: - **Read-only queries**: We only run SELECT on system tables - **Lightweight queries**: Information schema queries, not table scans - **Configurable scheduling**: Run discovery during off-peak hours - **Connection pooling**: Efficient connection management Typical discovery runs complete in **under 60 seconds** for databases with up to 1,000 tables. ## Common Questions ### What exactly does AnomalyArmor query against my database? Only metadata from system catalogs (`information_schema`, `pg_catalog`, and equivalents) and bounded aggregates on your tables: `COUNT`, `MIN`, `MAX`, `AVG`, distinct-value counts, and percentile approximations. Never `SELECT *`, never raw row reads. The [Query Gateway](/security/query-gateway) blocks anything else at execution time. ### Does AnomalyArmor need write access to my database? No. Read-only credentials are sufficient and recommended. AnomalyArmor never issues `INSERT`, `UPDATE`, `DELETE`, or DDL statements. Create a dedicated read-only user and grant it `SELECT` on the schemas you want monitored. ### Can I point AnomalyArmor at a read replica instead of my primary database? Yes, and it's a common setup for production workloads. Freshness is determined by the replica's timestamps, so if your replica lags the primary by minutes, that lag is reflected in freshness checks - factor that into your SLAs. ### Where is the metadata AnomalyArmor collects stored? In AnomalyArmor's managed cloud, encrypted at rest with AES-256 (keys managed via AWS KMS) and encrypted in transit with TLS 1.3. Credentials are stored with separate encryption and are never exposed in the UI or logs. ### Can I self-host AnomalyArmor? AnomalyArmor is primarily offered as managed SaaS. Self-hosted deployments of the open-source Query Gateway component are available for Enterprise customers who want to run validation inside their own network. [Contact sales](mailto:sales@anomalyarmor.ai) for options. ### Does AnomalyArmor support IP allowlisting? Yes. AnomalyArmor provides static outbound IPs you can add to your database firewall allowlist. See the current IP list in your dashboard under **Settings → Security**. VPC peering and AWS PrivateLink are available on Enterprise plans for networks that can't use IP allowlisting. ## Next Steps Now that you understand how AnomalyArmor works, you're ready to get started: - **[Quickstart Guide](/quickstart/overview)**: Connect your first database in under 15 minutes - **[Core Concepts](/concepts/overview)**: Deep dive into assets, discovery, and monitoring --- # Connect Your First Database > Step 1 - Add a database connection to AnomalyArmor In this step, you'll add your first database connection to AnomalyArmor. This establishes the secure link that allows us to discover and monitor your data assets. ## Before You Begin Make sure you have: - Database hostname, port, and credentials - Network access from AnomalyArmor to your database (see [Quickstart Overview](/quickstart/overview)) - A read-only database user (recommended) ## Supported Databases | Database | Version | Notes | |----------|---------|-------| | PostgreSQL | 12+ | Including RDS, Aurora, Supabase | | Databricks | Unity Catalog | Requires access token | | ClickHouse | 21+ | Including ClickHouse Cloud | ## Step-by-Step Guide ## What Happens Next After saving your connection: 1. **Credentials are encrypted**: Stored securely with AES-256 encryption 2. **Connection is ready**: You can now run discovery to scan your database 3. **No data accessed yet**: Discovery must be triggered manually or scheduled ## Security Best Practices > âš ī¸ Always use a **read-only, dedicated user** for AnomalyArmor. Never use your application's production credentials. ### Creating a Read-Only User ### PostgreSQL ```sql -- Create a dedicated user CREATE USER anomalyarmor WITH PASSWORD 'your-secure-password'; -- Grant minimal required permissions GRANT CONNECT ON DATABASE your_database TO anomalyarmor; GRANT USAGE ON SCHEMA public TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor; -- For future tables (optional but recommended) ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO anomalyarmor; ``` ### ClickHouse ```sql -- Create a dedicated user with read-only access CREATE USER anomalyarmor IDENTIFIED BY 'your-secure-password'; -- Grant read access to all tables GRANT SELECT ON *.* TO anomalyarmor; ``` ### Databricks In Unity Catalog: 1. Create a service principal or use a personal access token 2. Grant `USE CATALOG` on your catalog 3. Grant `USE SCHEMA` on schemas to monitor 4. Grant `SELECT` on tables (or use `ALL PRIVILEGES` for read access) ## Troubleshooting ### Connection refused **Cause**: Network connectivity issues **Solutions**: 1. Verify the hostname and port are correct 2. Check that your database is running 3. Ensure AnomalyArmor IPs are allowlisted in your firewall/security group 4. For RDS: Verify the security group allows inbound traffic on your DB port ```bash # Test connectivity from your machine nc -zv your-host.rds.amazonaws.com 5432 ``` ### Authentication failed **Cause**: Invalid credentials **Solutions**: 1. Double-check username and password (copy-paste to avoid typos) 2. Verify the user exists in the database 3. Ensure the user has CONNECT permission on the database 4. Check for special characters in password that may need escaping ### SSL connection required **Cause**: Database requires SSL but connection isn't using it **Solutions**: 1. Set SSL Mode to `require` in the connection form 2. For RDS: SSL is required by default, ensure SSL Mode is set 3. For self-hosted: Verify SSL certificates are properly configured ### Connection timeout **Cause**: Network latency or firewall issues **Solutions**: 1. Verify the database is accessible from outside your VPC 2. Check for firewall rules blocking the connection 3. For private databases: Contact us about VPC peering options ### Permission denied **Cause**: User lacks required permissions **Solutions**: 1. Grant `SELECT` on `information_schema` tables 2. Grant `USAGE` on the schema you want to monitor 3. Verify user can access the specific database ## Common Questions ### What database permissions does AnomalyArmor need? Read-only: `CONNECT` on the database, `USAGE` on the schemas you want monitored, and `SELECT` on tables. `ALTER DEFAULT PRIVILEGES` is optional but recommended so new tables are picked up automatically. No write, DDL, or admin permissions are ever needed. ### Can I connect AnomalyArmor to a database inside a private VPC? Yes, via three options: IP allowlisting from AnomalyArmor's static outbound IPs (visible in **Settings → Security**), VPC peering, or AWS PrivateLink. The last two require an Enterprise plan and a short setup call. See [Connection Issues](/troubleshooting/connection-issues) for diagnosing network reachability. ### Does AnomalyArmor support Amazon RDS, Aurora, and Supabase? Yes. These are PostgreSQL-compatible and work with the PostgreSQL connector. RDS and Aurora require SSL; set `SSL Mode` to `require` on the connection form. Supabase requires the direct database connection string, not the pooler URL, for freshness queries to return accurate timestamps. ### Why should I create a dedicated read-only user instead of reusing an existing one? Blast radius. A dedicated user means you can audit exactly what AnomalyArmor runs, rotate credentials without breaking other systems, and apply minimal permissions. The [Query Gateway](/security/query-gateway) already blocks writes and full-row reads, but a read-only user at the database layer is defense in depth. ### My connection test fails with 'Connection refused' - what's wrong? Almost always a firewall or security-group issue. Verify your database accepts inbound traffic on its port from AnomalyArmor's static IPs (listed in **Settings → Security**). For RDS, check the security group inbound rules. For self-hosted databases, confirm `listen_addresses` in PostgreSQL config allows external connections. ## Next Step Your database is connected! Now let's discover your tables and schemas. --- # Run Your First Discovery > Step 2 - Discover tables and schemas in your database Discovery is how AnomalyArmor learns about your database. It scans your metadata to find tables, views, columns, and their data , creating a complete inventory of your data assets. ## What Discovery Does Discovery queries your database's **system catalogs** (not your data) to find: | Asset Type | What We Capture | |------------|-----------------| | **Tables** | Name, schema, column definitions | | **Views** | Name, schema, underlying columns | | **Columns** | Name, data type, nullability, constraints | | **Metadata** | Primary keys, indexes, row counts (if available) | > â„šī¸ **Privacy First**: Discovery reads ONLY metadata (table names, columns, types). It never reads your actual data values. We only query metadata from `information_schema`, `pg_catalog`, or equivalent system tables. ## Step-by-Step Guide ## Understanding Discovery Results ### The Asset Catalog After discovery, your assets appear in the catalog. You can view them by clicking **Assets** in the sidebar. ### Asset Details Click any asset to see detailed information: - **Columns**: Full list with data types and constraints - **Schema History**: Changes detected over time (empty for first discovery) - **Freshness**: Data recency (requires configuration) - **Description**: AI-generated or manual documentation ## Setting Up Scheduled Discovery Discovery should run regularly to detect changes. Configure a schedule: ### Recommended Schedules | Environment | Frequency | Rationale | |-------------|-----------|-----------| | Production | Hourly | Catch changes quickly | | Staging | Every 6 hours | Balance coverage and load | | Development | Daily | Lower priority | ### Configure Schedule 1. Go to **Settings** → **Assets** → Select your connection 2. Click the **Settings** tab on the asset detail page 3. Under **Discovery Schedule**, choose frequency 4. Click **Save** **Discovery Schedule Options:** - Every 15 minutes (real-time monitoring) - Hourly (recommended for production) - Every 6 hours - Daily - Weekly - Manual only > 💡 Start with **hourly** discovery for production databases. You can adjust based on how frequently your schemas change and your alerting needs. ## What Happens on Subsequent Discoveries After the initial discovery, subsequent runs detect **changes**: - **New tables/views** added since last discovery - **Removed tables/views** that no longer exist - **Schema changes** (columns added, removed, type changed) - **Freshness updates** (latest timestamp values) Changes trigger alerts if you have alert rules configured (next step!). ## Incremental vs. Full Discovery | Type | What It Does | When to Use | |------|--------------|-------------| | **Incremental** | Detects changes since last run | Default for scheduled runs | | **Full** | Complete rescan of everything | After major changes, troubleshooting | To run a full discovery: Click **Run Discovery** → Select **Full Discovery**. ## Troubleshooting ### Discovery takes too long **Cause**: Large database or network latency **Solutions**: 1. Exclude schemas you don't need (e.g., `pg_temp`, test schemas) 2. Run during off-peak hours 3. Use a read replica instead of primary database To exclude schemas: - **Settings** → **Assets** → Select your connection → **Settings** tab → **Schema Filter** - Add schemas to exclude: `pg_temp_*, test_*` ### Discovery finds no tables **Cause**: Permission or schema filtering issues **Solutions**: 1. Verify the user has `SELECT` on `information_schema` 2. Check schema filter isn't excluding everything 3. Confirm you're connected to the correct database 4. Verify tables exist in the schemas you're monitoring ### Discovery fails with error **Cause**: Connection or permission issues **Solutions**: 1. Test the connection (Settings → Assets → Select connection → Test Connection) 2. Check database user permissions 3. Verify network connectivity 4. Check database logs for errors ### Row counts not showing **Cause**: Some databases don't expose row counts in system tables **Solutions**: - PostgreSQL: Run `ANALYZE` on tables to populate `pg_stat_user_tables` - This is informational only. Monitoring works without row counts ## What You've Accomplished After completing discovery, you have: - A complete inventory of your database assets - Metadata for all tables, views, and columns - A baseline for detecting future changes - Assets visible in your catalog ## Common Questions ### How long does discovery take for a large database? Discovery queries only system catalogs, not your data, so it's fast. Typical durations: under 1 minute for up to 100 tables, 1-3 minutes for 100-500 tables, 3-5 minutes for 500-1000 tables, and 5-10 minutes for 1000+ tables. You can navigate away during discovery. ### Does discovery impact my database's performance? Minimal. Discovery runs information-schema queries against system tables, not table scans on your data. Expect a brief spike in catalog read activity for a few seconds, no noticeable load on production workloads. ### Can I limit which schemas or tables AnomalyArmor discovers? Yes. Configure include/exclude patterns per schema on the connection's detail page. Patterns support glob-style matching (e.g., `analytics_*` includes every schema starting with `analytics_`, `public.audit_*` excludes specific tables). ### How often does discovery re-run after the first scan? By default, daily. Set it to hourly on the connection's detail page if your environment changes often. Each subsequent run is incremental - AnomalyArmor only records what changed since the last scan, which is also what drives schema-drift alerts. ### Why are my row counts showing as zero or missing after discovery? Some databases don't expose row counts in system tables by default. For PostgreSQL, run `ANALYZE` on the affected tables to populate `pg_stat_user_tables`. Row counts are informational only - monitoring, freshness, and schema-drift detection all work without them. ## Next Step Now that your assets are discovered, let's set up alerts to notify you when something changes. --- # Set Up Your First Alert > Step 3 - Get notified when data issues occur Alerts are how AnomalyArmor tells you when something needs attention. In this step, you'll create your first alert rule to get notified about schema changes. ## How Alerts Work Alerts follow a simple flow: **Three components**: 1. **Event**: What triggers the alert (schema change, freshness violation, etc.) 2. **Rule**: Conditions that filter events (optional: specific tables, change types) 3. **Destination**: Where the alert goes (Slack, email, PagerDuty, webhook) ## Step-by-Step Guide Let's create an alert rule for schema , one of the most common and important alerts. ## Recommended First Rules Here are three rules we recommend setting up: ### Rule 1: All Schema Changes | Field | Value | |-------|-------| | **Name** | All Schema Changes | | **Event** | Schema Change Detected | | **Conditions** | None (catch everything) | | **Destination** | Slack #data-alerts | Catches: Any structural change to your database ### Rule 2: Critical Table Freshness | Field | Value | |-------|-------| | **Name** | Critical Table Freshness | | **Event** | Freshness Violation | | **Tables** | orders, users, payments | | **Destination** | Slack #data-alerts + PagerDuty | Catches: Stale data in your most important tables ### Rule 3: Discovery Failures | Field | Value | |-------|-------| | **Name** | Discovery Failures | | **Event** | Discovery Failed | | **Conditions** | None | | **Destination** | Slack #data-alerts | Catches: Connection or permission issues ## Alert Destinations Deep Dive - **[Slack Setup](/alerts/destinations/slack)**: Full guide to Slack integration - **[All Destinations](/alerts/overview)**: Email, PagerDuty, webhooks, and more ## Troubleshooting ### Alerts not firing **Check these in order**: 1. **Rule is enabled**: Toggle should be ON in the Rules list 2. **Destination is configured**: Test the destination 3. **Conditions match**: Are your filters too restrictive? 4. **Events are occurring**: Check Discovery is running and finding changes ### Too many alerts **Reduce noise with filters**: 1. Add conditions to target specific tables 2. Filter out development/test schemas 3. Filter by change type (e.g., only column removals) 4. Use different destinations for different event types See [Alert Best Practices](/alerts/best-practices) for more tips. ### Slack alerts not appearing **Common causes**: 1. Bot not invited to private channels: Type `/invite @AnomalyArmor` 2. OAuth expired: Reconnect the Slack destination 3. Channel archived: Select a different channel ### Test alert works but real alerts don't **The rule conditions might not match**: 1. Verify discovery is running and finding assets 2. Check rule conditions aren't filtering out all events 3. Wait for the next discovery run to detect changes 4. Manually trigger a schema change in a test table to verify ## What You've Accomplished Congratulations! You've completed the quickstart. You now have: -A database connected to AnomalyArmor -Your data assets discovered and cataloged -Alert rules configured to notify you of issues ## What's Next? - **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set up SLAs to detect stale data before it impacts your business - **[Alert Best Practices](/alerts/best-practices)**: Learn how to reduce alert fatigue and route alerts effectively - **[Core Concepts](/concepts/overview)**: Deep dive into assets, discovery, schema changes, and more - **[Add More Assets](/quickstart/connect-first-database)**: Connect additional databases to expand your monitoring --- ## Quick Reference: Alert Rule Cheat Sheet | Event Type | Best Destination | Typical Conditions | |------------|------------------|-------------------| | Schema Change | Slack | Production databases only | | Column Removed | PagerDuty | Critical tables | | Freshness Violation | Slack + Email | SLA-bound tables | | Discovery Failed | Email | All data sources | | Asset Removed | Slack | Unexpected only | ## Common Questions ### What destinations can AnomalyArmor send alerts to? Slack, Email, PagerDuty, Microsoft Teams, Linear, incident.io, and custom webhooks. Each destination is configured once under **Settings → Integrations** and then reused across alert rules. See [Alerts Overview](/alerts/overview) for the full list. ### How do I stop getting alerts for a specific table without disabling the whole rule? Add an exclude condition on the alert rule: filter by asset name, schema, or tag. This keeps the rule active for everything else while muting the noisy table. Blackout windows work too if the noise is time-bounded (overnight batch loads, planned migrations). ### Can I route different alert types to different channels or people? Yes - that's the recommended pattern. Create separate rules per event type (schema change, freshness violation, discovery failure) with their own destinations. Schema drops typically page on-call via PagerDuty; freshness lapses go to team Slack; discovery errors email the data platform team. ### Why didn't my alert fire when the data issue happened? Three common causes: (1) the rule's conditions didn't match the specific event, (2) the rule is scoped to a different asset or schema, (3) the destination is misconfigured (stale Slack webhook, revoked token). Check **Alerts → Rule History** to see which events matched the rule and whether delivery succeeded. ### Can I test an alert rule before turning it on? Yes. Every rule has a **Test Send** button that fires a sample alert to the destination so you can verify formatting and channel routing. Use it before enabling rules in production. --- # "Set Up Completeness Monitoring" > Track row counts with ML-powered anomaly detection Now that you have discovery running and alerts configured, let's set up completeness monitoring. This tracks row counts in your tables and alerts you when data volumes are abnormal. ## What You'll Accomplish By the end of this step, you'll have: - Completeness monitoring on a critical table - ML-based anomaly detection learning your data patterns - Alerts when row counts deviate from expected values ## Why Completeness Monitoring? Row count anomalies catch real problems: | Anomaly | What It Means | |---------|---------------| | **Sudden drop** | Failed ETL job, data loss, broken pipeline | | **Unexpected spike** | Duplicate loads, runaway inserts | | **Missing data** | Source system outage, extraction failure | | **Gradual decline** | Upstream issues, filtering bugs | ## Set Up Monitoring ## Understanding the Learning Phase After enabling, you'll see a **Learning** status badge on the monitoring card. The chart will show data points being collected, with a message like "Building baseline - 3 of 7 data points collected." **What happens during learning:** - Row counts are captured at your check interval - ML model analyzes patterns (daily, weekly, hourly trends) - After 7+ data points, predictions activate ## After Learning Completes Once the baseline is established: - **Expected range** shown on the chart - **Anomalies** highlighted when row count falls outside predictions - **Alerts** fire to your configured destinations ## Explicit Mode (Alternative) If you know exactly what to expect, use explicit mode: | Setting | Example | |---------|---------| | Mode | Explicit | | Minimum Rows | 10,000 | | Maximum Rows | 50,000 | Alerts fire when row count falls outside your defined range. ## Which Tables to Monitor Start with tables that: - Receive data regularly (daily, hourly) - Are critical to downstream reporting - Have predictable volume patterns **Good candidates:** - Fact tables (orders, events, transactions) - Staging tables from ETL pipelines - Aggregation tables **Skip for now:** - Dimension tables (change infrequently) - Archive tables - Temporary/scratch tables ## Troubleshooting ### Still in 'Learning' after a week **Check these:** 1. Is the table receiving new data? 2. Is the check interval appropriate for your data frequency? 3. View the history tab to see if captures are running ### Too many alerts **Reduce noise:** 1. Increase sensitivity (3 or 4 = fewer alerts) 2. Adjust time window to match your data patterns 3. Consider if this table has irregular patterns ### Not getting alerts when expected **Verify:** 1. Learning phase is complete 2. Alert rules are configured for completeness events 3. Check the history tab to see detected anomalies ## Common Questions ### How long does AnomalyArmor's anomaly detection take to start catching issues? Usually 7-14 days of learning before anomaly detection is meaningful. During learning, AnomalyArmor records values without firing alerts so it can establish a baseline. You'll see a **Learning** status on the metric until the baseline is stable. ### Why is my metric stuck in 'learning' status? Learning needs enough historical data points to build a baseline - typically 14 daily samples or 30+ hourly samples. If your table is new or only sampled sporadically, learning takes longer. Reduce the sampling interval temporarily to accelerate, then relax it once anomaly detection is active. ### Can I use AnomalyArmor for row-count monitoring on massive tables without full scans? Yes - AnomalyArmor uses `COUNT(*)` only, which most databases satisfy from statistics or indexes without a full scan. For very large tables where even `COUNT(*)` is expensive, you can monitor a partitioned slice (e.g., `created_at >= NOW() - INTERVAL '1 day'`) as a lightweight proxy. ### What's the difference between row-count monitoring and freshness monitoring? Row counts detect volume anomalies (sudden drops, spikes) regardless of whether new rows are landing. Freshness detects staleness - was the most recent row added within your SLA? Use both: freshness alerts when the pipeline stops, row counts alert when it runs but produces wrong amounts. ## What's Next - **[Column Metrics](/data-quality/metrics)**: Track null percentages, distinct counts, and more - **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Ensure tables are updated on schedule --- # Core Concepts > The building blocks of data observability Data observability answers a simple question: **Can I trust this data?** When a dashboard shows unexpected numbers, you need to know if it's a real business trend or a broken pipeline. When an executive asks about yesterday's revenue, you need confidence that the data is fresh and complete. Data observability gives you that confidence. ## The Problem Data pipelines fail silently. Unlike application errors that crash loudly, data problems often go unnoticed until someone makes a bad decision: | Scenario | What Happened | The Cost | |----------|---------------|----------| | Marketing spends $50K on wrong audience | Pipeline dropped demographic column | Wasted ad spend, wrong targeting | | CEO quotes wrong revenue in earnings call | ETL job failed, dashboard showed stale data | Stock price impact, credibility loss | | Product team ships feature to wrong segment | Upstream table had schema change | Development time wasted, wrong launch | Data observability prevents these scenarios by monitoring your data like you monitor your applications. *Reactive vs proactive: catching issues before they cause damage* ## The Building Blocks AnomalyArmor monitors data through these interconnected concepts: - **[Assets](/concepts/assets)**: Tables, views, and other data objects that AnomalyArmor monitors - **[Discovery](/concepts/discovery)**: How AnomalyArmor finds and catalogs your data assets - **[Schema Changes](/concepts/schema-changes)**: Detecting and tracking structural changes to your data - **[Freshness](/concepts/freshness)**: Monitoring when your data was last updated - **[Metrics](/concepts/metrics)**: Tracking statistical properties and detecting anomalies - **[Alerts](/concepts/alerts)**: How you get notified when issues occur - **[Report Badges](/concepts/badges)**: Embedded status indicators for dashboards and docs - **[Intelligence](/concepts/intelligence)**: AI-powered search and documentation - **[Tagging](/concepts/tagging)**: Classifying and organizing your assets ## How They Work Together **Discovery** scans your databases and catalogs **Assets** (tables, views, columns). Once cataloged, AnomalyArmor monitors each asset for: - **Schema changes**: Columns added, removed, or type-changed - **Freshness violations**: Data not updated within your SLA - **Metric anomalies**: Unexpected changes in row counts, null rates, or distributions When issues occur, **Alerts** notify your team through Slack, PagerDuty, email, or webhooks. **Report Badges** embed this status directly in your dashboards and documentation, so consumers always know if data is trustworthy. **Intelligence** helps you explore your catalog with natural language, and **Tagging** organizes assets for compliance and governance. ## Quick Reference | Concept | Question It Answers | Example Alert | |---------|---------------------|---------------| | Asset | What data do I have? | (Cataloging, no alert) | | Discovery | What changed since last scan? | "New table detected: staging.orders_v2" | | Schema Change | Did the structure change? | "Column removed: orders.shipping_status" | | Freshness | Is data up to date? | "orders table is 4 hours stale" | | Metric | Is data quality normal? | "Row count dropped 60% from yesterday" | | Alert | Who needs to know? | (Routes to Slack, PagerDuty, etc.) | | Report Badge | Can consumers trust this? | (Visual indicator on dashboards) | | Intelligence | Where is X data? | (AI-powered search result) | | Tagging | What category is this? | (Classification: PII, production, etc.) | ## Common Questions ### What is data observability and how is it different from data quality? Data observability is continuous visibility into data health: schema, freshness, row counts, distributions. Data quality is often a one-time or batch check (is this value in the expected set?). Observability catches problems *as they happen* - including ones you didn't predict - while quality checks validate specific known rules. AnomalyArmor does both: observability by default, rule-based quality checks layered on top. ### Do I need to configure every monitoring concept manually? No. Schema drift and freshness are automatic once discovery runs - no per-table setup. Metrics and custom rules are opt-in where you want them. Most customers get value from the automatic monitoring on day one and add explicit metrics to their most critical tables over the first week. ### How does AnomalyArmor decide what's anomalous without me defining thresholds? For metrics, AnomalyArmor builds a statistical baseline from 7-14 days of historical values and flags new readings outside that baseline's expected range. For schema changes, "anomalous" is literal - the structure differs from the previous discovery. For freshness, it learns your typical update cadence and alerts when a table misses it. ### Can I start with just one concept and add others later? Yes. The most common starting point is schema drift + freshness on production databases - zero-config, high signal. Add metrics and report badges once the alerting cadence is calibrated. Tagging and compliance usually follow. ## Next Steps - **[Connect Your Database](/quickstart/connect-first-database)**: Start monitoring in under 15 minutes - **[Explore Assets](/concepts/assets)**: Understand the foundation of data observability --- # Assets > Tables, views, and other data objects that AnomalyArmor monitors An **asset** is any data object that AnomalyArmor discovers and monitors. When you connect a database, we catalog all tables, views, and their columns. ## Asset Hierarchy Assets are organized in a hierarchy that mirrors your database structure: For example: `snowflake.analytics.orders` represents the `orders` table in the `analytics` schema of your Snowflake connection. ## What We Track Each asset stores: | Property | Description | Use For | |----------|-------------|---------| | **Schema** | Column names, types, constraints | Drift detection | | **Freshness** | Last update timestamp | SLA monitoring | | **Metrics** | Row counts, null percentages | Data quality | | **Tags** | Auto and custom classifications | Organization | | **History** | All changes over time | Debugging | ## Asset Lifecycle Assets move through states as you use AnomalyArmor: 1. **Discovered**: Found during discovery, no monitoring configured 2. **Monitored**: Has freshness SLAs, metrics, or alert rules attached 3. **Archived**: Table no longer exists in database (kept for history) ## Finding Assets ### In the UI **Assets** page shows your full catalog. Use filters: - **Data Source**: Filter by database connection - **Schema**: Filter by namespace - **Tags**: Filter by classification (e.g., `pii:email`) - **Status**: Filter by freshness state ### Via API ```python from anomalyarmor import Client client = Client() assets = client.assets.list(schema="analytics", limit=100) ``` ### Via Intelligence Ask in plain English: "Where is customer data?" or "Show me tables updated today." ## Asset Details Click any asset to see: - **Overview**: Basic info, tags, description - **Schema**: Current columns with types - **History**: All schema changes over time - **Freshness**: Update timeline and SLA status - **Metrics**: Configured data quality checks ## Common Questions ### What counts as an asset in AnomalyArmor? Any table or view that AnomalyArmor discovers in your connected databases. System catalogs and internal tables are excluded. Materialized views count as assets; temporary tables don't. ### What happens to an asset when the underlying table is dropped? The asset moves to **Archived** state - removed from active monitoring but retained for history (schema snapshots, freshness timeline, past alerts). You can restore an asset if the table comes back, or delete it permanently to free a monitored-table slot. ### Can I monitor views the same way I monitor tables? Yes. Views are first-class assets. Schema drift, freshness (if the view has a timestamp column), and metrics all work. Note that freshness on views reflects the underlying base table's update, not the view's definition. ### How do I find assets across multiple databases? The **Assets** page aggregates every asset across all connected data sources. Filter by data source, schema, tag, or freshness status. You can also use the API (`client.assets.list()`) or ask in natural language via Intelligence ("show me tables with PII tags updated today"). ## Next Steps - **[Set Up Discovery](/schema-monitoring/asset-discovery)**: Configure how AnomalyArmor finds your assets - **[Connect a Data Source](/data-sources/overview)**: Add your first database to start discovering assets --- # Discovery > How AnomalyArmor finds and catalogs your data assets **Discovery** is the process of scanning your databases to find and catalog data assets. AnomalyArmor queries system catalogs (not your data) to build an inventory of tables, views, and columns. Discovery can run on a schedule or on-demand. On each run, discovery compares current state against previous state to detect new assets, removed assets, and schema changes. ## How Discovery Works ## Discovery Types **Initial Discovery** runs when you first connect a data source, performing a full catalog scan. **Incremental Discovery** runs on your configured schedule, checking only for changes since the last run. ## Common Questions ### What's the difference between initial and incremental discovery? Initial discovery is a full catalog scan that runs once when you connect a database. Incremental discovery runs on your configured schedule (hourly, daily, or custom) and only records what changed since the last run - new assets, removed assets, schema diffs. Incremental is fast even on large databases because it queries the same metadata but diffs against the previous snapshot. ### Can discovery be triggered manually? Yes. Click **Run Discovery** on the connection's detail page to kick off an on-demand scan. This is useful after a deliberate schema change (migration, table addition) when you want the catalog to reflect the new state immediately instead of waiting for the next scheduled run. ### Does discovery pull my actual data? No. Discovery queries only system catalogs (`information_schema`, `pg_catalog`, or equivalent) to read names, types, and metadata. It never runs `SELECT *` or reads row values. The [Query Gateway](/security/query-gateway) enforces this. ### Why didn't a newly created table appear after I ran discovery? Three common causes: (1) the table is in a schema excluded by your discovery include/exclude patterns; (2) your read-only database user doesn't have `USAGE` on that schema or doesn't see it in the system catalogs; (3) some databases delay catalog visibility for a few seconds after DDL. Check the connection's discovery include list first. ## Next Steps - **[Configure Discovery](/schema-monitoring/asset-discovery)**: Set up schedules and discovery options - **[Connect a Data Source](/data-sources/overview)**: Add databases for discovery to scan --- # Schema Changes > Detecting and tracking structural changes to your data assets A **schema change** is a single structural modification to your data - a column added, removed, or renamed; a data type changed; a table created or dropped. Each change is an individual record detected by [schema drift monitoring](/schema-monitoring/schema-drift), the feature that compares successive [asset discoveries](/schema-monitoring/asset-discovery) to find structural differences over time. These modifications can break downstream systems like dbt models, dashboards, and reports if not caught early. AnomalyArmor detects changes by comparing discovery runs. When a column disappears or a type changes, you get alerted before your pipelines fail. ## Change Types | Change | Risk Level | Impact | |--------|------------|--------| | Column removed | High | Breaks queries referencing it | | Table removed | High | Breaks all dependent queries | | Column renamed | High | Old name no longer works | | Type changed | Medium | May affect aggregations | | Column added | Low | Usually safe | | Table added | Low | Informational | ## Common Questions ### What kinds of schema changes does AnomalyArmor detect? Column added, removed, renamed, or type-changed; nullability changes; default-value changes; constraint changes (primary key, unique, foreign key); new and dropped tables and views. The full list is tracked at [schema drift](/schema-monitoring/schema-drift). AnomalyArmor does not diff stored procedures, triggers, or database-engine-level objects. ### How does AnomalyArmor detect a rename versus a drop-and-add? By default it's conservative: a column disappearing followed by one appearing with the same type is logged as drop+add, not rename. Most database catalogs don't preserve rename history, so we can't always distinguish. If you run a deliberate rename, annotate the alert so it doesn't page your team twice. ### Can I acknowledge schema changes so they stop alerting? Yes. Each change has an **Acknowledge** action that marks it as reviewed and suppresses downstream notifications on that specific change. The change is retained in history for audit - acknowledging doesn't delete it. See [alert best practices](/alerts/best-practices) for patterns. ### How fast does AnomalyArmor detect a schema change after it happens? Detection latency equals your discovery cadence. Default daily discovery means up to 24-hour detection; hourly discovery detects within the hour. For databases where schema changes happen frequently (active migration projects), increase the cadence to hourly or run manual discovery immediately after known deploys. ## Next Steps - **[View Change History](/schema-monitoring/schema-drift)**: See what changed and when - **[Set Up Alerts](/alerts/alert-rules)**: Get notified of breaking changes --- # Freshness > Monitor when your data was last updated **Freshness** measures how recently data in a table was updated. Stale data can be as damaging as wrong data. If your dashboard shows yesterday's numbers, decisions made today could be wrong. AnomalyArmor tracks freshness by monitoring timestamp columns. Set SLAs defining how old data can be, and get alerted when data is stale. ## Why Freshness Matters Data freshness is a critical indicator of data pipeline health. Here's why monitoring freshness prevents costly problems: ### Catch Pipeline Failures Early When ETL jobs fail silently, you might not know until someone reports a problem. Freshness monitoring detects the issue immediately when expected data doesn't arrive. **Real scenario**: Your nightly sales import fails at 2 AM. Without freshness monitoring, your morning reports show yesterday's data and your team makes decisions on outdated information. With freshness monitoring, you get alerted at 2:05 AM and can investigate before business hours. ### Prevent Downstream Cascade Failures Modern data stacks have dependencies. When upstream data goes stale, it can cause a cascade of failures downstream. Freshness monitoring acts as an early warning system. **Example**: Your `raw_events` table feeds into `sessions`, which feeds into `user_analytics`. If `raw_events` stops updating, freshness alerts catch it before derived tables produce incorrect aggregations. ### Meet Business SLAs Different tables have different freshness requirements. Customer-facing dashboards might need real-time data, while monthly reports can tolerate delays. Freshness monitoring lets you codify these expectations. | Use Case | Typical SLA | Impact if Stale | |----------|-------------|-----------------| | Real-time dashboards | < 5 minutes | Customer complaints, lost revenue | | Daily reporting | < 2 hours | Delayed decisions, missed opportunities | | Weekly analytics | < 24 hours | Inaccurate trend analysis | | Monthly aggregates | < 7 days | Incorrect billing, compliance issues | ### Detect Data Quality Issues Freshness problems often signal deeper issues. If a table that usually updates every hour hasn't updated in 12 hours, something is broken in your pipeline. **What stale data reveals**: - Source system failures - Network connectivity issues - Permission problems - Schema changes breaking queries - Resource exhaustion (disk, memory, connections) ### Reduce Mean Time to Detection (MTTD) Without freshness monitoring, you discover data problems when users report them. With automated freshness checks, you detect issues minutes after they occur instead of hours or days later. **Impact on MTTD**: - **Without monitoring**: 4-48 hours (user reports issue) - **With monitoring**: 5-15 minutes (automated alert) - **Result**: 95%+ reduction in detection time ## How It Works 1. You specify a timestamp column (e.g., `created_at`, `updated_at`) 2. AnomalyArmor queries `MAX(timestamp_column)` on your schedule 3. If the latest timestamp exceeds your SLA threshold, an alert fires 4. Alerts can route to Slack, email, PagerDuty, or webhooks ### Auto-Learning Freshness Baselines AnomalyArmor can learn your table's update patterns automatically. Instead of manually setting SLAs, enable auto-learning and the system will: 1. Observe your table's update frequency over time 2. Calculate typical update intervals and variance 3. Set dynamic thresholds based on historical patterns 4. Alert only when updates deviate from the learned baseline This is especially useful for tables with irregular update patterns or tables you're just starting to monitor. ### Handling Complex Update Patterns Some tables have nuanced freshness requirements: **Business hours only**: Your CRM sync runs 9 AM to 6 PM. Configure freshness checks to only alert during business hours, avoiding false alerts at night. **Weekly batches**: A table updates every Monday at 3 AM. Set a weekly schedule that expects updates once per week, not daily. **Time zone considerations**: Your `created_at` timestamps are in UTC but your business operates in PST. AnomalyArmor handles time zone conversions automatically. **Multiple sources**: If one table receives data from multiple sources with different frequencies, you can monitor multiple freshness columns or use separate freshness schedules. ## Common Questions ### Which column should I use for freshness monitoring? Use the column that's written every time a row is inserted or meaningfully updated - commonly `updated_at`, `created_at`, `event_time`, or `last_modified`. Avoid columns that only reflect specific lifecycle events (e.g., `deleted_at`). If your table has no reliable timestamp, freshness isn't a good fit; track it via row-count deltas instead. ### How do I configure freshness for tables that only update on business hours or weekly? Set a schedule on the freshness check itself: business-hours window, weekly cadence, or a cron expression. Alerts fire only when the table violates the configured schedule, so overnight or weekend gaps don't page on-call by mistake. ### What's auto-learning freshness and when should I use it? Auto-learning watches your table's update pattern for 1-2 weeks and sets a dynamic threshold based on observed variance. Use it for tables with irregular cadence where a fixed SLA would be wrong (analytics jobs, batch loads with variable completion times). Use a fixed SLA for tables where the business-contract freshness is an absolute (hourly sync, real-time dashboard). ### Does freshness monitoring work if my table's timestamp is in a different time zone than my business? Yes. AnomalyArmor normalizes timestamps to UTC internally; alerting respects the time zone configured on your workspace. You don't need to manually convert - just configure the business time zone once in workspace settings. ### Can I monitor freshness on a view or a derived table? Yes - as long as the view exposes a reliable timestamp column. Note that the freshness reflects the view's output (which depends on its underlying tables), so a stale view can indicate either an upstream issue or a materialization lag. Combine with freshness on the base table to distinguish. ## Next Steps - **[Set Freshness SLAs](/data-quality/freshness-monitoring)**: Define how fresh your data should be - **[Set Up Alerts](/alerts/alert-rules)**: Get notified when data goes stale --- # Metrics > Track statistical properties of your data and detect anomalies Metrics are statistical measurements that track properties of your data over time. AnomalyArmor captures metrics on a schedule, builds baselines, and alerts you when values fall outside expected ranges. ## What Metrics Track | Metric Type | What It Measures | Example Use Case | |-------------|------------------|------------------| | `row_count` | Total rows in a table | Detect data loss or unexpected growth | | `null_percent` | Percentage of null values | Catch ETL issues leaving nulls | | `distinct_count` | Unique values in a column | Detect cardinality changes | | `duplicate_count` | Duplicate values | Find unexpected duplicates | | `min_value` | Minimum numeric value | Catch invalid data (negative prices) | | `max_value` | Maximum numeric value | Detect outliers | | `mean` | Average numeric value | Monitor central tendency | ## How Anomaly Detection Works 1. **Baseline building**: Historical values establish what's "normal" 2. **Z-score calculation**: Each new value is compared to the baseline 3. **Sensitivity threshold**: Values exceeding the threshold trigger alerts A sensitivity of 2.0 means values more than 2 standard deviations from the mean are flagged as anomalies. ## Common Questions ### How is a metric different from a validity rule? A metric captures a numeric property over time (row count, null percentage, distinct count) and flags statistical anomalies against a learned baseline. A validity rule checks a deterministic condition on current data (value in allowed set, format match, range bounds). Use metrics to catch "this number changed suspiciously"; use validity rules to catch "this data broke a known contract". ### What's the default sensitivity for anomaly detection, and when should I change it? Default is 2.0 standard deviations - roughly 95% of normal variance stays silent, outliers alert. Raise to 3.0 for noisy metrics where 2΃ produces false alerts; lower to 1.5 for business-critical metrics where you'd rather investigate a false positive than miss a real anomaly. Tune per-metric, not globally. ### How much history does AnomalyArmor need before metric anomaly detection works? Typically 7-14 days of captures at your chosen sampling interval. During this "learning" phase, values are recorded but no alerts fire. You can shorten the wait by temporarily reducing the sample interval, then relaxing it once the baseline is established. ### Can I define a custom metric that isn't in the built-in list? Yes, via [custom SQL monitoring](/data-quality/custom-sql-monitoring). Write a `SELECT` that returns a single numeric column and AnomalyArmor treats the result as a metric - baseline learning, anomaly detection, and alerting all work the same way as built-in metrics. ## Next Steps - **[Create a Metric](/data-quality/metrics)**: Set up your first data quality metric - **[Set Up Alerts](/alerts/alert-rules)**: Get notified when anomalies are detected --- # Alerts > How you get notified when issues occur **Alerts** notify you when something needs attention. Whether it's a schema change, stale data, or a failed discovery job, alerts ensure the right people know at the right time. Alerts follow a simple pipeline: events are detected during discovery, evaluated against your rules, and routed to destinations like Slack, email, or PagerDuty. ## What Triggers Alerts | Event Type | Description | |------------|-------------| | Schema Change | Column added, removed, or type changed | | Freshness Violation | Data not updated within SLA | | Metric Anomaly | Value outside expected range | | Discovery Failed | Connection or permission error | | Asset Removed | Table/view no longer exists | ## Common Questions ### What's the difference between an event, a rule, and a destination? An event is something detected (schema change, freshness violation, metric anomaly). A rule is your filter on events - which assets, which event types, severity thresholds, schedule conditions. A destination is where the notification lands (Slack, email, PagerDuty, webhook). Events flow through rules to destinations. ### Can one event fire multiple alerts to different destinations? Yes - that's the normal pattern. A production schema drop might page on-call via PagerDuty *and* post to the team Slack channel *and* email the data-platform list. Each is a separate rule with the same event filter but different destinations. ### Do I configure alerts per table, or globally? Either. Rules can apply to all assets (useful for catching discovery failures across everything) or be scoped to specific schemas, tags, or individual assets. Most teams start with a global "schema drop" rule and add tighter per-critical-table rules over time. ### Can I suppress alerts during a planned migration or deploy? Yes. Use [blackout windows](/alerts/blackouts) to pause alerting on specific assets or rules for a defined time window. Events are still recorded; they just don't notify until the window ends. ## Next Steps - **[Create Alert Rules](/alerts/alert-rules)**: Define when and where alerts fire - **[Connect Destinations](/alerts/destinations/slack)**: Set up Slack, email, or other channels --- # Report Badges > Embedded status indicators for data quality visibility Report badges display data quality status directly in your existing tools. Instead of switching to AnomalyArmor to check if data is fresh, see a green or red badge right in your dashboard. ## The Problem Badges Solve Data teams context-switch constantly. Checking data quality in a separate tool breaks focus: Badges provide **always-on visibility** wherever you work. ## What Badges Monitor A single badge can track multiple quality signals: | Check | What It Catches | |-------|-----------------| | Schema Drift | Column additions, removals, type changes | | Freshness | Data not updated within SLA | | Metrics | Anomalies in row counts, null percentages | Configure badges to monitor any combination of these checks. ## Where Badges Work Any tool that supports image embeds: - **Databricks** - Markdown cells in notebooks - **Notion / Confluence** - Wiki pages and documentation - **Looker / Tableau** - Dashboard annotations - **GitHub** - README files and PR descriptions - **Slack** - Pinned messages in data channels ## Badge States | Badge | Meaning | |-------|---------| | **Passing** (green) | All monitored tables pass quality checks | | **Issues** (red) | One or more problems detected | | **Unknown** (grey) | Badge disabled or configuration error | ## Common Questions ### Where can I embed an AnomalyArmor report badge? Anywhere that accepts an image URL or Markdown image syntax: Notion, Confluence, Databricks notebooks, Looker / Tableau dashboard annotations, GitHub README files, Slack pinned messages, and most static-site generators. Badges are rendered from a URL, so if the tool shows external images, badges work. ### How quickly does a badge update after an issue is detected? Within a minute. The badge endpoint reads the current status of the tracked checks on each request, so a failing freshness check shows red as soon as the check fires. There's no caching layer on your side unless your embedding tool caches images (Slack and some wikis do; pass a cache-busting query param if needed). ### Can one badge cover multiple tables or checks? Yes. A badge can aggregate schema drift, freshness, and metric status across any set of tables. If any tracked check fails, the badge turns red. For per-table detail, use multiple badges side-by-side. ### Does embedding a badge expose any sensitive information? No. The badge image shows only a pass/fail color and an optional short label. Clicking the badge can link back to AnomalyArmor for authenticated detail, but the image itself has no table names, row counts, or customer data. Safe to embed in public READMEs. ## Next Steps - **[Create Your First Badge](/data-quality/report-badges)**: Step-by-step setup guide - **[Freshness Monitoring](/concepts/freshness)**: Understand what badges monitor --- # Intelligence > AI-powered analysis that connects your monitoring context into actionable answers **Intelligence** aggregates context from schema drift, freshness, data quality, tags, lineage, and alerts, then lets you ask questions and get AI-generated analysis in plain English. ## How It Works Intelligence reads your operational metadata, not your data values: | What We Analyze | What We Never Access | |----------------|---------------------| | Table and column names | Actual row data | | Data types and constraints | Query results | | Schema change history | Business values | | Freshness patterns and SLAs | Database credentials | | Data quality metric trends | | | Tags and classifications | | Your data values stay in your database. Intelligence only sees structure and operational metadata. ## What You Can Ask ### Finding Tables ``` "Where is customer data?" "Show me tables with email columns" "Which tables were updated today?" ``` ### Cross-Domain Diagnosis ``` "Why is my orders dashboard broken?" "What changed upstream that could cause NULLs in silver_orders?" "What should I prioritize this morning?" ``` ### Tracking Changes ``` "What columns were removed this week?" "Show me schema drift on gold tables" "Which tables had changes since the last deploy?" ``` ## Example Conversation **You**: "Why is the revenue dashboard showing errors?" **Intelligence**: "The column `discount_pct` was removed from `gold.fact_orders` on Tuesday at 4:12 PM. This was caused by an upstream schema change in `staging.orders_raw` at 3:47 PM. Two downstream views reference this column: - `analytics.order_summary` - `reporting.daily_revenue` Freshness is normal (table is still updating). No data quality anomalies on other columns. The issue is isolated to the removed column." ## AI-Generated Profiles When you select a table on the Intelligence page, you see an **Object Profile**: - **Summary and business context**: What this table is for and who uses it - **Findings**: Issues grouped by severity (Critical, High, Medium, Low), sourced from schema drift, freshness, data quality, and other domains - **Relationships**: Foreign keys, lineage signals, hub tables - **Change history**: Recent column additions, removals, and type changes ## Privacy Model | Question | Answer | |----------|--------| | Does Intelligence see my data? | No. Only metadata (names, types, monitoring history). | | Which AI provider is used? | Configurable per account. Supports multiple providers. | | Can I disable Intelligence? | Yes. Contact support. | | Is usage logged? | Yes, for billing and debugging. | ## Common Questions ### Does Intelligence send my data to an LLM provider? No. Intelligence only sends metadata - table names, column names, types, schema history, freshness stats, tags, alerts - never row values. Your data never leaves your database. This is enforced by the same [Query Gateway](/security/query-gateway) that governs monitoring queries. ### Which AI provider does Intelligence use? Configurable per account. Intelligence supports multiple providers, so enterprise customers with BAA or regional-data requirements can pick a provider that matches their compliance posture. Contact your account team to change providers. ### Why did Intelligence give me an answer that was out of date? Intelligence analyses the metadata snapshot as of the last discovery run plus live monitoring state. If discovery hasn't run since the schema change you're asking about, the answer reflects the pre-change state. Trigger a manual discovery on the relevant connection and re-ask. ### Can I disable Intelligence for compliance or privacy reasons? Yes. Intelligence can be disabled at the workspace level - contact support to turn it off. Even when enabled, Intelligence only sees metadata, so the privacy delta is small, but some compliance programs require the feature be off by default. ### Can Intelligence write to my database or run queries I didn't approve? No. Intelligence only reads the metadata AnomalyArmor already collects. It has no write path, no ad-hoc query capability on your data, and no way to touch your database beyond what the Query Gateway already allows for scheduled monitoring. ## Next Steps - **[Intelligence Overview](/intelligence/overview)**: Learn how monitoring context powers Intelligence - **[Ask Questions](/intelligence/ask-questions)**: Chat with the AI agent --- # Tagging > Classifying and organizing your assets **Tagging** helps you classify and organize data assets for governance and compliance. Know which tables contain PII before your next audit asks. AnomalyArmor supports two types of tags: auto-classification that detects PII patterns automatically, and custom tags you create to match your organization's needs. ## Tag Types | Type | What It Does | Use For | |------|--------------|---------| | Auto-classification | Detects PII patterns in column names | Compliance, PII tracking | | Custom tags | Labels you define and apply | Organization, ownership, status | ## Common Questions ### Does auto-classification read my actual data to detect PII? No. Auto-classification runs pattern matching against column names, types, and AI-generated profiles - never row values. So a column named `user_email` is classified as PII-email from its name, not from a data scan. See [auto-classification](/data-classification/auto-classification) for the full detection model. ### How do I create a custom tag for something like 'team ownership'? Create a tag set in **Settings → Tags** (e.g., `team:growth`, `team:platform`) then apply it to assets manually, via API, or as a rule (tag any asset in a specific schema with a specific team). See [custom tags](/data-classification/custom-tags). ### Can I use tags in alert rules to scope notifications? Yes. Alert rules support tag-based filtering - route all `pii:email` asset alerts to the security channel, or page only on `tier:critical` tables. This is typically how teams avoid noisy alerting across heterogeneous data. ### Are auto-classification tags guaranteed accurate? They're high-precision heuristics, not a substitute for your own review. Cases like columns with ambiguous names (`id`, `value`, `data`) are more likely to be missed or mis-tagged. Treat auto-tags as a starting point for compliance review, not the final word. ## Next Steps - **[Create Custom Tags](/data-classification/custom-tags)**: Define labels for your organization - **[Auto-Classification](/data-classification/auto-classification)**: Set up automatic PII detection --- # Jobs > Monitor and track background operations on your data assets Jobs are background operations that AnomalyArmor runs on your data assets. The Jobs page provides observability into all operations, helping you understand what actions have been taken, their status, and troubleshoot any failures. ## Why Jobs Matter Every time AnomalyArmor interacts with your data assets, it creates a job. This gives you: - **Audit trail**: Know exactly what operations ran and when - **Debugging**: Understand why something failed and how to fix it - **Performance visibility**: See how long operations take - **Operational awareness**: Track what's happening across your data estate ## Job Types | Job Type | Description | Triggered By | |----------|-------------|--------------| | **Asset Discovery** | Analyzes database structure, tables, columns, and relationships | Manual trigger, scheduled refresh | | **Intelligence Analysis** | Generates AI documentation for tables and columns | Manual trigger on asset | | **Tag Generation** | Auto-classifies columns with PII, sensitivity, and domain tags | After asset discovery | | **Freshness Check** | Monitors when tables were last updated | Scheduled intervals | | **Metric Capture** | Collects row counts and custom metrics | Scheduled intervals | | **Lineage Extraction** | Maps data flow between tables | After asset discovery | ## Job Statuses | Status | Meaning | |--------|---------| | **Pending** | Job is queued and waiting to start | | **Running** | Job is actively executing | | **Completed** | Job finished successfully | | **Failed** | Job encountered an error | | **Cancelled** | Job was manually stopped | ## Using the Jobs Page ### Viewing Jobs Navigate to **Jobs** in the left sidebar to see all jobs across your organization. The jobs table shows: - **Job ID**: Unique identifier (click row to see details) - **Job Type**: What operation was performed - **Asset**: Which database or table was affected - **Status**: Current job state - **Progress**: Completion percentage for running jobs - **Duration**: How long the job took - **Message**: Status message or error description - **Created**: When the job was started ### Filtering Jobs Use the tabs to filter by status: - **All**: Every job regardless of status - **Active**: Currently running or pending jobs - **Completed**: Successfully finished jobs - **Failed**: Jobs that encountered errors ### Job Details Click any job row to open the details modal, which shows: - Full job configuration - Detailed progress information - Error messages and stack traces (for failed jobs) - Timing breakdown - Related asset information ### Job Actions From the actions menu (three dots), you can: - **View Details**: Open the full job information modal - **Go to Asset**: Navigate to the affected asset - **Go to Job Page**: View results in context (e.g., schema changes page) - **Cancel Job**: Stop a running job - **Retry Job**: Re-run a failed job ## Common Scenarios ### Debugging a Failed Job 1. Navigate to **Jobs** 2. Click the **Failed** tab 3. Click the failed job row to open details 4. Review the error message and stack trace 5. Fix the underlying issue (e.g., connection problem, permission error) 6. Click **Retry Job** to re-run ### Monitoring a Long-Running Job 1. Navigate to **Jobs** 2. Click the **Active** tab 3. Watch the progress percentage update in real-time 4. Click the job row for detailed progress information ### Understanding Job History Use [Intelligence Q&A](/intelligence/overview) to ask questions about job history: ``` "Why did my last job fail?" "What jobs ran this week?" "How long does asset discovery usually take?" "Show me failed jobs for the orders database" ``` ## Best Practices 1. **Check jobs after schema changes**: When you modify your database, verify discovery jobs complete successfully 2. **Monitor failed jobs**: Set up alerts for job failures to catch issues early 3. **Review job duration trends**: Sudden increases may indicate database performance issues 4. **Use retry wisely**: If a job fails due to transient issues (network, timeouts), retry is often sufficient ## Common Questions ### Why did my discovery job fail? Open the job from the **Jobs** page and check the error and stack trace. Most failures are one of: expired database credentials, network unreachability (the AnomalyArmor IPs were removed from your firewall), insufficient permissions on new schemas, or a timeout on an unusually large catalog. Fix the underlying cause and click **Retry Job**. ### How long does a discovery job normally take? Under 60 seconds for up to 1,000 tables is typical. A sudden increase in duration often indicates a database-side issue (stats out of date, table explosion, replica lag). Track the duration trend on the Jobs page to catch this early. ### Can I cancel a long-running job? Yes. From the job's actions menu, click **Cancel Job**. In-flight queries are stopped, and the job moves to **Cancelled** state. Restarting is safe - the next run picks up from current database state. ### Do failed jobs fire alerts? Only if you configure a rule for the `Discovery Failed` event on your alerting destinations. See [alert rules](/alerts/alert-rules). Many teams route job failures to email rather than Slack, since they're often recoverable without paging on-call. ## Related - [Asset Discovery](/schema-monitoring/asset-discovery) - Learn about discovery jobs - [Intelligence Q&A](/intelligence/overview) - Ask questions about job history - [Alerts](/alerts/overview) - Set up notifications for job failures --- # Data Sources Overview > Connect AnomalyArmor to your databases and data platforms Connect once, monitor everything. Add your database credentials and AnomalyArmor continuously monitors for schema changes, stale data, and quality issues, without installing anything in your infrastructure. Your data stays in your database. We only read metadata (table names, column types, timestamps) through a secure, read-only connection. ## Supported Databases AnomalyArmor supports the most popular data platforms used by modern data teams: | Database | Version | Key Features | |----------|---------|--------------| | **PostgreSQL** | 12+ | Tables, views, schemas, materialized views | | **MySQL** | 5.7+ | Tables, views, schemas | | **SQL Server** | 2012+ | Tables, views, schemas, Azure SQL Database | | **Amazon Redshift** | Any | Tables, views, external tables (Spectrum) | | **Databricks** | Unity Catalog | Catalogs, schemas, Delta tables, views | | **ClickHouse** | 21.8+ | Tables, views, materialized views, dictionaries | - **[PostgreSQL](/data-sources/postgresql)**: Including RDS, Aurora, Supabase, and self-hosted - **[MySQL](/data-sources/mysql)**: Including RDS, Aurora MySQL, PlanetScale, and self-hosted - **[SQL Server](/data-sources/sql-server)**: Including Azure SQL Database and on-premise - **[Amazon Redshift](/data-sources/redshift)**: Provisioned clusters and Serverless workgroups - **[Databricks](/data-sources/databricks)**: Unity Catalog with Delta Lake support - **[ClickHouse](/data-sources/clickhouse)**: Including ClickHouse Cloud ## How Data Sources Work When you add a data source, AnomalyArmor: 1. **Stores credentials securely**: Encrypted with AES-256 2. **Tests connectivity**: Verifies we can reach your database 3. **Awaits discovery**: No scanning until you trigger it ## What We Access AnomalyArmor only queries **metadata** from your databases: | We Access | We Never Access | |-----------|-----------------| | `information_schema` | Your actual data | | System catalogs | Row contents | | Table/column names | PII or sensitive values | | Data types | Business data | | Timestamps (for freshness) | Query results | > â„šī¸ See [Security Overview](/security/overview) for detailed information about our security practices. ## Adding a Data Source ### Quick Steps 1. Navigate to **Data Sources** in the sidebar 2. Click **Add Connection** 3. Select your database type 4. Enter connection credentials 5. Click **Test Connection** 6. Click **Save** For detailed instructions, see the guide for your specific database: - [PostgreSQL Setup](/data-sources/postgresql) - [MySQL Setup](/data-sources/mysql) - [SQL Server Setup](/data-sources/sql-server) - [Amazon Redshift Setup](/data-sources/redshift) - [Databricks Setup](/data-sources/databricks) - [ClickHouse Setup](/data-sources/clickhouse) ## Managing Data Sources ### Editing Connections To update a data source: 1. Go to **Data Sources** 2. Click on the connection name 3. Click **Settings** 4. Update credentials or configuration 5. Click **Save** > âš ī¸ Changing credentials will require re-testing the connection. Scheduled discoveries may fail if credentials are incorrect. ### Deleting Connections To remove a data source: 1. Go to **Data Sources** 2. Click on the connection name 3. Click **Settings** → **Delete Connection** 4. Confirm deletion **What gets deleted**: - The connection and credentials - Discovery schedule - Associated alert rules (optional) **What's preserved**: - Historical schema change data - Audit logs ### Connection Status Each data source shows its status: | Status | Meaning | |--------|---------| | **Connected** | Last discovery succeeded | | **Error** | Connection or permission issue | | **Never Run** | Discovery hasn't been triggered yet | | **Running** | Discovery in progress | ## Network Requirements AnomalyArmor connects outbound to your databases. You'll need to: ### 1. Allow AnomalyArmor IPs Allowlist our static IP addresses in your firewall or security group: ``` AnomalyArmor IP Addresses: 34.xxx.xxx.xxx/32 34.xxx.xxx.xxx/32 ``` > 💡 View current IPs in **Settings → Security** in your dashboard. ### 2. Open Database Port Ensure the database port is accessible: | Database | Default Port | |----------|--------------| | PostgreSQL | 5432 | | MySQL | 3306 | | SQL Server | 1433 | | Amazon Redshift | 5439 | | Databricks | 443 (HTTPS) | | ClickHouse | 8443 (HTTPS) | ### 3. SSL/TLS Configuration We recommend (and often require) encrypted connections: - **PostgreSQL**: SSL Mode = `require` - **MySQL**: SSL Mode = `require` - **SQL Server**: Encryption enabled (required for Azure SQL) - **Amazon Redshift**: SSL required by default - **Databricks**: Always HTTPS - **ClickHouse**: Port 8443 for HTTPS ## Enterprise Options For enhanced security, Enterprise customers can use: ### VPC Peering Direct network peering between your AWS VPC and AnomalyArmor: - No public internet exposure - Lower latency - Private IP connectivity ### AWS PrivateLink Connect via AWS PrivateLink: - Fully private connectivity - No firewall changes needed - Traffic stays on AWS backbone Contact [sales@anomalyarmor.ai](mailto:sales@anomalyarmor.ai) for Enterprise options. ## Best Practices ### Use Read-Only Credentials Always create a dedicated, read-only user for AnomalyArmor: ```sql -- Example for PostgreSQL CREATE USER anomalyarmor WITH PASSWORD 'secure-password'; GRANT CONNECT ON DATABASE your_db TO anomalyarmor; GRANT USAGE ON SCHEMA public TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor; ``` ### Use Descriptive Names Name your data sources clearly: **Good names**: - `Production PostgreSQL` - `Analytics Databricks` - `Staging ClickHouse` **Avoid**: - `db1` - `test` - `connection` ### Start with One Environment Begin with your production database, then expand to staging and development environments once you're comfortable with the setup. ## Troubleshooting ### Connection test fails 1. Verify hostname and port are correct 2. Check credentials are valid 3. Ensure AnomalyArmor IPs are allowlisted 4. Verify SSL/TLS settings match your database ### Discovery finds no tables 1. Verify user has `SELECT` on `information_schema` 2. Check schema filters aren't excluding everything 3. Confirm tables exist in the monitored schemas ### Intermittent connection errors 1. Check database availability and load 2. Verify network stability 3. Consider using a read replica for monitoring ## Common Questions ### What databases does AnomalyArmor support today? PostgreSQL (12+), MySQL (5.7+), SQL Server (2012+, including Azure SQL), Amazon Redshift, Databricks (with Unity Catalog), and ClickHouse (21.8+). RDS, Aurora, Supabase, PlanetScale, Cloud SQL, and Azure Database are all supported via the compatible engine's connector. Snowflake and BigQuery are in active development. ### Can I connect a database that's behind a firewall or in a private VPC? Yes. Three options: (1) allowlist AnomalyArmor's static outbound IPs (visible in **Settings → Security**) in your firewall; (2) VPC peering (Enterprise); (3) AWS PrivateLink (Enterprise). SSH tunnel / bastion host is also supported for PostgreSQL and MySQL when direct connectivity is not possible. ### Can I monitor multiple databases from one AnomalyArmor workspace? Yes - most customers do. Connect as many as you need; assets across all connections appear in a unified catalog. Plan limits are on *monitored tables*, not number of connections. ### Should I point AnomalyArmor at production or a replica? Either works. A replica has zero impact on primary workload and is the recommended pattern for production-grade deployments. Freshness reflects replica timestamps, so factor in replication lag when setting SLAs. ### What happens if my database credentials expire or rotate? Discovery and monitoring jobs fail and the connection moves to **Error** status. Update credentials under the connection's settings and re-test; in-flight jobs will resume on the next scheduled run. Consider using credential managers (AWS Secrets Manager, Vault) and rotating via Enterprise SSO to reduce manual updates. ## Next Steps - **[Connect PostgreSQL](/data-sources/postgresql)**: Full guide with RDS, Aurora, and Supabase instructions - **[Run Discovery](/quickstart/run-first-discovery)**: Scan your database after connecting --- # PostgreSQL > Connect AnomalyArmor to PostgreSQL databases including RDS, Aurora, and Supabase Connect AnomalyArmor to any PostgreSQL-compatible database. This guide covers self-hosted PostgreSQL as well as managed services like Amazon RDS, Aurora, and Supabase. ## Supported Versions & Platforms | Platform | Minimum Version | Notes | |----------|-----------------|-------| | PostgreSQL | 12+ | Self-hosted or any cloud | | Amazon RDS | 12+ | All instance classes | | Amazon Aurora | PostgreSQL 12+ | Cluster and serverless | | Supabase | Any | Direct connection or pooler | | Google Cloud SQL | 12+ | Public or private IP | | Azure Database | 12+ | Single server or flexible | | Heroku Postgres | Any | Requires SSL | ## Connection Settings | Field | Description | Example | |-------|-------------|---------| | **Connection Name** | Friendly identifier | `Production PostgreSQL` | | **Host** | Hostname or IP address | `db.example.com` | | **Port** | Database port | `5432` | | **Database** | Database name | `myapp_production` | | **Username** | Database user | `anomalyarmor` | | **Password** | User password | `â€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸ` | | **SSL Mode** | SSL configuration | `require` | ### SSL Mode Options SSL (Secure Sockets Layer) encrypts the connection between AnomalyArmor and your database, preventing eavesdropping on sensitive data like credentials and query results. **Why use SSL?** - **Security**: Encrypts all data in transit, protecting against network sniffing - **Compliance**: Required for SOC2, HIPAA, PCI-DSS, and other security standards - **Cloud providers**: Most managed databases (RDS, Aurora, Cloud SQL) require or strongly recommend SSL **When SSL may not be needed:** - Local development databases on `localhost` - Databases on a private network with no external access - Testing environments with non-sensitive data #### Choosing an SSL Mode | Mode | Security Level | Description | |------|----------------|-------------| | `disable` | None | No encryption. Data sent in plain text. | | `allow` | Low | Uses SSL only if server requires it. | | `prefer` | Medium | Tries SSL first, falls back to unencrypted if unavailable. | | `require` | High | Always uses SSL, but doesn't verify the server certificate. | | `verify-ca` | Higher | Uses SSL and verifies the server certificate is signed by a trusted CA. | | `verify-full` | Highest | Uses SSL, verifies CA, and confirms the server hostname matches the certificate. | #### Recommendations by Environment | Environment | Recommended Mode | Reason | |-------------|------------------|--------| | **Local development** | `prefer` or `disable` | Convenience for local testing | | **Cloud databases** (RDS, Aurora, Cloud SQL) | `require` | SSL is available; certificate verification often not needed | | **Production with compliance** | `verify-ca` or `verify-full` | Maximum security for sensitive data | | **Heroku, Supabase** | `require` | These platforms require SSL | > 💡 When in doubt, start with `prefer`. It provides encryption when available without blocking connections if SSL isn't configured on your database. > âš ī¸ Never use `disable` for production databases or any database containing sensitive data. ## SSH Tunnel (Bastion Host) For databases behind firewalls, AnomalyArmor supports SSH tunnel connections through a bastion host. This is common in enterprise environments where databases are not directly accessible from the internet. ### When to Use SSH Tunnel - Database is in a private subnet with no public IP - Firewall rules prevent direct connections - Security policy requires bastion host access ### SSH Tunnel Settings Enable **SSH Tunnel** in the connection form to reveal these fields: | Field | Description | Example | |-------|-------------|---------| | **SSH Host** | Bastion server hostname | `bastion.example.com` | | **SSH Port** | SSH port (usually 22) | `22` | | **SSH Username** | SSH user on bastion | `ec2-user` | | **Authentication Method** | Key or Password | `Key` | | **SSH Private Key** | PEM-formatted private key | `-----BEGIN RSA PRIVATE KEY-----...` | | **Key Passphrase** (Optional) | For encrypted keys | `â€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸ` | | **SSH Password** | If using password auth | `â€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸ` | ### Key-Based Authentication (Recommended) 1. Generate an SSH key pair (or use existing): ```bash ssh-keygen -t rsa -b 4096 -f anomalyarmor_key ``` 2. Add the public key to the bastion host's `~/.ssh/authorized_keys` 3. In AnomalyArmor, paste the contents of the private key file or click **Upload Key File** > 💡 Key-based authentication is more secure and doesn't require password rotation. ### Password Authentication If your bastion host uses password authentication: 1. Set **Authentication Method** to `Password` 2. Enter the SSH password > âš ī¸ Key-based authentication is more secure than passwords. Use password auth only if key auth is not available. ### Connection Flow with SSH Tunnel 1. AnomalyArmor connects to your bastion host via SSH 2. An encrypted tunnel is established to your database 3. Database traffic flows securely through the tunnel 4. The tunnel closes automatically after each operation ## Creating a Read-Only User Create a dedicated user with minimal permissions. > 💡 **Quick Setup**: [View the PostgreSQL permissions script](/downloads/postgresql-permissions) for a ready-to-use SQL template with all necessary grants. ```sql -- Create the user CREATE USER anomalyarmor WITH PASSWORD 'your-secure-password'; -- Grant connection access GRANT CONNECT ON DATABASE your_database TO anomalyarmor; -- Grant schema access (repeat for each schema) GRANT USAGE ON SCHEMA public TO anomalyarmor; GRANT USAGE ON SCHEMA analytics TO anomalyarmor; -- Grant read access to existing tables GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO anomalyarmor; -- Grant access to future tables (recommended) ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO anomalyarmor; ALTER DEFAULT PRIVILEGES IN SCHEMA analytics GRANT SELECT ON TABLES TO anomalyarmor; ``` ### Verifying Permissions Test that the user can access metadata: ```sql -- Should return tables SELECT table_name FROM information_schema.tables WHERE table_schema = 'public' LIMIT 5; -- Should return columns SELECT column_name, data_type FROM information_schema.columns WHERE table_schema = 'public' LIMIT 5; ``` ## Provider-Specific Instructions ### Amazon RDS ### Amazon RDS PostgreSQL **Connection Details**: - **Host**: Your RDS endpoint (e.g., `mydb.abc123.us-east-1.rds.amazonaws.com`) - **Port**: `5432` (default) - **SSL Mode**: `require` **Security Group Configuration**: 1. Go to **AWS Console → RDS → Your Instance → Security Groups** 2. Edit inbound rules 3. Add rule: - Type: `PostgreSQL` - Port: `5432` - Source: AnomalyArmor IPs (see Settings → Security) **Parameter Group** (if using verify-ca or verify-full): - Ensure `rds.force_ssl = 1` - Download RDS CA certificate bundle > â„šī¸ RDS instances in private subnets require NAT Gateway or VPC peering for AnomalyArmor access. Contact us for Enterprise VPC peering options. ### Amazon Aurora ### Amazon Aurora PostgreSQL **Connection Details**: - **Host**: Cluster endpoint (reader or writer) - **Port**: `5432` (default) - **SSL Mode**: `require` **Choosing the Right Endpoint**: | Endpoint | Use Case | |----------|----------| | **Cluster (writer)** | If you need real-time schema changes | | **Reader** | Recommended - no impact on production writes | **Aurora Serverless v2**: - Use the cluster endpoint - Ensure minimum ACU allows connections during discovery - Consider scheduling discovery during active hours ### Supabase ### Supabase PostgreSQL **Connection Options**: | Method | Host | Port | When to Use | |--------|------|------|-------------| | **Direct** | `db.xxx.supabase.co` | `5432` | Standard setup | | **Pooler (Transaction)** | `xxx.pooler.supabase.com` | `5432` | High connection limits | | **Pooler (Session)** | `xxx.pooler.supabase.com` | `6543` | If direct fails | **Finding Your Credentials**: 1. Go to **Supabase Dashboard → Settings → Database** 2. Copy the connection string or individual fields 3. Use the database password (not the API key) **SSL Configuration**: - SSL Mode: `require` - Supabase enforces SSL by default > âš ī¸ Don't use the Supabase API key as the password. Use the actual database password from Settings → Database. ### Self-Hosted ### Self-Hosted PostgreSQL **Connection Details**: - **Host**: Your server's hostname or IP - **Port**: `5432` (or custom port) - **SSL Mode**: Depends on your setup **Firewall Configuration**: Allow inbound connections from AnomalyArmor IPs: ```bash # iptables example iptables -A INPUT -p tcp --dport 5432 -s 34.xxx.xxx.xxx -j ACCEPT iptables -A INPUT -p tcp --dport 5432 -s 34.xxx.xxx.xxx -j ACCEPT ``` **pg_hba.conf Configuration**: Add entries for AnomalyArmor: ``` # TYPE DATABASE USER ADDRESS METHOD hostssl all anomalyarmor 34.xxx.xxx.xxx/32 scram-sha-256 hostssl all anomalyarmor 34.xxx.xxx.xxx/32 scram-sha-256 ``` **SSL Setup** (if not already configured): ```bash # Generate self-signed certificate (for testing) openssl req -new -x509 -days 365 -nodes \ -out server.crt -keyout server.key # Set permissions chmod 600 server.key chown postgres:postgres server.key server.crt # Enable in postgresql.conf ssl = on ssl_cert_file = '/path/to/server.crt' ssl_key_file = '/path/to/server.key' ``` ### Google Cloud SQL ### Google Cloud SQL for PostgreSQL **Connection Methods**: | Method | Description | |--------|-------------| | **Public IP** | Add AnomalyArmor IPs to authorized networks | | **Cloud SQL Proxy** | For private IP instances (self-managed) | **Public IP Setup**: 1. Go to **Cloud Console → SQL → Your Instance → Connections** 2. Under **Authorized Networks**, click **Add Network** 3. Add each AnomalyArmor IP **Connection Details**: - **Host**: Public IP from instance overview - **Port**: `5432` - **SSL Mode**: `require` > â„šī¸ Cloud SQL requires SSL by default. If you need `verify-ca`, download the server certificate from the instance details. ## Connection Pooling Considerations If you use a connection pooler (PgBouncer, Pgpool): ### PgBouncer - **Transaction mode**: Works with AnomalyArmor - **Session mode**: Recommended for best compatibility - **Statement mode**: May have issues with complex queries > 💡 Connect directly to PostgreSQL, not through PgBouncer, unless you have connection limit constraints. ### Connection Limits AnomalyArmor uses **1-2 connections** during discovery. If you're near your connection limit: 1. Use a read replica for monitoring 2. Schedule discovery during off-peak hours 3. Increase `max_connections` if possible ## What We Query AnomalyArmor runs these types of queries: ```sql -- Tables and views SELECT * FROM information_schema.tables WHERE table_schema NOT IN ('pg_catalog', 'information_schema'); -- Columns SELECT * FROM information_schema.columns WHERE table_schema NOT IN ('pg_catalog', 'information_schema'); -- Constraints SELECT * FROM information_schema.table_constraints; -- Freshness (for timestamp columns) SELECT MAX(your_timestamp_column) FROM your_table; ``` **Impact**: Minimal. These are lightweight metadata queries. ## Troubleshooting ### Connection refused **Causes**: - Firewall blocking the connection - Wrong hostname or port - Database not running **Solutions**: 1. Verify AnomalyArmor IPs are allowlisted 2. Check security group rules (for RDS/Aurora) 3. Test connectivity: `nc -zv hostname 5432` 4. Verify database is accepting connections ### Password authentication failed **Causes**: - Wrong password - User doesn't exist - pg_hba.conf not configured **Solutions**: 1. Verify password (copy-paste to avoid typos) 2. Confirm user exists: `SELECT usename FROM pg_user;` 3. Check pg_hba.conf allows the connection method 4. Try resetting the password ### SSL connection required **Causes**: - Database requires SSL but connection uses `disable` - Wrong SSL mode for the server **Solutions**: 1. Set SSL Mode to `require` 2. For RDS/Aurora/Supabase: SSL is required 3. For self-hosted: Enable SSL or allow non-SSL (not recommended) ### Permission denied for relation **Causes**: - User lacks SELECT permission - Schema permission missing **Solutions**: ```sql -- Grant schema access GRANT USAGE ON SCHEMA public TO anomalyarmor; -- Grant table access GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor; ``` ### No tables found in discovery **Causes**: - User can't see tables in information_schema - Schema filter excluding all schemas **Solutions**: 1. Test as the user: `SELECT * FROM information_schema.tables LIMIT 5;` 2. Check schema filter settings in AnomalyArmor 3. Verify tables exist in the expected schemas ### SSH tunnel connection failed **Causes**: - Invalid SSH credentials - Bastion host not reachable - SSH port blocked **Solutions**: 1. Test SSH connection manually: `ssh -i key.pem user@bastion.example.com` 2. Verify SSH host and port are correct 3. Check that AnomalyArmor IPs can reach the bastion host 4. Ensure the SSH user has permission to forward connections ### SSH authentication failed **Causes**: - Invalid private key format - Wrong passphrase for encrypted key - Public key not added to bastion **Solutions**: 1. Verify key is in PEM format (starts with `-----BEGIN`) 2. For encrypted keys, ensure passphrase is correct 3. Check `~/.ssh/authorized_keys` on bastion includes your public key 4. Verify SSH user exists on the bastion host ## Common Questions ### Which PostgreSQL-compatible services does AnomalyArmor support? PostgreSQL 12+ self-hosted, Amazon RDS, Amazon Aurora (cluster and serverless), Supabase, Google Cloud SQL, Azure Database for PostgreSQL (single and flexible server), and Heroku Postgres. Any managed PostgreSQL that exposes the wire protocol on a network-reachable port works. ### What SSL Mode should I use for RDS or Aurora? `require` is the right default - it encrypts traffic without pinning certificates. Use `verify-ca` or `verify-full` if your compliance program requires CA validation; you'll need to upload the RDS/Aurora CA bundle in connection settings. Never use `disable` for managed cloud databases. ### My PostgreSQL is only reachable through a bastion host. Can AnomalyArmor still connect? Yes. Enable SSH tunnel mode on the connection, provide the bastion's host, port, user, and an SSH key, and AnomalyArmor tunnels to your PostgreSQL through it. The bastion needs outbound access to your database on port 5432. ### Does AnomalyArmor support PostgreSQL logical replication or CDC? Not for primary monitoring. AnomalyArmor monitors via `information_schema` and bounded aggregates on a schedule - it does not read the WAL or consume replication slots. For freshness, it uses `MAX(timestamp_column)`, which works without replication. ### How do I give AnomalyArmor access to future tables without re-granting permissions? Use `ALTER DEFAULT PRIVILEGES` so new tables created in the monitored schema automatically grant `SELECT` to the AnomalyArmor user: ```sql ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO anomalyarmor; ``` Run it once per schema you want monitored. Existing tables still need an explicit `GRANT SELECT ON ALL TABLES` the first time. ## Next Steps - **[Run Discovery](/quickstart/run-first-discovery)**: Scan your PostgreSQL database - **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes --- # MySQL > Connect AnomalyArmor to MySQL databases including RDS, Aurora MySQL, and PlanetScale Connect AnomalyArmor to any MySQL-compatible database. This guide covers self-hosted MySQL as well as managed services like Amazon RDS, Aurora MySQL, PlanetScale, and DigitalOcean. ## Supported Versions & Platforms | Platform | Minimum Version | Notes | |----------|-----------------|-------| | MySQL | 5.7+ | Self-hosted or any cloud | | Amazon RDS | 5.7+ | All instance classes | | Amazon Aurora MySQL | 5.7+ | Cluster and serverless | | PlanetScale | Any | Serverless MySQL | | DigitalOcean | 8.0+ | Managed databases | | Google Cloud SQL | 5.7+ | Public or private IP | | Azure Database | 5.7+ | Single server or flexible | | MariaDB | 10.3+ | MySQL-compatible fork | ## Connection Settings | Field | Description | Example | |-------|-------------|---------| | **Connection Name** | Friendly identifier | `Production MySQL` | | **Host** | Hostname or IP address | `db.example.com` | | **Port** | Database port | `3306` | | **Database** | Database name | `myapp_production` | | **Username** | Database user | `anomalyarmor` | | **Password** | User password | `â€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸ` | | **SSL Mode** | SSL configuration | `require` | ### SSL Mode Options | Mode | Description | When to Use | |------|-------------|-------------| | `disable` | No SSL | Local development only | | `require` | SSL required, no verification | **Recommended** for most cloud providers | | `verify-ca` | Verify server certificate | High security requirements | | `verify-full` | Verify certificate and hostname | Maximum security | > âš ī¸ Never use `disable` for production databases. Most cloud providers (RDS, Aurora, PlanetScale) require SSL. ## Creating a Read-Only User Create a dedicated user with minimal permissions: ```sql -- Create the user CREATE USER 'anomalyarmor'@'%' IDENTIFIED BY 'your-secure-password'; -- Grant read access to your database GRANT SELECT ON your_database.* TO 'anomalyarmor'@'%'; -- Access to information_schema is implicit with SELECT -- Flush privileges to apply changes FLUSH PRIVILEGES; ``` ### For Multiple Databases If you want to monitor multiple databases: ```sql -- Grant access to specific databases GRANT SELECT ON database1.* TO 'anomalyarmor'@'%'; GRANT SELECT ON database2.* TO 'anomalyarmor'@'%'; GRANT SELECT ON analytics.* TO 'anomalyarmor'@'%'; FLUSH PRIVILEGES; ``` ### Verifying Permissions Test that the user can access metadata: ```sql -- Should return tables SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'your_database' LIMIT 5; -- Should return columns SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'your_database' LIMIT 5; ``` ## Provider-Specific Instructions ### Amazon RDS ### Amazon RDS MySQL **Connection Details**: - **Host**: Your RDS endpoint (e.g., `mydb.abc123.us-east-1.rds.amazonaws.com`) - **Port**: `3306` (default) - **SSL Mode**: `require` **Security Group Configuration**: 1. Go to **AWS Console → RDS → Your Instance → Security Groups** 2. Edit inbound rules 3. Add rule: - Type: `MySQL/Aurora` - Port: `3306` - Source: AnomalyArmor IPs (see Settings → Security) **Parameter Group** (if using verify-ca or verify-full): - Ensure `require_secure_transport = ON` - Download RDS CA certificate bundle > â„šī¸ RDS instances in private subnets require NAT Gateway or VPC peering for AnomalyArmor access. Contact us for Enterprise VPC peering options. ### Amazon Aurora MySQL ### Amazon Aurora MySQL **Connection Details**: - **Host**: Cluster endpoint (reader or writer) - **Port**: `3306` (default) - **SSL Mode**: `require` **Choosing the Right Endpoint**: | Endpoint | Use Case | |----------|----------| | **Cluster (writer)** | If you need real-time schema changes | | **Reader** | Recommended - no impact on production writes | **Aurora Serverless v2**: - Use the cluster endpoint - Ensure minimum ACU allows connections during discovery - Consider scheduling discovery during active hours ### PlanetScale ### PlanetScale **Connection Details**: - **Host**: Your branch endpoint (e.g., `aws.connect.psdb.cloud`) - **Port**: `3306` - **SSL Mode**: `require` (mandatory) **Getting Credentials**: 1. Go to **PlanetScale Dashboard → Your Database → Settings → Passwords** 2. Create a new password with **Read-only** access 3. Copy the connection details **Important Notes**: - PlanetScale requires SSL - `disable` mode will fail - Use the `main` branch for production monitoring - Create separate credentials for AnomalyArmor > âš ī¸ PlanetScale uses Vitess which may show slightly different table metadata than standard MySQL. All core functionality works correctly. ### Self-Hosted ### Self-Hosted MySQL **Connection Details**: - **Host**: Your server's hostname or IP - **Port**: `3306` (or custom port) - **SSL Mode**: Depends on your setup **Firewall Configuration**: Allow inbound connections from AnomalyArmor IPs: ```bash # iptables example iptables -A INPUT -p tcp --dport 3306 -s 34.xxx.xxx.xxx -j ACCEPT iptables -A INPUT -p tcp --dport 3306 -s 34.xxx.xxx.xxx -j ACCEPT ``` **MySQL Configuration** (my.cnf): Allow remote connections: ```ini [mysqld] bind-address = 0.0.0.0 ``` **SSL Setup** (if not already configured): ```bash # Generate certificates mysql_ssl_rsa_setup --uid=mysql # Enable in my.cnf [mysqld] ssl-ca=/var/lib/mysql/ca.pem ssl-cert=/var/lib/mysql/server-cert.pem ssl-key=/var/lib/mysql/server-key.pem require_secure_transport=ON ``` **User Host Configuration**: Ensure user is created for remote access: ```sql -- Allow from any host CREATE USER 'anomalyarmor'@'%' IDENTIFIED BY 'password'; -- Or allow from specific IPs only CREATE USER 'anomalyarmor'@'34.xxx.xxx.xxx' IDENTIFIED BY 'password'; ``` ### Google Cloud SQL ### Google Cloud SQL for MySQL **Connection Methods**: | Method | Description | |--------|-------------| | **Public IP** | Add AnomalyArmor IPs to authorized networks | | **Cloud SQL Proxy** | For private IP instances (self-managed) | **Public IP Setup**: 1. Go to **Cloud Console → SQL → Your Instance → Connections** 2. Under **Authorized Networks**, click **Add Network** 3. Add each AnomalyArmor IP **Connection Details**: - **Host**: Public IP from instance overview - **Port**: `3306` - **SSL Mode**: `require` > â„šī¸ Cloud SQL requires SSL by default. If you need `verify-ca`, download the server certificate from the instance details. ### DigitalOcean ### DigitalOcean Managed MySQL **Connection Details**: - **Host**: Your database cluster hostname - **Port**: `25060` (DigitalOcean uses non-standard port) - **SSL Mode**: `require` (mandatory) **Getting Credentials**: 1. Go to **DigitalOcean → Databases → Your Cluster** 2. Click **Connection Details** 3. Select **Direct Connection** or **Connection Pool** **Trusted Sources**: 1. Go to **Settings → Trusted Sources** 2. Add AnomalyArmor IP addresses > â„šī¸ DigitalOcean requires SSL and uses port 25060 by default. ## What We Query AnomalyArmor runs these types of queries: ```sql -- Tables and views SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys'); -- Columns SELECT COLUMN_NAME, DATA_TYPE, IS_NULLABLE, COLUMN_DEFAULT FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'your_database'; -- Freshness (for timestamp columns) SELECT MAX(your_timestamp_column) FROM your_table; ``` **Impact**: Minimal. These are lightweight metadata queries. ## Excluded Schemas AnomalyArmor automatically excludes MySQL system schemas: - `mysql` - `information_schema` - `performance_schema` - `sys` Only user-created databases and tables are monitored. ## Troubleshooting ### Can't connect to MySQL server **Causes**: - Firewall blocking the connection - Wrong hostname or port - Database not running **Solutions**: 1. Verify AnomalyArmor IPs are allowlisted 2. Check security group rules (for RDS/Aurora) 3. Test connectivity: `nc -zv hostname 3306` 4. Verify MySQL is running: `systemctl status mysql` ### Access denied for user **Causes**: - Wrong password - User doesn't exist for connecting host - User lacks privileges **Solutions**: 1. Verify password (copy-paste to avoid typos) 2. Confirm user exists: `SELECT User, Host FROM mysql.user;` 3. Check user is created for `%` or specific IP 4. Verify grants: `SHOW GRANTS FOR 'anomalyarmor'@'%';` ### SSL connection error **Causes**: - Database requires SSL but connection uses `disable` - SSL certificate issues **Solutions**: 1. Set SSL Mode to `require` 2. For RDS/Aurora/PlanetScale: SSL is required 3. For self-hosted: Enable SSL or allow non-SSL (not recommended) ### Unknown database **Causes**: - Database name is incorrect - Database names are case-sensitive on Linux **Solutions**: 1. Verify database name: `SHOW DATABASES;` 2. Use exact case for database name 3. Check you have access: `SHOW DATABASES;` (shows only accessible DBs) ### No tables found in discovery **Causes**: - User can't see tables in information_schema - Schema filter excluding all schemas **Solutions**: 1. Test as the user: `SELECT * FROM INFORMATION_SCHEMA.TABLES LIMIT 5;` 2. Check schema filter settings in AnomalyArmor 3. Verify tables exist in the database ## Common Questions ### Which MySQL versions and managed services does AnomalyArmor support? MySQL 5.7 and 8.0+, Amazon RDS for MySQL, Aurora MySQL, Google Cloud SQL for MySQL, Azure Database for MySQL, and PlanetScale. MariaDB is MySQL-compatible for most monitoring use cases but not officially supported - test before relying on it in production. ### Does MySQL 5.7 have any monitoring limitations compared to 8.0? Minor. MySQL 5.7 doesn't expose some column-statistics views that 8.0 has, so a few metric baselines are coarser. Schema drift and freshness work identically on both versions - no migration urgency for observability alone. ### Can AnomalyArmor monitor PlanetScale branches? Yes. Each connection points at one PlanetScale branch. Monitor `main` for production; add separate data sources per branch if you want pre-merge schema checks. PlanetScale's schema migration flow surfaces in AnomalyArmor as schema drift on the target branch. ### How should I configure SSL for RDS MySQL? Set SSL Mode to `require` (RDS MySQL's default). Upload Amazon's RDS CA bundle in connection settings if your compliance program requires `verify-ca` or `verify-full`. ### Does AnomalyArmor read MySQL binlogs? No. AnomalyArmor queries `information_schema` and runs bounded aggregates on a schedule. Binlog access is not required - if you've granted `REPLICATION CLIENT` to the AnomalyArmor user, you can safely revoke it. ## Next Steps - **[Run Discovery](/quickstart/run-first-discovery)**: Scan your MySQL database - **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes --- # Snowflake > Connect AnomalyArmor to your Snowflake data warehouse Connect AnomalyArmor to your Snowflake data warehouse to monitor schemas, track freshness, and detect schema drift across your tables and views. ## Requirements Before connecting, ensure you have: - **Snowflake account** with database access - **Virtual warehouse** for query execution - **User with SELECT permissions** on information_schema and target schemas - **Network access** from AnomalyArmor to your Snowflake account ## Connection Settings | Field | Description | Example | |-------|-------------|---------| | **Connection Name** | Friendly identifier | `Snowflake Production` | | **Account** | Snowflake account identifier | `xy12345.us-east-1.aws` | | **Database** | Database (catalog) to monitor | `PRODUCTION_DB` | | **Warehouse** | Virtual warehouse name | `COMPUTE_WH` | | **Username** | Snowflake user account | `anomalyarmor_user` | | **Password** | User password | `â€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸ` | | **Schema** (Optional) | Default schema | `PUBLIC` | | **Role** (Optional) | Role to use | `ANOMALYARMOR_ROLE` | ## Finding Your Account Identifier Your account identifier format depends on your cloud provider and region: | Cloud Provider | Format | |---------------|--------| | **Standard (AWS)** | `xy12345.us-east-1.aws` | | **Azure** | `xy12345.east-us-2.azure` | | **GCP** | `xy12345.us-central1.gcp` | Find your account identifier in the Snowflake web interface URL: > âš ī¸ Include the full account identifier with region and cloud provider. Using just the account locator (e.g., `xy12345`) may not work for all regions. ## Creating a Read-Only User Create a dedicated user and role with minimal permissions. > 💡 **Quick Setup**: [Download the Snowflake permissions script](/downloads/snowflake-permissions) for a ready-to-use SQL template with role and user setup. ```sql -- Create a read-only role for AnomalyArmor CREATE ROLE IF NOT EXISTS ANOMALYARMOR_ROLE; -- Grant database access GRANT USAGE ON DATABASE your_database TO ROLE ANOMALYARMOR_ROLE; -- Grant schema access (for all schemas) GRANT USAGE ON ALL SCHEMAS IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE; GRANT USAGE ON FUTURE SCHEMAS IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE; -- Grant SELECT on all tables GRANT SELECT ON ALL TABLES IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE; GRANT SELECT ON FUTURE TABLES IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE; -- Grant SELECT on all views GRANT SELECT ON ALL VIEWS IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE; GRANT SELECT ON FUTURE VIEWS IN DATABASE your_database TO ROLE ANOMALYARMOR_ROLE; -- Grant warehouse usage GRANT USAGE ON WAREHOUSE your_warehouse TO ROLE ANOMALYARMOR_ROLE; -- Create user and assign role CREATE USER IF NOT EXISTS anomalyarmor_user PASSWORD = 'your_secure_password' DEFAULT_ROLE = ANOMALYARMOR_ROLE DEFAULT_WAREHOUSE = your_warehouse; GRANT ROLE ANOMALYARMOR_ROLE TO USER anomalyarmor_user; ``` ### Per-Schema Permissions For more granular control: ```sql -- Grant access to specific schemas only GRANT USAGE ON SCHEMA your_database.raw TO ROLE ANOMALYARMOR_ROLE; GRANT USAGE ON SCHEMA your_database.staging TO ROLE ANOMALYARMOR_ROLE; GRANT USAGE ON SCHEMA your_database.marts TO ROLE ANOMALYARMOR_ROLE; -- Grant SELECT per schema GRANT SELECT ON ALL TABLES IN SCHEMA your_database.raw TO ROLE ANOMALYARMOR_ROLE; GRANT SELECT ON ALL TABLES IN SCHEMA your_database.staging TO ROLE ANOMALYARMOR_ROLE; GRANT SELECT ON ALL TABLES IN SCHEMA your_database.marts TO ROLE ANOMALYARMOR_ROLE; ``` ## Authentication Methods ### Password Authentication Standard username/password authentication. Simplest setup for getting started. **In AnomalyArmor**: - Enter your username and password - No additional configuration required ### Key-Pair Authentication (Recommended) More secure authentication using RSA key pairs. Recommended for production. **Step 1: Generate Key Pair** ```bash # Generate private key (unencrypted for automation) openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out snowflake_key.p8 -nocrypt # Generate public key openssl rsa -in snowflake_key.p8 -pubout -out snowflake_key.pub # View public key (for Snowflake) cat snowflake_key.pub ``` **Step 2: Assign Public Key to User** ```sql -- Remove the header/footer lines from the public key ALTER USER anomalyarmor_user SET RSA_PUBLIC_KEY='MIIBIjANBgkqhki...'; ``` **Step 3: Configure in AnomalyArmor** - Set **Authenticator** to `snowflake_jwt` - Provide the contents of `snowflake_key.p8` as the **Private Key** - If your key is encrypted, provide the **Private Key Passphrase** - Leave password field empty > 💡 For encrypted keys, generate with: `openssl genrsa 2048 | openssl pkcs8 -topk8 -v2 aes256 -inform PEM -out snowflake_key.p8` > â„šī¸ Key-pair authentication is more secure because: - No password to rotate - Keys can't be phished - Supports hardware security modules ## Virtual Warehouse Considerations ### Warehouse Sizing AnomalyArmor runs lightweight metadata queries. Recommended warehouse configuration: | Environment | Size | Notes | |-------------|------|-------| | **Development** | X-Small | Sufficient for testing | | **Production** | Small | Faster query execution | ### Auto-Suspend Configuration Enable auto-suspend to minimize costs: ```sql -- Configure warehouse for auto-suspend (1-5 minutes recommended) ALTER WAREHOUSE your_warehouse SET AUTO_SUSPEND = 60 -- Suspend after 60 seconds idle AUTO_RESUME = TRUE -- Resume automatically on query INITIALLY_SUSPENDED = TRUE; ``` > 💡 AnomalyArmor queries typically complete in under a second. With 1-minute auto-suspend, you'll pay only for actual query time. ### Cost Estimation | Metric | Value | |--------|-------| | **Query duration** | < 1 second per discovery | | **With 1-min auto-suspend** | ~$0.01-0.05 per discovery | | **Hourly monitoring** | ~$15-30/month | ## Network Policies If your Snowflake account uses network policies, add AnomalyArmor's IP addresses: ```sql -- View existing network policies SHOW NETWORK POLICIES; -- Add AnomalyArmor IPs to your allowlist ALTER NETWORK POLICY your_policy SET ALLOWED_IP_LIST = ('34.xxx.xxx.xxx/32', '35.xxx.xxx.xxx/32', ...); ``` > â„šī¸ Find AnomalyArmor's current IP addresses in **Settings > Security** in the AnomalyArmor dashboard. ## What We Monitor AnomalyArmor discovers and monitors these Snowflake objects: | Object Type | Monitored | Notes | |-------------|-----------|-------| | **Tables** | Yes | Including managed and external | | **Views** | Yes | Standard and materialized | | **Schemas** | Yes | Schema-level metadata | | **Stages** | No | External/internal stages not monitored | | **Streams** | No | Change data capture not monitored | | **Tasks** | No | Scheduled tasks not monitored | ### Metadata Captured For each table and view: - Table name and schema - Column names and data types - Nullability and default values - Last modified timestamp (for freshness) - Partition information (where applicable) ## Multiple Databases To monitor multiple databases, create separate data sources for each: | Data Source | Database | |------------|----------| | Snowflake Production | `PRODUCTION_DB` | | Snowflake Staging | `STAGING_DB` | | Snowflake Analytics | `ANALYTICS_DB` | > â„šī¸ Each data source needs access to its respective database. Use the same credentials if they have permissions across databases. ## Connection Architecture ## What We Query AnomalyArmor runs these types of queries: ```sql -- List tables and views SELECT table_catalog, table_schema, table_name, table_type FROM YOUR_DATABASE.INFORMATION_SCHEMA.TABLES WHERE table_schema NOT IN ('INFORMATION_SCHEMA'); -- Get column details SELECT column_name, data_type, is_nullable, column_default FROM YOUR_DATABASE.INFORMATION_SCHEMA.COLUMNS WHERE table_schema = 'YOUR_SCHEMA' AND table_name = 'YOUR_TABLE'; -- Check freshness (for tables with timestamp columns) SELECT MAX(your_timestamp_column) FROM your_table; ``` **Impact**: Minimal. These are metadata queries that don't scan table data. ## Troubleshooting ### Connection test fails **Common causes**: 1. Invalid account identifier 2. Wrong username or password 3. Warehouse doesn't exist or is suspended **Solutions**: 1. Verify account identifier includes region and cloud (e.g., `xy12345.us-east-1.aws`) 2. Test credentials in Snowflake web interface first 3. Ensure warehouse exists: `SHOW WAREHOUSES;` 4. Resume warehouse if suspended: `ALTER WAREHOUSE your_wh RESUME;` ### Incorrect username or password **Causes**: - Typo in credentials - User doesn't exist - Password expired **Solutions**: 1. Verify user exists: `SHOW USERS LIKE 'anomalyarmor%';` 2. Reset password if needed 3. For key-pair auth, verify public key is assigned to user ### Account not found **Causes**: - Missing region or cloud in account identifier - Account locator typo **Solutions**: 1. Get full account identifier from Snowflake URL 2. Include region and cloud: `xy12345.us-east-1.aws` 3. Try alternative formats if needed ### Warehouse does not exist **Causes**: - Typo in warehouse name - User lacks USAGE on warehouse - Warehouse was deleted **Solutions**: 1. List warehouses: `SHOW WAREHOUSES;` 2. Grant usage: `GRANT USAGE ON WAREHOUSE wh TO ROLE role;` 3. Check warehouse name is exact match (case-sensitive) ### IP address blocked **Causes**: - Network policy restricting access - AnomalyArmor IPs not allowlisted **Solutions**: 1. Check network policies: `SHOW NETWORK POLICIES;` 2. Add AnomalyArmor IPs to allowlist 3. Contact your Snowflake admin for policy changes ### Role does not exist **Causes**: - Role typo - Role was deleted - User not granted the role **Solutions**: 1. List roles: `SHOW ROLES;` 2. Check grants: `SHOW GRANTS TO USER your_user;` 3. Grant role: `GRANT ROLE role TO USER user;` ### No tables found in discovery **Causes**: - User lacks SELECT permissions - Schema filter excluding all schemas - Empty database **Solutions**: 1. Test query as user: `SELECT * FROM your_db.INFORMATION_SCHEMA.TABLES LIMIT 5;` 2. Check grants: `SHOW GRANTS TO ROLE your_role;` 3. Grant SELECT on schemas: `GRANT SELECT ON ALL TABLES IN SCHEMA schema TO ROLE role;` ## Best Practices ### Use Dedicated Service Account Create a dedicated user for AnomalyArmor rather than using personal accounts: - Dedicated users persist regardless of employee changes - Easier to audit and manage permissions - Can be easily rotated or disabled ### Use Key-Pair Authentication for Production Password authentication works but key-pair is more secure: - No password to rotate - Keys can't be phished - Better audit trail - Supports hardware security modules ### Monitor Your Production Database Start with your production database where schema changes have the most impact: | Priority | Database | Importance | |----------|----------|------------| | 1 | Production database | Critical | | 2 | Staging database | Important | | 3 | Development databases | Optional | ### Choose the Right Warehouse Size Metadata queries are lightweight. X-Small is sufficient but Small provides faster startup: | Warehouse Size | Credit/Hour | Recommendation | |----------------|-------------|----------------| | X-Small | 1 | Development | | Small | 2 | Production | ## Common Questions ### How much Snowflake credit does AnomalyArmor consume? Minimal. Discovery runs metadata queries against `information_schema.tables/columns` which auto-suspend the warehouse promptly. Typical daily consumption is under 1 credit on an X-Small warehouse. Use a dedicated X-Small warehouse for AnomalyArmor to keep cost visibility clean. ### What's the correct format for the Snowflake account identifier? Include the full `account.region.cloud` format (e.g., `xy12345.us-east-1.aws`), not just the account locator. The URL in your Snowflake web interface shows it: `https://app.snowflake.com///...`. Getting this wrong is the most common first-connect error. ### Can AnomalyArmor monitor multiple Snowflake databases (catalogs) with one connection? Not today. Connect each Snowflake database as a separate data source. A shared `ANOMALYARMOR_ROLE` with `USAGE` on multiple databases simplifies permission setup across them. ### How do I give AnomalyArmor access to future tables in Snowflake? Use `FUTURE` grants: `GRANT SELECT ON FUTURE TABLES IN DATABASE TO ROLE ANOMALYARMOR_ROLE` (and likewise for `FUTURE SCHEMAS` and `FUTURE VIEWS`). New objects inherit `SELECT` automatically - no re-grant needed after each deploy. ### Do I need a separate warehouse for AnomalyArmor, or can it share my existing one? Sharing works but obscures cost attribution. A dedicated X-Small warehouse with `AUTO_SUSPEND = 60` isolates AnomalyArmor's credit usage and prevents unrelated queries from keeping the warehouse warm longer than needed. ## Next Steps - **[Run Discovery](/quickstart/run-first-discovery)**: Scan your Snowflake database - **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes --- # BigQuery > Connect AnomalyArmor to Google BigQuery for schema monitoring and change detection Connect AnomalyArmor to Google BigQuery to monitor schemas, track freshness, and detect schema drift across your datasets and tables. ## Requirements Before connecting, ensure you have: - **Google Cloud Platform (GCP) account** with an active project - **BigQuery API enabled** in your project - **Service account** with `bigquery.datasets.get` and `bigquery.tables.list` permissions - **Service account JSON key** for authentication ## Connection Settings | Field | Description | Example | |-------|-------------|---------| | **Asset Name** | Friendly identifier for this connection | `BigQuery Production` | | **Project ID** | GCP project ID | `my-analytics-project` | | **Service Account JSON** | Full JSON key contents | `{"type": "service_account", ...}` | | **Default Dataset** (Optional) | Scope introspection to a single dataset | `analytics` | | **Location** (Optional) | BigQuery data location/region | `US`, `EU`, `us-east1` | ### Service Account JSON Input You can provide the service account JSON in two ways: - **File Upload**: Click **Upload JSON File** to select your downloaded key file - **Paste**: Copy and paste the entire JSON contents into the text area > 💡 File upload is recommended to avoid copy-paste errors with the ~2KB JSON key. ### Default Dataset If your project has many datasets, use the **Default Dataset** field to scope introspection to a single dataset. This is useful for: - Large projects with 100+ datasets - Focusing on production data only - Reducing discovery time Leave empty to discover all datasets in the project. ### Location The **Location** field specifies where your BigQuery data resides. Common options: | Location | Description | |----------|-------------| | `US` | United States (multi-region) | | `EU` | European Union (multi-region) | | `us-east1` | South Carolina | | `us-west1` | Oregon | | `europe-west1` | Belgium | | `europe-west2` | London | | `asia-northeast1` | Tokyo | Leave as "Auto" to let BigQuery determine the location automatically. > â„šī¸ BigQuery uses project-level connections. Without a default dataset, all datasets within the project will be discovered automatically. ## Creating a Service Account ### Step 1: Create Service Account 1. Go to **GCP Console > IAM & Admin > Service Accounts** 2. Click **Create Service Account** 3. Enter details: - Name: `anomalyarmor` - Description: `AnomalyArmor schema monitoring` 4. Click **Create and Continue** ### Step 2: Assign Roles Grant the minimum required roles: | Role | Purpose | |------|---------| | **BigQuery Data Viewer** | Read table/view metadata and data | | **BigQuery Job User** | Execute queries for introspection | | Role | ID | Purpose | |------|----|---------| | **BigQuery Data Viewer** | `roles/bigquery.dataViewer` | Read metadata and table contents | | **BigQuery Job User** | `roles/bigquery.jobUser` | Run queries in the project | > 💡 For production, consider creating a custom role with only the specific permissions needed (see Minimal Permissions section below). ### Step 3: Create JSON Key 1. Click on the created service account 2. Go to **Keys** tab 3. Click **Add Key > Create new key** 4. Select **JSON** format 5. Download and securely store the key file > âš ī¸ Keep your service account key secure. Never commit it to version control or share it publicly. You can rotate keys in GCP Console if compromised. ## Minimal Permissions For maximum security, create a custom role with only these permissions: ```yaml # Custom role for AnomalyArmor title: "AnomalyArmor Schema Monitor" description: "Minimal permissions for schema monitoring" includedPermissions: - bigquery.datasets.get - bigquery.tables.get - bigquery.tables.list - bigquery.jobs.create - bigquery.jobs.get ``` To create via gcloud CLI: ```bash # Create custom role gcloud iam roles create anomalyarmor_monitor \ --project=your-project-id \ --title="AnomalyArmor Schema Monitor" \ --description="Minimal permissions for schema monitoring" \ --permissions=bigquery.datasets.get,bigquery.tables.get,bigquery.tables.list,bigquery.jobs.create,bigquery.jobs.get # Assign to service account gcloud projects add-iam-policy-binding your-project-id \ --member="serviceAccount:anomalyarmor@your-project-id.iam.gserviceaccount.com" \ --role="projects/your-project-id/roles/anomalyarmor_monitor" ``` ## Finding Your Project ID Your project ID is shown in multiple places in GCP Console: - **Dashboard**: Top-left dropdown shows project name and ID - **URL**: `console.cloud.google.com/bigquery?project=YOUR-PROJECT-ID` - **Project Settings**: IAM & Admin > Settings > âš ī¸ Use the **Project ID** (e.g., `my-analytics-123`), not the Project Name (e.g., "My Analytics"). Project IDs are lowercase with hyphens. ## Enabling BigQuery API If the BigQuery API is not enabled: 1. Go to **GCP Console > APIs & Services > Library** 2. Search for "BigQuery API" 3. Click **Enable** Or via gcloud CLI: ```bash gcloud services enable bigquery.googleapis.com --project=your-project-id ``` ## What We Monitor AnomalyArmor discovers and monitors these BigQuery objects: | Object Type | Monitored | Notes | |-------------|-----------|-------| | **Tables** | Yes | Standard and partitioned | | **Views** | Yes | Standard views | | **Materialized Views** | Yes | Treated as views | | **External Tables** | Yes | Schema only, not source | | **Datasets** | Yes | As schemas/namespaces | | **Routines** | No | Functions/procedures not monitored | | **Models** | No | ML models not monitored | ### Metadata Captured For each table and view: - Table name and dataset - Column names and data types - Nullability - Last modified timestamp (for freshness) - Partitioning information ## Connection Architecture ## What We Query AnomalyArmor runs these types of queries: ```sql -- List tables in each dataset SELECT table_catalog, table_schema, table_name, table_type FROM `project_id.dataset_id.INFORMATION_SCHEMA.TABLES`; -- Get column details SELECT column_name, data_type, is_nullable, column_default FROM `project_id.dataset_id.INFORMATION_SCHEMA.COLUMNS` WHERE table_name = 'your_table'; -- Check freshness (for tables with timestamp columns) SELECT MAX(your_timestamp_column) FROM `project_id.dataset_id.your_table`; ``` **Impact**: Minimal. These are lightweight metadata queries that don't scan table data. ### Cost Estimation | Metric | Value | |--------|-------| | **Query type** | Metadata queries (minimal cost) | | **Data scanned** | ~1KB per INFORMATION_SCHEMA query | | **Estimated cost** | < $0.01 per discovery | > â„šī¸ INFORMATION_SCHEMA queries in BigQuery are free. The only cost is for freshness checks that query actual tables, which scan minimal data. ## Multiple Projects To monitor multiple GCP projects, create separate data sources for each: | Data Source | Project | |------------|---------| | BigQuery Production | `prod-analytics-123` | | BigQuery Staging | `staging-analytics-456` | | BigQuery Development | `dev-analytics-789` | Each project needs its own service account or a service account with cross-project permissions. ### Cross-Project Access To use one service account for multiple projects: ```bash # Grant access in each project gcloud projects add-iam-policy-binding other-project-id \ --member="serviceAccount:anomalyarmor@main-project.iam.gserviceaccount.com" \ --role="roles/bigquery.dataViewer" ``` ## VPC Service Controls If your BigQuery is protected by VPC Service Controls: 1. Add AnomalyArmor's IP addresses to the access level 2. Or configure an access policy that allows the service account > â„šī¸ Contact AnomalyArmor support for Enterprise VPC Service Controls integration options. ## Troubleshooting ### Permission denied / 403 error **Causes**: - Service account lacks required roles - API not enabled - VPC Service Controls blocking access **Solutions**: 1. Verify roles: **BigQuery Data Viewer** and **BigQuery Job User** 2. Enable BigQuery API in GCP Console 3. Check VPC Service Controls perimeter settings 4. Test service account access via gcloud CLI ### Project not found **Causes**: - Wrong project ID - Project ID vs Project Name confusion - Project deleted or suspended **Solutions**: 1. Use Project ID (lowercase with hyphens), not Project Name 2. Verify project exists in GCP Console 3. Check project hasn't been deleted or suspended ### Invalid service account JSON **Causes**: - Malformed JSON - Missing required fields - Copied text instead of file contents **Solutions**: 1. Download fresh key from GCP Console 2. Paste entire file contents (including braces) 3. Verify JSON is valid (no trailing commas, proper quotes) ### BigQuery API not enabled **Causes**: - API disabled in project - New project without API enabled **Solutions**: 1. Go to GCP Console > APIs & Services > Library 2. Search for "BigQuery API" 3. Click Enable 4. Wait a few minutes for propagation ### Dataset not found **Causes**: - Dataset was deleted - Dataset in different project - Permission to list datasets missing **Solutions**: 1. Verify dataset exists in BigQuery Console 2. Check service account can list datasets 3. Grant `bigquery.datasets.get` permission ### No tables found in discovery **Causes**: - Empty datasets - Service account can't read INFORMATION_SCHEMA - Dataset-level permissions missing **Solutions**: 1. Verify tables exist in BigQuery Console 2. Test query as service account 3. Grant dataset-level access if using fine-grained permissions ### Quota exceeded / Rate limit **Causes**: - Too many concurrent queries - Project quota limits reached **Solutions**: 1. Wait and retry 2. Check quota usage in GCP Console 3. Request quota increase if needed 4. Schedule discovery during off-peak hours ## Best Practices ### Use Dedicated Service Account Create a dedicated service account for AnomalyArmor: - Easier to audit and manage permissions - Can be rotated independently - Clear separation of concerns ### Start with Production Data Monitor your production project first where schema changes have the most impact: ``` Priority: 1. Production project (critical) 2. Staging project (important) 3. Development projects (optional) ``` ### Rotate Keys Periodically Rotate service account keys on a regular schedule: 1. Create new key in GCP Console 2. Update credentials in AnomalyArmor 3. Delete old key after verification ### Monitor Specific Datasets If you have many datasets, consider creating dataset-level permissions to monitor only critical ones: ```sql -- Grant access to specific datasets only GRANT `roles/bigquery.dataViewer` ON SCHEMA `project.critical_dataset` TO "serviceAccount:anomalyarmor@project.iam.gserviceaccount.com"; ``` ## Common Questions ### How does AnomalyArmor authenticate to BigQuery? Via a Google Cloud service account JSON key. Create a service account with `BigQuery Metadata Viewer` and `BigQuery Data Viewer` on the project or specific datasets, generate a key, and paste the JSON into the connection form. Workload Identity federation is available on Enterprise plans for customers who want to avoid long-lived keys. ### Does AnomalyArmor monitor across multiple BigQuery projects? One connection monitors one project. Add a separate data source per project. Assets from all connected projects appear unified in the asset catalog; plan limits count monitored tables, not projects. ### What BigQuery costs does AnomalyArmor incur? Minimal. Metadata reads against `INFORMATION_SCHEMA` are free. Bounded aggregate queries (`COUNT(*)`, `MIN`, `MAX`) hit the usual slot pricing but scan very little data per table since BigQuery's column-store skips unused columns. Use a dedicated reservation or project if you want clean cost attribution. ### Can AnomalyArmor monitor partitioned tables and clustered tables in BigQuery? Yes. Schema drift, freshness, and metrics all work on partitioned and clustered tables. For very large partitioned tables, configure freshness to query a specific partition (e.g., `WHERE _PARTITIONTIME >= CURRENT_TIMESTAMP() - INTERVAL 1 DAY`) to minimize bytes scanned. ## Next Steps - **[Run Discovery](/quickstart/run-first-discovery)**: Scan your BigQuery project - **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes --- # Amazon Redshift > Connect AnomalyArmor to Amazon Redshift data warehouses Connect AnomalyArmor to your Amazon Redshift cluster to monitor schema changes and data freshness. This guide covers both provisioned clusters and Redshift Serverless. ## Supported Platforms | Platform | Notes | |----------|-------| | **Redshift Provisioned** | All node types (RA3, DC2, DS2) | | **Redshift Serverless** | Workgroups with any RPU configuration | ## Connection Settings | Field | Description | Example | |-------|-------------|---------| | **Connection Name** | Friendly identifier | `Production Redshift` | | **Host** | Cluster or workgroup endpoint | `my-cluster.abc123.us-east-1.redshift.amazonaws.com` | | **Port** | Database port | `5439` | | **Database** | Database name | `dev` | | **Username** | Database user | `anomalyarmor` | | **Password** | User password | See credentials section | ## Finding Your Connection Details ### Provisioned Cluster 1. Go to **AWS Console > Amazon Redshift > Clusters** 2. Click on your cluster name 3. Find the **Endpoint** in the cluster details 4. Copy the hostname portion (everything before the port) ### Redshift Serverless 1. Go to **AWS Console > Amazon Redshift > Serverless dashboard** 2. Click on your workgroup 3. Find the **Endpoint** in the workgroup details. The default database for Serverless is typically `dev`. ## Creating a Read-Only User Create a dedicated user with minimal permissions: ```sql -- Create the user CREATE USER anomalyarmor WITH PASSWORD 'your-secure-password'; -- Grant USAGE on schemas you want to monitor GRANT USAGE ON SCHEMA public TO anomalyarmor; GRANT USAGE ON SCHEMA analytics TO anomalyarmor; -- Grant SELECT on all current tables GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO anomalyarmor; -- Grant access to future tables (recommended) ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO anomalyarmor; ALTER DEFAULT PRIVILEGES IN SCHEMA analytics GRANT SELECT ON TABLES TO anomalyarmor; ``` ### Verifying Permissions Test that the user can access metadata: ```sql -- Should return tables SELECT table_schema, table_name FROM svv_tables WHERE table_schema NOT IN ('pg_catalog', 'information_schema') LIMIT 5; -- Should return columns SELECT column_name, data_type FROM information_schema.columns WHERE table_schema = 'public' LIMIT 5; ``` ## Network Configuration ### Security Groups Add an inbound rule to your Redshift cluster's security group: 1. Go to **AWS Console > VPC > Security Groups** 2. Find the security group attached to your Redshift cluster 3. Edit inbound rules 4. Add rule: - **Type**: Custom TCP - **Port**: `5439` - **Source**: AnomalyArmor IP addresses (see Settings > Security) ### Public Accessibility For AnomalyArmor to connect: **Option 1: Publicly Accessible Cluster** 1. Enable "Publicly accessible" in cluster settings 2. Configure security group as shown above **Option 2: VPC Peering (Enterprise)** For private clusters, contact us for VPC peering or PrivateLink options. > â„šī¸ Redshift clusters in private subnets with no public access require NAT Gateway or VPC peering. Contact us for Enterprise networking options. ### SSL/TLS Redshift requires SSL by default. AnomalyArmor automatically uses SSL for all Redshift connections. ## Redshift Serverless Considerations ### Automatic Scaling Redshift Serverless scales compute automatically. Our discovery queries are lightweight and consume minimal RPUs. ### Idle Workgroups If your workgroup has been idle, the first query may take a few seconds while compute scales up. This is normal behavior. ### Database Name Redshift Serverless typically uses `dev` as the default database. Check your workgroup settings if unsure. ## What We Monitor AnomalyArmor discovers and monitors: | Object Type | Monitored | Notes | |-------------|-----------|-------| | **Tables** | Yes | All user-created tables | | **Views** | Yes | Regular views | | **Materialized Views** | Yes | Treated as tables | | **External Tables** | Yes | Redshift Spectrum tables (if permissions granted) | | **System Tables** | No | Excluded automatically | ### Metadata Captured For each table and view: - Table name and schema - Column names and data types - Nullable constraints - Default values - Last modified timestamp (for freshness) ## What We Query AnomalyArmor runs these types of queries: ```sql -- Tables and views (using Redshift system view) SELECT table_schema, table_name, table_type FROM svv_tables WHERE table_schema NOT IN ('pg_catalog', 'information_schema'); -- Columns SELECT column_name, data_type, is_nullable FROM information_schema.columns WHERE table_schema = 'your_schema'; -- Freshness (for timestamp columns) SELECT MAX(your_timestamp_column) FROM your_table; ``` **Impact**: Minimal. These are lightweight metadata queries that don't scan table data. ## Multiple Schemas To monitor multiple schemas, grant permissions on each: ```sql -- For each schema you want to monitor GRANT USAGE ON SCHEMA raw TO anomalyarmor; GRANT USAGE ON SCHEMA staging TO anomalyarmor; GRANT USAGE ON SCHEMA production TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA raw TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA staging TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA production TO anomalyarmor; ``` > 💡 AnomalyArmor automatically excludes system schemas (pg_catalog, information_schema, pg_internal). ## Redshift Spectrum (External Tables) If you use Redshift Spectrum for querying S3: ```sql -- Grant access to external schemas GRANT USAGE ON SCHEMA spectrum_schema TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA spectrum_schema TO anomalyarmor; ``` External tables appear alongside regular tables in discovery. ## Troubleshooting ### Connection refused **Causes**: - Security group not configured - Cluster not publicly accessible - Wrong endpoint **Solutions**: 1. Verify AnomalyArmor IPs are in security group inbound rules 2. Check "Publicly accessible" setting in cluster configuration 3. Verify endpoint is correct (check AWS Console) 4. Test connectivity: `nc -zv your-endpoint 5439` ### Connection timeout **Causes**: - Firewall blocking connection - Cluster in private subnet without proper routing **Solutions**: 1. Verify security group allows port 5439 from AnomalyArmor IPs 2. If private subnet, ensure NAT Gateway is configured 3. Consider enabling "Publicly accessible" for the cluster 4. Contact us for VPC peering options (Enterprise) ### Authentication failed **Causes**: - Wrong password - User doesn't exist - Case sensitivity issues **Solutions**: 1. Verify password (copy-paste to avoid typos) 2. Confirm user exists: `SELECT usename FROM pg_user;` 3. Remember: Redshift usernames are case-sensitive when quoted 4. Try resetting the password: ```sql ALTER USER anomalyarmor PASSWORD 'new-password'; ``` ### Database does not exist **Causes**: - Wrong database name - Case sensitivity **Solutions**: 1. List databases: `SELECT datname FROM pg_database;` 2. Redshift Serverless default is usually `dev` 3. Database names are case-sensitive ### No tables found in discovery **Causes**: - User lacks SELECT permission - Schema permissions missing - No tables in monitored schemas **Solutions**: 1. Test as the user: ```sql SELECT * FROM svv_tables WHERE table_schema = 'public' LIMIT 5; ``` 2. Grant schema access: ```sql GRANT USAGE ON SCHEMA public TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor; ``` 3. Verify tables exist in expected schemas ### SSL connection error **Causes**: - SSL configuration mismatch **Solutions**: 1. AnomalyArmor uses SSL by default (required for Redshift) 2. Ensure your cluster has SSL enabled (default setting) 3. Contact support if you have custom SSL requirements ## Best Practices ### Use a Dedicated User Create a read-only user specifically for AnomalyArmor: - Don't use admin accounts - Don't share credentials with other tools - Use a strong, unique password ### Use Read Replicas (if available) If you have Redshift RA3 with data sharing or multiple clusters, consider pointing AnomalyArmor at a read replica to isolate monitoring queries from production workloads. ### Grant Minimal Permissions Only grant access to schemas you want to monitor. Avoid: ```sql -- Don't do this GRANT ALL ON DATABASE dev TO anomalyarmor; ``` ### Schedule Discovery After ETL If you have predictable ETL schedules, run discovery after ETL completes: | Schedule | Time | |----------|------| | **ETL** | 2:00 AM daily | | **Discovery** | 3:00 AM daily | ## Limitations - **IAM authentication**: Not yet supported (username/password only) - **Cross-database queries**: One connection per database - **Concurrency scaling**: Queries run on main cluster, not concurrency scaling - **Stored procedures**: Not monitored (tables and views only) ## Coming Soon - IAM role-based authentication - Redshift-specific metrics (WLM queue status, disk usage) - Cross-cluster data sharing support ## Common Questions ### Does AnomalyArmor support both provisioned Redshift clusters and Redshift Serverless? Yes. Both use the same PostgreSQL-wire-compatible connector. For Serverless, set the endpoint to the workgroup's connection URL. For provisioned, use the cluster endpoint. SSL is required in both cases. ### Can AnomalyArmor monitor Redshift Spectrum external tables? Yes. External tables (backed by S3) are discovered and cataloged alongside local Redshift tables. Freshness works if the external table has a timestamp column queryable via a standard `MAX(ts)` - Spectrum resolves this against S3 metadata when partitioned appropriately. ### Do I need to VACUUM or ANALYZE for AnomalyArmor to work correctly? No for discovery or schema drift - those read the catalog directly. For row-count and statistical metrics, Redshift's `SVV_TABLE_INFO` and `SVV_ALL_COLUMNS` views are used; these are maintained by Redshift automatically. You don't need to add any VACUUM / ANALYZE to your maintenance. ### Can I connect AnomalyArmor to Redshift in a private VPC without IP allowlisting? Yes, via VPC peering or AWS PrivateLink (Enterprise plans). Both keep traffic off the public internet. For non-Enterprise, allowlist AnomalyArmor's static IPs in your Redshift cluster's security group. ## Next Steps - **[Run Discovery](/quickstart/run-first-discovery)**: Scan your Redshift cluster - **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes --- # Databricks > Connect AnomalyArmor to Databricks Unity Catalog Connect AnomalyArmor to your Databricks workspace to monitor Unity Catalog assets. We support Delta tables, views, and all Unity Catalog-managed objects. ## Requirements Before connecting, ensure you have: - **Databricks workspace** with Unity Catalog enabled - **SQL Warehouse** (serverless or classic) - **Personal Access Token** or Service Principal credentials - **Catalog access** for the catalogs you want to monitor ## Connection Settings | Field | Description | Example | |-------|-------------|---------| | **Connection Name** | Friendly identifier | `Databricks Production` | | **Workspace URL** | Your Databricks workspace | `https://xxx.cloud.databricks.com` | | **HTTP Path** | SQL warehouse path | `/sql/1.0/warehouses/abc123` | | **Catalog** | Unity Catalog to monitor | `main` | | **Access Token** | Authentication token | `dapi...` | ## Finding Your Connection Details ### Workspace URL Your workspace URL is in your browser when logged into Databricks: | Cloud | URL Format | |-------|-----------| | **Azure** | `https://adb-1234567890.12.azuredatabricks.net` | | **AWS** | `https://dbc-abc123.cloud.databricks.com` | | **GCP** | `https://xxx.gcp.databricks.com` | ### SQL Warehouse HTTP Path 1. Go to **SQL Warehouses** in Databricks 2. Click on your warehouse 3. Go to **Connection Details** tab 4. Copy the **HTTP Path** ``` HTTP Path format: /sql/1.0/warehouses/abc123def456 ↑ Your warehouse ID ``` > 💡 Use a **Serverless SQL Warehouse** for best compatibility. Classic warehouses work too but may have startup delays. ### Creating an Access Token ### Personal Access Token Best for quick setup and testing: 1. Click your username in Databricks → **User Settings** 2. Go to **Access Tokens** tab 3. Click **Generate New Token** 4. Set a description: `AnomalyArmor` 5. Set lifetime (or leave blank for no expiry) 6. Click **Generate** 7. **Copy the token immediately** (you won't see it again) ``` Token format: dapi1234567890abcdef1234567890abcdef ``` > âš ī¸ Personal access tokens are tied to your user account. If you leave the organization, the token stops working. Consider using a service principal for production. ### Service Principal (Recommended) Best for production use: **Step 1: Create Service Principal** 1. Go to **Admin Console → Service Principals** 2. Click **Add Service Principal** 3. Name it: `anomalyarmor-monitoring` 4. Note the **Application ID** **Step 2: Generate OAuth Token** 1. Select the service principal 2. Go to **Secrets** tab 3. Click **Generate Secret** 4. Copy the **Client ID** and **Client Secret** **Step 3: Grant Permissions** The service principal needs: - `USE CATALOG` on target catalogs - `USE SCHEMA` on target schemas - `SELECT` on tables (or `ALL PRIVILEGES` for read access) ```sql -- Grant catalog access GRANT USE CATALOG ON CATALOG main TO `anomalyarmor-monitoring`; -- Grant schema access GRANT USE SCHEMA ON SCHEMA main.* TO `anomalyarmor-monitoring`; -- Grant table read access GRANT SELECT ON SCHEMA main.* TO `anomalyarmor-monitoring`; ``` **Step 4: Use in AnomalyArmor** Enter the OAuth token in the Access Token field. ## Granting Catalog Permissions The user or service principal needs read access to Unity Catalog. > 💡 **Quick Setup**: [Download the Databricks permissions script](/downloads/databricks-permissions) for a ready-to-use SQL template with Unity Catalog grants. ```sql -- Minimal permissions for AnomalyArmor -- Access to catalog GRANT USE CATALOG ON CATALOG production TO `anomalyarmor`; -- Access to all schemas in catalog GRANT USE SCHEMA ON CATALOG production TO `anomalyarmor`; -- Read access to tables GRANT SELECT ON CATALOG production TO `anomalyarmor`; ``` ### Per-Schema Permissions For more granular control: ```sql -- Access specific schemas only GRANT USE SCHEMA ON SCHEMA production.raw TO `anomalyarmor`; GRANT USE SCHEMA ON SCHEMA production.staging TO `anomalyarmor`; GRANT USE SCHEMA ON SCHEMA production.marts TO `anomalyarmor`; -- Read access per schema GRANT SELECT ON SCHEMA production.raw TO `anomalyarmor`; GRANT SELECT ON SCHEMA production.staging TO `anomalyarmor`; GRANT SELECT ON SCHEMA production.marts TO `anomalyarmor`; ``` ## What We Monitor AnomalyArmor discovers and monitors these Unity Catalog objects: | Object Type | Monitored | Notes | |-------------|-----------|-------| | **Delta Tables** | Yes | Including managed and external | | **Views** | Yes | Standard and materialized | | **Schemas** | Yes | Schema-level metadata | | **Volumes** | No | Coming soon | | **Functions** | No | Not supported | ### Metadata Captured For each table and view: - Table name and schema - Column names and data types - Table properties - Last modified timestamp (for freshness) - Partitioning information ## Multiple Catalogs ### 3-Level Namespace Support Databricks Unity Catalog uses a 3-level namespace: `catalog.schema.table`. AnomalyArmor fully supports this structure, enabling you to: - **Track tables across catalogs**: Distinguish between `prod.analytics.users` and `dev.analytics.users` - **Filter by catalog**: View only tables from specific catalogs in the UI - **Catalog-aware alerting**: Get notified of changes in production catalogs only - **Lineage across catalogs**: Track data flow between development, staging, and production ### Connecting Multiple Catalogs To monitor multiple catalogs, create separate data sources for each: **Data Sources:** - Databricks Production (catalog: production) - Databricks Staging (catalog: staging) - Databricks Development (catalog: development) > â„šī¸ Each data source needs access to its respective catalog. Use the same token if it has permissions across catalogs. ### Catalog-Aware Features | Feature | Catalog Support | |---------|-----------------| | **Asset Discovery** | Tables shown with full `catalog.schema.table` path | | **Schema Drift Alerts** | Filter alerts by catalog | | **Tag Inheritance** | Tags propagate within catalog boundaries | | **Table Filtering** | API supports `catalog_name` filter parameter | | **Lineage Visualization** | Shows cross-catalog data dependencies | ## SQL Warehouse Considerations ### Warehouse State AnomalyArmor queries run on your SQL warehouse. Consider: | Warehouse Type | Behavior | |----------------|----------| | **Serverless** | Auto-starts, minimal delay | | **Classic (Auto-stop)** | May have startup delay (30s-2min) | | **Classic (Always-on)** | Immediate, but costs more | ### Warehouse Sizing Discovery queries are lightweight. A **Small** or **X-Small** warehouse is sufficient: - **Recommended**: Serverless SQL Warehouse - **Alternative**: X-Small Classic Warehouse with auto-stop ### Scheduling Discovery If using a classic warehouse with auto-stop: 1. Schedule discovery during business hours 2. Or extend auto-stop timeout to cover discovery windows 3. Or use serverless (recommended) ## Connection Architecture ## What We Query AnomalyArmor runs these types of queries: ```sql -- List schemas SHOW SCHEMAS IN CATALOG production; -- List tables SHOW TABLES IN SCHEMA production.raw; -- Get table details DESCRIBE TABLE EXTENDED production.raw.events; -- Check freshness (for tables with timestamp columns) SELECT MAX(event_timestamp) FROM production.raw.events; ``` **Impact**: Minimal. These are metadata queries that don't scan table data. ## Troubleshooting ### Connection test fails **Common causes**: 1. Invalid or expired access token 2. Wrong workspace URL 3. Incorrect HTTP path **Solutions**: 1. Generate a new access token 2. Verify workspace URL matches your browser 3. Copy HTTP path directly from SQL Warehouse settings ### Permission denied errors **Causes**: - Token lacks catalog/schema permissions - Service principal not granted access **Solutions**: ```sql -- Check current permissions SHOW GRANTS ON CATALOG production; -- Grant necessary permissions GRANT USE CATALOG ON CATALOG production TO `your-user`; GRANT SELECT ON CATALOG production TO `your-user`; ``` ### Warehouse not found **Causes**: - Wrong HTTP path - Warehouse deleted or renamed **Solutions**: 1. Go to SQL Warehouses in Databricks 2. Copy the HTTP path from Connection Details 3. Ensure the warehouse exists and is accessible ### Discovery times out **Causes**: - Warehouse is starting up - Large number of tables **Solutions**: 1. Use a serverless warehouse (faster startup) 2. Extend warehouse auto-stop timeout 3. Filter to specific schemas if catalog is very large ### Token expired **Causes**: - Personal access token has expiry date **Solutions**: 1. Generate a new token with longer expiry 2. Use a service principal with OAuth (no expiry) 3. Update the token in AnomalyArmor Data Sources settings ## Best Practices ### Use Service Principals for Production Personal access tokens are tied to individual users. If that user leaves: - Token stops working - Monitoring breaks Service principals are organization-owned and persist regardless of user changes. ### Monitor Production Catalog Start with your production catalog where schema changes have the most impact. ### Schedule Discovery After ETL If you have predictable ETL schedules, run discovery after ETL completes to catch changes immediately: ``` ETL Schedule: 2:00 AM daily Discovery Schedule: 3:00 AM daily (1 hour after ETL) ``` ## Common Questions ### Does AnomalyArmor require Unity Catalog, or does the legacy Hive Metastore work? Unity Catalog is required. The legacy Hive Metastore does not expose the information-schema views AnomalyArmor needs for consistent cross-catalog monitoring. If you're still on Hive, migrating to Unity Catalog also unlocks most other modern observability tools. ### Should I use a SQL Warehouse or a general-purpose cluster for AnomalyArmor? SQL Warehouse - specifically Serverless or Pro. All-Purpose compute clusters work but start cold, which slows discovery. Use the smallest warehouse size; AnomalyArmor's queries are lightweight metadata and bounded aggregates. ### How do I authenticate AnomalyArmor to Databricks? Personal Access Token (PAT) on a dedicated service-principal user is the standard. Grant the service principal `USE CATALOG`, `USE SCHEMA`, and `SELECT` on the objects you want monitored. OAuth machine-to-machine is available as an Enterprise option. ### Will AnomalyArmor keep my Databricks warehouse warm and run up Databricks cost? No. Discovery runs in short bursts; your warehouse's `auto_stop_mins` setting takes over afterward. Set `auto_stop_mins` to 5-10 minutes on the AnomalyArmor warehouse to minimize idle cost. Serverless warehouses bill per second of activity and spin down instantly when idle. ### Can AnomalyArmor monitor Delta Live Tables or streaming tables? Yes. DLT materializations and streaming tables appear as ordinary Delta tables in Unity Catalog and are fully supported for schema, freshness, and metric monitoring. Freshness reads Delta's history for accurate last-update timestamps. ## Next Steps - **[Run Discovery](/quickstart/run-first-discovery)**: Scan your Databricks catalog - **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes --- # ClickHouse > Connect AnomalyArmor to ClickHouse databases Connect AnomalyArmor to your ClickHouse database for schema monitoring and freshness tracking. We support both self-hosted ClickHouse and ClickHouse Cloud. ## Requirements - **ClickHouse version**: 21.8 or higher - **HTTP interface**: Enabled (default on most installations) - **User credentials**: With read access to system tables - **Network access**: From AnomalyArmor to your ClickHouse server ## Connection Settings | Field | Description | Example | |-------|-------------|---------| | **Connection Name** | Friendly identifier | `ClickHouse Analytics` | | **Host** | ClickHouse hostname | `xxx.clickhouse.cloud` | | **Port** | HTTP(S) port | `8443` (HTTPS) or `8123` (HTTP) | | **Database** | Database name | `default` | | **Username** | ClickHouse user | `anomalyarmor` | | **Password** | User password | `â€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸâ€ĸ` | ### Port Configuration | Port | Protocol | When to Use | |------|----------|-------------| | `8443` | HTTPS | **ClickHouse Cloud** and production | | `8123` | HTTP | Development or internal networks | | `9440` | Native TLS | Not supported (use HTTP interface) | > âš ī¸ Always use HTTPS (port 8443) for cloud-hosted or production ClickHouse. HTTP (8123) should only be used for local development. ## Provider-Specific Instructions ### ClickHouse Cloud ### ClickHouse Cloud **Finding Connection Details**: 1. Go to your ClickHouse Cloud console 2. Select your service 3. Click **Connect** → **HTTPS** 4. Copy the connection details ``` Host: abc123.us-east-1.aws.clickhouse.cloud Port: 8443 Database: default (or your database name) ``` **IP Allowlisting**: 1. Go to **Settings** → **Security** 2. Under **IP Access List**, add AnomalyArmor IPs 3. Save changes Add the AnomalyArmor IP addresses: `34.xxx.xxx.xxx/32` and `34.xxx.xxx.xxx/32` **Creating a Read-Only User**: ```sql -- Create user CREATE USER anomalyarmor IDENTIFIED BY 'your-secure-password'; -- Grant read access GRANT SELECT ON *.* TO anomalyarmor; GRANT SHOW ON *.* TO anomalyarmor; -- Access to system tables (required for discovery) GRANT SELECT ON system.* TO anomalyarmor; ``` ### Self-Hosted ### Self-Hosted ClickHouse **Verify HTTP Interface**: Check your `config.xml` has HTTP enabled: ```xml 8123 8443 ``` **Firewall Configuration**: Allow inbound connections from AnomalyArmor: ```bash # iptables example iptables -A INPUT -p tcp --dport 8443 -s 34.xxx.xxx.xxx -j ACCEPT iptables -A INPUT -p tcp --dport 8443 -s 34.xxx.xxx.xxx -j ACCEPT ``` **Creating a Read-Only User**: ```sql -- Create user CREATE USER anomalyarmor IDENTIFIED BY 'your-secure-password' HOST IP '34.xxx.xxx.xxx', '34.xxx.xxx.xxx'; -- Grant read access GRANT SELECT ON *.* TO anomalyarmor; GRANT SHOW ON *.* TO anomalyarmor; ``` **SSL/TLS Setup** (recommended): ```xml 8443 /path/to/server.crt /path/to/server.key ``` ### Altinity Cloud ### Altinity Cloud Altinity Cloud is a managed ClickHouse service: **Connection Details**: 1. Go to your Altinity Cloud cluster 2. Find connection details in the cluster overview 3. Use the HTTPS endpoint **IP Allowlisting**: Add AnomalyArmor IPs to your cluster's allowed networks in the Altinity console. **User Setup**: Same as self-hosted ClickHouse. Create a read-only user through the Altinity console or SQL. ### Docker/Local ### Docker / Local Development **Default Connection**: ``` Host: localhost (or container IP) Port: 8123 (HTTP) or 8443 (HTTPS) Database: default Username: default Password: (empty or as configured) ``` **Docker Compose Example**: ```yaml services: clickhouse: image: clickhouse/clickhouse-server:latest ports: - "8123:8123" - "9000:9000" ``` > â„šī¸ For local development, you may need to expose the ClickHouse port publicly or use a tunneling solution for AnomalyArmor to connect. ## Creating a Read-Only User Full SQL script for setting up AnomalyArmor access: ```sql -- Create dedicated user CREATE USER IF NOT EXISTS anomalyarmor IDENTIFIED BY 'your-secure-password'; -- Grant read access to all databases GRANT SELECT ON *.* TO anomalyarmor; -- Grant ability to see databases and tables GRANT SHOW ON *.* TO anomalyarmor; -- Access to system tables (required for discovery) GRANT SELECT ON system.tables TO anomalyarmor; GRANT SELECT ON system.columns TO anomalyarmor; GRANT SELECT ON system.databases TO anomalyarmor; GRANT SELECT ON system.parts TO anomalyarmor; -- Optional: Restrict to specific databases -- GRANT SELECT ON analytics.* TO anomalyarmor; -- GRANT SELECT ON production.* TO anomalyarmor; ``` ### Verify Permissions Test the user can access metadata: ```sql -- Should work SELECT database, name, engine FROM system.tables LIMIT 5; -- Should work SELECT database, table, name, type FROM system.columns LIMIT 5; ``` ## What We Monitor AnomalyArmor discovers and monitors these ClickHouse objects: | Object Type | Monitored | Notes | |-------------|-----------|-------| | **Tables** | Yes | All table engines | | **Views** | Yes | Standard views | | **Materialized Views** | Yes | Including underlying tables | | **Dictionaries** | No | Coming soon | | **Functions** | No | Not supported | ### Metadata Captured For each table: - Database and table name - Column names and data types - Table engine type - Partition information - Last modification time (for freshness) ## What We Query AnomalyArmor runs these types of queries: ```sql -- List databases SELECT name FROM system.databases WHERE name NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema'); -- List tables SELECT database, name, engine, metadata_modification_time FROM system.tables WHERE database NOT IN ('system', 'INFORMATION_SCHEMA'); -- List columns SELECT database, table, name, type, default_kind, default_expression FROM system.columns WHERE database NOT IN ('system', 'INFORMATION_SCHEMA'); -- Check freshness (for tables with timestamp columns) SELECT MAX(event_time) FROM analytics.events; ``` **Impact**: These are lightweight metadata queries. No table scans. ## ClickHouse-Specific Considerations ### Table Engines AnomalyArmor works with all ClickHouse table engines: | Engine | Schema Monitoring | Freshness | |--------|-------------------|-----------| | MergeTree family | Full | Yes | | Log family | Full | Limited | | Distributed | Full | Via underlying tables | | View | Full | N/A | | MaterializedView | Full | Yes | ### ReplicatedMergeTree For replicated tables, connect to any replica. Schema changes propagate across all replicas, so monitoring one is sufficient. ### Distributed Tables Distributed tables show the schema of the distributed table definition. Underlying shard tables are monitored separately if in the same cluster. ## Troubleshooting ### Connection refused **Causes**: - Wrong port (using native port instead of HTTP) - Firewall blocking connection - HTTP interface disabled **Solutions**: 1. Verify port is 8443 (HTTPS) or 8123 (HTTP) 2. Check firewall/security group allows AnomalyArmor IPs 3. Verify HTTP interface is enabled in config.xml 4. Test: `curl https://your-host:8443/ping` ### Authentication failed **Causes**: - Wrong username or password - User doesn't exist - IP not in user's allowed hosts **Solutions**: 1. Verify credentials 2. Check user exists: `SELECT * FROM system.users WHERE name = 'anomalyarmor'` 3. Verify IP is allowed: Check user's HOST restrictions ```sql -- View user's allowed hosts SELECT name, host_ip, host_names FROM system.users; ``` ### SSL certificate error **Causes**: - Self-signed certificate not trusted - Certificate hostname mismatch **Solutions**: 1. For ClickHouse Cloud: Should work automatically 2. For self-hosted: Ensure certificate is valid 3. Contact support if issues persist with valid certificates ### Permission denied **Causes**: - User lacks SELECT on system tables - User lacks access to target databases **Solutions**: ```sql -- Grant required permissions GRANT SELECT ON system.* TO anomalyarmor; GRANT SELECT ON your_database.* TO anomalyarmor; GRANT SHOW ON *.* TO anomalyarmor; ``` ### No tables found **Causes**: - User can only see specific databases - All tables in excluded system databases **Solutions**: 1. Grant `SHOW` privilege: `GRANT SHOW ON *.* TO anomalyarmor` 2. Verify tables exist outside system databases 3. Check AnomalyArmor schema filters ## Connection Architecture ## Best Practices ### Use HTTPS in Production Always use port 8443 with HTTPS for production: - Encrypted in transit - Required by ClickHouse Cloud - Protects credentials ### Connect to One Node For clustered setups, connect to one node. System tables show cluster-wide metadata. ### Schedule Discovery After Mutations If you have regular schema changes (ALTER TABLE), schedule discovery after those operations complete. ## Common Questions ### Which ClickHouse versions and deployments does AnomalyArmor support? ClickHouse 21.8+ self-hosted and ClickHouse Cloud are fully supported. Both use the HTTPS/TCP native protocol on port 8443 (Cloud) or 8123/9000 (self-hosted). Older ClickHouse versions may work for basic monitoring but aren't officially supported - upgrade if possible. ### Does AnomalyArmor support ClickHouse materialized views and dictionaries? Materialized views are first-class assets (schema drift, freshness, metrics all work). Dictionaries are discovered and cataloged but not actively monitored for freshness; treat them as reference metadata. ### Can AnomalyArmor monitor freshness on a ClickHouse `MergeTree` table that uses event-time partitioning? Yes. Use your event-time column (e.g., `event_date`, `created_at`) as the freshness timestamp. AnomalyArmor queries `MAX(event_date)` which ClickHouse resolves efficiently via partition metadata, even on billion-row tables. ### How much load does AnomalyArmor put on my ClickHouse cluster? Very little. Discovery hits `system.tables` / `system.columns` which are metadata tables. Freshness and metrics use `MAX` / `COUNT(*)` / `uniqExact` which ClickHouse optimizes to microseconds on partitioned tables. Expect microseconds-to-low-milliseconds per query. ## Next Steps - **[Run Discovery](/quickstart/run-first-discovery)**: Scan your ClickHouse database - **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes --- # Microsoft SQL Server > Connect AnomalyArmor to SQL Server databases including Azure SQL Database and on-premise instances Connect AnomalyArmor to any Microsoft SQL Server database. This guide covers on-premise SQL Server, Azure SQL Database, and Amazon RDS for SQL Server. ## Supported Versions & Platforms | Platform | Minimum Version | Notes | |----------|-----------------|-------| | SQL Server | 2012+ | On-premise or any cloud | | SQL Server 2019 | Recommended | Best compatibility | | SQL Server 2022 | Latest | Fully supported | | Azure SQL Database | Any | All service tiers | | Azure SQL Managed Instance | Any | All service tiers | | Amazon RDS SQL Server | 2012+ | All instance classes | > âš ī¸ SQL Server 2008 and earlier are not supported. Please upgrade to SQL Server 2012+ for compatibility. ## Connection Settings | Field | Description | Example | |-------|-------------|---------| | **Connection Name** | Friendly identifier | `Production SQL Server` | | **Host** | Hostname or IP address | `db.example.com` | | **Port** | Database port | `1433` | | **Database** | Database name | `myapp_production` | | **Username** | SQL Server user | `anomalyarmor` | | **Password** | User password | `********` | | **SSL Mode** | SSL configuration | `require` | ### Authentication Methods | Method | Supported | Notes | |--------|-----------|-------| | **SQL Server Authentication** | Yes | Username and password | | **Windows Authentication** | No | Not currently supported | | **Azure Active Directory** | No | Planned for future release | > â„šī¸ SQL Server Authentication (username/password) is required. Windows Authentication and Azure AD are planned for future releases. ## Creating a Read-Only User Create a dedicated user with minimal permissions: ```sql -- Create a login at the server level CREATE LOGIN anomalyarmor WITH PASSWORD = 'YourSecurePassword123!'; -- Switch to your database USE your_database; -- Create a user for the login CREATE USER anomalyarmor FOR LOGIN anomalyarmor; -- Grant SELECT on schemas (repeat for each schema you want to monitor) GRANT SELECT ON SCHEMA::dbo TO anomalyarmor; GRANT SELECT ON SCHEMA::production TO anomalyarmor; -- Grant VIEW DEFINITION for schema introspection GRANT VIEW DEFINITION TO anomalyarmor; ``` ### Verifying Permissions Test that the user can access metadata: ```sql -- Should return tables SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'dbo'; -- Should return columns SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'dbo'; ``` ## Provider-Specific Instructions ### Azure SQL Database ### Azure SQL Database **Connection Details**: - **Host**: `yourserver.database.windows.net` - **Port**: `1433` - **SSL Mode**: Encryption is always enabled (TLS 1.2+) **Firewall Configuration**: 1. Go to **Azure Portal > SQL databases > Your database > Set server firewall** 2. Add a rule for each AnomalyArmor IP address (see Settings > Security) 3. Or enable "Allow Azure services" if AnomalyArmor runs in Azure ``` Firewall Rules ────────────── Rule name │ Start IP │ End IP AnomalyArmor-1 │ 34.xxx.xxx.xxx │ 34.xxx.xxx.xxx AnomalyArmor-2 │ 34.xxx.xxx.xxx │ 34.xxx.xxx.xxx ``` **Service Tiers**: All tiers are supported: - Basic, Standard, Premium (DTU-based) - General Purpose, Business Critical, Hyperscale (vCore-based) - Serverless > âš ī¸ Azure SQL Database enforces encrypted connections. The SSL mode setting is informational - Azure will always use TLS 1.2+. ### Azure SQL Managed Instance ### Azure SQL Managed Instance **Connection Details**: - **Host**: `yourinstance.abc123.database.windows.net` - **Port**: `1433` (default) or `3342` (public endpoint) - **SSL Mode**: `require` **Connectivity Options**: | Method | Description | |--------|-------------| | **VNet Integration** | Connect through Azure VNet peering | | **Public Endpoint** | Enable public endpoint on port 3342 | **Public Endpoint Setup**: 1. Go to **Azure Portal > SQL managed instances > Your instance > Security > Networking** 2. Enable **Public endpoint** 3. Add AnomalyArmor IPs to **Deny public access** exceptions > â„šī¸ For VNet-based connectivity, contact us about Enterprise VPC peering options. ### Amazon RDS ### Amazon RDS for SQL Server **Connection Details**: - **Host**: Your RDS endpoint (e.g., `mydb.abc123.us-east-1.rds.amazonaws.com`) - **Port**: `1433` (default) - **SSL Mode**: `require` **Security Group Configuration**: 1. Go to **AWS Console > RDS > Your Instance > Security Groups** 2. Edit inbound rules 3. Add rule: - Type: `MS SQL` - Port: `1433` - Source: AnomalyArmor IPs (see Settings > Security) ``` Security Group: sg-abc123 ──────────────────────────────────────────────────────── Inbound Rules MS SQL │ TCP │ 1433 │ 34.xxx.xxx.xxx/32 │ AnomalyArmor MS SQL │ TCP │ 1433 │ 34.xxx.xxx.xxx/32 │ AnomalyArmor ``` **RDS SQL Server Editions**: All editions are supported: - Express Edition - Web Edition - Standard Edition - Enterprise Edition > â„šī¸ RDS instances in private subnets require NAT Gateway or VPC peering for AnomalyArmor access. ### On-Premise ### On-Premise SQL Server **Connection Details**: - **Host**: Your server's hostname or IP - **Port**: `1433` (or custom port) - **SSL Mode**: Depends on your setup **Firewall Configuration**: Allow inbound connections from AnomalyArmor IPs: ```powershell # Windows Firewall example New-NetFirewallRule -DisplayName "AnomalyArmor SQL" ` -Direction Inbound -LocalPort 1433 -Protocol TCP ` -RemoteAddress 34.xxx.xxx.xxx -Action Allow ``` **SQL Server Configuration**: Ensure TCP/IP is enabled: 1. Open **SQL Server Configuration Manager** 2. Go to **SQL Server Network Configuration > Protocols** 3. Enable **TCP/IP** 4. Set TCP Port to `1433` (or your preferred port) 5. Restart SQL Server service **Mixed Mode Authentication**: SQL Server Authentication must be enabled: 1. Connect in SSMS as administrator 2. Right-click server > **Properties > Security** 3. Select **SQL Server and Windows Authentication mode** 4. Restart SQL Server service **SSL/TLS Configuration** (recommended): 1. Install a valid SSL certificate on the server 2. In SQL Server Configuration Manager, go to **SQL Server Network Configuration > Protocols > Properties** 3. Set **Force Encryption** to Yes ## What We Query AnomalyArmor runs these types of queries: ```sql -- Tables and views SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA NOT IN ('sys', 'INFORMATION_SCHEMA'); -- Columns SELECT COLUMN_NAME, DATA_TYPE, IS_NULLABLE, COLUMN_DEFAULT FROM INFORMATION_SCHEMA.COLUMNS; -- Freshness (for timestamp columns) SELECT MAX(your_timestamp_column) FROM your_table; ``` **Impact**: Minimal. These are lightweight metadata queries using standard INFORMATION_SCHEMA views. ## Excluded Schemas AnomalyArmor automatically excludes system schemas: - `sys` - SQL Server system objects - `INFORMATION_SCHEMA` - ANSI standard metadata views All user-created schemas are included by default. ## Troubleshooting ### Login failed for user **Error**: `Login failed for user 'anomalyarmor'` **Causes**: - Wrong username or password - SQL Server Authentication not enabled - User doesn't have access to the specified database **Solutions**: 1. Verify username and password are correct 2. Check SQL Server is in Mixed Mode authentication 3. Ensure the login exists: `SELECT name FROM sys.server_principals WHERE name = 'anomalyarmor'` 4. Ensure the user has database access ### Cannot open database **Error**: `Cannot open database 'mydb' requested by the login` **Causes**: - Database name is incorrect - User doesn't have access to the database - Database doesn't exist **Solutions**: 1. Verify database name (case-sensitive on some configurations) 2. Check user permissions: `SELECT name FROM sys.database_principals WHERE name = 'anomalyarmor'` 3. Grant access: `USE mydb; CREATE USER anomalyarmor FOR LOGIN anomalyarmor;` ### Connection refused or timeout **Error**: `Cannot connect to SQL Server` or connection timeout **Causes**: - Firewall blocking the connection - Wrong hostname or port - SQL Server not listening on TCP/IP - SQL Server Browser service not running (named instances) **Solutions**: 1. Verify AnomalyArmor IPs are allowlisted 2. Check firewall rules 3. Ensure TCP/IP protocol is enabled in SQL Server Configuration Manager 4. For named instances, ensure SQL Server Browser is running or specify the port 5. Test connectivity: `Test-NetConnection hostname -Port 1433` ### Azure SQL firewall error **Error**: `Cannot connect - firewall rule` or error 40615 **Causes**: - AnomalyArmor IP not in Azure SQL firewall rules - Public access is disabled **Solutions**: 1. Go to Azure Portal > SQL databases > Set server firewall 2. Add AnomalyArmor IP addresses to firewall rules 3. Ensure "Deny public network access" is Off ### Windows Authentication required **Error**: Error 18470 or "Windows authentication is required" **Causes**: - Server is configured for Windows Authentication only - SQL Server Authentication is disabled **Solutions**: 1. Enable Mixed Mode authentication in SQL Server properties 2. Restart SQL Server service 3. AnomalyArmor currently requires SQL Server Authentication ### No tables found in discovery **Causes**: - User lacks SELECT permission on schemas - User lacks VIEW DEFINITION permission - All tables are in excluded schemas **Solutions**: 1. Grant schema access: `GRANT SELECT ON SCHEMA::dbo TO anomalyarmor;` 2. Grant view definition: `GRANT VIEW DEFINITION TO anomalyarmor;` 3. Test query: `SELECT * FROM INFORMATION_SCHEMA.TABLES;` ## Common Questions ### Which SQL Server versions and services does AnomalyArmor support? SQL Server 2012+, Azure SQL Database (Single and Elastic Pool), Azure SQL Managed Instance, and Amazon RDS for SQL Server. SQL Server 2008 and earlier are not supported due to missing information-schema features. ### Do I need a SQL login or Windows authentication for Azure SQL? SQL login (username + password) is the recommended path for Azure SQL Database. Azure AD authentication is supported on Enterprise plans. Windows authentication is not supported for managed Azure SQL; for on-premise SQL Server, SQL auth is still the simplest setup. ### What permissions does AnomalyArmor need on SQL Server? Minimal read: grant the connection's login `CONNECT` to the database, `VIEW DEFINITION` on the schemas to monitor (so information_schema returns your tables), and `SELECT` on those tables. No `ALTER`, `INSERT`, or `sysadmin` needed. ### Does AnomalyArmor support Always Encrypted columns in SQL Server? AnomalyArmor doesn't decrypt Always Encrypted columns - nor does it need to. Schema drift and freshness work against encrypted columns because we only read metadata and aggregate over non-encrypted timestamp columns. If a column is Always Encrypted, we see its presence and type; its values stay encrypted end-to-end. ## Next Steps - **[Run Discovery](/quickstart/run-first-discovery)**: Scan your SQL Server database - **[Set Up Alerts](/quickstart/set-up-first-alert)**: Get notified of schema changes --- # Schema Monitoring Overview > Detect and track schema changes across your data assets Schema monitoring is AnomalyArmor's core capability. It automatically detects changes to your database structure (columns added, removed, renamed, or modified) before they break your pipelines and dashboards. ## Why Schema Monitoring Matters Schema changes are one of the most common causes of data pipeline failures: ## What We Monitor AnomalyArmor detects all structural changes to your database: | Change Type | Description | Impact Level | |------------|-------------|--------------| | **Column Removed** | Existing column no longer exists | High - breaks queries | | **Column Added** | New column appears | Low - usually safe | | **Column Type Changed** | Data type modified | Medium - may affect logic | | **Column Renamed** | Name changed (detected as remove + add) | High - breaks queries | | **Table Added** | New table discovered | Low - informational | | **Table Removed** | Table no longer exists | High - breaks queries | | **Constraint Changed** | Primary key, foreign key, unique modified | Medium - may affect joins | ## How It Works Schema monitoring happens during **discovery**: 1. **Discovery scans** your database's system catalogs 2. **Comparison** against the previous discovery state 3. **Changes recorded** with full before/after details 4. **Alerts triggered** based on your rules ## Key Features ### Automatic Detection Changes are detected automatically during scheduled discovery, no manual work required. - Set discovery to run hourly, daily, or custom - Changes detected on next run after they occur - Zero configuration for basic monitoring ### Complete Change History Every change is recorded with full context: - **What changed**: Column name, type, constraints - **When detected**: Timestamp of discovery - **Before/after**: Previous and new values - **Asset context**: Which table, schema, database ### Flexible Alerting Route different changes to different places: ``` Production + Column Removed → PagerDuty + Slack #incidents Production + Column Added → Slack #data-changes Development + Any Change → Email digest ``` ### Change Visualization See schema changes at a glance: *Schema diff showing before and after states* ## Getting Started ### 1. Connect Your Database Schema monitoring starts when you [add a data source](/data-sources/overview). ### 2. Run Discovery [Trigger discovery](/quickstart/run-first-discovery) to establish a baseline. ### 3. Set Up Alerts [Create alert rules](/alerts/alert-rules) for schema changes: ``` Event: Schema Change Detected Scope: Your production database Destinations: Slack, Email, or PagerDuty ``` ### 4. Schedule Discovery Configure discovery to run regularly: | Environment | Recommended Schedule | |-------------|---------------------| | Production | Hourly | | Staging | Every 6 hours | | Development | Daily | ## Viewing Schema Changes ### In the Asset Catalog 1. Go to **Assets** 2. Click on a table or view 3. Select the **Schema History** tab You'll see: - Timeline of all changes - Before/after comparison for each change - Which discovery run detected it ### In Alert History 1. Go to **Alerts → History** 2. Filter by "Schema Change" 3. Click an alert to see change details ## Schema Change Types ### Column Removed **What**: A column that existed before is no longer present. **Risk**: High, any query referencing this column will fail. **Example**: ``` Table: orders Column removed: shipping_status (varchar) Detected: 2024-01-15 08:00 UTC ``` **Action**: Update downstream queries, models, and reports. ### Column Added **What**: A new column appears that didn't exist before. **Risk**: Low. Existing queries usually continue working. **Example**: ``` Table: orders Column added: delivery_estimate (timestamp) Detected: 2024-01-15 08:00 UTC ``` **Action**: Consider adding to models if relevant. ### Column Type Changed **What**: A column's data type was modified. **Risk**: Medium. May affect type-specific logic. **Example**: ``` Table: products Column: price Type change: integer → decimal(10,2) Detected: 2024-01-15 08:00 UTC ``` **Action**: Review aggregations, comparisons, and type casting. ### Table Removed **What**: An entire table no longer exists. **Risk**: High. All queries to this table will fail. **Example**: ``` Table removed: legacy_orders Last seen: 2024-01-14 08:00 UTC Detected: 2024-01-15 08:00 UTC ``` **Action**: Urgent. Investigate if intentional. ## Best Practices ### Monitor Production Most Closely Production schema changes have the highest impact. Run discovery more frequently and set up immediate alerts. ### Route by Change Type Not all changes are equally urgent: | Change | Alert Destination | |--------|------------------| | Column/Table removed | PagerDuty + Slack | | Type changed | Slack | | Column/Table added | Email or Slack (low priority) | ### Coordinate with Change Management When you know schema changes are coming: 1. Inform your team 2. Update downstream code first (if possible) 3. Make the schema change 4. Verify AnomalyArmor detects it 5. Confirm no unexpected alerts ### Use Schema History for Debugging When a pipeline fails, check schema history: 1. Find the asset in AnomalyArmor 2. Look at recent schema changes 3. Correlate timing with pipeline failure 4. Identify the breaking change ## Common Questions ### How quickly will I know when a column is removed? Detection latency equals your discovery schedule. Hourly discovery catches removals within an hour, daily discovery within 24 hours. Alerts fire as soon as the comparison completes. See [Asset Discovery](/schema-monitoring/asset-discovery) to adjust the schedule. ### Does schema monitoring require writing any code or queries? No. Schema monitoring is automatic once you connect a data source and run discovery. You only configure alert rules to decide where notifications go. ### Can I get different alerts for different types of schema changes? Yes. Route breaking changes like column removals to PagerDuty, type changes to Slack, and additions to an email digest. Configure this in [alert rules](/alerts/alert-rules) by filtering on change type. ### How does AnomalyArmor tell a rename from a remove-plus-add? It doesn't, automatically. A rename is detected as a column removed plus a column added. The schema diff highlights the pair so you can confirm it's a rename and update downstream references. ### Is there a history of past schema changes I can review? Yes. Open any asset and select the **Schema History** tab to see every detected change with before and after values, timestamps, and the discovery run that found it. This is useful for correlating pipeline failures with structural changes. ## Related Topics - **[Schema Drift](/schema-monitoring/schema-drift)**: Deep dive into detecting and managing drift - **[Asset Discovery](/schema-monitoring/asset-discovery)**: How discovery finds and catalogs assets - **[Alert Rules](/alerts/alert-rules)**: Configure schema change alerts - **[Core Concepts](/concepts/schema-changes)**: Understand schema changes in depth --- # Asset Discovery > Configure and run discovery to catalog your database assets Discovery scans your databases to build a catalog of tables, views, and columns. Run it manually or on a schedule to keep your asset inventory current. ## Running Discovery Manually 1. Go to **Data Sources** in the sidebar 2. Find your data source 3. Click **Run Discovery** (or the refresh icon) 4. Wait for completion (typically 30 seconds to 5 minutes) You'll see "Discovery completed" with a summary of assets found. ## Scheduling Discovery Set discovery to run automatically: 1. Go to **Data Sources** → select your data source 2. Click **Settings** (gear icon) 3. Under **Discovery Schedule**, select frequency: | Schedule | Best For | Typical Use Case | |----------|----------|------------------| | **Hourly** | Production databases | Catch changes quickly | | **Every 6 hours** | Active development | Balance coverage and load | | **Daily** | Stable databases | Low-change environments | | **Weekly** | Archives, warehouses | Rarely modified systems | 4. Click **Save** Discovery runs at the start of each interval (e.g., hourly = on the hour). ## What Discovery Finds During each run, discovery catalogs: | Asset Type | What's Captured | |------------|-----------------| | Tables | Name, schema, columns, types, constraints | | Views | Name, schema, underlying columns | | Columns | Name, data type, nullability, defaults | | Relationships | Foreign keys, primary keys | ## Discovery Duration Discovery time depends on database size: | Database Size | Typical Duration | |---------------|------------------| | < 100 tables | 30 seconds | | 100-500 tables | 1-2 minutes | | 500-2000 tables | 2-5 minutes | | 2000+ tables | 5-15 minutes | > â„šī¸ Discovery runs read-only queries against system catalogs. It doesn't impact your database performance significantly. ## Incremental vs Full Discovery **Incremental** (default): Checks for changes since last run. Faster. **Full**: Re-scans everything. Use when: - First run after connecting - Suspect missed changes - After major database migrations To run full discovery: 1. Go to **Data Sources** → select your data source 2. Click the dropdown arrow next to **Run Discovery** 3. Select **Full Discovery** ## Viewing Discovery Results After discovery completes: 1. Go to **Assets** in the sidebar 2. Filter by data source to see discovered assets 3. Click any asset to view details New assets show a "New" badge for 24 hours. ## Discovery History View past discovery runs: 1. Go to **Data Sources** → select your data source 2. Click **History** tab 3. See each run with: - Timestamp - Duration - Assets found/changed - Any errors ## Troubleshooting ### Discovery stuck or taking too long **Causes:** - Large database with many tables - Slow network connection - Database under heavy load **Fix:** 1. Check discovery history for progress 2. Wait up to 30 minutes for large databases 3. If stuck, cancel and retry during off-peak hours ### Discovery completed but missing tables **Causes:** - User doesn't have permission on those schemas - Tables in schemas excluded from discovery - Views the user can't access **Fix:** 1. Check your database user's permissions 2. Grant `SELECT` on `information_schema` (or equivalent) 3. Verify schema inclusion settings ### Discovery failed **Causes:** - Connection credentials changed - Database unreachable - Permission revoked **Fix:** 1. Check the error message in discovery history 2. Test the connection in Data Sources 3. Verify credentials and network access ## Common Questions ### Does discovery impact database performance? No. Discovery runs read-only queries against system catalogs (`information_schema` and equivalents), not against your data tables. The load is negligible even on large databases. ### How often should I run discovery? For production databases, run discovery **hourly** to catch schema changes quickly. Stable databases can run daily, and archives can run weekly. More frequent runs shorten the window between a change and detection. ### What's the difference between incremental and full discovery? Incremental discovery checks only for changes since the last run and is the default. Full discovery re-scans everything and should be used after the first connection, after major migrations, or when you suspect a missed change. ### Why are some tables missing after discovery? The most common cause is permissions. The database user AnomalyArmor connects with needs `SELECT` on `information_schema` (or the vendor equivalent) and access to the target schemas. Check also that the schema isn't excluded in your discovery settings. ### How long does discovery take on a large database? Databases under 100 tables finish in about 30 seconds. 500-2000 tables take 2-5 minutes, and 2000+ tables can take 5-15 minutes. If a scan runs longer than 30 minutes it's likely stuck, retry during off-peak hours. ## See Also - **[Discovery Concepts](/concepts/discovery)**: How discovery works under the hood - **[Schema Changes](/concepts/schema-changes)**: Understanding detected changes --- # Schema Drift Detection > Detect and track schema changes over time to prevent data pipeline failures Schema drift occurs when your database structure , intentionally or accidentally. AnomalyArmor detects drift automatically and alerts you before it causes downstream failures. *Schema diff visualization showing column changes* ## What is Schema Drift? Schema drift is any change to database structure: | Column | Before (Discovery T1) | After (Discovery T2) | Change | |--------|----------------------|---------------------|--------| | id | `integer` | `integer` | No change | | user_id | `integer` | `integer` | No change | | status | `varchar` | -- | **Removed** (or renamed) | | order_status | -- | `varchar` | **Added** (possibly renamed from `status`) | | total | `decimal` | `decimal` | No change | | shipping_date | `timestamp` | -- | **Removed** (or renamed) | | shipping_estimate | -- | `timestamp` | **Added** (possibly renamed from `shipping_date`) | | delivery_fee | -- | `decimal` | **Added** | | created_at | `timestamp` | `timestamp` | No change | ## Types of Schema Drift ### Breaking Changes (High Risk) Changes that will cause immediate failures: | Change | Example | Impact | |--------|---------|--------| | **Column Removed** | `shipping_status` deleted | Queries referencing it fail | | **Table Removed** | `legacy_orders` dropped | All queries to table fail | | **Column Renamed** | `status` → `order_status` | Queries using old name fail | | **Incompatible Type Change** | `varchar` → `integer` | Type mismatches, cast failures | ### Non-Breaking Changes (Low-Medium Risk) Changes that usually don't cause immediate failures: | Change | Example | Impact | |--------|---------|--------| | **Column Added** | New `delivery_fee` column | Existing queries work; new data available | | **Table Added** | New `order_items` table | No impact on existing queries | | **Compatible Type Change** | `varchar(50)` → `varchar(100)` | Usually transparent | | **Constraint Added** | New unique constraint | May affect inserts, not selects | ## How Drift is Detected AnomalyArmor compares consecutive discovery runs: | Column | Run #1 (Baseline) | Run #2 (Comparison) | Result | |--------|-------------------|---------------------|--------| | id | Present | Present | No change | | user_id | Present | Present | No change | | status | Present | Missing | **Removed** | | total | Present | Present | No change | | order_status | Missing | Present | **Added** | | created_at | Present | Present | No change | **Detected drift**: Column `status` removed, column `order_status` added (possibly a rename, investigate). ### Detection Timing | Discovery Schedule | Max Detection Delay | |-------------------|---------------------| | Every 15 minutes | 15 minutes | | Hourly | 1 hour | | Daily | 24 hours | > 💡 For production databases, hourly discovery provides a good balance of timeliness and resource usage. ## Viewing Schema Drift ### Asset Schema History 1. Navigate to **Assets** 2. Click on a table 3. Select **Schema History** tab You'll see: ![Schema change timeline displaying multiple changes with timestamps and details](/images/screenshots/schema-history-timeline.png) *Schema change history showing column additions, removals, and modifications over time* ### Change Details Click any change to see details: - **Before state**: Previous column definition - **After state**: New column definition (or "Removed") - **Detection time**: When the change was discovered - **Discovery run**: Which scan found it ## Schema Drift Alerts ### Setting Up Alerts Create rules to notify you of drift: | Field | Value | |-------|-------| | **Name** | Production Breaking Changes | | **Event** | Schema Change Detected | | **Data source** | `production-postgres` | | **Schema** | `public`, `analytics` | | **Change type** | Column Removed, Table Removed, Type Changed | | **Destinations** | Slack, PagerDuty | ### Alert Content Schema drift alerts include: ``` âš ī¸ Schema Change Detected Asset: production.public.orders Changes detected: â€ĸ Column removed: status (varchar) â€ĸ Column added: order_status (varchar) Detected: January 15, 2024 at 8:00 AM UTC Discovery Run: #1234 [View Details] [View Asset] ``` ## Common Drift Scenarios ### Scenario 1: Column Rename **Symptom**: Column removed + column added with similar name ``` Detected changes: - Column removed: user_email - Column added: email_address This is likely a rename. ``` **Action**: 1. Confirm it's a rename (ask the upstream team) 2. Update all downstream references 3. Verify queries work with new name ### Scenario 2: Unannounced Migration **Symptom**: Multiple columns changed in related tables ``` Detected changes in users: - Column type changed: phone (varchar → jsonb) - Column added: phone_verified (boolean) Detected changes in user_profiles: - Column removed: phone_number ``` **Action**: 1. Identify the migration (check with upstream team) 2. Review impact on your queries 3. Update affected models ### Scenario 3: Table Dropped **Symptom**: Table no longer exists ``` Asset removed: public.legacy_orders Last seen: January 14, 2024 This could be: - Intentional removal - Renamed table - Permission change - Wrong database connected ``` **Action**: 1. Verify if intentional 2. Update any queries referencing it 3. Check if data was migrated elsewhere ### Scenario 4: Type Change **Symptom**: Column data type modified ``` Column: products.price Type change: integer → decimal(10,2) ``` **Action**: 1. Review aggregations (SUM, AVG) for precision changes 2. Check comparisons and arithmetic 3. Verify no type casting issues ## Managing Expected Drift ### Planned Schema Changes When you know changes are coming: 1. **Communicate**: Tell your team about upcoming changes 2. **Prepare**: Update downstream code in advance (staged rollout) 3. **Execute**: Make the schema change 4. **Verify**: Confirm AnomalyArmor detects it 5. **Acknowledge**: Mark alerts as expected ### Disabling Rules During Migrations For large migrations: 1. Go to **Alerts → Rules** 2. Toggle OFF relevant rules 3. Perform migration 4. Review detected changes 5. Toggle rules back ON > âš ī¸ Don't forget to re-enable rules after migration! Set a calendar reminder. ## Drift Prevention ### Schema Versioning Coordinate schema changes with version control: - Use migration tools (Flyway, Alembic, Atlas) - Review migrations before applying - Test in staging first ### Change Communication Establish processes: - Schema change announcements - Dependency mapping - Impact assessment before changes ### Backward Compatibility When changing schemas: - Add new columns instead of modifying existing - Keep old columns during transition periods - Use views to maintain compatibility ## Troubleshooting ### Drift detected but change was intentional This is expected behavior. AnomalyArmor doesn't know intent. It reports all changes. Consider: 1. Acknowledging the alert 2. Documenting the change in your records 3. Disabling rules during planned migrations ### Drift not detected 1. **Check discovery schedule**: Has discovery run since the change? 2. **Check scope**: Is the table in a monitored schema? 3. **Check permissions**: Can AnomalyArmor user see the table? 4. **Run manual discovery**: Trigger discovery to check now ### Too many drift alerts 1. **Filter by change type**: Alert only on breaking changes (removed columns/tables) 2. **Exclude dev/test**: Don't monitor volatile environments 3. **Use patterns**: Exclude `*_temp`, `*_backup` tables ### False positive - rename detected as remove+add This is expected. AnomalyArmor can't automatically determine if a remove+add pair is a rename. You should: 1. Investigate the pair manually 2. Document the rename in your records 3. Update downstream references ## Best Practices 1. **Monitor production closely**: Run frequent discovery, alert immediately 2. **Differentiate environments**: Different rules for prod vs. dev 3. **Route by change type**: Breaking changes → PagerDuty; additions → Email 4. **Review history regularly**: Check drift patterns to identify risky areas 5. **Coordinate changes**: Communicate schema changes before making them ## Common Questions ### What counts as a breaking schema change? Column removals, table removals, column renames, and incompatible type changes (such as `varchar` to `integer`) will break queries immediately. Additions and compatible type widenings (`varchar(50)` to `varchar(100)`) usually don't. ### How do I avoid alert storms during a planned migration? Disable the relevant alert rules in **Alerts → Rules** before the migration, let the changes flow through, then toggle the rules back on. Set a calendar reminder, leaving rules off is a common mistake. ### Can AnomalyArmor detect renames automatically? No. A rename shows up as a column removed plus a column added with a similar name. The UI flags the pair as a possible rename, but confirming intent still requires a human to check with the upstream team. ### Why did drift get detected for a change I made intentionally? AnomalyArmor reports all structural changes, it has no way to know which ones you planned. Acknowledge the alert, document the change, and optionally disable the rule during known migration windows. ### What should I do first when a breaking change is detected in production? Open the asset's schema history to see exactly what changed and when. Then check downstream dependencies, such as views, models, and dashboards, that reference the affected column or table, and coordinate fixes before more queries fail. ### Does detection work on views and materialized views, not just tables? Yes. Discovery catalogs tables, views, and their underlying columns, so drift on views is detected the same way. See [Asset Discovery](/schema-monitoring/asset-discovery) for what's captured. ## Related Topics - **[Schema Monitoring Overview](/schema-monitoring/overview)**: Broader schema monitoring capabilities - **[Alert Rules](/alerts/alert-rules)**: Configure drift alerts - **[Schema Changes Concept](/concepts/schema-changes)**: Deep dive into change types - **[Asset Discovery](/schema-monitoring/asset-discovery)**: How assets are discovered --- # Data Quality Overview > Monitor the health and reliability of your data assets Data quality monitoring ensures your data is fresh, complete, and reliable. AnomalyArmor helps you detect data issues before they impact business decisions. ## Why Data Quality Matters Bad data leads to bad decisions: ## Data Quality Dimensions AnomalyArmor monitors key data quality dimensions: | Dimension | What It Means | How We Monitor | |-----------|---------------|----------------| | **Freshness** | Is data up to date? | Timestamp monitoring, SLAs | | **Completeness** | Did the right amount arrive? | Row count monitoring, ML anomaly detection | | **Metrics** | Are column values correct? | Statistical monitoring, anomaly detection | | **Schema** | Is structure correct? | Schema drift detection | | **Availability** | Is data accessible? | Discovery success/failure | ## Monitoring Capabilities - **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Track when data was last updated and detect stale data before it impacts downstream consumers. - **[Row Count Monitoring](/data-quality/row-count-monitoring)**: Monitor row counts with ML-based anomaly detection or explicit thresholds. - **[Data Quality Metrics](/data-quality/metrics)**: Monitor null percentages, distinct counts, and other column-level statistics. Detect anomalies automatically. - **[Schema Monitoring](/schema-monitoring/overview)**: Detect structural changes to your database that could break pipelines and reports. - **[Report Badges](/data-quality/report-badges)**: Embed data quality status indicators in dashboards, wikis, and operational tools. ## How Data Quality Monitoring Works 1. **Discovery runs** on your configured schedule, **metrics captured** at defined intervals 2. **Metadata collected** including schema, timestamps, and metric values (row counts, null %, etc.) 3. **Compared against expectations** (SLAs, statistical baselines, previous state) 4. **Alerts fired** when expectations aren't met or anomalies detected ## Getting Started ### Set Up Freshness Monitoring 1. Navigate to an asset 2. Click **Freshness** tab 3. Select a timestamp column 4. Set your expected update frequency 5. Configure alert threshold ### Set Up Row Count Monitoring 1. Navigate to an asset 2. Click **Data Quality** tab 3. Scroll to **Row Count Monitoring** section 4. Click **Create Schedule** 5. Configure time window and check interval 6. Choose auto-learn or explicit thresholds ### Set Up Data Quality Metrics 1. Navigate to an asset 2. Click **Metrics** tab 3. Click **Create Metric** 4. Select metric type (null %, distinct count, etc.) 5. Configure capture interval 6. Enable anomaly detection (optional) ### Set Up Schema Monitoring Schema monitoring is automatic once you: 1. Connect a data source 2. Run discovery 3. Configure alert rules for schema changes ## Best Practices ### Start with Critical Assets Don't monitor everything at once. Focus on: - **Revenue-impacting tables**: Orders, payments, transactions - **Customer-facing data**: Data that powers dashboards and reports - **Compliance-required data**: Audit logs, regulatory reports ### Set Realistic Expectations Match SLAs to actual data patterns: | Data Type | Typical Freshness | |-----------|-------------------| | Real-time events | Minutes | | Hourly ETL | 1-2 hours | | Daily batches | Same-day | | Weekly reports | 1 week | ### Layer Your Monitoring Combine multiple checks for full coverage: **Critical table (orders):** - **Freshness**: Alert if >2 hours stale - **Completeness**: Alert if row count drops >50% - **Metrics**: Alert if null_percent exceeds 5% - **Schema**: Alert on any column removed - **Availability**: Alert if discovery fails ## Data Quality Dashboard View overall data health in the **Assets** section: | Indicator | Meaning | |-----------|---------| | Green | All checks passing | | Yellow | Warning threshold reached | | Red | SLA violated or issue detected | | Gray | Not monitored | ## Common Questions ### Which monitor should I set up first? Start with [freshness monitoring](/data-quality/freshness-monitoring) on revenue-impacting tables. Freshness catches the most common failure mode (ETL didn't run) with the least configuration, then layer in row counts and column metrics as you learn your patterns. ### What's the difference between freshness, row count, and metrics? **Freshness** answers "did data arrive on time?" **Row count** answers "did the right amount arrive?" **Metrics** answer "is the column-level data correct?" Critical tables benefit from all three. See [freshness](/data-quality/freshness-monitoring), [row count](/data-quality/row-count-monitoring), and [metrics](/data-quality/metrics). ### Do I have to monitor every table? No, and you shouldn't try. Focus on revenue-impacting tables, customer-facing dashboard sources, and compliance-required data. Most teams get 80% of the value from monitoring 20% of their assets. ### Can AnomalyArmor detect issues it wasn't explicitly configured for? Row count monitoring and metrics use anomaly detection against learned baselines, so they can flag unusual values without explicit thresholds. Freshness still needs an SLA, and schema drift detects all changes automatically once discovery runs. ## Related Topics - **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set up freshness SLAs - **[Row Count Monitoring](/data-quality/row-count-monitoring)**: Monitor row counts with ML anomaly detection - **[Data Quality Metrics](/data-quality/metrics)**: Track column-level statistics and detect anomalies - **[Alert Rules](/alerts/alert-rules)**: Configure data quality alerts - **[Report Badges](/data-quality/report-badges)**: Embed quality status in external tools --- # Freshness Monitoring > Set SLAs on data freshness and detect stale data before it impacts your business Freshness monitoring tracks when your data was last updated and alerts you when it becomes stale. Stale data can be just as damaging as wrong data. If your dashboard shows yesterday's numbers, decisions made today could be wrong. > â„šī¸ **Prerequisites**: Before setting up freshness monitoring, you need: - A [connected data source](/data-sources/overview) with discovery completed - Assets with timestamp columns (`created_at`, `updated_at`, or similar) *Freshness timeline showing expected vs actual data updates* ## Why Freshness Matters Data has an expected update cadence. When that cadence breaks, something is wrong: ## How Freshness Monitoring Works AnomalyArmor tracks freshness by monitoring timestamp columns: 1. **You specify** which column indicates "when data was updated" 2. **Discovery queries** the maximum value of that column 3. **We compare** against your configured SLA 4. **Alert fires** if data is older than allowed ```sql -- What AnomalyArmor checks SELECT MAX(created_at) FROM orders; -- Result: 2024-01-15 07:58:32 -- Your SLA: Updated within 1 hour -- Current time: 2024-01-15 08:30:00 -- Data age: 32 minutes -- Status: OK Within SLA ``` ## Setting Up Freshness Monitoring ## SLA Strategies ### Start Conservative Set SLAs with buffer room to avoid false positives: | Actual Cadence | Recommended SLA | Why | |---------------|-----------------|-----| | Every hour | 2 hours | Allows for occasional delays | | Every day | 36 hours | Accounts for timing variations | | Every 15 min | 30 minutes | Double the expected window | > 💡 Start lenient and tighten over time. It's easier to make SLAs stricter than to deal with alert fatigue from SLAs that are too tight. ### By Data Criticality | Criticality | SLA Approach | Example | |-------------|--------------|---------| | Critical | Expected + 25% | Revenue: 1hr expected → 75min SLA | | High | Expected + 50% | Orders: 1hr expected → 90min SLA | | Medium | Expected × 2 | Analytics: 1hr expected → 2hr SLA | | Low | Expected × 3 | Reports: daily expected → 3 day SLA | ### By Update Pattern ### Streaming/Real-time For data expected within minutes: | Setting | Value | |---------|-------| | **Expected** | Every 5 minutes | | **Warning** | 10 minutes | | **Violation** | 15 minutes | | **Alert** | Slack + PagerDuty | ### Batch/Hourly For hourly ETL jobs: | Setting | Value | |---------|-------| | **Expected** | Every hour | | **Warning** | 80 minutes | | **Violation** | 2 hours | | **Alert** | Slack | ### Daily For daily batch loads: | Setting | Value | |---------|-------| | **Expected** | Daily by 6am | | **Warning** | 8am (2hr buffer) | | **Violation** | 12pm (6hr buffer) | | **Alert** | Email | ## Freshness Alerts ### Setting Up Alert Rules Create rules to notify you of freshness violations: | Field | Value | |-------|-------| | **Rule** | Critical Table Freshness | | **Event** | Freshness Violation | | **Assets** | orders, payments, users | | **Destinations** | Slack, PagerDuty | ### Alert Content Freshness alerts include: ``` [!] Freshness Violation Asset: production.public.orders Column: created_at Expected: Updated every 1 hour Last update: 3 hours 15 minutes ago Data age: 3h 15m (SLA: 1h) Detected: January 15, 2024 at 8:00 AM UTC [View Asset] [View Dashboard] ``` ## Handling Special Cases ### Weekends and Holidays Some data doesn't update on weekends: **Options:** 1. **Longer weekend SLA**: Set different thresholds for Saturday/Sunday 2. **Pause monitoring**: Temporarily disable freshness checks 3. **Adjust expectations**: Set SLA to "72 hours" to cover full weekends ### Maintenance Windows During planned maintenance: 1. **Disable rules**: Toggle OFF freshness alert rules for affected assets 2. **Document**: Note expected staleness 3. **Re-enable**: Toggle rules back ON after maintenance ### Infrequently Updated Tables Some tables legitimately update rarely: | Table Type | Update Pattern | SLA Approach | |-----------|----------------|--------------| | Reference/lookup | Monthly | 45-day SLA or no monitoring | | Historical archive | Never | Don't monitor freshness | | Dimension tables | Daily/weekly | Match actual pattern | ## Viewing Freshness Status ### Asset List View In the **Assets** list, freshness status appears as indicators: | Indicator | Meaning | |-----------|---------| | Green | Within SLA | | Yellow | Warning threshold reached | | Red | SLA violated | | Gray | Freshness not configured | ### Asset Detail View Click an asset to see: - Current freshness status - Last update timestamp - Freshness history over time - SLA configuration The detail view shows a status card with the current freshness state, last update timestamp, data age, and a progress bar indicating how much of your SLA window has been consumed. Below that, a 7-day history shows freshness status over time. **Example freshness status:** | Field | Value | |-------|-------| | Status | ✓ Within SLA | | Timestamp Column | `created_at` | | Last Update | January 15, 2024 at 2:15 PM UTC | | Data Age | 45 minutes | | SLA Threshold | 2 hours | | Progress | 37% of SLA (45 min / 2 hr) | **Freshness History:** All checks in the last 7 days were within SLA. ## Troubleshooting ### False positive alerts **Problem**: Freshness alerts fire when data is actually fine. **Solutions**: 1. Verify timestamp column choice. Is it the right one? 2. Loosen SLA threshold 3. Check timezone handling (UTC vs. local) 4. Review discovery schedule vs. SLA timing ### Missed violations **Problem**: Data was stale but no alert fired. **Solutions**: 1. Verify freshness monitoring is enabled for the asset 2. Check alert rule configuration 3. Confirm discovery is running frequently enough 4. Verify timestamp column has recent values ### Wrong timestamp column **Problem**: Freshness showing incorrect values. **Solutions**: 1. Review column choice 2. For tables with updates, use `updated_at` not `created_at` 3. For ETL tables, use the load timestamp column 4. Ensure column is always populated (no NULLs) ### Timezone issues **Problem**: Freshness calculations seem off by hours. **Solutions**: 1. Check timestamp column timezone 2. AnomalyArmor normalizes to UTC 3. Ensure consistent timezone handling in your ETL ## Best Practices ### 1. Start with Critical Tables Focus monitoring on: - Revenue-impacting tables - Customer-facing dashboard sources - Compliance-required data ### 2. Align with Business Needs Ask: "When would stale data cause a problem?" | Scenario | Acceptable Staleness | SLA | |----------|---------------------|-----| | Real-time dashboard | Minutes | 15 min | | Daily executive report | Hours | Same-day | | Monthly compliance | Days | 1 week | ### 3. Coordinate with ETL Schedules | Phase | Time | |-------|------| | **ETL Schedule** | 2:00 AM daily | | **Processing time** | ~30 minutes | | **Data available** | ~2:30 AM | | **SLA** | 4:00 AM (90 min buffer) | ### 4. Use Warning Thresholds Configure two levels: 1. **Warning**: "Heads up, getting stale" → Slack 2. **Violation**: "SLA breached" → PagerDuty ## Common Questions ### Which timestamp column should I pick? Pick the column that updates when new data arrives. Use `created_at` for append-only tables like events and logs, `updated_at` for tables that get updated in place, and the load timestamp for ETL destination tables. Avoid columns that can be NULL. ### What's a sensible starting SLA? Start lenient, around double the expected cadence. If data is expected hourly, set a 2-hour SLA. It's easier to tighten thresholds than to fight alert fatigue from SLAs set too tight on day one. ### How do I handle tables that don't update on weekends? Either extend the SLA to cover the weekend (for example, 72 hours), set different thresholds for weekend days, or pause the freshness rule during known quiet periods. Choose based on how critical the data is. ### Why does the freshness calculation look off by several hours? Almost always a timezone issue. AnomalyArmor normalizes to UTC, so a timestamp column stored in local time will appear skewed. Confirm your ETL writes timestamps consistently and check the column's declared timezone. ### What's the difference between a warning and a violation? A **warning** fires before the SLA is breached (often at 80% of the threshold) as an early heads-up. A **violation** fires once the data is officially past SLA. Route them separately, warnings to Slack, violations to PagerDuty. ### Do I need freshness monitoring on every table? No. Focus on tables that power dashboards, revenue reporting, customer-facing products, or compliance processes. Reference tables and historical archives usually don't need freshness checks. ## What's Next - **[Set Up Metrics](/data-quality/metrics)**: Track row counts, null percentages, and detect anomalies - **[Configure Alerts](/alerts/alert-rules)**: Get notified when freshness SLAs are violated - **[Report Badges](/data-quality/report-badges)**: Embed freshness status in dashboards - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue --- # Data Quality Metrics > Monitor null percentages, distinct counts, and other column-level statistics to detect data quality issues Data quality metrics let you track statistical properties of your columns over time. AnomalyArmor captures metric values on a schedule, builds historical baselines, and automatically detects when values fall outside expected ranges. > â„šī¸ **Looking for row count monitoring?** Use [Row Count Monitoring](/data-quality/row-count-monitoring) for tracking row counts with ML-based anomaly detection or explicit thresholds. > â„šī¸ **Prerequisites**: Before creating metrics, you need: - A [connected data source](/data-sources/overview) with discovery completed - At least one asset (table/view) to monitor **Example scenario:** The `customer_email` column normally has ~3% null values. On Jan 30, null percentage jumped to 12.3%, well outside the expected range band. AnomalyArmor flags this as an anomaly, indicating a potential data quality issue in the source system. ## Why Use Metrics Freshness tells you *when* data was updated. Completeness tells you *how much* arrived. Metrics tell you *what changed* at the column level: | Issue | Freshness | Completeness | Metrics | |-------|-----------|--------------|---------| | ETL job failed completely | Detects it | Detects it | Detects it | | ETL ran but loaded 0 rows | Might miss it | **Catches it** | N/A | | Data loaded but 50% nulls | Misses it | Misses it | **Catches it** | | Unexpected duplicates | Misses it | Misses it | **Catches it** | | Values outside valid range | Misses it | Misses it | **Catches it** | **Use freshness for "did data arrive on time?"** **Use row count monitoring for "did the right amount of data arrive?"** **Use metrics for "is the column-level data quality correct?"** ## Metric Types All metrics require a specific column to monitor: | Type | Description | Best For | |------|-------------|----------| | `null_percent` | Percentage of null values | Detecting missing data | | `distinct_count` | Count of unique values | Cardinality monitoring | | `duplicate_count` | Count of repeated values | Data quality checks | | `min_value` | Minimum numeric value | Range validation | | `max_value` | Maximum numeric value | Outlier detection | | `mean` | Average numeric value | Central tendency | | `percentile` | Nth percentile value | Distribution analysis | ## Creating a Metric ## Viewing Metric History Each metric tracks historical values and displays them as a trend chart: - **Value line**: Actual metric values over time - **Anomaly band**: Expected range (mean +/- sensitivity * stddev) - **Anomaly points**: Values outside the band are flagged ### Reading the Chart | Indicator | Meaning | |-----------|---------| | Green line within band | Normal values | | Red dot outside band | Anomaly detected | | Gray dashed lines | Upper/lower bounds | ## Which Metric Type Should I Use? ### Is my table growing or shrinking unexpectedly? Use [Row Count Monitoring](/data-quality/row-count-monitoring). It provides ML-based pattern learning, time-windowed counting, and explicit threshold support for row count monitoring. ### Are there unexpected null values? Use **null_percent** on the column that shouldn't have nulls. Example: Monitor `customer_email` for null percentage. Alert if nulls exceed historical baseline (e.g., jumps from 2% to 15%). ### Are values within expected range? Use **min_value** and **max_value** on numeric columns. Example: Monitor `price` column. Alert if minimum drops below 0 (invalid) or maximum exceeds historical norms. ### Is data being duplicated? Use **duplicate_count** on columns that should be unique. Example: Monitor `order_id` for duplicates. Any duplicates indicate a data quality issue. ### How many unique values exist? Use **distinct_count** on categorical columns. Example: Monitor `country_code` distinct count. A sudden increase might indicate invalid data. ## Best Practices ### Start with High-Impact Metrics Focus on metrics that catch real problems: **Critical table (orders):** - **Completeness**: Catch data loss or duplication (see [Row Count Monitoring](/data-quality/row-count-monitoring)) - **null_percent** on `order_id`: Should never be null - **null_percent** on `customer_id`: Should never be null - **min_value** on `total_amount`: Should never be negative ### Match Capture Interval to Data Freshness | Data Update Pattern | Recommended Interval | |---------------------|---------------------| | Real-time streaming | Hourly | | Hourly batch jobs | Hourly | | Daily batch jobs | Daily | | Weekly aggregates | Weekly | ### Use Meaningful Sensitivity Values | Scenario | Sensitivity | Rationale | |----------|-------------|-----------| | New table, learning patterns | 3.0 | Reduce noise while learning | | Established table, stable patterns | 2.0 | Balanced detection | | Critical data, low tolerance | 1.5 | More sensitive alerting | ## Troubleshooting ### Metric shows 'No data' **Causes:** - Metric was just created and hasn't captured yet - Capture job failed - Table is empty **Solutions:** 1. Wait for the next scheduled capture (check interval) 2. Trigger a manual capture: **Actions > Capture Now** 3. Check the table has data ### Too many false positive anomalies **Causes:** - Sensitivity is too low (too sensitive) - Normal data patterns are highly variable - Seasonality not accounted for **Solutions:** 1. Increase sensitivity (e.g., 2.0 to 3.0) 2. Allow more baseline data to accumulate (30+ days) 3. Consider if the variation is actually expected ### Missing real anomalies **Causes:** - Sensitivity is too high (not sensitive enough) - Baseline includes anomalous data - Capture interval too infrequent **Solutions:** 1. Decrease sensitivity (e.g., 3.0 to 2.0) 2. Reset baseline after fixing data issues 3. Increase capture frequency ### Metric capture failing **Causes:** - Database connection issues - Column was renamed or removed - Permission changes **Solutions:** 1. Check data source connection status 2. Verify column still exists 3. Check database user permissions ## Common Questions ### When should I use metrics versus row count monitoring? Use **metrics** for column-level checks like null rates, distinct counts, and numeric ranges. Use [Row Count Monitoring](/data-quality/row-count-monitoring) for table-level volume tracking, it has ML-based pattern learning and time-windowed counting that metrics don't. ### What sensitivity should I start with for anomaly detection? Start at **2.0** (balanced, alerts on 2 standard deviations). Drop to 1.5 for critical data where you want tight detection, or raise to 3.0 if you're seeing too many false positives from noisy patterns. ### How long before anomaly detection is useful? Anomaly detection needs a baseline. Expect rougher results for the first week or two while history accumulates. For stable patterns, 30+ days of baseline data gives the tightest, most trustworthy bands. ### Does AnomalyArmor read my column values? It runs aggregate queries (like `COUNT`, `MIN`, `MAX`, `AVG`) against your database to compute the metric. Only the numeric result is stored, individual row values aren't transmitted or retained. ### Can I monitor a metric on a custom SQL expression? The built-in metric types run against a specific column. For arbitrary SQL, use [Custom SQL Monitoring](/data-quality/custom-sql-monitoring) instead, which lets you write any `SELECT` that returns a numeric value. ## What's Next - **[Set Up Metric Alerts](/alerts/alert-rules)**: Get notified when metrics detect anomalies - **[Metrics API](/api/metrics)**: Automate metric management with the API - **[Report Badges](/data-quality/report-badges)**: Embed metric status in dashboards - **[Alert Rules](/alerts/alert-rules)**: Configure where alerts are sent --- # Row Count Monitoring > Monitor row counts with ML-based anomaly detection or explicit thresholds Row Count Monitoring tracks row counts in your tables over time. It detects when data volumes drop unexpectedly (data loss) or spike unusually (duplicate loads), helping you catch ETL issues before they impact downstream consumers. > â„šī¸ **Why Row Count?** Row count monitoring used to be part of Data Quality Metrics. We moved it to its own feature with enhanced capabilities: ML-based pattern learning, time-windowed counting, and explicit threshold support. **Example scenario:** The orders table typically receives 45,000-55,000 rows daily. On Jan 30, only 15,234 rows were loaded -- a 70% drop flagged as an anomaly, indicating a potential ETL failure. ## Configuration Reference ### Monitoring Mode {#monitoring-mode} Row Count Monitoring offers two approaches to fit different needs: #### Auto-Learn Mode (Recommended) Let AnomalyArmor learn your table's normal row count patterns: | Aspect | How It Works | |--------|--------------| | **Learning period** | Collects data for 7+ days to establish baseline | | **Pattern detection** | Identifies daily, weekly, and seasonal trends | | **Anomaly detection** | Uses statistical analysis (mean +/- stddev * sensitivity) | | **Best for** | Tables with consistent, predictable patterns | ``` Auto-learn example (orders table): Day 1-7: Learning... collecting baseline data Day 8+: Baseline established (avg: 48,000, stddev: 3,200) Alerts if row count deviates significantly ``` #### Explicit Mode Set specific row count thresholds when you know exactly what to expect: | Setting | Description | |---------|-------------| | **Min rows** | Alert if row count falls below this value | | **Max rows** | Alert if row count exceeds this value | | **Best for** | Tables with known, fixed expectations | ``` Explicit example (daily_summary table): Expected: Exactly 1 row per day Min: 1, Max: 1 Alert if row count != 1 ``` ### Sensitivity {#sensitivity} For auto-learn mode, sensitivity controls how strict the anomaly detection is. It's the multiplier applied to the standard deviation when calculating expected ranges. | Sensitivity | Behavior | Use When | |-------------|----------|----------| | **1 (Tight)** | Very strict, catches small deviations | Critical data, low tolerance for anomalies | | **2 (Balanced)** | Default, catches moderate deviations | Most tables, standard monitoring | | **3 (Relaxed)** | Less strict, allows more variation | High natural variability, noisy data | | **4 (Loose)** | Very permissive, only catches large deviations | Highly variable patterns, initial setup | **Default**: 2 (balanced detection) **Formula**: Expected range = mean Âą (stddev × sensitivity) ``` Example with sensitivity = 2: Mean: 48,000 rows StdDev: 3,000 rows Expected range: 42,000 - 54,000 rows (48k - 6k to 48k + 6k) If actual count = 35,000 → ANOMALY (outside range) If actual count = 51,000 → HEALTHY (within range) ``` > 💡 Start with sensitivity 2-3 for new monitors, then tighten to 1-2 once patterns are stable. ### Timestamp Column {#timestamp-column} Optional column used to filter rows within the time window. Without a timestamp column, all rows in the table are counted regardless of when they were created. **When to use:** -Append-only tables that grow over time -Event streams or log tables -Tables where you care about recent data arrival **When to skip:** -Tables that are fully replaced on each load -Snapshot tables with fixed row counts -Dimension tables with slow-changing data ``` With timestamp column (orders.created_at): Query: SELECT COUNT(*) FROM orders WHERE created_at >= NOW() - INTERVAL '24 hours' Result: 48,000 (recent rows only) Without timestamp column: Query: SELECT COUNT(*) FROM orders Result: 5,000,000 (all rows ever) ``` **Column requirements:** - Must be a timestamp or datetime type - Should represent when the row was created/ingested - Should be indexed for performance ### Time Window {#time-window} How far back to count rows when a timestamp column is specified. Choose based on your data load frequency: | Window | Duration | Best For | |--------|----------|----------| | **1 hour** | Last 60 minutes | Real-time streaming, high-frequency events | | **6 hours** | Last 6 hours | Hourly batch jobs, frequent updates | | **12 hours** | Last 12 hours | Twice-daily pipelines | | **24 hours** | Last day | Daily batch ETL (most common) | | **168 hours** | Last 7 days | Weekly aggregates, slow-changing data | ``` Example: Daily batch job loads orders every night at 2 AM Time window: 24 hours Check interval: 6 hours (runs at 2 AM, 8 AM, 2 PM, 8 PM) Check at 8 AM: Counts rows WHERE created_at >= 8 AM yesterday Includes last night's batch + today's streaming data ``` > âš ī¸ Without a timestamp column, the time window setting is ignored and all rows are counted. ### Check Interval {#check-interval} How often to run the row count check and evaluate for anomalies: | Interval | Frequency | Best For | |----------|-----------|----------| | **1 hour** | Every hour | Real-time monitoring, critical tables | | **6 hours** | 4x per day | Standard monitoring, daily tables | | **12 hours** | 2x per day | Less critical tables, longer time windows | | **24 hours** | Once per day | Weekly tables, slow-changing data | **Cost considerations:** More frequent checks = more compute resources. Choose the interval that matches your SLA requirements. ``` Example check intervals for different scenarios: Scenario: Real-time event stream Time window: 1 hour Check interval: 1 hour Result: Hourly checks on last hour of data Scenario: Daily batch job Time window: 24 hours Check interval: 6 hours Result: 4 checks per day on last 24h of data Scenario: Weekly report table Time window: 168 hours Check interval: 24 hours Result: Daily checks on last week of data ``` > 💡 Check interval should be ≤ time window for meaningful monitoring. A 24-hour check interval with a 1-hour time window would miss most anomalies. ## Time-Windowed Counting For tables that accumulate data over time, use a timestamp column to count rows within a specific window: | Window | Use Case | |--------|----------| | **1 hour** | Real-time event streams | | **6 hours** | Frequent batch loads | | **12 hours** | Twice-daily pipelines | | **24 hours** | Daily batch ETL (most common) | | **168 hours** | Weekly aggregates | ``` Time-windowed counting (orders table with created_at): Without time window: COUNT(*) = 5,000,000 (all time) With 24h window: COUNT(*) WHERE created_at >= now() - 24h = 48,000 ``` > 💡 Use time-windowed counting for append-only tables. Without it, row counts only grow, making anomaly detection less useful. ## Setting Up Row Count Monitoring ## Understanding Results ### Status Indicators | Status | Meaning | Action | |--------|---------|--------| | **Healthy** | Row count within expected range | None needed | | **Anomaly** | Row count outside expected range | Investigate the cause | | **Learning** | Collecting baseline data | Wait for learning to complete | | **No Data** | No checks have run yet | Check will run on next interval | ### Anomaly Types | Anomaly | Possible Causes | |---------|-----------------| | **Row count too low** | ETL failure, data loss, filter bug, source issue | | **Row count too high** | Duplicate load, removed filter, upstream spike | | **Row count zero** | Complete ETL failure, wrong table, permissions | ## Best Practices ### Choose the Right Mode | Scenario | Recommended Mode | |----------|------------------| | Data patterns vary naturally | Auto-learn with sensitivity 2-3 | | Exact expectations known | Explicit with min/max thresholds | | New table, unknown patterns | Auto-learn with sensitivity 3-4 | | Critical data, low tolerance | Auto-learn with sensitivity 1-2 | ### Set Appropriate Windows | Data Pattern | Recommended Window | |--------------|-------------------| | Real-time streaming | 1 hour | | Hourly batch jobs | 6 hours | | Daily batch jobs | 24 hours | | Weekly aggregates | 168 hours | ### Start Conservative, Then Tighten 1. **Week 1**: Use auto-learn with sensitivity 3 (less sensitive) 2. **Week 2-4**: Review any anomalies, adjust if too noisy 3. **Month 2+**: Tighten to sensitivity 2 once patterns are stable ## Row Count vs. Metrics | Feature | Row Count | Data Quality Metrics | |---------|--------------|---------------------| | **Purpose** | Monitor row counts | Monitor column statistics | | **Scope** | Table-level | Column-level | | **ML-based** | Yes (auto-learn) | Yes (anomaly detection) | | **Time windows** | Yes | No | | **Explicit thresholds** | Yes | Via checks | **Use Row Count Monitoring for**: "Did the right amount of data arrive?" **Use Metrics for**: "Is the data quality correct?" (nulls, duplicates, ranges) ## Troubleshooting ### Status shows 'Learning' for too long **Causes:** - Not enough data points collected yet - Check interval is very long (weekly) **Solutions:** 1. Wait for at least 7 data points (7 days for daily checks) 2. Consider switching to explicit mode if you know expected values ### Too many false positive anomalies **Causes:** - Sensitivity is too low (too sensitive) - Natural data variation is high - Seasonality not yet learned **Solutions:** 1. Increase sensitivity (e.g., 2 to 3) 2. Allow more baseline data (30+ days) 3. Switch to explicit mode with wider thresholds ### Missing real anomalies **Causes:** - Sensitivity is too high (not sensitive enough) - Baseline includes anomalous data **Solutions:** 1. Decrease sensitivity (e.g., 3 to 2) 2. Switch to explicit mode with tighter thresholds ### Row count always zero with time window **Causes:** - Timestamp column has no recent data - Wrong timestamp column selected - Time window too narrow **Solutions:** 1. Verify timestamp column has data in the window 2. Check column data type (should be timestamp/datetime) 3. Widen the time window ## Common Questions ### Auto-learn or explicit mode, which should I pick? Use **auto-learn** when row counts fluctuate naturally, AnomalyArmor builds a statistical baseline and flags deviations. Use **explicit** when you know the exact min and max (for example, a daily summary that should always have exactly one row). ### Why configure a timestamp column? Without one, `COUNT(*)` returns all rows ever, so counts only grow and anomaly detection loses meaning. A timestamp column lets AnomalyArmor count only rows inside a time window like the last 24 hours, which is what you actually want to monitor for append-only tables. ### How long does auto-learn mode need before it starts alerting? At least 7 data points. For a daily check interval that's 7 days; for hourly it's 7 hours. Until enough baseline accumulates, the status shows **Learning** and no anomalies fire. ### What does a sensitivity of 2 actually mean? It's the multiplier on the standard deviation used to define the expected range. Expected range = `mean Âą (stddev × sensitivity)`. Lower values (1) are stricter and catch smaller drifts; higher values (3-4) tolerate more variation. ### What's the difference between row count monitoring and data quality metrics? Row count is table-level ("did the right volume arrive?") with ML pattern learning and time windows. [Metrics](/data-quality/metrics) are column-level ("are the values correct?"), tracking things like null percentages, distinct counts, and numeric ranges. ### What typically causes a row count spike? The most common cause is a duplicate load, where the same batch ran twice or a filter was removed upstream. Sudden drops are usually ETL failures, wrong source, or a filter change that excluded valid data. The anomaly page shows the timing so you can correlate with deploys. ## What's Next - **[Set Up Alerts](/alerts/overview)**: Get notified when row count anomalies are detected - **[Data Quality Metrics](/data-quality/metrics)**: Monitor column-level statistics like null percentages - **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Track when data was last updated - **[Report Badges](/data-quality/report-badges)**: Embed row count status in dashboards --- # "Custom SQL Monitoring" > "Monitor any business rule by writing SQL queries that return a numeric value, with configurable thresholds and alerts." Custom SQL monitors let you define arbitrary data quality checks by writing SELECT queries against your connected databases. Each monitor returns a single numeric value that is compared against thresholds you configure. ## When to use Custom SQL monitors Custom SQL monitors are ideal when built-in monitors (freshness, row count, schema drift, metrics) do not cover your specific business rules: | Use Case | Example SQL | |----------|-------------| | Zero-dollar transactions | `SELECT COUNT(*) FROM orders WHERE total_amount = 0` | | Source-target reconciliation | `SELECT ABS(a.cnt - b.cnt) FROM (SELECT COUNT(*) cnt FROM source) a, (SELECT COUNT(*) cnt FROM target) b` | | Null percentage | `SELECT 100.0 * COUNT(*) FILTER (WHERE email IS NULL) / COUNT(*) FROM users` | | Duplicate detection | `SELECT COUNT(*) FROM (SELECT id, COUNT(*) FROM events GROUP BY id HAVING COUNT(*) > 1) t` | | SLA compliance | `SELECT AVG(EXTRACT(EPOCH FROM completed_at - created_at)) / 3600 FROM jobs WHERE created_at > NOW() - INTERVAL '1 day'` | > â„šī¸ If your check is about **freshness** (is data arriving on time?), **row counts** (did volume change?), or **column statistics** (mean, null%, distinct count), use the dedicated built-in monitors instead. They provide richer baselines, anomaly detection, and require no SQL. ## Creating a Custom SQL monitor ## SQL requirements Your query must follow these rules: 1. **Must be a SELECT or WITH (CTE) statement**. INSERT, UPDATE, DELETE, DROP, and other DML/DDL keywords are rejected. 2. **Must return at least one row** with a numeric value in the first column. 3. **Runs with a configurable timeout** (default 60 seconds, maximum 300 seconds). Optimize your query if it times out. 4. **Runs against your database**, not AnomalyArmor's internal database. The query uses the same connection and credentials as asset discovery. ## Threshold configuration Custom SQL monitors use static bounds: | Configuration | Behavior | |---------------|----------| | Upper bound only | Alert when value > upper bound | | Lower bound only | Alert when value < lower bound | | Both bounds | Alert when value > upper bound OR value < lower bound | | Value at boundary | No alert (boundaries are inclusive) | ## Viewing results Each monitor execution stores: - The numeric **value** returned by the query - **Pass/Fail** status based on threshold comparison - **Execution time** in milliseconds - **Timestamp** of when the check ran Historical results are available on the monitor detail page, showing value over time with threshold lines overlaid. ## Alerts When a threshold is breached, an alert flows through the standard AnomalyArmor alert pipeline: 1. Alert rules matching `custom_sql` event type are evaluated 2. Matching rules route to configured destinations (Slack, email, webhook) 3. The alert appears in **Incidents** alongside freshness, schema, and metric alerts To set up alert routing, create an alert rule that includes the `custom_sql` event type. ## Database-specific tips ### PostgreSQL ```sql -- Use FILTER for conditional aggregates SELECT COUNT(*) FILTER (WHERE status = 'failed') FROM jobs WHERE created_at > NOW() - INTERVAL '1 hour' ``` ### Snowflake ```sql -- Use IFF for conditional counts SELECT COUNT(IFF(status = 'failed', 1, NULL)) FROM jobs WHERE created_at > DATEADD('hour', -1, CURRENT_TIMESTAMP()) ``` ### BigQuery ```sql -- Use COUNTIF for conditional counts SELECT COUNTIF(status = 'failed') FROM `project.dataset.jobs` WHERE created_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR) ``` ### Redshift ```sql -- Use CASE inside COUNT SELECT COUNT(CASE WHEN status = 'failed' THEN 1 END) FROM jobs WHERE created_at > GETDATE() - INTERVAL '1 hour' ``` ## Best practices - **Keep queries fast.** Target under 10 seconds. Use appropriate indexes and time filters. - **Return a single value.** Multi-row results use only the first row's first column. - **Use meaningful names.** "Zero-Dollar Orders (Last Hour)" is clearer than "Custom Check 1". - **Start with wide thresholds** and tighten them as you learn normal ranges. - **Use the Run Now button** to validate your query before relying on scheduled execution. ## Common Questions ### When should I use a Custom SQL monitor instead of a built-in monitor? Use Custom SQL when your check is a business rule the built-ins don't cover, like zero-dollar transactions, cross-table reconciliation, or SLA compliance. Stick with [freshness](/data-quality/freshness-monitoring), [row count](/data-quality/row-count-monitoring), or [metrics](/data-quality/metrics) for those standard cases, they provide anomaly detection without any SQL. ### What's allowed in the SQL query? Only `SELECT` and `WITH` (CTE) statements. INSERT, UPDATE, DELETE, and DDL are rejected. The query must return at least one row with a numeric value in the first column. ### Where does the query run? It runs against **your connected database**, using the same credentials as asset discovery, not against AnomalyArmor's internal database. Write queries the way you would in your warehouse's SQL editor. ### What happens if my query is slow? Each run has a default 60-second timeout (300 seconds max). If your query times out, add time filters, use indexes, or reduce the scope. Target under 10 seconds for frequent checks. ### Can I get alerts from Custom SQL monitors? Yes. Threshold breaches fire alerts through the standard pipeline. Create an alert rule that matches the `custom_sql` event type and route it to Slack, email, webhook, or any configured destination. ### Does the monitor support both upper and lower bounds? Yes. Configure either or both. Setting both alerts if the value exceeds the upper bound **or** drops below the lower bound. Values at the boundary don't alert, thresholds are inclusive. - **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Monitor data arrival timing automatically - **[Metrics](/data-quality/metrics)**: Track column-level statistics with anomaly detection --- # Report Badges > Embed data quality status indicators in Databricks, Looker, Notion, and other tools Report badges let you embed AnomalyArmor data quality status directly into your dashboards, wikis, and operational tools. See pass/fail status at a glance without leaving your workflow. ## Why Use Report Badges Data teams context-switch constantly. Opening a separate monitoring tool to check data quality breaks flow: Badges provide "always-on" visibility into data quality wherever you work. ## Where Badges Work Any tool that supports image embeds can display badges: | Platform | How to Embed | |----------|--------------| | Databricks | Markdown cell: `![](badge-url)` | | Looker | Custom visualization with HTML | | Tableau | Web content or image object | | Notion | Paste URL as embed | | Confluence | Insert image by URL | | Slack | Post badge URL directly | | GitHub | Add to README | ## Badge Status Badges show one of three states: | Badge | Meaning | |-------|---------| | ![Passing](https://shields.io/badge/AnomalyArmor-Passing-brightgreen) | All monitored tables pass quality checks | | ![2 Issues](https://shields.io/badge/AnomalyArmor-2_Issues-red) | Tables have schema drift, freshness violations, or metric anomalies | | ![Unknown](https://shields.io/badge/AnomalyArmor-Unknown-lightgrey) | Badge disabled or configuration error | ## Creating a Badge ## Copying the Badge URL After creating a badge: 1. Find the badge in the **Report Badges** list 2. Click **Copy URL** next to the badge 3. The URL is copied to your clipboard Badge URLs follow this format: ``` https://app.anomalyarmor.ai/badges/{public-id}.svg ``` > â„šī¸ Badge URLs are public and require no authentication. Anyone with the URL can view the badge status. URLs use randomly generated IDs that are not guessable. ## Embedding Examples ### Databricks Notebook Add a markdown cell: ```markdown # Data Quality Status ![Data Quality](https://app.anomalyarmor.ai/badges/abc123.svg) The badge above shows real-time quality status for our analytics tables. ``` ### Notion Page 1. Type `/embed` 2. Paste the badge URL 3. The badge renders inline ### GitHub README ```markdown ## Data Quality [![Analytics Quality](https://app.anomalyarmor.ai/badges/abc123.svg)](https://app.anomalyarmor.ai) Our analytics tables are monitored by AnomalyArmor. ``` ### Slack Simply paste the badge URL. Slack will render the SVG inline. ## Managing Badges ### Viewing All Badges Navigate to **Report Badges** to see all badges: - Badge status (live preview) - Scope (asset and tag filters) - Enabled checks (schema drift, freshness) - Copy URL action ### Editing a Badge 1. Click the badge row in the list 2. Modify configuration 3. Click **Save** The badge URL remains unchanged. Embedded badges automatically reflect the updated configuration. ### Deleting a Badge 1. Click the delete icon on the badge row 2. Confirm deletion > âš ī¸ Deleting a badge immediately invalidates the URL. Embedded badges will show "Unknown" status. ### Disabling vs Deleting If you want to temporarily hide a badge: 1. Edit the badge 2. Toggle **Active** off 3. Save Disabled badges show "Unknown" status but retain their URL for later reactivation. ## Caching Behavior Badges cache their status for performance: | Scenario | Cache Duration | |----------|----------------| | Default | 5 minutes | | Embedded images | Respects HTTP cache headers | This means: - Badge status updates within 5 minutes of quality changes - High traffic won't overload AnomalyArmor - Multiple embeds of the same badge share cached results ## Best Practices ### 1. One Badge Per Logical Domain Create badges for logical groupings: **Good:** - Analytics Schema Badge - Core Tables Badge - Customer-Facing Badge **Avoid:** - All Tables Badge (too broad) - Single Table Badge (too narrow) ### 2. Use Tags for Precision Combine asset selection with tags: - **Asset**: production.analytics - **Tags**: `["critical", "customer-facing"]` - **Result**: Only critical, customer-facing tables in analytics ### 3. Place Badges Strategically Embed badges where decisions are made: | Location | Why | |----------|-----| | Dashboard header | Quality context before analysis | | ETL documentation | Status of pipeline outputs | | Data catalog | Trust indicator for consumers | | Team wiki | Always-visible quality status | ### 4. Link to Details When embedding, make badges clickable: ```markdown [![Status](badge-url)](https://app.anomalyarmor.ai/assets/your-asset) ``` This lets viewers click through to investigate issues. ## Troubleshooting ### Badge shows Unknown **Causes**: - Badge was deleted - Badge was disabled - Invalid badge URL **Solutions**: 1. Verify the badge exists in Report Badges 2. Check the badge is marked Active 3. Ensure URL is correct and complete ### Badge not updating **Causes**: - Browser caching old image - Within 5-minute cache window **Solutions**: 1. Hard refresh the page (Cmd+Shift+R / Ctrl+Shift+R) 2. Wait 5 minutes for cache to expire 3. Add cache-busting parameter: `badge-url?t=timestamp` ### Badge shows Issues unexpectedly **Causes:** - Schema drift detected on monitored tables - Freshness SLA violated - Metric anomaly detected (row count, null percentage, etc.) - Upstream tables have issues (if enabled) **Solutions:** 1. Click badge or navigate to AnomalyArmor 2. Review asset details for specific issues 3. Resolve underlying quality problems ### Badge not rendering in tool **Causes**: - Tool doesn't support SVG images - Tool blocks external image URLs - Firewall blocking AnomalyArmor domain **Solutions**: 1. Check tool's image/embed documentation 2. Try alternative embed method (HTML vs Markdown) 3. Verify network access to app.anomalyarmor.ai ## Security Considerations - **Public URLs**: Badge URLs are accessible without authentication - **Non-guessable IDs**: URLs use random UUIDs that cannot be enumerated - **Read-only**: Badges only display status; they cannot modify data - **No sensitive data**: Badges show pass/fail only, not table contents > â„šī¸ If you need to revoke access to a badge, delete it. A new badge for the same scope will have a different URL. ## Common Questions ### Where can I embed an AnomalyArmor badge? Any tool that accepts an image URL. That includes Databricks notebooks, Looker, Tableau, Notion, Confluence, Slack, and GitHub READMEs. Paste the badge URL like you would any SVG or Markdown image. ### Are badge URLs public? Yes. Badge URLs require no authentication so they render anywhere. They use randomly generated, non-guessable IDs and only show pass/fail status, never table contents or row data. Delete and recreate a badge to rotate the URL. ### How fast does badge status update when something breaks? Within **5 minutes**. Badges cache for performance so high-traffic embeds don't overload AnomalyArmor. If you need an immediate refresh, add a cache-busting query parameter or wait out the cache window. ### Why is my badge showing 'Unknown'? Either the badge was deleted, disabled, or the URL is malformed. Check **Report Badges**, confirm the badge is marked Active, and verify the embedded URL matches exactly. ### Can one badge cover multiple tables? Yes. Scope a badge to a schema and it will reflect the status of every table inside. Use tag filters (for example, `critical` or `customer-facing`) to narrow it further without creating one badge per table. ## Related Topics - **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set SLAs on data freshness - **[Data Quality Metrics](/data-quality/metrics)**: Track row counts, null percentages, and detect anomalies - **[Schema Monitoring](/schema-monitoring/overview)**: Detect structural changes - **[Alert Rules](/alerts/alert-rules)**: Get notified of quality issues --- # Coverage Tiers > Understand your monitoring coverage level and what each tier catches Every data asset in AnomalyArmor earns a coverage score based on the monitoring you have configured. The score maps to a tier that tells you, at a glance, how well-protected your data is. ## The Four Tiers | Tier | Score | What You Catch | What Can Still Surprise You | |------|-------|---------------|---------------------------| | **Monitored** | 10-29 | Schema changes that break pipelines | Data going stale, values going wrong, volume drops | | **Protected** | 30-49 | Pipeline failures, data disappearing, schema breaks | Subtle quality issues, slow degradation | | **Verified** | 50-69 | All of the above, plus stale data and value corruption | Edge cases in complex transformations | | **Intelligent** | 70+ | All of the above, plus AI-powered anomaly detection and proactive alerts | Very little. You have full coverage across all six monitoring features. | Assets scoring below 10 show as **Not monitored** with a prompt to set up basic monitoring. ## What Goes Into Each Tier ### Monitored (entry level, any plan) - Schema drift monitoring enabled on the asset - At least one alert destination configured (Slack, email, webhook) - Discovery has been run (you know what tables you have) **Customer value**: "I'll know when my upstream sources change before it breaks my pipelines." ### Protected (standard monitoring) - Everything in Monitored - Row count metrics on important tables (detects data loss, duplicates, empty loads) - Alert rules routing anomalies to your team **Customer value**: "I'll know when data stops flowing or volumes look wrong. My team gets notified." ### Verified (strong monitoring) - Everything in Protected - Freshness monitoring on actively-updated tables (auto-learn mode) - Validity rules on critical columns (null checks, format validation, range bounds) - AI knowledge base generated for the asset **Customer value**: "I validate data quality, not just availability. I catch bad data before it reaches dashboards." ### Intelligent (top tier) - Everything in Verified - All coverage features above 70% - Alert routing verified (rules have destinations) - Daily briefing active **Customer value**: "Full observability with AI-powered analysis. Anomalies are automatically investigated. I get proactive guidance." ## Where You See Your Tier - **Asset cards** in the asset list (small badge with tier name) - **Asset detail page** header (full badge with score and breakdown) - **Daily briefing** header per asset - **Homepage dashboard** (lowest-tier asset highlighted for attention) - **Agent conversations** ("BalloonBazaar is at Verified tier, score 58") ## Best Practices - Aim for **Verified** on production assets that feed dashboards and reports - **Protected** is a solid baseline for all actively-used databases - **Monitored** is appropriate for staging environments and development databases - Review coverage quarterly as your data landscape changes - **[How to Reach Each Tier](/coverage-tiers/guide)**: Step-by-step guide for upgrading your coverage tier. - **[Coverage Score Explained](/coverage-tiers/score)**: How the score is calculated and how to improve it. ## Common Questions ### What's the difference between Monitored, Protected, Verified, and Intelligent? Each tier adds a layer of coverage. Monitored catches schema changes, Protected adds row-count anomaly detection, Verified adds freshness and validity checks, and Intelligent adds AI-powered anomaly detection and daily briefings. See [The Four Tiers](#the-four-tiers). ### What tier should a production table be at? Aim for Verified (50+) on production assets that feed dashboards and reports. Protected is a reasonable baseline for all actively-used databases. Monitored is fine for staging and development. See [Best Practices](#best-practices). ### Why is my asset showing 'Not monitored'? Assets below score 10 show as Not monitored: discovery hasn't run, schema drift isn't enabled, or there's no alert destination configured. Enable schema monitoring and connect a destination and you'll clear the threshold. ### Where can I see an asset's tier? Tiers show up as badges on asset cards in the asset list, on the asset detail page header, on the homepage dashboard (lowest-tier asset is highlighted), in daily briefings, and in agent replies. --- # How to Reach Each Tier > Step-by-step guide for upgrading your monitoring coverage tier Each tier builds on the previous one. You can reach Protected in about 2 minutes with auto-setup, and Verified in under 10 minutes. ## Not Monitored to Monitored (score 10+) **Time**: Under 1 minute **What you gain**: You'll know when upstream sources change their schema before it breaks your pipelines. ## Monitored to Protected (score 30+) **Time**: About 2 minutes **Or use the agent**: Tell the AnomalyArmor agent "set up Protected-level monitoring for my database" and it will handle the setup for you. **What you gain**: You'll catch data loss, duplicate loads, and empty pipelines. Your team gets notified automatically. ## Protected to Verified (score 50+) **Time**: About 5 minutes **What you gain**: You validate data quality, not just availability. You catch bad data before it reaches dashboards. ## Verified to Intelligent (score 70+) **Time**: About 5 minutes **What you gain**: Full observability with AI-powered anomaly detection. Proactive alerts and daily briefings keep you ahead of data issues. ## Using the Agent for Setup The AnomalyArmor agent can handle tier upgrades conversationally: ``` You: "What tier is my production database at?" Agent: "Your Production DB is at Protected tier (score 42/100). Schema Drift: 100% (25/25 pts) Freshness: 0% (0/25 pts) Metrics: 40% (8/20 pts) To reach Verified (50+), add freshness monitoring to 8 tables. Want me to set that up?" You: "Yes, set up freshness monitoring." Agent: "Done. Created 8 freshness schedules using auto-learn mode. Your score is now 54 - you've reached Verified tier." ``` ## Tips - **Start with the highest-impact feature**. The coverage breakdown shows which feature contributes the most points. Focus there first. - **Use recommendations**. The asset page shows specific recommendations for improving your score, with one-click setup buttons. - **Batch setup is fastest**. Use the "Apply All Recommendations" button or ask the agent to handle everything at once. ## Common Questions ### How long does it take to reach Verified on a production asset? Under 10 minutes if you use the one-click setup: about 2 minutes for Protected (schema drift + row counts + alert routing), then 5 more for freshness and validity rules. Asking the agent to handle it is usually faster than clicking through yourself. ### Can I upgrade tiers without configuring each monitor manually? Yes. Tell the AnomalyArmor agent "set up Protected-level monitoring for my database" and it will create the row-count metrics, freshness schedules, and alert rules for you. You can also use the **Apply All Recommendations** button on the asset page. ### Which feature should I add first to raise my score fastest? Check the coverage breakdown on the asset page. The feature with the largest gap between current coverage and 100% is usually also the one with the most unclaimed points. Start there. ### Do I need Enterprise to reach Intelligent tier? No. Intelligent tier is available on every plan. It requires all coverage features above 70% and alert routing verified. The limiter is plan capacity (monitor count), not tier eligibility. --- # Coverage Score Explained > How your coverage score is calculated and how to improve it Your coverage score is a weighted percentage (0-100) that reflects how thoroughly you're monitoring a data asset. Each monitoring feature contributes points based on its weight and your coverage percentage. ## Score Calculation The score is the sum of weighted percentages across six monitoring features: | Feature | Weight | Max Points | What It Measures | |---------|--------|-----------|------------------| | Schema Drift | 25% | 25 | Tables with active schema monitoring vs total | | Freshness | 25% | 25 | Tables with freshness schedules vs total | | Row Count Metrics | 20% | 20 | Tables with metrics configured vs total | | Alert Routing | 15% | 15 | Alert rules with destinations vs total rules | | Validity Rules | 10% | 10 | Tables with validity rules vs total | | Intelligence KB | 5% | 5 | Whether AI knowledge base is fresh (less than 7 days old) | **Total possible: 100 points** ### Example An asset with 20 tables: | Feature | Coverage | Calculation | Points | |---------|----------|-------------|--------| | Schema Drift | Enabled (1/1) | 100% x 25 | 25 | | Freshness | 5 of 20 tables | 25% x 25 | 6 | | Metrics | 8 of 20 tables | 40% x 20 | 8 | | Alert Routing | 2 of 3 rules | 67% x 15 | 10 | | Validity | 0 of 20 tables | 0% x 10 | 0 | | Intelligence | KB generated | 100% x 5 | 5 | | **Total** | | | **54** (Verified) | ## Score Color Coding | Score Range | Color | Meaning | |-------------|-------|---------| | 0-25 | Red | Minimal monitoring, significant blind spots | | 26-50 | Orange | Basic monitoring in place, gaps remain | | 51-75 | Yellow | Good monitoring, some features not fully covered | | 76-100 | Green | Full monitoring across all features | ## Common Questions ### Why is my coverage score low even though I have freshness monitoring enabled? Freshness monitoring contributes a maximum of 25 points, so a freshness-only setup caps around 25 (Monitored tier). Add coverage across multiple features - schema drift, metrics, validity rules, and alert routing - to move up tiers. ### Does schema drift monitoring cover all my tables automatically? Yes. Schema drift monitoring is asset-level, not per-table. Enabling it once covers every table in that asset, which is why it contributes up to 25 points with a single toggle. ### What counts as a fresh intelligence knowledge base? The AI knowledge base is considered fresh if it was generated within the last 7 days. Regenerate it after significant schema changes or when you add new tables. ### Does my coverage score affect my plan or pricing? No. Coverage tiers are earned by configuring monitoring, not purchased. Any customer on any plan can work toward any tier. Higher plans do include AI investigations, which boost the Intelligence KB score component and make higher tiers easier to reach. ### How often does my coverage score update? In real time. The score is calculated when you view the asset page or when the agent checks your tier, so it always reflects the current state of your monitoring configuration. ## Improving Your Score The fastest ways to improve your score, by impact: 1. **Enable schema drift monitoring** (+25 points max, single toggle) 2. **Add freshness monitoring** (+25 points max, batch setup available) 3. **Add row count metrics** (+20 points max, batch setup available) 4. **Link destinations to alert rules** (+15 points max) 5. **Add validity rules** (+10 points max) 6. **Generate AI intelligence** (+5 points, single button) Use the **Monitoring Recommendations** section on your asset page for one-click setup, or ask the agent to "improve my coverage score." --- # Alerts Overview > Get notified when data issues occur through flexible alert routing Alerts are how AnomalyArmor notifies you when something needs attention. Whether it's a schema change, stale data, or a failed discovery job, alerts ensure the right people know at the right time. *Alert pipeline from event to notification* ## How Alerts Work Alerts follow a three-stage pipeline: ### 1. Event Detection AnomalyArmor detects events during discovery runs: | Event Type | Description | |------------|-------------| | **Schema Change** | Column added, removed, or type changed | | **Freshness Violation** | Data not updated within SLA | | **Discovery Failed** | Connection or permission error | | **Asset Removed** | Table/view no longer exists | | **New Asset** | Table/view discovered for first time | ### 2. Rule Evaluation Each event is checked against your alert rules: - **Scope**: Does the event match the rule's filters? (data source, schema, asset) - **Conditions**: Does it meet additional criteria? (change type, etc.) - **Active**: Is the rule enabled? ### 3. Suppression Check *Alert suppression pipeline: schedule, blackout, cooldown, and daily limit checks* Before delivery, alerts pass through suppression checks: - **[Operating Schedules](/alerts/schedules)**: Is the event within the rule's active hours? - **[Blackout Windows](/alerts/blackouts)**: Is a company-wide blackout currently active? - **Cooldown**: Has this rule already fired recently? - **Daily Limit**: Has the rule exceeded its daily notification cap? Suppressed alerts are still recorded in the alert log for auditing. ### 4. Routing & Delivery Matching events are sent to configured destinations: - Rules can have multiple destinations - Each destination can receive from multiple rules - Deduplication prevents repeat alerts for the same event ## Supported Destinations - **[Slack](/alerts/destinations/slack)**: Real-time channel notifications - **[Email](/alerts/destinations/email)**: Individual or team distribution - **[Webhooks](/alerts/destinations/webhooks)**: Custom integrations - **[PagerDuty](/alerts/destinations/pagerduty)**: On-call escalation - **[MS Teams](/alerts/destinations/ms-teams)**: Teams channel notifications ## Alert Components ### Rules Rules define **when** alerts fire and **where** they go: **Example: "Production Schema Changes"** - **Event Type**: Schema Change Detected - **Scope**: Data source = production-postgres - **Conditions**: Change type = Column Removed - **Destinations**: - Slack (#data-alerts) - PagerDuty (on-call) See [Alert Rules](/alerts/alert-rules) for detailed configuration. ### Destinations Destinations are the channels where alerts are delivered: **Example: "Slack - Data Alerts"** - **Type**: Slack - **Channel**: #data-alerts - **Workspace**: your-company.slack.com - **Status**: Connected Configure destinations before creating rules that use them. ### Alert History All alerts are logged for review: - View past alerts in **Alerts → History** - Filter by date, type, destination, or asset - See which rules triggered each alert - Track response times and patterns ## Setting Up Alerts ### Quick Start 1. **Add a destination**: Connect Slack, email, or another channel 2. **Create a rule**: Define what triggers alerts and where they go 3. **Test**: Use "Send Test Alert" to verify delivery 4. **Monitor**: Review alert history and adjust thresholds ### Recommended First Rules Start with these three rules: | Rule | Event | Destination | Why | |------|-------|-------------|-----| | Schema Changes | Schema Change | Slack | Catch breaking changes | | Stale Data | Freshness Violation | Slack | Detect pipeline failures | | Connection Issues | Discovery Failed | Email | Know when monitoring breaks | ## Alert Deduplication AnomalyArmor prevents alert storms: - **Same event**: Won't re-alert for the same change until resolved - **Cooldown period**: Configurable delay between repeated alerts - **Aggregation**: Multiple changes can be grouped (coming soon) ## Managing Alerts ### Viewing Active Alerts Go to **Alerts → Active** to see unresolved alerts: - Filter by asset or date - Click to view details and related changes - Mark as acknowledged or resolved ### Disabling Rules To temporarily stop alerts during maintenance: 1. Go to **Alerts → Rules** 2. Find the rule and toggle it **OFF** 3. After maintenance, toggle it back **ON** ### Reviewing History **Alerts → History** shows all past alerts: - When each alert fired - Which rule triggered it - Where it was delivered - Alert details and context Use history to: - Identify alert fatigue (too many alerts) - Find patterns (same asset always alerting) - Tune thresholds and conditions ## Best Practices ### Start with critical assets Don't alert on everything. Begin with your most important tables (revenue, users, orders) and expand from there. ### Match channels to urgency - **PagerDuty**: Only for truly urgent issues requiring immediate response - **Slack**: Team visibility, moderate urgency - **Email**: Low urgency, informational, digests ### Set realistic thresholds If your data updates hourly, don't set a 30-minute freshness SLA. Start lenient and tighten over time. ### Review and tune regularly Check alert history weekly. If you're getting too many alerts, adjust rules. If you're missing issues, add coverage. See [Alert Best Practices](/alerts/best-practices) for more detailed guidance. ## Troubleshooting ### Alerts not firing 1. Check rule is enabled (toggle ON) 2. Verify destination is connected (test it) 3. Confirm scope matches the asset 4. Ensure events are occurring (check discovery is running) ### Too many alerts 1. Add conditions to filter events 2. Exclude development/test schemas 3. Increase thresholds (e.g., longer freshness SLA) 4. Route different event types to different destinations ### Wrong destination receiving alerts 1. Check rule configuration 2. Verify destination is selected for the correct rule 3. Check for duplicate rules with different destinations ## Common Questions ### What kinds of events can AnomalyArmor alert on? AnomalyArmor alerts on schema changes (columns or tables added, removed, or retyped), freshness violations when data falls behind its SLA, discovery failures from connection or permission errors, assets that have been removed, and newly discovered tables or views. See [Alert Rules](/alerts/alert-rules) for the full event vocabulary. ### How does AnomalyArmor prevent alert storms? Every event passes through cooldown, daily-limit, and deduplication checks before delivery. The same unresolved event will not re-alert, and multiple related changes to one table are grouped into a single notification. Review [Best Practices](/alerts/best-practices) for additional tuning. ### Can one alert rule send to multiple destinations? Yes. A single rule can fan out to Slack, PagerDuty, email, and any other configured destination at the same time, and the same destination can receive from many rules. This is how teams route breaking changes to PagerDuty while also posting them to a Slack channel for visibility. ### Where do I see alerts that were suppressed by a schedule or blackout? Suppressed alerts still appear under **Alerts > History** with the suppression reason recorded (`outside operating hours`, `blackout period`, `cooldown`, or `daily limit`). Nothing is ever silently dropped. ### What are the first alert rules I should set up? Start with three: schema changes on your production database routed to Slack, freshness violations on revenue tables routed to Slack, and discovery failures routed to email. Expand from there once you know the volume. See [Best Practices](/alerts/best-practices) for a recommended starting configuration. ## Next Steps - **[Create Alert Rules](/alerts/alert-rules)**: Configure when and where alerts fire - **[Set Up Slack](/alerts/destinations/slack)**: Connect your Slack workspace - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue - **[Freshness SLAs](/data-quality/freshness-monitoring)**: Set up data freshness alerts --- # Create and Manage Alert Rules > Configure when and how to receive alerts Rules are the core of AnomalyArmor's alerting system. Each rule defines what events trigger alerts, what conditions must be met, and where notifications are sent. > â„šī¸ **Prerequisites**: Before creating alert rules, you need: - A [connected data source](/data-sources/overview) with discovery completed - At least one [alert destination](/alerts/destinations/slack) configured (Slack, email, etc.) ## Rule Components Every alert rule has these parts: ## Creating a Rule ## Event Types in Detail ### Schema Change Detected Triggers when table or column structure changes: | Change | Description | |--------|-------------| | Column Removed | Existing column no longer exists | | Table Removed | Entire table no longer exists | | Column Type Changed | Data type modified | | Column Added | New column appeared | | Table Added | New table discovered | | Column Renamed | Name change detected (remove + add) | **Example rule:** - **Event**: Schema Change - **Scope**: Data source = production - **Conditions**: Change type = Column Removed OR Table Removed - **Destinations**: Slack, PagerDuty ### Freshness Violation Triggers when data isn't updated within the SLA: | Condition | Description | |-----------|-------------| | SLA exceeded | Latest timestamp older than threshold | | Warning threshold | Approaching SLA (e.g., 80% of threshold) | | Resolved | Data became fresh again | **Example rule:** - **Event**: Freshness Violation - **Scope**: Asset = orders, payments, users - **Conditions**: (none - all violations) - **Destinations**: Slack ### Discovery Failed Triggers when a discovery job can't complete: | Failure Type | Description | |--------------|-------------| | Connection failed | Can't reach database | | Authentication failed | Invalid credentials | | Permission denied | User lacks access | | Timeout | Discovery took too long | **Example rule:** - **Event**: Discovery Failed - **Scope**: All data sources - **Conditions**: (none) - **Destinations**: Email, Slack ### Asset Removed Triggers when a previously-known asset no longer exists: **Common causes:** - Table was intentionally dropped - Table was renamed - Permission revoked - Discovery ran against wrong database **Example rule:** - **Event**: Asset Removed - **Scope**: Data source = production - **Destinations**: Slack ## Rule Examples ### Critical Tables Only Monitor your most important tables closely: | Field | Value | |-------|-------| | **Name** | Critical Tables - All Changes | | **Event** | Schema Change Detected | | **Data Source** | `production-postgres` | | **Assets** | `orders`, `users`, `payments`, `products` | | **Conditions** | All changes | | **Destinations** | Slack `#critical-data`, PagerDuty | ### Production Breaking Changes Alert only on potentially breaking changes. Pick **Breaking only** as the trigger scope and the rule will fire on any schema alert that contains at least one breaking change, no matter how many additive changes come along with it: | Field | Value | |-------|-------| | **Name** | Production - Breaking Changes Only | | **Event** | Schema Change Detected | | **Data Source** | `production-postgres` | | **Schema** | `public`, `analytics` | | **Trigger scope** | Breaking only | | **Destinations** | Slack `#data-alerts`, PagerDuty | ### Gold Tables - Change Freeze Enforce a strict change-freeze policy on a set of critical tables by scoping to specific change types: | Field | Value | |-------|-------| | **Name** | Gold Tables - Change Freeze | | **Event** | Schema Change Detected | | **Assets** | `gold.*` | | **Trigger scope** | Specific types | | **Change types** | `COLUMN_REMOVED`, `COLUMN_TYPE_CHANGED`, `PRIMARY_KEY_REMOVED`, `PRIMARY_KEY_MODIFIED` | | **Destinations** | PagerDuty, Slack `#gold-table-freeze` | ### Non-Production Informational Low-priority notifications for dev/staging: | Field | Value | |-------|-------| | **Name** | Staging Schema Changes (Info Only) | | **Event** | Schema Change Detected | | **Data Source** | `staging-postgres` | | **Conditions** | All changes | | **Destinations** | Email (daily digest) | ### Freshness by Criticality Different thresholds for different tables: **Revenue Tables Freshness** | Field | Value | |-------|-------| | **Event** | Freshness Violation | | **Assets** | `orders`, `payments`, `revenue_*` | | **Destinations** | Slack, PagerDuty | **Analytics Tables Freshness** | Field | Value | |-------|-------| | **Event** | Freshness Violation | | **Assets** | `daily_*`, `weekly_*`, `analytics_*` | | **Destinations** | Slack | ## Managing Rules ### Enabling/Disabling Toggle rules on or off without deleting: 1. Go to **Alerts → Rules** 2. Find the rule 3. Click the toggle switch Disabled rules won't evaluate events but preserve configuration. ### Editing Rules Modify existing rules: 1. Go to **Alerts → Rules** 2. Click the rule name 3. Make changes 4. Click **Save** > â„šī¸ Changes take effect immediately. Events that already fired won't re-fire. ### Deleting Rules Remove rules you no longer need: 1. Go to **Alerts → Rules** 2. Click the rule menu (⋮) 3. Select **Delete** 4. Confirm deletion > âš ī¸ Deleted rules can't be recovered. Consider disabling instead if you might need the rule later. ## Advanced Configuration ### Multiple Destinations A single rule can send to multiple places: **Rule: "Critical Schema Changes"** - Slack `#data-alerts` - Slack `#data-engineering` - PagerDuty - Email team@company.com (daily digest) ### Pattern Matching Use patterns in asset filters: | Pattern | Matches | |---------|---------| | `orders*` | orders, orders_2024, orders_archive | | `*_staging` | orders_staging, users_staging | | `analytics.*` | All tables in analytics schema | | `*.users` | users table in any schema | ## Troubleshooting ### Rule not triggering 1. **Check enabled**: Is the toggle ON? 2. **Check scope**: Does the asset match the filter? 3. **Check conditions**: Are conditions too restrictive? 4. **Check events**: Is discovery running and finding changes? 5. **Check destination**: Is it configured correctly? ### Rule triggering unexpectedly 1. **Review scope**: Is it broader than intended? 2. **Check conditions**: Add filters to narrow matches 3. **Review event type**: Correct event selected? 4. **Check duplicates**: Multiple rules for same scope? ### Can't save rule 1. **Required fields**: Ensure name, event type, and destination are set 2. **Valid scope**: Check pattern syntax 3. **Destination exists**: Verify destination is configured ## Common Questions ### How do I alert only on breaking schema changes? Create a Schema Change rule and set **Trigger scope** to **Breaking only**. The rule fires on any alert that contains at least one breaking change (column removed, type changed, primary key removed, etc.), even when mixed with additive changes. Additive-only alerts will be skipped. ### Can I scope an alert rule to specific tables or schemas? Yes. Every rule has optional Data Source, Schema, and Asset filters, and all three accept glob patterns like `orders*`, `*_staging`, or `analytics.*`. Combine them to target, for example, only `gold.*` tables in your production warehouse. ### What happens when I disable an alert rule? Disabled rules stop evaluating incoming events but keep their configuration intact, so you can toggle them back on later. This is safer than deleting a rule during a known-noisy window like a migration. Use a [blackout window](/alerts/blackouts) instead if you want to silence everything temporarily without touching individual rules. ### Can one rule send alerts to more than one destination? Yes. Pick any number of configured destinations in the rule builder, and each receives a copy of the alert. Teams commonly route production breaking changes to PagerDuty, Slack, and a team email list from a single rule. ### How do I enforce a change freeze on critical tables? Create a rule with **Trigger scope: Specific types** and pick the change types you want to block, such as `COLUMN_REMOVED`, `COLUMN_TYPE_CHANGED`, and `PRIMARY_KEY_REMOVED`. Scope the asset filter to your protected tables (for example `gold.*`) and route the rule to PagerDuty. ## Next Steps - **[Set Up Slack](/alerts/destinations/slack)**: Configure Slack as a destination - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue --- # Slack > Send AnomalyArmor alerts to Slack channels Send AnomalyArmor alerts directly to your Slack workspace. Get notified about schema changes, freshness violations, and other data events where your team already works. Alerts arrive as richly formatted Slack Block Kit messages with a color-coded severity bar, structured fields, and a **View Details** link back to AnomalyArmor. **Example Slack Alert:** **Breaking schema changes detected** 3 changes in 2 table(s) (2 breaking) **Rule:** Production Schema Monitor **Tables:** orders, customers **Changes:** - orders: Column Removed `shipping_status` (breaking) - orders: Column Added `tracking_number` - customers: Column Removed `legacy_id` (breaking) [View Details] ## Why Slack? Slack is ideal for alerts that need quick team visibility: - **Real-time**: Alerts appear instantly in channels - **Contextual**: Team can discuss and coordinate in threads - **Actionable**: Click through to AnomalyArmor for details - **Flexible**: Route different alerts to different channels ## Prerequisites Before you begin: - Admin access to your Slack workspace (or permission to install apps) - AnomalyArmor account with alert configuration permissions - At least one Slack channel where you want to receive alerts ## Setup Guide ### Step 1: Navigate to Destinations 1. Log in to AnomalyArmor 2. Click **Alerts** in the left sidebar 3. Select **Destinations** tab 4. Click **Add Destination** You'll see the Destinations page with a **+ Add Destination** button in the top-right corner. ### Step 2: Select Slack From the destination type list, click **Slack**. ### Step 3: Authorize with Slack Click **Connect to Slack** to begin the OAuth flow. Slack will show a standard authorization dialog listing the permissions AnomalyArmor needs: | Permission | Why We Need It | |------------|---------------| | Post to channels | Send alert messages | | View channel list | Let you select destination channels | | View workspace info | Display workspace name in AnomalyArmor | > â„šī¸ AnomalyArmor requests only the minimum permissions needed to send alerts. We cannot read your messages, files, or private channels. ### Step 4: Select Channels After authorization, choose which channels can receive alerts: 1. **Public channels**: Any public channel in your workspace 2. **Private channels**: Must invite the AnomalyArmor bot first To use a private channel: 1. Open the private channel in Slack 2. Type `/invite @AnomalyArmor` 3. The channel will appear in AnomalyArmor's channel list ### Step 5: Name Your Destination Give the destination a descriptive name: ``` Examples: - "Slack - #data-alerts" - "Production Alerts - Slack" - "Platform Team Slack" ``` This name appears when creating alert rules, so make it recognizable. ### Step 6: Test the Connection Click **Send Test Alert** to verify everything works: ``` [OK] Test alert sent successfully Channel: #data-alerts Message: "Test alert from AnomalyArmor" ``` Check Slack to confirm the message arrived. ### Step 7: Save Click **Save** to complete the setup. Your Slack destination is now ready to use in alert rules. ## Alert Message Format AnomalyArmor Slack alerts are formatted using Slack's Block Kit for rich, structured messages. Each alert type displays contextually relevant information. ### Schema Change Alerts Schema change alerts show affected tables, breaking change counts, and detailed change lists: **Breaking schema changes detected in warehouse** 5 changes in 2 table(s) (2 breaking) | Field | Value | |-------|-------| | Rule | Production Schema Monitor | | Severity | CRITICAL | | Tables | orders, customers | | Breaking | 2 changes | **Changes:** - orders: Column Removed `shipping_status` (breaking) - orders: Column Added `tracking_number` - customers: Column Removed `legacy_id` (breaking) [View Details] ### Freshness Violation Alerts Freshness SLA violation alerts show overdue time, SLA thresholds, and last update times: **Freshness SLA Violation** Data is 4.5 hours overdue | Field | Value | |-------|-------| | Rule | Sales Freshness | | Severity | HIGH | | Asset | warehouse.sales | | Overdue | 4.5 hours (2.3 intervals) | | SLA Threshold | 2.0 hours | | Last Update | 2025-12-28 10:30 UTC | [View Details] ### Message Components | Component | Description | |-----------|-------------| | Color bar | Visual severity indicator (red for critical, orange for high, yellow for medium, blue for low) | | Alert title | What happened with summary stats | | Context fields | Rule name, severity, and event-specific details | | Changes list | Detailed breakdown of changes (schema alerts) | | View Details button | Link to view full alert details in AnomalyArmor | | Footer | Alert ID and Rule ID for tracing | ## Multiple Slack Destinations You can create multiple Slack destinations for different use cases: | Destination | Channel | Use Case | |-------------|---------|----------| | Critical Alerts | #data-incidents | Breaking changes, SLA violations | | Schema Changes | #data-changes | All schema drift notifications | | Dev/Test Alerts | #data-dev | Non-production events | | Team-Specific | #analytics-team | Alerts for specific assets | ### Routing Example **Production Breaking Changes** - Scope: production database - Event: Schema change (column removed, table removed) - Destinations: Slack #data-alerts, PagerDuty (for on-call) **Dev Schema Changes** - Scope: development database - Event: Any schema change - Destinations: Slack #data-dev ## Best Practices ### Channel Selection ### Use dedicated alert channels Don't send alerts to busy team channels. Create dedicated channels like `#data-alerts` so alerts don't get lost in conversation. ### Separate by urgency Route breaking changes to a channel with notifications enabled. Route informational alerts (like new tables) to a quieter channel. ### Consider channel membership Make sure the right people are in the channel. Data engineers should be in `#data-alerts`, but not necessarily all of `#general`. ### Alert Volume > âš ī¸ Too many alerts leads to alert fatigue. If a channel gets 50+ alerts/day, people stop paying attention. Be selective about what triggers Slack notifications. **Tips to reduce noise**: - Start with critical alerts only - Use email for low-priority events - Set freshness SLAs with buffer (e.g., 2x expected update time, not too tight) - Filter out expected changes (dev databases) ### Thread Usage AnomalyArmor posts each alert as a new message. To keep channels clean: 1. Create a thread on the alert to discuss 2. Use reactions (:eyes: = investigating, :white_check_mark: = resolved) 3. @mention teammates who need to see it ## Troubleshooting ### "Channel not found" **Cause**: The channel was deleted, renamed, or AnomalyArmor lost access. **Fix**: 1. Verify the channel exists in Slack 2. For private channels, re-invite the AnomalyArmor bot 3. Edit the destination and reselect the channel ### "Unable to post message" **Cause**: Permission issues or Slack API errors. **Fix**: 1. Check that AnomalyArmor app is still installed in Slack 2. Try removing and re-adding the Slack destination 3. Verify your Slack workspace allows third-party apps ### Alerts not appearing **Cause**: Rule configuration or routing issues. **Fix**: 1. Check the alert rule is enabled 2. Verify the rule's scope matches the event 3. Confirm the Slack destination is selected for the rule 4. Use **Send Test Alert** to verify the connection ### OAuth errors during setup **Cause**: Permission or network issues. **Fix**: 1. Ensure you have admin permissions in Slack (or can request app installation) 2. Disable browser popup blockers temporarily 3. Try a different browser 4. Contact your Slack workspace admin if apps are restricted ## Security ### Data Sent to Slack Alert messages contain: - Asset names (table/column names) - Change types (added, removed, modified) - Timestamps Alert messages **do not** contain: - Actual data values - Database credentials - Connection strings - Query results ### Revoking Access To disconnect AnomalyArmor from Slack: 1. In Slack: **Settings** → **Manage apps** → **AnomalyArmor** → **Remove** 2. In AnomalyArmor: Delete the Slack destination ## Common Questions ### How do I route AnomalyArmor alerts to a private Slack channel? Private channels don't appear in the selector until you invite the bot. Open the private channel in Slack, run `/invite @AnomalyArmor`, then refresh the channel list in AnomalyArmor. Public channels work without any invite. ### Can I send different alert types to different Slack channels? Yes. Create a separate Slack destination for each channel (e.g., `#data-incidents`, `#data-changes`, `#data-dev`), then pick the appropriate destination on each alert rule. One rule can also fan out to multiple channels at once. ### Does AnomalyArmor's Slack app read my messages? No. The app requests only the minimum scopes needed to post alerts: post to channels, view the channel list, and view workspace info. It cannot read messages, files, or private channel contents. ### What information is included in a Slack alert? Schema alerts include the rule name, affected tables, a color-coded severity bar, and a detailed change list with breaking changes marked. Freshness alerts include overdue time, SLA threshold, and last update time. Every alert has a **View Details** link back to AnomalyArmor. No row-level data values are ever sent. ### Why are my alerts not appearing in Slack? Check, in order: the rule is enabled, the rule's scope matches the event, the Slack destination is selected on the rule, and the channel still exists with the AnomalyArmor bot invited (for private channels). Use **Send Test Alert** on the destination to confirm the connection itself is healthy. ## Next Steps - **[Alert Rules](/alerts/alert-rules)**: Create rules that route to Slack - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue --- # Email > Send AnomalyArmor alerts to email addresses for team notifications Send AnomalyArmor alerts to individual email addresses or distribution lists. Email is ideal for alerts that don't require immediate action or for team members who prefer email over chat. ## Why Email? Email works well for certain alert scenarios: - **Digest-friendly**: Batch multiple alerts into summaries - **Universal**: Everyone has email, no app installation needed - **Async**: Good for non-urgent notifications - **Auditable**: Built-in record of all alerts received ## Prerequisites Before you begin: - AnomalyArmor account with alert configuration permissions - Valid email addresses for recipients ## Setup Guide ### Step 1: Navigate to Destinations 1. Log in to AnomalyArmor 2. Click **Alerts** in the left sidebar 3. Select **Destinations** tab 4. Click **Add Destination** ### Step 2: Select Email From the destination type list, click **Email**. ### Step 3: Configure Recipients Enter the email addresses that should receive alerts: | Field | Description | |-------|-------------| | **Destination Name** | A descriptive name (e.g., "Data Team Email") | | **Recipients** | Email addresses, one per line or comma-separated | > â„šī¸ You can use distribution lists or group email addresses to reach multiple team members with a single destination. ### Step 4: Test the Connection Click **Send Test Alert** to verify delivery: ``` Test alert sent to: team@example.com Subject: [AnomalyArmor] Test Alert ``` Check the recipient inbox (and spam folder) to confirm the message arrived. ### Step 5: Save Click **Save** to complete the setup. Your email destination is now ready to use in alert rules. ## Alert Email Format AnomalyArmor emails include: - **Subject**: Alert type and affected asset - **Body**: Full alert details with links to AnomalyArmor - **Unsubscribe link**: Manage notification preferences ## Best Practices ### Use Distribution Lists Instead of individual emails, use a team distribution list: - Easier to manage team changes - Single destination covers entire team - Reduces configuration maintenance ### Pair with Other Destinations Email works well alongside real-time channels: | Alert Type | Destinations | |------------|--------------| | Breaking changes | Slack + PagerDuty + Email | | Schema changes | Slack + Email | | Informational | Email only | ### Avoid Alert Fatigue Email alerts can pile up quickly. Consider: - Routing only important alerts to email - Using Slack for high-volume informational events - Setting freshness SLAs with buffer (e.g., 2x expected update time) ## Troubleshooting ### Emails going to spam **Cause**: Email filters flagging AnomalyArmor messages. **Fix**: 1. Add `alerts@anomalyarmor.ai` to your allowlist 2. Mark test emails as "Not Spam" 3. Contact your IT team to allowlist the domain ### Emails not arriving **Cause**: Invalid email address or delivery issues. **Fix**: 1. Verify the email address is correct 2. Check spam/junk folders 3. Try a different email address to test 4. Use **Send Test Alert** to verify the destination ## Common Questions ### Can I send AnomalyArmor alerts to a distribution list? Yes. Enter any group or distribution list address in the recipients field, the same as an individual mailbox. This is the recommended pattern, since managing membership in your mail system is easier than editing AnomalyArmor destinations when the team changes. ### Why are AnomalyArmor emails going to spam? Alerts come from `alerts@anomalyarmor.ai`. Add that address to your allowlist, mark any test email as "Not Spam", and ask your IT team to allowlist the domain if the issue persists. Corporate spam filters are the most common cause of missing alerts. ### Can I add multiple recipients to one email destination? Yes. Enter addresses one per line or comma-separated in the recipients field. Every alert sent to that destination is delivered to all listed addresses. ## Next Steps - **[Alert Rules](/alerts/alert-rules)**: Create rules that route to email - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue --- # Webhooks > Send AnomalyArmor alerts to any HTTP endpoint for custom integrations Send AnomalyArmor alerts to any HTTP endpoint via webhooks. This enables custom integrations with internal tools, monitoring systems, or any service that accepts HTTP requests. ## Why Webhooks? Webhooks provide maximum flexibility: - **Universal**: Integrate with any system that accepts HTTP - **Customizable**: Build your own alert handling logic - **Automation**: Trigger workflows, scripts, or pipelines - **Integration**: Connect to tools without native integrations ## Use Cases Common webhook integrations: | System | Use Case | |--------|----------| | Custom dashboards | Display alerts on internal monitoring | | Automation platforms | Trigger Zapier, n8n, or Make workflows | | Ticketing systems | Create tickets in Jira, Linear, etc. | | Data pipelines | Pause pipelines on schema changes | | Logging systems | Send alerts to Datadog, Splunk, etc. | ## Prerequisites Before you begin: - An HTTP endpoint that can receive POST requests - AnomalyArmor account with alert configuration permissions - (Optional) Authentication credentials for your endpoint ## Setup Guide ### Step 1: Navigate to Destinations 1. Log in to AnomalyArmor 2. Click **Alerts** in the left sidebar 3. Select **Destinations** tab 4. Click **Add Destination** ### Step 2: Select Webhook From the destination type list, click **Webhook**. ### Step 3: Configure the Webhook | Field | Description | |-------|-------------| | **Destination Name** | A descriptive name (e.g., "Internal Alerting System") | | **URL** | Your endpoint URL (must be HTTPS) | | **Authentication** | Optional: Bearer token or API key | | **Headers** | Optional: Custom HTTP headers | ### Step 4: Test the Connection Click **Send Test Alert** to verify your endpoint receives the payload: ``` POST https://your-endpoint.com/alerts Status: 200 OK Response time: 142ms ``` Your endpoint should respond with a 2xx status code. ### Step 5: Save Click **Save** to complete the setup. Your webhook destination is now ready to use in alert rules. ## Payload Format AnomalyArmor sends a JSON payload for each alert: ```json { "event_type": "alert.triggered", "alert": { "id": "alert_abc123", "rule_name": "Production Schema Changes", "alert_type": "schema", "status": "triggered", "created_at": "2024-01-15T08:00:00Z" }, "event": { "type": "schema_change", "change_type": "column_removed", "asset": { "qualified_name": "production.public.orders", "asset_type": "table" }, "details": { "column_name": "shipping_status", "column_type": "varchar" } }, "links": { "alert_url": "https://app.anomalyarmor.ai/alerts/alert_abc123", "asset_url": "https://app.anomalyarmor.ai/assets/..." } } ``` ### Event Types | Event Type | Description | |------------|-------------| | `schema_change` | Column, table, or constraint changes | | `freshness_violation` | Table exceeded freshness SLA | | `discovery_complete` | Discovery job finished | | `connection_failed` | Database connection issue | ## Authentication ### Bearer Token Add an `Authorization` header with your token: ``` Authorization: Bearer your-secret-token ``` ### API Key Header Add a custom header with your API key: ``` X-API-Key: your-api-key ``` ### HMAC Signature AnomalyArmor includes an `X-AnomalyArmor-Signature` header with each request. Verify this signature to ensure the request came from AnomalyArmor: ```python import hmac import hashlib def verify_signature(payload, signature, secret): expected = hmac.new( secret.encode(), payload.encode(), hashlib.sha256 ).hexdigest() return hmac.compare_digest(f"sha256={expected}", signature) ``` ## Retry Logic AnomalyArmor retries failed webhook deliveries: | Attempt | Delay | |---------|-------| | 1 | Immediate | | 2 | 1 minute | | 3 | 5 minutes | | 4 | 15 minutes | | 5 | 1 hour | After 5 failed attempts, the alert is marked as failed delivery. ## Best Practices ### Endpoint Design - Respond quickly (< 5 seconds) to avoid timeouts - Return 200 for successful receipt, even if processing is async - Use a queue for heavy processing - Log incoming payloads for debugging ### Security - Always use HTTPS endpoints - Verify the `X-AnomalyArmor-Signature` header - Rotate authentication tokens periodically - Allowlist AnomalyArmor IP addresses if needed ### Error Handling Your endpoint should handle: - Duplicate alerts (use `alert.id` for deduplication) - Unknown event types (ignore gracefully) - Missing fields (use defaults) ## Troubleshooting ### "Connection refused" or timeout **Cause**: Endpoint unreachable or slow. **Fix**: 1. Verify the URL is correct and accessible 2. Check firewall rules allow AnomalyArmor IPs 3. Ensure endpoint responds within 30 seconds ### "401 Unauthorized" **Cause**: Authentication failed. **Fix**: 1. Verify the token/API key is correct 2. Check the header name matches your endpoint's expectation 3. Ensure credentials haven't expired ### "400 Bad Request" **Cause**: Endpoint rejected the payload. **Fix**: 1. Check endpoint logs for specific error 2. Verify endpoint accepts `application/json` 3. Test with a simple endpoint first ## Common Questions ### How do I verify a webhook actually came from AnomalyArmor? Every request includes an `X-AnomalyArmor-Signature` header in the form `sha256=`. Compute HMAC-SHA256 of the raw request body using your webhook secret and compare with `hmac.compare_digest` to prevent spoofed deliveries. A Python example is in the [Authentication](#authentication) section. ### What happens if my webhook endpoint is down? AnomalyArmor retries failed deliveries five times with backoff: immediate, 1 minute, 5 minutes, 15 minutes, and 1 hour. After five failures the alert is marked as failed delivery. Build your endpoint to respond in under 5 seconds and queue heavy work asynchronously. ### What does the webhook JSON payload look like? Each request is a POST with a JSON body containing `event_type`, an `alert` object (id, rule_name, status, timestamps), an `event` object describing the change and affected asset, and `links` back to AnomalyArmor. Use `alert.id` to deduplicate on your side. See the [Payload Format](#payload-format) example above for the full shape. ### Can I use a self-signed or HTTP endpoint? No. Webhook URLs must be HTTPS with a valid certificate. This protects the authentication header and the asset metadata in transit, and it's a requirement for every webhook destination. ## Next Steps - **[Alert Rules](/alerts/alert-rules)**: Create rules that route to webhooks - **[API Reference](/api/overview)**: Build integrations with the API --- # PagerDuty > Send AnomalyArmor alerts to PagerDuty for incident management Send AnomalyArmor alerts to PagerDuty to trigger incidents and page on-call engineers. Critical schema changes or freshness violations can automatically create PagerDuty incidents for immediate response. ## Why PagerDuty? PagerDuty is ideal for alerts that require immediate human intervention: - **On-call routing**: Automatically page the right person based on schedules - **Escalation**: Ensure critical alerts don't go unnoticed - **Incident management**: Track response and resolution - **Deduplication**: Related alerts are grouped together ## Prerequisites Before you begin: - PagerDuty account with admin access (or permission to create integrations) - AnomalyArmor account with alert configuration permissions - A PagerDuty service configured for your data infrastructure ## Setup Guide ### Step 1: Create a PagerDuty Integration In PagerDuty: 1. Go to **Services** and select (or create) a service for data alerts 2. Click **Integrations** tab 3. Click **Add Integration** 4. Search for **Events API v2** and select it 5. Click **Add** 6. Copy the **Integration Key** (also called Routing Key) > â„šī¸ The Integration Key is a 32-character string that looks like: `a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6` ### Step 2: Add Destination in AnomalyArmor 1. Log in to AnomalyArmor 2. Click **Alerts** in the left sidebar 3. Select **Destinations** tab 4. Click **Add Destination** 5. Select **PagerDuty** ### Step 3: Configure the Destination Enter the following: | Field | Description | |-------|-------------| | **Name** | A descriptive name (e.g., "PagerDuty - Data Platform") | | **Integration Key** | The key copied from PagerDuty | ### Step 4: Test the Connection Click **Test** to send a test event to PagerDuty. ``` Success! Test event sent to PagerDuty Dedup Key: anomalyarmor-test-12345 ``` Check PagerDuty to confirm the incident was created. > âš ī¸ Remember to resolve the test incident in PagerDuty to avoid confusion. ### Step 5: Save Click **Create Destination** to complete the setup. ## Alert Format AnomalyArmor sends events using the PagerDuty Events API v2 format: | Field | Value | |-------|-------| | **Summary** | Alert title with context | | **Source** | `anomalyarmor` | | **Severity** | `critical` (all PagerDuty alerts are high priority) | | **Dedup Key** | Unique identifier to group related alerts | > â„šī¸ Since PagerDuty is reserved for critical alerts, all events sent to PagerDuty use the `critical` severity level. Use Slack or email for lower-priority notifications. ### Custom Details Each PagerDuty incident includes: - Rule name that triggered the alert - Event type (schema_change, freshness_violation, etc.) - Alert description - Asset ID - Link to view in AnomalyArmor ## Best Practices ### Use PagerDuty for Critical Alerts Only > âš ī¸ Don't route all alerts to PagerDuty. Reserve it for events that require immediate action - typically production schema changes that could break pipelines. **Good use cases**: - Production column removed - Critical table freshness SLA violated - Breaking schema changes in production **Better handled elsewhere**: - Development database changes (use Slack) - Informational schema additions (use email) - Routine freshness warnings (use Slack digest) ### Set Up Proper Escalation Configure your PagerDuty service with: 1. **Acknowledgement timeout**: Auto-escalate if not acknowledged (e.g., 5 minutes) 2. **Escalation policy**: Include backup responders 3. **On-call schedule**: Ensure 24/7 coverage for critical services ### Combine with Other Destinations Create alert rules that send to multiple destinations: **Production Breaking Changes** - Event: Schema Change - Scope: production databases - Conditions: Column removed OR type changed - Destinations: PagerDuty (immediate paging), Slack #data-incidents (team visibility), Email data-eng-list@company.com (record) ## Troubleshooting ### "Invalid routing key" **Cause**: The integration key is incorrect or the integration was deleted in PagerDuty. **Fix**: 1. Go to your PagerDuty service 2. Check the Events API v2 integration still exists 3. Copy a fresh integration key 4. Update the destination in AnomalyArmor ### Incidents not appearing **Cause**: Service is disabled or in maintenance mode. **Fix**: 1. Check the PagerDuty service is enabled 2. Verify no maintenance window is active 3. Check the Events API v2 integration is active 4. Use the Test button to verify connectivity ### Duplicate incidents **Cause**: Multiple alert rules triggering for the same event. **Fix**: AnomalyArmor includes a deduplication key with each event. PagerDuty will group alerts with the same dedup key into a single incident. If you're seeing duplicates: 1. Review your alert rules for overlap 2. Consider combining rules or adjusting scopes ### Rate limiting **Cause**: Too many events sent in a short period. **Fix**: 1. Review alert rule thresholds 2. Consider using alert grouping/digest for high-volume events 3. Route lower-priority alerts to Slack or email instead ## Security ### Data Sent to PagerDuty Alert events contain: - Asset names (database, schema, table names) - Change types and descriptions - Timestamps - Rule information Alert events **do not** contain: - Actual data values - Database credentials - Connection strings - Query results ### Revoking Access To disconnect AnomalyArmor from PagerDuty: 1. In AnomalyArmor: Delete the PagerDuty destination 2. In PagerDuty: Remove the Events API v2 integration from the service ## Common Questions ### Which PagerDuty integration type should I use? Use **Events API v2** on the target service. In PagerDuty, open the service, go to **Integrations**, click **Add Integration**, select Events API v2, and copy the 32-character Integration Key into AnomalyArmor. Older Events API v1 integrations are not supported. ### Can I set non-critical severity for PagerDuty alerts? No. Every event AnomalyArmor sends to PagerDuty uses `critical` severity by design, since PagerDuty is reserved for events that require immediate human response. Route lower-priority alerts to [Slack](/alerts/destinations/slack) or [email](/alerts/destinations/email) instead. ### Will PagerDuty deduplicate repeated alerts from AnomalyArmor? Yes. Each event ships with a deduplication key derived from the rule and affected asset, so PagerDuty groups repeated firings into a single incident rather than paging on-call again. If you still see duplicates, check whether multiple overlapping rules are firing for the same event. ### What happens if PagerDuty isn't acknowledged? That's controlled by the PagerDuty service's escalation policy, not by AnomalyArmor. Set an acknowledgement timeout (for example 5 minutes) and configure backup responders in PagerDuty so unacknowledged pages escalate automatically. ## Next Steps - **[Alert Rules](/alerts/alert-rules)**: Create rules that route to PagerDuty - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue and page only when necessary --- # Microsoft Teams > Send AnomalyArmor alerts to Microsoft Teams channels Send AnomalyArmor alerts directly to your Microsoft Teams channels. Get notified about schema changes, freshness violations, and other data events where your team collaborates. ## Why Microsoft Teams? Teams is ideal for organizations using Microsoft 365: - **Real-time**: Alerts appear instantly in channels - **Contextual**: Team can discuss and coordinate in threads - **Actionable**: Click through to AnomalyArmor for details - **Integrated**: Works with your existing Microsoft ecosystem ## Prerequisites Before you begin: - Microsoft Teams account with permission to add connectors (or admin who can approve) - AnomalyArmor account with alert configuration permissions - A Teams channel where you want to receive alerts ## Setup Guide ### Step 1: Create an Incoming Webhook in Teams 1. Open Microsoft Teams 2. Navigate to the channel where you want alerts 3. Click the **...** menu next to the channel name 4. Select **Connectors** (or **Manage channel** → **Connectors**) 5. Find **Incoming Webhook** and click **Configure** 6. Give it a name: "AnomalyArmor Alerts" 7. Optionally upload the AnomalyArmor logo 8. Click **Create** 9. **Copy the webhook URL** - you'll need this in AnomalyArmor > âš ī¸ Keep the webhook URL secret. Anyone with this URL can post messages to your channel. ### Step 2: Add Destination in AnomalyArmor 1. Log in to AnomalyArmor 2. Click **Alerts** in the left sidebar 3. Select **Destinations** tab 4. Click **Add Destination** 5. Select **Microsoft Teams** ### Step 3: Configure the Destination | Field | Description | |-------|-------------| | **Destination Name** | A descriptive name (e.g., "Teams - #data-alerts") | | **Webhook URL** | Paste the URL copied from Teams | ### Step 4: Test the Connection Click **Send Test Alert** to verify everything works: ``` Test alert sent successfully Channel: Data Alerts Message: "Test alert from AnomalyArmor" ``` Check Teams to confirm the message arrived in your channel. ### Step 5: Save Click **Save** to complete the setup. Your Teams destination is now ready to use in alert rules. ## Alert Message Format AnomalyArmor sends Adaptive Cards to Teams with: - Alert type indicator (schema, freshness, discovery) - Affected asset details - Change description - Timestamp - Action button to view in AnomalyArmor ## Best Practices ### Channel Selection - Create dedicated alert channels (e.g., `Data Alerts`) - Don't send to busy general channels - Separate by urgency: breaking changes vs informational ### Webhook Management - Rotate webhook URLs periodically for security - Document which webhooks are used where - Delete unused webhooks from Teams ## Troubleshooting ### "Webhook URL invalid" **Cause**: The webhook URL is malformed or expired. **Fix**: 1. Regenerate the webhook in Teams 2. Copy the new URL carefully (it's long) 3. Update the destination in AnomalyArmor ### Messages not appearing **Cause**: Webhook deleted or channel permissions changed. **Fix**: 1. Verify the webhook still exists in Teams channel settings 2. Recreate the webhook if needed 3. Update AnomalyArmor with the new URL ### Rate limiting **Cause**: Too many alerts in a short period. **Fix**: 1. Review alert rules to reduce volume 2. Consider email for high-volume, low-priority alerts 3. Teams webhooks have rate limits (~4 messages/second) ## Security ### Data Sent to Teams Alert messages contain: - Asset names (table/column names) - Change types (added, removed, modified) - Timestamps Alert messages **do not** contain: - Actual data values - Database credentials - Connection strings ### Revoking Access To disconnect: 1. In Teams: Remove the Incoming Webhook connector from the channel 2. In AnomalyArmor: Delete the Teams destination ## Common Questions ### How do I connect AnomalyArmor to a Microsoft Teams channel? In the target Teams channel, open **Connectors**, configure an **Incoming Webhook** named "AnomalyArmor Alerts", and copy the webhook URL. Paste that URL into a new Microsoft Teams destination in AnomalyArmor and send a test alert to confirm delivery. ### Is the Teams webhook URL sensitive? Yes. Anyone with the URL can post to the channel, so treat it like a credential. Rotate it periodically, avoid checking it into source control, and remove unused webhooks from the Teams channel settings. ### What's the rate limit for Teams alerts? Microsoft Teams incoming webhooks accept roughly 4 messages per second. High-volume alert rules can trip that limit, so keep the Teams destination for schema changes and critical events. Route verbose informational events to email or a digest instead. ## Next Steps - **[Alert Rules](/alerts/alert-rules)**: Create rules that route to Teams - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue --- # Linear > Create Linear issues automatically from AnomalyArmor alerts Create Linear issues automatically when AnomalyArmor detects schema changes or data quality issues. This enables your team to track and resolve data infrastructure problems using your existing project management workflow. ## Why Linear? Linear is ideal for teams that want to track data issues alongside their engineering work: - **Automatic issue creation**: Schema changes become trackable issues - **Team assignment**: Route issues to the right team automatically - **Priority management**: Set priority based on alert severity - **Integration with workflows**: Issues flow through your existing Linear workflow ## Prerequisites Before you begin: - Linear account with API access - Permission to create API keys in Linear - A Linear team where issues will be created ## Setup Guide ### Step 1: Create a Linear API Key In Linear: 1. Click your avatar in the bottom-left corner 2. Select **Settings** 3. Go to **API** in the left sidebar 4. Click **Create new API key** 5. Give it a descriptive name (e.g., "AnomalyArmor Integration") 6. Copy the API key > âš ī¸ Store this API key securely. It won't be shown again after you close the dialog. ### Step 2: Find Your Team ID In Linear: 1. Go to the team where you want issues created 2. Look at the URL: `https://linear.app/your-workspace/team/ABC/...` 3. The Team ID is the short code after `/team/` (e.g., `ABC` or `DATA`) Alternatively, use the Linear API or GraphQL explorer to list your teams. ### Step 3: Add Destination in AnomalyArmor 1. Log in to AnomalyArmor 2. Click **Alerts** in the left sidebar 3. Select **Destinations** tab 4. Click **Add Destination** 5. Select **Linear** ### Step 4: Configure the Destination Enter the following: | Field | Description | Required | |-------|-------------|----------| | **Name** | A descriptive name (e.g., "Linear - Data Team") | Yes | | **API Key** | The API key from Step 1 | Yes | | **Team ID** | The team identifier from Step 2 | Yes | | **Project ID** | Target project for issues | No | | **Label IDs** | Comma-separated label IDs to auto-apply | No | | **Priority** | Default issue priority (0-4) | No | ### Step 5: Test the Connection Click **Test** to create a test issue in Linear. ``` Success! Test issue created in Linear Issue: DATA-123 ``` Check Linear to confirm the issue was created in the correct team. > â„šī¸ Remember to close or delete the test issue to keep your backlog clean. ### Step 6: Save Click **Create Destination** to complete the setup. ## Issue Format AnomalyArmor creates Linear issues with: | Field | Value | |-------|-------| | **Title** | Alert title with context | | **Description** | Full alert details including asset and change info | | **Team** | Your configured team | | **Priority** | Based on alert type or your default setting | | **Labels** | Any labels you configured | ### Priority Mapping Linear uses 0-4 priority scale: | Priority | Meaning | |----------|---------| | 0 | No priority | | 1 | Urgent | | 2 | High | | 3 | Medium | | 4 | Low | ## Best Practices ### Create Focused Alert Rules Create specific alert rules for Linear to avoid issue overload: **Production Schema Issues** - Event: Schema Change - Scope: production databases only - Conditions: Column removed OR type changed - Destination: Linear - Data Team ### Use Labels for Categorization Configure label IDs to automatically categorize issues: - `data-quality` - for freshness violations - `schema-change` - for schema drift alerts - `urgent` - for critical production issues ### Combine with Real-Time Notifications Use Linear for tracking alongside real-time notifications: **Critical Data Issues** - Destinations: Slack (immediate notification), Linear (issue tracking) ## Troubleshooting ### "Invalid API key" **Cause**: The API key is incorrect or revoked. **Fix**: 1. Generate a new API key in Linear settings 2. Update the destination in AnomalyArmor ### "Team not found" **Cause**: Invalid team ID or API key lacks access to the team. **Fix**: 1. Verify the team ID in the Linear URL 2. Ensure the API key has access to the workspace containing the team ### Issues not appearing **Cause**: API key permissions or team configuration. **Fix**: 1. Verify the API key has write access 2. Check the team exists and is active 3. Use the Test button to verify connectivity ## Security ### Data Sent to Linear Issue content includes: - Asset names (database, schema, table names) - Change types and descriptions - Timestamps - Alert rule information Issue content **does not** include: - Actual data values - Database credentials - Connection strings ### Revoking Access To disconnect AnomalyArmor from Linear: 1. In AnomalyArmor: Delete the Linear destination 2. In Linear: Revoke the API key in Settings > API ## Common Questions ### Where do I find my Linear Team ID? Open your team in Linear and look at the URL: `https://linear.app/your-workspace/team/ABC/...`. The short code after `/team/` (for example `ABC` or `DATA`) is the Team ID. You can also list teams via Linear's GraphQL API. ### Can I auto-apply labels and priority to Linear issues from AnomalyArmor? Yes. When configuring the destination, add comma-separated **Label IDs** to auto-tag every created issue and set a default **Priority** (0-4, where 1 is Urgent and 4 is Low). Labels like `data-quality` or `schema-change` make it easy to filter Linear by alert type. ### Should I use Linear as my only destination for production alerts? Not on its own. Linear is good for tracking but not real-time notification. Pair it with Slack or PagerDuty so the team sees the alert immediately, then uses Linear to track resolution. ## Next Steps - **[Alert Rules](/alerts/alert-rules)**: Create rules that route to Linear - **[Slack](/alerts/destinations/slack)**: Add real-time notifications alongside Linear tracking --- # incident.io > Send AnomalyArmor alerts to incident.io for incident management Send AnomalyArmor alerts to incident.io to automatically create incidents when schema drift or data quality issues are detected. Critical changes can trigger incidents for immediate response from your on-call team. ## Why incident.io? incident.io is ideal for teams that want structured incident management: - **Incident lifecycle**: Track incidents from detection to resolution - **Automatic creation**: Alerts create incidents automatically - **Slack integration**: Automatically create incident channels - **Post-mortems**: Built-in tooling for learning from incidents ## Prerequisites Before you begin: - incident.io account with API access - AnomalyArmor account with alert configuration permissions - Permission to create API keys in incident.io ## Setup Guide ### Step 1: Create an incident.io API Key In incident.io: 1. Go to **Settings** → **API Keys** 2. Click **Create API key** 3. Give it a descriptive name (e.g., "AnomalyArmor Integration") 4. Ensure it has the `incident.write` permission 5. Click **Create** 6. Copy the API key (it won't be shown again) > â„šī¸ API keys start with `inc_` followed by a long alphanumeric string. ### Step 2: Get Your Closed Status ID (Optional) To enable automatic incident closing when alerts are resolved in AnomalyArmor: 1. Go to **Settings** → **Incident Lifecycle** in incident.io 2. Click on a status with category "Closed" (e.g., "Closed", "Resolved") 3. Copy the UUID from the URL (e.g., `01FCNDV6P870EA6S7TK1DSYDG0`) > â„šī¸ When configured, resolving or dismissing an alert in AnomalyArmor will automatically close the corresponding incident in incident.io. ### Step 3: Add Destination in AnomalyArmor 1. Log in to AnomalyArmor 2. Click **Alerts** in the left sidebar 3. Select **Destinations** tab 4. Click **Add Destination** 5. Select **incident.io** ### Step 4: Configure the Destination Enter the following: | Field | Description | |-------|-------------| | **Name** | A descriptive name (e.g., "incident.io - Data Platform") | | **API Key** | The API key copied from incident.io | | **Closed Status ID** | (Optional) UUID of the status to set when closing incidents automatically | ### Step 5: Test the Connection Click **Test** to create a test incident in incident.io. ``` Success! Incident created: INC-123 ``` Check incident.io to confirm the incident was created. > âš ī¸ Remember to resolve the test incident in incident.io to keep your incident list clean. ### Step 6: Save Click **Create Destination** to complete the setup. ## Alert Format AnomalyArmor creates incidents using the incident.io API v2: | Field | Value | |-------|-------| | **Name** | `[AnomalyArmor] Alert title` | | **Summary** | Alert description with AnomalyArmor context | | **Mode** | `standard` (not retrospective) | | **Visibility** | `public` | ### Incident Summary Each incident includes: - Alert description - Source: AnomalyArmor - Rule name that triggered the alert - Event type (schema_change, freshness_violation, etc.) - Asset ID ### Idempotency AnomalyArmor includes an idempotency key with each incident request. This prevents duplicate incidents if the same alert is processed multiple times. ### Incident Lifecycle AnomalyArmor supports full incident lifecycle management: | AnomalyArmor Action | incident.io Effect | |---------------------|-------------------| | Alert triggered | New incident created | | Alert resolved | Incident status updated to configured closed status | | Alert dismissed | Incident status updated to configured closed status | > â„šī¸ Automatic incident closing requires the **Closed Status ID** to be configured. If not set, incidents must be closed manually in incident.io. ## Best Practices ### Reserve for Critical Alerts > âš ī¸ Don't route all alerts to incident.io. Reserve it for events that require coordinated response - typically production schema changes that could break pipelines or critical freshness SLA violations. **Good use cases**: - Production column removed or renamed - Critical table freshness SLA violated - Breaking schema changes in production databases **Better handled elsewhere**: - Development database changes (use Slack) - Informational schema additions (use email) - Routine freshness warnings (use Slack digest) ### Combine with Other Destinations Create alert rules that send to multiple destinations: **Production Breaking Changes** - Event: Schema Change - Scope: production databases - Conditions: Column removed OR type changed - Destinations: incident.io (incident creation), Slack #data-incidents (team visibility), Email data-eng-list@company.com (record) ## Troubleshooting ### "No API key configured" **Cause**: The API key field is empty. **Fix**: 1. Edit the destination in AnomalyArmor 2. Enter your incident.io API key 3. Save the destination ### "HTTP 401: Unauthorized" **Cause**: The API key is invalid or expired. **Fix**: 1. Go to incident.io Settings → API Keys 2. Verify the key exists and hasn't been revoked 3. Create a new API key if needed 4. Update the destination in AnomalyArmor ### "HTTP 403: Forbidden" **Cause**: The API key lacks required permissions. **Fix**: 1. Go to incident.io Settings → API Keys 2. Verify the key has `incident.write` permission 3. Create a new key with correct permissions if needed ### Incidents not appearing **Cause**: Rate limiting or API issues. **Fix**: 1. Check incident.io status page for outages 2. Use the Test button to verify connectivity 3. Check the AnomalyArmor alert history for delivery errors > 💡 Set incident severity manually in incident.io based on your organization's criteria, or use incident.io's built-in rules to auto-assign severity. ## Security ### Data Sent to incident.io Incident data contains: - Asset names (database, schema, table names) - Change types and descriptions - Timestamps - Rule information Incident data **does not** contain: - Actual data values - Database credentials - Connection strings - Query results ### Revoking Access To disconnect AnomalyArmor from incident.io: 1. In AnomalyArmor: Delete the incident.io destination 2. In incident.io: Revoke the API key in Settings → API Keys ## Common Questions ### What permissions does the incident.io API key need? The key must have the `incident.write` permission to create incidents. If you want automatic incident closing when alerts resolve, the same key handles status updates. API keys start with `inc_` and are created under **Settings > API Keys** in incident.io. ### Will resolving an AnomalyArmor alert close the incident.io incident? Only if you configure the **Closed Status ID** on the destination. Without it, incidents stay open in incident.io even after the alert is resolved or dismissed in AnomalyArmor. Find the status UUID under **Settings > Incident Lifecycle** in incident.io. ### Does AnomalyArmor create duplicate incidents if the same alert fires twice? No. Every request includes an idempotency key, so retries and repeated processing of the same alert collapse into a single incident in incident.io. ### Should I route every alert to incident.io? No. Reserve it for events that justify a coordinated response such as production breaking schema changes or critical freshness SLA violations. Route dev changes, informational additions, and routine freshness warnings to Slack or email instead. See [Best Practices](/alerts/best-practices). ## Next Steps - **[Alert Rules](/alerts/alert-rules)**: Create rules that route to incident.io - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue and create incidents only when necessary --- # Operating Schedules > Control when alert rules are active with time-based schedules Operating schedules let you define business hours or monitoring windows so alerts only fire when your team is available to respond. Instead of waking up on-call engineers at 3 AM for non-critical issues, schedules suppress alerts outside defined time windows. *A typical business-hours operating schedule* ## How Schedules Work Schedules sit between rule evaluation and alert delivery: 1. An event matches an alert rule 2. AnomalyArmor checks whether the current time falls within the rule's operating schedule 3. If inside the schedule window, the alert is delivered normally 4. If outside the schedule window, the alert is suppressed with reason "outside operating hours" > â„šī¸ Suppressed alerts still appear in **Alerts > History** for auditing. No data is lost. You can always review what happened outside your schedule windows. ## Creating a Schedule ## Schedule Rules Each schedule contains one or more rules that define active time windows. Rules are combined with OR logic, so an alert fires if the current time falls within any of the schedule's rules. ### Rule Structure Each rule defines three things: | Field | Format | Example | |-------|--------|---------| | **Days** | 1 = Monday through 7 = Sunday | 1-5 (weekdays) | | **Start time** | HH:MM (24-hour) | 09:00 | | **End time** | HH:MM (24-hour) | 18:00 | ### Combining Rules Use multiple rules to cover different day/time combinations: | Rule | Days | Start | End | Coverage | |------|------|-------|-----|----------| | 1 | Mon-Fri (1-5) | 09:00 | 18:00 | Weekday business hours | | 2 | Sat (6) | 10:00 | 14:00 | Saturday morning shift | This schedule activates alerts Monday through Friday from 9 AM to 6 PM, plus Saturday from 10 AM to 2 PM. ### Overnight Schedules If the start time is later than the end time, the window wraps to the next day: | Rule | Days | Start | End | Coverage | |------|------|-------|-----|----------| | 1 | Mon-Thu (1-4) | 22:00 | 06:00 | Overnight batch window | This covers 10 PM to 6 AM the following morning, Monday night through Friday morning. ## Assigning to Rules Schedules take effect when assigned to an alert rule. ### During Rule Creation When creating a new alert rule, select a schedule from the **Operating Schedule** dropdown in the rule builder. Leave it blank to keep the rule always active. ### On an Existing Rule 1. Go to **Alerts > Rules** 2. Click the rule name to edit 3. Select or change the **Operating Schedule** 4. Click **Save** > â„šī¸ Removing a schedule from a rule makes it always active. Alerts are never lost. They are suppressed and logged when outside the schedule window. ### Constraints - A rule can only have one schedule at a time - Rules without a schedule are always active - Multiple rules can share the same schedule - Changing a schedule affects all rules that use it ## Timezone Handling Schedules evaluate against the current time in the configured timezone: - All times use IANA timezone identifiers (`America/New_York`, `Europe/London`, `Asia/Tokyo`) - Daylight Saving Time transitions are handled automatically - When clocks spring forward, the schedule adjusts accordingly. A 2 AM to 3 AM window would be skipped on the transition day - When clocks fall back, the schedule covers the repeated hour > âš ī¸ Avoid using UTC offsets like `UTC-5` for schedules in regions that observe DST. Use the region-based identifier (e.g., `America/New_York`) so DST is handled correctly. ## Common Patterns ### Business Hours Only Standard weekday coverage for a single team: | Rule | Days | Start | End | |------|------|-------|-----| | 1 | Mon-Fri (1-5) | 09:00 | 18:00 | **Best for**: Non-critical alerts that can wait until the next business day. ### 24/7 Weekdays with Reduced Weekends Full weekday coverage with limited weekend hours: | Rule | Days | Start | End | |------|------|-------|-----| | 1 | Mon-Fri (1-5) | 00:00 | 23:59 | | 2 | Sat-Sun (6-7) | 09:00 | 17:00 | **Best for**: Teams with weekend on-call during daytime hours only. ### Follow the Sun Multiple rules covering regional business hours for global teams: | Rule | Days | Start | End | Region | |------|------|-------|-----|--------| | 1 | Mon-Fri (1-5) | 09:00 | 18:00 | US East (America/New_York) | | 2 | Mon-Fri (1-5) | 09:00 | 18:00 | Europe (Europe/London) | | 3 | Mon-Fri (1-5) | 09:00 | 18:00 | APAC (Asia/Tokyo) | > â„šī¸ Follow-the-sun requires creating separate schedules (one per timezone) and separate alert rules for each region's destinations. ## Common Questions ### How do I stop getting paged outside business hours? Create a schedule with Mon-Fri, 09:00 to 18:00 in your team's timezone and assign it to the rule in the **Operating Schedule** dropdown. Events that fire outside the window are suppressed and logged with reason "outside operating hours". For company-wide quiet periods, use [blackout windows](/alerts/blackouts) instead. ### Can one schedule cover overnight hours? Yes. Set the start time later than the end time and the window wraps to the next day. For example, Mon-Thu with start 22:00 and end 06:00 covers each overnight period from Monday night through Friday morning. ### Does an operating schedule handle daylight saving time? Yes, as long as you use an IANA identifier like `America/New_York` rather than a fixed UTC offset. DST transitions are handled automatically: spring-forward skips the repeated hour, fall-back covers it. Fixed offsets like `UTC-5` will drift when DST changes. ### Can multiple rules share the same schedule? Yes. One schedule can be assigned to any number of rules, and editing the schedule affects all of them. A rule can only have one schedule at a time, and rules with no schedule are always active. ### How do I set up follow-the-sun alerting across regions? Create one schedule per region using the region's IANA timezone (e.g., `America/New_York`, `Europe/London`, `Asia/Tokyo`) and one alert rule per region routing to that region's destinations. There is no single "any region" schedule. The per-rule assignment is what makes the routing work. ## Next Steps - **[Create Alert Rules](/alerts/alert-rules)**: Assign schedules to your alert rules - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue with effective scheduling --- # Blackout Windows > Suppress all alerts during planned maintenance or holidays Blackout windows let you temporarily suppress all alert notifications company-wide. Use them during planned maintenance, deployments, holidays, or any period when alerts would create unnecessary noise. ## How Blackouts Work - Blackouts suppress **all** alerts for the entire company during the window - Unlike [operating schedules](/alerts/schedules) (which are per-rule), blackouts are global - Suppressed alerts are still recorded in the alert log with reason "blackout period" - Blackouts can be one-time or yearly recurring ## Creating a Blackout ## Recurrence Options | Type | Behavior | Use Case | |------|----------|----------| | None | Fires once, then expires | Deployment windows, one-off maintenance | | Yearly | Repeats same dates each year | Company holidays, annual freeze periods | > â„šī¸ Yearly recurring blackouts match the month and day, adjusting for timezone. A blackout from Dec 24-26 will repeat every year on those dates. ## Managing Blackouts - Active blackouts suppress alerts immediately when the window starts - Toggle **is_active** to temporarily disable a blackout without deleting it - Past one-time blackouts remain in the list for audit purposes but have no effect > 💡 Create recurring yearly blackouts for predictable events like company holidays. Use one-time blackouts for ad-hoc maintenance windows. ## Blackouts vs Operating Schedules | Feature | Operating Schedules | Blackout Windows | |---------|-------------------|------------------| | Scope | Per-rule | Company-wide | | Timing | Recurring weekly pattern | Specific date ranges | | Use case | Business hours | Maintenance, holidays | | Configuration | Assigned to individual rules | Applies globally | > âš ī¸ Blackouts override operating schedules. If an event falls within both an active schedule and an active blackout, the blackout takes precedence and the alert is suppressed. ## Common Questions ### How do I silence alerts during a planned deployment? Create a one-time blackout in **Alerts > Blackouts** with the start and end times of your deployment window. The blackout suppresses all alerts company-wide until it ends. Suppressed events are still logged in **Alerts > History** with reason "blackout period". ### What's the difference between a blackout and an operating schedule? Blackouts are company-wide date ranges, typically used for maintenance windows and holiday freezes. [Operating schedules](/alerts/schedules) are per-rule recurring weekly patterns, typically used for business-hours-only alerting. If both apply, the blackout wins and the alert is suppressed. ### Can blackouts repeat every year for holidays? Yes. Set **Recurrence** to **Yearly** when creating the blackout and it will repeat on the same month and day each year, adjusting for timezone. This is the recommended pattern for company holiday code freezes. ### Do suppressed alerts still show up anywhere? Yes. All blackout-suppressed events are written to **Alerts > History** with the reason "blackout period", so you can audit what happened during the window without receiving the notifications in real time. ## Next Steps - **[Alert Rules](/alerts/alert-rules)**: Configure per-rule schedules and conditions - **[Best Practices](/alerts/best-practices)**: Reduce alert fatigue across your team --- # Alert Best Practices > Reduce alert fatigue and improve response times with effective alerting strategies Effective alerting is about balance: too few alerts and you miss issues; too many and you ignore them all. This guide helps you build an alerting strategy that keeps you informed without overwhelming your team. ## The Alert Fatigue Problem Alert fatigue happens when teams receive too many notifications: **The goal**: Every alert should be actionable and worth investigating. ## Core Principles ### 1. Start Narrow, Expand Carefully Don't monitor everything at once: 1. **Week 1**: Monitor 5 critical production tables 2. **Week 2**: Add freshness monitoring to those tables 3. **Week 3**: Expand to 10 more important tables 4. **Week 4**: Review alert history, tune thresholds 5. Continue expanding gradually ### 2. Every Alert Should Be Actionable Before creating an alert, ask: - What action should someone take when this fires? - Is immediate action required, or can it wait? - Who is the right person to respond? If you can't answer these questions, the alert may not be useful. ### 3. Match Urgency to Destination | Urgency | Destination | When to Use | |---------|-------------|-------------| | **Immediate** | PagerDuty | On-call response needed now | | **Soon** | Slack | Team should see within hours | | **Eventually** | Email | Can be reviewed daily/weekly | ## Event-Based Routing Route different event types based on impact severity: ### Recommended Setup | Alert Type | Event | Trigger scope | Destination | |------------|-------|---------------|-------------| | Production breaking changes | Schema Change | Breaking only | PagerDuty + Slack | | Production additive changes | Schema Change | Non-breaking only | Slack (low urgency) | | Gold-table change freeze | Schema Change | Specific types (`COLUMN_REMOVED`, `PRIMARY_KEY_REMOVED`, ...) | PagerDuty | | Freshness violations | Freshness Violation | SLA breached | Slack | | Discovery failures | Discovery Failed | Any failure | Slack + Email | | Dev/staging changes | Schema Change | Breaking only | Email | ## Environment Separation Monitor different environments differently: ### Production **Rules:** - All schema changes → Slack + PagerDuty (for breaking) - All freshness violations → Slack - Discovery failures → Slack + Email **Schedule:** Hourly discovery | **Threshold:** Strict SLAs ### Staging **Rules:** - Breaking changes only → Slack - Freshness (critical tables only) → Slack **Schedule:** Every 6 hours | **Threshold:** Lenient SLAs ### Development **Rules:** - None or weekly digest only **Schedule:** Daily | **Threshold:** Very lenient or disabled ## Threshold Tuning ### Start Lenient If your ETL runs hourly, don't set a 30-minute SLA: | Pattern | Starting SLA | After Tuning | |---------|--------------|--------------| | 15 min updates | 45 min | 30 min | | Hourly updates | 3 hours | 2 hours | | Daily updates | 36 hours | 24 hours | ### Use Warning Thresholds Two-stage alerts reduce surprise violations: **orders table freshness:** - **Expected**: Updated hourly - **Warning**: After 90 minutes (alert to Slack) - **Violation**: After 2 hours (alert to PagerDuty) Warnings give you time to investigate before escalation. ### Review and Tighten After 2-4 weeks: 1. Check alert history 2. Identify alerts that fired but weren't actionable 3. Tighten thresholds that never trigger 4. Loosen thresholds that trigger too often ## Scope Filtering ### Include Only What Matters Filter rules to relevant assets: **Rule: Production Revenue Freshness** - **Data source**: production-postgres - **Schema**: public - **Assets**: `orders`, `payments`, `revenue_*`, `transaction_*` ### Exclude Noise Remove assets that don't need monitoring: **Exclusions:** - `*_temp` (temporary tables) - `*_backup` (backup copies) - `*_old` (deprecated tables) - `pg_temp_*` (PostgreSQL temp) - `test_*` (test tables) ## Alert Aggregation Avoid alert storms by grouping related alerts: ### Same Asset, Multiple Changes **Instead of:** - Column added: new_field_1 - Column added: new_field_2 - Column added: new_field_3 - Column type changed: status **AnomalyArmor groups:** - **Schema Change: 4 changes detected** - 3 columns added - 1 column type changed - View details → ### Deduplication The same change won't re-alert until resolved or a cooldown period passes. ## Common Mistakes ### Alerting on everything **Problem**: Every table, every change, every environment → hundreds of alerts **Solution**: Start with 5-10 critical tables. Expand only after you've proven the value. ### Same destination for everything **Problem**: All alerts go to Slack → important ones get buried **Solution**: Use event-based routing. PagerDuty for breaking changes, Slack for schema changes, Email for informational. ### Too-tight SLAs **Problem**: Freshness SLA is 1 hour, but ETL sometimes takes 70 minutes → constant false positives **Solution**: Set SLA at 2x expected, tune down over time. ### Monitoring dev environments **Problem**: Dev databases change constantly → alert storm **Solution**: Don't monitor dev at all, or use weekly email digests only. ### No one owns the alerts **Problem**: Alerts fire but no one responds **Solution**: Define ownership for each alert type. Use PagerDuty with on-call rotations for critical alerts. ## Weekly Review Process Schedule 15-30 minutes weekly to review alerts: ### Questions to Ask 1. **How many alerts fired this week?** - If more than 50: Too many. Add filters or raise thresholds. - If fewer than 5: Are you monitoring enough? 2. **What percentage were actionable?** - Target: >80% - If lower: Identify patterns and add filters 3. **Were any issues missed?** - If yes: Add coverage for those scenarios 4. **Which alerts took longest to resolve?** - These may need better routing or documentation ### Tuning Actions | Finding | Action | |---------|--------| | Alert fires often but isn't actioned | Disable or change to email digest | | Same asset alerts repeatedly | Investigate root cause, not just the alert | | Critical issue wasn't alerted | Add coverage | | Team ignores channel | Reduce volume or change channel | ## Sample Alert Configuration Here's a recommended starting configuration: | Rule | Event | Scope | Trigger scope | Destinations | |------|-------|-------|---------------|--------------| | **Production Breaking Changes** | Schema Change | Production database, all schemas | Breaking only | PagerDuty, Slack #incidents | | **Production Additive Changes** | Schema Change | Production database, all schemas | Non-breaking only | Slack #data-alerts | | **Critical Table Freshness** | Freshness Violation | orders, payments, users, products | SLA from asset config | Slack #data-alerts, PagerDuty (if >4h stale) | | **Analytics Freshness** | Freshness Violation | daily_*, weekly_*, analytics_* | SLA from asset config | Slack #analytics-team | | **Discovery Failures** | Discovery Failed | All | All failures | Slack #data-alerts, Email ops@company.com | | **Staging Changes (Breaking)** | Schema Change | Staging database | Breaking only | Email (daily digest) | ## Checklist Before going live with alerts: - [ ] Defined critical tables (start with 5-10) - [ ] Set up event-based routing (breaking → PagerDuty, others → Slack) - [ ] Excluded dev/test environments - [ ] SLAs set with buffer (2x expected) - [ ] Warning thresholds configured - [ ] Assigned ownership for each alert type - [ ] Scheduled weekly review meeting - [ ] Documented escalation process ## Use Schedules and Blackouts Reduce noise by controlling when alerts fire: ### Operating Schedules Assign [operating schedules](/alerts/schedules) to rules that only matter during business hours: - **Freshness rules**: If your pipelines run overnight, set schedules to only alert during business hours when the team can respond - **Non-critical schema changes**: Alert during work hours, suppress overnight - **Development environments**: Restrict to CI/CD windows ### Blackout Windows Use [blackout windows](/alerts/blackouts) for planned quiet periods: - **Deployment windows**: Suppress alerts during known release times - **Holiday freezes**: Create yearly recurring blackouts for company holidays - **Maintenance periods**: Silence alerts during planned infrastructure work > 💡 Combine schedules and blackouts: schedules handle recurring weekly patterns, blackouts handle specific date ranges. Both keep your team focused on alerts they can act on. ## Common Questions ### How do I stop getting too many data alerts? Alert fatigue usually comes from monitoring too broadly. Start with 5-10 critical production tables, route only breaking changes to PagerDuty, send additive changes to Slack, and exclude dev and staging from noisy rules. Set freshness SLAs at roughly 2x expected update time, then tighten once you see real patterns. ### Should I alert on dev and staging databases? Usually no. Dev databases change constantly and produce noise without actionable signal. If you must monitor non-prod, restrict it to breaking changes only and route to a weekly email digest, not to a real-time channel. ### What's a good starting freshness SLA? Start at roughly twice your expected update interval, then tighten over time. For hourly pipelines try a 3-hour SLA and tune down to 2 hours. For 15-minute pipelines try 45 minutes. Tight SLAs at launch produce constant false positives during normal pipeline variance. ### Which alerts belong on PagerDuty versus Slack versus email? Match the destination to urgency. PagerDuty is for breaking production changes and critical SLA violations that need on-call response now. Slack is for schema changes and freshness issues the team should see within hours. Email fits informational events, digests, and low-urgency records. ### How often should I review my alert rules? Block 15-30 minutes weekly. Count alerts fired, estimate what percentage were actionable (target above 80 percent), and check whether any real issue was missed. Disable rules that never produce action and tighten ones that never fire. This is the fastest path out of chronic alert fatigue. ## Related Topics - **[Alert Rules](/alerts/alert-rules)**: Configure alert rules - **[Freshness Monitoring](/data-quality/freshness-monitoring)**: Set up freshness SLAs - **[Slack Integration](/alerts/destinations/slack)**: Configure Slack alerts - **[Alerts Overview](/alerts/overview)**: Alert system architecture - **[Operating Schedules](/alerts/schedules)**: Control when rules are active - **[Blackout Windows](/alerts/blackouts)**: Suppress alerts during maintenance --- # Intelligence Overview > AI that connects schema drift, freshness, data quality, and lineage into actionable answers Intelligence is the layer that ties your monitoring together. Schema drift detects changes. Freshness catches stale tables. Data quality tracks metrics. Intelligence reads all of that context and lets you ask questions, surface findings, and diagnose problems in plain English. ## How Intelligence Uses Your Monitoring Context Every AnomalyArmor feature generates context that Intelligence can reason over: | Feature | Context Created | What Intelligence Does With It | |---------|----------------|-------------------------------| | [Schema Drift](/concepts/schema-changes) | Column additions, removals, type changes | "A column was removed Tuesday. That's likely why your dashboard broke." | | [Freshness](/concepts/freshness) | Update patterns, SLA violations, stale tables | "The orders table is 6 hours overdue. Its upstream source is also stale." | | [Data Quality](/concepts/metrics) | Null rates, row counts, validity rule failures | "Null rate on email spiked from 2% to 28% after Wednesday's deploy." | | [Tags](/data-classification/overview) | PII labels, domain classifications | "3 tables with PII tags had schema changes this week." | | [Lineage](/intelligence/overview) | Upstream sources, downstream consumers | "The stale staging table affects 7 downstream gold tables." | | [Alerts](/concepts/alerts) | Alert history, rule configurations | "You had 4 freshness alerts and 1 schema drift alert since Monday." | The more monitoring you set up, the smarter Intelligence gets. ## Features - **[Object Profiles](/intelligence/object-profiles)**: AI-generated profiles with findings, relationships, and change history for every table - **[Ask Questions](/intelligence/ask-questions)**: Chat with an AI agent that has full context across all your monitoring domains - **[Walkthrough](/intelligence/use-cases)**: See how monitoring context flows into Intelligence to solve real problems ## What You Can Ask Intelligence answers are grounded in real data from your monitoring setup, not just schema names. ### Cross-Domain Diagnosis ``` "Why is my orders dashboard showing errors?" "What changed upstream that could cause NULLs in silver_orders?" "What should I prioritize this morning?" ``` ### Schema and Structure ``` "What columns were removed this week?" "Which tables have PII?" "How is users related to orders?" ``` ### Freshness and Health ``` "Which tables are stale right now?" "What's the update pattern for my fact tables?" "Show me SLA violations this week" ``` ### Data Quality ``` "Which columns have spiking null rates?" "What validity rules are failing?" "Are there anomalies in my row counts?" ``` ### Impact Analysis ``` "What's the blast radius if orders goes down?" "What depends on the customers table?" "Show me everything downstream of raw_events" ``` ## The Intelligence Page The Intelligence page is a two-panel layout: - **Left**: Schema Explorer tree. Navigate databases, schemas, and tables. - **Right**: Dynamic content that changes based on your selection. Select a table to see its **Object Profile**: an AI-generated summary with findings grouped by severity, relationship mapping, change history, and which monitoring context is available. Select a database or schema to see a high-level overview with aggregated findings. Click **Ask Agent** to open the full-screen chat, where you can ask questions scoped to the selected asset with full conversation history. ## Getting Started ## How It Works Intelligence builds a knowledge base from your operational metadata: | What We Analyze | What We Never Access | |----------------|---------------------| | Table and column names | Actual row data | | Data types and constraints | Query results | | Schema change history | Business values | | Freshness patterns and SLAs | Database credentials | | Data quality metrics | | | Alert history and rules | | | Tags and classifications | | | Lineage and dependencies | | Your data values stay in your database. Intelligence only sees structure and operational metadata. ## Usage and Billing Intelligence queries count against your monthly API limit: | Plan | Monthly Queries | |------|-----------------| | Starter | 500 | | Growth | 2,000 | | Professional | 10,000 | | Enterprise | Unlimited | Check usage: **Settings** > **Billing** > **Usage** ## Privacy and Security - Intelligence processes queries using AI language models (configurable per account) - Only schema metadata and operational data are sent (names, types, alert history, freshness status) - No actual data values are ever transmitted - You can request Intelligence be disabled for your account ## Common Questions ### What does Intelligence actually do that monitoring alone doesn't? Monitoring detects events, schema drift, stale tables, null rate spikes. Intelligence correlates them. It reads context from every monitoring domain and answers questions like "why is this broken?" in one place instead of you clicking through separate pages. ### Does Intelligence send my data to an LLM? Only metadata: table names, column types, schema changes, freshness status, metric values, alert history, and tags. Row data, query results, and credentials are never transmitted. See the [Ask Questions](/intelligence/ask-questions) page for more on privacy. ### Do I need monitoring set up before Intelligence is useful? Yes, effectively. Intelligence reasons over the context your monitoring creates. Without freshness schedules, drift detection, or metrics, it falls back to schema-only analysis and loses most of its diagnostic power. ### Is Intelligence usage capped? Yes, by plan. Starter gets 500 queries per month, Growth 2,000, Professional 10,000, and Enterprise is unlimited. Check consumption under **Settings → Billing → Usage**. ### Can I turn Intelligence off for my account? Yes. If your organization doesn't want AI processing, request that Intelligence be disabled. The rest of AnomalyArmor (monitoring, alerts, classification) continues working without it. ## Next Steps - **[Object Profiles](/intelligence/object-profiles)**: Understand what you see when you select a table - **[Ask Questions](/intelligence/ask-questions)**: Chat with the AI agent --- # Object Profiles > AI-generated intelligence profiles for every table, powered by your monitoring context When you select a table on the Intelligence page, you see its **Object Profile**: an AI-generated analysis that combines schema metadata with context from freshness, data quality, schema drift, tags, and lineage into a single view. The profile is only as useful as the monitoring you've set up. A table with freshness schedules, data quality metrics, and schema drift monitoring will have a much richer profile than one with no monitoring at all. ## What's in a Profile ### Object Summary The top card shows the table's identity at a glance: - **Name and path**: `warehouse.gold.fact_orders` - **Warehouse role**: Classified as Fact Table, Dimension, Staging, Raw, or other patterns - **Confidence score**: How confident the AI is in its analysis (based on available context) - **Summary**: One-paragraph explanation of what this table is for - **Business context**: What team uses it, what process populates it, what it means to the business - **Domain and tags**: Business and technical classifications applied to this table ### Findings Findings are the most actionable part of the profile. They surface issues and observations pulled from across your monitoring domains, grouped by severity: | Severity | Example | |----------|---------| | **Critical** | "Column `order_total` was removed on Tuesday. 3 downstream tables reference this column." | | **High** | "Table is 6 hours overdue for update. SLA threshold is 2 hours." | | **Medium** | "Null rate on `email` increased from 2.1% to 12.4% over the past week." | | **Low** | "Table has no freshness monitoring configured." | Each finding includes the affected object, a description, and the monitoring domain it came from. Critical and High findings are expanded by default. Low findings are collapsed. ### Relationships and Lineage Shows how this table connects to the rest of your data: - **Foreign key relationships**: Detected column references to other tables - **Lineage signals**: Whether this table is a source, derived, or staging table - **Schema-level lineage**: Which schemas act as sources vs. consumers - **Hub tables**: Central entities that many tables reference ### Change History A timeline of recent schema changes detected on this table: - Column additions (green) - Column removals (red) - Type modifications (amber) Shows the 10 most recent changes with timestamps. This context is what lets Intelligence answer questions like "what changed on the orders table this week?" ### Analysis Context Shows which monitoring domains have data available for this table. The agent uses this context when answering questions: | Domain | What It Provides | |--------|-----------------| | **Schema** | Table structure, column types, constraints | | **Tags** | Business classifications, PII labels | | **Freshness** | Update patterns, SLA status, learned schedules | | **Rules** | Alert rule configurations and firing history | | **Data Quality** | Metric trends, validity rule results | A checkmark means context is available. Missing domains mean you haven't set up that monitoring yet. ## Generating Profiles ### Single Table 1. Navigate to a table in the Schema Explorer 2. Click **Analyze Table** in the top right 3. Wait for analysis to complete (typically 10-30 seconds) 4. The profile appears with all available context ### Entire Schema 1. Select a schema in the Schema Explorer 2. Click **Analyze Schema** 3. All tables in the schema are analyzed ### Full Database 1. Select the database root in the Schema Explorer 2. Click **Analyze All** 3. Every table across all schemas is analyzed Analysis runs in the background. You'll see a notification when it completes, and the profiles update automatically. ## Making Profiles More Useful The quality of a profile depends directly on what monitoring you've set up: | What You Set Up | What the Profile Gains | |----------------|----------------------| | Freshness schedule | Update patterns, SLA violation findings, staleness history | | Schema drift monitoring | Change timeline, removed column warnings, drift findings | | Data quality metrics | Trend data, anomaly findings, null rate tracking | | Tags | Business context, PII classification, domain grouping | | Alert rules | Alert coverage analysis, firing history context | **Start with your most critical tables.** Set up freshness monitoring and a few data quality metrics, then generate intelligence. The profile will immediately surface findings from that context. ## Editing Descriptions AI-generated descriptions are a starting point. You can edit any description to add context the AI can't infer: - Business context: "Used by the finance team for quarterly reporting" - Data sources: "Populated by the Stripe webhook integration" - Update frequency: "Updated in real-time as orders are placed" - Caveats: "Does not include cancelled orders before 2023" Edits are preserved across re-analysis. Your additions won't be overwritten. ## Common Questions ### What's in an object profile? A summary of the table's role, AI-generated business context, findings grouped by severity, foreign key and lineage relationships, and a timeline of recent schema changes. Every section is grounded in your actual monitoring data, not inferred from the table name alone. ### Why are my profiles sparse on findings? Profiles are only as good as the monitoring feeding them. A table with freshness schedules, data quality metrics, and schema drift monitoring produces rich findings. A table with no monitoring produces mostly schema-level context and "consider enabling X" suggestions. ### How do I generate a profile for many tables at once? Select a schema and click **Analyze Schema**, or pick the database root and click **Analyze All**. Analysis runs in the background and profiles update automatically as each table completes. ### Do my edits to a profile's description survive re-analysis? Yes. Manual edits to descriptions are preserved across future runs. Use this to add business context the AI can't infer, like which team owns a table or how it's populated. ### What does the confidence score represent? It reflects how much monitoring context was available when the profile was generated. More freshness history, drift detection, metrics, and tags produce higher confidence. Tables with only schema metadata score lower. ## Next Steps - **[Ask Questions](/intelligence/ask-questions)**: Use profiles as context for AI-powered Q&A - **[Walkthrough](/intelligence/use-cases)**: See profiles in action during a real debugging scenario --- # Ask Questions > Chat with an AI agent that has full context from your monitoring setup The Intelligence agent is an AI assistant that can answer questions about your data using context from every monitoring domain: schema drift, freshness, data quality, tags, lineage, and alerts. It chains multiple tools together automatically, so a single question can pull from several sources to give you a complete answer. ## How to Ask 1. Navigate to the **Intelligence** page 2. Click **Ask Agent** in the top right 3. Type your question in plain English 4. The agent investigates, calling tools and checking context 5. You see the full response with tool calls visible The agent opens as a full-screen overlay. You can scope it to a specific table by selecting one in the Schema Explorer before opening the chat. ## What Makes This Different The agent doesn't just search table names. It reads your monitoring context: | Traditional Search | Intelligence Agent | |---|---| | "orders" returns tables with "orders" in the name | "Why is the orders dashboard broken?" checks schema drift for removed columns, freshness for stale data, and data quality for metric anomalies | | "stale tables" returns a static list | "What should I prioritize?" ranks issues by severity across all domains | | "PII tables" returns tagged assets | "Which PII tables had changes this week?" cross-references tags with schema drift history | ## Question Categories ### Diagnosis (Cross-Domain) These are the most powerful questions because they pull from multiple monitoring domains at once. ``` You: Why is my orders dashboard showing "column not found"? Agent: I checked schema drift for the orders table and found that the column `discount_pct` was removed on Tuesday at 3:47 PM. This column is referenced by 2 downstream views: - analytics.order_summary - reporting.daily_revenue The removal coincides with a schema change detected on the upstream staging.orders_raw table. I recommend checking the ETL pipeline that populates staging.orders_raw for intentional column removal. ``` ``` You: What changed upstream that could cause NULLs in silver_orders? Agent: I found 3 potential causes: 1. Schema change: raw_events.user_id column type changed from INT to VARCHAR on Monday. This could cause join failures with the users table. 2. Freshness: raw_events hasn't updated in 14 hours (SLA: 2 hours). Missing upstream data would produce NULLs in derived columns. 3. Data quality: null_percentage on raw_events.order_id spiked from 0.1% to 8.3% starting Wednesday. ``` ### Health and Prioritization ``` You: What should I prioritize this morning? Agent: Here's your priority list: CRITICAL: - gold.fact_orders is 6 hours stale (SLA: 2h). Upstream staging.orders_raw is also stale. HIGH: - 3 schema drift alerts: columns removed from gold.dim_customer, gold.dim_product, silver.silver_users - Null rate on gold.fact_payments.amount spiked to 15% MEDIUM: - 2 validity rules failing on gold.dim_customer (UNIQUE on email, NOT_NULL on created_at) No new info-level issues. ``` ### Data Discovery ``` You: Where is revenue data? Agent: Revenue data lives in 3 tables: 1. gold.fact_orders (3.2M rows, updated 5 min ago) - Columns: order_total, tax_amount, discount_amount, net_revenue - Tagged: revenue, finance 2. gold.fact_subscriptions (450K rows, updated 1 hour ago) - Columns: mrr, arr, subscription_value - Tagged: revenue, SaaS-metrics 3. reporting.revenue_daily (aggregated, updated daily at 6 AM) - Columns: total_revenue, revenue_by_product, revenue_by_region gold.fact_orders has the most granular and freshest data. ``` ### Data Quality ``` You: Are there anomalies in my row counts this week? Agent: I checked row_count metrics across all monitored tables: ANOMALIES DETECTED: - gold.fact_orders: Row count dropped 23% on Wednesday (3.2M -> 2.4M). This is outside the normal daily variance of +/- 5%. - staging.email_events: Row count doubled on Thursday (890K -> 1.8M). Possible duplicate ingestion. NORMAL: - 14 other monitored tables show expected row count patterns. ``` ### Impact Analysis ``` You: What's the blast radius if I drop the customers table? Agent: Dropping gold.dim_customer would affect: DIRECT DEPENDENTS (3 tables): - gold.fact_orders (JOIN on customer_id) - gold.fact_subscriptions (JOIN on customer_id) - reporting.customer_ltv (derived from dim_customer) INDIRECT DEPENDENTS (2 tables): - reporting.revenue_daily (depends on fact_orders) - reporting.churn_analysis (depends on fact_subscriptions) ALERT IMPACT: - 4 freshness schedules would stop reporting - 2 data quality metrics would fail - 1 validity rule (UNIQUE on email) would be orphaned ``` ## Sessions Conversations are organized into sessions. Each session maintains full context, so follow-up questions work naturally. - **New session**: Click **+** in the session sidebar - **Resume session**: Click a previous session to continue - **Session history**: Sidebar shows all past conversations with timestamps ## Tool Visibility As the agent works, you can see each tool call it makes. This transparency lets you verify what the agent is checking: - Which tool was called (e.g., `list_schema_changes`, `check_freshness`) - What parameters were passed - The result returned For actions that modify data (creating metrics, applying tags), the agent asks for confirmation before proceeding. ## Tips - **Be specific about scope**: "Check freshness for the 3 gold fact tables" works better than "check everything" - **Chain requests**: "Find tables with spiking null rates, then show me their schema changes this week" works in one message - **Ask diagnostic questions**: "Why is X broken?" triggers cross-domain investigation, which is where Intelligence is strongest - **Let it discover first**: Say "find tables with customer data" instead of typing exact table paths ## Common Questions ### How is this different from searching a data catalog? A catalog returns tables that match keywords. The Intelligence agent chains tools across schema drift, freshness, data quality, tags, and lineage to diagnose **why** something is happening, not just what exists. "Why is my orders dashboard broken?" pulls from multiple monitoring domains in a single answer. ### Does the agent see my actual data values? No. It only reads operational metadata, schema structure, freshness patterns, metric values, alert history, tags, and lineage. Row-level data stays in your database. See [Intelligence Overview](/intelligence/overview) for the full list of what's analyzed versus what's never accessed. ### Can the agent make changes, or is it read-only? It can suggest and, with your confirmation, perform actions like creating metrics or applying tags. Any action that modifies data requires you to approve it first. Investigative questions are always read-only. ### What kinds of questions work best? Diagnostic ones. "Why is X broken?", "What changed upstream?", "What should I prioritize?" trigger cross-domain investigation where the agent is strongest. Scope matters: asking about specific tables beats "check everything". ### Do sessions remember earlier questions? Yes. Each session keeps full conversation context, so follow-ups like "and what about the phone column?" work naturally. Start a new session when you switch topics to keep context clean. ### Can I see what tools the agent called to answer a question? Yes. Each response shows the tool calls inline, which tool was used, what arguments were passed, and what it returned. This transparency lets you verify the answer and spot places where extra monitoring would give the agent more context. ## Next Steps - **[Walkthrough](/intelligence/use-cases)**: See the agent diagnose a real problem end to end - **[Object Profiles](/intelligence/object-profiles)**: Understand the context that powers agent answers --- # Intelligence in Action > How monitoring context flows into Intelligence to solve real problems Intelligence is most powerful when you've set up monitoring across your data stack. This page walks through two real scenarios where schema drift, freshness, and data quality create the context that Intelligence uses to diagnose problems. ## Scenario 1: The Broken Dashboard A revenue dashboard shows "column not found" at 8 AM. Here's how you'd use AnomalyArmor to find the root cause. ### Step 1: Set Up Schema Drift Monitoring Before anything breaks, you enabled schema drift detection on your gold tables: ``` Navigate to gold.fact_orders > Schema Monitoring > Enable ``` AnomalyArmor captures a baseline of every column, type, and constraint. Any future change triggers a schema drift alert. ### Step 2: Set Up Freshness Monitoring You created a freshness schedule for the orders pipeline: ``` gold.fact_orders: Check hourly, SLA 2 hours staging.orders_raw: Check hourly, SLA 1 hour ``` Now AnomalyArmor knows when these tables should update and will alert if they fall behind. ### Step 3: Set Up Data Quality Metrics You added quality metrics on key columns: ``` gold.fact_orders.order_total: row_count, null_percentage gold.fact_orders.discount_pct: null_percentage, distinct_count gold.dim_customer.email: null_percentage ``` AnomalyArmor tracks these metrics on every discovery run, building a history of normal ranges. ### Step 4: Something Breaks Tuesday at 3:47 PM, a deploy removes the `discount_pct` column from `staging.orders_raw`. The ETL pipeline runs, propagating the removal to `gold.fact_orders`. Wednesday morning, the revenue dashboard fails with "column not found: discount_pct." At this point, AnomalyArmor has already captured: - **Schema drift**: Column `discount_pct` removed from `gold.fact_orders` at 4:12 PM Tuesday - **Data quality**: `discount_pct` null_percentage metric stopped reporting (column gone) - **Alerts**: Schema drift alert fired at 4:12 PM, freshness alert cleared (table still updating on schedule) ### Step 5: Intelligence Connects the Dots You open the Intelligence page, select `gold.fact_orders`, and see the Object Profile: **Findings (Critical)**: > Column `discount_pct` (FLOAT) was removed on Tuesday at 4:12 PM. This column had an active null_percentage metric that is no longer reporting. 2 downstream views reference this column: `analytics.order_summary` and `reporting.daily_revenue`. **Findings (Medium)**: > The removal correlates with a schema change on `staging.orders_raw` at 3:47 PM Tuesday, suggesting the column was dropped upstream. You click **Ask Agent** and type: ``` You: Why is the revenue dashboard broken? Agent: The revenue dashboard is failing because the column `discount_pct` was removed from gold.fact_orders. Timeline: - Tue 3:47 PM: discount_pct removed from staging.orders_raw - Tue 4:12 PM: Change propagated to gold.fact_orders via ETL - Tue 4:12 PM: Schema drift alert fired - Wed 8:00 AM: Dashboard query fails on missing column The column is referenced by: - analytics.order_summary (SELECT discount_pct) - reporting.daily_revenue (SUM(discount_pct)) Root cause: Upstream schema change in staging.orders_raw. Check the deploy that went out Tuesday afternoon. Recommended actions: 1. Restore the column if the removal was unintentional 2. Update the 2 downstream views if it was intentional 3. Resolve the schema drift alert after addressing ``` Without Intelligence, this investigation would involve checking the schema history page, the freshness page, the alerts page, and the data quality page separately. With Intelligence, you get the full picture in one question. ## The Pattern Every Intelligence investigation follows the same loop: 1. **Features generate context**: Schema drift, freshness, and data quality monitoring continuously capture operational data about your tables 2. **Intelligence aggregates context**: When you select a table or ask a question, Intelligence pulls from all available monitoring domains 3. **AI reasons across domains**: Instead of checking each page separately, Intelligence correlates findings across schema changes, freshness status, quality metrics, tags, and lineage 4. **You get a diagnosis**: A single answer that explains what happened, when, why, and what to do about it The more monitoring you set up, the more complete the picture. Start with your most critical tables: add freshness schedules, enable schema drift detection, and track a few data quality metrics. Then let Intelligence do the correlation work for you. ## Common Questions ### What's the minimum monitoring setup for Intelligence to be useful? Freshness on your most critical tables plus schema drift detection. That combination alone unlocks most diagnostic questions. Add data quality metrics next for the null-rate and row-count regressions shown in Scenario 2. ### Can Intelligence catch a regression nothing alerted on? Yes. Scenario 2 is exactly that case, no alert fires because the table updates on schedule and schema is unchanged, but the null-rate trend in data quality metrics surfaces the regression in the Object Profile. Intelligence reads trend history, not just threshold breaches. ### How does Intelligence trace a broken dashboard back to its root cause? It correlates schema drift (column removed), freshness (was the table also late?), lineage (which views reference the column), and alert history (when did the alert fire?) into a single timeline. That's what turns "dashboard is broken" into "deploy on Tuesday 3:47 PM removed `discount_pct`". ### Does Intelligence work across upstream and downstream tables? Yes. Lineage context lets it trace a problem from a gold table back to staging sources, and forward to downstream views and reports. The "blast radius" question in [Ask Questions](/intelligence/ask-questions) uses the same signal. ## Next Steps - **[Set Up Freshness](/concepts/freshness)**: Monitor table update patterns - **[Enable Schema Drift](/concepts/schema-changes)**: Detect column changes automatically - **[Add Data Quality](/concepts/metrics)**: Track null rates, row counts, and validity rules - **[Ask Questions](/intelligence/ask-questions)**: Start chatting with the Intelligence agent --- # Data Classification > Tag and categorize your data assets for governance and compliance Data classification helps you track sensitive data, organize assets, and meet compliance requirements. Know which tables contain PII before your next audit asks. ## Two Types of Tags | Type | How It Works | Use For | |------|--------------|---------| | **Auto-classification** | Automatically detects PII patterns in column names | Compliance, PII tracking | | **Custom tags** | You apply labels manually or in bulk | Organization, ownership, status | Both types work together. Auto-classification catches PII; custom tags add business context. ## The Value ### Before Classification Auditor: "Where is customer PII?" You: "Let me check... probably in users... and maybe orders... I think there's an email somewhere in marketing..." ### After Classification Auditor: "Where is customer PII?" You: *filters by `pii:*`* "Here's the complete list of 47 columns containing PII across 12 tables." ## Quick Start **Already have data sources connected?** Classification starts working automatically: 1. Auto-classification runs during discovery 2. Check results: **Assets** > **Filter** > **Classification** 3. Add custom tags to organize assets by environment, team, or review status **Need to customize patterns?** Go to **Settings** > **Data Classification**. ## What Gets Auto-Classified AnomalyArmor detects common PII patterns: | Data Type | Example Columns | Tag | |-----------|-----------------|-----| | Email | `email`, `user_email` | `pii:email` | | Phone | `phone`, `mobile` | `pii:phone` | | SSN | `ssn`, `tax_id` | `pii:ssn` | | Credit Card | `card_number` | `pii:credit-card` | | Address | `address`, `zip_code` | `pii:address` | | Name | `first_name`, `last_name` | `pii:name` | | IP Address | `ip`, `client_ip` | `pii:ip-address` | > â„šī¸ Auto-classification reads **column names only**, never actual data values. ## Common Use Cases ### Compliance Audit Preparation Before an audit: 1. Filter assets by `pii:*` tags to see all PII 2. Verify critical tables are classified 3. Export the list for documentation See [Compliance & Governance](/data-classification/compliance) for detailed workflows. ### Organizing by Environment Distinguish production from development: 1. Create tags: `production`, `staging`, `development` 2. Apply to assets via bulk tagging 3. Filter alerts by tag to reduce noise ### Tracking Review Status Mark assets as reviewed: 1. Create tags: `pii-reviewed`, `needs-review` 2. Work through unreviewed assets 3. Track progress with filters ### Alert Routing Route alerts based on tags: - Critical tables → PagerDuty - Non-critical → Email digest - Development → Slack only ## Alerting on Tags Create tag-based alert rules: 1. Go to **Alerts** > **Rules** 2. Click **Create Rule** 3. Under **Scope**, select **Filter by tag** 4. Choose your tag (e.g., `production`) 5. Configure destinations Now alerts only fire for assets with that tag. ## Common Questions ### Do I need to tag everything manually? No. [Auto-classification](/data-classification/auto-classification) runs during discovery and tags PII patterns like email, phone, and SSN automatically. You only add [custom tags](/data-classification/custom-tags) for things AnomalyArmor can't infer, like environment, ownership, or review status. ### Does auto-classification look at my data? No. It reads only column names, types, and metadata. Column values stay in your database and are never transmitted. ### How does this help with compliance? Classification gives you an instant inventory of PII locations, which auditors regularly ask for. See [Compliance & Governance](/data-classification/compliance) for SOC 2, GDPR, and access review workflows. ### Can alerts be scoped to specific tags? Yes. In alert rules, choose **Filter by tag** and select any tag, like `production` or `pii:email`. Only assets with that tag will trigger the rule, which is the standard way to separate prod alerts from dev noise. ## In This Section - **[Auto-Classification](/data-classification/auto-classification)**: Automatic PII detection and patterns - **[Custom Tags](/data-classification/custom-tags)**: Create and apply your own labels - **[Compliance](/data-classification/compliance)**: SOC 2, GDPR, and audit workflows --- # Auto-Classification > Automatically detect PII and sensitive data in your columns Auto-classification scans your column names and metadata to identify potentially sensitive data like emails, phone numbers, and SSNs. This helps you track PII for compliance without manual tagging. ## How It Works Auto-classification runs automatically during discovery: 1. **Discovery scans** your database metadata 2. **Pattern matching** checks column names against known PII patterns 3. **Tags applied** automatically to matching columns 4. **Results visible** in the asset detail page > â„šī¸ Auto-classification analyzes column **names and types only**, never actual data values. A column named `email` gets tagged, but we never read the emails inside it. ## Detected Patterns | Pattern | Example Column Names | Tag Applied | |---------|---------------------|-------------| | Email | `email`, `user_email`, `email_address` | `pii:email` | | Phone | `phone`, `phone_number`, `mobile` | `pii:phone` | | SSN | `ssn`, `social_security`, `tax_id` | `pii:ssn` | | Credit Card | `cc_number`, `card_number`, `credit_card` | `pii:credit-card` | | IP Address | `ip`, `ip_address`, `client_ip` | `pii:ip-address` | | Address | `address`, `street`, `zip_code` | `pii:address` | | Name | `first_name`, `last_name`, `full_name` | `pii:name` | ## Viewing Classified Data To see which columns have been auto-classified: 1. Go to **Assets** in the sidebar 2. Click on any asset to view details 3. Look for **Classification** tags on columns 4. Or filter: **Assets** → **Filter** → **Classification** → select a PII type ## Enabling Auto-Classification Auto-classification is **enabled by default** for all data sources. To verify or change: 1. Go to **Settings** → **Data Classification** 2. Check that **Auto-classification** is toggled ON 3. Select which patterns to detect (all enabled by default) ## Accuracy and Limitations Auto-classification uses pattern matching on column names, which means: **It will catch:** - Columns with obvious names (`email`, `phone_number`, `ssn`) - Common naming conventions (`user_email`, `billing_address`) **It may miss:** - Abbreviated names (`em`, `ph`) - Non-English names - Ambiguous names (`data`, `value`, `field1`) **False positives:** - `email_sent_count` might be tagged as email (it's not PII) - Review classifications and remove incorrect tags manually ## Customizing Detection To adjust which patterns are detected: 1. Go to **Settings** → **Data Classification** 2. Under **Auto-classification Patterns**, toggle patterns on/off 3. Changes apply to the next discovery run > 💡 Run a manual discovery after changing patterns to see updated classifications immediately. ## Manual Override If auto-classification gets it wrong: **To remove an incorrect tag:** 1. Go to the asset detail page 2. Find the column with the incorrect tag 3. Click the tag → **Remove** **To add a missing tag:** 1. Go to the asset detail page 2. Click **Add Tag** on the column 3. Select the classification that matches the data type (e.g., `pii:email` for email addresses, `pii:phone` for phone numbers) ## Verification After discovery runs, verify auto-classification is working: 1. Go to **Assets** → **Filter** → **Classification** → **Has any** 2. You should see assets with classified columns 3. Click into an asset to see which columns were tagged If no assets appear: - Verify auto-classification is enabled in Settings - Check that discovery has run recently - Your column names may not match detection patterns ## Common Questions ### Does auto-classification read my actual data? No. Auto-classification looks only at column names, types, and metadata. Column values are never read or transmitted. A column called `email` is tagged based on its name, not on inspecting what's inside. ### What PII types get detected out of the box? Email, phone, SSN, credit card, IP address, physical address, and name. Detection runs automatically during discovery and tags matching columns with values like `pii:email` or `pii:phone`. ### Why did a non-PII column like `email_sent_count` get tagged? Pattern matching is name-based, so anything with `email` in the name matches. Open the column, click the tag, and remove it manually. Manual overrides persist across future discovery runs. ### Will it miss PII in abbreviated or non-English column names? Yes, it's likely to miss short names like `em` or `ph`, non-English names, and ambiguous ones like `data` or `value`. Add the appropriate `pii:*` tag manually from the asset detail page for those columns. ### Is auto-classification on by default? Yes, for all data sources. You can toggle individual patterns on or off in **Settings → Data Classification**. Changes take effect on the next discovery run. ## Next Steps - **[Custom Tags](/data-classification/custom-tags)**: Add your own classification tags - **[Data Handling](/security/overview)**: How we protect your data --- # Custom Tags > Create and apply your own labels to organize data assets Custom tags let you organize assets beyond auto-, such as tables as `production`, `deprecated`, `pii-reviewed`, or any label that fits your workflow. ## Creating a Tag 1. Go to **Settings** → **Tags** 2. Click **Create Tag** 3. Enter a tag name (e.g., `production`, `needs-review`) 4. Choose a color (optional, helps with visual scanning) 5. Click **Save** You'll see "Tag created successfully" and the tag appears in your list. ## Applying Tags to Assets ### Single Asset 1. Navigate to the asset in **Assets** 2. Click the asset name to open details 3. Click **Add Tag** (next to the asset title) 4. Select from your available tags 5. Tag appears immediately on the asset ### Multiple Assets (Bulk) 1. Go to **Assets** → **All Assets** 2. Check the boxes next to assets you want to tag 3. Click **Actions** → **Add Tag** 4. Select the tag to apply 5. Click **Apply** All selected assets now have the tag. ## Removing Tags ### From a Single Asset 1. Go to the asset detail page 2. Find the tag you want to remove 3. Click the **X** on the tag 4. Tag is removed immediately ### Bulk Remove 1. Go to **Assets** → filter by the tag you want to remove 2. Select all assets (checkbox in header) 3. Click **Actions** → **Remove Tag** 4. Select the tag to remove 5. Click **Remove** ## Editing a Tag To rename or change a tag's color: 1. Go to **Settings** → **Tags** 2. Find the tag and click **Edit** (pencil icon) 3. Change name or color 4. Click **Save** Changes apply everywhere the tag is used. ## Deleting a Tag 1. Go to **Settings** → **Tags** 2. Find the tag and click **Delete** (trash icon) 3. Confirm deletion > âš ī¸ Deleting a tag removes it from all assets. This cannot be undone. ## Filtering by Tags Find all assets with a specific tag: 1. Go to **Assets** 2. Click **Filter** 3. Select **Tags** → choose your tag 4. View shows only assets with that tag Combine with other filters (data source, schema) to narrow results. ## Tag Naming Conventions Consistent naming helps your team: | Category | Examples | Use For | |----------|----------|---------| | Environment | `production`, `staging`, `dev` | Distinguish environments | | Status | `reviewed`, `needs-review`, `deprecated` | Track review status | | Ownership | `team-data`, `team-analytics` | Assign ownership | | Compliance | `pii-reviewed`, `gdpr-ready` | Track compliance work | ## Verification After creating and applying tags: 1. Go to **Assets** → **Filter** → **Tags** → select your tag 2. Verify the correct assets appear 3. Click into an asset to confirm the tag shows ## Common Questions ### How do custom tags differ from auto-classification? [Auto-classification](/data-classification/auto-classification) is automatic and focused on PII patterns. Custom tags are anything you define: environment (`production`, `staging`), ownership (`team-analytics`), status (`needs-review`), or any label your team needs. Both types coexist on the same asset. ### Can I tag many assets at once? Yes. Go to **Assets → All Assets**, check the assets you want, then use **Actions → Add Tag**. Bulk removal works the same way. This is the fastest path to tagging a large set of tables by environment or team. ### Can I route alerts based on tags? Yes. In [alert rules](/alerts/alert-rules), scope a rule by tag so only tagged assets trigger the rule. This is how teams send production alerts to PagerDuty and development alerts to Slack from the same underlying events. ### What happens when I delete a tag? Deletion is permanent and removes the tag from every asset it was applied to. There's no undo. To temporarily hide a tag without losing data, rename it instead of deleting. ## Next Steps - **[Auto-Classification](/data-classification/auto-classification)**: Automatic PII detection - **[Alert Rules](/alerts/alert-rules)**: Create alerts filtered by tag --- # Compliance & Governance > Use tagging for audits, GDPR, and data governance ## Why Classification Matters for Compliance Auditors ask: "Where is your PII?" You need an answer that isn't "let me check." AnomalyArmor's auto-classification and custom tags give you: - **Instant PII inventory** across all databases - **Audit-ready exports** of sensitive data locations - **Continuous monitoring** as new tables appear ## Recommended Tag Structure | Tag | Use For | |-----|---------| | `pii:email`, `pii:phone`, etc. | Auto-classified PII (automatic) | | `sensitivity:high` | Manually flagged critical data | | `compliance:reviewed` | Audit trail of reviewed assets | | `compliance:gdpr-scope` | GDPR-relevant data | | `retention:30-days` | Data retention policy | --- ## Audit Preparation Checklist ``` [ ] Run discovery to ensure catalog is current [ ] Review auto-classification results for accuracy [ ] Remove false positives (email_count ≠ PII) [ ] Add manual tags for data auto-classification missed [ ] Export filtered asset list for auditor [ ] Document any exceptions with descriptions ``` --- ## Staying Compliant Over Time New tables appear. Schemas change. Stay ahead: 1. **Alert on new PII**: Create rule for "New asset detected" + filter by auto-classification 2. **Review cadence**: Monthly review of `compliance:needs-review` tagged assets 3. **Discovery schedule**: Run frequently enough to catch new tables before auditors do ## Common Questions ### How do I answer 'where is our PII?' during an audit? Go to **Assets**, filter by classification `pii:*`, and export the list. You get a complete inventory of PII-tagged columns across every connected database. Pair with a manual spot-check for PII hidden in non-obvious column names. ### Does AnomalyArmor help with GDPR data mapping? Yes. [Auto-classification](/data-classification/auto-classification) tags emails, names, and addresses automatically. Filter by the relevant `pii:*` tags to map where personal data lives, then add retention or scope tags like `compliance:gdpr-scope` for documentation. ### Can I flag tables as reviewed for audit purposes? Use [custom tags](/data-classification/custom-tags) like `compliance:reviewed` and `compliance:needs-review`. Apply them manually or in bulk, then filter by tag to see what still needs attention. ### What happens if a new table appears between audits? Create an alert rule for "New asset detected" filtered by auto-classification. Every time discovery finds a new PII-tagged table, the alert fires so you can review it before the next audit cycle. --- # Guides Overview > Practical guides for common AnomalyArmor use cases These guides walk you through solving specific problems with AnomalyArmor. Each guide targets a particular persona or use case. ## Use Case Guides - **[Prevent Pipeline Failures](/guides/prevent-pipeline-failures)**: **For Data Engineers** Catch schema changes before they break your dbt models and ETL jobs. - **[On-Call Data Alerting](/guides/oncall-data-alerting)**: **For Platform Teams** Set up production monitoring with proper escalation and on-call routing. - **[Compliance Monitoring](/guides/compliance-monitoring)**: **For Data Governance** Track data lineage and changes for audit and compliance requirements. ## Who Are These For? | Guide | Primary Audience | Key Outcome | |-------|------------------|-------------| | Prevent Pipeline Failures | Data Engineers | Proactive schema change detection | | On-Call Data Alerting | Platform/SRE Teams | 24/7 data incident response | | Compliance Monitoring | Data Governance | Audit trails and change documentation | ## Getting Started If you're new to AnomalyArmor, start with the [Quickstart Guide](/quickstart/overview) to connect your first database. Then return to these guides for specific workflows. --- # "Video Tutorials" > Step-by-step video walkthroughs of AnomalyArmor features Video tutorials are coming soon. We're recording Loom walkthroughs covering: - **Getting Started** - Connect your first database and run discovery - **Setting Up Alerts** - Configure Slack, email, and webhook notifications - **Creating Metrics** - Track row counts, null percentages, and custom queries - **Schema Monitoring** - Detect and respond to schema changes Check back soon, or [contact us](/support/contact) if you'd like early access to video content. --- # For Data Engineers > Schema drift detection, CI/CD integration, and automation for data engineers As a data engineer, you build and maintain the pipelines that keep data flowing. AnomalyArmor helps you catch breaking changes before they impact downstream systems. ## Your Key Workflows ### Detect Breaking Schema Changes Schema drift is your biggest enemy. A column rename or type change can silently break pipelines that ran fine yesterday. ### Integrate with Your CI/CD Gate deployments on data quality using the CLI: ```bash # Install pip install anomalyarmor-cli # Check freshness before running dbt armor freshness check snowflake.prod.warehouse.orders # Exit code 1 if stale, blocking the pipeline ``` [Full CLI reference](/cli/overview) ### Automate with Webhooks Trigger actions when schema changes are detected: ```python from anomalyarmor import Client client = Client() # Get schema changes from last 24 hours changes = client.schema.changes( since="24h", change_types=["column_removed", "type_changed"] ) for change in changes: print(f"Breaking change: {change.asset_name} - {change.description}") ``` [Python SDK guide](/sdk/overview) ## Recommended Setup | Feature | Why You Need It | |---------|-----------------| | [Schema Drift Detection](/schema-monitoring/schema-drift) | Catch column changes before they break pipelines | | [Freshness Monitoring](/data-quality/freshness-monitoring) | Know when upstream data is stale | | [Webhook Alerts](/alerts/destinations/webhooks) | Integrate with your existing monitoring | | [CLI](/cli/overview) | Automate checks in CI/CD | ## Common Tasks - **[Set Up dbt Integration](/integrations/dbt)**: Run AnomalyArmor checks as part of dbt runs - **[Airflow Pre-flight Checks](/integrations/airflow)**: Gate DAG tasks on data freshness - **[GitHub Actions Integration](/integrations/github-actions)**: Add data quality checks to your CI pipeline - **[CLI Reference](/cli/reference)**: Full command documentation ## Common Questions ### How do I gate a dbt run or Airflow DAG on data freshness? Use the `armor freshness check ` CLI command in your pipeline. It exits non-zero when data is stale, which blocks the next step in most orchestrators. See [CLI overview](/cli/overview) and [Airflow integration](/integrations/airflow). ### Can AnomalyArmor catch a breaking schema change before my pipeline runs? Yes, if you schedule discovery to run before your pipelines. Point discovery at production every hour and schedule it 30-60 minutes before your main ETL windows. Breaking changes (column removed, type changed) fire alerts on the next discovery. ### How do I send schema change alerts to a webhook for automation? Create a webhook destination, then build an alert rule scoped to the change types you care about (e.g. `column_removed`, `type_changed`). Incoming payloads include asset name, change type, and diff for your automation to act on. See [Webhook destinations](/alerts/destinations/webhooks). ### Can I pull schema change history programmatically? Yes, via the Python SDK: `client.schema.changes(since="24h", change_types=[...])`. See [Python SDK guide](/sdk/overview). ### How does AnomalyArmor handle column renames? Today a rename surfaces as a column removed plus a column added. You can correlate the pair in your webhook handler or alert rule. Automatic rename detection is on the roadmap. --- # For Analytics Engineers > Data quality metrics, freshness SLAs, and stakeholder reporting for analytics engineers As an analytics engineer, you own the quality of data that powers business decisions. AnomalyArmor helps you catch data quality issues before they reach dashboards. ## Your Key Workflows ### Set Freshness SLAs Stakeholders expect data to be current. Set SLAs to get alerted when data is stale: ### Track Data Quality Metrics Go beyond freshness. Monitor row counts, null percentages, and more: ``` Critical Table Monitoring: +-- row_count: Catch unexpected data loss or growth +-- null_percent on key columns: Detect incomplete data loads +-- distinct_count on IDs: Identify duplicate records ``` [Create your first metric](/quickstart/create-first-metric) ### Report Quality to Stakeholders Embed data quality status directly in dashboards: ```markdown # Sales Dashboard ![Data Quality](https://app.anomalyarmor.ai/badges/abc123.svg) ``` When the badge is green, stakeholders know the data is reliable. [Set up report badges](/data-quality/report-badges) ### Document Tables with AI Use Intelligence to auto-generate descriptions for your dbt models: ```python from anomalyarmor import Client client = Client() # Generate descriptions for all columns descriptions = client.intelligence.describe( "snowflake.prod.warehouse.orders" ) # Paste into your dbt YAML files ``` [Use Intelligence](/intelligence/object-profiles) ## Recommended Setup | Feature | Why You Need It | |---------|-----------------| | [Freshness Monitoring](/data-quality/freshness-monitoring) | Set SLAs on data recency | | [Data Quality Metrics](/data-quality/metrics) | Track statistical properties over time | | [Report Badges](/data-quality/report-badges) | Show quality status in dashboards | | [Slack Alerts](/alerts/destinations/slack) | Get notified in your team channel | ## Common Tasks - **[Set Up Freshness SLAs](/data-quality/freshness-monitoring)**: Define expected update frequency - **[Create Metrics](/data-quality/metrics)**: Track row counts, null percentages, and more - **[Embed Report Badges](/data-quality/report-badges)**: Show quality status in Looker, Tableau, Notion - **[Object Profiles](/intelligence/object-profiles)**: AI-generated intelligence profiles for every table ## Common Questions ### How do I set freshness SLAs on my dbt models? Identify the tables that feed executive or customer-facing dashboards, then set freshness thresholds per table (hourly, daily, weekly). Route alerts to your team's Slack channel. See [Freshness Monitoring](/data-quality/freshness-monitoring). ### Can I show data quality status inside my Looker or Tableau dashboards? Yes. Embed a report badge that turns green or red based on freshness and quality monitors. Stakeholders can see at a glance whether the underlying data is trustworthy. See [Report Badges](/data-quality/report-badges). ### How do I generate column descriptions for my dbt YAML files? Use the Intelligence API (`client.intelligence.describe(...)`) to auto-generate descriptions and paste them into your `schema.yml`. See [Object Profiles](/intelligence/object-profiles) for the full workflow. ### What metrics should I track on my most important tables? Start with row count, null percentage on required columns, and distinct count on ID columns. Those three catch the majority of real-world quality issues (data loss, incomplete loads, duplicate inserts). See [Data Quality Metrics](/data-quality/metrics). --- # For Platform Teams > Multi-database monitoring, incident management, team administration, and API integration As a platform team, you manage data infrastructure at scale. AnomalyArmor provides unified observability across all your databases with enterprise-grade controls. ## Your Key Workflows ### Monitor Multiple Databases Connect all your data sources from a single pane of glass: ### Integrate with Incident Management Route critical alerts to your existing incident management: | Platform | Integration | |----------|-------------| | PagerDuty | [Native integration](/alerts/destinations/pagerduty) | | Opsgenie | Via webhook | | ServiceNow | Via webhook | Example PagerDuty routing: ``` Alert Rule: Schema Change (column removed) Severity: Critical Route to: PagerDuty → Data Platform On-Call ``` ### Manage Team Access Set up role-based access control: | Role | Permissions | |------|-------------| | Admin | Full access, team management, API keys | | Editor | Configure alerts, create metrics | | Viewer | Read-only access to all data | [Team management guide](/account/team-management) ### Build Custom Integrations Use the API to integrate AnomalyArmor with your internal tools: ```python from anomalyarmor import Client client = Client(api_key="aa_live_xxx") # Get all assets across all sources assets = client.assets.list() # Export to your internal data catalog for asset in assets: sync_to_internal_catalog(asset) # Monitor API key usage keys = client.api_keys.list() for key in keys: print(f"{key.name}: {key.last_used}") ``` [API documentation](/api/overview) ### Security and Compliance Review our security posture: - **Read-only access**: We only query metadata, never your data - **Query validation**: Open-source Query Gateway validates all SQL - **Encryption**: AES-256 at rest, TLS 1.3 in transit - **Audit logs**: All administrative actions logged [Security overview](/security/overview) ## Recommended Setup | Feature | Why You Need It | |---------|-----------------| | [Multi-source monitoring](/data-sources/overview) | Unified view across all databases | | [PagerDuty integration](/alerts/destinations/pagerduty) | Incident escalation | | [Team management](/account/team-management) | Role-based access control | | [API access](/api/overview) | Custom integrations | ## Common Tasks - **[Add Data Source](/data-sources/overview)**: Connect PostgreSQL, Databricks, Snowflake, and more - **[Configure PagerDuty](/alerts/destinations/pagerduty)**: Route critical alerts to on-call - **[Manage Team](/account/team-management)**: Invite users and set permissions - **[API Keys](/api/authentication)**: Create and manage API access ## Common Questions ### How many databases can I connect to AnomalyArmor? Every plan supports multiple connections; the limit is set by plan capacity (total tables monitored), not by connection count. Enterprise supports unlimited tables and VPC peering / PrivateLink per source. See [Data Sources](/data-sources/overview). ### Can I route alerts to PagerDuty, Opsgenie, or ServiceNow? PagerDuty has a native integration. Opsgenie and ServiceNow are supported via webhook destinations. You can route by severity so breaking changes page on-call while low-priority events go to Slack or email. See [PagerDuty integration](/alerts/destinations/pagerduty). ### What RBAC roles are available? Viewer (read-only), Editor (configure alerts and metrics), Admin (full access plus team and API keys). SSO with SAML 2.0 and SCIM provisioning is available on Enterprise. See [Team management](/account/team-management). ### Can I sync AnomalyArmor's asset inventory to our internal data catalog? Yes. Use the REST API or Python SDK to pull the asset list (`client.assets.list()`) and push it into your catalog. Changes are available via schema-change webhooks so syncs stay current. See [API overview](/api/overview). ### How does AnomalyArmor fit our compliance requirements? Read-only metadata access, AES-256 at rest, TLS 1.3 in transit, audit logs for every admin action, GDPR DPA and HIPAA BAA available. VPC peering and PrivateLink keep traffic off the public internet on Enterprise. See [Security Overview](/security/overview). --- # Prevent Pipeline Failures > Catch schema changes before they break your dbt models and ETL jobs **Audience**: Data Engineers, Analytics Engineers Schema changes are one of the most common causes of pipeline failures. A dropped column upstream can cascade into failed dbt runs, broken dashboards, and late-night debugging sessions. This guide shows you how to use AnomalyArmor to catch schema changes before they impact your pipelines. ## The Problem ## The Solution With AnomalyArmor, you'll know about schema changes before your pipelines run: ## Setup Guide ### Step 1: Connect Your Source Databases Connect the databases that your pipelines read , not just your warehouse. **Common sources to monitor:** - Production application databases (the ones your dbt reads from) - Third-party data sources - Shared data lakes For each source, follow the [connection guide](/data-sources/overview). ### Step 2: Schedule Frequent Discovery For pipeline-critical databases, run discovery frequently: | Database Type | Recommended Schedule | Why | |---------------|---------------------|-----| | Application databases | Hourly | Changes can happen anytime | | Shared warehouses | Every 6 hours | Less frequent changes | | Third-party sources | Daily | Usually stable | Configure in: **Data Sources → [Your Connection] → Settings → Discovery Schedule** ### Step 3: Create Breaking Change Alerts Set up alerts specifically for changes that break pipelines: **Rule: Breaking Schema Changes (Production)** | Field | Value | |-------|-------| | **Event** | Schema Change Detected | | **Data Source** | `production-app-db` | | **Schema** | `public` | | **Assets** | All (or list specific tables) | | **Change Type** | Column Removed, Table Removed, Type Changed | | **Destinations** | Slack `#data-engineering`, Email `data-team@company.com` | ### Step 4: Time Alerts Before Pipeline Runs If your dbt runs at 3 AM, schedule discovery at 2 AM: ## Advanced: Pre-dbt Validation ### Option 1: Webhook Integration Use webhooks to fail your pipeline early if breaking changes are detected: 1. Set up a webhook destination in AnomalyArmor 2. Point it at a validation endpoint in your orchestrator 3. If webhook fires, block the dbt run 1. **AnomalyArmor Alert** fires on schema change 2. **Webhook** sent to Airflow/Dagster 3. **Set flag**: `schema_changes_detected = true` 4. **dbt task** checks flag before running 5. **If flag = true**: Fail fast with meaningful error ### Option 2: Discovery Schedule Alignment Align discovery with your orchestration schedule: ```python # In your Airflow DAG discovery_check = SimpleHttpOperator( task_id='check_for_schema_changes', http_conn_id='anomalyarmor', endpoint='/api/v1/discoveries/latest', method='GET', ) run_dbt = BashOperator( task_id='run_dbt', bash_command='dbt run', ) discovery_check >> run_dbt ``` ## What to Do When Alerts Fire ### Immediate Actions 1. **Acknowledge the alert**: Let your team know you're investigating 2. **Check the change details**: View in AnomalyArmor: what changed, when, and on which asset 3. **Assess impact**: Which models/dashboards use this table? ### If the Change is Breaking 1. **Pause affected pipelines** (if possible before they run) 2. **Update your dbt models** to handle the change 3. **Test locally** with the new schema 4. **Deploy the fix** before the next scheduled run ### If the Change is Expected 1. **Document it**: Note in AnomalyArmor or your team wiki 2. **Update downstream**: Ensure all dependents are updated 3. **Consider communication**: Should you announce to stakeholders? ## Model Dependency Mapping Know which models depend on which tables: **Source Table: `production.orders`** - `stg_orders` (staging model) - `int_orders_enriched` (intermediate) - `fct_orders` (fact table) - monthly_revenue (dashboard) - customer_lifetime_value (analytics) - `rpt_daily_orders` (report) - `dim_order_status` (dimension) When `production.orders` changes, all of these are potentially impacted. > 💡 Use dbt's `dbt ls --select +models/staging/stg_orders.sql` to see all downstream dependencies. ## Alert Configuration Examples | Priority | Rule Name | Event | Scope | Conditions | Destinations | |----------|-----------|-------|-------|------------|--------------| | **High** | Revenue Table Changes | Schema Change | orders, payments, transactions | Any change | Slack #data-critical, PagerDuty | | **Medium** | Dimension Table Changes | Schema Change | dim_*, *_lookup | Column removed or type changed | Slack #data-engineering | | **Low** | External Source Changes | Schema Change | external.*, partner_* | Any change | Email (daily digest) | ## Troubleshooting ### Pipeline failed but I didn't get an alert 1. **Check discovery timing**: Did discovery run before the pipeline? 2. **Check scope**: Is the table included in the alert rule? 3. **Check conditions**: Does the change type match your conditions? 4. **Verify destination**: Is the destination configured correctly? ### Too many alerts for non-breaking changes 1. **Filter change types**: Alert only on `Column Removed`, `Table Removed`, `Type Changed` 2. **Exclude test schemas**: Filter out `test_*`, `dev_*` 3. **Separate environments**: Different rules for prod vs. staging ### Can't connect to production database 1. **Use a read replica**: Monitor the replica instead of primary 2. **Create a dedicated user**: With read-only permissions 3. **Check network access**: Firewall rules, security groups ## Checklist Before going live: - [ ] Connected all source databases that feed pipelines - [ ] Discovery scheduled to run before pipeline runs - [ ] Alert rules for breaking changes (column/table removed) - [ ] Alerts routed to the right channel (data engineering team) - [ ] Team knows what to do when alerts fire - [ ] Documented critical table dependencies ## Common Questions ### How often should I run schema discovery for pipeline-critical tables? For production application databases that feed dbt or ETL, run discovery **hourly**, shared warehouses every 6 hours, and stable third-party sources daily. The goal is to detect a change before the next pipeline run, so align the discovery schedule with your orchestrator. See [Schedule Frequent Discovery](#step-2-schedule-frequent-discovery). ### Which schema changes actually break dbt pipelines? The high-severity ones are **column removed**, **table removed**, and **column type changed**. Additive changes (new columns, new tables) rarely break existing models. Scope your breaking-change alert rule to those three change types to cut alert noise without missing pipeline-breakers. ### Can AnomalyArmor block my dbt run if a breaking change is detected? Yes, via a webhook destination. Point the webhook at an Airflow/Dagster sensor that sets a flag, then make your dbt task depend on the flag being clear. See [Option 1: Webhook Integration](#option-1-webhook-integration) for the pattern. This is the "fail fast with a meaningful error" flow that beats a 3 AM dbt compilation error. ### Should I monitor my source database or my warehouse? Monitor both, but source databases are where most breaking changes originate, upstream teams drop columns without telling you. Connect the production application DBs your dbt project reads from, not just the warehouse you write into. See [Step 1](#step-1-connect-your-source-databases). ### Why am I getting too many alerts for non-breaking changes? Tighten the change-type filter on the rule to **Column Removed**, **Table Removed**, and **Type Changed** only. Exclude `test_*` and `dev_*` schemas, and split prod and staging into separate rules with different destinations. See the [Troubleshooting section](#troubleshooting). ### How do I know which dbt models a source table feeds? Upload your dbt manifest via the [lineage upload](/guides/lineage-upload) flow and AnomalyArmor's asset page will show downstream dependencies. You can also use `dbt ls --select +models/staging/stg_orders.sql` locally to see the chain from source to dashboard. ## Related Resources - **[Schema Monitoring](/schema-monitoring/overview)**: Deep dive into schema change detection - **[Alert Rules](/alerts/alert-rules)**: Configure alert conditions --- # On-Call Data Alerting > Set up production monitoring with proper escalation and on-call routing **Audience**: Platform Teams, Data Platform, SRE Data incidents need the same rigor as application incidents. This guide helps you set up 24/7 monitoring with proper escalation, on-call routing, and incident response. ## The Goal ## Architecture Overview ## Setting Up PagerDuty Integration ### Step 1: Create PagerDuty Service In PagerDuty: 1. Go to **Services → New Service** 2. Name: `Data Observability - AnomalyArmor` 3. Integration: Select **Events API V2** 4. Copy the **Integration Key** ### Step 2: Add PagerDuty Destination in AnomalyArmor 1. Go to **Alerts → Destinations** 2. Click **Add Destination** 3. Select **PagerDuty** 4. Enter the Integration Key 5. Name: `PagerDuty - Data On-Call` 6. **Test** and **Save** ### Step 3: Configure Escalation Policy In PagerDuty, set up escalation: ## Alert Urgency Framework Define how urgently different data incidents need response: ### Critical (Page Immediately) **Criteria:** - Production data pipeline completely down - Core revenue tables missing or stale >4 hours - Discovery failures for >24 hours **Examples:** - Column removed from `orders` table - `payments` table data >4 hours stale - Can't connect to production database **Destination:** PagerDuty → On-Call ### High (Respond Within 4 Hours) **Criteria:** - Important tables stale (1-4 hours) - Schema changes in production - Non-critical discovery failures **Examples:** - Column type changed in production - Analytics tables 2 hours stale - Staging discovery failed **Destination:** Slack #data-incidents ### Medium (Respond Within 24 Hours) **Criteria:** - Non-production schema changes - Warning thresholds reached - New assets discovered **Examples:** - Staging schema changed - Freshness approaching SLA (warning) - New table discovered in production **Destination:** Slack #data-alerts ### Low (Informational) **Criteria:** - Development changes - Expected changes - Routine discoveries **Destination:** Email digest (daily) ## Alert Rule Configuration ### Rule 1: Critical - Production Breaking Changes | Field | Value | |-------|-------| | **Name** | CRITICAL - Production Breaking Changes | | **Event** | Schema Change Detected | | **Data source** | `production-*` | | **Schema** | `public`, `analytics` | | **Change type** | Column Removed, Table Removed | | **Destinations** | PagerDuty (Data On-Call), Slack `#data-incidents` | ### Rule 2: Critical - Revenue Table Freshness | Field | Value | |-------|-------| | **Name** | CRITICAL - Revenue Data Stale | | **Event** | Freshness Violation | | **Assets** | `orders`, `payments`, `revenue_*` | | **SLA exceeded by** | >4 hours | | **Destinations** | PagerDuty (Data On-Call), Slack `#data-incidents` | ### Rule 3: High - Production Schema Changes | Field | Value | |-------|-------| | **Name** | Production Schema Changes | | **Event** | Schema Change Detected | | **Data source** | `production-*` | | **Change type** | All | | **Destinations** | Slack `#data-incidents` | ### Rule 4: High - Data Freshness Violations | Field | Value | |-------|-------| | **Name** | HIGH - Data Freshness Violations | | **Event** | Freshness Violation | | **Data source** | `production-*` | | **Condition** | SLA exceeded | | **Destinations** | Slack `#data-incidents` | ### Rule 5: High - Discovery Failures | Field | Value | |-------|-------| | **Name** | HIGH - Discovery Failures | | **Event** | Discovery Failed | | **Data source** | `production-*` | | **Destinations** | Slack `#data-incidents`, Email `data-platform@company.com` | ## On-Call Runbook ### When Paged for Schema Change ### When Paged for Freshness Violation 1. **ACKNOWLEDGE** the alert 2. **CHECK ETL STATUS** - Is the ETL job running? Failed? Stuck? - Check Airflow/Dagster/orchestrator 3. **CHECK SOURCE SYSTEM** - Is the source database accessible? - Is source data actually updating? 4. **IDENTIFY ROOT CAUSE** - ETL failure → Fix and restart - Source delay → Communicate delay - Connection issue → Troubleshoot connection 5. **MITIGATE** - Restart failed jobs - Notify stakeholders of delay 6. **RESOLVE** and document ## Slack Integration Best Practices ### Channel Setup **Slack Channels:** - `#data-incidents` - Breaking changes (notifications on) - `#data-alerts` - All schema changes (lower priority) - `#data-digest` - Daily/weekly summaries ### Alert Message Format AnomalyArmor alerts include: ``` 🔴 CRITICAL: Schema Change Detected Asset: production.public.orders Change: Column removed - shipping_status (varchar) Detected: Today at 3:15 PM UTC Discovery Run: #12345 Impact: High - This table is used by 5 downstream models Actions: â€ĸ [View in AnomalyArmor] â€ĸ [View Asset Details] â€ĸ [View Downstream Dependencies] On-Call: @data-oncall ``` ## Maintenance Windows ### Scheduled Maintenance Before planned changes: 1. Go to **Alerts → Rules** 2. Toggle OFF relevant rules 3. Set a reminder to re-enable (e.g., calendar event) 4. Proceed with maintenance 5. Verify changes detected correctly 6. Toggle rules back ON ### Quick Disable For unexpected but known issues, quickly disable a rule: 1. Go to **Alerts → Rules** 2. Find the rule 3. Toggle it **OFF** 4. Remember to re-enable when the issue is resolved ## Metrics to Track | Metric | Target | How to Measure | |--------|--------|----------------| | MTTD (Time to Detect) | < 1 hour | Discovery frequency | | MTTN (Time to Notify) | < 5 min | Alert → PagerDuty time | | MTTR (Time to Resolve) | < 4 hours | Alert → Resolution time | | False Positive Rate | < 20% | Alerts ignored / Total alerts | | Pager Load | < 5/week | Critical alerts per week | Review these weekly in your on-call handoff. ## Checklist Before going live with on-call alerting: - [ ] PagerDuty integration configured - [ ] Escalation policy set up - [ ] Critical/High/Medium/Low rules defined - [ ] Slack channels created and configured - [ ] On-call runbook documented - [ ] Team trained on response procedures - [ ] Test alert sent and verified ## Common Questions ### How do I page my on-call engineer when data breaks? Create a PagerDuty service with an **Events API V2** integration, copy the integration key, and add a PagerDuty destination in **Alerts → Destinations**. Then route only your Critical rules (breaking schema changes, revenue-table freshness >4h) to that destination. See [Setting Up PagerDuty Integration](#setting-up-pagerduty-integration). ### Which data incidents should actually page someone? Page on production pipelines being completely down, core revenue tables stale for more than 4 hours, or discovery failures lasting over 24 hours. Everything else should go to Slack, not PagerDuty, to protect on-call from alert fatigue. See the [Alert Urgency Framework](#alert-urgency-framework). ### How do I suppress alerts during planned maintenance? Go to **Alerts → Rules** and toggle off the relevant rules before the maintenance window, then re-enable after. Set a calendar reminder so rules don't stay off indefinitely. For recurring windows, use operating schedules and blackouts in the contract config instead. ### What metrics should I track for data on-call health? MTTD (under 1 hour, driven by discovery frequency), MTTN (under 5 minutes from alert to page), MTTR (under 4 hours), false-positive rate (under 20%), and pager load (under 5 critical alerts per week). Review these weekly in your on-call handoff. See [Metrics to Track](#metrics-to-track). ### Can I send different alerts to different Slack channels? Yes. Create separate destinations for `#data-incidents` (breaking changes), `#data-alerts` (all schema changes), and `#data-digest` (daily summaries), then route each alert rule by severity. That keeps high-signal alerts out of the noisy firehose and stops people from muting the wrong channel. ## Related Resources - **[PagerDuty Setup](/alerts/destinations/pagerduty)**: Detailed PagerDuty integration guide - **[Alert Best Practices](/alerts/best-practices)**: Reduce alert fatigue --- # Compliance Monitoring > Track data lineage and changes for audit and compliance requirements **Audience**: Data Governance, Compliance Teams, DPOs Regulatory frameworks like GDPR, SOX, HIPAA, and industry standards require organizations to demonstrate control over their data. AnomalyArmor provides the audit trails and change documentation needed for compliance. ## Compliance Use Cases | Requirement | How AnomalyArmor Helps | |-------------|------------------------| | **Data Lineage** | Track what data exists and where | | **Change Management** | Document all schema modifications | | **Audit Trails** | Immutable logs of detected changes | | **Access Documentation** | Record what AnomalyArmor can see | | **Data Classification** | Tag PII and sensitive data | ## Schema Change Documentation ### Automatic Change History Every schema change is automatically recorded: ``` Schema History: customers table ──────────────────────────────── 2024-01-15 08:00:00 UTC Change: Column removed - ssn (varchar) Detected by: Discovery Run #1234 Details: Column no longer exists in source 2024-01-10 14:30:00 UTC Change: Column added - gdpr_consent (boolean) Detected by: Discovery Run #1200 Details: New column discovered 2024-01-05 09:15:00 UTC Change: Column type changed - email From: varchar(100) To: varchar(255) Detected by: Discovery Run #1150 ``` ### Accessing Change History 1. Navigate to **Assets** 2. Click on any asset 3. Select **Schema History** tab 4. Filter by date range 5. Export for auditors ### Exporting for Audits Export schema change history: 1. Go to **Assets → [Asset] → Schema History** 2. Click **Export** 3. Select format: CSV, JSON, or PDF 4. Choose date range 5. Download The export includes: - Asset name and location - Change type and details - Detection timestamp - Discovery run ID - Before/after values ## Data Classification ### Tagging Sensitive Data Identify and tag PII and sensitive columns: 1. Navigate to an asset 2. Click **Classification** tab 3. Select columns to classify 4. Apply tags: - `PII` - Personally Identifiable Information - `Financial` - Financial data - `Health` - Health/medical data - `Confidential` - Internal confidential - Create custom tags for your organization's specific requirements ### Classification Alerts Alert when classified columns change: | Field | Value | |-------|-------| | **Rule** | PII Column Changes | | **Event** | Schema Change Detected | | **Classification** | PII | | **Conditions** | Any change | | **Destinations** | Email compliance@company.com, Slack #data-governance | ### Classification Report Generate a report of all classified data: 1. Go to **Assets** 2. Filter by Classification 3. Export filtered results Report includes: - All assets with classified columns - Classification tags applied - Column details - Last change date ## Audit Trail Access ### What's Logged AnomalyArmor maintains audit logs for: | Activity | Logged Details | |----------|---------------| | Discovery runs | Start time, end time, assets found, changes detected | | Schema changes | Before/after state, detection time | | Configuration changes | Who changed what, when | | User actions | Login, logout, settings changes | | Alert activity | Alerts fired, acknowledged, resolved | ### Accessing Audit Logs **In the UI:** 1. Go to **Settings → Audit Log** 2. Filter by date, user, or activity type 3. Export for compliance review **Via API:** ```bash GET /api/v1/audit-logs ?start_date=2024-01-01 &end_date=2024-01-31 &activity_type=schema_change ``` ### Log Retention | Plan | Retention Period | |------|------------------| | Standard | 90 days | | Enterprise | Configurable (up to 7 years) | Contact support for extended retention requirements. ## Compliance Frameworks ### GDPR | GDPR Requirement | AnomalyArmor Support | |-----------------|----------------------| | **Art. 30 - Records of Processing** | Asset catalog documents data locations | | **Art. 32 - Security of Processing** | Change monitoring detects unauthorized modifications | | **Art. 33 - Breach Notification** | Alerts can notify of potential data issues | | **Art. 17 - Right to Erasure** | Track when PII columns are removed | **Recommended Setup:** - Tag all PII columns - Alert on any PII column changes - Export monthly PII inventory reports ### SOX | SOX Requirement | AnomalyArmor Support | |-----------------|----------------------| | **Section 302 - Financial Controls** | Monitor financial data tables | | **Section 404 - Internal Controls** | Document all schema changes | | **Change Management** | Audit trail of modifications | **Recommended Setup:** - Monitor all financial data tables - Alert on any schema changes to financial data - Weekly export of change reports ### HIPAA | HIPAA Requirement | AnomalyArmor Support | |------------------|----------------------| | **Access Controls** | Document what systems access PHI | | **Audit Controls** | Log all data access and changes | | **Integrity Controls** | Detect unauthorized modifications | **Recommended Setup:** - Tag all PHI columns - Alert immediately on PHI changes - Enable extended audit log retention ## Reporting for Auditors ### Monthly Compliance Report Generate monthly reports showing: 1. **Asset Inventory**: All monitored tables and columns 2. **Change Summary**: Schema changes in the period 3. **Classification Status**: PII and sensitive data inventory 4. **Alert Summary**: Alerts fired and responses ### Quarterly Review Prepare for quarterly reviews: 1. Export schema change history for quarter 2. Export classification inventory 3. Export audit logs 4. Document any incidents and responses 5. Review and update classifications ### Annual Audit Package For annual audits, compile: - [ ] Complete asset inventory - [ ] 12 months of schema change history - [ ] Classification inventory with evidence - [ ] Audit logs for the year - [ ] Incident response documentation - [ ] Evidence of regular monitoring (alert history) ## Alert Rules for Compliance ### PII Monitoring | Field | Value | |-------|-------| | **Rule** | PII Data Changes | | **Event** | Schema Change Detected | | **Classification** | PII, Personal Data | | **Conditions** | Any change | | **Destinations** | Email dpo@company.com, Email compliance@company.com, Slack #data-governance | ### Financial Data Monitoring | Field | Value | |-------|-------| | **Rule** | Financial Data Changes | | **Event** | Schema Change Detected | | **Scope** | Classification: Financial, OR Assets: *revenue*, *payment*, *transaction* | | **Conditions** | Any change | | **Destinations** | Email finance-compliance@company.com, Slack #finance-data | ### Unauthorized Access Detection | Field | Value | |-------|-------| | **Rule** | Discovery Failures | | **Event** | Discovery Failed | | **Scope** | All data sources | | **Destinations** | Email security@company.com, Slack #security-alerts | > â„šī¸ Repeated failures may indicate permission changes or unauthorized access attempts. ## Best Practices ### Documentation 1. **Tag everything sensitive**: Don't miss PII or financial columns 2. **Regular reviews**: Quarterly review of classifications 3. **Export regularly**: Don't wait for audits to export data ### Monitoring 1. **Alert on all sensitive data changes**: Better to know than miss 2. **Include compliance team on alerts**: They need visibility 3. **Document incident responses**: Keep records of how you responded ### Retention 1. **Extended retention for regulated data**: Match your regulatory requirements 2. **Backup exports**: Keep copies outside AnomalyArmor 3. **Immutable storage**: Use write-once storage for audit exports ## Checklist For compliance readiness: - [ ] All databases connected and discovered - [ ] PII columns identified and tagged - [ ] Financial data identified and tagged - [ ] Health data identified and tagged (if applicable) - [ ] Alert rules for classified data changes - [ ] Compliance team added to alerts - [ ] Export schedule established - [ ] Retention policy configured - [ ] Audit log access documented ## Common Questions ### How does AnomalyArmor help with GDPR compliance? AnomalyArmor supports GDPR Articles 30, 32, and 33 by cataloging where personal data lives, tracking every schema change, and alerting on modifications to **PII**-tagged columns. Tag PII columns in the **Classification** tab and create an alert rule that routes any change to your DPO. See the [GDPR table](#gdpr) for the full mapping. ### How long are audit logs retained? Standard plans retain audit logs for **90 days**. Enterprise plans support configurable retention up to **7 years** to match regulatory requirements like SOX and HIPAA. Contact support if you need extended retention beyond the default. ### Can I export schema change history for an auditor? Yes. Open the asset's **Schema History** tab, click **Export**, and choose CSV, JSON, or PDF over any date range. The export includes change type, detection timestamp, discovery run ID, and before/after values, which is usually enough to satisfy a SOX or HIPAA change-management review. ### How do I tag PII columns so I get alerted when they change? In the asset view, open the **Classification** tab, select the columns, and apply the built-in `PII` tag (or a custom tag). Then create an alert rule scoped to that classification with event `Schema Change Detected` and route it to your compliance channel. Repeat for `Financial`, `Health`, or custom classifications. ### Does AnomalyArmor store the actual PII values? No. AnomalyArmor reads metadata and schema, not row data. We record that a column named `ssn` existed and was removed, never the values inside it. That keeps the audit trail useful without creating a new compliance surface to defend. ## Related Resources - **[Data Classification](/data-classification/overview)**: Tagging and classification features - **[Security Overview](/security/overview)**: Security and compliance documentation --- # Upload Lineage Data > Import data lineage from dbt manifests, dbt Cloud, or manual definitions AnomalyArmor can visualize how your data flows from source to destination, but it needs lineage data to work with. This guide covers three ways to get lineage into AnomalyArmor: 1. **Upload a dbt manifest.json** file (most common) 2. **Sync from dbt Cloud** automatically 3. **Define lineage manually** via the API ## Option 1: Upload a dbt manifest.json If you use dbt, the fastest way to populate lineage is uploading your `manifest.json` file. This file contains your full DAG, including all models, sources, seeds, and their dependencies. ### Generate the manifest Run one of these dbt commands to produce `target/manifest.json`: ```bash # Either of these generates a manifest.json in target/ dbt parse # Fastest, parses without compiling dbt compile # Compiles SQL, also generates manifest dbt run # Full run, also generates manifest ``` ### Upload via the API ### What gets imported AnomalyArmor parses the `nodes` and `parent_map` from your manifest to extract: - **Models** (transformations in your dbt project) - **Sources** (raw tables dbt reads from) - **Seeds** (CSV files loaded by dbt) - **Parent-child relationships** between all of the above The `sync_to_catalog` parameter (default: `true`) also triggers a asset discovery job so your dbt models appear as assets in the catalog. ### Response ```json { "data": { "asset_id": "550e8400-e29b-41d4-a716-446655440000", "sync_stats": { "nodes_created": 42, "nodes_updated": 8, "edges_created": 67, "edges_updated": 3 }, "manifest_metadata": { "generated_at": "2025-03-15T10:30:00Z", "dbt_version": "1.7.4", "project_name": "my_analytics" }, "catalog_sync_job_id": "job-uuid-here" } } ``` ### Automate uploads in CI/CD Add a manifest upload step after your dbt run completes: ```yaml # .github/workflows/dbt.yml jobs: dbt-run: steps: - name: Run dbt run: dbt run - name: Upload lineage to AnomalyArmor run: | curl -X POST \ "https://api.anomalyarmor.ai/api/v1/assets/$ASSET_ID/lineage/upload" \ -H "Authorization: Bearer $ARMOR_API_KEY" \ -F "file=@target/manifest.json" env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} ASSET_ID: ${{ vars.ARMOR_ASSET_ID }} ``` ## Option 2: Sync from dbt Cloud If you use dbt Cloud, AnomalyArmor can fetch the manifest directly from your dbt Cloud account. No file upload needed. ### Finding your dbt Cloud credentials | Parameter | Where to find it | |-----------|-----------------| | `account_id` | dbt Cloud URL: `cloud.getdbt.com/deploy/**12345**/...` | | `api_token` | dbt Cloud > Account Settings > API Access > Service tokens | | `job_id` | dbt Cloud > Jobs > select your job > ID in the URL | > â„šī¸ Use a **service token** with at least the "Read artifacts" permission. Personal tokens work but are tied to individual users. ## Option 3: Define lineage manually For data sources that are not managed by dbt, you can define lineage nodes and edges directly via the API. ### Create a lineage node ```bash curl -X POST \ "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/nodes" \ -H "Authorization: Bearer aa_live_xxx" \ -H "Content-Type: application/json" \ -d '{ "unique_id": "source.crm.customers", "name": "customers", "resource_type": "source", "schema": "crm", "database": "production" }' ``` ### Create a lineage edge ```bash curl -X POST \ "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/edges" \ -H "Authorization: Bearer aa_live_xxx" \ -H "Content-Type: application/json" \ -d '{ "parent_unique_id": "source.crm.customers", "child_unique_id": "model.analytics.dim_customers", "relationship_type": "derives_from" }' ``` ## Updating lineage When your dbt project changes, re-upload the manifest. AnomalyArmor handles updates intelligently: - New nodes and edges are created - Existing nodes are updated with new metadata - Relationships that no longer exist are removed For a clean reset, delete all lineage from a source first: ```bash curl -X DELETE \ "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/source/dbt" \ -H "Authorization: Bearer aa_live_xxx" ``` ## Limits | Constraint | Value | |-----------|-------| | Max manifest file size | 50 MB | | File format | JSON (UTF-8 encoded) | | Required manifest keys | `nodes`, `parent_map` | ## Common Questions ### Where do I find my dbt `manifest.json` file? It lives under `target/manifest.json` after you run `dbt parse`, `dbt compile`, or `dbt run`. `dbt parse` is the fastest option because it does not compile SQL or hit your warehouse. See [Generate the manifest](#generate-the-manifest). ### Do I need dbt to upload lineage? No. Option 3 lets you [define lineage manually](#option-3-define-lineage-manually) via the API by creating nodes and edges directly. This is how you model lineage for sources outside dbt, like Fivetran pipelines, Airflow tasks, or bespoke ETL jobs. ### How do I sync lineage from dbt Cloud instead of uploading a file? Call `POST /api/v1/assets/{asset_id}/lineage/dbt-cloud/sync` with your dbt Cloud `account_id`, `api_token`, and `job_id`. AnomalyArmor pulls the latest manifest from the run artifacts directly. Use a **service token** with "Read artifacts" permission, not a personal token. See [Option 2](#option-2-sync-from-dbt-cloud). ### How often should I re-upload the manifest? Re-upload any time your dbt project changes, typically as a CI step after `dbt run` in your deploy pipeline. AnomalyArmor diffs the new manifest against the existing graph: new nodes and edges are created, updated ones are updated, and relationships that no longer exist are removed. ### What is the max manifest size AnomalyArmor accepts? **50 MB**, UTF-8 encoded JSON, with `nodes` and `parent_map` keys required. Most dbt projects fit well under this; if you're hitting it, check whether you're accidentally uploading a `run_results.json` instead of `manifest.json`. ## Next steps - **[Query Lineage](/api/lineage)**: Explore upstream and downstream dependencies - **[Impact Analysis](/api/lineage#use-case-impact-analysis)**: Check downstream impact before schema changes - **[dbt Integration](/integrations/dbt)**: Add quality gates to your dbt workflows - **[AI Agent: Lineage](/ai-agents/skills/lineage)**: Ask natural language questions about data flow --- # Profile Settings > Update your name, email, and account preferences Manage your personal account settings including display name, email, and security. ## Updating Your Profile 1. Click your avatar in the top right corner 2. Select **Settings** 3. Update any of the following: | Field | Description | |-------|-------------| | **Display Name** | How your name appears to teammates | | **Email** | Your login email (requires verification if changed) | | **Avatar** | Upload an image or use your Gravatar | | **Timezone** | Used for displaying timestamps | 4. Click **Save Changes** You'll see "Profile updated" confirmation at the top of the page. ## Changing Your Email Changing your email requires verification: 1. Enter your new email address 2. Click **Save Changes** 3. Check your new email for a verification link 4. Click the link to confirm Until verified, you'll continue using your old email to log in. ## Notification Preferences Control which emails you receive: 1. Go to **Settings** > **Notifications** 2. Toggle notifications on/off: | Notification | Default | Description | |--------------|---------|-------------| | Alert emails | On | Email when alerts fire (if email destination configured) | | Weekly digest | On | Summary of changes and alerts | | Product updates | On | New features and announcements | 3. Click **Save** ## Security ### Changing Your Password 1. Go to **Settings** > **Security** 2. Click **Change Password** 3. Enter your current password 4. Enter your new password (minimum 12 characters) 5. Confirm your new password 6. Click **Update Password** You'll be logged out of other sessions and need to log in again. ## Account Recovery ### Forgot Password 1. Go to the login page 2. Click **Forgot password?** 3. Enter your email address 4. Check email for reset link (expires in 1 hour) 5. Click link and set new password ### Locked Out If you can't access your account: | Situation | Solution | |-----------|----------| | Forgot password | Use password reset flow above | | Lost 2FA device | Use backup codes or contact support | | Email no longer accessible | Contact support with identity verification | | Account suspended | Contact support for details | ## Deleting Your Account To delete your personal account: 1. Contact **support@anomalyarmor.ai** 2. Request account deletion 3. We'll confirm and process within 7 days > â„šī¸ If you're the workspace owner, you must transfer ownership or delete the workspace first. ## Related - **[Team Management](/account/team-management)**: Invite members and manage roles - **[Billing](/billing/overview)**: Manage subscription and invoices --- # Team Management > Invite teammates and manage roles and permissions Control who has access to your AnomalyArmor workspace and what they can do. ## Roles and Permissions | Role | Description | |------|-------------| | **Viewer** | Read-only access. Can see assets, alerts, and dashboards. | | **Member** | Can configure alerts and destinations. Cannot manage data sources or team. | | **Admin** | Full access to features and team management. Cannot access billing. | | **Owner** | Full access including billing. One owner per workspace. | > â„šī¸ **Admin vs Owner**: The only difference is billing access. Use Admin for team members who should manage users and data sources but not payment details. ### Permission Matrix | Permission | Viewer | Member | Admin | Owner | |------------|--------|--------|-------|-------| | View assets and dashboards | Yes | Yes | Yes | Yes | | Configure alerts | No | Yes | Yes | Yes | | Manage data sources | No | No | Yes | Yes | | Invite/remove team members | No | No | Yes | Yes | | Access billing and payments | No | No | No | Yes | ## Seat Limits Your subscription plan determines how many team members you can have: | Plan | Seats | |------|-------| | Starter | 1 | | Growth | 5 | | Professional | 10 | | Enterprise | Unlimited | Pending invitations count toward your seat limit. Upgrade your plan in **Settings > Billing** to add more team members. ## Inviting Team Members 1. Go to **Settings > Users** 2. Click **Invite member** 3. Enter their email address 4. Select a role from the dropdown 5. Click **Send invitation** The invitation email includes a link to join your workspace. Invitations expire after 7 days. > â„šī¸ Only Admins and Owners can invite team members. ### What Happens When They Accept When someone accepts your invitation: 1. If they don't have an AnomalyArmor account, they create one 2. They automatically join your workspace with the assigned role 3. They appear in your team list immediately ## Changing a Member's Role Only the Owner can change roles: 1. Go to **Settings > Users** 2. Click the menu icon (three dots) next to the team member 3. Select a new role from the submenu 4. Change applies immediately > â„šī¸ You cannot change your own role. The Owner role cannot be assigned through this menu. ## Removing Team Members 1. Go to **Settings > Users** 2. Click the menu icon (three dots) next to the team member 3. Click **Remove member** 4. Confirm removal Removed members immediately lose access. Their past actions remain in the system. > âš ī¸ You cannot remove the workspace Owner. You cannot remove yourself. ## Revoking Pending Invitations 1. Go to **Settings > Users** 2. Click the **Pending Invitations** tab 3. Click the X button next to the invitation 4. The invitation link stops working immediately ## Troubleshooting ### Invitation not received 1. Check spam/junk folder 2. Verify the email address is correct 3. Ask them to add `alerts@anomalyarmor.com` to their contacts 4. Revoke and resend the invitation ### Cannot invite more members You may have reached your seat limit. Check your current usage in **Settings > Billing** and upgrade if needed. ### Cannot change someone's role Only the Owner can change roles. If you need a role change, contact your workspace Owner. ### Cannot remove someone You cannot remove the Owner. You also cannot remove yourself. Ask the Owner to make changes. --- # Billing Overview > Understand pricing, manage subscriptions, and track usage AnomalyArmor pricing is based on the number of monitored tables. All features are included in every plan. ## Choosing a Plan | If you have... | Choose | Monthly | |----------------|--------|---------| | 1-20 tables | Starter | $99 | | 21-100 tables | Growth | $475 | | 101-300 tables | Professional | $1,188 | | 300+ tables | Enterprise | Custom | Not sure how many tables you have? Connect your database and run discovery. We'll show your count before you commit. ## What's Included Every plan includes: - **All monitoring features**: Schema drift, freshness, metrics, alerts - **All integrations**: Slack, PagerDuty, webhooks, email - **AI Intelligence**: Ask questions, auto-generate descriptions - **Auto-classification**: PII detection, custom tags - **API access**: Full REST API for automation The only difference between plans is capacity (tables and team members). ## Managing Your Subscription ### View Current Plan **Settings** > **Billing** > **Subscription** See your current plan, usage, and renewal date. ### Upgrade 1. Go to **Settings** > **Billing** 2. Click **Upgrade Plan** 3. Select new plan 4. Confirm payment Upgrades take effect immediately. You pay the prorated difference. ### Downgrade 1. Go to **Settings** > **Billing** 2. Click **Change Plan** 3. Select lower tier Downgrades take effect at next billing cycle. Your table count must fit within the new plan's limit. ### Cancel 1. Go to **Settings** > **Billing** 2. Click **Cancel Subscription** Access continues until the end of your paid period. ## In This Section - **[Plans & Pricing](/billing/plans)**: Compare plan tiers and features - **[Usage Limits](/billing/usage-limits)**: Understand how usage is measured - **[Billing FAQ](/billing/faq)**: Common billing questions answered ## Common Questions ### How is AnomalyArmor priced? Pricing is based on how many tables you monitor. Every plan ships with every feature: schema drift, freshness, metrics, integrations, API access, AI Intelligence. See [Plans & Pricing](/billing/plans) for tier-by-tier pricing. ### Am I locked into features based on my plan? No. All plans include every feature. The only differences between tiers are capacity (tables, team members) and API rate limits. You never lose a feature by downgrading. ### Where do I see my current plan and usage? Go to **Settings → Billing → Subscription**. You'll see your current plan, table usage versus limit, next renewal date, and buttons to upgrade, downgrade, or cancel. ### How fast does an upgrade take effect? Upgrades are instant. You pay the prorated difference for the remainder of the current billing period, and the new capacity is available immediately. --- # Plans & Pricing > AnomalyArmor subscription tiers Choose the plan that fits your data monitoring needs. ## Plan Comparison | | Starter | Growth | Professional | Enterprise | |---|---------|--------|--------------|------------| | **Monthly** | $99 | $475 | $1,188 | Custom | | **Annual** | $84/mo | $404/mo | $1,010/mo | Custom | | **Tables** | 20 | 100 | 300 | 1000+ | | **Team Members** | 1 | 5 | 10 | Unlimited | | **API Calls** | 10k/mo | 50k/mo | 200k/mo | Unlimited | Save 15% with annual billing. ## Feature Availability All plans include every feature: | Feature | All Plans | |---------|-----------| | Schema drift detection | Included | | Freshness monitoring | Included | | Data quality metrics | Included | | AI Intelligence | Included | | Auto-classification | Included | | All alert destinations | Included | | Full API access | Included | | Report badges | Included | | Custom tags | Included | ## Which Plan Is Right for You? ### Starter ($99/mo) **Best for**: Solo practitioners, single database, getting started - Monitoring one database - Personal project or small startup - Evaluating before team rollout ### Growth ($475/mo) **Best for**: Small data teams, multiple databases - 2-5 team members need access - Multiple databases or environments - Production + staging monitoring ### Professional ($1,188/mo) **Best for**: Larger teams, enterprise data stacks - Complex data infrastructure - Multiple teams need access - High-volume API usage ### Enterprise (Custom) **Best for**: Large organizations, custom requirements - 1000+ tables - Custom integrations - Dedicated support - SLA guarantees - SSO/SCIM [Contact sales](mailto:sales@anomalyarmor.ai) for Enterprise pricing. ## Changing Plans ### Upgrades - Take effect immediately - Pay prorated difference for current period - New limits available instantly ### Downgrades - Take effect at next billing cycle - No refund for current period - Must reduce usage to fit new limits first ### How to Change 1. Go to **Settings** > **Billing** 2. Click **Manage Subscription** 3. Select new plan 4. Confirm ## Trials Trials are available by invitation: a magic link in a cold email, a trial code we share directly, or the [homepage try-it form](https://www.anomalyarmor.ai/). There is no self-serve trial from the signup page without a code and no free tier. - 14-day duration - All features unlocked - No credit card required - Workspace pauses after trial until a plan is chosen (no data loss) ## Common Questions ### What counts as a table for plan limits? Any table or view that AnomalyArmor discovers and monitors counts toward your plan limit. System tables are excluded automatically. You control which schemas to include during connection setup. ### Can I add more tables without upgrading my plan? No. Once you hit your plan limit, you must upgrade to monitor additional tables. The practical approach is to start with your most business-critical tables and expand from there. ### What happens if I exceed AnomalyArmor's API rate limits? Requests over the limit return HTTP 429. Upgrade for higher limits, or batch and cache API calls in your integration to reduce request volume. ### Does AnomalyArmor offer discounts? Yes. Annual billing saves 15% vs monthly. Non-profits, early-stage startups, and education customers may qualify for additional discounts; [contact sales](mailto:sales@anomalyarmor.ai) to discuss. ### Is there a free trial? Trials are available by invitation: a magic link in a cold email, a trial code we share directly, or the [homepage try-it form](https://www.anomalyarmor.ai/). There is no self-serve trial from the signup page without a code and no free tier. Trials are 14 days with all features unlocked and no credit card; at the end your workspace pauses until you pick a paid plan. --- # Usage Limits > Understanding how monitor usage is calculated Your plan includes a set number of active monitors. This page explains how we count monitors toward your limit. ## What Counts as an Active Monitor Each individual monitor counts toward your limit: - Each freshness schedule = 1 monitor - Each data quality metric (row count, null %, distinct count, etc.) = 1 monitor - Each validity rule (regex, allowed values, format checks) = 1 monitor - Each referential integrity check (orphan detection) = 1 monitor ### Key Rules 1. **Per-item counting** - Each monitor counts separately toward your limit 2. **Multiple monitors on one table** - A table with 1 freshness + 5 metrics + 3 validity rules = 9 monitors 3. **Removing monitors reduces your count** - Delete a metric or schedule and your count decreases by 1 4. **Cross-database** - Monitors from all your connected databases count toward one shared limit ## What's Free (Unlimited) The following features are free and do not count toward your monitor limit: | Feature | Description | |---------|-------------| | Schema discovery | Discovering tables and columns in your databases | | Schema drift monitoring | Alerts when schema changes are detected | | AI descriptions & tags | Auto-generated documentation for your data | | Alert destinations | Slack, email, PagerDuty, webhooks configuration | ## Checking Your Usage Go to **Settings** → **Billing** to see: - Current monitors in use vs. your plan limit - Breakdown by feature type (freshness, data quality, validity) - Which tables have monitors - Which databases they belong to ## Approaching Your Limit When you reach 80% of your monitor limit, you'll see a warning in the billing settings. To add more monitors, you can: 1. **Remove unused monitors** - Delete metrics or schedules you no longer need 2. **Upgrade your plan** - Higher tiers include more monitors at a lower per-monitor cost ## Example If you're on the Starter plan (20 monitors) and you: - Add freshness schedule to `orders` table → 1/20 monitors used - Add 3 data quality metrics to `orders` → 4/20 monitors used (1 freshness + 3 metrics) - Add freshness to `customers` table → 5/20 monitors used - Add 2 validity rules to `customers` → 7/20 monitors used - Delete freshness from `orders` → 6/20 monitors used - Delete 1 metric from `orders` → 5/20 monitors used ## Common Questions ### Does schema drift monitoring count toward my monitor limit? No. Schema discovery, schema drift monitoring, AI descriptions, and alert destinations are all free and unlimited. Only active freshness schedules, data quality metrics, and validity rules count. See [What's Free](#whats-free-unlimited). ### If I add 5 metrics to one table, does that count as 1 or 5? 5. Monitors are counted per-item, not per-table. A table with 1 freshness schedule, 3 metrics, and 2 validity rules uses 6 monitors out of your plan's allowance. ### Do monitors across different databases share one pool? Yes. All monitors from all your connected data sources share a single plan-wide limit. You can spend your allocation across one database or spread it across many. ### What happens when I hit my monitor limit? You can't create new monitors until you remove existing ones or upgrade. Existing monitors keep running. You'll see an 80%-used warning in **Settings → Billing** so you can plan ahead. ### How do I lower my monitor usage without upgrading? Delete unused freshness schedules, metrics, or validity rules. Each deletion frees up 1 slot instantly. Schema drift monitoring stays free, so you can keep schema coverage on tables where per-row quality isn't critical. --- # Billing FAQ > Common billing questions answered ## Common Questions ### What happens if I exceed my table limit? You can't add new data sources or monitors that would put you over the limit. Existing monitoring keeps running uninterrupted. To add more, upgrade your plan from **Settings → Billing**. ### Can I downgrade my plan? Yes. Downgrades take effect at your next billing cycle, not immediately. Before downgrading, reduce your monitored tables to fit the lower plan's limit. ### How does proration work? Upgrades charge the prorated difference immediately for the remainder of the current period. Downgrades do not issue a refund for the current period; the new (lower) rate starts on the next cycle. ### Can I cancel anytime? Yes. Cancel from **Settings → Billing**. Access continues through the end of your already-paid period, then the workspace is paused until you reactivate. There is no free tier. ### How do I get invoices or receipts? Go to **Settings → Billing → Invoices** to download PDF receipts for any past payment. Enterprise customers on invoice billing receive invoices by email from accounting@anomalyarmor.ai. ### Do you offer refunds? We handle refunds case-by-case. Email support@anomalyarmor.ai within 14 days of the charge with your account email and the reason, and we'll respond within 2 business days. ### What payment methods do you accept? Credit cards (Visa, Mastercard, American Express) via Stripe. Enterprise customers can pay by invoice in USD with Net-30 terms. Contact sales@anomalyarmor.ai to set up invoicing. ### Is there a free trial? Trials are available by invitation: a magic link in a cold email, a trial code we share directly, or the [homepage try-it form](https://www.anomalyarmor.ai/). There is no self-serve trial from the signup page without a code and no free tier. Trials are 14 days with all features unlocked and no credit card; at the end your workspace pauses until you pick a paid plan. --- # Security Overview > How AnomalyArmor protects your data with enterprise-grade security controls AnomalyArmor is built with security as a foundational principle, not an afterthought. We understand that data observability tools have access to sensitive infrastructure, and we've designed our platform to limit data exposure to metadata and bounded aggregates, and to enforce that boundary at every layer. ## Core Security Principles ### Metadata and Aggregates Only, Never Raw Row Values The most important security control is what we **don't** access: | We Access | We Never Access | |-----------|-----------------| | Table names | Individual row data | | Column names and types | `SELECT *` or raw column values | | Schema metadata | PII content | | Timestamp values (for freshness) | Business-sensitive records | | Constraint definitions | Query results beyond aggregates | | Aggregate statistics (`COUNT`, `MIN`, `MAX`, `SUM`, `AVG`, `PERCENTILE`) | | When you connect a database, AnomalyArmor queries system catalogs (`information_schema`, `pg_catalog`, etc.) for structure, and runs bounded aggregate queries against your tables for monitoring. It never runs `SELECT *` or row-level reads of sensitive columns. ```sql -- What we query (metadata + aggregates) SELECT table_name, column_name, data_type FROM information_schema.columns; SELECT COUNT(*), MAX(created_at) FROM orders; -- What we NEVER query (raw rows) SELECT * FROM users; -- ❌ Never executed SELECT email, ssn FROM customers; -- ❌ Never executed ``` This isn't just policy. It's enforced by our open-source [Query Gateway](/security/query-gateway), which parses and validates every SQL query before execution. ### Defense in Depth Security is implemented at every layer: ## Compliance & Certifications ### GDPR Compliance For customers in the EU or handling EU data: - Data Processing Agreement (DPA) available - Right to erasure supported - Data portability supported - EU data residency options (contact sales) ### HIPAA Readiness For healthcare organizations: - Business Associate Agreement (BAA) available - Enhanced audit logging - Contact us for healthcare-specific deployment options ## Encryption ### Data in Transit All network traffic uses **TLS 1.3** encryption: - Database connections from AnomalyArmor to your infrastructure - Web traffic to the AnomalyArmor dashboard - API calls to AnomalyArmor services - Webhook deliveries to your endpoints **Minimum TLS version**: 1.2 (1.3 preferred) **Cipher suites**: Modern, AEAD-only (AES-GCM, ChaCha20-Poly1305) ### Data at Rest All stored data is encrypted with **AES-256**: | Data Type | Encryption | Key Management | |-----------|------------|----------------| | Database credentials | AES-256-GCM | AWS KMS, automatic rotation | | Asset metadata | AES-256 | Managed by infrastructure | | Audit logs | AES-256 | Immutable storage | | Backups | AES-256 | Separate backup keys | ### Key Management Encryption keys are managed through AWS KMS: - Automatic key rotation every 365 days - Keys never leave the KMS boundary - Separate keys per customer (Enterprise) - Hardware Security Module (HSM) backed ## Network Security ### Connection Architecture AnomalyArmor connects **outbound** to your databases: ### IP Allowlisting AnomalyArmor uses static IP addresses for database connections: ``` Production IPs (US region): 34.xxx.xxx.xxx/32 34.xxx.xxx.xxx/32 # Add these to your security group/firewall ``` > â„šī¸ Contact support for the current list of IP addresses, or view them in **Settings → Security** in your dashboard. ### VPC Peering (Enterprise) For enhanced network isolation: - Direct VPC peering between your AWS account and AnomalyArmor - No public internet exposure for database connections - Private DNS resolution ### AWS PrivateLink (Enterprise) Connect via AWS PrivateLink for: - Fully private connectivity - No data traversing public internet - Simplified firewall rules ## Authentication & Access Control ### User Authentication AnomalyArmor supports multiple authentication methods: | Method | Availability | Details | |--------|--------------|---------| | Email + Password | All plans | Strong password requirements | | Google SSO | All plans | OAuth 2.0 integration | | SAML 2.0 SSO | Enterprise | Okta, Azure AD, OneLogin, etc. | | Multi-Factor Auth | All plans | TOTP-based (Google Authenticator, etc.) | ### Role-Based Access Control (RBAC) Control who can do what: | Role | Permissions | |------|-------------| | Viewer | View assets, alerts, dashboards | | Editor | Viewer + configure alerts, manage destinations | | Admin | Editor + manage data sources, team members | | Owner | Admin + billing, security settings | ### Session Security - Session tokens expire after 24 hours of inactivity - Sessions invalidated on password change - Concurrent session limits (Enterprise) - IP-based session restrictions (Enterprise) ## Credential Security ### How We Store Your Database Credentials Database credentials are the most sensitive data we handle: 1. **Encryption**: AES-256-GCM with customer-specific keys 2. **Key storage**: AWS KMS with HSM backing 3. **Access logging**: Every credential access is logged 4. **Least privilege**: Only the discovery engine accesses credentials 5. **No visibility**: Support staff cannot view your credentials ### Credential Recommendations > âš ī¸ Always create a **read-only, dedicated user** for AnomalyArmor. Never share credentials with production applications. Recommended setup for PostgreSQL: ```sql -- Create dedicated read-only user CREATE USER anomalyarmor WITH PASSWORD 'secure-password'; -- Grant minimal permissions GRANT CONNECT ON DATABASE your_db TO anomalyarmor; GRANT USAGE ON SCHEMA public TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor; -- For information_schema access (required) -- This is typically granted by default ``` ## Audit Logging ### What We Log Every significant action is logged: | Event Type | Details Logged | |------------|---------------| | Authentication | Login, logout, failed attempts, MFA events | | Data access | Discovery runs, asset views, exports | | Configuration | Rule changes, destination updates, team changes | | Admin actions | User management, security settings | ### Log Retention - **Standard**: 90 days retention - **Enterprise**: Configurable retention, SIEM integration ### Accessing Audit Logs Navigate to **Settings → Audit Log** to view activity: - Filter by user, action type, date range - Export to CSV - API access for SIEM integration (Enterprise) ## Infrastructure Security ### Cloud Provider AnomalyArmor runs on AWS with: - Multi-AZ deployment for high availability - Automated backups with point-in-time recovery - DDoS protection via AWS Shield - Web Application Firewall (WAF) ### Host Security - Hardened AMIs with minimal attack surface - Automatic security patching - No SSH access to production hosts - Immutable infrastructure (replaced, not updated) ### Container Security - Non-root container execution - Read-only file systems where possible - Resource limits enforced - Regular vulnerability scanning ## Incident Response ### Our Commitment In the event of a security incident: 1. **Detection**: 24/7 monitoring with automated alerting 2. **Response**: Incident response team engaged within 1 hour 3. **Communication**: Affected customers notified within 24 hours 4. **Resolution**: Root cause analysis and remediation 5. **Post-mortem**: Detailed report shared with affected customers ### Reporting Security Issues If you discover a security vulnerability: - Email: [security@anomalyarmor.ai](mailto:security@anomalyarmor.ai) - Response time: Within 24 hours - Bug bounty: Contact us for details > âš ī¸ Please do not disclose security issues publicly until we've had a chance to address them. ## Penetration Testing ### Our Testing - Annual third-party penetration tests - Continuous automated vulnerability scanning - Bug bounty program for responsible disclosure ### Your Testing Enterprise customers may conduct penetration testing: 1. Contact us at least 2 weeks in advance 2. Scope limited to your tenant 3. Provide test dates and methodology 4. Share findings for mutual benefit ## Security FAQ ### Can AnomalyArmor employees see my data? No. Employees cannot access customer credentials or metadata. All access is logged and audited. Support debugging happens with synthetic data only. ### What happens if AnomalyArmor is breached? Your database credentials are encrypted with customer-specific keys. Even in a breach, credentials cannot be decrypted without the KMS keys. We would notify you within 24 hours of any confirmed breach. ### Can I run AnomalyArmor in my own cloud? Self-hosted deployments are available for Enterprise customers with specific compliance requirements. Contact sales for details. ### How do I revoke AnomalyArmor's access? 1. Delete the data source connection in AnomalyArmor 2. Revoke the database user's permissions or delete the user 3. Remove AnomalyArmor IPs from your security group ## Common Questions ### Is AnomalyArmor SOC 2 / GDPR / HIPAA compliant? AnomalyArmor is GDPR-compliant with a DPA available, and HIPAA-ready with a BAA available for Enterprise. SOC 2 Type II is in progress; contact security@anomalyarmor.ai for the current report status. ### Does AnomalyArmor support SSO and MFA? Yes. Email + password, Google SSO, and TOTP-based MFA are available on all plans. SAML 2.0 SSO (Okta, Azure AD, OneLogin) is available on Enterprise. See [User Authentication](#user-authentication). ### Can I connect AnomalyArmor to a database that isn't publicly reachable? Yes, on Enterprise. VPC peering and AWS PrivateLink let AnomalyArmor reach databases without a public endpoint, so traffic never traverses the public internet. See [Network Security](#network-security). ### How are database credentials protected if AnomalyArmor is breached? Credentials are encrypted with AES-256-GCM using customer-specific keys managed by AWS KMS with HSM backing. Even in a breach, credentials cannot be decrypted without the KMS keys, which never leave the KMS boundary. ### How do I revoke AnomalyArmor's access to my database? Delete the data source connection in AnomalyArmor, then revoke or delete the database user on your side, and remove our IPs from your security group. That gives you a belt-and-suspenders revocation. ### Can I run penetration tests against AnomalyArmor? Enterprise customers can run penetration tests against their own tenant with 2 weeks' notice. Email security@anomalyarmor.ai with scope, dates, and methodology. We also run annual third-party pen tests and a bug bounty for responsible disclosure. ## Related Pages - **[Query Gateway](/security/query-gateway)**: Open-source SQL security layer - **[Data Handling](/security/data-handling)**: What data we access and how we handle it - **[Data Retention](/security/data-retention)**: How long we keep your data --- # Query Gateway > How AnomalyArmor enforces SQL access restrictions with an open-source security layer AnomalyArmor uses an open-source SQL security layer called the **Query Gateway** to enforce strict access controls on every query we run against your database. This page explains how it works and how you can verify our security claims. ## What is the Query Gateway? The Query Gateway is a SQL validation layer that parses and validates every query before execution. It's the enforcement mechanism behind our "metadata only" promise. ### Key Security Properties | Property | Description | |----------|-------------| | **Fail-closed** | If a query cannot be parsed, it is blocked. No exceptions. | | **Pre-execution validation** | Queries are validated before reaching your database | | **Recursive checking** | Subqueries and CTEs are validated against the same rules | | **Comment stripping** | SQL comments are removed to prevent obfuscation attacks | | **Open source** | Full source code available for your security team to audit | ## Access Levels The Query Gateway enforces three distinct permission tiers. AnomalyArmor uses different access levels depending on the operation: | Level | Purpose | Permitted | Blocked | |-------|---------|-----------|---------| | **Schema Only** | Metadata queries | System tables (`information_schema`, `pg_catalog`, etc.) | Any row-level data | | **Aggregates** | Statistics only | `COUNT(*)`, `AVG`, `SUM`, `MIN`, `MAX` | Raw column values, `SELECT *` | | **Full** | Complete read access | Any valid `SELECT` | (unrestricted) | ### What AnomalyArmor Uses | Feature | Access Level | Example Query | |---------|--------------|---------------| | Schema discovery | Schema Only | `SELECT table_name FROM information_schema.tables` | | Freshness monitoring | Aggregates | `SELECT MAX(updated_at) FROM orders` | | Row counts | Aggregates | `SELECT COUNT(*) FROM users` | > â„šī¸ AnomalyArmor does not use `full` access level. We only query metadata (schema_only) and aggregate statistics (aggregates) for freshness monitoring. ### Schema Only Mode In schema_only mode, queries can only access system catalogs: ```sql -- Allowed SELECT table_name, column_name, data_type FROM information_schema.columns; -- Blocked (not a system table) SELECT * FROM users; -- Blocked (even aggregates on user tables) SELECT COUNT(*) FROM orders; ``` ### Aggregates Mode In aggregates mode, queries can retrieve statistical information but never raw values: ```sql -- Allowed SELECT MAX(created_at) FROM orders; SELECT COUNT(*) FROM users; SELECT AVG(amount) FROM transactions; -- Blocked (returns raw values) SELECT email FROM users; SELECT * FROM orders LIMIT 1; -- Blocked (window functions can leak row data) SELECT ROW_NUMBER() OVER (ORDER BY id) FROM users; ``` ## Open Source Verification The Query Gateway is fully open source under the Apache 2.0 license. Your security team can: 1. **Audit the code** - Review exactly how queries are validated 2. **Run the test suite** - 97+ tests covering security edge cases 3. **Verify our claims** - See the exact rules enforced at each access level ### Running the Tests Yourself ```bash # Clone the repository git clone https://github.com/anomalyarmor/anomalyarmor-query-gateway.git cd anomalyarmor-query-gateway # Install dependencies pip install -e ".[dev]" # Run the test suite pytest -v ``` ## Technical Implementation ### SQL Parsing The gateway uses [sqlglot](https://github.com/tobymao/sqlglot) to parse queries into an Abstract Syntax Tree (AST). This enables: - Dialect-aware parsing (PostgreSQL, MySQL, Databricks, ClickHouse, etc.) - Accurate identification of accessed tables and columns - Detection of nested queries and CTEs ### Validation Process ### Supported Databases The gateway supports all databases that AnomalyArmor connects to: - PostgreSQL - MySQL - Databricks - ClickHouse - Snowflake - SQL Server - BigQuery - Redshift ## Audit Logging Every query validation is logged with: - Query text - Access level applied - Allow/block decision - Rejection reason (if blocked) - Metadata (asset ID, user ID, timestamp) These logs are available in your audit trail (Enterprise plans). ## Common Questions ### Can I run the Query Gateway myself? The gateway is open source, but AnomalyArmor runs it as part of our managed service. Self-hosted deployments are available for Enterprise customers. ### What happens if a query is blocked? Blocked queries are logged and never reach your database. The discovery job reports an error, which our engineering team investigates. ### Can the Query Gateway be bypassed? No. All database queries from AnomalyArmor pass through the gateway. There is no code path that executes queries directly. ### How do I verify AnomalyArmor is using the Query Gateway? Check your database query logs: you will only see metadata queries (against system catalogs) and aggregate queries (`COUNT`, `AVG`, `MIN`, `MAX`). Never `SELECT *` or raw row reads. You can also review the open-source gateway code directly. ## Related Pages - **[Security Overview](/security/overview)**: Full security documentation - **[Data Handling](/security/data-handling)**: What data we access and store --- # Data Handling > What data AnomalyArmor accesses and how we protect it The most common security question: "What data does AnomalyArmor actually see?" Here's the complete answer. ## What We Access AnomalyArmor queries **metadata and aggregate statistics only**, never your actual row data. Every query is validated by our open-source [Query Gateway](/security/query-gateway) before execution. | We Access | We Never Access | |-----------|-----------------| | Table names | Row data | | Column names | Actual values in columns | | Data types | PII content | | Timestamps (for freshness) | Query results | | Row counts | Business-sensitive data | | Constraint definitions | File contents | ### The Queries We Run During discovery, we query system catalogs: ```sql -- PostgreSQL example SELECT table_name, column_name, data_type FROM information_schema.columns WHERE table_schema NOT IN ('pg_catalog', 'information_schema'); -- For freshness, we query MAX of timestamp columns SELECT MAX(updated_at) FROM your_table; ``` We **never** run: ```sql SELECT * FROM your_table; -- ❌ Never SELECT email FROM users; -- ❌ Never ``` ### Freshness Monitoring For freshness checks, we query the maximum value of your designated timestamp column: ```sql SELECT MAX(created_at) FROM orders; -- Returns: 2024-01-15 08:30:00 -- We see: "Last update was 30 minutes ago" -- We don't see: Any actual order data ``` ## Data Storage ### What We Store | Data Type | Location | Encryption | Retention | |-----------|----------|------------|-----------| | Database credentials | AWS (encrypted) | AES-256-GCM | Until you delete | | Schema metadata | AWS RDS | AES-256 | Account lifetime | | Change history | AWS RDS | AES-256 | 90 days | | Audit logs | AWS S3 | AES-256 | 1 year | ### What We Don't Store - Your actual data values - Query results beyond metadata - Copies of your database ## Data Flow 1. Discovery engine connects via TLS 1.3 2. Runs metadata queries against system catalogs 3. Receives only schema information 4. Stores encrypted in our infrastructure 5. Your actual data never leaves your database ## Compliance Implications ### For GDPR - We don't process personal data (only metadata) - DPA available if required - Data deletion on account closure ### For HIPAA - No PHI accessed - BAA available for healthcare customers - Enhanced audit logging available ## Verification Want to verify what we access? You have multiple options: ### Check Your Database Logs ```sql -- PostgreSQL: Enable query logging ALTER SYSTEM SET log_statement = 'all'; SELECT pg_reload_conf(); -- Filter for AnomalyArmor user grep 'anomalyarmor' /var/log/postgresql/postgresql.log ``` You'll see only `information_schema` queries and timestamp aggregations. ### Audit the Query Gateway Our [Query Gateway](https://github.com/anomalyarmor/anomalyarmor-query-gateway) is open source. Your security team can: 1. Review the source code 2. Run the 97+ security tests 3. Verify exactly which queries are allowed at each access level ### Can AnomalyArmor employees see my data? No. We can't see your data because we never access it. Support staff can see metadata (table names, column names) to help troubleshoot, but never actual data values. ### What if I have a column named 'password'? We see the column name "password" exists and its data type (e.g., varchar). We never see the actual password values stored in that column. ### How do I know you're not lying? 1. Audit our open-source [Query Gateway](https://github.com/anomalyarmor/anomalyarmor-query-gateway) 2. Check your database query logs 3. Use a read-only database user with only `SELECT` on system catalogs 4. Run a network packet capture during discovery ### Does AnomalyArmor ever read the actual values stored in my tables? No. Every query is restricted to system catalogs and bounded aggregates (`COUNT`, `MAX`, `MIN`, `AVG`, `SUM`). Raw reads like `SELECT *` or `SELECT email FROM users` are blocked by the open-source [Query Gateway](/security/query-gateway) before they ever reach your database. ### If I have a PII column like 'ssn' or 'email', what does AnomalyArmor see? We see the column name and its data type. We do not see a single row of the column's contents. Column names help auto-classify sensitivity, but the values never leave your database. ### Exactly which SQL statements does AnomalyArmor run against my database? Metadata queries against `information_schema` and `pg_catalog` for structure, plus aggregate queries like `SELECT MAX(updated_at) FROM your_table` for freshness. See the [queries we run](#the-queries-we-run) section above for concrete examples. ### How can I independently verify what AnomalyArmor queries? Turn on query logging for the `anomalyarmor` user and audit the logs, or run a packet capture during discovery. The [Query Gateway](https://github.com/anomalyarmor/anomalyarmor-query-gateway) is open source with 97+ security tests you can run yourself. See [Verification](#verification) above. ### Can AnomalyArmor employees see my data? No. We can't see your data because we never access it. Support staff can see metadata (table and column names) to help troubleshoot, but never actual values. Every credential access is logged. ### Does AnomalyArmor store copies of my tables? No. We store schema metadata (table and column names, types, constraints), change history, and timestamp aggregates, not the underlying rows. See [What We Store](#what-we-store) for the full list with encryption and retention details. ## See Also - **[Query Gateway](/security/query-gateway)**: Open-source SQL security layer - **[Security Overview](/security/overview)**: Full security documentation - **[Data Retention](/security/data-retention)**: How long we keep data --- # Data Retention > How long AnomalyArmor keeps your metadata and how to request deletion Understanding what data we keep and for how long. ## Retention Periods | Data Type | Retention | Notes | |-----------|-----------|-------| | Schema metadata | Account lifetime | Current state of your assets | | Schema change history | 90 days | Historical diffs | | Alert history | 90 days | Past alert triggers | | Audit logs | 1 year | Who did what, when | | Database credentials | Until deleted | Encrypted, you control | ## Extended Retention (Enterprise) Enterprise customers can customize retention: | Data Type | Standard | Enterprise Options | |-----------|----------|-------------------| | Change history | 90 days | Up to 2 years | | Alert history | 90 days | Up to 2 years | | Audit logs | 1 year | Up to 7 years | Contact your account manager to adjust retention periods. ## What Happens When Data Expires When data exceeds its retention period: 1. **Automatic deletion** - No action required from you 2. **Soft delete first** - Data marked for deletion 3. **Permanent removal** - Purged from all systems within 30 days 4. **Backup cleanup** - Removed from backups on next rotation ## Account Deletion When you close your account: 1. All data marked for deletion immediately 2. Schema metadata, history, and credentials deleted 3. Audit logs retained for 30 days (legal/compliance) 4. Complete removal within 30 days of account closure To delete your account: 1. Go to **Settings** → **Account** 2. Click **Delete Account** 3. Confirm deletion 4. You'll receive confirmation email when complete ## Data Export Before deletion, export your data: 1. Go to **Settings** → **Data Export** 2. Select what to export: - Asset inventory (CSV) - Change history (CSV) - Alert history (CSV) 3. Click **Export** 4. Download within 7 days ## GDPR Right to Erasure EU customers can request data deletion: 1. Email privacy@anomalyarmor.ai 2. Include your account email 3. Specify what data to delete 4. We'll respond within 30 days We'll delete: - All metadata associated with your account - Audit logs (after legal retention period) - Any PII in our systems We'll retain: - Anonymized usage statistics - Billing records (legal requirement) ## Compliance Certifications Our retention practices support: - **GDPR** - DPA available - **HIPAA** - BAA available ## Common Questions ### How long does AnomalyArmor keep my schema change history? 90 days on Starter, Growth, and Professional plans. Enterprise customers can extend to up to 2 years. See [Retention Periods](#retention-periods) for the full breakdown. ### How long are audit logs retained? 1 year by default. Enterprise customers can configure retention up to 7 years for compliance programs that require longer windows. Audit logs are stored in immutable encrypted storage. ### What happens to my data when I close my account? Metadata, change history, and database credentials are marked for deletion immediately, and fully purged within 30 days. Audit logs are retained for 30 days after closure for legal and compliance purposes, then deleted. ### How do I submit a GDPR right-to-erasure request? Email privacy@anomalyarmor.ai with your account email and what you want deleted. We respond within 30 days. Anonymized usage statistics and billing records are kept as required by law. See [GDPR Right to Erasure](#gdpr-right-to-erasure). ### Can I export my data before deletion? Yes. Go to **Settings → Data Export** and you can download your asset inventory, change history, and alert history as CSV. The export link is available for 7 days after it's generated. ### When does deleted data actually leave your backups? Retention deletions are soft-deleted first, permanently removed from primary storage within 30 days, and purged from backups on the next rotation cycle. Account closures follow the same 30-day hard-delete window. ## See Also - **[Data Handling](/security/data-handling)**: What data we access - **[Security Overview](/security/overview)**: Full security documentation --- # Common Setup Mistakes > Avoid these common configuration errors when setting up AnomalyArmor Avoid these pitfalls when setting up AnomalyArmor. Most users encounter at least one of these. Here's how to prevent and fix them. ## Database Connection Mistakes ### Wrong Port Number **The mistake**: Using the wrong default port for your database. | Database | Correct Port | Common Mistake | |----------|--------------|----------------| | PostgreSQL | 5432 | 3306 (MySQL) | | ClickHouse HTTPS | 8443 | 9000 (native) | | ClickHouse HTTP | 8123 | 9000 (native) | | Databricks | 443 | N/A | **How to fix**: Double-check the port in your database settings or cloud console. ### SSL Not Enabled **The mistake**: Not enabling SSL when your database requires it. Most cloud databases require SSL: - AWS RDS/Aurora: Required by default - Supabase: Required - ClickHouse Cloud: Always HTTPS - Google Cloud SQL: Required by default **How to fix**: Set **SSL Mode** to `require` in your connection settings. ### Forgot to Allowlist IPs **The mistake**: Firewall or security group blocks AnomalyArmor. **Symptoms**: - "Connection refused" error - "Connection timed out" error **How to fix**: 1. Go to **Settings → Security** to find AnomalyArmor IPs 2. Add them to your security group/firewall rules 3. Test the connection again ### Using Wrong Endpoint (AWS) **The mistake**: Using the wrong RDS/Aurora endpoint. ``` Common mistakes: ✗ mydb.rds.amazonaws.com (missing region) ✓ mydb.us-east-1.rds.amazonaws.com ✗ Writer endpoint for read-only operations ✓ Reader endpoint (recommended for monitoring) ``` **How to fix**: Copy the endpoint directly from AWS Console. ## Permission Mistakes ### Insufficient Database Permissions **The mistake**: User can connect but can't read metadata. **Symptoms**: - "Permission denied" error - Discovery finds no tables **Required permissions**: ```sql -- PostgreSQL GRANT CONNECT ON DATABASE your_db TO anomalyarmor; GRANT USAGE ON SCHEMA public TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor; -- ClickHouse GRANT SELECT ON system.* TO anomalyarmor; GRANT SELECT ON your_database.* TO anomalyarmor; GRANT SHOW ON *.* TO anomalyarmor; ``` ### Forgetting Schema Permissions **The mistake**: Granting table access but not schema access. ```sql -- Wrong: Has table access but not schema access GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO anomalyarmor; -- Right: Add schema usage GRANT USAGE ON SCHEMA analytics TO anomalyarmor; GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO anomalyarmor; ``` ### Not Granting Access to Future Tables **The mistake**: New tables aren't automatically monitored. ```sql -- PostgreSQL: Grant access to future tables ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO anomalyarmor; ``` Without this, you'll need to manually grant access each time a table is created. ## Alert Configuration Mistakes ### Alerting on Everything **The mistake**: Creating broad rules that alert on every change in every environment. **Result**: Alert fatigue. Your team ignores all alerts. **Better approach**: 1. Start with 5-10 critical production tables 2. Alert only on breaking changes (column removed, table removed) 3. Exclude dev/test environments 4. Add coverage gradually ### Same Destination for All Severities **The mistake**: Sending all alerts to Slack or all alerts to PagerDuty. **Result**: Important alerts get buried or you get paged for trivial issues. **Better approach**: ``` Critical (column removed) → PagerDuty + Slack #incidents High (type changed) → Slack #data-alerts Low (column added) → Email digest ``` ### Too-Tight Freshness SLAs **The mistake**: Setting freshness SLA to match exact expected timing. ``` ETL runs hourly SLA set to 1 hour ETL occasionally takes 65 minutes Result: Constant false positives ``` **Better approach**: ``` ETL runs hourly SLA set to 2 hours (2x expected) Buffer for normal variation Tighten later if needed ``` ### Not Disabling Rules During Maintenance **The mistake**: Getting paged during planned maintenance. **How to fix**: Before maintenance: 1. Go to **Alerts → Rules** 2. Toggle OFF relevant rules 3. Set a reminder to re-enable 4. Toggle rules back ON after maintenance ## Discovery Configuration Mistakes ### Discovery Too Infrequent **The mistake**: Daily discovery for production databases. **Result**: Schema changes aren't caught until the next day. After pipelines fail. **Better approach**: | Environment | Discovery Frequency | |-------------|---------------------| | Production | Hourly | | Staging | Every 6 hours | | Development | Daily | ### Not Scheduling Before Pipeline Runs **The mistake**: Discovery runs at random times, not before pipelines. **Better approach**: ``` Pipeline runs at: 3:00 AM Discovery should run at: 2:00 AM This way, you get alerts before the pipeline runs. ``` ### Including Unnecessary Schemas **The mistake**: Monitoring schemas that change constantly. ``` Schemas generating noise: - pg_temp_* (PostgreSQL temp tables) - test_* (test tables) - *_backup (backup copies) - dev_* (development) ``` **How to fix**: Add schema filters to exclude noisy schemas. ## Freshness Monitoring Mistakes ### Wrong Timestamp Column **The mistake**: Using `created_at` for tables that get updated. | Table Type | Right Column | |------------|--------------| | Append-only (events, logs) | `created_at` | | Updated tables (users, profiles) | `updated_at` | | ETL destination | `_loaded_at` or `_etl_timestamp` | **Symptom**: Freshness shows stale even though data is current. ### Not Handling Weekends **The mistake**: Freshness SLA assumes 24/7 updates. **Result**: Weekend alerts for data that legitimately doesn't update. **Options**: 1. Longer SLA (72 hours for daily data) 2. Disable rules on Friday, re-enable Monday 3. Different weekend thresholds (coming soon) ## Security Mistakes ### Using Production Application Credentials **The mistake**: Using the same credentials your application uses. **Risks**: - If compromised, attacker has full application access - Can't easily rotate without affecting application - No audit trail separation **Better approach**: Create a dedicated read-only user for AnomalyArmor. ### Not Using SSL **The mistake**: Disabling SSL "because it's easier." **Risk**: Credentials transmitted in plaintext. **How to fix**: Always use `SSL Mode: require` for production. ## Checklist: Before Going Live Use this checklist to avoid common mistakes: **Connection**: - [ ] Correct hostname (no typos) - [ ] Correct port for database type - [ ] SSL mode set to `require` - [ ] AnomalyArmor IPs allowlisted **Permissions**: - [ ] Dedicated read-only user created - [ ] CONNECT granted on database - [ ] USAGE granted on schemas - [ ] SELECT granted on tables - [ ] Future table access configured **Discovery**: - [ ] Discovery frequency matches environment (hourly for production, 6h for staging, daily for dev) - [ ] Scheduled before pipeline runs - [ ] Noisy schemas excluded **Alerts**: - [ ] Started with critical tables only - [ ] Routing by event type configured - [ ] Dev/test excluded - [ ] Freshness SLAs have buffer - [ ] Maintenance rule-toggle process documented ## Still Having Issues? - **[Connection Issues](/troubleshooting/connection-issues)**: Detailed connection troubleshooting - **[Contact Support](/support/contact)**: Get help from our team ## Common Questions ### Why does AnomalyArmor find zero tables after I connect my database? The user can connect but lacks `USAGE` on the schemas or `SELECT` on `information_schema`. Grant both, then re-run discovery. See [Forgetting Schema Permissions](#forgetting-schema-permissions) above for the exact grants. ### Why do new tables never show up in AnomalyArmor? New tables are not automatically covered by `GRANT SELECT ON ALL TABLES`. Add `ALTER DEFAULT PRIVILEGES` so future tables inherit access. See [Not Granting Access to Future Tables](#not-granting-access-to-future-tables) for the PostgreSQL syntax. ### Why are we getting false positive freshness alerts on a table that updates fine? Almost always the wrong timestamp column. Append-only tables use `created_at`, but tables that get updated in place need `updated_at` or `_loaded_at`. Switch the column and the noise usually stops. ### Should I use the same database user my application uses? No. Create a dedicated read-only user for AnomalyArmor so you can rotate credentials independently, keep audit trails clean, and limit blast radius if a key leaks. See [Security Mistakes](#security-mistakes). ### How often should discovery run? Hourly for production, every 6 hours for staging, daily for development. Schedule it an hour before your main pipelines run so schema changes are caught before they break anything. See [Discovery Configuration Mistakes](#discovery-configuration-mistakes). --- # Connection Issues > Troubleshoot database connectivity problems This guide helps you diagnose and fix common database connection problems. If you're having trouble connecting AnomalyArmor to your database, start here. ## Quick Diagnosis Use this decision tree to identify your issue: | Error Message | Likely Cause | Jump To | |---------------|--------------|---------| | "Connection refused" | Network/firewall issue | [Connection refused](#connection-refused) | | "Authentication failed" | Credentials issue | [Authentication failed](#authentication-failed) | | "SSL required" | SSL configuration issue | [SSL required](#ssl-required--ssl-connection-error) | | "Permission denied" | Database permissions issue | [Permission denied](#permission-denied--access-denied) | | "Timeout" | Network latency issue | [Timeout](#timeout--connection-timed-out) | ## Connection Test Failures ### "Connection refused" **What it means**: AnomalyArmor can't reach your database at all. **What you'll see**: When you click "Test Connection" in the Add Connection form, you'll see a red destructive toast notification with: - **Title**: "Error" - **Message**: "Connection refused - Could not connect to database at mydb.us-east-1.rds.amazonaws.com:5432" - **Details**: A list of possible causes: - Firewall/security group blocking connection - Incorrect hostname or port - Database not running The toast appears in the top-right corner of the screen and includes action buttons to retry or edit the connection settings. **Common causes**: 1. Firewall/security group blocking the connection 2. Wrong hostname or port 3. Database not running or not accepting connections **Troubleshooting steps**: ### 1. Verify hostname and port Double-check the hostname. Typos are common: ``` Common mistakes: [X] mydb.rds.amazonaws.com (missing region) [OK] mydb.us-east-1.rds.amazonaws.com [X] Port 3306 (MySQL default) [OK] Port 5432 (PostgreSQL default) ``` ### 2. Check firewall/security group AnomalyArmor needs inbound access on your database port. **AWS RDS/Aurora:** 1. Go to RDS Console → Your instance → Security Groups 2. Add inbound rule: - Type: PostgreSQL (or your DB type) - Port: 5432 (or your port) - Source: AnomalyArmor IPs (see Settings → Security) **Google Cloud SQL:** 1. Go to SQL Console → Your instance → Connections 2. Under "Authorized Networks", add AnomalyArmor IPs **Self-hosted:** ```bash # Allow AnomalyArmor IPs iptables -A INPUT -p tcp --dport 5432 -s 34.xxx.xxx.xxx -j ACCEPT ``` ### 3. Verify database is running Can you connect from another location? ```bash # Test from your local machine nc -zv your-host.rds.amazonaws.com 5432 # Expected output (success): Connection to your-host.rds.amazonaws.com port 5432 [tcp/postgresql] succeeded! # If it fails, the issue is likely firewall-related ``` ### 4. Check for VPC/private network issues If your database is in a private subnet: - It may not be accessible from the internet - You may need VPC peering or PrivateLink (Enterprise) - Contact us about private connectivity options ### "Authentication failed" **What it means**: AnomalyArmor reached your database but the credentials were rejected. **What you'll see**: A red error notification appears with a message like: > âš ī¸ **Connection failed** Authentication failed - FATAL: password authentication failed for user 'anomalyarmor' Possible causes: - Incorrect username or password - User doesn't exist in the database - User lacks CONNECT permission **Common causes**: 1. Wrong username or password 2. User doesn't exist 3. User doesn't have permission to connect **Troubleshooting steps**: ### 1. Verify credentials Copy-paste to avoid typos: - Copy the username exactly as it appears in your database - Copy the password (watch for leading/trailing spaces) - Check for special characters that may need escaping ### 2. Confirm user exists Connect to your database and check: ```sql -- PostgreSQL SELECT usename FROM pg_user WHERE usename = 'anomalyarmor'; -- ClickHouse SELECT name FROM system.users WHERE name = 'anomalyarmor'; ``` ### 3. Reset the password If unsure, reset the password: ```sql -- PostgreSQL ALTER USER anomalyarmor WITH PASSWORD 'new-secure-password'; -- ClickHouse ALTER USER anomalyarmor IDENTIFIED BY 'new-secure-password'; ``` Then update the password in AnomalyArmor. ### 4. Check pg_hba.conf (PostgreSQL) For self-hosted PostgreSQL, check `pg_hba.conf` allows the connection: ``` # Add entry for AnomalyArmor hostssl all anomalyarmor 34.xxx.xxx.xxx/32 scram-sha-256 ``` Reload PostgreSQL after changes: ```bash sudo systemctl reload postgresql ``` ### "SSL required" / "SSL connection error" **What it means**: The database requires SSL but the connection isn't using it (or vice versa). **What you'll see**: A red error notification appears with a message like: > âš ī¸ **Connection failed** SSL connection required - server requires SSL connection Solution: Set SSL Mode to 'require' in your connection settings and try again. **Troubleshooting steps**: ### 1. Enable SSL in AnomalyArmor Edit your connection: 1. Go to Data Sources → Your connection → Settings 2. Set **SSL Mode** to `require` 3. Save and test ### 2. Try different SSL modes | Mode | Description | |------|-------------| | `disable` | No SSL (not recommended) | | `require` | SSL required, no certificate verification | | `verify-ca` | Verify server certificate | | `verify-full` | Verify certificate and hostname | Start with `require` for most cloud databases. ### 3. Check cloud provider requirements - **AWS RDS**: Requires SSL by default, use `require` - **Aurora**: Requires SSL, use `require` - **Supabase**: Requires SSL, use `require` - **Google Cloud SQL**: Requires SSL, use `require` - **ClickHouse Cloud**: Always HTTPS on port 8443 ### "Timeout" / "Connection timed out" **What it means**: The connection attempt took too long. **Common causes**: 1. Network latency 2. Firewall silently dropping packets 3. Database under heavy load **Troubleshooting steps**: ### 1. Check network path Test connectivity from your own machine: ```bash # Test if port is reachable nc -zv -w 10 your-host.com 5432 # Test latency ping your-host.com ``` ### 2. Check security groups If the security group doesn't have an explicit allow rule, packets may be silently dropped (causing timeout instead of refused). ### 3. Try during off-peak hours If the database is under heavy load, connections may time out. ### "Permission denied" / "Access denied" **What it means**: Connection succeeded but the user can't access the required data. **Troubleshooting steps**: ### 1. Grant CONNECT permission ```sql -- PostgreSQL GRANT CONNECT ON DATABASE your_database TO anomalyarmor; ``` ### 2. Grant schema access ```sql -- PostgreSQL GRANT USAGE ON SCHEMA public TO anomalyarmor; GRANT USAGE ON SCHEMA analytics TO anomalyarmor; ``` ### 3. Grant table access ```sql -- PostgreSQL GRANT SELECT ON ALL TABLES IN SCHEMA public TO anomalyarmor; -- For future tables ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO anomalyarmor; ``` ### 4. Verify with test query Connect as the anomalyarmor user and run: ```sql SELECT table_name FROM information_schema.tables LIMIT 5; ``` If this fails, you need more permissions. ## Discovery Failures If connection tests pass but discovery fails: ### "No tables found" **Causes**: - User can connect but can't see tables - Schema filter excluding all schemas - Connected to wrong database **Solutions**: 1. Check schema filter settings 2. Verify user has SELECT on `information_schema` 3. Confirm database name is correct ### "Discovery timeout" **Causes**: - Very large database (10,000+ tables) - Slow network - Database under load **Solutions**: 1. Exclude unnecessary schemas 2. Run during off-peak hours 3. Consider using a read replica ## Provider-Specific Issues ### AWS RDS **Common issues:** - Security group not allowing AnomalyArmor IPs - Using private subnet without public access - Wrong endpoint (cluster vs instance) **Solutions:** 1. Add AnomalyArmor IPs to security group 2. Enable "Publicly accessible" or use VPC peering 3. Use the correct endpoint for your use case ### Databricks **Common issues:** - Invalid or expired access token - Wrong HTTP path - SQL warehouse not running **Solutions:** 1. Generate a new access token 2. Copy HTTP path from SQL Warehouse settings 3. Ensure warehouse is running or use serverless ### ClickHouse **Common issues:** - Using native port instead of HTTP port - HTTP interface disabled - IP not in user's allowed hosts **Solutions:** 1. Use port 8443 (HTTPS) or 8123 (HTTP) 2. Enable HTTP interface in config.xml 3. Add IPs to user's HOST restriction ## Still Stuck? If you've tried everything above and still can't connect: 1. **Collect diagnostic info**: - Error message (exact text) - Database type and version - Cloud provider (if applicable) - Network setup (public/private subnet) 2. **Contact support**: - Email: support@anomalyarmor.ai - Include the diagnostic info above - We can help troubleshoot complex network setups ## Common Questions ### Why does my connection test say 'Connection refused'? AnomalyArmor could not open a socket to your host. It's almost always a firewall or security group blocking our IPs, a wrong hostname or port, or a database that isn't actually listening. Walk through the [Connection refused](#connection-refused) checklist. ### What IP addresses do I need to allowlist? AnomalyArmor connects from a small set of static IPs listed in **Settings → Security** in your dashboard. Add those to your security group or firewall inbound rules for the database port. ### Why am I getting 'Authentication failed' when the credentials look correct? The most common causes are trailing whitespace in the password, a user that was created but not granted `CONNECT`, or (self-hosted PostgreSQL) a missing `pg_hba.conf` entry for our IPs. See [Authentication failed](#authentication-failed). ### Why does the connection time out instead of returning an error? Timeouts almost always mean a firewall is silently dropping packets rather than rejecting them. Check your security group has an explicit allow rule for AnomalyArmor's IPs on the database port. See [Timeout](#timeout--connection-timed-out). ### Can AnomalyArmor connect to a database in a private VPC? Not over the public internet. Enterprise customers can connect via VPC peering or AWS PrivateLink so the database never needs a public endpoint. [Contact sales](mailto:sales@anomalyarmor.ai) to set that up. --- # FAQ > Frequently asked questions about AnomalyArmor ## General ### What databases do you support? AnomalyArmor currently supports PostgreSQL (12+) including RDS, Aurora, Supabase, and self-hosted; Databricks with Unity Catalog; and ClickHouse (21.8+) including ClickHouse Cloud. More databases are on the way. [Contact us](mailto:support@anomalyarmor.ai) if you need a specific database. ### Do you access my actual data? No. AnomalyArmor queries metadata from system catalogs like `information_schema` plus bounded aggregates (`COUNT`, `MAX`) for freshness. We never run `SELECT *` or read row values. See [Data Handling](/security/data-handling) for the full list. ### How does discovery work? Discovery queries your database's system catalogs to find tables, views, and columns. For PostgreSQL we query `information_schema` and `pg_catalog`; for Databricks we use Unity Catalog APIs. Each run is compared against the previous state to detect changes. ### How often should I run discovery? Hourly for production, every 6 hours for staging, and daily for development. You can also trigger discovery on demand from the asset page at any time. ### What's the performance impact on my database? Minimal. Discovery runs a handful of lightweight system-catalog queries (typically 3-10 queries, under 1 second each) and never scans your tables. For very large databases you can point AnomalyArmor at a read replica. ## Schema Monitoring ### What schema changes do you detect? Columns added, columns removed, column type changes, tables added, tables removed, and constraint modifications. Changes are surfaced on the next discovery run after they occur. ### Can you detect column renames? Not directly yet. A rename surfaces as a column removed plus a column added, which you can correlate manually. Automatic rename detection is on the roadmap. ### Why didn't I get alerted about a schema change? Check, in order: did discovery run after the change; is the table in a monitored schema; does the alert rule scope include this asset; is the rule enabled; is the destination configured and tested. ## Freshness Monitoring ### How does freshness monitoring work? You specify a timestamp column (like `created_at` or `updated_at`). AnomalyArmor queries `MAX(column)` and alerts if the latest timestamp is older than your SLA threshold. See [Freshness Monitoring](/data-quality/freshness-monitoring). ### What if my table doesn't have a timestamp column? Without a timestamp column we can't monitor freshness. Options: add a timestamp column in your ETL, use an `_etl_loaded_at` audit column, or skip freshness for that table and rely on row-count metrics instead. ### How do I fix a false-positive freshness alert? Usually the SLA is too tight or you're using the wrong column (`created_at` on a table that gets updated). Loosen the threshold to 2x the expected interval, switch to `updated_at` for mutable tables, and check timezone handling. ## Alerts ### How do I reduce alert fatigue? Start with 5-10 critical tables, route by event type (breaking changes to PagerDuty, additions to email), exclude dev/test environments, and start with lenient thresholds that you tighten over time. See [Alert Best Practices](/alerts/best-practices). ### Can I disable alerts temporarily? Yes. Go to **Alerts → Rules**, find the rule, and toggle it off. The rule won't fire until you toggle it back on, which is the right pattern for planned maintenance. ### How do I test if my alerts work? Go to **Alerts → Destinations**, pick the destination, and click **Send Test Alert**. You'll see a test notification land in Slack, PagerDuty, email, or the webhook target within a few seconds. ### Can I send alerts to multiple destinations? Yes. When you create or edit an alert rule you can attach any number of destinations, so (for example) the same schema-drift alert can hit both Slack and email. ## Security ### How are my credentials stored? Database credentials are encrypted at rest with AES-256-GCM using keys managed by AWS KMS. Keys never leave the KMS boundary and not even our support team can read your credentials. See [Security Overview](/security/overview). ### What compliance certifications do you have? AnomalyArmor is GDPR compliant (DPA available) and HIPAA-ready (BAA available on Enterprise). Email security@anomalyarmor.ai for compliance documentation. ### What IP addresses does AnomalyArmor connect from? AnomalyArmor uses a small set of static IPs. View the current list in **Settings → Security** and add them to your firewall or security-group inbound rules. ### Can I use a read-only database user? Yes, and we recommend it. We only need `SELECT` on system catalogs and the tables you want monitored. See the [PostgreSQL guide](/data-sources/postgresql) for the exact grants. ## Billing & Plans ### Is there a free trial? Trials are available by invitation: a magic link in a cold email, a trial code we share directly, or the [homepage try-it form](https://www.anomalyarmor.ai/). There is no self-serve trial from the signup page without a code and no free tier. Trials are 14 days with no credit card; at the end you pick a paid plan or your workspace pauses until you choose one. ### What counts toward my usage? Usage is based on active monitors: each freshness schedule, data quality metric, and validity rule. Schema drift monitoring and AI descriptions are free. See [Usage Limits](/billing/usage-limits). ### Can I change plans? Yes. Upgrade or downgrade any time from **Settings → Billing**. Upgrades take effect immediately with prorated charges; downgrades take effect next billing cycle. ### What happens if I exceed my plan limits? You'll get a notification and the option to upgrade or remove unused monitors. We won't cut off service without warning. ## Technical ### Do you have an API? Yes. A full REST API is available on all plans. See the [API overview](/api/overview) for authentication and endpoints. ### Can I integrate with my existing tools? Yes, via native Slack and PagerDuty integrations, outbound webhooks to any HTTPS endpoint, email destinations, and the REST API for custom integrations. ### What browsers do you support? Modern Chrome, Firefox, Safari, and Edge. We don't support Internet Explorer. ### Is there a self-hosted option? Self-hosted deployments are available for Enterprise customers with specific compliance requirements. [Contact sales](mailto:sales@anomalyarmor.ai) for details. ## Troubleshooting ### My connection test fails. What do I do? Start with the [Connection Issues](/troubleshooting/connection-issues) guide. It walks through refused, timed-out, authentication, SSL, and permission errors with specific fixes for each. ### Discovery finds no tables. Why? Typically the user is missing `USAGE` on the schemas or `SELECT` on `information_schema`, the schema filter is excluding everything, or you connected to the wrong database name. See [Common Setup Mistakes](/troubleshooting/common-setup-mistakes). ### I'm not receiving alerts. What's wrong? Check that the alert rule is enabled, the destination is configured and passes a test send, the rule scope actually matches the asset, and discovery is running and detecting changes. ## Still Have Questions? - **[Contact Support](/support/contact)**: Get help from our team - **[Documentation](/)**: Browse the full documentation --- # Get Help > Contact support and find answers ## Self-Service First Most issues resolve in minutes: | Issue | Solution | |-------|----------| | Data seems stale | Run discovery to refresh | | Connection failing | Check [Connection Issues](/troubleshooting/connection-issues) | | Alert not firing | Verify rule scope and destination | | Missing tables | Check schema filters and permissions | See [Common Setup Mistakes](/troubleshooting/common-setup-mistakes) for quick fixes. ## Contact Support | Channel | Best For | Response Time | |---------|----------|---------------| | **In-app chat** | Quick questions | Minutes (business hours) | | **support@anomalyarmor.ai** | Detailed issues | Within 24 hours | | **security@anomalyarmor.ai** | Security concerns | Within 4 hours | Business hours: Monday-Friday, 9am-6pm PT ## System Status Check for ongoing incidents: **[status.anomalyarmor.ai](https://status.anomalyarmor.ai)** Subscribe to status updates for proactive notifications. ## When Contacting Support Include these details for faster resolution: **Required:** - Data source type (Postgres, Snowflake, etc.) - Error message (screenshot or copy/paste) - When it started happening **Helpful:** - Steps to reproduce - What you've already tried - Asset or rule ID (from URL) ## Enterprise Support Enterprise plans include: | Feature | Growth | Professional | Enterprise | |---------|--------|--------------|------------| | Email support | 24h response | 24h response | 4h response | | In-app chat | Business hours | Business hours | Extended hours | | Dedicated CSM | - | - | Included | | Phone support | - | - | Available | ## Feature Requests Have an idea? We track all requests: 1. **In-app chat**: Tell us what you need 2. **Email**: support@anomalyarmor.ai with subject "Feature Request" We prioritize based on customer impact and strategic fit. ## Related - **[Troubleshooting](/troubleshooting/common-setup-mistakes)**: Common fixes for setup issues - **[FAQ](/troubleshooting/faq)**: Frequently asked questions --- # Glossary > Key terms and definitions for AnomalyArmor Terms and definitions used throughout AnomalyArmor and this documentation. ## A **Alert** A notification sent when AnomalyArmor detects an event matching one of your configured rules. Alerts can be sent to Slack, email, PagerDuty, webhooks, or Microsoft Teams. **Alert Rule** A configuration that defines what events trigger alerts, what conditions must be met, and where notifications are sent. Each rule has an event type, optional scope/conditions, and one or more destinations. **Asset** A data object that AnomalyArmor discovers and monitors. Assets include tables, views, and materialized views. Each asset has metadata like column names, data types, and change history. **Asset Catalog** The complete inventory of all discovered assets across your connected data sources. Searchable and filterable, the catalog is your source of truth for what exists in your data infrastructure. ## B **Baseline** The initial state recorded during the first discovery of an asset. Subsequent discoveries compare against this baseline to detect changes. **Breaking Change** A schema modification likely to cause query failures, such as a removed column or table. Breaking changes typically warrant immediate attention and urgent alerts. ## C **Change Detection** The process of comparing the current state of an asset against its previous state to identify modifications. Part of the discovery process. **Connection** See [Data Source](#d). **Cooldown Period** The minimum time between repeated alerts for the same event. Prevents alert storms when an issue persists across multiple discoveries. ## D **Data Classification** Tags and labels applied to assets or columns for organization, governance, and compliance. Examples include `PII`, `Financial`, `Confidential`, or custom tags. **Data Quality** The degree to which data meets expectations for freshness, completeness, accuracy, and schema correctness. AnomalyArmor monitors freshness and schema as key quality dimensions. **Data Source** A connection to a database or data platform. Each data source stores credentials, connection settings, and discovery configuration. Also called a "connection." **Destination** A configured endpoint where alerts are sent. Supported destinations include Slack, email, PagerDuty, Microsoft Teams, and webhooks. **Discovery** (asset discovery) The process by which AnomalyArmor scans a database's system catalogs to find and catalog assets (tables, views, columns, data types). Discovery is the *action* that produces a metadata snapshot; it does not, by itself, detect changes. The individual differences between successive discoveries are [schema changes](#s), and the system that detects them is [schema drift](#s). User-facing verb: **Run Discovery**. **Discovery Job** A single execution of the discovery process. Each job queries the database and produces a metadata snapshot. Change detection happens separately by comparing the new snapshot against the previous one. **Asset Discovery** See [Discovery](#d). Canonical name for the discovery workflow (`asset_discovery` in the backend). Synonymous with Discovery. ## F **Freshness** A measure of how recently data in a table was updated. Calculated by checking the maximum value of a configured timestamp column. Tables with data older than their SLA are considered "stale." **Freshness SLA** See [SLA](#s). **Freshness Violation** An event triggered when a table's data exceeds its configured freshness threshold. Indicates the data is stale and may impact downstream systems. ## I **Incremental Discovery** A discovery mode that only detects changes since the last run, rather than rescanning everything. Faster and lighter than full discovery. The default for scheduled discoveries after the initial baseline. **Information Schema** The standard SQL schema containing metadata about database objects. AnomalyArmor queries `information_schema` (and similar system catalogs) to discover assets. We never query your actual data tables. ## M **Metadata** Data about data. For AnomalyArmor, this includes table names, column names, data types, constraints, timestamps, and schema history. Metadata is all we , we never see row data. **Metric** A statistical measurement tracked over time for data quality monitoring. Types include row count, null percentage, distinct count, duplicate count, and numeric statistics (min, max, average). Metrics are captured on a schedule and analyzed for anomalies. **Metric Anomaly** A metric value that falls outside the expected range based on historical data. Detected using statistical methods (z-score) with configurable sensitivity. Anomalies may indicate data quality issues. ## P **PII (Personally Identifiable Information)** Data that can identify an individual, such as names, email addresses, phone numbers, or social security numbers. Often tagged for compliance purposes using data classification. ## R **Rule** See [Alert Rule](#a). ## S **Schema** The structure of a database, including tables, columns, data types, and constraints. In database terminology, also refers to a namespace within a database (e.g., `public` schema in PostgreSQL). **Schema Change** Any modification to database structure. Types include: column added, column removed, column type changed, table added, table removed, and constraint modified. **Schema Drift** The phenomenon of database structure changing over time, intentionally or unintentionally. AnomalyArmor detects drift by comparing consecutive discoveries. **Schema History** The complete record of all schema changes detected for an asset. Includes what changed, when it was detected, and before/after values. **Sensitivity** A configuration parameter for anomaly detection that determines how many standard deviations from the mean trigger an alert. Lower values (1.0) are more sensitive and catch smaller deviations. Higher values (3.0) reduce false positives. Default is 2.0. **Scope** The filter criteria on an alert rule that determine which assets the rule applies to. Scope can include data source, schema, specific assets, or patterns. **SLA (Service Level Agreement)** A configured threshold defining how fresh data should be. For example, an SLA of "1 hour" means the table should have data no older than 1 hour. Violations trigger freshness alerts. **Stale Data** Data that hasn't been updated within its expected timeframe (SLA). Stale data may indicate ETL failures, source system issues, or pipeline problems. ## T **Table** A fundamental database object that stores data in rows and columns. Tables are the most common asset type monitored by AnomalyArmor. **Timestamp Column** A column containing date/time values used to track when data was created or updated. AnomalyArmor queries the maximum value of this column for freshness monitoring. ## U **Unity Catalog** Databricks' unified governance solution for data and AI. AnomalyArmor integrates with Unity Catalog to discover and monitor Databricks assets. ## V **View** A virtual table defined by a SQL query. Views don't store data themselves but present data from underlying tables. AnomalyArmor monitors views as assets. **VPC Peering** A network connection between two Virtual Private Clouds (VPCs) that enables private connectivity. Enterprise feature for connecting AnomalyArmor to databases in private subnets. ## W **Warning Threshold** An optional secondary threshold that triggers an early warning alert before the main SLA violation. Gives teams advance notice that data is approaching staleness. **Webhook** An HTTP callback that receives alert data when events occur. Webhooks enable custom integrations with any system that can receive HTTP requests. ## Z **Z-Score** A statistical measure indicating how many standard deviations a value is from the mean. Used in anomaly detection for metrics. A z-score above the configured sensitivity threshold triggers an anomaly alert. --- **Allowlist** A list of allowed IP addresses. Many databases require adding AnomalyArmor's IP addresses to allowlists in security groups or firewalls before connections can be established. --- # Downloads > Downloadable resources for setting up AnomalyArmor Download these ready-to-use resources to accelerate your AnomalyArmor setup. ## SQL Permission Templates Minimal-privilege SQL scripts for granting AnomalyArmor read-only access to your databases. Each script includes: - Step-by-step setup instructions - Customizable placeholders - Verification queries - Cleanup commands - **[PostgreSQL](/downloads/postgresql-permissions)**: For PostgreSQL, RDS, Aurora, Supabase - **[Snowflake](/downloads/snowflake-permissions)**: Role and user setup for Snowflake - **[Databricks](/downloads/databricks-permissions)**: Unity Catalog grants ## How to Use 1. **Open** the script page for your database 2. **Copy** the SQL code 3. **Replace** the placeholder values (marked with `your_` prefix) 4. **Run** as a database admin or owner 5. **Verify** using the test queries provided > 💡 These scripts grant the minimum permissions required for AnomalyArmor. You can further restrict access by using the per-schema options in each script. ## Security Best Practices ### Use dedicated service accounts Create a dedicated user for AnomalyArmor rather than using personal accounts. This ensures: - Monitoring continues when employees leave - Clear audit trail of AnomalyArmor activity - Easy permission revocation if needed ### Grant only what's needed AnomalyArmor only needs SELECT access for: - `information_schema` or equivalent for discovery - Tables you want to monitor for freshness - All scripts default to read-only access ### Use key-pair authentication when available For Snowflake and other databases that support it: - More secure than passwords - No password rotation required - Supports hardware security modules ## Need Help? - **[PostgreSQL Guide](/data-sources/postgresql)**: Detailed PostgreSQL setup instructions - **[Snowflake Guide](/data-sources/snowflake)**: Detailed Snowflake setup instructions - **[Databricks Guide](/data-sources/databricks)**: Detailed Databricks setup instructions - **[Contact Support](/support/contact)**: Get help from our team --- # AnomalyArmor API Overview > Programmatic access to AnomalyArmor data observability features via the REST API, OpenAPI 3.1 spec, and SDKs AnomalyArmor provides a REST API for programmatic access to data observability features. This enables automation, CI/CD integration, and custom tooling. ## Base URL ``` https://api.anomalyarmor.ai/api/v1 ``` ## Authentication All API requests require authentication via API key. Include your key in the `Authorization` header: ```bash curl -H "Authorization: Bearer aa_live_your_key_here" \ https://api.anomalyarmor.ai/api/v1/assets ``` > â„šī¸ Create API keys in **Settings > API Keys** or via CLI: `armor api-keys create` ## Rate Limiting Rate limits are enforced per API key based on your subscription tier: | Tier | Rate Limit | Burst | |------|------------|-------| | Free Trial | 20 req/min | 2 req/sec | | Starter | 30 req/min | 3 req/sec | | Growth | 500 req/min | 25 req/sec | | Professional | 1,000 req/min | 50 req/sec | | Enterprise | 5,000 req/min | 100 req/sec | ### Rate Limit Headers Every response includes rate limit information: ``` X-RateLimit-Limit: 500 X-RateLimit-Remaining: 499 X-RateLimit-Reset: 1701705600 ``` When rate limited, you'll receive a `429` response with `Retry-After` header. ## Pagination All list endpoints use offset/limit pagination: ```bash GET /api/v1/assets?limit=50&offset=100 ``` Response includes pagination metadata: ```json { "data": [...], "pagination": { "total": 245, "limit": 50, "offset": 100, "has_more": true } } ``` ## Error Responses Errors return a consistent JSON structure: ```json { "error": { "code": "ASSET_NOT_FOUND", "message": "Asset not found", "details": { "asset_id": "snowflake.prod.warehouse.orders", "suggestion": "Check the qualified name format: source.database.schema.table" } } } ``` ### Error Codes | Code | HTTP Status | Description | |------|-------------|-------------| | `ASSET_NOT_FOUND` | 404 | Asset doesn't exist or not accessible | | `UNAUTHORIZED` | 401 | Invalid or missing API key | | `FORBIDDEN` | 403 | Valid key but insufficient scope | | `RATE_LIMITED` | 429 | Too many requests | | `VALIDATION_ERROR` | 400 | Invalid request parameters | | `INTERNAL_ERROR` | 500 | Server error | ## Quick Start ## Available Endpoints - **[Assets](/api/assets)**: List and retrieve data assets - **[Freshness](/api/freshness)**: Monitor and check data freshness - **[Schema](/api/schema)**: Track schema changes and drift - **[Lineage](/api/lineage)**: Explore data dependencies - **[Alerts](/api/alerts)**: Query alert history and status - **[Authentication](/api/authentication)**: API key management ### Data Quality APIs - **[Metrics](/api/metrics)**: Track row counts, null rates, and data trends - **[Validity](/api/validity)**: Enforce NOT NULL, UNIQUE, REGEX, and custom rules - **[Referential Checks](/api/referential-checks)**: Verify foreign key relationships ## Next Steps - **[SDK Quickstart](/sdk/overview)**: Get started with the Python SDK - **[CLI Reference](/cli/overview)**: Command-line interface guide - **[Airflow Integration](/integrations/airflow)**: Use with Apache Airflow ## Common Questions ### What is the base URL for the AnomalyArmor API? All endpoints are served from `https://api.anomalyarmor.ai/api/v1`. Every request must include a Bearer token in the `Authorization` header. Point the Python or TypeScript SDK at a different `api_url` / `baseUrl` if you are targeting a staging environment. ### How do I handle rate limits and 429 responses? Limits are per-API-key and vary by subscription tier, from 20 req/min on Free Trial up to 5,000 req/min on Enterprise. Every response includes `X-RateLimit-Limit`, `X-RateLimit-Remaining`, and `X-RateLimit-Reset` headers, and 429s include a `Retry-After` header you should honor before retrying. The TypeScript SDK retries idempotent requests automatically; the Python SDK raises `RateLimitError` with a `retry_after` attribute. ### How does pagination work on list endpoints? List endpoints use offset/limit pagination with `limit` (default 50, max 100) and `offset` query parameters. Responses include a `pagination` object with `total`, `limit`, `offset`, and `has_more` so you can loop until `has_more` is `false`. Prefer paging in batches of 100 to minimize request count against your rate limit. ### What error codes does the API return? Errors come back as `{"error": {"code", "message", "details"}}`. Common codes include `ASSET_NOT_FOUND` (404), `UNAUTHORIZED` (401), `FORBIDDEN` (403), `RATE_LIMITED` (429), `VALIDATION_ERROR` (400), and `INTERNAL_ERROR` (500). The `details` object usually includes field names or suggestions you can surface in logs. ### Should I use the REST API directly or the SDK? Use the SDK in application code: it handles auth, retries on 429, and typed responses for you. Hit the REST API directly from shell scripts, non-supported languages, or when you need a verb not yet wrapped by the ergonomic SDK surface. Both paths share the same `aa_live_*` Bearer token. --- # AnomalyArmor API Authentication > Authenticate to the AnomalyArmor REST API with API keys, or to the AnomalyArmor MCP server with OAuth 2.1 AnomalyArmor supports two authentication methods: | Method | Use Case | |--------|----------| | **API Keys** | SDK, CLI, CI/CD pipelines, programmatic access | | **OAuth 2.1** | MCP server connections from AI tools (Claude Code, Cursor) | For MCP integrations, see the [MCP Server](/integrations/mcp-server) page. OAuth authentication is handled automatically when you connect via the remote MCP server. The rest of this page covers API key authentication for programmatic access. ## API Key Format API keys use the format `aa_live_`: ``` aa_live_k8jd92hf8j2hd98fh2d9h2f98h2d9fh2 ``` > âš ī¸ API keys are shown only once at creation. Store them securely - we cannot retrieve them later. ## Creating API Keys ### Via Dashboard 1. Go to **Settings > API Keys** 2. Click **Create API Key** 3. Enter a descriptive name (e.g., "Airflow Production") 4. Select scope based on needs: `read-only` for monitoring, `read-write` for triggering actions, `admin` for key management 5. Click **Create Key** 6. Copy the key immediately ### Via CLI ```bash # Create a read-only key armor api-keys create --name "airflow-prod" --scope read-only # Create a read-write key for triggering refreshes armor api-keys create --name "ci-pipeline" --scope read-write ``` ### Via API ```bash curl -X POST "https://api.anomalyarmor.ai/api/v1/api-keys" \ -H "Authorization: Bearer aa_live_admin_key" \ -H "Content-Type: application/json" \ -d '{"name": "automation-key", "scope": "read-only"}' ``` ## Scopes | Scope | Capabilities | |-------|--------------| | `read-only` | GET endpoints only. Read assets, freshness, lineage, alerts. | | `read-write` | GET + POST. Trigger freshness/schema refreshes. | | `admin` | Full access including API key management. | > 💡 Follow the principle of least privilege - use `read-only` for monitoring and `read-write` only when you need to trigger actions. ### Scope Examples ### read-only: Airflow pre-flight check Check if data is fresh before running pipelines. No ability to modify anything. ```python # Just reads freshness status client.freshness.require_fresh("warehouse.orders") ``` ### read-write: Trigger refresh after dbt run Check freshness and trigger a refresh when needed. ```python # Can trigger refresh operations client.freshness.refresh("warehouse.orders", wait=True) ``` ### admin: Key rotation automation Create and revoke keys programmatically for security compliance. ```python # Can manage API keys new_key = client.api_keys.create(name="rotated-key", scope="read-only") client.api_keys.revoke(old_key_id) ``` ## Using API Keys ### Environment Variable (Recommended) ```bash export ARMOR_API_KEY="aa_live_your_key_here" ``` Then in your code: ```python from anomalyarmor import Client # Automatically uses ARMOR_API_KEY client = Client() ``` ### Direct Parameter ```python from anomalyarmor import Client client = Client(api_key="aa_live_your_key_here") ``` ### HTTP Header For direct API calls: ```bash curl -H "Authorization: Bearer aa_live_your_key_here" \ https://api.anomalyarmor.ai/api/v1/assets ``` ## Rate Limits by Tier Rate limits are set when you create the API key based on your subscription: | Tier | Max Keys | Rate Limit | Burst | |------|----------|------------|-------| | Free Trial | 1 | 20/min | 2/sec | | Starter | 1 | 30/min | 3/sec | | Growth | 10 | 500/min | 25/sec | | Professional | 25 | 1,000/min | 50/sec | | Enterprise | Unlimited | 5,000/min | 100/sec | > â„šī¸ When you upgrade your plan, existing API keys automatically get the new rate limits. ## Revoking Keys Revoke compromised or unused keys immediately: ### Via Dashboard 1. Go to **Settings > API Keys** 2. Find the key and click the trash icon 3. Confirm revocation ### Via CLI ```bash armor api-keys revoke ``` ### Via API ```bash curl -X DELETE "https://api.anomalyarmor.ai/api/v1/api-keys/" \ -H "Authorization: Bearer aa_live_admin_key" ``` ## Security Best Practices ## Troubleshooting ### 401 Unauthorized - Check the key is not revoked - Verify the `Authorization: Bearer` header format - Ensure no extra whitespace in the key ### 403 Forbidden - The key is valid but lacks permission for this operation - Check the scope - you may need `read-write` or `admin` ### 429 Rate Limited - You've exceeded your rate limit - Check `Retry-After` header for when to retry - Consider upgrading your plan for higher limits ## Common Questions ### How do I create an API key? Generate keys under **Settings > API Keys** in the dashboard, via `armor api-keys create --name ... --scope ...`, or by POSTing to `/api/v1/api-keys` with an admin key. Name keys after the integration that uses them ("airflow-prod", "ci-pipeline") so rotation and revocation stay auditable. The plaintext key is shown only once at creation. ### Which scope should my API key have? Use `read-only` for monitoring, dashboards, and CI checks that only query state. Use `read-write` when you need to trigger freshness or schema refreshes. Reserve `admin` for key management automation itself. Follow least privilege: most integrations only need `read-only`. ### What does the aa_live_ prefix on a key mean? All AnomalyArmor API keys start with `aa_live_` followed by a random secret, making them easy to detect with secret scanners and grep over source trees. Treat the full string as sensitive. If you see one committed to git or pasted in logs, revoke it immediately from **Settings > API Keys**. ### How do I rotate or revoke an API key? Create the replacement key first, roll it out to the consumer, then revoke the old key via the dashboard, `armor api-keys revoke `, or `DELETE /api/v1/api-keys/{key-id}` with an admin token. Revocation takes effect immediately and cannot be undone, so keep the new key live before cutting the old one. ### Where should I store API keys in code? Read the key from the `ARMOR_API_KEY` environment variable (Python SDK / CLI) or `ANOMALYARMOR_API_KEY` (TypeScript SDK CLI shim), and inject it from a secrets manager in production. Never hardcode keys in source, Docker images, or Jupyter notebooks. Use separate keys per environment so a dev leak can't touch prod. ### Why am I getting 403 Forbidden when my key works elsewhere? 403 means the key is valid but doesn't have the scope the endpoint requires. Write endpoints (freshness refresh, schema refresh, create-metric) need `read-write`; key management needs `admin`. The error body shows `current_scope` and `required_scope` so you can pick the right key. --- # "List Assets" > "Retrieve all data assets (tables, views) you have access to" Get a list of all discovered assets across your data sources. Use this to build integrations, sync with data catalogs, or audit your inventory. ## When to Use - **Data catalog sync**: Export assets to external tools - **Automation**: Build workflows that operate on your asset inventory - **Auditing**: Generate reports of all monitored tables ## SDK & CLI Examples ## Parameters | Parameter | Type | Description | |-----------|------|-------------| | `source` | string | Filter by data source name | | `asset_type` | string | Filter by type: `table`, `view` | | `schema` | string | Filter by schema name | | `limit` | integer | Max results to return (default: 100, max: 1000) | | `offset` | integer | Skip first N results for pagination | ## Response ```json { "assets": [ { "id": "ast_abc123", "qualified_name": "snowflake.prod.public.orders", "asset_type": "table", "source": "snowflake.prod", "schema": "public", "name": "orders", "columns": 12, "freshness_status": "fresh", "last_updated": "2024-01-15T08:30:00Z", "tags": ["production", "pii:email"] } ], "total": 150, "limit": 100, "offset": 0 } ``` ## Response Fields | Field | Description | |-------|-------------| | `id` | Unique asset identifier | | `qualified_name` | Full path: `source.schema.table` | | `asset_type` | `table` or `view` | | `source` | Data source this asset belongs to | | `schema` | Database schema name | | `name` | Table/view name | | `columns` | Number of columns | | `freshness_status` | `fresh`, `stale`, or `unknown` | | `last_updated` | Last data update timestamp | | `tags` | Applied classification and custom tags | ## Error Responses | Status | Meaning | |--------|---------| | `401` | Invalid or missing API key | | `403` | API key doesn't have read permissions | | `429` | Rate limit exceeded | --- # "Get Asset" > "Retrieve details for a specific data asset" Get complete details for a single asset including columns, relationships, freshness status, and tags. ## When to Use - **Asset details**: Get full metadata for a specific table - **Column inspection**: List all columns with types and descriptions - **Freshness check**: Verify data recency for a single asset - **Integration sync**: Pull asset metadata into external tools ## SDK & CLI Examples ## Parameters | Parameter | Type | Description | |-----------|------|-------------| | `asset_id` | string | Asset ID (UUID) or qualified name | | `include_columns` | boolean | Include column details (default: true) | ## Response ```json { "id": "ast_abc123", "qualified_name": "snowflake.prod.public.orders", "asset_type": "table", "source": "snowflake.prod", "schema": "public", "name": "orders", "description": "Customer order records including order details and fulfillment status", "freshness": { "status": "fresh", "last_updated": "2024-01-15T08:30:00Z", "sla_threshold": "2h", "checked_at": "2024-01-15T09:00:00Z" }, "columns": [ { "name": "order_id", "data_type": "integer", "nullable": false, "description": "Unique identifier for each order", "tags": [] }, { "name": "user_email", "data_type": "varchar(255)", "nullable": true, "description": "Customer email address", "tags": ["pii:email"] } ], "tags": ["production", "critical"], "created_at": "2024-01-01T00:00:00Z", "updated_at": "2024-01-15T08:30:00Z" } ``` ## Response Fields | Field | Description | |-------|-------------| | `id` | Unique asset identifier | | `qualified_name` | Full path: `source.schema.table` | | `description` | Human-readable description (generated or manual) | | `freshness.status` | `fresh`, `stale`, or `unknown` | | `freshness.last_updated` | Timestamp of most recent data | | `freshness.sla_threshold` | Configured freshness SLA | | `columns` | Array of column definitions | | `columns[].tags` | Classification tags on this column | | `tags` | Asset-level tags | ## Error Responses | Status | Meaning | |--------|---------| | `401` | Invalid or missing API key | | `404` | Asset not found | | `429` | Rate limit exceeded | --- # "Get Freshness Status" > "Retrieve freshness status for an asset or data source" Check whether data is fresh or stale. Use this to gate pipelines, build monitoring dashboards, or verify SLA compliance. ## When to Use - **Pipeline gating**: Fail jobs if upstream data is stale - **Dashboards**: Show freshness status in monitoring tools - **SLA tracking**: Verify data meets freshness requirements - **Alerting integration**: Build custom alert workflows ## SDK & CLI Examples ## Parameters | Parameter | Type | Description | |-----------|------|-------------| | `asset_id` | string | Asset ID or qualified name | ## Response ```json { "asset": "snowflake.prod.public.orders", "status": "fresh", "last_updated": "2024-01-15T08:30:00Z", "sla_threshold": "2h", "sla_status": "within_sla", "checked_at": "2024-01-15T09:00:00Z", "timestamp_column": "created_at", "data_age_minutes": 30 } ``` ## Response Fields | Field | Description | |-------|-------------| | `status` | `fresh`, `stale`, or `unknown` | | `last_updated` | Timestamp of most recent data in the table | | `sla_threshold` | Configured freshness SLA (e.g., "2h", "24h") | | `sla_status` | `within_sla`, `warning`, or `violated` | | `checked_at` | When this freshness check ran | | `timestamp_column` | Column used to determine freshness | | `data_age_minutes` | How old the data is in minutes | ## Status Values | Status | Meaning | |--------|---------| | `fresh` | Data updated within SLA threshold | | `stale` | Data older than SLA threshold | | `unknown` | No timestamp column configured or no data | ## Error Responses | Status | Meaning | |--------|---------| | `401` | Invalid or missing API key | | `404` | Asset not found | | `422` | No freshness configuration for this asset | | `429` | Rate limit exceeded | --- # "Get Schema Changes" > "Retrieve recent schema changes detected for an asset" Get the history of schema changes for an asset. Use this to track drift, understand what changed, and build change management workflows. ## When to Use - **Change tracking**: Monitor schema drift over time - **Impact analysis**: Understand what changed before troubleshooting - **Compliance**: Audit schema modifications - **Integration**: Sync changes to external systems ## SDK & CLI Examples ## Parameters | Parameter | Type | Description | |-----------|------|-------------| | `asset_id` | string | Asset ID or qualified name | | `days_back` | integer | Look back this many days (default: 30) | | `change_type` | string | Filter by type (see Change Types below) | | `limit` | integer | Max results (default: 100) | ## Response ```json { "asset": "snowflake.prod.public.orders", "changes": [ { "id": "chg_abc123", "change_type": "column_removed", "object_name": "status", "object_type": "column", "old_value": { "name": "status", "data_type": "varchar(20)", "nullable": true }, "new_value": null, "detected_at": "2024-01-15T08:30:00Z", "discovery_id": "disc_xyz789" }, { "id": "chg_def456", "change_type": "column_added", "object_name": "fulfillment_status", "object_type": "column", "old_value": null, "new_value": { "name": "fulfillment_status", "data_type": "varchar(50)", "nullable": true }, "detected_at": "2024-01-15T08:30:00Z", "discovery_id": "disc_xyz789" } ], "total": 2 } ``` ## Response Fields | Field | Description | |-------|-------------| | `id` | Unique change identifier | | `change_type` | Type of change (see Change Types) | | `object_name` | Name of changed column or table | | `object_type` | `column` or `table` | | `old_value` | Previous state (null for additions) | | `new_value` | New state (null for removals) | | `detected_at` | When the change was discovered | | `discovery_id` | Discovery run that found this change | ## Change Types | Type | Description | Impact | |------|-------------|--------| | `column_added` | New column appeared | Low - usually safe | | `column_removed` | Column no longer exists | High - breaks queries | | `column_type_changed` | Data type modified | Medium - may affect logic | | `column_renamed` | Detected as remove + add | High - breaks queries | | `table_added` | New table discovered | Low - informational | | `table_removed` | Table no longer exists | High - breaks queries | | `constraint_changed` | PK, FK, or unique modified | Medium - may affect joins | ## Error Responses | Status | Meaning | |--------|---------| | `401` | Invalid or missing API key | | `404` | Asset not found | | `429` | Rate limit exceeded | --- # "List Alerts" > "List triggered alerts with filtering by status, type, and date range" Retrieve alerts that have been triggered. Use this to build dashboards, integrate with ticketing systems, or create custom notification workflows. ## When to Use - **Dashboard integration**: Show recent alerts in monitoring tools - **Ticketing sync**: Create tickets for triggered alerts - **Reporting**: Generate alert volume and response metrics - **Custom workflows**: Build automated response systems ## SDK & CLI Examples ## Parameters | Parameter | Type | Description | |-----------|------|-------------| | `status` | string | Filter by status: `triggered`, `acknowledged`, `resolved` | | `event_type` | string | Filter by type: `schema_change`, `freshness_violation`, `discovery_failed` | | `asset` | string | Filter by asset qualified name | | `since` | datetime | Only alerts after this timestamp | | `until` | datetime | Only alerts before this timestamp | | `limit` | integer | Max results (default: 100, max: 1000) | | `offset` | integer | Skip first N results | ## Response ```json { "alerts": [ { "id": "alt_abc123", "rule_name": "Production Schema Changes", "event_type": "schema_change", "status": "triggered", "message": "Column 'status' removed from orders table", "asset": "snowflake.prod.public.orders", "details": { "change_type": "column_removed", "column_name": "status", "column_type": "varchar(20)" }, "triggered_at": "2024-01-15T08:30:00Z", "acknowledged_at": null, "resolved_at": null, "destinations": ["slack", "pagerduty"] } ], "total": 45, "limit": 100, "offset": 0 } ``` ## Response Fields | Field | Description | |-------|-------------| | `id` | Unique alert identifier | | `rule_name` | Name of the rule that triggered | | `event_type` | Type of event that triggered the alert | | `status` | `triggered`, `acknowledged`, or `resolved` | | `message` | Human-readable alert description | | `asset` | Affected asset qualified name | | `details` | Event-specific details (varies by type) | | `triggered_at` | When the alert fired | | `acknowledged_at` | When someone acknowledged it (if applicable) | | `resolved_at` | When the alert was resolved (if applicable) | | `destinations` | Where the alert was sent | ## Status Values | Status | Meaning | |--------|---------| | `triggered` | Alert fired, not yet acknowledged | | `acknowledged` | Someone is looking at it | | `resolved` | Issue has been addressed | ## Event Types | Type | Description | |------|-------------| | `schema_change` | Table or column structure changed | | `freshness_violation` | Data older than SLA threshold | | `discovery_failed` | Discovery job couldn't complete | | `asset_removed` | Table or view no longer exists | ## Error Responses | Status | Meaning | |--------|---------| | `401` | Invalid or missing API key | | `400` | Invalid filter parameters | | `429` | Rate limit exceeded | --- # Assets API > List and retrieve data assets Query your data assets (tables, views, models) discovered by AnomalyArmor. ## List Assets ``` GET /api/v1/assets ``` ### Query Parameters | Parameter | Type | Description | |-----------|------|-------------| | `source` | string | Filter by data source name | | `type` | string | Filter by asset type (`table`, `view`, `model`) | | `limit` | integer | Max items to return (default: 50, max: 100) | | `offset` | integer | Number of items to skip (default: 0) | ### Example Request ### Example Response ```json { "data": [ { "id": "550e8400-e29b-41d4-a716-446655440000", "qualified_name": "snowflake.prod.warehouse.orders", "name": "orders", "asset_type": "table", "source": "snowflake", "database": "prod", "schema": "warehouse", "description": "Customer order transactions", "row_count": 1500000, "column_count": 24, "created_at": "2024-01-15T10:30:00Z", "updated_at": "2024-12-04T08:00:00Z" } ], "pagination": { "total": 245, "limit": 10, "offset": 0, "has_more": true } } ``` ## Get Asset ``` GET /api/v1/assets/{id} ``` Retrieve a single asset by qualified name or UUID. ### Path Parameters | Parameter | Type | Description | |-----------|------|-------------| | `id` | string | Qualified name (e.g., `snowflake.prod.warehouse.orders`) or UUID | ### Example Request ### Example Response ```json { "data": { "id": "550e8400-e29b-41d4-a716-446655440000", "qualified_name": "snowflake.prod.warehouse.orders", "name": "orders", "asset_type": "table", "source": "snowflake", "database": "prod", "schema": "warehouse", "description": "Customer order transactions", "row_count": 1500000, "column_count": 24, "columns": [ { "name": "order_id", "data_type": "VARCHAR", "is_nullable": false, "is_primary_key": true }, { "name": "customer_id", "data_type": "VARCHAR", "is_nullable": false }, { "name": "order_date", "data_type": "TIMESTAMP", "is_nullable": false } ], "tags": ["pii", "financial"], "created_at": "2024-01-15T10:30:00Z", "updated_at": "2024-12-04T08:00:00Z" } } ``` ## Asset Identification AnomalyArmor supports two ways to identify assets: ### Qualified Name (Primary) Human-readable, hierarchical identifier: ``` {source}.{database}.{schema}.{table} ``` Examples: - `snowflake.prod.warehouse.orders` - `databricks.main.analytics.daily_sales` - `postgresql.app_db.public.users` > 💡 Use qualified names in code for readability. They're stable as long as you don't rename the underlying table. ### UUID (Secondary) System-generated unique identifier. Use for automation where names may change: ``` 550e8400-e29b-41d4-a716-446655440000 ``` ## Error Responses | Status | Code | Description | |--------|------|-------------| | 404 | `ASSET_NOT_FOUND` | Asset doesn't exist or you don't have access | | 400 | `VALIDATION_ERROR` | Invalid qualified name format | ```json { "error": { "code": "ASSET_NOT_FOUND", "message": "Asset not found", "details": { "asset_id": "snowflake.prod.warehouse.orders", "suggestion": "Check the qualified name format: source.database.schema.table" } } } ``` ## Common Questions ### Should I identify an asset by qualified name or UUID? Use the qualified name (`source.database.schema.table`) for code and config you read by hand: it survives re-discovery and is human-debuggable. Use the UUID for automation where names might be renamed upstream, since UUIDs are stable across renames. Both identifiers work on every `/api/v1/assets/{id}` endpoint. ### How does AnomalyArmor discover assets? Assets are discovered automatically when you connect a source (Snowflake, BigQuery, Postgres, etc.) via the dashboard. The Assets API is read-only; it returns whatever the platform has discovered so far. If an expected table is missing, trigger a re-scan from the source's page in the dashboard. ### Does listing assets return column-level details? No. `GET /api/v1/assets` returns metadata and row/column counts. To get the column schema (names, types, nullability, primary keys), call `GET /api/v1/assets/{id}` for a single asset. That keeps list responses small and fast when you have hundreds of tables. --- # Freshness API > Monitor and validate data freshness The Freshness API enables monitoring and validation of data freshness. Use it to check if your data is up-to-date and trigger on-demand freshness checks. ## Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/v1/sdk/freshness` | List freshness status for all assets | | GET | `/api/v1/sdk/freshness/summary` | Get freshness summary statistics | | GET | `/api/v1/sdk/freshness/{id}` | Get freshness status for an asset | | POST | `/api/v1/sdk/freshness/{id}/refresh` | Trigger freshness check | ## Get Freshness Summary ``` GET /api/v1/sdk/freshness/summary ``` Returns aggregate freshness statistics across all monitored assets. ### Response ```json { "data": { "total_assets": 245, "fresh": 230, "stale": 10, "unknown": 5, "fresh_percentage": 93.9, "by_source": { "snowflake": {"total": 150, "fresh": 145, "stale": 5}, "bigquery": {"total": 95, "fresh": 85, "stale": 5} } } } ``` ## List Freshness Status ``` GET /api/v1/sdk/freshness ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `status` | string | - | Filter: `fresh`, `stale`, or `unknown` | | `source` | string | - | Filter by data source | | `limit` | integer | 50 | Max results (max: 100) | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": [ { "asset_id": "550e8400-e29b-41d4-a716-446655440000", "qualified_name": "snowflake.prod.warehouse.orders", "status": "fresh", "is_fresh": true, "last_updated": "2024-12-04T08:15:00Z", "threshold_hours": 24, "hours_since_update": 2.5, "checked_at": "2024-12-04T10:30:00Z" } ], "pagination": { "total": 100, "limit": 50, "offset": 0, "has_more": true } } ``` ## Get Asset Freshness ``` GET /api/v1/sdk/freshness/{id} ``` Get detailed freshness status for a specific asset. ### Response ```json { "data": { "asset_id": "550e8400-e29b-41d4-a716-446655440000", "qualified_name": "snowflake.prod.warehouse.orders", "status": "fresh", "is_fresh": true, "last_updated": "2024-12-04T08:15:00Z", "threshold_hours": 24, "hours_since_update": 2.5, "checked_at": "2024-12-04T10:30:00Z", "history": [ {"checked_at": "2024-12-04T10:30:00Z", "status": "fresh"}, {"checked_at": "2024-12-04T06:30:00Z", "status": "fresh"}, {"checked_at": "2024-12-03T22:30:00Z", "status": "stale"} ] } } ``` ## Trigger Freshness Check ``` POST /api/v1/sdk/freshness/{id}/refresh ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `wait` | boolean | false | Wait for check to complete | ### Response (async) ```json { "data": { "job_id": "job_abc123", "status": "pending", "asset_id": "snowflake.prod.warehouse.orders" } } ``` ### Response (sync with wait=true) ```json { "data": { "job_id": "job_abc123", "status": "completed", "asset_id": "snowflake.prod.warehouse.orders", "result": { "is_fresh": true, "last_updated": "2024-12-04T10:35:00Z" } } } ``` ## Gate Pattern: require_fresh() The SDK provides a convenient gate pattern for pipelines: ```python from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError client = Client() try: # Raises StalenessError if data is stale client.freshness.require_fresh("snowflake.prod.warehouse.orders") print("Data is fresh, proceeding...") except StalenessError as e: print(f"Data is stale: last updated {e.last_updated}") raise # Fail the pipeline ``` > 💡 Use `require_fresh()` in Airflow tasks to automatically fail pipelines when upstream data is stale. See the [Airflow Integration](/integrations/airflow) guide. ## CLI Check Command The CLI provides a `check` command that exits with code 1 if data is stale: ```bash # Exit 0 if fresh, exit 1 if stale armor freshness check snowflake.prod.warehouse.orders echo $? # 0 = fresh, 1 = stale # Use in shell scripts if armor freshness check snowflake.prod.warehouse.orders; then echo "Data is fresh" dbt run else echo "Data is stale, aborting" exit 1 fi ``` ## Error Responses ### Asset Not Found (404) ```json { "error": { "code": "ASSET_NOT_FOUND", "message": "Asset not found", "details": {"asset_id": "invalid.qualified.name"} } } ``` ### Forbidden (403) When attempting to trigger refresh without proper scope: ```json { "error": { "code": "FORBIDDEN", "message": "Insufficient permissions. Required scope: read-write", "details": {"current_scope": "read-only", "required_scope": "read-write"} } } ``` ## Common Questions ### What's the difference between the async and sync refresh calls? Without `wait=true`, `POST /freshness/{id}/refresh` returns immediately with a `job_id` and `status=pending`. With `wait=true`, the request blocks until the check completes and returns the actual `is_fresh` result. Use async for fire-and-forget cron triggers and sync when a pipeline needs the answer before proceeding. ### How do I gate an Airflow or dbt pipeline on freshness? In Python, call `client.freshness.require_fresh("...")` which raises `StalenessError` when stale, which fails the task cleanly. In shell, run `armor freshness check ` and rely on exit code 1 for stale. The [Airflow integration guide](/integrations/airflow) shows the full DAG pattern. ### Where does the freshness threshold come from? Each asset has a `threshold_hours` configured from the dashboard (Freshness tab on the asset page) or inferred from historical update cadence. The API response shows the active `threshold_hours` next to `hours_since_update` so you can display both in your own UI. ### Why does freshness return status=unknown? `unknown` means AnomalyArmor hasn't yet observed enough update history to decide, typically for newly connected assets or tables that haven't been written to since the source was connected. Trigger a refresh to get a current read, and give the asset a few update cycles for the baseline to fill in. --- # Schema API > Monitor schema changes and detect drift The Schema API monitors schema changes and detects drift in your data assets. Use it to track column additions, removals, type changes, and trigger on-demand schema checks. ## Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/v1/sdk/schema/summary` | Get schema drift summary | | GET | `/api/v1/sdk/schema/changes` | List recent schema changes | | POST | `/api/v1/sdk/schema/{id}/refresh` | Trigger schema check | ## Get Schema Summary ``` GET /api/v1/sdk/schema/summary ``` Returns aggregate schema drift statistics. ### Response ```json { "data": { "total_assets": 245, "assets_with_changes": 12, "changes_last_24h": 5, "changes_last_7d": 23, "by_change_type": { "column_added": 15, "column_removed": 3, "column_type_changed": 8, "column_renamed": 2 } } } ``` ## List Schema Changes ``` GET /api/v1/sdk/schema/changes ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `asset_id` | string | - | Filter by asset qualified name | | `change_type` | string | - | Filter by change type | | `since` | datetime | - | Changes since timestamp | | `limit` | integer | 50 | Max results | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": [ { "id": "change_123", "asset_id": "550e8400-e29b-41d4-a716-446655440000", "qualified_name": "snowflake.prod.warehouse.orders", "change_type": "column_added", "column_name": "discount_code", "details": { "new_type": "VARCHAR(50)", "nullable": true }, "detected_at": "2024-12-04T09:15:00Z", "previous_schema_hash": "abc123", "current_schema_hash": "def456" } ], "pagination": { "total": 23, "limit": 50, "offset": 0, "has_more": false } } ``` ## Trigger Schema Check ``` POST /api/v1/sdk/schema/{id}/refresh ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `wait` | boolean | false | Wait for check to complete | ### Response ```json { "data": { "job_id": "job_xyz789", "status": "completed", "asset_id": "snowflake.prod.warehouse.orders", "result": { "changes_detected": true, "changes": [ { "change_type": "column_added", "column_name": "discount_code", "details": {"new_type": "VARCHAR(50)"} } ] } } } ``` ## Change Types | Type | Description | |------|-------------| | `column_added` | New column added to table | | `column_removed` | Column removed from table | | `column_type_changed` | Column data type changed | | `column_renamed` | Column name changed | | `table_created` | New table created | | `table_dropped` | Table dropped | ## Use Case: Post-Deploy Schema Check Trigger a schema check after deploying dbt models: ```python from anomalyarmor import Client client = Client() def verify_schema_after_deploy(models: list[str]): """Run schema check after dbt deployment.""" for model in models: print(f"Checking schema: {model}") result = client.schema.refresh(model, wait=True) if result.changes_detected: print(f" Schema changes detected:") for change in result.changes: print(f" - {change.change_type}: {change.column_name}") else: print(f" No schema changes") # After dbt run verify_schema_after_deploy([ "snowflake.prod.mart.orders_mart", "snowflake.prod.mart.customers_mart", ]) ``` ## Common Questions ### How does AnomalyArmor detect schema changes? A baseline schema snapshot is taken the first time an asset is profiled. On every scan (scheduled or triggered via `POST /schema/{id}/refresh`), the current column list, types, and nullability are diffed against the baseline and any delta is written to the changes feed. The `previous_schema_hash` and `current_schema_hash` fields let you correlate changes across runs. ### Can I trigger a schema check on demand after a dbt deploy? Yes. Call `POST /api/v1/sdk/schema/{id}/refresh?wait=true` (or `client.schema.refresh(asset, wait=True)`) in your post-deploy step. The synchronous response includes `changes_detected` and the change list, so you can fail the deploy pipeline if a breaking change lands unexpectedly. ### Which change types are reported? The feed reports `column_added`, `column_removed`, `column_type_changed`, `column_renamed`, `table_created`, and `table_dropped`. Renames are heuristic and are sometimes reported as paired add/remove when the column's data distribution has also shifted. --- # Lineage API > Explore data asset relationships and dependencies The Lineage API provides access to data asset relationships, showing upstream sources and downstream dependencies. Use it for impact analysis and understanding data flow. ## Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/v1/sdk/lineage` | List assets with lineage info | | GET | `/api/v1/sdk/lineage/{id}` | Get lineage for specific asset | ## List Assets with Lineage ``` GET /api/v1/sdk/lineage ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `source` | string | - | Filter by data source | | `has_upstream` | boolean | - | Only assets with upstream dependencies | | `has_downstream` | boolean | - | Only assets with downstream dependencies | | `limit` | integer | 50 | Max results | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": [ { "asset_id": "550e8400-e29b-41d4-a716-446655440000", "qualified_name": "snowflake.prod.warehouse.orders", "upstream_count": 3, "downstream_count": 5 } ], "pagination": { "total": 100, "limit": 50, "offset": 0, "has_more": true } } ``` ## Get Asset Lineage ``` GET /api/v1/sdk/lineage/{id} ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `direction` | string | `both` | `upstream`, `downstream`, or `both` | | `depth` | integer | 1 | Levels to traverse (1-5) | ### Response ```json { "data": { "asset_id": "550e8400-e29b-41d4-a716-446655440000", "qualified_name": "snowflake.prod.warehouse.orders", "upstream": [ { "asset_id": "raw_orders_id", "qualified_name": "snowflake.raw.stripe.orders", "asset_type": "table", "relationship": "derives_from", "depth": 1, "upstream": [ { "qualified_name": "stripe.api.orders", "asset_type": "api", "depth": 2 } ] }, { "asset_id": "raw_customers_id", "qualified_name": "snowflake.raw.crm.customers", "asset_type": "table", "relationship": "joins_with", "depth": 1 } ], "downstream": [ { "asset_id": "orders_mart_id", "qualified_name": "snowflake.prod.mart.orders_mart", "asset_type": "table", "relationship": "feeds_into", "depth": 1 }, { "asset_id": "dashboard_id", "qualified_name": "looker.sales_dashboard", "asset_type": "dashboard", "relationship": "feeds_into", "depth": 1 } ] } } ``` ## Relationship Types | Type | Description | |------|-------------| | `derives_from` | Target is derived from source (transformation) | | `joins_with` | Assets are joined together | | `feeds_into` | Source feeds into target | | `copies_to` | Direct copy relationship | | `references` | Lookup/reference relationship | ## Use Case: Check Upstream Before Pipeline Verify all upstream sources are fresh before running: ```python from anomalyarmor import Client client = Client() def check_upstream_freshness(asset: str): """Check all upstream sources are fresh.""" lineage = client.lineage.get(asset, direction="upstream") print(f"Checking {len(lineage.upstream)} upstream sources...") for upstream in lineage.upstream: client.freshness.require_fresh(upstream.qualified_name) print(f" {upstream.qualified_name}") print("All upstream sources are fresh!") check_upstream_freshness("snowflake.prod.mart.orders_summary") ``` ## Use Case: Impact Analysis Before making schema changes, check downstream impact: ```python from anomalyarmor import Client client = Client() def check_impact(asset: str): """Check downstream impact before schema change.""" lineage = client.lineage.get(asset, direction="downstream", depth=3) print(f"Downstream impact for {asset}:") print(f"Total downstream assets: {len(lineage.downstream)}") # Group by type by_type = {} for dep in lineage.downstream: by_type.setdefault(dep.asset_type, []).append(dep) for asset_type, assets in by_type.items(): print(f" {asset_type}: {len(assets)}") for asset in assets[:3]: print(f" - {asset.qualified_name}") check_impact("snowflake.prod.warehouse.orders") ``` Output: ``` Downstream impact for snowflake.prod.warehouse.orders: Total downstream assets: 8 table: 5 - snowflake.prod.mart.orders_mart - snowflake.prod.mart.revenue_daily - snowflake.prod.mart.customer_orders dashboard: 3 - looker.sales_dashboard - looker.executive_summary - tableau.orders_analysis ``` ## Common Questions ### How deep can I traverse lineage in one request? Pass `depth` from 1 to 5 on `GET /lineage/{id}`. Depth 1 returns direct neighbors, which is cheap and enough for most impact-analysis UIs. Higher depths fan out quickly, so avoid depth 5 in tight loops and paginate the results you consume. ### Where does lineage come from? Lineage is derived from warehouse query history, dbt manifests, and view definitions for supported sources, then enriched with dashboard consumers (Looker, Tableau, Mode). You won't see relationships for sources that don't expose query history or for ad-hoc scripts run outside the warehouse. ### What do the relationship types mean? `derives_from` marks transformation-based derivation (CREATE TABLE AS, dbt models). `joins_with` captures co-used inputs. `feeds_into` describes a non-derivation downstream consumer (e.g., a dashboard). `copies_to` is a direct copy, and `references` is a lookup/FK-style join. --- # Alerts API > Query alert history and status The Alerts API provides access to data quality alerts triggered by AnomalyArmor monitoring. Use it to query alert history, check active incidents, and integrate with your incident response workflows. ## Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/v1/sdk/alerts/summary` | Get alerts summary | | GET | `/api/v1/sdk/alerts` | List alerts | | GET | `/api/v1/sdk/alerts/rules` | List alert rules | ## Get Alerts Summary ``` GET /api/v1/sdk/alerts/summary ``` Returns aggregate alert statistics. ### Response ```json { "data": { "total": 45, "by_status": { "triggered": 5, "acknowledged": 3, "resolved": 37 }, "by_type": { "freshness": 20, "schema_change": 15, "row_count": 10 }, "triggered_last_24h": 8, "triggered_last_7d": 25 } } ``` ## List Alerts ``` GET /api/v1/sdk/alerts ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `status` | string | - | Filter: `triggered`, `acknowledged`, `resolved` | | `alert_type` | string | - | Filter: `freshness`, `schema_change`, `row_count` | | `asset_id` | string | - | Filter by asset qualified name | | `since` | datetime | - | Alerts since timestamp | | `limit` | integer | 50 | Max results | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": [ { "id": "alert_123", "alert_type": "freshness", "status": "triggered", "asset_id": "550e8400-e29b-41d4-a716-446655440000", "qualified_name": "snowflake.prod.warehouse.orders", "message": "Data is 26 hours stale (threshold: 24 hours)", "triggered_at": "2024-12-04T08:00:00Z", "acknowledged_at": null, "resolved_at": null, "details": { "last_updated": "2024-12-03T06:00:00Z", "threshold_hours": 24, "hours_stale": 26 }, "rule_id": "rule_freshness_orders" } ], "pagination": { "total": 5, "limit": 50, "offset": 0, "has_more": false } } ``` ## List Alert Rules ``` GET /api/v1/sdk/alerts/rules ``` Returns configured alert rules for your organization. ### Response ```json { "data": [ { "id": "rule_freshness_orders", "name": "Orders Freshness", "alert_type": "freshness", "asset_id": "snowflake.prod.warehouse.orders", "enabled": true, "config": { "threshold_hours": 24 }, "destinations": ["slack", "email"], "created_at": "2024-01-15T10:00:00Z" } ], "pagination": { "total": 15, "limit": 50, "offset": 0, "has_more": false } } ``` ## Alert Types | Type | Description | |------|-------------| | `freshness` | Data is stale beyond threshold | | `schema_change` | Schema change detected | | `row_count` | Row count anomaly detected | | `null_percentage` | Null percentage exceeded threshold | | `duplicate_keys` | Duplicate primary keys detected | ## Alert Statuses | Status | Description | |--------|-------------| | `triggered` | Alert is active and needs attention | | `acknowledged` | Alert has been seen but not resolved | | `resolved` | Alert has been resolved | ## Use Case: Incident Response Build an incident response script: ```python from anomalyarmor import Client client = Client() def get_active_incidents(): """Get all active alerts for incident response.""" alerts = client.alerts.list(status="triggered") by_type = {"schema": [], "freshness": [], "discovery": []} for alert in alerts: by_type[alert.alert_type].append(alert) print("=== Active Incidents ===") for alert_type in ["schema", "freshness", "discovery"]: if by_type[alert_type]: print(f"\n{alert_type.upper()} ({len(by_type[alert_type])}):") for alert in by_type[alert_type]: print(f" - {alert.qualified_name}: {alert.message}") return alerts incidents = get_active_incidents() ``` Output: ``` === Active Incidents === SCHEMA (1): - snowflake.prod.warehouse.payments: Column removed: deprecated_field FRESHNESS (2): - snowflake.prod.warehouse.orders: Data is 26 hours stale - bigquery.analytics.events: Data is 4 hours stale ``` ## Use Case: Slack Bot Integration Post alerts to Slack: ```python from anomalyarmor import Client import requests client = Client() SLACK_WEBHOOK = "https://hooks.slack.com/services/..." def post_alerts_to_slack(): """Post triggered alerts to Slack.""" alerts = client.alerts.list(status="triggered") if not alerts: return blocks = [{"type": "header", "text": {"type": "plain_text", "text": f"{len(alerts)} Active Alerts"}}] for alert in alerts[:5]: # Limit to 5 blocks.append({ "type": "section", "text": { "type": "mrkdwn", "text": f"*{alert.qualified_name}*\n{alert.message}" } }) requests.post(SLACK_WEBHOOK, json={"blocks": blocks}) # Run periodically post_alerts_to_slack() ``` ## Common Questions ### How do I filter alerts by asset or type? Combine `asset_id`, `alert_type`, and `status` query parameters on `GET /api/v1/sdk/alerts`. For example, `?status=triggered&alert_type=freshness` returns only active freshness alerts. The `since` parameter accepts an ISO-8601 timestamp to bound history queries. ### Can I acknowledge or resolve alerts through the API? The read endpoints on this page cover history and summaries. For state transitions (acknowledge / resolve), use the per-alert endpoints documented in the [TypeScript SDK reference](/sdk/javascript-reference) (`client.alerts.acknowledge`, `client.alerts.resolve`) which post to `/api/v1/alerts/{alert_id}/acknowledge` and `.../resolve`. ### What alert types does AnomalyArmor emit? The API returns `freshness`, `schema_change`, `row_count`, `null_percentage`, and `duplicate_keys`. Each carries a `details` object with the relevant context (e.g. `hours_stale`, `threshold_hours` for freshness). Use `alert_type` to route to different Slack channels or PagerDuty services. --- # Metrics API > Monitor and track data quality metrics like row counts, null percentages, and more The Metrics API enables programmatic management of data quality metrics. Use it to track trends in your data over time, detect anomalies, and integrate quality monitoring into your pipelines. ## Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/v1/sdk/metrics/{asset_id}/summary` | Get metrics summary for an asset | | GET | `/api/v1/sdk/metrics/{asset_id}` | List metrics for an asset | | GET | `/api/v1/sdk/metrics/{asset_id}/{metric_id}` | Get metric details | | POST | `/api/v1/sdk/metrics/{asset_id}` | Create a new metric | | PATCH | `/api/v1/sdk/metrics/{asset_id}/{metric_id}` | Update a metric | | DELETE | `/api/v1/sdk/metrics/{asset_id}/{metric_id}` | Delete a metric | | POST | `/api/v1/sdk/metrics/{asset_id}/{metric_id}/capture` | Trigger metric capture | | GET | `/api/v1/sdk/metrics/{asset_id}/{metric_id}/snapshots` | List metric snapshots | ## Metric Types | Type | Description | Requires Column | |------|-------------|-----------------| | `row_count` | Total row count of the table | No | | `null_percent` | Percentage of null values | Yes | | `distinct_count` | Count of distinct values | Yes | | `duplicate_count` | Count of duplicate values | Yes | | `min_value` | Minimum numeric value | Yes | | `max_value` | Maximum numeric value | Yes | | `mean` | Average numeric value | Yes | | `percentile` | Percentile value (requires `percentile_value`) | Yes | ## Get Metrics Summary ``` GET /api/v1/sdk/metrics/{asset_id}/summary ``` Returns aggregate metrics statistics for an asset. ### Response ```json { "data": { "total_metrics": 15, "active_metrics": 12, "total_checks": 8, "passing": 6, "failing": 1, "warning": 1, "error": 0, "health_percentage": 87.5 } } ``` ## List Metrics ``` GET /api/v1/sdk/metrics/{asset_id} ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `metric_type` | string | - | Filter by type (e.g., `row_count`, `null_percent`) | | `is_active` | boolean | - | Filter by active status | | `limit` | integer | 50 | Max results (max: 100) | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": { "items": [ { "id": "m_550e8400-e29b-41d4-a716-446655440001", "internal_id": 123, "asset_id": 456, "table_path": "snowflake.prod.warehouse.orders", "column_name": "customer_email", "metric_type": "null_percent", "capture_interval": "daily", "sensitivity": 3, "is_active": true, "created_at": "2026-01-01T10:00:00Z" } ] }, "pagination": { "total": 15, "limit": 50, "offset": 0, "has_more": false } } ``` ## Get Metric Details ``` GET /api/v1/sdk/metrics/{asset_id}/{metric_id} ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `include_snapshots` | boolean | true | Include recent snapshots | | `snapshot_limit` | integer | 30 | Max snapshots to include | ### Response ```json { "data": { "id": "m_550e8400-e29b-41d4-a716-446655440001", "internal_id": 123, "asset_id": 456, "table_path": "snowflake.prod.warehouse.orders", "column_name": "customer_email", "metric_type": "null_percent", "capture_interval": "daily", "sensitivity": 3, "is_active": true, "created_at": "2026-01-01T10:00:00Z" } } ``` ## Create Metric ``` POST /api/v1/sdk/metrics/{asset_id} ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Request Body | Field | Type | Required | Description | |-------|------|----------|-------------| | `metric_type` | string | Yes | Metric type (see table above) | | `table_path` | string | Yes | Full table path (catalog.schema.table) | | `column_name` | string | For column metrics | Column name | | `capture_interval` | string | No | `hourly`, `daily`, `weekly` (default: `daily`) | | `sensitivity` | float | No | Anomaly detection sensitivity (default: 1.0) | | `group_by_columns` | array | No | Columns to group by | | `percentile_value` | float | No | Percentile value (for `percentile` type) | ### Response ```json { "data": { "id": "m_550e8400-e29b-41d4-a716-446655440001", "internal_id": 123, "asset_id": 456, "table_path": "snowflake.prod.warehouse.orders", "column_name": "customer_email", "metric_type": "null_percent", "capture_interval": "daily", "sensitivity": 1.0, "is_active": true, "created_at": "2026-01-04T10:30:00Z" } } ``` ## Update Metric ``` PATCH /api/v1/sdk/metrics/{asset_id}/{metric_id} ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Request Body | Field | Type | Description | |-------|------|-------------| | `is_active` | boolean | Whether metric is active | | `capture_interval` | string | Capture interval | | `sensitivity` | float | Anomaly detection sensitivity | ## Delete Metric ``` DELETE /api/v1/sdk/metrics/{asset_id}/{metric_id} ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Response ```json { "data": { "success": true, "message": "Metric deleted" } } ``` ## Trigger Metric Capture ``` POST /api/v1/sdk/metrics/{asset_id}/{metric_id}/capture ``` > â„šī¸ Requires `read-write` or `admin` scope. Triggers an immediate capture of the metric value. ### Response ```json { "data": { "snapshot_count": 1, "snapshots": [ { "id": 789, "value": 2.5, "captured_at": "2026-01-04T10:35:00Z", "is_anomaly": false, "status": "PASS" } ] } } ``` ## List Metric Snapshots ``` GET /api/v1/sdk/metrics/{asset_id}/{metric_id}/snapshots ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `limit` | integer | 100 | Max results | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": { "items": [ { "id": 789, "metric_definition_id": 123, "value": 2.5, "captured_at": "2026-01-04T10:35:00Z", "is_anomaly": false, "z_score": 0.3, "status": "PASS" }, { "id": 788, "metric_definition_id": 123, "value": 15.2, "captured_at": "2026-01-03T10:35:00Z", "is_anomaly": true, "z_score": 4.2, "status": "FAIL" } ] }, "pagination": { "total": 90, "limit": 30, "offset": 0, "has_more": true } } ``` ## Use Case: Monitor Row Count Trends Track daily row counts to detect unexpected data volume changes: ```python from anomalyarmor import Client client = Client() asset_id = "550e8400-e29b-41d4-a716-446655440000" # Create row count metric metric = client.metrics.create( asset_id, metric_type="row_count", table_path="snowflake.prod.warehouse.orders", capture_interval="daily", sensitivity=2.0, # Alert on 2+ standard deviations ) # Trigger initial capture result = client.metrics.capture(asset_id, metric.id) print(f"Initial row count: {result['snapshots'][0]['value']}") # Later: check for anomalies snapshots = client.metrics.snapshots(asset_id, metric.id, limit=7) anomalies = [s for s in snapshots if s.is_anomaly] if anomalies: print(f"Found {len(anomalies)} anomalies in the last 7 captures") ``` ## Error Responses ### Metric Not Found (404) ```json { "error": { "code": "METRIC_NOT_FOUND", "message": "Metric not found", "details": {"metric_id": "m_invalid-uuid"} } } ``` ### Validation Error (400) ```json { "error": { "code": "VALIDATION_ERROR", "message": "column_name is required for null_percent metrics", "details": {"field": "column_name", "metric_type": "null_percent"} } } ``` ### Forbidden (403) ```json { "error": { "code": "FORBIDDEN", "message": "Insufficient permissions. Required scope: read-write", "details": {"current_scope": "read-only", "required_scope": "read-write"} } } ``` ## Common Questions ### What's the difference between a metric and a validity rule? Metrics track numeric values over time (row counts, null percentages, mean, percentile) and alert on statistical anomalies via the `sensitivity` parameter. Validity rules ([Validity API](/api/validity)) enforce deterministic pass/fail constraints like NOT NULL, REGEX, or RANGE. Use metrics to catch drift, validity to catch explicit contract violations. ### How does the sensitivity parameter affect anomaly detection? Sensitivity is the z-score threshold for flagging a snapshot as anomalous, defaulting to 1.0. Raise it (e.g. `2.0` or `3.0`) to reduce false positives on noisy data, lower it to catch subtler shifts. Each captured snapshot returns `z_score` and `is_anomaly` so you can tune in production. ### Can I capture a metric on demand outside its scheduled interval? Yes. `POST /api/v1/sdk/metrics/{asset_id}/{metric_id}/capture` (or `client.metrics.capture(...)`) triggers an immediate capture and returns the new snapshot with anomaly status. This is useful for backfilling after creating a metric and for post-deploy validation without waiting for the next scheduled run. ### Which metric types require a column_name? Column-level types (`null_percent`, `distinct_count`, `duplicate_count`, `min_value`, `max_value`, `mean`, `percentile`) require `column_name`. `row_count` operates on the whole table and ignores `column_name`. `percentile` additionally needs `percentile_value` in the request body. --- # Validity API > Define and enforce data validity rules like NOT NULL, UNIQUE, and custom patterns The Validity API enables programmatic management of data validity rules. Use it to enforce data quality constraints, detect invalid records, and integrate validation into your pipelines. ## Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/v1/sdk/validity/{asset_id}/summary` | Get validity summary for an asset | | GET | `/api/v1/sdk/validity/{asset_id}` | List validity rules for an asset | | GET | `/api/v1/sdk/validity/{asset_id}/{rule_id}` | Get validity rule details | | POST | `/api/v1/sdk/validity/{asset_id}` | Create a new validity rule | | PATCH | `/api/v1/sdk/validity/{asset_id}/{rule_id}` | Update a validity rule | | DELETE | `/api/v1/sdk/validity/{asset_id}/{rule_id}` | Delete a validity rule | | POST | `/api/v1/sdk/validity/{asset_id}/{rule_id}/check` | Trigger validity check | | GET | `/api/v1/sdk/validity/{asset_id}/{rule_id}/results` | List check results | ## Rule Types | Type | Description | Configuration | |------|-------------|---------------| | `NOT_NULL` | Column must not contain null values | None | | `UNIQUE` | Column values must be unique | None | | `REGEX` | Values must match a regex pattern | `rule_config.pattern` | | `RANGE` | Numeric values must be within range | `rule_config.min`, `rule_config.max` | | `ENUM` | Values must be in allowed set | `rule_config.allowed_values` | | `DATE_FORMAT` | Values must match date format | `rule_config.format` | | `CUSTOM_SQL` | Custom SQL expression | `rule_config.sql_expression` | ## Get Validity Summary ``` GET /api/v1/sdk/validity/{asset_id}/summary ``` Returns aggregate validity statistics for an asset. ### Response ```json { "data": { "total_rules": 12, "passing": 10, "failing": 1, "error": 1 } } ``` ## List Validity Rules ``` GET /api/v1/sdk/validity/{asset_id} ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `rule_type` | string | - | Filter by type (e.g., `NOT_NULL`, `REGEX`) | | `is_active` | boolean | - | Filter by active status | | `limit` | integer | 50 | Max results (max: 100) | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": { "items": [ { "id": 123, "uuid": "v_550e8400-e29b-41d4-a716-446655440001", "table_path": "snowflake.prod.warehouse.orders", "column_name": "customer_email", "rule_type": "NOT_NULL", "name": "Customer Email Required", "severity": "critical", "is_active": true, "check_interval": "daily", "created_at": "2024-12-01T10:00:00Z" } ] }, "pagination": { "total": 12, "limit": 50, "offset": 0, "has_more": false } } ``` ## Get Validity Rule Details ``` GET /api/v1/sdk/validity/{asset_id}/{rule_id} ``` ### Response ```json { "data": { "id": 123, "uuid": "v_550e8400-e29b-41d4-a716-446655440001", "table_path": "snowflake.prod.warehouse.orders", "column_name": "customer_email", "rule_type": "NOT_NULL", "rule_config": null, "name": "Customer Email Required", "description": "Email address must not be null for valid orders", "severity": "critical", "is_active": true, "alert_threshold_percent": 1.0, "treat_null_as_valid": false, "check_interval": "daily" } } ``` ## Create Validity Rule ``` POST /api/v1/sdk/validity/{asset_id} ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Request Body | Field | Type | Required | Description | |-------|------|----------|-------------| | `rule_type` | string | Yes | Rule type (see table above) | | `table_path` | string | Yes | Full table path (catalog.schema.table) | | `column_name` | string | For column rules | Column name | | `rule_config` | object | For some types | Rule-specific configuration | | `name` | string | No | Human-readable rule name | | `description` | string | No | Rule description | | `severity` | string | No | `info`, `warning`, `critical` (default: `warning`) | | `error_message` | string | No | Custom error message | | `alert_threshold_percent` | float | No | Alert when invalid % exceeds this | | `treat_null_as_valid` | boolean | No | Whether nulls pass (default: `false`) | | `check_interval` | string | No | `hourly`, `daily`, `weekly` (default: `daily`) | ### Examples ### Response ```json { "data": { "id": 124, "uuid": "v_550e8400-e29b-41d4-a716-446655440002", "table_path": "snowflake.prod.warehouse.orders", "column_name": "customer_email", "rule_type": "NOT_NULL", "name": "Customer Email Required", "severity": "critical", "is_active": true, "check_interval": "daily", "created_at": "2024-12-04T10:30:00Z" } } ``` ## Update Validity Rule ``` PATCH /api/v1/sdk/validity/{asset_id}/{rule_id} ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Request Body | Field | Type | Description | |-------|------|-------------| | `is_active` | boolean | Whether rule is active | | `name` | string | Rule name | | `description` | string | Rule description | | `severity` | string | Severity level | | `alert_threshold_percent` | float | Alert threshold | | `treat_null_as_valid` | boolean | Null handling | | `check_interval` | string | Check interval | ## Delete Validity Rule ``` DELETE /api/v1/sdk/validity/{asset_id}/{rule_id} ``` > â„šī¸ Requires `read-write` or `admin` scope. ## Trigger Validity Check ``` POST /api/v1/sdk/validity/{asset_id}/{rule_id}/check ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Request Body | Field | Type | Default | Description | |-------|------|---------|-------------| | `sample_limit` | integer | 10 | Max invalid samples to collect | ### Response ```json { "data": { "id": 456, "validity_rule_id": 123, "status": "fail", "total_rows": 10000, "invalid_count": 25, "invalid_percent": 0.25, "invalid_samples": { "samples": [ {"row_id": 1001, "value": null}, {"row_id": 1042, "value": null} ] }, "execution_duration_ms": 1250, "checked_at": "2024-12-04T10:35:00Z" } } ``` ## List Check Results ``` GET /api/v1/sdk/validity/{asset_id}/{rule_id}/results ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `limit` | integer | 100 | Max results | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": { "items": [ { "id": 456, "validity_rule_id": 123, "status": "fail", "total_rows": 10000, "invalid_count": 25, "invalid_percent": 0.25, "execution_duration_ms": 1250, "checked_at": "2024-12-04T10:35:00Z" }, { "id": 455, "validity_rule_id": 123, "status": "pass", "total_rows": 9975, "invalid_count": 0, "invalid_percent": 0.0, "execution_duration_ms": 1100, "checked_at": "2024-12-03T10:35:00Z" } ] }, "pagination": { "total": 60, "limit": 30, "offset": 0, "has_more": true } } ``` ## Use Case: Validate Email Format Ensure all customer emails match a valid format: ```python from anomalyarmor import Client client = Client() asset_id = "550e8400-e29b-41d4-a716-446655440000" # Create email validation rule rule = client.validity.create( asset_id, rule_type="REGEX", table_path="snowflake.prod.warehouse.customers", column_name="email", rule_config={"pattern": r"^[\w.-]+@[\w.-]+\.\w{2,}$"}, name="Valid Email Format", description="Validates email addresses match standard format", severity="warning", alert_threshold_percent=1.0, # Alert if > 1% invalid ) # Run initial check result = client.validity.check(asset_id, rule.uuid, sample_limit=20) if result.status == "fail": print(f"Warning: {result.invalid_count} invalid emails found") print(f"Invalid rate: {result.invalid_percent:.2f}%") for sample in (result.invalid_samples or {}).get("samples", [])[:5]: print(f" - {sample}") else: print("All emails are valid!") ``` ## Error Responses ### Rule Not Found (404) ```json { "error": { "code": "RULE_NOT_FOUND", "message": "Validity rule not found", "details": {"rule_id": "v_invalid-uuid"} } } ``` ### Invalid Rule Configuration (400) ```json { "error": { "code": "VALIDATION_ERROR", "message": "Invalid regex pattern in rule_config", "details": {"field": "rule_config.pattern", "error": "Invalid regex syntax"} } } ``` ### Forbidden (403) ```json { "error": { "code": "FORBIDDEN", "message": "Insufficient permissions. Required scope: read-write", "details": {"current_scope": "read-only", "required_scope": "read-write"} } } ``` ## Common Questions ### Which validity rule type should I pick? Use `NOT_NULL` for required columns, `UNIQUE` for primary-key-like invariants, `REGEX` for string formats (emails, IDs), `RANGE` for numeric bounds, `ENUM` for finite allowed sets, `DATE_FORMAT` for date strings, and `CUSTOM_SQL` when no built-in type fits. Each type's required `rule_config` is listed in the Rule Types table at the top of this page. ### How do I inspect rows that failed a validity rule? Every `check` response includes `invalid_samples.samples`, up to `sample_limit` rows (default 10, configurable per check). Bump `sample_limit` up to 100 when debugging a broken ingest. The response also returns `invalid_count` and `invalid_percent` so you can report a failure rate even when individual samples aren't needed. ### What does the alert_threshold_percent field control? `alert_threshold_percent` is the invalid-row percentage that flips a check result from pass to fail. Set it to `0` if any single invalid row should page you. Use higher values (e.g. `1.0`) on rules where a small amount of invalid data is tolerated and you only want to catch systemic regressions. ### How is treat_null_as_valid different from using NOT_NULL? `treat_null_as_valid` governs how a non-NULL-type rule (REGEX, RANGE, ENUM, etc.) handles NULLs. When `true`, NULL rows are skipped; when `false`, NULLs count as invalid. Combine with a separate `NOT_NULL` rule when you need to enforce both non-null and format at once - they surface as two distinct checks you can alert on independently. --- # Referential Checks API > Verify foreign key relationships and detect orphaned records The Referential Checks API enables monitoring of foreign key relationships between tables. Use it to detect orphaned records, validate referential integrity, and integrate relationship checks into your data quality workflows. ## Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/v1/sdk/referential/{asset_id}/summary` | Get referential summary for an asset | | GET | `/api/v1/sdk/referential/{asset_id}` | List referential checks for an asset | | GET | `/api/v1/sdk/referential/{asset_id}/{check_id}` | Get referential check details | | POST | `/api/v1/sdk/referential/{asset_id}` | Create a new referential check | | PATCH | `/api/v1/sdk/referential/{asset_id}/{check_id}` | Update a referential check | | DELETE | `/api/v1/sdk/referential/{asset_id}/{check_id}` | Delete a referential check | | POST | `/api/v1/sdk/referential/{asset_id}/{check_id}/execute` | Execute referential check | | GET | `/api/v1/sdk/referential/{asset_id}/{check_id}/results` | List check results | ## Concepts A referential integrity check validates that values in a **child column** (foreign key) exist in a **parent column** (primary key). - **Child table**: The table containing the foreign key (e.g., `orders.customer_id`) - **Parent table**: The table being referenced (e.g., `customers.id`) - **Orphan**: A record in the child table with no matching parent record ## Get Referential Summary ``` GET /api/v1/sdk/referential/{asset_id}/summary ``` Returns aggregate referential integrity statistics for an asset. ### Response ```json { "data": { "total_checks": 8, "active_checks": 7, "passing_checks": 6, "failing_checks": 1, "last_check_at": "2024-12-04T10:30:00Z" } } ``` ## List Referential Checks ``` GET /api/v1/sdk/referential/{asset_id} ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `is_active` | boolean | - | Filter by active status | | `limit` | integer | 50 | Max results (max: 100) | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": { "items": [ { "id": "r_550e8400-e29b-41d4-a716-446655440001", "internal_id": 123, "asset_id": 456, "child_table_path": "snowflake.prod.warehouse.orders", "child_column_name": "customer_id", "parent_table_path": "snowflake.prod.warehouse.customers", "parent_column_name": "id", "name": "Orders -> Customers FK", "capture_interval": "daily", "is_active": true, "created_at": "2024-12-01T10:00:00Z" } ] }, "pagination": { "total": 8, "limit": 50, "offset": 0, "has_more": false } } ``` ## Get Referential Check Details ``` GET /api/v1/sdk/referential/{asset_id}/{check_id} ``` ### Response ```json { "data": { "id": "r_550e8400-e29b-41d4-a716-446655440001", "internal_id": 123, "asset_id": 456, "child_table_path": "snowflake.prod.warehouse.orders", "child_column_name": "customer_id", "parent_table_path": "snowflake.prod.warehouse.customers", "parent_column_name": "id", "name": "Orders -> Customers FK", "description": "Validates orders reference existing customers", "capture_interval": "daily", "max_orphan_count": 0, "max_orphan_percent": 0.1, "min_child_count": null, "max_child_count": null, "is_active": true } } ``` ## Create Referential Check ``` POST /api/v1/sdk/referential/{asset_id} ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Request Body | Field | Type | Required | Description | |-------|------|----------|-------------| | `child_table_path` | string | Yes | Full path to child table (with FK column) | | `child_column_name` | string | Yes | Column name in child table (FK) | | `parent_table_path` | string | Yes | Full path to parent table (with PK column) | | `parent_column_name` | string | Yes | Column name in parent table (PK) | | `name` | string | No | Human-readable check name | | `description` | string | No | Check description | | `capture_interval` | string | No | `hourly`, `daily`, `weekly` (default: `daily`) | | `max_orphan_count` | integer | No | Alert if orphan count exceeds this | | `max_orphan_percent` | float | No | Alert if orphan % exceeds this | | `min_child_count` | integer | No | Min expected children per parent | | `max_child_count` | integer | No | Max expected children per parent | ### Response ```json { "data": { "id": "r_550e8400-e29b-41d4-a716-446655440002", "internal_id": 124, "asset_id": 456, "child_table_path": "snowflake.prod.warehouse.orders", "child_column_name": "customer_id", "parent_table_path": "snowflake.prod.warehouse.customers", "parent_column_name": "id", "name": "Orders -> Customers FK", "capture_interval": "daily", "max_orphan_count": 0, "is_active": true, "created_at": "2024-12-04T10:30:00Z" } } ``` ## Update Referential Check ``` PATCH /api/v1/sdk/referential/{asset_id}/{check_id} ``` > â„šī¸ Requires `read-write` or `admin` scope. ### Request Body | Field | Type | Description | |-------|------|-------------| | `is_active` | boolean | Whether check is active | | `name` | string | Check name | | `description` | string | Check description | | `capture_interval` | string | Capture interval | | `max_orphan_count` | integer | Max orphan count threshold | | `max_orphan_percent` | float | Max orphan percent threshold | | `min_child_count` | integer | Min child count cardinality | | `max_child_count` | integer | Max child count cardinality | ## Delete Referential Check ``` DELETE /api/v1/sdk/referential/{asset_id}/{check_id} ``` > â„šī¸ Requires `read-write` or `admin` scope. ## Execute Referential Check ``` POST /api/v1/sdk/referential/{asset_id}/{check_id}/execute ``` > â„šī¸ Requires `read-write` or `admin` scope. Triggers an immediate execution of the referential integrity check. ### Response ```json { "data": { "id": 789, "referential_check_id": 123, "status": "fail", "orphan_count": 5, "orphan_percent": 0.05, "total_child_rows": 10000, "orphan_sample": [ {"customer_id": 99999}, {"customer_id": 99998} ], "parents_below_min": null, "parents_above_max": null, "query_duration_ms": 2500, "created_at": "2024-12-04T10:35:00Z" } } ``` ## List Check Results ``` GET /api/v1/sdk/referential/{asset_id}/{check_id}/results ``` ### Query Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `limit` | integer | 100 | Max results | | `offset` | integer | 0 | Results to skip | ### Response ```json { "data": { "items": [ { "id": 789, "referential_check_id": 123, "status": "fail", "orphan_count": 5, "orphan_percent": 0.05, "total_child_rows": 10000, "query_duration_ms": 2500, "created_at": "2024-12-04T10:35:00Z" }, { "id": 788, "referential_check_id": 123, "status": "pass", "orphan_count": 0, "orphan_percent": 0.0, "total_child_rows": 9995, "query_duration_ms": 2300, "created_at": "2024-12-03T10:35:00Z" } ] }, "pagination": { "total": 60, "limit": 30, "offset": 0, "has_more": true } } ``` ## Use Case: Detect Orphaned Orders Monitor orders that reference non-existent customers: ```python from anomalyarmor import Client client = Client() asset_id = "550e8400-e29b-41d4-a716-446655440000" # Create a referential check check = client.referential.create( asset_id, child_table_path="snowflake.prod.warehouse.orders", child_column_name="customer_id", parent_table_path="snowflake.prod.warehouse.customers", parent_column_name="id", name="Orders -> Customers FK", description="Ensures all orders reference valid customers", max_orphan_count=0, # Zero tolerance for orphans ) # Execute check immediately result = client.referential.execute(asset_id, check.id) if result.status == "fail": print(f"Data quality issue: {result.orphan_count} orphaned orders found") print(f"Orphan rate: {result.orphan_percent:.3f}%") # Log sample orphans for investigation for orphan in (result.orphan_sample or []): print(f" Orphaned customer_id: {orphan}") # This could trigger an alert or fail a pipeline raise Exception("Referential integrity violation detected") else: print("All orders reference valid customers!") ``` ## Use Case: Cardinality Validation Ensure each order has a reasonable number of line items: ```python from anomalyarmor import Client client = Client() asset_id = "550e8400-e29b-41d4-a716-446655440000" # Create check with cardinality constraints check = client.referential.create( asset_id, child_table_path="snowflake.prod.warehouse.order_items", child_column_name="order_id", parent_table_path="snowflake.prod.warehouse.orders", parent_column_name="id", name="Order Items -> Orders FK", min_child_count=1, # Each order must have at least 1 item max_child_count=100, # No order should have more than 100 items ) result = client.referential.execute(asset_id, check.id) if result.parents_below_min is not None and result.parents_below_min > 0: print(f"Found {result.parents_below_min} orders with no items!") if result.parents_above_max is not None and result.parents_above_max > 0: print(f"Found {result.parents_above_max} orders with too many items!") ``` ## Error Responses ### Check Not Found (404) ```json { "error": { "code": "CHECK_NOT_FOUND", "message": "Referential check not found", "details": {"check_id": "r_invalid-uuid"} } } ``` ### Invalid Table Path (400) ```json { "error": { "code": "VALIDATION_ERROR", "message": "Table not found: snowflake.prod.warehouse.invalid_table", "details": {"field": "child_table_path"} } } ``` ### Check Inactive (400) ```json { "error": { "code": "VALIDATION_ERROR", "message": "Cannot execute inactive check", "details": {"check_id": "r_550e8400-e29b-41d4-a716-446655440001"} } } ``` ### Forbidden (403) ```json { "error": { "code": "FORBIDDEN", "message": "Insufficient permissions. Required scope: read-write", "details": {"current_scope": "read-only", "required_scope": "read-write"} } } ``` ## Common Questions ### Do my warehouse tables need declared foreign keys for this to work? No. Referential checks work against any two columns you name, regardless of whether the underlying warehouse enforces or even supports FK constraints. Snowflake and BigQuery, for instance, don't enforce FKs, which is exactly why this API exists. Just provide the child and parent `table_path` and column names. ### When should I use max_orphan_count vs max_orphan_percent? Use `max_orphan_count=0` for hard-invariant relationships (every order must reference a real customer). Use `max_orphan_percent` when some drift is acceptable (e.g. late-arriving dimension rows) and you care about the rate, not the absolute count. You can set both; either threshold breach marks the check as failed. ### How do min_child_count and max_child_count work? These are cardinality constraints on the parent->children relationship. `min_child_count=1` flags parents with zero children (e.g. orders with no line items), and `max_child_count=100` flags parents with too many. The result returns `parents_below_min` and `parents_above_max` counts separate from the orphan count. ### How expensive are referential checks on large tables? The query joins child and parent tables, so cost scales with the warehouse's join performance on the key columns. `query_duration_ms` is returned on every result for observability. For very large tables, prefer `daily` or `weekly` `capture_interval` and rely on `execute` for on-demand checks after loads rather than running them hourly. --- # SDK overview > Get started with the AnomalyArmor SDKs (Python and TypeScript) AnomalyArmor ships official SDKs for **Python** and **TypeScript**. Both talk to the same REST API (`app.anomalyarmor.ai`) with the same `aa_live_*` Bearer tokens, so an existing Python script and a new Vercel function see identical data. | Language | Package | Versioning | Docs | |----------|---------|-----------|------| | Python | `anomalyarmor-cli` on PyPI | `pip install anomalyarmor-cli` | Below | | TypeScript / JavaScript | `@anomalyarmor/sdk` on npm | `npm install @anomalyarmor/sdk` | [JavaScript SDK](/sdk/javascript) | The Python SDK also ships the `anomalyarmor` CLI. The TS SDK ships an `npx anomalyarmor` CLI. Both CLIs read `ANOMALYARMOR_API_KEY` from env as a convenience - library code in both SDKs requires the key to be passed explicitly. ## Install side-by-side ## Quickstart side-by-side ## Python SDK `anomalyarmor-cli` provides a Python SDK and CLI for programmatic access. ### Installation ```bash pip install anomalyarmor-cli ``` > â„šī¸ Requires Python 3.9 or higher. ### Quick Start ```python from anomalyarmor import Client # Initialize - uses ARMOR_API_KEY env var client = Client() # Or pass key directly client = Client(api_key="aa_live_xxx") # List assets assets = client.assets.list(source="snowflake", limit=10) for asset in assets: print(asset.qualified_name) # Check freshness (raises StalenessError if stale) client.freshness.require_fresh("snowflake.prod.warehouse.orders") ``` ## Configuration ### Environment Variables | Variable | Description | |----------|-------------| | `ARMOR_API_KEY` | Your API key (recommended) | | `ARMOR_API_URL` | Custom API URL (optional) | ```bash export ARMOR_API_KEY="aa_live_your_key_here" ``` ### Client Options ```python from anomalyarmor import Client client = Client( api_key="aa_live_xxx", # Or use ARMOR_API_KEY env var api_url="https://...", # Custom API URL (optional) timeout=30, # Request timeout in seconds ) ``` ## Resources The client provides access to all AnomalyArmor resources: ```python client.assets # Data assets client.freshness # Freshness monitoring client.schema # Schema drift detection client.lineage # Data lineage client.alerts # Alert history client.metrics # Data quality metrics client.api_keys # API key management (admin scope) ``` ## Common Patterns ### Airflow Pre-flight Check Gate your pipeline on data freshness: ```python from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError def check_upstream_freshness(): client = Client() try: # Raises StalenessError if stale client.freshness.require_fresh("snowflake.prod.warehouse.orders") print("Data is fresh, proceeding...") except StalenessError as e: print(f"Data is stale: {e}") raise # Fail the task ``` ### List and Filter Assets ```python from anomalyarmor import Client client = Client() # All Snowflake tables tables = client.assets.list(source="snowflake", type="table") # Paginate through all assets offset = 0 while True: assets = client.assets.list(limit=100, offset=offset) if not assets: break for asset in assets: process(asset) offset += 100 ``` ### Trigger and Wait for Refresh ```python from anomalyarmor import Client client = Client() # Trigger freshness check and wait for completion result = client.freshness.refresh( "snowflake.prod.warehouse.orders", wait=True # Block until complete ) print(f"Job {result.job_id}: {result.status}") ``` ### Check Lineage Before Running ```python from anomalyarmor import Client client = Client() # Get upstream dependencies lineage = client.lineage.get("snowflake.prod.warehouse.orders") # Check all upstream sources are fresh for upstream in lineage.upstream: client.freshness.require_fresh(upstream.qualified_name) print("All upstream sources are fresh!") ``` ### Monitor Data Quality Metrics ```python from anomalyarmor import Client client = Client() asset_id = "550e8400-e29b-41d4-a716-446655440000" # Create a row count metric with anomaly detection metric = client.metrics.create( asset_id, metric_type="row_count", table_path="snowflake.prod.warehouse.orders", capture_interval="daily", sensitivity=2.0, # Alert on 2+ standard deviations ) # Get metrics summary summary = client.metrics.summary(asset_id) print(f"Health: {summary.health_percentage}%") # Check for recent anomalies snapshots = client.metrics.snapshots(asset_id, metric.id, limit=7) anomalies = [s for s in snapshots if s.is_anomaly] if anomalies: print(f"Found {len(anomalies)} anomalies in the last 7 days") ``` ## Exception Handling ```python from anomalyarmor import Client from anomalyarmor.exceptions import ( StalenessError, # Data is stale AuthenticationError, # Invalid/missing API key NotFoundError, # Asset not found RateLimitError, # Rate limit exceeded ValidationError, # Invalid parameters ServerError, # Server error ArmorError, # Base exception ) client = Client() try: client.freshness.require_fresh("snowflake.prod.warehouse.orders") except StalenessError as e: print(f"Data is stale: last updated {e.last_updated}") except AuthenticationError: print("Invalid API key") except RateLimitError as e: print(f"Rate limited, retry after {e.retry_after} seconds") except ArmorError as e: print(f"API error: {e}") ``` ## Context Manager The client supports context manager for automatic cleanup: ```python from anomalyarmor import Client with Client() as client: assets = client.assets.list() # Connection automatically closed ``` ## Type Hints The SDK is fully typed for IDE support: ```python from anomalyarmor import Client from anomalyarmor.models import Asset, FreshnessStatus client = Client() asset: Asset = client.assets.get("snowflake.prod.warehouse.orders") status: FreshnessStatus = client.freshness.get(asset.qualified_name) print(status.is_fresh) # IDE knows this is bool print(status.last_updated) # IDE knows this is datetime ``` ## Next Steps - **[SDK Reference](/sdk/reference)**: Complete method reference - **[CLI Guide](/cli/overview)**: Command-line interface - **[Airflow Integration](/integrations/airflow)**: Use in Airflow DAGs - **[API Reference](/api/overview)**: REST API documentation ## Common Questions ### Should I use the Python SDK or the TypeScript SDK? Pick whichever matches the runtime you're already using: Python for Airflow, dbt hooks, and notebooks; TypeScript for Next.js, Vercel functions, and Node services. Both SDKs wrap the same REST API and accept the same `aa_live_*` keys, so mixing them across services is fine. See the [TypeScript SDK page](/sdk/javascript) for Node-specific setup. ### Which Python version does the SDK require? Python 3.9 or higher. The SDK ships fully typed models (`Asset`, `FreshnessStatus`, etc.) so you get IDE completion on every method and field. Install with `pip install anomalyarmor-cli`, which also installs the `armor` CLI. ### How do I paginate through thousands of assets in Python? Call `client.assets.list(limit=100, offset=n)` in a loop, incrementing `offset` by the page size until an empty page comes back. The SDK mirrors the REST pagination directly rather than hiding it, which keeps memory flat for very large accounts. The pattern is shown in the "List and Filter Assets" example above. ### How do I use the SDK with Airflow? Call `client.freshness.require_fresh(asset)` at the top of a task; it raises `StalenessError` when the asset is stale, which Airflow surfaces as a task failure. Set `ARMOR_API_KEY` as an Airflow connection secret and instantiate `Client()` with no arguments. The [Airflow integration guide](/integrations/airflow) shows a full DAG. --- # SDK Reference > Complete Python SDK method reference Complete reference for all `anomalyarmor-cli` SDK classes and methods. ## Client The main entry point for the SDK. ```python from anomalyarmor import Client client = Client( api_key="aa_live_xxx", # Or use ARMOR_API_KEY env var api_url="https://app.anomalyarmor.ai/api/v1", # Optional timeout=30, # Request timeout in seconds ) ``` ### Constructor | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `api_key` | `str \| None` | `None` | API key. Falls back to `ARMOR_API_KEY` env var | | `api_url` | `str \| None` | Production URL | Base URL for API requests | | `timeout` | `int \| None` | `30` | Request timeout in seconds | ### Context Manager ```python with Client() as client: assets = client.assets.list() # Connection automatically closed ``` ## client.freshness Monitor data freshness. ### freshness.summary() Get aggregate freshness statistics. ```python summary = client.freshness.summary() print(f"Fresh: {summary.fresh}/{summary.total_assets}") print(f"Fresh rate: {summary.fresh_percentage}%") ``` **Returns:** `FreshnessSummary` ### freshness.list() List freshness status for all assets. ```python statuses = client.freshness.list( status="stale", # Filter: "fresh", "stale", "unknown" limit=50, offset=0, ) ``` **Returns:** `list[FreshnessStatus]` ### freshness.get() Get freshness status for a specific asset. ```python status = client.freshness.get("snowflake.prod.warehouse.orders") print(f"Fresh: {status.is_fresh}") print(f"Last updated: {status.last_updated}") print(f"Hours since update: {status.hours_since_update}") ``` **Returns:** `FreshnessStatus` ### freshness.require_fresh() Require an asset to be fresh, raising an error if stale. **This is the recommended gate pattern for pipelines.** ```python from anomalyarmor.exceptions import StalenessError try: client.freshness.require_fresh( "snowflake.prod.warehouse.orders", max_age_hours=24, # Optional custom threshold ) print("Data is fresh!") except StalenessError as e: print(f"Stale: {e.hours_since_update}h old") raise ``` **Parameters:** - `asset_id` (str): Asset qualified name or UUID - `max_age_hours` (float | None): Custom threshold. Uses asset's configured threshold if not provided. **Returns:** `FreshnessStatus` if fresh **Raises:** `StalenessError` if stale, `NotFoundError` if not found ### freshness.refresh() Trigger a freshness check. ```python result = client.freshness.refresh("snowflake.prod.warehouse.orders") print(f"Job ID: {result['job_id']}") ``` **Returns:** `dict` with `job_id`, `status`, `message` **Raises:** `NotFoundError`, `AuthorizationError` (requires `read-write` scope) --- ## client.schema Monitor schema drift. ### schema.summary() Get schema drift summary statistics. ```python summary = client.schema.summary() print(f"Changes last 24h: {summary.changes_last_24h}") ``` **Returns:** `SchemaSummary` ### schema.changes() List recent schema changes. ```python changes = client.schema.changes( asset_id="snowflake.prod.warehouse.orders", # Optional filter change_type="column_added", # Optional filter limit=50, offset=0, ) for change in changes: print(f"{change.qualified_name}: {change.change_type}") ``` **Returns:** `list[SchemaChange]` ### schema.refresh() Trigger a schema check. ```python result = client.schema.refresh("snowflake.prod.warehouse.orders") ``` **Returns:** `dict` with `job_id`, `status` --- ## client.lineage Explore data dependencies. ### lineage.list() List assets with lineage information. ```python assets = client.lineage.list( source="snowflake", has_upstream=True, has_downstream=True, limit=50, ) ``` **Returns:** `list[LineageAsset]` ### lineage.get() Get lineage for a specific asset. ```python lineage = client.lineage.get( "snowflake.prod.warehouse.orders", direction="both", # "upstream", "downstream", or "both" depth=2, # Levels to traverse (1-5) ) for upstream in lineage.upstream: print(f"<- {upstream.qualified_name}") for downstream in lineage.downstream: print(f"-> {downstream.qualified_name}") ``` **Returns:** `Lineage` --- ## client.tags Manage data classification tags. ### tags.list() List tags for an asset. ```python tags = client.tags.list( asset="postgresql.analytics", # Asset ID or qualified name category="business", # Optional filter ) for tag in tags: print(f"{tag.name} on {tag.object_path}") ``` **Returns:** `list[Tag]` ### tags.create() Create a tag on a database object. ```python tag = client.tags.create( asset="postgresql.analytics", name="pii_data", object_path="gold.customers", # Required: schema.table or schema.table.column object_type="table", # "table" or "column" (default: "table") category="governance", # "business", "technical", "governance" description="Contains customer PII", # Optional ) ``` **Returns:** `Tag` ### tags.apply() Apply multiple tags to multiple objects. ```python result = client.tags.apply( asset="postgresql.analytics", tag_names=["pii", "gdpr"], # Required object_paths=["gold.customers", "gold.orders"], # Required category="governance", ) print(f"Applied: {result.applied}, Failed: {result.failed}") ``` **Returns:** `BulkApplyResult` --- ## client.intelligence Query the AI knowledge base about your data. ### intelligence.ask() Ask a question about an asset's data. ```python answer = client.intelligence.ask( asset="postgresql.analytics", question="What tables contain customer data?", ) print(answer.answer) print(f"Confidence: {answer.confidence}") print(f"Sources: {answer.sources}") ``` **Returns:** `IntelligenceAnswer` **Raises:** `NotFoundError` if asset not found, `ValidationError` if intelligence not generated ### intelligence.generate() Generate AI intelligence for an asset (async job). ```python result = client.intelligence.generate( asset="postgresql.analytics", ) print(f"Job ID: {result.job_id}") print(f"Status: {result.status}") ``` **Returns:** `dict` with `job_id`, `status` > â„šī¸ Requires asset discovery to be run first. Use the UI or API to discover schema before generating intelligence. --- ## client.jobs Monitor async job status. ### jobs.status() Get status of an async job. ```python status = client.jobs.status("job_abc123") print(f"Status: {status.status}") print(f"Progress: {status.progress}%") if status.error: print(f"Error: {status.error}") ``` **Returns:** `JobStatus` --- ## client.metrics Monitor data quality metrics like row counts, null percentages, and more. ### metrics.summary() Get metrics summary for an asset. ```python summary = client.metrics.summary("asset-uuid") print(f"Active: {summary.active_metrics}/{summary.total_metrics}") print(f"Health: {summary.health_percentage}%") ``` **Returns:** `MetricsSummary` ### metrics.list() List metrics for an asset. ```python metrics = client.metrics.list( "asset-uuid", metric_type="null_percent", # Optional filter is_active=True, # Optional filter limit=50, offset=0, ) for m in metrics: print(f"{m.table_path}.{m.column_name}: {m.metric_type}") ``` **Returns:** `list[MetricDefinition]` ### metrics.get() Get metric details with optional snapshots. ```python metric = client.metrics.get( "asset-uuid", "metric-uuid", include_snapshots=True, snapshot_limit=30, ) print(f"Type: {metric.metric_type}") print(f"Table: {metric.table_path}") ``` **Returns:** `MetricDefinition` ### metrics.create() Create a new metric. Requires `read-write` scope. ```python metric = client.metrics.create( "asset-uuid", metric_type="null_percent", table_path="catalog.schema.table", column_name="email", capture_interval="daily", ) print(f"Created: {metric.id}") ``` **Parameters:** - `asset_id` (str): Asset UUID - `metric_type` (str): `row_count`, `null_percent`, `distinct_count`, etc. - `table_path` (str): Full table path - `column_name` (str | None): Column name for column metrics - `capture_interval` (str): `hourly`, `daily`, `weekly` (default: `daily`) - `sensitivity` (float): Anomaly detection sensitivity (default: 1.0) **Returns:** `MetricDefinition` ### metrics.update() Update a metric. Requires `read-write` scope. ```python metric = client.metrics.update( "asset-uuid", "metric-uuid", is_active=False, sensitivity=2.0, ) ``` **Returns:** `MetricDefinition` ### metrics.delete() Delete a metric. Requires `read-write` scope. ```python client.metrics.delete("asset-uuid", "metric-uuid") ``` ### metrics.capture() Trigger an immediate metric capture. Requires `read-write` scope. ```python result = client.metrics.capture("asset-uuid", "metric-uuid") print(f"Captured {result.get('snapshot_count', 0)} snapshots") ``` **Returns:** `dict` with `snapshot_count` and `snapshots` ### metrics.snapshots() List historical snapshots for a metric. ```python snapshots = client.metrics.snapshots( "asset-uuid", "metric-uuid", limit=100, ) for s in snapshots: print(f"{s.captured_at}: {s.value}") ``` **Returns:** `list[MetricSnapshot]` --- ## client.validity Define and enforce data validity rules. ### validity.summary() Get validity summary for an asset. ```python summary = client.validity.summary("asset-uuid") print(f"Total rules: {summary.total_rules}") print(f"Passing: {summary.passing}, Failing: {summary.failing}") ``` **Returns:** `ValiditySummary` ### validity.list() List validity rules for an asset. ```python rules = client.validity.list( "asset-uuid", rule_type="NOT_NULL", # Optional filter is_active=True, # Optional filter limit=50, ) for r in rules: print(f"{r.column_name}: {r.rule_type}") ``` **Returns:** `list[ValidityRule]` ### validity.get() Get validity rule details. ```python rule = client.validity.get("asset-uuid", "rule-uuid") print(f"Type: {rule.rule_type}") print(f"Severity: {rule.severity}") ``` **Returns:** `ValidityRule` ### validity.create() Create a new validity rule. Requires `read-write` scope. ```python # NOT_NULL rule rule = client.validity.create( "asset-uuid", rule_type="NOT_NULL", table_path="catalog.schema.table", column_name="email", severity="critical", ) # REGEX rule regex_rule = client.validity.create( "asset-uuid", rule_type="REGEX", table_path="catalog.schema.table", column_name="email", rule_config={"pattern": r"^[\w.-]+@[\w.-]+\.\w+$"}, ) ``` **Parameters:** - `asset_id` (str): Asset UUID - `rule_type` (str): `NOT_NULL`, `UNIQUE`, `REGEX`, `RANGE`, `ENUM`, etc. - `table_path` (str): Full table path - `column_name` (str | None): Column name - `rule_config` (dict | None): Rule-specific configuration - `severity` (str): `info`, `warning`, `critical` (default: `warning`) - `check_interval` (str): `hourly`, `daily`, `weekly` (default: `daily`) **Returns:** `ValidityRule` ### validity.update() Update a validity rule. Requires `read-write` scope. ```python rule = client.validity.update( "asset-uuid", "rule-uuid", severity="critical", is_active=True, ) ``` **Returns:** `ValidityRule` ### validity.delete() Delete a validity rule. Requires `read-write` scope. ```python client.validity.delete("asset-uuid", "rule-uuid") ``` ### validity.check() Trigger an immediate validity check. Requires `read-write` scope. ```python result = client.validity.check( "asset-uuid", "rule-uuid", sample_limit=20, ) if result.status == "fail": print(f"Invalid: {result.invalid_count} ({result.invalid_percent:.2f}%)") ``` **Returns:** `ValidityCheckResult` ### validity.results() List historical check results. ```python results = client.validity.results( "asset-uuid", "rule-uuid", limit=30, ) for r in results: print(f"{r.checked_at}: {r.status}") ``` **Returns:** `list[ValidityCheckResult]` --- ## client.referential Monitor referential integrity between tables. ### referential.summary() Get referential summary for an asset. ```python summary = client.referential.summary("asset-uuid") print(f"Total: {summary.total_checks}") print(f"Passing: {summary.passing_checks}, Failing: {summary.failing_checks}") ``` **Returns:** `ReferentialSummary` ### referential.list() List referential checks for an asset. ```python checks = client.referential.list( "asset-uuid", is_active=True, limit=50, ) for c in checks: print(f"{c.child_column_name} -> {c.parent_column_name}") ``` **Returns:** `list[ReferentialCheck]` ### referential.get() Get referential check details. ```python check = client.referential.get("asset-uuid", "check-uuid") print(f"FK: {check.child_table_path}.{check.child_column_name}") print(f"PK: {check.parent_table_path}.{check.parent_column_name}") ``` **Returns:** `ReferentialCheck` ### referential.create() Create a new referential check. Requires `read-write` scope. ```python check = client.referential.create( "asset-uuid", child_table_path="catalog.schema.orders", child_column_name="customer_id", parent_table_path="catalog.schema.customers", parent_column_name="id", name="Orders -> Customers FK", max_orphan_count=0, ) ``` **Parameters:** - `asset_id` (str): Asset UUID - `child_table_path` (str): Child table path (contains FK) - `child_column_name` (str): FK column name - `parent_table_path` (str): Parent table path (contains PK) - `parent_column_name` (str): PK column name - `name` (str | None): Check name - `max_orphan_count` (int | None): Alert threshold for orphan count - `max_orphan_percent` (float | None): Alert threshold for orphan % **Returns:** `ReferentialCheck` ### referential.update() Update a referential check. Requires `read-write` scope. ```python check = client.referential.update( "asset-uuid", "check-uuid", max_orphan_percent=0.5, capture_interval="hourly", ) ``` **Returns:** `ReferentialCheck` ### referential.delete() Delete a referential check. Requires `read-write` scope. ```python client.referential.delete("asset-uuid", "check-uuid") ``` ### referential.execute() Execute a referential check immediately. Requires `read-write` scope. ```python result = client.referential.execute("asset-uuid", "check-uuid") if result.status == "fail": print(f"Orphans: {result.orphan_count} ({result.orphan_percent:.2f}%)") ``` **Returns:** `ReferentialCheckResult` ### referential.results() List historical check results. ```python results = client.referential.results( "asset-uuid", "check-uuid", limit=30, ) for r in results: print(f"{r.created_at}: {r.status} ({r.orphan_count} orphans)") ``` **Returns:** `list[ReferentialCheckResult]` --- ## client.alerts Query alert history. ### alerts.summary() Get alert summary statistics. ```python summary = client.alerts.summary() print(f"Triggered: {summary.triggered}") print(f"Last 24h: {summary.triggered_last_24h}") ``` **Returns:** `AlertsSummary` ### alerts.list() List alerts with filters. ```python alerts = client.alerts.list( status="triggered", # "triggered", "acknowledged", "resolved" alert_type="freshness", # "freshness", "schema_change", "row_count" asset_id="snowflake.prod.warehouse.orders", limit=50, ) ``` **Returns:** `list[Alert]` ### alerts.rules() List configured alert rules. ```python rules = client.alerts.rules() for rule in rules: print(f"{rule.name}: {rule.alert_type}") ``` **Returns:** `list[AlertRule]` --- ## client.api_keys Manage API keys (requires `admin` scope). ### api_keys.list() List your organization's API keys. ```python keys = client.api_keys.list( include_revoked=False, limit=50, ) ``` **Returns:** `list[APIKey]` ### api_keys.create() Create a new API key. > âš ī¸ The full key is only returned once. Store it securely! ```python new_key = client.api_keys.create( name="Airflow Production", scope="read-only", # "read-only", "read-write", "admin" ) print(f"Key: {new_key.key}") # Save this! ``` **Returns:** `CreatedAPIKey` (includes full key) ### api_keys.get() Get details of a specific key. ```python key = client.api_keys.get(key_id) print(f"{key.name}: {key.scope}") ``` **Returns:** `APIKey` (without full key) ### api_keys.revoke() Revoke an API key. This cannot be undone. ```python client.api_keys.revoke(key_id) ``` **Returns:** `APIKey` ### api_keys.usage() Get API key usage and limits. ```python usage = client.api_keys.usage() print(f"Keys: {usage['current_count']}/{usage['max_keys']}") print(f"Rate limit: {usage['rate_limit_per_min']}/min") ``` **Returns:** `dict` --- ## Models ### Asset ```python class Asset: id: str qualified_name: str name: str asset_type: str # "table", "view", "model" source: str database: str | None schema_name: str | None description: str | None row_count: int | None column_count: int | None tags: list[str] created_at: datetime updated_at: datetime ``` ### FreshnessStatus ```python class FreshnessStatus: asset_id: str qualified_name: str status: str # "fresh", "stale", "unknown" is_fresh: bool is_stale: bool last_updated: datetime | None hours_since_update: float | None staleness_threshold_hours: float | None checked_at: datetime ``` ### APIKey ```python class APIKey: id: str name: str key_prefix: str # e.g., "aa_live_abc1" key_suffix: str # e.g., "xy9z" scope: str created_at: datetime last_used_at: datetime | None revoked_at: datetime | None ``` ### CreatedAPIKey ```python class CreatedAPIKey: id: str name: str key: str # Full key - only shown once! scope: str created_at: datetime ``` ### Tag ```python class Tag: id: str name: str category: str # "business", "technical", "governance" object_path: str | None # e.g., "gold.customers" object_type: str | None # "table" or "column" description: str | None created_at: datetime | None ``` ### BulkApplyResult ```python class BulkApplyResult: applied: int # Number of tags successfully applied failed: int # Number that failed total: int # Total attempted ``` ### IntelligenceAnswer ```python class IntelligenceAnswer: answer: str # AI-generated answer confidence: str # "high", "medium", "low" sources: str # Data sources used ``` ### JobStatus ```python class JobStatus: job_id: str status: str # "pending", "running", "completed", "failed" progress: int # 0-100 error: str | None created_at: datetime completed_at: datetime | None ``` ### MetricsSummary ```python class MetricsSummary: total_metrics: int active_metrics: int total_checks: int passing: int failing: int warning: int error: int health_percentage: float ``` ### MetricDefinition ```python class MetricDefinition: id: str # Public UUID internal_id: int asset_id: int table_path: str column_name: str | None metric_type: str # row_count, null_percent, etc. capture_interval: str sensitivity: int is_active: bool created_at: datetime | None ``` ### MetricSnapshot ```python class MetricSnapshot: id: int metric_definition_id: int value: float captured_at: datetime is_anomaly: bool z_score: float | None status: str | None # PASS, FAIL, WARNING ``` ### ValiditySummary ```python class ValiditySummary: total_rules: int passing: int failing: int error: int ``` ### ValidityRule ```python class ValidityRule: id: int uuid: str table_path: str column_name: str | None rule_type: str # NOT_NULL, UNIQUE, REGEX, etc. rule_config: dict | None name: str | None severity: str # info, warning, critical is_active: bool check_interval: str ``` ### ValidityCheckResult ```python class ValidityCheckResult: id: int validity_rule_id: int status: str # pass, fail, error total_rows: int invalid_count: int invalid_percent: float invalid_samples: dict | None checked_at: datetime ``` ### ReferentialSummary ```python class ReferentialSummary: total_checks: int active_checks: int passing_checks: int failing_checks: int last_check_at: datetime | None ``` ### ReferentialCheck ```python class ReferentialCheck: id: str # Public UUID internal_id: int asset_id: int child_table_path: str child_column_name: str parent_table_path: str parent_column_name: str name: str | None capture_interval: str max_orphan_count: int | None max_orphan_percent: float | None is_active: bool ``` ### ReferentialCheckResult ```python class ReferentialCheckResult: id: int referential_check_id: int status: str # pass, fail, error orphan_count: int orphan_percent: float total_child_rows: int orphan_sample: list | None created_at: datetime ``` --- # Error Handling > Handle exceptions and debug SDK issues The SDK uses typed exceptions to communicate errors. All exceptions inherit from `ArmorError`. ## Exception Hierarchy ## Import Exceptions ```python from anomalyarmor.exceptions import ( ArmorError, # Base exception AuthenticationError, # Invalid/missing API key AuthorizationError, # Valid key, insufficient scope NotFoundError, # Resource not found ValidationError, # Invalid parameters RateLimitError, # Rate limit exceeded ServerError, # Server error StalenessError, # Data is stale ) ``` ### AuthenticationError Raised when authentication fails (401). ```python try: client = Client(api_key="invalid_key") client.assets.list() except AuthenticationError as e: print("Invalid API key") ``` **Common causes:** - Invalid API key - Expired or revoked key - Missing `Authorization` header --- ### AuthorizationError Raised when authorization fails (403). The API key is valid but lacks permissions. ```python class AuthorizationError(ArmorError): required_scope: str | None # Scope needed for this action current_scope: str | None # Scope of your API key ``` **Example:** ```python try: # Trying to create a key with read-only scope client.api_keys.create(name="test", scope="admin") except AuthorizationError as e: print(f"Need {e.required_scope}, have {e.current_scope}") ``` **Common causes:** - Using `read-only` key for write operations - Using `read-write` key for admin operations --- ### NotFoundError Raised when a resource doesn't exist (404). ```python class NotFoundError(ArmorError): resource_type: str | None # e.g., "asset" resource_id: str | None # The ID that wasn't found ``` **Example:** ```python try: asset = client.assets.get("nonexistent.qualified.name") except NotFoundError as e: print(f"Asset not found: {e.resource_id}") ``` --- ### ValidationError Raised when request parameters are invalid (422). ```python class ValidationError(ArmorError): field_errors: dict[str, str] # Field-specific errors ``` **Example:** ```python try: client.api_keys.create(name="", scope="invalid") except ValidationError as e: print(f"Validation failed: {e.field_errors}") ``` --- ### RateLimitError Raised when rate limit is exceeded (429). ```python class RateLimitError(ArmorError): retry_after: int | None # Seconds to wait before retrying ``` **Example:** ```python import time try: assets = client.assets.list() except RateLimitError as e: if e.retry_after: print(f"Rate limited. Waiting {e.retry_after}s...") time.sleep(e.retry_after) # Retry ``` --- ### ServerError Raised for server-side errors (5xx). ```python class ServerError(ArmorError): status_code: int # HTTP status code ``` **Example:** ```python try: assets = client.assets.list() except ServerError as e: print(f"Server error ({e.status_code}): {e.message}") ``` --- ### StalenessError Raised by `require_fresh()` when data is stale. This is a **data quality** exception, not an API error. ```python class StalenessError(ArmorError): asset: str # Asset qualified name hours_since_update: float # Hours since last update threshold_hours: float # The threshold that was exceeded ``` **Example:** ```python from anomalyarmor.exceptions import StalenessError try: client.freshness.require_fresh("snowflake.prod.warehouse.orders") except StalenessError as e: print(f"Asset {e.asset} is stale") print(f"Last update: {e.hours_since_update:.1f}h ago") print(f"Threshold: {e.threshold_hours:.1f}h") sys.exit(1) ``` --- ## Best Practices ### Catch Specific Exceptions ```python from anomalyarmor.exceptions import ( StalenessError, AuthenticationError, RateLimitError, ArmorError, ) try: client.freshness.require_fresh(asset) except StalenessError as e: # Data quality issue - fail the pipeline logger.error(f"Stale data: {e.asset}") raise except AuthenticationError: # Configuration issue - alert on-call logger.critical("Invalid API key!") notify_oncall() raise except RateLimitError as e: # Transient - retry after waiting time.sleep(e.retry_after or 60) retry() except ArmorError as e: # Unexpected error - log and continue logger.warning(f"API error: {e}") ``` ### Retry with Backoff ```python import time from anomalyarmor.exceptions import RateLimitError, ServerError def with_retry(fn, max_retries=3): """Execute function with exponential backoff.""" for attempt in range(max_retries): try: return fn() except RateLimitError as e: wait = e.retry_after or (2 ** attempt * 10) print(f"Rate limited, waiting {wait}s...") time.sleep(wait) except ServerError as e: if attempt == max_retries - 1: raise wait = 2 ** attempt * 5 print(f"Server error, retrying in {wait}s...") time.sleep(wait) raise Exception("Max retries exceeded") # Usage assets = with_retry(lambda: client.assets.list()) ``` ### Pipeline Gate Pattern ```python from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError, ArmorError import sys def check_freshness_gate(assets: list[str]) -> bool: """Gate pipeline on data freshness.""" client = Client() stale = [] for asset in assets: try: client.freshness.require_fresh(asset) except StalenessError: stale.append(asset) except ArmorError as e: print(f"Warning: Could not check {asset}: {e}") if stale: print(f"BLOCKED: {len(stale)} stale assets: {stale}") return False print("All assets fresh, proceeding...") return True # In your pipeline if not check_freshness_gate(["orders", "customers"]): sys.exit(1) ``` --- ## Debugging ### Enable Request Logging ```python import logging # Enable debug logging for httpx logging.basicConfig(level=logging.DEBUG) logging.getLogger("httpx").setLevel(logging.DEBUG) client = Client() ``` ### Inspect Error Details ```python try: client.assets.get("invalid") except ArmorError as e: print(f"Message: {e.message}") print(f"Code: {e.code}") print(f"Details: {e.details}") # For ValidationError if hasattr(e, 'field_errors'): print(f"Field errors: {e.field_errors}") ``` ### Check API Key Validity ```python try: # Simple health check client.freshness.summary() print("API key is valid") except AuthenticationError: print("API key is invalid or revoked") ``` ## Common Questions ### Which exception should I catch to fail a pipeline on stale data? Catch `StalenessError`, which is raised by `client.freshness.require_fresh(asset)` when the asset is past its freshness threshold. It carries `asset`, `hours_since_update`, and `threshold_hours` so you can log actionable context. Let it propagate in Airflow tasks to mark the task failed cleanly. ### How do I distinguish a transient server error from a permanent one? Catch `ServerError` (5xx) separately from `ArmorError` and retry with backoff; these are usually transient. `ValidationError` and `NotFoundError` are permanent for the given input, so retrying won't help. The "Retry with Backoff" example above shows the pattern for 429 + 5xx specifically. ### What's the difference between AuthenticationError and AuthorizationError? `AuthenticationError` (401) means the API key itself is missing, invalid, or revoked. `AuthorizationError` (403) means the key is valid but lacks the scope the endpoint requires. The latter exposes `required_scope` and `current_scope` attributes so you can point users at the right key to use. ### How do I debug an unexpected error from the SDK? Enable httpx debug logging (`logging.getLogger("httpx").setLevel(logging.DEBUG)`) to see the raw request and response, then inspect `e.message`, `e.code`, and `e.details` on the caught `ArmorError`. For `ValidationError`, `e.field_errors` points at the exact fields the API rejected. --- # TypeScript SDK > Install and use @anomalyarmor/sdk for TypeScript and JavaScript `@anomalyarmor/sdk` is the official TypeScript SDK for AnomalyArmor. Types are generated straight from our OpenAPI spec via `openapi-typescript`, so every endpoint is typed end-to-end. Runtime is a thin wrapper around `openapi-fetch` with Bearer auth and retry-on-429 middleware baked in. ## Install ```bash npm install @anomalyarmor/sdk # or: pnpm add @anomalyarmor/sdk / yarn add @anomalyarmor/sdk ``` Node 18+ is required. The package ships dual ESM + CJS builds and TypeScript types. ## Quickstart ```ts import { createAnomalyArmorClient } from '@anomalyarmor/sdk'; const client = createAnomalyArmorClient({ apiKey: process.env.ANOMALYARMOR_API_KEY!, // aa_live_... }); // Liveness probe — 200 means key is valid const health = await client.health.check(); // Aggregate counts of open alerts const overview = await client.alerts.overview(); console.log(`${overview.unresolved_alerts} unresolved alerts`); // Per-table freshness for an asset const freshness = await client.freshness.check('asset-uuid-here'); ``` Get an API key at [app.anomalyarmor.ai/settings/api-keys](https://app.anomalyarmor.ai/settings/api-keys). Keys start with `aa_live_`. ## Authentication The SDK captures your API key at construction time and attaches `Authorization: Bearer ` to every outgoing request via a small middleware. **The SDK library itself never reads from `process.env`** - that would hide auth at a distance. Only the CLI shim (next section) reads `ANOMALYARMOR_API_KEY`, because CLIs traditionally do. ## CLI The package ships a `bin` binary so you can smoke-check connectivity without writing code: ```bash # With the env var set: export ANOMALYARMOR_API_KEY=aa_live_... npx anomalyarmor health # Or pass the key directly: npx anomalyarmor health --api-key aa_live_... ``` The CLI has exactly one command (`health`) by design - anything richer belongs in a real script calling the library API. ## Rate limiting & retries The SDK automatically retries HTTP 429 responses, honoring `Retry-After`. Retries are bounded (default 3 attempts, each sleep capped at 60 s) and only apply to idempotent verbs (`GET`, `HEAD`, `PUT`, `DELETE`, `OPTIONS`) so a `POST` failure never duplicates a side effect. Tune or disable: ```ts const client = createAnomalyArmorClient({ apiKey: '...', maxRetries: 5, // default 3 maxRetrySleepSeconds: 120, // default 60 }); // Opt out of retries entirely (mutation-heavy callers who want explicit control): const client = createAnomalyArmorClient({ apiKey: '...', maxRetries: 0 }); ``` ## How requests work Every typed SDK call flows through six layers: the ergonomic helper → `openapi-fetch` → the middleware stack → platform `fetch` → parse / unwrap → a statically-typed response. ## Error handling Ergonomic methods throw `AnomalyArmorApiError` on 4xx / 5xx responses: ```ts import { AnomalyArmorApiError } from '@anomalyarmor/sdk'; try { await client.alerts.get('does-not-exist'); } catch (err) { if (err instanceof AnomalyArmorApiError) { console.error(`HTTP ${err.status}: ${err.message}`); console.error('Full body:', err.body); } else { throw err; // network / parse / programmer errors } } ``` ## Drop down to the raw client The ergonomic surface covers the endpoints customers use most often. For anything else, `client.raw` exposes the full typed `openapi-fetch` client: ```ts const { data, error } = await client.raw.GET('/api/v1/assets/{asset_id}', { params: { path: { asset_id: 'my-asset' } }, }); if (error) throw new Error(`asset lookup failed: ${error.message}`); // data is typed from the OpenAPI spec ``` ## Configuration reference | Option | Default | Purpose | |---|---|---| | `apiKey` | *(required)* | Your `aa_live_*` Bearer token. | | `baseUrl` | `https://app.anomalyarmor.ai` | Override for staging or a local backend. | | `maxRetries` | `3` | 429-retry budget. Set to `0` to disable. | | `maxRetrySleepSeconds` | `60` | Cap on any single `Retry-After` sleep. | | `fetch` | `globalThis.fetch` | Injectable `fetch` for tests / alternate runtimes. | ## Next steps - **[API reference](/sdk/javascript-reference)**: Full typed reference for every SDK method - **[Authentication](/api/authentication)**: How API keys work across the platform - **[Python SDK](/sdk/overview)**: Same API surface in Python - **[CLI reference](/cli/overview)**: Installable CLI for interactive use ## Common Questions ### What Node version does the SDK support? Node 18 or higher, so `globalThis.fetch` is available natively. The package ships dual ESM + CJS builds and its own TypeScript types, so it drops into Next.js, Vite, and plain Node projects without extra polyfills. ### Does the SDK retry failed requests automatically? Yes, but only for idempotent verbs (`GET`, `HEAD`, `PUT`, `DELETE`, `OPTIONS`) and only on HTTP 429. It honors the `Retry-After` header with a 3-attempt default, each sleep capped at 60 seconds. Tune with `maxRetries` / `maxRetrySleepSeconds`, or pass `maxRetries: 0` to opt out when you want explicit control of mutation retries. ### How do I call an endpoint that isn't in client.alerts / client.freshness / client.schema? Drop down to `client.raw`, which is a fully typed `openapi-fetch` client covering the whole OpenAPI surface. You get the same Bearer auth and retry middleware, plus path/query/body types generated from the spec. Example: `client.raw.GET('/api/v1/assets/{asset_id}', { params: { path: { asset_id } } })`. ### How do I point the SDK at a local backend or staging environment? Pass `baseUrl` at construction time: `createAnomalyArmorClient({ apiKey, baseUrl: 'http://localhost:8000' })`. The default is `https://app.anomalyarmor.ai`. For testing, you can also inject a custom `fetch` via the `fetch` option to stub responses without going over the network. ### Why does the SDK library ignore ANOMALYARMOR_API_KEY? Intentional: library code must receive `apiKey` explicitly so authentication isn't hidden at a distance, which makes multi-tenant and per-request key rotation safe. Only the `npx anomalyarmor` CLI shim reads `ANOMALYARMOR_API_KEY`, matching standard CLI conventions. --- # TypeScript SDK reference > Full API reference for @anomalyarmor/sdk The SDK's full typed surface lives on npm as `@anomalyarmor/sdk` and in-source as typedoc-generated reference pages. ## Live reference The canonical reference is generated from TypeScript source on every release and hosted on GitHub Pages: ## Ergonomic surface If you just want a quick lookup table, here are the hand-written wrappers. For anything not listed, drop down to `client.raw` and use the generated OpenAPI `paths` type directly - see [How requests work](/sdk/javascript#how-requests-work). ### `client.health` | Method | Backing endpoint | |---|---| | `check()` | `GET /api/v1/health` | ### `client.alerts` | Method | Backing endpoint | |---|---| | `overview()` | `GET /api/v1/alerts/overview` | | `history(params?)` | `GET /api/v1/alerts/history` | | `inbox()` | `GET /api/v1/alerts/inbox` | | `get(alertId)` | `GET /api/v1/alerts/{alert_id}` | | `acknowledge(alertId, notes?)` | `POST /api/v1/alerts/{alert_id}/acknowledge` | | `resolve(alertId, notes?)` | `POST /api/v1/alerts/{alert_id}/resolve` | ### `client.freshness` | Method | Backing endpoint | |---|---| | `check(assetId)` | `GET /api/v1/assets/{asset_id}/freshness` | | `getTable(assetId, tablePath)` | `GET /api/v1/assets/{asset_id}/freshness/{table_path}` | | `checkTable(assetId, tablePath)` | `POST /api/v1/assets/{asset_id}/freshness/{table_path}/check` | ### `client.schema` | Method | Backing endpoint | |---|---| | `listChanges(assetId)` | `GET /api/v1/schema-drift/assets/{asset_id}/changes` | | `getChange(assetId, changeId)` | `GET /api/v1/schema-drift/assets/{asset_id}/changes/{change_id}` | | `baselineStatus(assetId)` | `GET /api/v1/schema-drift/assets/{asset_id}/baseline-status` | | `detectChanges(assetId)` | `POST /api/v1/schema-drift/assets/{asset_id}/detect-changes` | ## Exported types ```ts import type { AnomalyArmorClient, AnomalyArmorClientOptions, AlertsResource, FreshnessResource, SchemaResource, HealthResource, // Raw OpenAPI types for advanced callers: paths, components, operations, } from '@anomalyarmor/sdk'; import { createAnomalyArmorClient, AnomalyArmorApiError, DEFAULT_BASE_URL } from '@anomalyarmor/sdk'; ``` ## Source The SDK source lives in the [AnomalyArmor core monorepo](https://github.com/anomalyarmor/core/tree/main/sdk/javascript). Bug reports, feature requests, and PRs welcome at [github.com/anomalyarmor/core/issues](https://github.com/anomalyarmor/core/issues). --- # CLI Overview > Install and use the armor CLI for scripting, automation, and CI/CD integration Stop bad data before it reaches production. The `armor` CLI lets you gate deployments on data freshness, check quality in CI/CD pipelines, and automate monitoring setup. ```bash # Gate your pipeline on data freshness armor freshness check snowflake.prod.warehouse.orders || exit 1 ``` ## Installation ```bash pip install anomalyarmor-cli ``` This installs both the Python SDK and the `armor` CLI command. ## Authentication ### Login with API Key ```bash # Interactive prompt armor auth login # Non-interactive (for CI/CD) armor auth login --key aa_live_your_key_here ``` Credentials are stored in `~/.armor/config.yaml`. ### Check Auth Status ```bash armor auth status ``` Output: ``` Authenticated as: your-email@company.com API Key: aa_live_k8jd...9z4f Organization: Acme Corp ``` ### Logout ```bash armor auth logout ``` ## Commands ### Assets ```bash # List all assets armor assets list # Filter by source armor assets list --source snowflake # Filter by type armor assets list --type table # Get single asset armor assets get snowflake.prod.warehouse.orders ``` ### Freshness ```bash # Get freshness summary armor freshness summary # List all freshness statuses armor freshness list # Get freshness for specific asset armor freshness get snowflake.prod.warehouse.orders # Check freshness (exit 0 if fresh, exit 1 if stale) armor freshness check snowflake.prod.warehouse.orders # Trigger refresh armor freshness refresh snowflake.prod.warehouse.orders # Trigger and wait for completion armor freshness refresh snowflake.prod.warehouse.orders --wait ``` ### Schema ```bash # Get schema drift summary armor schema summary # List recent schema changes armor schema changes # Trigger schema check armor schema refresh snowflake.prod.warehouse.orders --wait ``` ### Lineage ```bash # List assets with lineage armor lineage list # Get lineage for asset armor lineage get snowflake.prod.warehouse.orders # Get only upstream armor lineage get snowflake.prod.warehouse.orders --direction upstream # Get only downstream armor lineage get snowflake.prod.warehouse.orders --direction downstream ``` ### Alerts ```bash # Get alerts summary armor alerts summary # List all alerts armor alerts list # Filter by status armor alerts list --status triggered # Filter by asset armor alerts list --asset snowflake.prod.warehouse.orders ``` ### Metrics ```bash # Get metrics summary for an asset armor metrics summary # List all metrics for an asset armor metrics list # Filter by type armor metrics list --type null_percent # Get specific metric details armor metrics get # Create a row count metric armor metrics create --type row_count --table snowflake.prod.warehouse.orders # Create a null percentage metric armor metrics create --type null_percent --table snowflake.prod.warehouse.orders --column email # Trigger manual capture armor metrics capture # Delete a metric armor metrics delete --yes ``` ### API Keys (Admin) ```bash # List your API keys armor api-keys list # Create new key armor api-keys create --name "airflow-prod" --scope read-only # Revoke key armor api-keys revoke ``` ## Exit Codes | Code | Meaning | |------|---------| | 0 | Success | | 1 | Check failed (e.g., data is stale) | | 2 | Authentication error | | 3 | Resource not found | | 4 | Rate limited | ### Using in Scripts ```bash #!/bin/bash # Gate pipeline on freshness if armor freshness check snowflake.prod.warehouse.orders; then echo "Data is fresh, running pipeline..." dbt run else echo "Data is stale, aborting" exit 1 fi ``` ## Output Formats ### Table (Default) ```bash armor assets list ``` ``` QUALIFIED NAME TYPE SOURCE ROWS snowflake.prod.warehouse.orders table snowflake 1.5M snowflake.prod.warehouse.customers table snowflake 250K snowflake.prod.warehouse.products table snowflake 10K ``` ### JSON ```bash armor assets list --format json ``` ```json [ { "qualified_name": "snowflake.prod.warehouse.orders", "asset_type": "table", "source": "snowflake", "row_count": 1500000 } ] ``` ## Configuration File The CLI stores configuration in `~/.armor/config.yaml`: ```yaml api_key: aa_live_your_key_here api_url: https://api.anomalyarmor.ai ``` ### Environment Variables Override config file with environment variables: ```bash export ARMOR_API_KEY="aa_live_xxx" export ARMOR_API_URL="https://api.anomalyarmor.ai" ``` ## CI/CD Examples ### GitHub Actions ```yaml - name: Check data freshness env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} run: | pip install anomalyarmor-cli armor freshness check snowflake.prod.warehouse.orders ``` ### GitLab CI ```yaml check_freshness: script: - pip install anomalyarmor-cli - armor freshness check snowflake.prod.warehouse.orders variables: ARMOR_API_KEY: $ARMOR_API_KEY ``` ## Next Steps - **[CLI Reference](/cli/reference)**: Complete command reference - **[Python SDK](/sdk/overview)**: Use programmatically in Python - **[Airflow Integration](/integrations/airflow)**: Use in Airflow DAGs - **[API Reference](/api/overview)**: REST API documentation ## Common Questions ### How do I use the CLI in GitHub Actions or GitLab CI? Install it with `pip install anomalyarmor-cli`, set `ARMOR_API_KEY` from your secrets store, and run `armor freshness check `. The non-zero exit code on stale data fails the step cleanly. The "CI/CD Examples" section above has copy-paste snippets for both runners. ### What does each exit code mean in scripts? `0` is success, `1` is a check failure (stale data, validity fail), `2` is auth, `3` is not-found, and `4` is rate-limited. Write shell guards against the specific codes you care about (`if [ $? -eq 1 ]; then ...`) rather than treating any non-zero as the same error. ### How do I get machine-readable output from the CLI? Pass `--format json` on any list or get command. The output is a raw JSON array (or object) suitable for piping into `jq`, other CLIs, or uploading as a CI artifact. The default table format is optimized for terminal reading only. ### Where does the CLI store credentials and how do I override them? `armor auth login` writes `api_key` and `api_url` to `~/.armor/config.yaml`. The `ARMOR_API_KEY` and `ARMOR_API_URL` environment variables override the config file, which is what you want in CI where you shouldn't write secrets to disk. ### Can I use the CLI non-interactively for CI? Yes. `armor auth login --key aa_live_...` skips the interactive prompt, and every command reads `ARMOR_API_KEY` directly if you prefer not to persist a config file at all. Most CI pipelines just set the env var and run commands without ever calling `auth login`. --- # CLI Reference > Complete command reference for the armor CLI Complete reference for all `armor` CLI commands. ## Exit Codes | Code | Meaning | |------|---------| | `0` | Success | | `1` | Staleness/check failed | | `2` | Authentication error | | `3` | Resource not found | | `4` | Rate limited | | `5` | General error | ## armor assets Asset management commands. ### armor assets list List assets with optional filters. ```bash armor assets list [OPTIONS] ``` | Option | Description | |--------|-------------| | `--source, -s` | Filter by source type (e.g., `snowflake`, `postgresql`) | | `--type, -t` | Filter by asset type (e.g., `table`, `view`) | | `--search` | Search in asset names | | `--limit, -l` | Max results (default: 50) | **Example:** ```bash $ armor assets list --source snowflake --type table Assets â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”“ ┃ Qualified Name ┃ Type ┃ Source ┃ Active ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━┩ │ snowflake.prod.warehouse.orders │ table │ snowflake │ Yes │ │ snowflake.prod.warehouse.users │ table │ snowflake │ Yes │ └───────────────────────────────────┴───────┴───────────┴────────┘ Showing 2 assets ``` ### armor assets get Get asset details. ```bash armor assets get ``` | Argument | Description | |----------|-------------| | `asset_id` | Asset ID or qualified name | **Example:** ```bash $ armor assets get snowflake.prod.warehouse.orders Asset: snowflake.prod.warehouse.orders ID: 550e8400-e29b-41d4-a716-446655440000 Type: table Source: snowflake Active: Yes Description: Customer order transactions ``` --- ## armor freshness Freshness monitoring commands. ### armor freshness summary Get freshness summary. ```bash armor freshness summary ``` **Example:** ```bash $ armor freshness summary Freshness Summary Total Assets: 245 Fresh: 230 Stale: 10 Unknown: 5 Freshness Rate: 93.9% ``` ### armor freshness list List freshness status for all assets. ```bash armor freshness list [OPTIONS] ``` | Option | Description | |--------|-------------| | `--status, -s` | Filter by status (`fresh`, `stale`, `unknown`) | | `--limit, -l` | Max results (default: 50) | **Example:** ```bash $ armor freshness list --status stale Freshness Status â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”“ ┃ Asset ┃ Status ┃ Hours Since Update┃ Threshold ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩ │ snowflake.prod.warehouse.orders │ Stale │ 26.5 │ 24h │ └───────────────────────────────────┴────────┴───────────────────┴───────────┘ ``` ### armor freshness get Check freshness for a specific asset. ```bash armor freshness get ``` **Example:** ```bash $ armor freshness get snowflake.prod.warehouse.orders Asset: snowflake.prod.warehouse.orders Status: Fresh Last Update: 2024-12-04T10:30:00Z Hours Since Update: 2.5 Threshold: 24h ``` ### armor freshness check Check if an asset is fresh, fail if stale. **Ideal for CI/CD pipelines.** ```bash armor freshness check [OPTIONS] ``` | Option | Description | |--------|-------------| | `--max-age, -m` | Max acceptable age in hours (optional) | **Exit codes:** - `0` - Data is fresh - `1` - Data is stale - `3` - Asset not found **Example (fresh):** ```bash $ armor freshness check snowflake.prod.warehouse.orders FRESH: snowflake.prod.warehouse.orders Hours since update: 2.5h $ echo $? 0 ``` **Example (stale):** ```bash $ armor freshness check snowflake.prod.warehouse.orders STALE: Asset 'snowflake.prod.warehouse.orders' is stale: 26.5h since last update (threshold: 24.0h) Hours since update: 26.5h Threshold: 24.0h $ echo $? 1 ``` **Shell script usage:** ```bash #!/bin/bash if armor freshness check snowflake.prod.warehouse.orders; then echo "Data is fresh, running pipeline..." dbt run else echo "Data is stale, aborting" exit 1 fi ``` ### armor freshness refresh Trigger a freshness check for an asset. ```bash armor freshness refresh ``` > â„šī¸ Requires an API key with `read-write` or `admin` scope. **Example:** ```bash $ armor freshness refresh snowflake.prod.warehouse.orders Refresh initiated Job ID: job_abc123 Status: queued ``` --- ## armor schema Schema drift monitoring commands. ### armor schema summary Get schema changes summary. ```bash armor schema summary ``` **Example:** ```bash $ armor schema summary Schema Changes Summary Total Changes: 23 Unacknowledged: 5 Critical: 2 Warning: 8 Info: 13 ``` ### armor schema changes List schema changes. ```bash armor schema changes [OPTIONS] ``` | Option | Description | |--------|-------------| | `--asset, -a` | Filter by asset | | `--type, -t` | Filter by change type (`column_added`, `column_removed`, `type_changed`, etc.) | | `--breaking` | Only show breaking changes (removed columns/tables, type changes) | | `--unacknowledged, -u` | Only show unacknowledged changes | | `--limit, -l` | Max results (default: 50) | **Example:** ```bash $ armor schema changes --breaking Schema Changes â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”“ ┃ Asset ┃ Change ┃ Column ┃ Ack ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━┩ │ snowflake.prod.warehouse.orders │ column_removed │ user_id │ No │ └─────────────────────────────────┴────────────────┴───────────┴─────┘ ``` --- ## armor alerts Alert management commands. ### armor alerts summary Get alerts summary. ```bash armor alerts summary ``` **Example:** ```bash $ armor alerts summary Alerts Summary Total Rules: 15 Active Rules: 12 Recent Alerts: 45 Unresolved: 3 ``` ### armor alerts list List alerts with filters. ```bash armor alerts list [OPTIONS] ``` | Option | Description | |--------|-------------| | `--status, -s` | Filter by status (`triggered`, `acknowledged`, `resolved`) | | `--type, -t` | Filter by alert type (`schema`, `freshness`, `discovery`) | | `--limit, -l` | Max results (default: 50) | **Example:** ```bash $ armor alerts list --status triggered Alerts â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”“ ┃ Asset ┃ Message ┃ Type ┃ Status ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩ │ snowflake.prod.warehouse.orders │ Data is 26 hours stale│ freshness │ Triggered │ └─────────────────────────────────┴───────────────────────┴───────────┴───────────┘ ``` --- ## armor api-keys API key management commands. Requires `admin` scope. ### armor api-keys list List API keys. ```bash armor api-keys list [OPTIONS] ``` | Option | Description | |--------|-------------| | `--include-revoked` | Include revoked keys | **Example:** ```bash $ armor api-keys list API Keys â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”“ ┃ Name ┃ Key ┃ Scope ┃ Active ┃ Last Used ┃ ┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━┩ │ Airflow Production │ aa_live_abc...z4 │ read-only │ Yes │ 2024-12-04 │ │ CI Pipeline │ aa_live_def...y8 │ read-write │ Yes │ 2024-12-03 │ └────────────────────┴──────────────────┴────────────┴────────┴────────────┘ ``` ### armor api-keys create Create a new API key. ```bash armor api-keys create --name [OPTIONS] ``` | Option | Description | |--------|-------------| | `--name, -n` | Human-readable name for the key | | `--scope, -s` | Permission scope (default: `read-only`) | **Scopes:** - `read-only` - Read assets, freshness, lineage, alerts - `read-write` - Read + trigger refreshes - `admin` - Full access including key management **Example:** ```bash $ armor api-keys create --name "Airflow Production" --scope read-only API key created successfully! Key: aa_live_k8jd92hf8j2hd98fh2d9h2f98h2d9fh2 IMPORTANT: This key will only be shown once! Store it securely. ``` ### armor api-keys revoke Revoke an API key. ```bash armor api-keys revoke [OPTIONS] ``` | Option | Description | |--------|-------------| | `--yes, -y` | Skip confirmation prompt | **Example:** ```bash $ armor api-keys revoke 550e8400-e29b-41d4-a716-446655440000 Are you sure you want to revoke this key? [y/N]: y API key revoked successfully. ``` --- ## armor tags Tag management commands. ### armor tags list List tags for an asset. ```bash armor tags list --asset [OPTIONS] ``` | Option | Description | |--------|-------------| | `--asset, -a` | Asset ID or qualified name (required) | | `--category, -c` | Filter by category | **Example:** ```bash $ armor tags list --asset postgresql.analytics Tags â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”“ ┃ Name ┃ Category ┃ Object Path ┃ Type ┃ ┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ pii_data │ governance │ gold.customers │ table │ │ financial │ business │ gold.orders │ table │ └────────────────┴────────────┴─────────────────────┴────────────┘ ``` ### armor tags create Create a tag on a database object. ```bash armor tags create --asset --path [OPTIONS] ``` | Option | Description | |--------|-------------| | `--asset, -a` | Asset ID or qualified name (required) | | `--path, -p` | Object path, e.g., `schema.table` (required) | | `--category, -c` | Category: `business`, `technical`, `governance` (default: `business`) | | `--description, -d` | Tag description | **Example:** ```bash $ armor tags create pii_data --asset postgresql.analytics --path gold.customers --category governance Tag created: pii_data Category: governance Path: gold.customers ``` ### armor tags apply Apply multiple tags to multiple objects. ```bash armor tags apply --asset --paths [OPTIONS] ``` | Option | Description | |--------|-------------| | `--asset, -a` | Asset ID or qualified name (required) | | `--paths, -p` | Comma-separated object paths (required) | | `--category, -c` | Category for new tags (default: `business`) | **Example:** ```bash $ armor tags apply "pii,gdpr" --asset postgresql.analytics --paths "gold.customers,gold.orders" Applied 4 tags ``` --- ## armor intelligence AI intelligence commands. ### armor intelligence ask Ask a question about your data. ```bash armor intelligence ask ``` **Example:** ```bash $ armor intelligence ask postgresql.analytics "What tables contain customer data?" Question: What tables contain customer data? Answer: The following tables contain customer data: - gold.customers: Primary customer dimension table - gold.orders: Contains customer_id foreign key Confidence: high Sources: Asset Discovery, Intelligence KB ``` ### armor intelligence generate Generate AI intelligence for an asset. ```bash armor intelligence generate ``` > â„šī¸ Requires asset discovery to be run first. **Example:** ```bash $ armor intelligence generate postgresql.analytics Intelligence generation started Job ID: job_abc123 Status: queued ``` --- ## armor jobs Job monitoring commands. ### armor jobs status Get status of an async job. ```bash armor jobs status ``` **Example:** ```bash $ armor jobs status job_abc123 Job: job_abc123 Status: completed Progress: 100% Started: 2024-12-21T10:30:00Z Completed: 2024-12-21T10:32:15Z ``` --- ## armor metrics Data quality metrics commands. ### armor metrics summary Get metrics summary for an asset. ```bash armor metrics summary ``` **Example:** ```bash $ armor metrics summary 550e8400-e29b-41d4-a716-446655440000 Metrics Summary Total Metrics: 15 Active Metrics: 12 Health: 87.5% Passing: 10 Failing: 2 ``` ### armor metrics list List metrics for an asset. ```bash armor metrics list [OPTIONS] ``` | Option | Description | |--------|-------------| | `--type, -t` | Filter by metric type (e.g., `row_count`, `null_percent`) | | `--active` | Only show active metrics | | `--limit, -l` | Max results (default: 50) | **Example:** ```bash $ armor metrics list 550e8400-e29b-41d4-a716-446655440000 --type null_percent Metrics â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”“ ┃ Table ┃ Column ┃ Type ┃ Active ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━┩ │ snowflake.prod.warehouse.orders │ customer_id │ null_percent │ Yes │ │ snowflake.prod.warehouse.orders │ email │ null_percent │ Yes │ └───────────────────────────────────┴──────────────┴───────────────┴────────┘ ``` ### armor metrics get Get metric details. ```bash armor metrics get ``` **Example:** ```bash $ armor metrics get 550e8400-e29b-41d4-a716-446655440000 m_abc123 Metric: m_abc123 Type: null_percent Table: snowflake.prod.warehouse.orders Column: customer_email Interval: daily Active: Yes ``` ### armor metrics create Create a new metric. Requires `read-write` scope. ```bash armor metrics create --type --table [OPTIONS] ``` | Option | Description | |--------|-------------| | `--type, -t` | Metric type (required): `row_count`, `null_percent`, etc. | | `--table` | Full table path (required) | | `--column` | Column name (for column metrics) | | `--interval` | Capture interval: `hourly`, `daily`, `weekly` (default: `daily`) | **Example:** ```bash $ armor metrics create 550e8400-e29b-41d4-a716-446655440000 \ --type null_percent \ --table snowflake.prod.warehouse.orders \ --column customer_email Metric created: m_abc123 Type: null_percent Table: snowflake.prod.warehouse.orders Column: customer_email ``` ### armor metrics capture Trigger an immediate metric capture. Requires `read-write` scope. ```bash armor metrics capture ``` **Example:** ```bash $ armor metrics capture 550e8400-e29b-41d4-a716-446655440000 m_abc123 Captured 1 snapshot Value: 2.5 Anomaly: No ``` ### armor metrics delete Delete a metric. Requires `read-write` scope. ```bash armor metrics delete [OPTIONS] ``` | Option | Description | |--------|-------------| | `--yes, -y` | Skip confirmation prompt | --- ## armor validity Data validity rules commands. ### armor validity summary Get validity summary for an asset. ```bash armor validity summary ``` **Example:** ```bash $ armor validity summary 550e8400-e29b-41d4-a716-446655440000 Validity Summary Total Rules: 12 Passing: 10 Failing: 1 Error: 1 ``` ### armor validity list List validity rules for an asset. ```bash armor validity list [OPTIONS] ``` | Option | Description | |--------|-------------| | `--type, -t` | Filter by rule type (e.g., `NOT_NULL`, `REGEX`) | | `--active` | Only show active rules | | `--limit, -l` | Max results (default: 50) | **Example:** ```bash $ armor validity list 550e8400-e29b-41d4-a716-446655440000 --type NOT_NULL Validity Rules â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”“ ┃ Table ┃ Column ┃ Type ┃ Severity ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩ │ snowflake.prod.warehouse.orders │ customer_id │ NOT_NULL │ critical │ │ snowflake.prod.warehouse.orders │ email │ NOT_NULL │ warning │ └───────────────────────────────────┴───────────────┴──────────┴───────────┘ ``` ### armor validity get Get validity rule details. ```bash armor validity get ``` **Example:** ```bash $ armor validity get 550e8400-e29b-41d4-a716-446655440000 v_abc123 Rule: v_abc123 Name: Customer Email Required Type: NOT_NULL Table: snowflake.prod.warehouse.orders Column: customer_email Severity: critical Active: Yes ``` ### armor validity create Create a new validity rule. Requires `read-write` scope. ```bash armor validity create --type --table [OPTIONS] ``` | Option | Description | |--------|-------------| | `--type, -t` | Rule type (required): `NOT_NULL`, `UNIQUE`, `REGEX`, etc. | | `--table` | Full table path (required) | | `--column` | Column name | | `--config` | Rule configuration (JSON string) | | `--severity` | Severity: `info`, `warning`, `critical` (default: `warning`) | **Examples:** ```bash # NOT NULL rule $ armor validity create 550e8400-e29b-41d4-a716-446655440000 \ --type NOT_NULL \ --table snowflake.prod.warehouse.orders \ --column customer_email \ --severity critical Rule created: v_abc123 Type: NOT_NULL # REGEX rule $ armor validity create 550e8400-e29b-41d4-a716-446655440000 \ --type REGEX \ --table snowflake.prod.warehouse.orders \ --column customer_email \ --config '{"pattern": "^[\\w.-]+@[\\w.-]+\\.\\w+$"}' Rule created: v_def456 Type: REGEX ``` ### armor validity check Trigger an immediate validity check. Requires `read-write` scope. ```bash armor validity check ``` **Exit codes:** - `0` - Validation passed - `1` - Validation failed (invalid records found) - `3` - Rule not found **Example (pass):** ```bash $ armor validity check 550e8400-e29b-41d4-a716-446655440000 v_abc123 PASS: v_abc123 (NOT_NULL on customer_email) Total Rows: 10000 Invalid: 0 (0.00%) $ echo $? 0 ``` **Example (fail):** ```bash $ armor validity check 550e8400-e29b-41d4-a716-446655440000 v_abc123 FAIL: v_abc123 (NOT_NULL on customer_email) Total Rows: 10000 Invalid: 25 (0.25%) Sample invalid values: - row 1001: null - row 1042: null $ echo $? 1 ``` ### armor validity delete Delete a validity rule. Requires `read-write` scope. ```bash armor validity delete [OPTIONS] ``` | Option | Description | |--------|-------------| | `--yes, -y` | Skip confirmation prompt | --- ## armor referential Referential integrity check commands. ### armor referential summary Get referential summary for an asset. ```bash armor referential summary ``` **Example:** ```bash $ armor referential summary 550e8400-e29b-41d4-a716-446655440000 Referential Summary Total Checks: 8 Active: 7 Passing: 6 Failing: 1 ``` ### armor referential list List referential checks for an asset. ```bash armor referential list [OPTIONS] ``` | Option | Description | |--------|-------------| | `--active` | Only show active checks | | `--limit, -l` | Max results (default: 50) | **Example:** ```bash $ armor referential list 550e8400-e29b-41d4-a716-446655440000 Referential Checks â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”â”ŗâ”â”â”â”â”â”â”â”â”“ ┃ Child ┃ Parent ┃ Status ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩ │ orders.customer_id │ customers.id │ Pass │ │ payments.order_id │ orders.id │ Fail │ └─────────────────────────────────┴──────────────────────────────────┴────────┘ ``` ### armor referential get Get referential check details. ```bash armor referential get ``` **Example:** ```bash $ armor referential get 550e8400-e29b-41d4-a716-446655440000 r_abc123 Check: r_abc123 Name: Orders -> Customers FK FK: orders.customer_id PK: customers.id Interval: daily Active: Yes ``` ### armor referential create Create a new referential check. Requires `read-write` scope. ```bash armor referential create [OPTIONS] ``` | Option | Description | |--------|-------------| | `--child-table` | Child table path (required) | | `--child-column` | Child column name (required) | | `--parent-table` | Parent table path (required) | | `--parent-column` | Parent column name (required) | | `--name` | Human-readable name | | `--max-orphans` | Max orphan count threshold | | `--max-orphan-percent` | Max orphan percentage threshold | **Example:** ```bash $ armor referential create 550e8400-e29b-41d4-a716-446655440000 \ --child-table snowflake.prod.warehouse.orders \ --child-column customer_id \ --parent-table snowflake.prod.warehouse.customers \ --parent-column id \ --name "Orders -> Customers FK" Check created: r_abc123 FK: orders.customer_id -> customers.id ``` ### armor referential execute Execute a referential check. Requires `read-write` scope. ```bash armor referential execute ``` **Exit codes:** - `0` - Check passed (no orphans or within threshold) - `1` - Check failed (orphans exceed threshold) - `3` - Check not found **Example (pass):** ```bash $ armor referential execute 550e8400-e29b-41d4-a716-446655440000 r_abc123 PASS: r_abc123 (orders.customer_id -> customers.id) Total Child Rows: 10000 Orphans: 0 (0.00%) $ echo $? 0 ``` **Example (fail):** ```bash $ armor referential execute 550e8400-e29b-41d4-a716-446655440000 r_abc123 FAIL: r_abc123 (orders.customer_id -> customers.id) Total Child Rows: 10000 Orphans: 5 (0.05%) Sample orphan values: - customer_id: 99999 - customer_id: 99998 $ echo $? 1 ``` ### armor referential delete Delete a referential check. Requires `read-write` scope. ```bash armor referential delete [OPTIONS] ``` | Option | Description | |--------|-------------| | `--yes, -y` | Skip confirmation prompt | --- ## armor lineage Data lineage commands. ### armor lineage get Get lineage for an asset. ```bash armor lineage get [OPTIONS] ``` | Option | Description | |--------|-------------| | `--depth, -d` | Depth of lineage (1-5, default: 1) | | `--direction` | Direction: `upstream`, `downstream`, `both` (default: `both`) | **Example:** ```bash $ armor lineage get snowflake.prod.warehouse.orders --depth 2 Lineage for: snowflake.prod.warehouse.orders Upstream (dependencies): - snowflake.raw.stripe.orders - snowflake.raw.crm.customers Downstream (dependents): - snowflake.prod.mart.orders_mart - looker.sales_dashboard ``` --- ## armor contract ODCS contract export/import (see the [Data Contracts guide](/guides/contracts) for the ODCS background). ### armor contract pull Export one or many contracts as ODCS YAML. ```bash armor contract pull [--asset UUID | --warehouse NAME | --all] [-o FILE] [OPTIONS] ``` **Scope (exactly one required):** - `--asset UUID`: one asset, sync. Writes YAML to stdout unless `-o` given. - `--warehouse NAME`: every contract in the warehouse, async job, writes zip to `-o`. - `--all`: every contract in the account, async job, writes zip to `-o`. **Options:** - `-o, --output PATH`: output file. Required for bulk scopes. - `--mode extended|odcs-pure`: default `extended`. `odcs-pure` strips AnomalyArmor-specific fields for maximum ODCS interop. - `--include DOMAINS`: comma-separated domain names to include (e.g. `freshness,validity`). - `--exclude DOMAINS`: comma-separated domain names to exclude. - `--timeout SECONDS`: async job poll timeout, default 600. **Examples:** ```bash # One asset to stdout armor contract pull --asset # One asset to file, freshness + validity only armor contract pull --asset -o orders.yaml --include freshness,validity # Whole warehouse armor contract pull --warehouse analytics -o analytics.zip # Everything, pure ODCS (max portability) armor contract pull --all --mode odcs-pure -o contracts.zip ``` **Domain names:** `schema`, `freshness`, `validity`, `metrics`, `row_count`, `drift_monitors`, `schema_drift`, `alert_rules`, `destinations`, `blackouts`. --- ## Configuration ### Config File The CLI stores configuration in `~/.armor/config.yaml`: ```yaml api_key: aa_live_your_key_here api_url: https://api.anomalyarmor.ai timeout: 30 ``` ### Environment Variables Override config file with environment variables: ```bash export ARMOR_API_KEY="aa_live_xxx" export ARMOR_API_URL="https://api.anomalyarmor.ai" ``` Environment variables take precedence over config file values. --- # Data Contracts (ODCS export/import) > Export and import AnomalyArmor monitoring config as portable ODCS YAML AnomalyArmor speaks the [Open Data Contract Standard (ODCS) v3.1.0](https://bitol-io.github.io/open-data-contract-standard/latest/). Every asset's monitoring config (freshness SLAs, validity rules, schema, metrics, drift monitors, alert rules, destinations, blackouts) exports as ODCS YAML and imports back on demand. Three things this gives you: 1. **Interoperability.** ODCS is a Linux Foundation project. Soda, Great Expectations, and dbt speak it. Your contract is not a proprietary blob. 2. **No lock-in.** Click Export, get a directory of YAML. Import it into whichever tool you pick next. No rewrite required. 3. **Version control.** Contracts diff cleanly in Git. Reviewers see which validity rule changed in a PR, not a screenshot of a settings page. ## ODCS-native vs ODCS-extended ODCS covers the core concepts every tool agrees on: schema, column quality rules, freshness SLAs, ownership. We use its sanctioned `customProperties` extension mechanism for the AnomalyArmor-specific concepts ODCS does not have a native slot for. | Native ODCS (any tool reads these) | Extended (under `customProperties.anomalyarmor`) | |---|---| | `apiVersion`, `id`, `name`, `version`, `status`, `domain`, `tags` | Schema drift monitoring (ODCS has no schema-change concept) | | `schema[]` with `properties[]` + `physicalType` | ML distribution drift (PSI, KS, chi-squared) | | `quality[]` metric library (`nullValues`, `invalidValues`, `duplicateValues`, `rowCount`) | Row-count anomaly detection with rolling baselines | | `quality[]` custom SQL | Alert rules and routing | | `slaProperties[]` (latency, frequency) | Destinations (Slack, email, PagerDuty, Linear, webhooks) | | `team[]` | Operating schedules and blackout windows | Our exported YAML is **fully ODCS-compliant**: it passes the [official v3.1.0 JSON Schema validator](https://github.com/bitol-io/open-data-contract-standard/tree/main/schema). Using `customProperties` is exactly how Soda and Great Expectations handle the same tension: industry-standard, not a workaround. ODCS models `customProperties` as a **list of `{property, value}` entries**, not a dict. Our AnomalyArmor extensions all ride inside a single entry whose `property` is `anomalyarmor`: ```yaml customProperties: - property: anomalyarmor value: freshness: [...] validity: [...] drift_monitors: [...] ``` ## Deviations from ODCS v3.1.0 Small, documented places where our exported YAML does more or less than the spec: - **`schema[].properties[].id` (extension).** ODCS does not define an `id` at the column level. We write a stable UUID5 per column so re-import can match columns across renames. `odcs-pure` mode strips it; other ODCS tools read the column by `name` and ignore `id`. - **Unmodeled optional sections.** These ODCS sections are recognized but we do not produce or consume them: `servers`, `roles`, `price`, `support`, `authoritativeDefinitions`, `contractCreatedTs`. They round-trip opaquely if present on import and are never emitted on export. ## Portability modes Two export modes control what gets written to the YAML. ### `extended` (default) Full round-trip fidelity. Includes `customProperties.anomalyarmor.` blocks with every AA-specific configuration knob (drift thresholds, monitoring modes, alert routing, etc.). Use this for version control, backup, and re-import into AnomalyArmor. ### `odcs-pure` Strips every `customProperties.anomalyarmor` block. The resulting YAML contains only the ODCS-native subset that any ODCS tool can read and act on: schema, quality rules, freshness SLAs, team. You lose the AA-extended features (drift, alert routing, blackouts) but gain maximum portability for the interop case. Pick `odcs-pure` when you want to hand the contract to a Soda or Great Expectations customer. Pick `extended` everywhere else. ## CLI ```bash pip install anomalyarmor-cli armor auth login --key aa_live_your_key_here ``` The package installs two console scripts, `armor` and `anomalyarmor`. They are aliases, use whichever you prefer. Examples below use `armor` for brevity. ### Export: one asset to stdout ```bash armor contract pull --asset ``` Writes ODCS YAML to stdout. Pipe to `yq` or redirect to a file. ### Export: one asset to a file ```bash armor contract pull --asset -o orders.yaml ``` ### Export: whole warehouse as a zip ```bash armor contract pull --warehouse analytics -o analytics.zip ``` Submits an async job, polls to completion, and writes the zip. Layout inside the zip: ``` analytics/ contract/ orders.yaml customers.yaml invoices.yaml ``` ### Export: every contract in your account ```bash armor contract pull --all -o everything.zip ``` ### Filtering (export or import) Include or exclude specific config domains: ```bash # Only freshness + validity armor contract pull --asset --include freshness,validity -o partial.yaml # Everything except alert routing and destinations armor contract pull --asset --exclude alert_rules,destinations -o no-alerts.yaml ``` Registered domain names: `schema`, `freshness`, `validity`, `metrics`, `row_count`, `drift_monitors`, `schema_drift`, `alert_rules`, `destinations`, `blackouts`. ### Export: ODCS-pure mode ```bash armor contract pull --asset --mode odcs-pure -o orders.odcs.yaml ``` ### Validate a YAML file (no DB, no auth required beyond the API key) ```bash armor contract validate -f orders.yaml ``` Exits 0 on a valid document, non-zero on parse or schema errors. Errors are printed one-per-line with file path, YAML path, and line number, so editors and CI can parse the output. Cheap: a single HTTP round-trip regardless of contract size. ### Plan an import (dry-run diff) ```bash armor contract plan --asset -f orders.yaml ``` Shows a per-domain diff without applying. Output: ``` Plan for asset 53581432-...: freshness +2 ~1 -0 validity +0 ~3 -1 drift_monitors +0 ~0 -0 ``` `+` is additions, `~` is modifications, `-` is deletions. ### Apply an import ```bash armor contract apply --asset -f orders.yaml ``` Transactional per asset. Any domain failure rolls back the whole contract. By default, configs present in the live asset but missing from the YAML are **not deleted** (warn-only). To prune, pass `--prune`: ```bash armor contract apply --asset -f orders.yaml --prune ``` `--prune` deletes configs that are absent from the YAML. Treat this the way you would treat `rm -rf`, review the `plan` output first. ### Diff two local YAML files ```bash armor contract diff baseline.yaml pr.yaml ``` Pure client-side, no server call. Uses the same identity-matching logic as the server-side plan, so the summary line-up is identical. Useful in CI to compare `contracts/main.yaml` to `contracts/pr.yaml` before calling `plan` or `apply`. ## In-product UI The same primitives ship in the app. - **Export.** Every asset detail page has an **Export as YAML** action. Opens a modal with three controls: asset scope (this asset / schema / warehouse / all), feature-domain checkboxes, and the `extended` / `odcs-pure` toggle. Multi-asset scopes download as a zip. - **Import.** Same page, **Import YAML** action next to Export. Drag-drop or click-to-browse a `.yaml` file. The modal runs server-side validation, then shows a plan preview (`+2 freshness, ~1 validity, -0 drift`). Apply button is disabled until the plan is clean. Prune has an explicit confirmation dialog that lists what will be removed. - **Bulk import.** Asset list supports checkbox selection; the bulk action bar exposes **Import YAML** which applies one contract to many selected assets (async job, polled in the modal). ## REST API All endpoints authenticated via your API key. Single-asset endpoints are synchronous; bulk endpoints run as async jobs. ### Export, single asset (synchronous) ``` GET /api/v1/contracts/{asset_public_id}/export ?mode=extended|odcs-pure (default: extended) &include=freshness,validity (comma-separated, default: all) &exclude=alert_rules (applied after include) ``` Returns `application/x-yaml` with a `Content-Disposition: attachment` header. ### Export, bulk (asynchronous) ``` POST /api/v1/contracts/export-jobs { "scope": "warehouse"|"all", "scope_name": "analytics", // required for scope=warehouse "mode": "extended", "include": ["freshness"], "exclude": ["alert_rules"] } ``` Returns `{"job_id": ""}`. Poll with: ``` GET /api/v1/contracts/export-jobs/{job_id} ``` Response includes `status`, `progress_percent`, and `asset_count`. When status is `completed`, download: ``` GET /api/v1/contracts/export-jobs/{job_id}/download ``` Returns `application/zip` with the laid-out `{warehouse}/contract/{table}.yaml` archive. ### Validate (no DB, no asset context) ``` POST /api/v1/contracts/validate Body: raw ODCS YAML ``` Returns `{ valid: bool, errors: [{line, yaml_path, code, message}], contract_summary: {...} }`. ### Plan, single asset (synchronous) ``` POST /api/v1/contracts/{asset_public_id}/plan Body: raw ODCS YAML ``` Returns `{ valid, errors, diffs: { : { added, modified, removed } } }`. Read-only, no mutations. ### Apply, single asset (synchronous, transactional) ``` POST /api/v1/contracts/{asset_public_id}/apply?prune=false Body: raw ODCS YAML ``` Returns `{ valid, applied: { : {added, modified, removed} }, unsupported_domains: [], pruned: bool }`. Partial failures roll back the entire contract. ### Apply, bulk (asynchronous) ``` POST /api/v1/contracts/apply-jobs { "asset_public_ids": ["...", "..."], "yaml": "apiVersion: v3.1.0\n...", "prune": false } ``` Returns `{"job_id": ""}`. Poll `GET /api/v1/contracts/apply-jobs/{job_id}` for per-asset status. ## Round-trip example ```bash # Export armor contract pull --asset -o orders.yaml # Edit the freshness threshold in orders.yaml vim orders.yaml # Commit to Git git add orders.yaml && git commit -m "tighten freshness SLA on orders" # Preview the change without applying armor contract plan --asset -f orders.yaml # Apply once the diff looks right armor contract apply --asset -f orders.yaml ``` ## Adapter coverage at a glance Three adapters ship today. The grid shows which kinds of rules each adapter converts automatically, which convert with caveats, and which you rewrite as custom rules in AnomalyArmor. Every skipped check surfaces a warning, never a silent drop. See the per-adapter guides below for the exact mapping tables. ## Coming from Soda? [Soda ships ODCS export](https://docs.soda.io/data-testing/git-managed-data-contracts) as of 2024. Run `soda export --odcs` on your existing Soda project and the resulting YAML imports directly into AnomalyArmor: ```bash soda export --odcs > contracts/ armor contract apply --asset -f contracts/orders.yaml ``` If you want to preview the migration before signing up, paste your Soda YAML into [anomalyarmor.ai/migrate](https://www.anomalyarmor.ai/migrate) for a structural summary without an account. ## Coming from dbt or Great Expectations? Those tools do not emit ODCS directly. We ship adapters that translate their config into ODCS YAML, which then feeds the same `contract apply` pipeline: ```bash # dbt: translate schema.yml tests and contracts into ODCS armor migrate-from dbt ./my-dbt-project/ > contracts/ # Great Expectations: translate expectation suites into ODCS armor migrate-from great-expectations ./great_expectations/ > contracts/ ``` See the per-adapter guides for mapping rules, unsupported expectations, and CLI options: - [Migrate from dbt](/guides/migrate-from-dbt) - [Migrate from Great Expectations](/guides/migrate-from-great-expectations) ## Limits and current gaps - **No S3 artifact storage for huge bulk exports.** Jobs today inline the zip bytes in the status response, which is fine for hundreds of tables. Company-wide exports with thousands of tables should use multiple `--warehouse` jobs. - **Single-asset apply only in `contract apply`.** Multi-asset apply ships through the bulk-apply REST/UI path, not through the CLI flag surface yet. - **Some domains are `unsupported` on import.** The `apply` response lists any domains whose `import_` is not yet implemented in `unsupported_domains`. Pair with `plan` first to see whether they would have changed before committing. ## Common Questions ### What is ODCS and why does AnomalyArmor use it? ODCS is the [Open Data Contract Standard](https://bitol-io.github.io/open-data-contract-standard/latest/), a Linux Foundation project that defines a vendor-neutral YAML format for data contracts. Using it means your AnomalyArmor config is portable: Soda, Great Expectations, and dbt also speak it, so you keep optionality if you ever switch tools. ### What is the difference between `extended` and `odcs-pure` export modes? `extended` (default) includes the full `customProperties.anomalyarmor` block so the contract round-trips with every AA-specific feature intact (drift, alert routing, blackouts). `odcs-pure` strips those blocks and emits only the ODCS-native subset (schema, quality, freshness, team). Use `odcs-pure` when handing the contract to a Soda or Great Expectations user, `extended` everywhere else. ### How do I preview contract changes before applying them? Run `armor contract plan --asset -f contract.yaml`. It returns a per-domain diff showing additions, modifications, and deletions without mutating anything. Apply with `armor contract apply` once the diff looks right. See [Plan an import](#plan-an-import-dry-run-diff). ### Does `contract apply` delete configs that aren't in my YAML? No, not by default. Configs present in the live asset but missing from the YAML are warn-only, so you won't accidentally wipe rules by applying a partial contract. Pass `--prune` to explicitly delete them, and treat that flag the way you would `rm -rf`, always review the `plan` output first. ### Can I put AnomalyArmor contracts in Git? Yes, that's one of the main reasons to use them. The YAML diffs cleanly in PRs so reviewers see which validity rule or freshness SLA changed, not a screenshot of a settings page. Export with `armor contract pull`, commit the file, and use `armor contract diff` in CI to compare branches. ### Can I validate a contract YAML without applying it? Yes. `armor contract validate -f contract.yaml` runs the document against the official ODCS v3.1.0 JSON Schema and exits non-zero on parse or schema errors. Errors print with file path, YAML path, and line number so editors and CI can surface them inline. No DB connection required. --- # Migrate from dbt > Translate a dbt project's schema.yml into ODCS YAML and apply it to AnomalyArmor If you already describe your warehouse in dbt, you already have half of an AnomalyArmor contract. This guide walks through converting a dbt project's `schema.yml` files into ODCS YAML and applying it with `armor contract apply`. Install the CLI first (see [Data Contracts](/guides/contracts#cli) for the auth step): ```bash pip install anomalyarmor-cli armor auth login --key aa_live_your_key_here ``` ## One-line pipeline ```bash armor migrate-from dbt ./my-dbt-project/ | armor contract apply --asset -f - ``` The adapter writes ODCS YAML to stdout and `contract apply` reads from stdin when passed `-f -`. If the adapter maps zero models, it refuses to emit (non-zero exit) so the pipeline fails clean instead of clobbering your live config with an empty contract. Prefer a two-step flow for anything non-trivial: ```bash # 1. Translate armor migrate-from dbt ./my-dbt-project/ -o contracts/from-dbt.yaml # 2. Preview armor contract plan --asset -f contracts/from-dbt.yaml # 3. Apply once the diff looks right armor contract apply --asset -f contracts/from-dbt.yaml ``` ## What the adapter reads Running `armor migrate-from dbt ` walks the project root recursively and parses every `schema.yml` / `schema.yaml` file. It skips: - `dbt_packages/` and `dbt_modules/` (vendored third-party packages, not your models). - `node_modules/` (front-end tooling that sometimes coexists with dbt repos). - Any hidden directory (a path segment starting with `.`). Your project root is the directory holding `dbt_project.yml`. The adapter does **not** call `dbt compile` or `dbt parse`, so you do not need a live target or a working `profiles.yml`. Static file scan only. ## Mapping table | dbt input | ODCS output | Notes | |---|---|---| | `models[].name` | `schema[].name` | Table identity. | | `models[].columns[].name` | `schema[].properties[].name` | Column identity. | | `models[].columns[].data_type` | `schema[].properties[].physicalType` | Only emitted if present in the dbt file. | | `models[].columns[].description` | `schema[].properties[].description` | Native ODCS. | | `models[].description` | `schema[].description` | Native ODCS. | | `columns[].tests[].not_null` | validity rule, `rule_type=not_null` | AA extension. | | `columns[].tests[].unique` | validity rule, `rule_type=unique` | AA extension. | | `columns[].tests[].accepted_values.values` | validity rule, `rule_type=allowed_values` | Needs a non-empty `values` list. | | `models[].config.freshness` | freshness schedule (hours) | Best-effort: `error_after` preferred, falls back to `warn_after`. `minute` / `hour` / `day` periods convert to hours. | Everything the adapter understands lands in the emitted YAML. Everything else lands in the **warnings** section of the summary line so you know what to review. ## What gets skipped (and why) | Input shape | Reason | Resolution | |---|---|---| | `dbt_utils.*` tests (e.g. `dbt_utils.accepted_range`) | No 1:1 map to an AA validity rule | Re-author as a custom SQL check in AnomalyArmor, or add to the request queue for adapter coverage. | | `dbt_expectations.*` tests | Too many variants to map safely | Same as above. | | `relationships` tests | Reference integrity is not modeled by AA validity rules | Track as a custom SQL check. | | Any test with a namespaced name (contains `.`) | Conservative skip | Adapter errs on the side of warning over silent mistranslation. | | `accepted_values` with empty or missing `values` | Would emit an empty rule | Fix the test in dbt, re-run. | | `freshness` with a non-`minute`/`hour`/`day` period | Cannot normalize to hours | Normalize in dbt or edit the emitted YAML directly. | | dbt `meta`, `tags`, `owner` | No round-trip design yet | Not a data-quality signal; track with AA tags or leave in dbt. | The CLI summary prints a one-liner that groups these warnings so you see totals at a glance: ``` dbt -> ODCS (./my-dbt-project/): mapped 42, warnings 7 (dbt_utils.accepted_range: 3, ...). ``` If `mapped_count` is zero, the adapter exits non-zero and writes nothing to stdout. ## Flags | Flag | Purpose | |---|---| | `--output`, `-o` | Write YAML to a file instead of stdout. Pass `-` or omit for stdout. | | `--name` | Override the contract `name` field. Defaults to the project directory name. | ## Validate before applying ```bash armor migrate-from dbt ./my-dbt-project/ -o from-dbt.yaml armor contract validate -f from-dbt.yaml ``` `contract validate` runs the document against the ODCS v3.1.0 JSON Schema. Parse or schema errors print with file path, YAML path, and line number so editors and CI can surface them inline. No DB connection required beyond auth. ## One dbt project, many assets Today `contract apply` is asset-scoped (one asset per call). If your dbt project covers multiple warehouse tables: 1. Translate once: `armor migrate-from dbt ./proj -o full.yaml`. 2. Split by table or re-export per-asset from AnomalyArmor to get the target asset scoping, then apply each file against its asset. Multi-asset bulk apply runs through the REST jobs endpoint today (see [the bulk-apply section of the contracts guide](/guides/contracts#apply-bulk-asynchronous)). A CLI shortcut is on the roadmap. ## Common Questions ### How do I migrate from dbt tests to AnomalyArmor? Run `armor migrate-from dbt ./my-dbt-project/` and pipe the ODCS YAML into `armor contract apply --asset -f -`. The adapter reads every `schema.yml` statically, so you don't need a live warehouse or working `profiles.yml`. See the [one-line pipeline](#one-line-pipeline). ### Which dbt tests does AnomalyArmor support? Built-in `not_null`, `unique`, and `accepted_values` tests map to AnomalyArmor validity rules. Column `description`, `data_type`, and model `description` land in native ODCS fields. Freshness configs convert into hour-based SLAs. See the full [mapping table](#mapping-table). ### Why does the adapter skip `dbt_utils` and `dbt_expectations` tests? Those tests have too many variants to map safely into AnomalyArmor's validity rules without risking silent mistranslation, so the adapter warns rather than guesses. Re-author them as custom SQL checks in AnomalyArmor. The CLI summary groups the warnings so you see totals at a glance. ### Do I need dbt installed to run the migration? No. The adapter is a static file scan, it walks the project directory and parses `schema.yml` files directly. You don't need `dbt compile`, `dbt parse`, a live target, or a working `profiles.yml`. ### Can I preview the migration before applying it? Yes. Write the translated YAML to a file first with `armor migrate-from dbt ./proj -o from-dbt.yaml`, then run `armor contract validate` and `armor contract plan` against it. Plan shows a per-domain diff (additions, modifications, deletions) without mutating anything. See [Validate before applying](#validate-before-applying). ### How do I migrate a dbt project that covers many warehouse tables? `contract apply` is asset-scoped today (one call per asset), so translate once with `armor migrate-from dbt ./proj -o full.yaml`, then apply the file against each target asset. For a true bulk flow, use the [REST bulk-apply endpoint](/guides/contracts#apply-bulk-asynchronous). A CLI shortcut is on the roadmap. ## See also - [Data Contracts (ODCS export/import)](/guides/contracts) - [Migrate from Great Expectations](/guides/migrate-from-great-expectations) - [anomalyarmor.ai/migrate](https://www.anomalyarmor.ai/migrate) for Soda users (ODCS export is native in Soda) --- # Migrate from Great Expectations > Convert a Great Expectations project into an AnomalyArmor data contract The `armor migrate-from great-expectations` command converts a Great Expectations project on disk into an [ODCS](/guides/contracts) YAML document. Pipe the output straight into `armor contract apply` to port your expectation suites into AnomalyArmor validity rules. ```bash armor migrate-from great-expectations ./great_expectations/ | armor contract apply --asset -f - ``` The adapter is file-based: it reads `great_expectations.yml` and the expectation-suite JSON files directly. You do not need the `great_expectations` pip package installed to run the migration. ## What gets mapped ### Native ODCS fields These expectations land on the standard ODCS property surface so they round-trip through any ODCS tool, not just AnomalyArmor. | Great Expectations | ODCS field | | ---------------------------------------- | --------------------------------------- | | `expect_column_values_to_not_be_null` | `schema[].properties[].required = true` | | `expect_column_values_to_be_unique` | `schema[].properties[].unique = true` | ### AnomalyArmor validity rules These expectations map to validity rules under `customProperties.anomalyarmor.validity`. | Great Expectations | AnomalyArmor `rule_type` | Notes | | ------------------------------------------------ | ------------------------ | --------------------------------------------------------- | | `expect_column_values_to_be_in_set` | `allowed_values` | `value_set` → `values` | | `expect_column_values_to_match_regex` | `regex_match` | `regex` → `pattern` | | `expect_column_values_to_be_between` | `range_bounds` | `min_value` / `max_value` → `min` / `max` | | `expect_column_value_lengths_to_be_between` | `length_bounds` | `min_value` / `max_value` → `min_length` / `max_length` | | `expect_column_values_to_match_strftime_format` | `format` | `%Y-%m-%d` → `iso_date`; `%Y-%m-%dT%H:%M:%S` → `iso_datetime` | ### Unmapped expectations Anything not in the table above surfaces as a warning in the command summary rather than being silently dropped. The common ones that land in warnings today: - `expect_column_pair_*` and `expect_multicolumn_*` (cross-column logic) - `expect_column_kl_divergence_*` and other statistical distribution checks - Custom expectations (`dbt_expectations.*`, user-defined subclasses) - `expect_column_values_to_be_of_type` (type-system translation is its own problem) - Table-scoped expectations like `expect_table_row_count_to_be_between` (row-count monitoring needs its own follow-up wiring) For expectations without a native AnomalyArmor equivalent, author a custom SQL check after the migration rather than trying to force them through the validity-rule shape. ## Project shapes supported The adapter auto-detects whether you have a legacy (pre-v1) project or a v1 project using Fluent Datasources. - **Legacy projects.** Expectation suites live as JSON under `{project_root}/expectations/*.json`. The adapter walks that tree directly. - **v1 Fluent Datasources projects.** The adapter reads the `fluent_datasources` block from `great_expectations.yml` to pull per-column type hints into `schema[].properties[].physicalType`. Suites still need to live under `expectations/`. If your project uses a non-default suite store (a database, S3, etc.), export the suites to disk before migrating: ```python import great_expectations as gx context = gx.get_context() for name in context.list_expectation_suite_names(): suite = context.get_expectation_suite(name) with open(f"expectations/{name}.json", "w") as f: f.write(suite.to_json_dict()) ``` The adapter emits a warning with a pointer to this snippet when it detects a project shape it cannot walk via files alone. ## Suite-to-table resolution Great Expectations organizes suites by suite name, not by table. AnomalyArmor scopes every validity rule to a `table_path`. The adapter resolves the mapping in this order: 1. `meta.anomalyarmor.table_path` inside the suite JSON (the explicit escape hatch). 2. The suite's `expectation_suite_name` field (GE's canonical source of truth for the suite's logical name). 3. The suite filename without the `.json` extension (final fallback). If your suite names already match your target table names, you don't need to touch anything. When they don't, set the escape hatch inside the suite: ```json { "expectation_suite_name": "orders_v2_suite", "meta": { "anomalyarmor": { "table_path": "warehouse.public.orders" } }, "expectations": [...] } ``` ## Worked example Given this legacy project layout: ``` great_expectations/ ├── great_expectations.yml └── expectations/ └── orders.json ``` ...where `orders.json` contains: ```json { "expectation_suite_name": "public.orders", "expectations": [ { "expectation_type": "expect_column_values_to_not_be_null", "kwargs": {"column": "order_id"} }, { "expectation_type": "expect_column_values_to_be_unique", "kwargs": {"column": "order_id"} }, { "expectation_type": "expect_column_values_to_be_in_set", "kwargs": {"column": "status", "value_set": ["pending", "shipped", "delivered"]} }, { "expectation_type": "expect_column_values_to_match_regex", "kwargs": {"column": "email", "regex": "^[^@]+@[^@]+\\.[^@]+$"} } ] } ``` Running: ```bash armor migrate-from great-expectations ./great_expectations/ --output contract.yaml ``` ...produces: ```yaml apiVersion: v3.1.0 kind: DataContract id: name: great_expectations version: 1.0.0 status: active schema: - name: public.orders properties: - name: order_id required: true unique: true - name: status - name: email customProperties: - property: anomalyarmor value: validity: - table_path: public.orders column_name: status rule_type: allowed_values rule_config: values: [pending, shipped, delivered] severity: error treat_null_as_valid: true check_interval: daily - table_path: public.orders column_name: email rule_type: regex_match rule_config: pattern: "^[^@]+@[^@]+\\.[^@]+$" severity: error treat_null_as_valid: true check_interval: daily ``` Then validate and apply: ```bash armor contract validate -f contract.yaml armor contract plan --asset -f contract.yaml armor contract apply --asset -f contract.yaml ``` Re-running the migration is safe: the command produces a deterministic `id` seeded from the project path, so re-applying the same contract will dedupe rather than create duplicates. ## Common Questions ### How do I migrate from Great Expectations to AnomalyArmor? Run `armor migrate-from great-expectations ./great_expectations/` to translate your expectation suites into ODCS YAML, then pipe it into `armor contract apply --asset -f -`. The adapter is file-based and reads suite JSON directly, so you don't need the `great_expectations` pip package installed on the migration host. ### Which Great Expectations expectations are supported? `not_null` and `unique` map to native ODCS `required` / `unique`. Six more map to AnomalyArmor validity rules: `be_in_set`, `match_regex`, `be_between`, `value_lengths_between`, and `match_strftime_format`. See the full [mapping tables](#what-gets-mapped) for exact field translations. ### What happens to expectations the adapter can't map? They surface as warnings in the CLI summary, never silent drops. Common unsupported cases are `expect_column_pair_*`, `expect_multicolumn_*`, KL-divergence / distribution checks, custom subclasses, and type checks. Re-author those as custom SQL checks after the migration rather than forcing them through the validity-rule shape. ### Does the adapter work with Great Expectations v1 Fluent Datasources? Yes. The adapter auto-detects v1 projects and reads the `fluent_datasources` block from `great_expectations.yml` to pull per-column type hints into `physicalType`. Suites still need to live on disk under `expectations/`, so if you use a non-default store (database, S3), export them to disk first. See [Project shapes supported](#project-shapes-supported). ### How does AnomalyArmor know which table an expectation suite belongs to? The adapter resolves the suite-to-table mapping in this order: `meta.anomalyarmor.table_path` inside the suite JSON, then the suite's `expectation_suite_name` field, then the filename. Use the `meta.anomalyarmor.table_path` escape hatch when suite names don't match your target table paths. See [Suite-to-table resolution](#suite-to-table-resolution). ### Is it safe to re-run the migration? Yes. The command produces a deterministic contract `id` seeded from the project path, so re-applying the same contract deduplicates instead of creating duplicates. Combine with `armor contract plan` first to preview the diff before `apply`. ## See also - [Data Contracts (ODCS export/import)](/guides/contracts) - [Migrate from dbt](/guides/migrate-from-dbt) --- # First API Integration > Step-by-step guide to your first programmatic integration with AnomalyArmor This guide walks you through building your first programmatic integration with AnomalyArmor. By the end, you'll have a working data quality check that can run in your pipeline. ## Prerequisites - An AnomalyArmor account with at least one connected data source - Python 3.9+ installed - An API key (create in **Settings > API Keys**) ## Step 1: Install the SDK ```bash pip install anomalyarmor-cli ``` ## Step 2: Configure Authentication Store your API key securely. You have two options: **Option A: Environment variable (recommended for CI/CD)** ```bash export ARMOR_API_KEY="aa_live_your_key_here" ``` **Option B: Config file (for local development)** ```bash armor auth login # Follow the prompts to enter your API key ``` ## Step 3: Verify Connection Test that everything is working: ```python from anomalyarmor import Client client = Client() # List your data assets assets = client.assets.list(limit=5) for asset in assets: print(f"{asset.qualified_name} ({asset.asset_type})") ``` You should see a list of your connected tables and views. ## Step 4: Check Data Freshness The most common integration pattern is checking data freshness before running a pipeline. Here's a complete example: ```python from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError def run_pipeline(): client = Client() # Check that source data is fresh before processing try: client.freshness.require_fresh("snowflake.prod.warehouse.orders") print("Data is fresh, proceeding with pipeline...") # Your pipeline logic here except StalenessError as e: print(f"Pipeline aborted: data is {e.hours_since_update:.1f}h stale") raise if __name__ == "__main__": run_pipeline() ``` ## Step 5: Add Data Quality Checks Expand your integration with validity and referential integrity checks: ```python from anomalyarmor import Client client = Client() asset_id = "your-asset-uuid" # Get from assets.list() # Check freshness summary = client.freshness.summary() print(f"Overall freshness: {summary.fresh_percentage}%") # Check validity rules validity_summary = client.validity.summary(asset_id) if validity_summary.failing > 0: print(f"Warning: {validity_summary.failing} validity rules failing") # List failing rules rules = client.validity.list(asset_id) for rule in rules: result = client.validity.check(asset_id, rule.uuid) if result.status == "fail": print(f" - {rule.name}: {result.invalid_count} invalid records") # Check referential integrity ref_summary = client.referential.summary(asset_id) if ref_summary.failing_checks > 0: print(f"Warning: {ref_summary.failing_checks} referential checks failing") ``` ## Step 6: Create a Quality Gate Combine all checks into a single quality gate function: ```python from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError def quality_gate(asset_id: str, critical_tables: list[str]) -> bool: """ Run all data quality checks before pipeline execution. Returns True if all checks pass, False otherwise. Note: list[str] type hint requires Python 3.9+. For earlier versions, use typing.List[str] instead. """ client = Client() passed = True # 1. Check freshness of critical tables for table in critical_tables: try: client.freshness.require_fresh(table) print(f"[PASS] Freshness: {table}") except StalenessError as e: print(f"[FAIL] Freshness: {table} ({e.hours_since_update:.1f}h stale)") passed = False # 2. Check validity rules validity = client.validity.summary(asset_id) if validity.failing == 0: print(f"[PASS] Validity: {validity.total_rules} rules") else: print(f"[FAIL] Validity: {validity.failing}/{validity.total_rules} rules failing") passed = False # 3. Check referential integrity ref = client.referential.summary(asset_id) if ref.failing_checks == 0: print(f"[PASS] Referential: {ref.total_checks} checks") else: print(f"[FAIL] Referential: {ref.failing_checks}/{ref.total_checks} failing") passed = False return passed # Usage if __name__ == "__main__": asset = "your-asset-uuid" tables = [ "snowflake.prod.warehouse.orders", "snowflake.prod.warehouse.customers", ] if quality_gate(asset, tables): print("\nAll quality checks passed! Running pipeline...") # run_pipeline() else: print("\nQuality checks failed. Pipeline aborted.") exit(1) ``` ## Step 7: CLI Integration For shell scripts and CI/CD, use the CLI directly: ```bash #!/bin/bash set -e # Check freshness (exits 1 if stale) armor freshness check snowflake.prod.warehouse.orders # If we get here, data is fresh echo "Data quality checks passed!" dbt run ``` ## Common Patterns ### Pattern 1: Pre-ETL Validation Run checks before ETL starts: ```python # At the start of your ETL client.freshness.require_fresh("source_table") client.validity.check(asset_id, "not_null_rule") # Then run ETL... ``` ### Pattern 2: Post-ETL Validation Verify output quality after ETL: ```python # After ETL completes result = client.validity.check(asset_id, "uniqueness_rule") if result.status == "fail": rollback_etl() alert_team(f"ETL produced {result.invalid_count} duplicate records") ``` ### Pattern 3: Continuous Monitoring Schedule regular quality checks: ```python # In a scheduled job (e.g., Airflow, cron) summary = client.metrics.summary(asset_id) if summary.failing > 0: alert_team(f"{summary.failing} metrics failing") ``` ## Next Steps ## Common Questions ### Where do I get an API key to start? Sign in to [app.anomalyarmor.ai](https://app.anomalyarmor.ai), open **Settings → API Keys**, and click **Create Key**. New keys are shown once - copy to a password manager or secrets manager immediately. Set the scope to `read-only` for pipeline gating or `read-write` if your integration needs to create metrics or acknowledge alerts. ### Should I use the Python SDK or raw REST calls for my first integration? Python SDK if you're in Python or Airflow - it handles pagination, retries on 429, and error typing for you. Raw REST (`curl` or `requests`) if you're in a different language, want zero dependencies, or are prototyping a webhook handler. Both paths are first-class. ### How do I test an integration without affecting production monitoring? Create a separate AnomalyArmor workspace (or use a sandbox project) with a non-production database connection. Integrations there can create/delete monitors freely without touching prod. When ready, swap the API key to production and re-run against real endpoints. ### What's the lightest possible first integration? A freshness check at the start of your pipeline: one API call, one conditional exit. `GET /api/v1/freshness/check?asset=my_table` returns whether the table meets its SLA. If no, abort the pipeline. This pattern is 5 lines of code and catches the most common data issue (stale upstream). - **[dbt Integration](/integrations/dbt)**: Add quality gates to dbt workflows - **[GitHub Actions](/integrations/github-actions)**: Run checks in CI/CD pipelines - **[Airflow Integration](/integrations/airflow)**: Integrate with Apache Airflow - **[API Reference](/api/overview)**: Full API documentation --- # Airflow Integration > Use AnomalyArmor in Apache Airflow DAGs Integrate AnomalyArmor with Apache Airflow to gate pipelines on data quality and freshness. ## Installation Install the SDK in your Airflow environment: ```bash pip install anomalyarmor-cli ``` ## Configuration Set `ARMOR_API_KEY` as an Airflow variable or environment variable: ```bash # Airflow Variable airflow variables set ARMOR_API_KEY "aa_live_xxx" # Or environment variable export ARMOR_API_KEY="aa_live_xxx" ``` ## Pre-flight Freshness Check The most common pattern: fail the task if upstream data is stale. ```python from airflow import DAG from airflow.operators.python import PythonOperator from datetime import datetime from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError def check_upstream_freshness(): """Gate: Fail task if upstream data is stale.""" client = Client() # This raises StalenessError if data is stale client.freshness.require_fresh("snowflake.prod.warehouse.orders") print("Upstream data is fresh, proceeding...") def run_transformation(): """Main transformation logic.""" print("Running dbt models...") # subprocess.run(["dbt", "run", "--select", "orders_mart"]) with DAG( "orders_pipeline", start_date=datetime(2024, 1, 1), schedule_interval="@hourly", catchup=False, ) as dag: freshness_gate = PythonOperator( task_id="check_freshness", python_callable=check_upstream_freshness, ) transform = PythonOperator( task_id="run_transformation", python_callable=run_transformation, ) freshness_gate >> transform ``` > 💡 Use `read-only` scope for pre-flight checks. You only need `read-write` if triggering refreshes. ## Check Multiple Sources Verify all upstream dependencies before running: ```python from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError def check_all_upstream(): """Check all upstream sources are fresh.""" client = Client() upstream_tables = [ "snowflake.prod.warehouse.orders", "snowflake.prod.warehouse.customers", "snowflake.prod.warehouse.products", ] stale_tables = [] for table in upstream_tables: try: client.freshness.require_fresh(table) except StalenessError: stale_tables.append(table) if stale_tables: raise Exception(f"Stale upstream data: {stale_tables}") print("All upstream sources are fresh!") ``` ## Trigger Freshness Check Trigger a freshness check and wait for completion: ```python from anomalyarmor import Client def refresh_and_check(): """Trigger freshness check, then verify.""" client = Client() # Trigger refresh and wait result = client.freshness.refresh( "snowflake.prod.warehouse.orders", wait=True, ) print(f"Refresh job {result.job_id}: {result.status}") # Now check freshness client.freshness.require_fresh("snowflake.prod.warehouse.orders") ``` ## Sensor Pattern Wait for data to become fresh: ```python from airflow.sensors.python import PythonSensor from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError def is_data_fresh(): """Return True when data is fresh.""" client = Client() try: client.freshness.require_fresh("snowflake.prod.warehouse.orders") return True except StalenessError: return False freshness_sensor = PythonSensor( task_id="wait_for_fresh_data", python_callable=is_data_fresh, poke_interval=300, # Check every 5 minutes timeout=3600, # Timeout after 1 hour mode="poke", ) ``` ## Check Lineage Verify all upstream dependencies using lineage: ```python from anomalyarmor import Client def check_upstream_via_lineage(): """Check all upstream sources via lineage API.""" client = Client() # Get upstream dependencies lineage = client.lineage.get("snowflake.prod.mart.orders_summary") print(f"Checking {len(lineage.upstream)} upstream sources...") for upstream in lineage.upstream: client.freshness.require_fresh(upstream.qualified_name) print(f" {upstream.qualified_name}") print("All upstream sources are fresh!") ``` ## Error Handling Handle different error types appropriately: ```python from anomalyarmor import Client from anomalyarmor.exceptions import ( StalenessError, AuthenticationError, RateLimitError, ArmorError, ) import time def check_with_retry(): """Check freshness with error handling.""" client = Client() max_retries = 3 for attempt in range(max_retries): try: client.freshness.require_fresh("snowflake.prod.warehouse.orders") return # Success except StalenessError as e: # Data is stale - this is expected, fail the task raise Exception(f"Data is stale: last updated {e.last_updated}") except RateLimitError as e: # Rate limited - retry after waiting if attempt < max_retries - 1: print(f"Rate limited, waiting {e.retry_after}s...") time.sleep(e.retry_after) else: raise except AuthenticationError: # Auth error - likely config issue raise Exception("Invalid ARMOR_API_KEY") except ArmorError as e: # Other API error - retry if attempt < max_retries - 1: print(f"API error, retrying: {e}") time.sleep(10) else: raise ``` ## Complete DAG Example Full example with freshness gate, transformation, and post-run schema check: ```python from airflow import DAG from airflow.operators.python import PythonOperator from airflow.operators.bash import BashOperator from datetime import datetime from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError default_args = { "retries": 1, "retry_delay": timedelta(minutes=5), } def check_upstream(): client = Client() client.freshness.require_fresh("snowflake.prod.warehouse.orders") client.freshness.require_fresh("snowflake.prod.warehouse.customers") def trigger_schema_check(): client = Client() result = client.schema.refresh( "snowflake.prod.mart.orders_summary", wait=True, ) print(f"Schema check: {result.status}") with DAG( "orders_mart_pipeline", default_args=default_args, start_date=datetime(2024, 1, 1), schedule_interval="@daily", catchup=False, tags=["data-quality", "orders"], ) as dag: check_freshness = PythonOperator( task_id="check_upstream_freshness", python_callable=check_upstream, ) run_dbt = BashOperator( task_id="run_dbt_models", bash_command="cd /dbt && dbt run --select orders_mart", ) verify_schema = PythonOperator( task_id="verify_schema", python_callable=trigger_schema_check, ) check_freshness >> run_dbt >> verify_schema ``` ## Next Steps ## Common Questions ### How do I gate an Airflow DAG on data freshness using AnomalyArmor? Use a `PythonSensor` or `PythonOperator` at the start of your DAG that calls `client.freshness.check(table="...")` from the AnomalyArmor Python SDK. If freshness is violated, raise an exception to short-circuit the DAG. See the "Gating DAGs" section above for a full example. ### Does the AnomalyArmor Python SDK work with Airflow 2 and MWAA? Yes. The SDK is a pure-Python package (`pip install anomalyarmor`) and works in any Airflow environment that supports pip-installed dependencies: Airflow 2.x, Astronomer, MWAA, and Airflow on Kubernetes. MWAA requires adding the package to `requirements.txt`. ### Where should I store my AnomalyArmor API key in Airflow? In an Airflow Connection (Extra JSON field) or a Variable - never in DAG code. For MWAA, use AWS Secrets Manager with Airflow's Secrets Backend so keys rotate without redeploying DAGs. ### Can AnomalyArmor trigger Airflow DAG runs when data quality fails? Yes via webhooks. Configure an AnomalyArmor alert with a webhook destination pointing at Airflow's REST API (e.g., `POST /api/v1/dags/{dag_id}/dagRuns`). Most teams use this pattern to trigger remediation DAGs when a critical freshness or schema-drift alert fires. - **[Python SDK](/sdk/overview)**: SDK reference and patterns - **[Freshness API](/api/freshness)**: Freshness endpoint details - **[Lineage API](/api/lineage)**: Explore data dependencies - **[Alerts](/alerts/overview)**: Set up freshness alerts --- # dbt Integration > Add data quality gates to your dbt workflows Integrate AnomalyArmor with dbt to add data quality gates before and after your transformations. This guide covers common patterns for ensuring data quality throughout your dbt workflow. ## Prerequisites - AnomalyArmor account with connected data source - dbt project configured - Python 3.9+ (for SDK) or shell access (for CLI) - API key with `read-only` scope (or `read-write` for triggering checks) ## Installation ```bash pip install anomalyarmor-cli dbt-core ``` Configure your API key: ```bash export ARMOR_API_KEY="aa_live_your_key_here" ``` ## Pattern 1: Pre-run Quality Gate Check data quality before running dbt: ### Using CLI (Shell Script) ```bash #!/bin/bash # pre_dbt_check.sh set -e echo "Running pre-dbt quality checks..." # Check freshness of source tables armor freshness check snowflake.raw.stripe.payments || { echo "Source data is stale. Aborting dbt run." exit 1 } armor freshness check snowflake.raw.crm.customers || { echo "Source data is stale. Aborting dbt run." exit 1 } echo "All source tables are fresh. Starting dbt..." dbt run ``` ### Using Python ```python # pre_dbt_check.py from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError import subprocess import sys def check_sources_and_run_dbt(): client = Client() sources = [ "snowflake.raw.stripe.payments", "snowflake.raw.crm.customers", "snowflake.raw.shopify.orders", ] print("Checking source freshness...") for source in sources: try: status = client.freshness.require_fresh(source) print(f" [OK] {source} ({status.hours_since_update:.1f}h old)") except StalenessError as e: print(f" [STALE] {source} ({e.hours_since_update:.1f}h old)") print("\nAborting: Source data is stale.") sys.exit(1) print("\nAll sources fresh. Running dbt...") result = subprocess.run(["dbt", "run"], check=True) return result.returncode if __name__ == "__main__": check_sources_and_run_dbt() ``` ## Pattern 2: Post-run Validation Validate output quality after dbt completes: ```python # post_dbt_validate.py from anomalyarmor import Client def validate_dbt_outputs(): client = Client() asset_id = "your-asset-uuid" print("Validating dbt outputs...") # Check validity rules on transformed tables validity = client.validity.summary(asset_id) if validity.failing > 0: print(f"[WARN] {validity.failing} validity rules failing") # Get details of failing rules rules = client.validity.list(asset_id) for rule in rules: result = client.validity.check(asset_id, rule.uuid) if result.status == "fail": print(f" - {rule.name} on {rule.column_name}") print(f" Invalid: {result.invalid_count} ({result.invalid_percent:.2f}%)") return False else: print(f"[OK] All {validity.total_rules} validity rules passing") # Check referential integrity ref = client.referential.summary(asset_id) if ref.failing_checks > 0: print(f"[WARN] {ref.failing_checks} referential checks failing") return False else: print(f"[OK] All {ref.total_checks} referential checks passing") return True if __name__ == "__main__": if not validate_dbt_outputs(): print("\nPost-dbt validation failed!") exit(1) print("\nAll validations passed!") ``` ## Pattern 3: dbt run-operation Hook Create a dbt macro that calls AnomalyArmor: > â„šī¸ The `shell()` function is not a built-in dbt Jinja function. This pattern requires a custom macro or package that provides shell execution capabilities. Consider using the wrapper script approach (Pattern 4) for a simpler, more portable solution. ```sql -- macros/armor_quality_gate.sql {% macro armor_freshness_check(table_name) %} {% set result = run_query("SELECT 1") %} {{ log("Checking freshness for " ~ table_name, info=True) }} {# Call CLI from dbt - requires custom shell() macro #} {% set check_result = shell("armor freshness check " ~ table_name ~ " 2>&1 || echo STALE") %} {% if "STALE" in check_result %} {{ exceptions.raise_compiler_error("Data quality check failed: " ~ table_name ~ " is stale") }} {% endif %} {% endmacro %} ``` Then in your model: ```sql -- models/marts/orders_mart.sql {{ config(pre_hook=armor_freshness_check('snowflake.raw.stripe.payments')) }} SELECT o.order_id, o.customer_id, c.customer_name, o.total_amount FROM {{ ref('stg_orders') }} o LEFT JOIN {{ ref('stg_customers') }} c ON o.customer_id = c.customer_id ``` ## Pattern 4: Full dbt Wrapper Script A wrapper that handles pre-checks, dbt run, and post-validation: ```python #!/usr/bin/env python # dbt_with_quality_gates.py """ Run dbt with AnomalyArmor quality gates. Usage: python dbt_with_quality_gates.py run python dbt_with_quality_gates.py run --select marts.* """ import argparse import subprocess import sys from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError # Configuration ASSET_ID = "your-asset-uuid" SOURCE_TABLES = [ "snowflake.raw.stripe.payments", "snowflake.raw.crm.customers", ] def pre_checks(client: Client) -> bool: """Run pre-dbt quality checks.""" print("\n=== Pre-dbt Quality Checks ===\n") # Check source freshness all_fresh = True for table in SOURCE_TABLES: try: status = client.freshness.require_fresh(table) print(f"[FRESH] {table} ({status.hours_since_update:.1f}h)") except StalenessError as e: print(f"[STALE] {table} ({e.hours_since_update:.1f}h)") all_fresh = False return all_fresh def run_dbt(args: list[str]) -> int: """Execute dbt with provided arguments.""" print("\n=== Running dbt ===\n") cmd = ["dbt"] + args result = subprocess.run(cmd) return result.returncode def post_checks(client: Client) -> bool: """Run post-dbt quality checks.""" print("\n=== Post-dbt Quality Checks ===\n") passed = True # Validity checks validity = client.validity.summary(ASSET_ID) if validity.failing == 0: print(f"[PASS] Validity: {validity.total_rules} rules") else: print(f"[FAIL] Validity: {validity.failing}/{validity.total_rules} failing") passed = False # Referential integrity ref = client.referential.summary(ASSET_ID) if ref.failing_checks == 0: print(f"[PASS] Referential: {ref.total_checks} checks") else: print(f"[FAIL] Referential: {ref.failing_checks}/{ref.total_checks} failing") passed = False # Metrics anomalies metrics = client.metrics.summary(ASSET_ID) if metrics.failing == 0: print(f"[PASS] Metrics: {metrics.active_metrics} monitored") else: print(f"[WARN] Metrics: {metrics.failing} anomalies detected") # Don't fail on metric anomalies, just warn return passed def main(): parser = argparse.ArgumentParser(description="Run dbt with quality gates") parser.add_argument("dbt_command", help="dbt command (run, build, test)") parser.add_argument("dbt_args", nargs="*", help="Additional dbt arguments") parser.add_argument("--skip-pre", action="store_true", help="Skip pre-checks") parser.add_argument("--skip-post", action="store_true", help="Skip post-checks") args = parser.parse_args() client = Client() dbt_args = [args.dbt_command] + args.dbt_args # Pre-checks if not args.skip_pre: if not pre_checks(client): print("\nPre-checks failed. Use --skip-pre to bypass.") sys.exit(1) # Run dbt dbt_exit_code = run_dbt(dbt_args) if dbt_exit_code != 0: print(f"\ndbt exited with code {dbt_exit_code}") sys.exit(dbt_exit_code) # Post-checks if not args.skip_post: if not post_checks(client): print("\nPost-checks failed. Review quality issues.") sys.exit(1) print("\n=== All checks passed! ===") if __name__ == "__main__": main() ``` Run it: ```bash # Full run with all checks python dbt_with_quality_gates.py run # Run specific models python dbt_with_quality_gates.py run --select marts.orders_mart # Skip pre-checks (for development) python dbt_with_quality_gates.py run --skip-pre ``` ## Pattern 5: dbt Cloud Webhook Integration For dbt Cloud, use webhooks to trigger AnomalyArmor checks: ```python # webhook_handler.py (Flask example) from flask import Flask, request, jsonify from anomalyarmor import Client app = Flask(__name__) @app.route("/dbt-webhook", methods=["POST"]) def handle_dbt_webhook(): payload = request.json event_type = payload.get("eventType") run_status = payload.get("data", {}).get("runStatus") if event_type == "run.completed" and run_status == "Success": # dbt run completed successfully, run post-checks client = Client() asset_id = "your-asset-uuid" validity = client.validity.summary(asset_id) ref = client.referential.summary(asset_id) if validity.failing > 0 or ref.failing_checks > 0: # Alert your team via Slack, PagerDuty, etc. send_alert(f"dbt run completed but quality issues found") return jsonify({"status": "ok"}) ``` ## Pattern 6: Upload Lineage from dbt Upload your dbt `manifest.json` to populate data lineage in AnomalyArmor. This lets you visualize the full DAG, run impact analysis, and check upstream freshness before transformations. ```bash # Generate the manifest dbt parse # Upload to AnomalyArmor curl -X POST \ "https://api.anomalyarmor.ai/api/v1/assets/$ASSET_ID/lineage/upload" \ -H "Authorization: Bearer $ARMOR_API_KEY" \ -F "file=@target/manifest.json" ``` Or add it as a post-run step in your dbt wrapper: ```python from anomalyarmor import Client client = Client() def upload_lineage(asset_id: str, manifest_path: str = "target/manifest.json"): """Upload dbt manifest to sync lineage after a dbt run.""" with open(manifest_path, "rb") as f: result = client.lineage.upload(asset_id=asset_id, file=f) print(f"Lineage synced: {result.sync_stats['nodes_created']} nodes, " f"{result.sync_stats['edges_created']} edges") ``` For dbt Cloud users, sync lineage directly without file uploads: ```python result = client.lineage.sync_dbt_cloud( asset_id=ASSET_ID, account_id="12345", api_token="dbtc_your_token_here", job_id="67890", ) ``` > 💡 See the full [Lineage Upload Guide](/guides/lineage-upload) for detailed setup, CI/CD integration, and manual lineage options. ## Best Practices ### 1. Scope Your Checks Don't check everything. Focus on critical paths: ```python # Check only critical source tables CRITICAL_SOURCES = [ "production.stripe.payments", # Revenue-critical "production.core.users", # Identity-critical ] ``` ### 2. Set Appropriate Thresholds Configure checks with realistic thresholds: ```python # Allow small percentage of stale data in non-critical tables client.freshness.require_fresh( "analytics.events", max_age_hours=48, # More lenient for analytics ) ``` ### 3. Fail Fast, But Not Always ```python # Critical checks should fail the pipeline if not critical_checks_pass(): sys.exit(1) # Non-critical checks can warn without failing if not advisory_checks_pass(): log_warning("Advisory checks failed, continuing anyway") ``` ### 4. Cache API Calls For large dbt projects, minimize API calls: ```python # Get summary once, not per-model validity_summary = client.validity.summary(asset_id) # Then use summary.failing, summary.passing, etc. ``` ## Troubleshooting ### "Asset not found" errors Ensure your table names match AnomalyArmor's qualified names exactly: ```bash # List your assets to see exact names armor assets list --source snowflake ``` ### Slow checks For faster feedback, use summary endpoints instead of individual checks: ```python # Fast: single API call summary = client.validity.summary(asset_id) print(f"Failing: {summary.failing}") # Slow: N API calls for rule in client.validity.list(asset_id): result = client.validity.check(asset_id, rule.uuid) # Avoid in loops ``` ## Next Steps ## Common Questions ### Do I replace my dbt tests with AnomalyArmor, or run both? Run both - they do different things. dbt tests validate model-level contracts at build time (is this column unique? are these values in this set?). AnomalyArmor monitors continuously after dbt runs, catching freshness violations, schema drift, and statistical anomalies that point-in-time tests miss. See [migrating from dbt tests](/guides/migrate-from-dbt) for the patterns. ### Does AnomalyArmor hook into `dbt run` or does it monitor tables after the fact? Both patterns work. For post-run monitoring, nothing to configure - AnomalyArmor watches your warehouse independently. For pre-run gating (fail fast if source data isn't fresh enough), add a Python/shell step before `dbt run` that calls the AnomalyArmor freshness API. See the "Pre-run gating" section above. ### Can AnomalyArmor import my existing dbt tests instead of rewriting them? Yes. Use `armor migrate-from dbt path/to/dbt/project` (available via the AnomalyArmor CLI) to generate ODCS contract files from your dbt schema.yml. Tests like `not_null`, `unique`, `accepted_values`, and `relationships` map cleanly; others surface as warnings. See [migrating from dbt](/guides/migrate-from-dbt). ### Does the integration work with dbt Cloud as well as dbt Core? Yes. dbt Cloud emits `manifest.json` on every run; configure AnomalyArmor to consume it via the lineage-upload flow, and model-to-table relationships populate automatically. See [lineage upload](/guides/lineage-upload). - **[GitHub Actions](/integrations/github-actions)**: Run checks in CI/CD - **[Airflow Integration](/integrations/airflow)**: Orchestrate with Airflow - **[Validity API](/api/validity)**: Full validity API reference - **[Metrics API](/api/metrics)**: Track data metrics --- # GitHub Integration > Connect GitHub repositories for code ownership, change tracking, and incident context Connect your GitHub repositories to AnomalyArmor to monitor code ownership, track changes to data pipelines, and get incident context when alerts fire. AnomalyArmor analyzes your repos for CODEOWNERS files, git blame history, and table references so you can route alerts to the right people. ## What AnomalyArmor Does with Your Repos Once connected and monitored, AnomalyArmor scans your repositories for: - **CODEOWNERS files** to map directories and files to team owners - **Git blame data** to identify who last modified SQL models, dbt files, and pipeline code - **Table references** in SQL files to link code to monitored data assets - **Change correlation** to show recent code changes when data quality issues arise This lets AnomalyArmor automatically suggest alert owners, show who to contact when a table breaks, and provide code-level context for schema changes and freshness incidents. ## GitHub App vs OAuth: Which to Choose AnomalyArmor supports two connection methods. For most teams, the GitHub App is the better choice. | | GitHub App | OAuth App | |---|---|---| | **Setup** | One-click install from GitHub Marketplace | Personal OAuth flow | | **Permissions** | Scoped to selected repos only | Access to all repos you can see | | **Token lifecycle** | Auto-refreshing, no expiry | Permanent until revoked | | **Webhooks** | Automatic push event webhooks | Manual per-repo webhook setup | | **Best for** | Teams, production use | Quick personal testing | | **Identity** | Org-level installation | Individual user account | > 💡 **Use the GitHub App** if you're setting this up for a team. It provides granular repo-level permissions and automatic webhook delivery for real-time sync. ## Setup: GitHub App (Recommended) ## Setup: OAuth App > â„šī¸ OAuth tokens are tied to your personal GitHub account. If you leave the organization, the connection breaks. For team setups, use the GitHub App instead. ## Monitoring Repos The Monitor toggle controls whether AnomalyArmor actively analyzes a repository: - **Toggle ON**: Creates a repository record and triggers initial analysis (CODEOWNERS parsing, git blame, table reference scanning). The Status column shows progress (Pending, Running, Synced, or Failed). - **Toggle OFF**: Pauses monitoring. All existing analysis data is preserved. You can re-enable at any time. ### Linking Assets to Repos For the richest analysis, link your monitored repos to data assets (databases). Click the link icon next to a monitored repo's status to open the asset linking dialog. This lets AnomalyArmor correlate code changes in the repo with specific tables in your database. ## Webhooks ### GitHub App Webhooks are configured automatically when you install the app. AnomalyArmor receives push events in real time and re-analyzes monitored repos when relevant files change (CODEOWNERS, SQL files, dbt models). ### OAuth App OAuth connections do not include automatic webhooks. AnomalyArmor relies on periodic re-sync to pick up changes. You can manually trigger a re-sync from **Settings > Git Integration** by clicking **Re-sync**. ## Common Questions ### Should I use the GitHub App or OAuth to connect? Use the GitHub App for team or organization installs: it delivers webhooks automatically so AnomalyArmor re-syncs on push, gives you fine-grained per-repo access control, and uses a refreshing token that never expires. Use OAuth only for personal-account trials or temporary single-user connections, or when your organization blocks third-party GitHub Apps. You can switch later without losing monitored repo data. ### Does AnomalyArmor read my source code? AnomalyArmor analyzes CODEOWNERS, commit history, and references to your monitored tables (typically in SQL files and dbt models). It does not read or store full source code. The repo connection is for ownership and change context, not for code execution. ### Does the GitHub integration support private repositories? Yes. Both GitHub App and OAuth connection methods support private repos. For GitHub App, grant access to the specific repos during installation. For OAuth, your authorized scopes must include private repo read access. ### Why does my integration show 'No repositories found' after connecting? For GitHub App: confirm you granted access to at least one repo during install (check the installed GitHub Apps list in your org settings). For OAuth: private repos in orgs with SAML SSO require explicit SSO authorization on the token. ### How do I disconnect or change which repositories AnomalyArmor can see? For GitHub App, open your GitHub organization settings, find AnomalyArmor under Installed GitHub Apps, and edit the repo access list. Then click **Re-sync** in AnomalyArmor to refresh. For OAuth, click **Disconnect** in the Git Integration settings and reconnect. Previously monitored repositories and their analysis data are preserved across disconnect/reconnect. ## Troubleshooting ### "No repositories found" after connecting - **GitHub App**: Check that you granted access to at least one repository during installation. Go to your GitHub organization settings, find the AnomalyArmor app under **Installed GitHub Apps**, and verify the repository access list. - **OAuth**: Verify that your GitHub account has access to the repositories you expect. Private repos in organizations may require SSO authorization. ### Connection shows "Pending Setup" The OAuth flow was started but not completed. Click **Complete Setup** to finish the authorization. ### Permissions seem wrong after changing repo access If you change which repos the GitHub App can access (in GitHub's settings), click **Re-sync** in AnomalyArmor to refresh the repository list. ### Need to start over Click **Disconnect** in the Git Integration settings, then reconnect using either method. Your previously monitored repositories and their analysis data are preserved. ## Next Steps - **[GitHub Actions](/integrations/github-actions)**: Run quality checks in CI/CD - **[dbt Integration](/integrations/dbt)**: Connect dbt project metadata - **[Alert Destinations](/alerts/overview)**: Route alerts to the right team - **[Data Sources](/data-sources/overview)**: Connect your data warehouse --- # GitHub Actions Integration > Run data quality checks in your CI/CD pipeline with GitHub Actions Integrate AnomalyArmor into your GitHub Actions workflows to automatically run data quality checks on pull requests, scheduled jobs, or deployments. ## Prerequisites - AnomalyArmor account with connected data source - GitHub repository - API key stored as a GitHub secret ## Setup ### 1. Create API Key Generate an API key in **Settings > API Keys** with `read-only` scope (or `read-write` if you need to trigger checks). ### 2. Add Secret to GitHub Go to your repository's **Settings > Secrets and variables > Actions** and add: - Name: `ARMOR_API_KEY` - Value: `aa_live_your_key_here` ## Basic Workflow Add this workflow file to `.github/workflows/data-quality.yml`: ```yaml name: Data Quality Checks on: push: branches: [main] pull_request: branches: [main] schedule: # Run daily at 6 AM UTC - cron: '0 6 * * *' workflow_dispatch: # Manual trigger jobs: quality-checks: runs-on: ubuntu-latest steps: - name: Checkout code uses: actions/checkout@v4 - name: Set up Python uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install AnomalyArmor CLI run: pip install anomalyarmor-cli - name: Check data freshness env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} run: | armor freshness check snowflake.prod.warehouse.orders armor freshness check snowflake.prod.warehouse.customers - name: Run validity checks env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} run: | armor validity summary ${{ vars.ASSET_ID }} ``` ## Workflow Patterns ### Pattern 1: Pre-deployment Gate Block deployments if data quality checks fail: ```yaml name: Deploy with Quality Gate on: push: branches: [main] jobs: quality-gate: runs-on: ubuntu-latest steps: - uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install CLI run: pip install anomalyarmor-cli - name: Quality Gate env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} run: | # Check all critical tables armor freshness check snowflake.prod.warehouse.orders armor freshness check snowflake.prod.warehouse.customers armor freshness check snowflake.prod.warehouse.products echo "All quality checks passed!" deploy: needs: quality-gate runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Deploy application run: ./deploy.sh ``` ### Pattern 2: Scheduled Quality Report Generate a daily quality report: ```yaml name: Daily Quality Report on: schedule: - cron: '0 8 * * *' # 8 AM UTC daily workflow_dispatch: jobs: quality-report: runs-on: ubuntu-latest steps: - uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install dependencies run: pip install anomalyarmor-cli - name: Generate Quality Report env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} run: | echo "# Daily Data Quality Report" >> $GITHUB_STEP_SUMMARY echo "" >> $GITHUB_STEP_SUMMARY echo "## Freshness Summary" >> $GITHUB_STEP_SUMMARY armor freshness summary >> $GITHUB_STEP_SUMMARY echo "" >> $GITHUB_STEP_SUMMARY echo "## Validity Summary" >> $GITHUB_STEP_SUMMARY armor validity summary ${{ vars.ASSET_ID }} >> $GITHUB_STEP_SUMMARY echo "" >> $GITHUB_STEP_SUMMARY echo "## Referential Summary" >> $GITHUB_STEP_SUMMARY armor referential summary ${{ vars.ASSET_ID }} >> $GITHUB_STEP_SUMMARY ``` ### Pattern 3: PR Comment with Quality Status Post quality status as a PR comment: ```yaml name: PR Quality Check on: pull_request: branches: [main] jobs: quality-check: runs-on: ubuntu-latest permissions: pull-requests: write steps: - uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install CLI run: pip install anomalyarmor-cli - name: Run Quality Checks id: quality env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} run: | # Capture results FRESHNESS=$(armor freshness summary 2>&1) || true VALIDITY=$(armor validity summary ${{ vars.ASSET_ID }} 2>&1) || true # Build comment echo "COMMENT<> $GITHUB_OUTPUT echo "## Data Quality Status" >> $GITHUB_OUTPUT echo "" >> $GITHUB_OUTPUT echo "### Freshness" >> $GITHUB_OUTPUT echo '```' >> $GITHUB_OUTPUT echo "$FRESHNESS" >> $GITHUB_OUTPUT echo '```' >> $GITHUB_OUTPUT echo "" >> $GITHUB_OUTPUT echo "### Validity" >> $GITHUB_OUTPUT echo '```' >> $GITHUB_OUTPUT echo "$VALIDITY" >> $GITHUB_OUTPUT echo '```' >> $GITHUB_OUTPUT echo "EOF" >> $GITHUB_OUTPUT - name: Post PR Comment uses: actions/github-script@v7 with: script: | github.rest.issues.createComment({ issue_number: context.issue.number, owner: context.repo.owner, repo: context.repo.repo, body: `${{ steps.quality.outputs.COMMENT }}` }) ``` ### Pattern 4: dbt + Quality Checks Combine dbt runs with quality validation: ```yaml name: dbt with Quality Gates on: push: branches: [main] paths: - 'dbt/**' workflow_dispatch: jobs: dbt-run: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install dependencies run: | pip install dbt-snowflake anomalyarmor-cli - name: Pre-flight Quality Check env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} run: | echo "Checking source data freshness..." armor freshness check snowflake.raw.stripe.payments armor freshness check snowflake.raw.crm.customers - name: Run dbt working-directory: ./dbt env: DBT_PROFILES_DIR: ${{ github.workspace }}/dbt run: | dbt deps dbt run - name: Post-run Validation env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} run: | echo "Validating dbt outputs..." armor validity summary ${{ vars.ASSET_ID }} armor referential summary ${{ vars.ASSET_ID }} ``` ### Pattern 5: Python Script for Complex Logic For complex quality gates, use a Python script: ```yaml name: Advanced Quality Gate on: push: branches: [main] jobs: quality-gate: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install dependencies run: pip install anomalyarmor-cli - name: Run Quality Gate env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} ASSET_ID: ${{ vars.ASSET_ID }} run: python scripts/quality_gate.py ``` Create `scripts/quality_gate.py`: ```python #!/usr/bin/env python """ Advanced quality gate for GitHub Actions. Exits with code 0 if all checks pass, 1 otherwise. """ import os import sys from anomalyarmor import Client from anomalyarmor.exceptions import StalenessError def main(): client = Client() asset_id = os.environ["ASSET_ID"] passed = True print("=" * 50) print("DATA QUALITY GATE") print("=" * 50) # 1. Freshness checks print("\n## Freshness Checks") critical_tables = [ "snowflake.prod.warehouse.orders", "snowflake.prod.warehouse.customers", ] for table in critical_tables: try: status = client.freshness.require_fresh(table) print(f"[PASS] {table} ({status.hours_since_update:.1f}h old)") except StalenessError as e: print(f"[FAIL] {table} ({e.hours_since_update:.1f}h stale)") passed = False # 2. Validity checks print("\n## Validity Checks") validity = client.validity.summary(asset_id) if validity.failing == 0: print(f"[PASS] {validity.total_rules} rules, all passing") else: print(f"[FAIL] {validity.failing}/{validity.total_rules} rules failing") # List failing rules for rule in client.validity.list(asset_id): result = client.validity.check(asset_id, rule.uuid) if result.status == "fail": print(f" - {rule.name}: {result.invalid_count} invalid") passed = False # 3. Referential integrity print("\n## Referential Integrity") ref = client.referential.summary(asset_id) if ref.failing_checks == 0: print(f"[PASS] {ref.total_checks} checks, all passing") else: print(f"[FAIL] {ref.failing_checks}/{ref.total_checks} failing") passed = False # 4. Summary print("\n" + "=" * 50) if passed: print("RESULT: ALL CHECKS PASSED") print("=" * 50) sys.exit(0) else: print("RESULT: QUALITY GATE FAILED") print("=" * 50) sys.exit(1) if __name__ == "__main__": main() ``` ## Environment Variables Reference these in your workflows: | Variable | Required | Description | |----------|----------|-------------| | `ARMOR_API_KEY` | Yes | Your AnomalyArmor API key (store as secret) | | `ASSET_ID` | For some commands | Asset UUID (store as variable) | ### Using GitHub Variables Store non-sensitive config as repository variables: 1. Go to **Settings > Secrets and variables > Actions** 2. Click **Variables** tab 3. Add variables like `ASSET_ID`, `CRITICAL_TABLES`, etc. Reference in workflows: ```yaml env: ASSET_ID: ${{ vars.ASSET_ID }} ``` ## Best Practices ### 1. Use Secrets for API Keys Never hardcode API keys: ```yaml # Good env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} # Bad - Never do this! env: ARMOR_API_KEY: aa_live_xxxxx ``` ### 2. Fail Fast Put quality checks early in workflows: ```yaml jobs: quality-gate: # Runs first runs-on: ubuntu-latest ... build: needs: quality-gate # Only runs if quality passes ... deploy: needs: build ... ``` ### 3. Cache Dependencies Speed up workflows by caching: ```yaml - uses: actions/setup-python@v5 with: python-version: '3.11' cache: 'pip' - name: Install dependencies run: pip install anomalyarmor-cli ``` ### 4. Use Job Summaries Write results to `$GITHUB_STEP_SUMMARY` for visibility: ```yaml - name: Quality Summary run: | echo "## Quality Gate Results" >> $GITHUB_STEP_SUMMARY armor freshness summary >> $GITHUB_STEP_SUMMARY ``` ### 5. Set Timeouts Prevent hanging jobs: ```yaml jobs: quality-checks: runs-on: ubuntu-latest timeout-minutes: 10 ``` ## Troubleshooting ### "Command not found: armor" Ensure you've installed the CLI before using it: ```yaml - name: Install CLI run: pip install anomalyarmor-cli - name: Use CLI run: armor --version # Should work now ``` ### "Authentication failed" Check that: 1. `ARMOR_API_KEY` secret is set correctly 2. The secret name matches your workflow reference 3. API key hasn't been revoked ```yaml - name: Debug auth env: ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }} run: | armor auth status ``` ### Workflow not triggering Check your `on:` triggers and branch patterns match your setup. ## Next Steps ## Common Questions ### How do I store my AnomalyArmor API key in GitHub Actions? Add it as a repository or organization secret: **Settings → Secrets and variables → Actions → New repository secret**, name it `ARMOR_API_KEY`, and paste the key value. Reference it in your workflow as `${{ secrets.ARMOR_API_KEY }}`. For org-wide pipelines, use an org-level secret scoped to specific repos. ### Can I fail a pull request on a data quality check failure? Yes. Add an AnomalyArmor freshness or validity check as a required status check on the branch protection rule for `main`. The workflow exits non-zero when a check fails, GitHub marks the PR as failing, and merge is blocked until the check passes or is overridden. ### Can I run AnomalyArmor checks on a schedule from GitHub Actions? Yes. Use a `workflow_dispatch` trigger plus a `schedule:` cron entry. But most customers skip this - AnomalyArmor's own scheduled monitoring and native alert destinations handle recurring checks without GitHub Actions runner time. Use GitHub Actions specifically for PR-time gates and deploy-time validation. ### Is there an official AnomalyArmor GitHub Action? Not yet as a Marketplace action - a `run:` step that installs the CLI (`pip install anomalyarmor-cli`) or calls REST directly via `curl` is the current pattern. A dedicated action is on the roadmap. See the examples in this page for the canonical shape. - **[Airflow Integration](/integrations/airflow)**: Orchestrate with Airflow - **[dbt Integration](/integrations/dbt)**: Add gates to dbt workflows - **[CLI Reference](/cli/reference)**: Full CLI documentation - **[API Reference](/api/overview)**: Full API documentation --- # MCP Server > Connect AI assistants to AnomalyArmor with Model Context Protocol The AnomalyArmor MCP server lets AI assistants like Claude Code and Cursor interact with your data observability platform through natural language. ## Connection Methods | Method | Auth | Best For | |--------|------|----------| | **Remote (Recommended)** | Sign in with your AnomalyArmor account | Zero setup, always up to date | | **Local** | API key | Air-gapped environments, custom setups | ## Remote Server (Recommended) Connect directly to AnomalyArmor's hosted MCP server. No installation required. You sign in with your existing AnomalyArmor account through OAuth. ### Claude Code ```bash claude mcp add anomalyarmor --transport http https://mcp.anomalyarmor.ai/mcp ``` Claude Code will open a browser window to sign in the first time you use a tool. ### Cursor Add to your Cursor MCP configuration: **macOS**: `~/Library/Application Support/Cursor/mcp.json` **Linux**: `~/.config/Cursor/mcp.json` **Windows**: `%APPDATA%\Cursor\mcp.json` ```json { "mcpServers": { "anomalyarmor": { "url": "https://mcp.anomalyarmor.ai/mcp" } } } ``` Restart Cursor after saving. You will be prompted to sign in when you first use a tool. ### Other MCP Clients Any MCP client that supports the **streamable-http** transport can connect: ``` URL: https://mcp.anomalyarmor.ai/mcp Transport: streamable-http Auth: OAuth 2.1 (automatic) ``` The server publishes standard OAuth 2.1 discovery metadata at `https://mcp.anomalyarmor.ai/.well-known/oauth-protected-resource`. ## Local Server (API Key) For environments where you need a local MCP server (air-gapped networks, custom configurations), install and run the server locally with an API key. ### Install ```bash pip install armor-mcp ``` > â„šī¸ If the package is not yet on PyPI, install from source: ```bash pip install git+https://github.com/anomalyarmor/agents.git#subdirectory=armor-mcp ``` ### Configure Add to your AI tool's MCP settings: ```json { "mcpServers": { "anomalyarmor": { "command": "armor-mcp", "env": { "ANOMALYARMOR_API_KEY": "your-api-key" } } } } ``` ### Getting Your API Key 1. Go to **Settings > API Keys** in the AnomalyArmor dashboard 2. Click **Create API Key** 3. Name it (e.g., "Claude Code MCP") 4. Copy the key (it won't be shown again) ## Available Tools Once connected, your AI assistant has access to 40+ tools across these categories: ### Health and Briefings | Tool | Description | |------|-------------| | `health_summary` | Overall health status across all assets | | `get_todays_briefing` | Daily digest with key insights and alerts | | `get_coverage_summary` | Monitoring coverage analysis | ### Alerts | Tool | Description | |------|-------------| | `list_alerts` | Query alerts with date, severity, and status filters | | `get_alert_summary` | Aggregate alert counts and trends | | `list_incidents` | List incidents with status, severity, and date filters | | `get_incident` | Get full incident details including activity timeline | | `acknowledge_alert` | Mark an incident as seen | | `resolve_alert` | Close an incident with resolution metadata | | `dismiss_alert` | Dismiss a non-actionable incident | | `snooze_alert` | Snooze an incident for a set duration | | `preview_alerts` | Dry-run alert rules against historical data | ### Assets | Tool | Description | |------|-------------| | `list_assets` | List data sources with filters | | `get_asset` | Get details for a specific asset | | `explore` | Navigate schemas, tables, and columns within an asset. Supports hierarchical browsing (`explore(asset_id)` for schemas, `explore(asset_id, parent_path="schema")` for tables) and full-text search (`explore(asset_id, search="keyword")`) | | `create_asset` | Connect a new data source | | `test_asset_connection` | Test connectivity to a source | | `trigger_asset_discovery` | Start asset discovery (runs as background job) | ### Freshness | Tool | Description | |------|-------------| | `get_freshness_summary` | Freshness overview. Supports optional `asset_id` to filter to a single source | | `check_freshness` | Check a specific table's freshness | | `list_stale_assets` | Find currently stale tables | | `list_freshness_schedules` | View monitoring schedules | | `create_freshness_schedule` | Set up freshness monitoring. Use `setup_freshness` for bulk creation with `table_paths` and `schema_name` params | | `delete_freshness_schedule` | Remove a schedule | | `dry_run_freshness` | Preview what alerts a threshold would produce | ### Schema Monitoring | Tool | Description | |------|-------------| | `get_schema_summary` | Schema drift overview. Supports optional `asset_id` to filter to a single source | | `list_schema_changes` | Recent schema changes with filters | | `create_schema_baseline` | Capture current schema as baseline | | `enable_schema_monitoring` | Start drift detection for one or more assets. Accepts `asset_id` (single) or `asset_ids` (bulk), interval strings (`'5m'`, `'1h'`, `'6h'`, `'1d'`) or `cron_expression` for custom schedules | | `disable_schema_monitoring` | Stop drift detection | | `dry_run_schema` | Preview drift detection results | ### Data Quality | Tool | Description | |------|-------------| | `list_metrics` | Quality metrics for an asset | | `get_metrics_summary` | Metrics health overview | | `create_metric` | Add a quality metric (row count, null rate, etc.) | | `delete_metric` | Remove a metric | | `capture_metric` | Trigger immediate metric capture | | `dry_run_metric` | Preview metric alerting behavior | ### Validity Rules | Tool | Description | |------|-------------| | `list_validity_rules` | Validity rules for an asset | | `get_validity_summary` | Passing/failing rule overview | | `create_validity_rule` | Add a rule (NOT_NULL, UNIQUE, etc.) | | `delete_validity_rule` | Remove a rule | | `check_validity_rule` | Run a check immediately | ### Intelligence, Lineage, and Jobs | Tool | Description | |------|-------------| | `ask_question` | Natural language Q&A about your data | | `generate_intelligence` | Trigger AI analysis for an asset | | `get_lineage` | Explore upstream/downstream dependencies | | `job_status` | Track progress of async operations | ### Tags and Classification | Tool | Description | |------|-------------| | `list_tags` | Tags for an asset | | `create_tag` | Tag a table or column | | `apply_tags` | Apply multiple tags to multiple objects | | `bulk_apply_tag` | Apply a tag across multiple assets | ## Usage Examples These examples show full terminal sessions using Claude Code with the AnomalyArmor MCP server. Each one demonstrates a real workflow from start to finish. ### Morning Triage: Check Health and Act on Issues ```bash ~/warehouse $ claude ╭────────────────────────────────────────────────────────────────────╮ │ Claude Code │ ╰────────────────────────────────────────────────────────────────────╯ You: What's the health of my data warehouse? Show me anything that needs attention. Claude: I'll check your overall health and look for issues. ┌ health_summary │ Querying AnomalyArmor health status... └ Result: 47 assets monitored ┌ list_stale_assets │ Checking for freshness violations... └ Result: 2 stale tables found ┌ list_alerts │ Fetching unresolved alerts (last 24h)... └ Result: 5 alerts Here's your morning summary: OVERALL: NEEDS ATTENTION (2 issues) Freshness: ✅ 45 tables fresh ❌ gold.fact_orders - 6h overdue (SLA: 2h) ❌ staging.orders_raw - 8h overdue (SLA: 1h) Alerts (last 24h): 🔴 1 critical - freshness SLA breach on gold.fact_orders 🟡 2 warnings - schema drift on gold.fact_orders, null rate spike on gold.dim_customer.email đŸ”ĩ 2 info - new schema baselines captured Schema drift: 3 changes detected this week Data quality: 1 anomaly (null rate) The critical issue is the orders pipeline. staging.orders_raw stopped updating 8 hours ago and gold.fact_orders is stale as a result. Want me to dig into the root cause? You: Yes, investigate the orders staleness and check if anything downstream is affected. Claude: Let me trace the lineage and check downstream impact. ┌ check_freshness │ gold.fact_orders → STALE (6h overdue) └ Last update: yesterday 10:12 PM ┌ get_lineage │ Fetching upstream/downstream for gold.fact_orders... └ 1 upstream, 7 downstream ┌ check_freshness │ staging.orders_raw → STALE (8h overdue) └ Last update: yesterday 8:30 PM ┌ list_schema_changes │ staging.orders_raw - changes in last 7 days └ 1 change: discount_pct removed Tuesday 3:47 PM Here's the full picture: Pipeline trace: raw.events (✅ FRESH, updated 5m ago) → staging.orders_raw (❌ STALE, 8h overdue) → gold.fact_orders (❌ STALE, 6h overdue) The pipeline breaks at staging.orders_raw. The upstream raw.events table is fresh, so the issue is in the ETL between raw → staging. Downstream impact (7 tables): ❌ gold.fact_payments - depends on fact_orders ❌ gold.fact_refunds - depends on fact_orders ❌ analytics.order_summary - derived view ❌ analytics.order_funnel - derived view ❌ reporting.daily_revenue - aggregated report ❌ reporting.weekly_kpis - aggregated report ❌ reporting.customer_ltv - joins with dim_customer Additional context: staging.orders_raw also had a schema change this week (discount_pct column removed Tuesday). This may be related to the staleness if the ETL is failing on the missing column. Recommended next steps: 1. Check the Airflow/dbt DAG for the staging.orders_raw job 2. Look for errors related to the discount_pct column removal 3. Once fixed, the 7 downstream tables should recover automatically You: Acknowledge the freshness alert and resolve the schema drift alert with a note that the column removal was intentional. Claude: I'll handle both alerts. ┌ acknowledge_alert │ Alert #4521 (freshness SLA breach) → acknowledged └ Done ┌ resolve_alert │ Alert #4518 (schema drift) → resolved │ Resolution: "Column removal was intentional per product change" └ Done Done. The freshness alert is acknowledged (you'll still see it until the pipeline recovers), and the schema drift alert is resolved with your note. ``` ### Setting Up Monitoring for a New Schema ```bash ~/warehouse $ claude You: I just added a new "analytics" schema with 8 tables. Set up full monitoring: freshness, schema drift, and null checks on any ID or email columns. Claude: Let me discover what's in the analytics schema first. ┌ search_tables │ Searching for tables in analytics schema... └ Found 8 tables ┌ get_table_columns │ Reading columns for all 8 tables... └ 67 columns total Found 8 tables in analytics with 67 columns. Here's what I'll set up: Freshness monitoring (8 tables): analytics.order_summary - hourly, 2h SLA analytics.order_funnel - hourly, 2h SLA analytics.customer_cohorts - daily, 26h SLA analytics.product_performance - daily, 26h SLA analytics.revenue_by_region - daily, 26h SLA analytics.churn_analysis - daily, 26h SLA analytics.funnel_conversion - hourly, 2h SLA analytics.session_metrics - hourly, 2h SLA Schema drift monitoring: Enabled for all 8 tables Null percentage metrics (12 columns with ID or email patterns): analytics.order_summary.order_id analytics.order_summary.customer_id analytics.order_funnel.user_id analytics.order_funnel.session_id analytics.customer_cohorts.customer_id analytics.customer_cohorts.email analytics.product_performance.product_id analytics.churn_analysis.customer_id analytics.churn_analysis.email analytics.funnel_conversion.user_id analytics.session_metrics.session_id analytics.session_metrics.user_id I chose hourly checks for tables that look transactional and daily for tables that look like aggregations. Want me to proceed? You: Yes, do it all. Claude: Setting everything up now. ┌ create_freshness_schedule (×8) │ Creating schedules for all analytics tables... └ 8 schedules created ┌ enable_schema_monitoring (×8) │ Enabling drift detection... └ 8 baselines captured ┌ create_metric (×12) │ Creating null_percentage metrics... └ 12 metrics created All done: ✅ 8 freshness schedules (4 hourly, 4 daily) ✅ 8 schema drift monitors with baselines ✅ 12 null_percentage metrics on ID/email columns You'll start receiving alerts if any table misses its SLA, any schema changes are detected, or any null rates spike outside normal ranges. The first data points will appear after the next discovery run. ``` ### Investigating a Data Quality Issue ```bash ~/warehouse $ claude You: The finance team says revenue numbers look wrong today. Something about missing discount amounts. Can you investigate? Claude: Let me check the revenue-related tables for anomalies. ┌ search_tables │ Searching for tables matching "revenue", "order", "discount"... └ Found 5 relevant tables ┌ get_metrics_summary │ Checking data quality metrics on gold.fact_orders... └ 3 metrics tracked ┌ list_schema_changes │ gold.fact_orders - recent changes └ 1 change this week ┌ check_freshness │ gold.fact_orders → FRESH (updated 12m ago) └ No freshness issues Found the issue. Here's what happened: Root cause: The discount_pct column was removed from gold.fact_orders on Tuesday at 4:12 PM. Evidence: - Schema drift detected the removal Tuesday 4:12 PM - The column had a null_percentage metric that stopped reporting (no column to measure) - Freshness is fine - the table is still updating - The column was also removed from upstream staging.orders_raw at 3:47 PM Impact on revenue: - analytics.order_summary uses SUM(discount_pct) to calculate discount totals - this query is now failing - reporting.daily_revenue references discount_pct in its net_revenue calculation The table has data, but the discount column is gone. That's why revenue looks wrong - discount amounts aren't being subtracted. You: Was this intentional? Can you check if there's a related alert that someone already handled? Claude: Let me check the alert history. ┌ list_alerts │ Filtering for schema drift alerts on gold.fact_orders... └ 1 alert found ┌ get_alert_details │ Alert #4518... └ Status: RESOLVED There was a schema drift alert (#4518) that fired Tuesday at 4:12 PM for the discount_pct column removal. It was resolved today at 9:15 AM with the note: "Column removal was intentional per product change" So the column removal was intentional, but the downstream views haven't been updated yet. The finance team's revenue calculations are referencing a column that no longer exists. Recommended: 1. Update analytics.order_summary to remove the discount_pct reference 2. Update reporting.daily_revenue net_revenue calculation 3. Consider adding a new discount column if discounts are still tracked differently You: Tag the affected downstream tables so the data team knows. Claude: I'll tag them. ┌ apply_tags │ Applying "needs-migration" tag... └ Applied to 2 tables Tagged 2 tables with "needs-migration": - analytics.order_summary - reporting.daily_revenue The data team can find these by asking "what tables are tagged needs-migration?" in any AnomalyArmor agent. ``` ## Troubleshooting ### Authentication Failed (Remote) ``` Authentication failed ``` **Solution**: 1. Clear your MCP auth cache and re-authenticate 2. In Claude Code: run `/mcp` and select the reconnect option 3. Verify you can sign in to [app.anomalyarmor.ai](https://app.anomalyarmor.ai) ### Authentication Failed (Local) ``` Error: Invalid API key ``` **Solution**: Verify your API key is correct in your MCP config. Generate a new key from **Settings > API Keys** if needed. ### Server Not Found ``` Error: MCP server 'anomalyarmor' not found ``` **Solution**: 1. Restart your AI tool after config changes 2. For local mode, verify armor-mcp is installed: `armor-mcp --version` 3. Check MCP config syntax (valid JSON) ### Empty Responses If tools return empty results after successful authentication: **Solution**: 1. Verify you have data sources connected in AnomalyArmor 2. Try `health_summary` as a basic connectivity test 3. For local mode, ensure your API key has the correct scope ### Rate Limiting ``` Error: Rate limit exceeded ``` **Solution**: Wait a few seconds and retry. The server handles rate limits automatically. ## Inline UI rendering (MCP Apps) Hosts that support the [draft MCP Apps spec](/integrations/mcp-apps) (Claude Desktop nightly, Cursor experimental) render interactive charts inline when you call `check_freshness`, `get_freshness_summary`, `list_schema_changes`, `get_lineage`, or `health_summary` - instead of a JSON blob, you see a timeline, diff table, lineage graph, or stat dashboard. Hosts without MCP Apps support fall back to JSON automatically; the tools work everywhere. See [MCP Apps (inline UI)](/integrations/mcp-apps) for host-support details and examples. ## Next Steps - **[MCP Apps (inline UI)](/integrations/mcp-apps)**: Interactive charts rendered inline in supported hosts - **[Claude Code Setup](/integrations/claude-code)**: Detailed Claude Code configuration - **[Cursor Setup](/integrations/cursor)**: Cursor IDE integration - **[Web Agent](/ai-agents/web-agent)**: Use the same tools in your browser - **[AI Skills](/ai-agents/overview)**: Browse available skills ## Common Questions ### What's the MCP server URL and auth model? Hosted at `https://mcp.anomalyarmor.ai/mcp`. Authenticate with a Bearer API key (`Authorization: Bearer aa_live_...`). Keys are scoped `read-only` (demo/trial) or `read-write` (paid) - any write tool call returns 403 on a read-only key with a handoff URL the agent can surface. ### Which MCP hosts does AnomalyArmor support? Tested with Claude Desktop, Claude Code CLI, Cursor, and Windsurf. Any host that speaks the Model Context Protocol over HTTPS/SSE should work. See [Claude Code](/integrations/claude-code) and [Cursor](/integrations/cursor) for host-specific setup. ### Do MCP tool calls count against my API rate limit? Yes. MCP traffic uses the same API rate limit as direct REST calls (per-key). Agents tend to be bursty, so enable retry-with-backoff in your host config if you hit 429s. Enterprise plans raise the limit significantly. ### Does the MCP server let agents modify my data or infrastructure? No. All AnomalyArmor tools operate on metadata and monitoring state (assets, metrics, alerts, tags). Agents cannot write to your connected databases, rotate credentials, or change billing. Mutating tools on AnomalyArmor itself (creating a metric, dismissing an alert) require a `read-write` scoped key and always ask for confirmation in the host UI before executing. --- # MCP Apps (inline UI) > Interactive chart rendering for AnomalyArmor MCP tools inside Claude Desktop and Cursor AnomalyArmor's hosted MCP server (`mcp.anomalyarmor.ai/mcp`) attaches an inline HTML view to five flagship tools - `check_freshness`, `get_freshness_summary`, `list_schema_changes`, `get_lineage`, and `health_summary`. Hosts that implement the draft [MCP Apps](https://modelcontextprotocol.io/) spec render the view directly in the chat; hosts without MCP Apps support keep seeing the existing JSON response. Nothing breaks either way. The images below are **illustrative static snapshots** of what the live renders look like. The actual tool response is interactive and generated by the MCP server at render time. ## Supported hosts | Host | Status | Notes | |------|--------|-------| | **Claude Desktop nightly** | Full | Requires Developer Mode → MCP Apps feature flag | | **Cursor experimental** | Best-effort | Behind an experimental feature flag; rendering fidelity varies | | **Everything else** (Claude Code CLI, Cursor stable, web agents, custom clients) | Falls back to JSON | The tools still work; you see the raw JSON payload | No configuration is required on the AnomalyArmor side - if your host supports MCP Apps, the inline UI appears automatically the next time you call one of the five flagship tools. ## Freshness timeline Tool: `check_freshness` / `get_freshness_summary`. Returns per-table or aggregate freshness with status-colored bars and stat cards. ## Schema diff table Tool: `list_schema_changes`. Color-codes each row by change type (green = added, red = dropped, yellow = type-changed) with a severity pill and detection timestamp. ## Lineage graph Tool: `get_lineage`. Three-column layout of upstream sources, the queried asset, and downstream consumers. Edge count is surfaced in the caption. ## Health dashboard Tool: `health_summary`. Stat-card grid with freshness %, active alert count, and unacknowledged schema drift, followed by a "Needs attention" list. ## Security and privacy - The embedded HTML runs in the host's resource sandbox with a strict Content Security Policy: `default-src 'none'`, no `connect-src` (so the template cannot make network calls), and no inline secrets. - All data is pre-filtered by the MCP server before it's injected into the template - Bearer tokens, `aa_live_*` API keys, and email addresses cannot reach the render layer. A CI conformance sweep guards against regressions. - Vega-Lite is pinned to `5.17.0` and loaded from `cdn.jsdelivr.net`. If the CDN is unreachable at render time, hosts show the JSON payload alongside a `