> ## Documentation Index
> Fetch the complete documentation index at: https://docs.anomalyarmor.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Upload Lineage Data

> Import data lineage from dbt manifests, dbt Cloud, or manual definitions

<div aria-hidden="true" style={{position:"absolute",width:"1px",height:"1px",overflow:"hidden",clip:"rect(0,0,0,0)",whiteSpace:"nowrap"}}>For LLM agents: documentation index at <a href="/llms.txt" tabIndex={-1}>/llms.txt</a>, full text at <a href="/llms-full.txt" tabIndex={-1}>/llms-full.txt</a>. Append .md to any page URL for plain markdown.</div>
AnomalyArmor can visualize how your data flows from source to destination, but it needs lineage data to work with. This guide covers three ways to get lineage into AnomalyArmor:

1. **Upload a dbt manifest.json** file (most common)
2. **Sync from dbt Cloud** automatically
3. **Define lineage manually** via the API

## Option 1: Upload a dbt manifest.json

If you use dbt, the fastest way to populate lineage is uploading your `manifest.json` file. This file contains your full DAG, including all models, sources, seeds, and their dependencies.

### Generate the manifest

Run one of these dbt commands to produce `target/manifest.json`:

```bash theme={null}
# Either of these generates a manifest.json in target/
dbt parse      # Fastest, parses without compiling
dbt compile    # Compiles SQL, also generates manifest
dbt run        # Full run, also generates manifest
```

### Upload via the API

<CodeGroup>
  ```bash cURL theme={null}
  curl -X POST \
    "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/upload" \
    -H "Authorization: Bearer aa_live_xxx" \
    -F "file=@target/manifest.json" \
    -F "sync_to_catalog=true"
  ```

  ```python Python SDK theme={null}
  from anomalyarmor import Client

  client = Client()

  with open("target/manifest.json", "rb") as f:
      result = client.lineage.upload(asset_id="your-asset-uuid", file=f)

  print(f"Nodes created: {result.sync_stats['nodes_created']}")
  print(f"Edges created: {result.sync_stats['edges_created']}")
  ```

  ```bash CLI theme={null}
  armor lineage upload --asset your-asset-uuid target/manifest.json
  ```
</CodeGroup>

### What gets imported

AnomalyArmor parses the `nodes` and `parent_map` from your manifest to extract:

* **Models** (transformations in your dbt project)
* **Sources** (raw tables dbt reads from)
* **Seeds** (CSV files loaded by dbt)
* **Parent-child relationships** between all of the above

The `sync_to_catalog` parameter (default: `true`) also triggers a asset discovery job so your dbt models appear as assets in the catalog.

### Response

```json theme={null}
{
  "data": {
    "asset_id": "550e8400-e29b-41d4-a716-446655440000",
    "sync_stats": {
      "nodes_created": 42,
      "nodes_updated": 8,
      "edges_created": 67,
      "edges_updated": 3
    },
    "manifest_metadata": {
      "generated_at": "2025-03-15T10:30:00Z",
      "dbt_version": "1.7.4",
      "project_name": "my_analytics"
    },
    "catalog_sync_job_id": "job-uuid-here"
  }
}
```

### Automate uploads in CI/CD

Add a manifest upload step after your dbt run completes:

```yaml theme={null}
# .github/workflows/dbt.yml
jobs:
  dbt-run:
    steps:
      - name: Run dbt
        run: dbt run

      - name: Upload lineage to AnomalyArmor
        run: |
          curl -X POST \
            "https://api.anomalyarmor.ai/api/v1/assets/$ASSET_ID/lineage/upload" \
            -H "Authorization: Bearer $ARMOR_API_KEY" \
            -F "file=@target/manifest.json"
        env:
          ARMOR_API_KEY: ${{ secrets.ARMOR_API_KEY }}
          ASSET_ID: ${{ vars.ARMOR_ASSET_ID }}
```

## Option 2: Sync from dbt Cloud

If you use dbt Cloud, AnomalyArmor can fetch the manifest directly from your dbt Cloud account. No file upload needed.

<CodeGroup>
  ```bash cURL theme={null}
  curl -X POST \
    "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/dbt-cloud/sync" \
    -H "Authorization: Bearer aa_live_xxx" \
    -H "Content-Type: application/json" \
    -d '{
      "account_id": "12345",
      "api_token": "dbtc_your_token_here",
      "job_id": "67890"
    }'
  ```

  ```python Python SDK theme={null}
  from anomalyarmor import Client

  client = Client()

  result = client.lineage.sync_dbt_cloud(
      asset_id="your-asset-uuid",
      account_id="12345",
      api_token="dbtc_your_token_here",
      job_id="67890",
  )
  ```
</CodeGroup>

### Finding your dbt Cloud credentials

| Parameter    | Where to find it                                           |
| ------------ | ---------------------------------------------------------- |
| `account_id` | dbt Cloud URL: `cloud.getdbt.com/deploy/**12345**/...`     |
| `api_token`  | dbt Cloud > Account Settings > API Access > Service tokens |
| `job_id`     | dbt Cloud > Jobs > select your job > ID in the URL         |

<Note>
  Use a **service token** with at least the "Read artifacts" permission. Personal tokens work but are tied to individual users.
</Note>

## Option 3: Define lineage manually

For data sources that are not managed by dbt, you can define lineage nodes and edges directly via the API.

### Create a lineage node

```bash theme={null}
curl -X POST \
  "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/nodes" \
  -H "Authorization: Bearer aa_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "unique_id": "source.crm.customers",
    "name": "customers",
    "resource_type": "source",
    "schema": "crm",
    "database": "production"
  }'
```

### Create a lineage edge

```bash theme={null}
curl -X POST \
  "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/edges" \
  -H "Authorization: Bearer aa_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "parent_unique_id": "source.crm.customers",
    "child_unique_id": "model.analytics.dim_customers",
    "relationship_type": "derives_from"
  }'
```

## Updating lineage

When your dbt project changes, re-upload the manifest. AnomalyArmor handles updates intelligently:

* New nodes and edges are created
* Existing nodes are updated with new metadata
* Relationships that no longer exist are removed

For a clean reset, delete all lineage from a source first:

```bash theme={null}
curl -X DELETE \
  "https://api.anomalyarmor.ai/api/v1/assets/{asset_id}/lineage/source/dbt" \
  -H "Authorization: Bearer aa_live_xxx"
```

## Limits

| Constraint             | Value                 |
| ---------------------- | --------------------- |
| Max manifest file size | 50 MB                 |
| File format            | JSON (UTF-8 encoded)  |
| Required manifest keys | `nodes`, `parent_map` |

## Common Questions

### Where do I find my dbt `manifest.json` file?

It lives under `target/manifest.json` after you run `dbt parse`, `dbt compile`, or `dbt run`. `dbt parse` is the fastest option because it does not compile SQL or hit your warehouse. See [Generate the manifest](#generate-the-manifest).

### Do I need dbt to upload lineage?

No. Option 3 lets you [define lineage manually](#option-3-define-lineage-manually) via the API by creating nodes and edges directly. This is how you model lineage for sources outside dbt, like Fivetran pipelines, Airflow tasks, or bespoke ETL jobs.

### How do I sync lineage from dbt Cloud instead of uploading a file?

Call `POST /api/v1/assets/{asset_id}/lineage/dbt-cloud/sync` with your dbt Cloud `account_id`, `api_token`, and `job_id`. AnomalyArmor pulls the latest manifest from the run artifacts directly. Use a **service token** with "Read artifacts" permission, not a personal token. See [Option 2](#option-2-sync-from-dbt-cloud).

### How often should I re-upload the manifest?

Re-upload any time your dbt project changes, typically as a CI step after `dbt run` in your deploy pipeline. AnomalyArmor diffs the new manifest against the existing graph: new nodes and edges are created, updated ones are updated, and relationships that no longer exist are removed.

### What is the max manifest size AnomalyArmor accepts?

**50 MB**, UTF-8 encoded JSON, with `nodes` and `parent_map` keys required. Most dbt projects fit well under this; if you're hitting it, check whether you're accidentally uploading a `run_results.json` instead of `manifest.json`.

## Next steps

<CardGroup cols={2}>
  <Card title="Query Lineage" icon="diagram-project" href="/api/lineage">
    Explore upstream and downstream dependencies
  </Card>

  <Card title="Impact Analysis" icon="burst" href="/api/lineage#use-case-impact-analysis">
    Check downstream impact before schema changes
  </Card>

  <Card title="dbt Integration" icon="gears" href="/integrations/dbt">
    Add quality gates to your dbt workflows
  </Card>

  <Card title="AI Agent: Lineage" icon="brain" href="/ai-agents/skills/lineage">
    Ask natural language questions about data flow
  </Card>
</CardGroup>
