Build a dam for your data

Stop AI agents from leaking your enterprise data.

DataDam is the governed control plane between your AI agents and your enterprise data. Every request authenticated, policy-evaluated, masked where required, and written to an immutable audit log.

// no credit card . self-hosted in your environment . up in 5 minutes

LIVE . LIVE AUDIT FEED
18:35:40finance-agent-15
postgres/transactions card ****-****-7129
maskpci . sensitive
18:35:46finance-agent-15
postgres/transactions amount, currency, ts
allowtrust 884
18:35:52finance-agent-02
s3://earnings-q4 denied
blockfinra . mnpi
18:35:58finance-agent-08
workday/Employee employee_id tok_***
tokenizesoc2 . sensitive
18:36:04finance-agent-15
mysql/billing tax_id ***-**-9034
maskpii.detect
18:36:11claude-prod-7
mcp:github/repo readme, src/**
allowtrust 905
18:36:18cursor-readonly
mongo/customers email c***@acme.com
maskgdpr . pii
18:36:25marketing-bot
postgres/users ssn
blockrole . deny
18:36:31gpt-4o-coder
mcp:slack/channel members, last_msg
allowtrust 870
18:36:37etl-bot-prod
s3://customers email tok_a8f3...
tokenizereversible . owner-only
18:36:44gemini-1.5
snowflake/revenue q4_total $***K
maskpii . financial
18:36:50intern-agent
epic/Patient mrn
blockhipaa . phi
18:36:57claude-prod-7
mcp:linear/issue title, status, owner
allowtrust 887
18:37:03support-bot-04
mysql/tickets body [redacted]
maskpii.detect
18:35:40finance-agent-15
postgres/transactions card ****-****-7129
maskpci . sensitive
18:35:46finance-agent-15
postgres/transactions amount, currency, ts
allowtrust 884
18:35:52finance-agent-02
s3://earnings-q4 denied
blockfinra . mnpi
18:35:58finance-agent-08
workday/Employee employee_id tok_***
tokenizesoc2 . sensitive
18:36:04finance-agent-15
mysql/billing tax_id ***-**-9034
maskpii.detect
18:36:11claude-prod-7
mcp:github/repo readme, src/**
allowtrust 905
18:36:18cursor-readonly
mongo/customers email c***@acme.com
maskgdpr . pii
18:36:25marketing-bot
postgres/users ssn
blockrole . deny
18:36:31gpt-4o-coder
mcp:slack/channel members, last_msg
allowtrust 870
18:36:37etl-bot-prod
s3://customers email tok_a8f3...
tokenizereversible . owner-only
18:36:44gemini-1.5
snowflake/revenue q4_total $***K
maskpii . financial
18:36:50intern-agent
epic/Patient mrn
blockhipaa . phi
18:36:57claude-prod-7
mcp:linear/issue title, status, owner
allowtrust 887
18:37:03support-bot-04
mysql/tickets body [redacted]
maskpii.detect

The agent era broke governance

Your DLP wasn't built for this.

Agents don't read like analysts. They read fast, broad, recursive. Every API key is a fan-out vector. Every chat turn is a new lineage edge. By the time legal asks "what did the model see?" the answer is in a thousand places, and not one of them is the audit log.

01

API keys leak silently.

An IDE assistant with read-all access becomes a credential that touches every column on every prompt. The breach is silent. The audit trail is whatever the model decided to log.

"We had no idea our IDE assistant was reading customer SSNs until a developer paste-bombed a finding into Slack."

02

Contracts live in Notion.

The rule that says "marketing can't read PHI" is a paragraph in a Confluence page. The role that grants it is a Snowflake grant tweaked six months ago. Drift is the default. Detection is the surprise.

"Our data contracts are real. They are also nowhere a query plan can see them."

03

Audit logs are LLM-shaped.

Most "AI governance" is a wrapper that tells the model to behave. That's not governance. That's an honor system in production. Auditors don't accept "we asked nicely."

"Compliance asked for an immutable log. The vendor sent a Datadog dashboard."

04

The proxy is the only choke point that holds.

If the rule isn't enforced before the bytes leave the database, the rule isn't enforced. DataDam makes that line a real boundary, not a policy document.

"We finally have a place where 'who can read what' is a function, not a feeling."

How it works

A proxy in your environment. Three things change. Nothing else does.

Drop the container next to your databases. Point your agent's connection string at it. The agent thinks it's talking to Postgres. Your governance team thinks it's the best Tuesday of the quarter.

Step 01

Connect

The agent points its connection string at the proxy. No SDK. No wrapper. Postgres on the wire, MySQL on the wire, Mongo, S3, MCP. Same shape, same drivers.

$ export DATABASE_URL=
  "postgres://datadam:5432"
Step 02

Govern

Every query is parsed, classified, masked, and policy-checked before the connector talks to the database. Contracts, trust scores, anomaly thresholds, kill switches: enforced, not advisory.

enforced · deterministic
no LLM in the governance loop
Step 03

Audit

OpenLineage events for every read, denial, and mask. Append-only, tamper-evident, exportable to Splunk, Datadog, Elasticsearch, or your SIEM. The audit answer ships with the audit question.

append-only · openlineage
5 sink destinations supported
// see it in motionlive demo
AIAny AI agentvia governed MCP
Request
tools/call postgres__query
SELECT email, ssn, phone
FROM users LIMIT 3
Result
emaila***@acme.com
ssn***-**-****
phone+1-***-***-1234
3 rows · 3 fields masked
DataDam Proxyv0.2.0 · in your env
1
Contract users-pg v3
3 fields → policy match
2
Role engineer
read access ✓
3
Mask 3 fields
ssn → REDACT · email → generalize
4
Audit emitted
openlineage · append-only
PGpostgres / userstrust 920
Raw rows · before governance
emailalice@acme.com
ssn489-12-1234
phone+1-415-555-1234
never leaves your environment
Step 1. Agent issues an MCP request to DataDam.Step 2. Request travels to the proxy. Encrypted, scoped, no PII yet.Step 3. Proxy resolves the contract and validates the agent's role.Step 4. Proxy issues the underlying SQL to your database.Step 5. Database returns raw rows, with real PII.Step 6. Raw rows flow back to the proxy. They never leave your environment.Step 7. Proxy applies field masks. SSN redacted. Email generalized. Audit emitted.Step 8. Sanitized result arrives at the agent. Exactly the data it's allowed to see.

One row, two views

What the operator sees. What the agent sees.

Same SQL. Same row in your database. The proxy strips, masks, or generalizes by role before the response reaches the agent. Operators get activity attribution; agents get governed values.

Operator view

/audit?agent=bi-runner

Activity + classification + mask attribution

agent_id        bi-runner
role            viewer
source          customers
table           customers
column          email
classification  pii.email
mask_mode       generalize_email
count_24h       47
detections_24h  47 (all generalized)
denied_24h      0
trust_score     820 (allowed)

The console shows who accessed what, how often, with which mask applied. The operator never sees a row value either; they see the classification and the count.

Agent view

POST /customers/query

Governed values, never raw PII

[
  {
    "id": "7f3a-...",
    "name": "Alice Liddell",
    "email": "*@wonderland.test",
    "ssn": "***"
  },
  {
    "id": "9c1b-...",
    "name": "Bob Cratchit",
    "email": "*@buildit.test",
    "ssn": "***"
  }
]

The agent's tool call returns governed values. Names pass (not classified PII in this contract); emails generalize to the domain; SSN redacts to placeholder. The mask runs inside the proxy on the row in memory, before bytes leave the proxy host.

Same data. Two views. Operator sees activity, classification, and mask attribution. Agent sees the governed row. Neither sees the raw values directly. The mask map is deterministic and authored by the operator in your data contract.

Tag-driven governance. No LLM in the loop.

What it does

Six layers between the agent and the database.

Each layer is deterministic. Each layer is auditable. None of them ask the model to behave.

PII masking, both ways.

Runtime PII detection flags entities on un-contracted columns. ODCS contracts mask declared columns. Tokenize mode lets you reverse the mask via an audited admin call.

  • 200+ entity types
  • Per-org confidence threshold
  • Reversible tokens, owner-only

Data contracts (ODCS v3).

The Open Data Contract Standard, enforced at the proxy. Drift detection, freshness SLAs, schema stability. Versioned, immutable once active, deprecation paths.

  • Draft → active → deprecated → retired
  • Active versions are immutable
  • Per-version diff in the console

Append-only audit log.

Every read, every mask, every denial. OpenLineage-shaped, exportable to Splunk, Datadog, Elasticsearch, or your SIEM. DB-level UPDATE/DELETE prevention. Immutable means immutable.

  • OpenLineage facets
  • 5 export sink destinations
  • 30 to 3,650 day retention

Trust scoring.

Five components, 200 points each. Contract validity, freshness, schema stability, availability, ownership. Block below threshold, warn above. The agent learns the source is stale before the model does.

  • Per-source composite, 0 to 1,000
  • Org default + per-source override
  • Recomputed every 5 minutes

Anomaly + threat correlation.

Three statistical detectors over 14-day rolling baselines: volume z-score, novelty, time-of-day. Multi-step threats correlate across windows. No LLM in the detection path.

  • k-anonymity floor of 5
  • 14-day rolling baselines
  • Triage suppression: 24h cooldown

Kill switch + governance intelligence.

One click stops an agent, source, or org. Recommendations surface idle agents, uncontracted sources, high-denial roles. Cohort intelligence (k=5) flags drift against industry peers.

  • Org / source / agent scope
  • Auto-suggestions, never auto-applied
  • Peer benchmarks, never your data

Connectors

One governed pipe to every data source and MCP server you run.

Postgres, MySQL, MongoDB, and S3 ship as first-party connectors today. The MCP gateway fans out to any MCP server you register. The proxy authenticates every agent, evaluates every request against your contracts and policies, masks fields per role, and writes the decision to an immutable audit log. Same governance pipeline for every source.

Data sources

  • PostgreSQL
  • MySQL
  • MongoDB
  • Amazon S3

MCP servers

  • GitHub
  • Salesforce
  • Microsoft 365
  • Atlassian
  • Snowflake
  • HubSpot
  • Stripe
  • Notion
  • Datadog
  • Asana

Plus any MCP server you register. Common operator additions: Workday, ServiceNow, Google Workspace, Slack, Zoom, Zendesk, and any internal MCP tool your team builds. Register the upstream once and DataDam authenticates, governs, and audits every call.

First-party connector roadmap

Native, contract-aware connectors planned for premium tiers: Salesforce, Slack, Jira, ServiceNow, HubSpot, SharePoint. Custom first-party connectors are available on Enterprise plans.

Brand marks shown above identify the data systems DataDam connects to, used under nominative fair use. We do not imply partnership, endorsement, or commercial relationship with the named companies.

Industries

Built for teams that can't ship governance late.

Six regulated verticals. Each gets pre-configured policy templates, the right audit retention floor, and a vocabulary the auditor recognizes.

Healthcare

PHI auto-masked at the proxy. Patient identifiers tokenized so analytics agents work without ever seeing real MRNs.

HIPAA blueprint

Financial services

MNPI fenced at the contract layer. Trading agents and research agents get the data they need; the data they don't is a 403.

FINRA blueprint

Legal & compliance

Privileged content stays privileged. Append-only audit trail per query, exportable on demand to your SIEM or eDiscovery flow.

SOC 2 blueprint

Government

Self-hosted, air-gapped, on-prem-ready. The control plane never sees query content; the proxy never calls home for policy.

FedRAMP roadmap

Insurance

PII masked at the column level. Claim-bot reads policy metadata without ever touching the underwriting file's medical fields.

SOC 2 blueprint

EdTech

Per-role scoping for student records and grades. Tutoring agents see what their cohort allows; admin queries are audited row-by-row.

FERPA roadmap

Trust boundary

Your data never leaves your perimeter.

The proxy runs in your environment, on your Postgres, behind your firewall. The control plane (this site, your console at app.mydatadam.com) sees rollup metadata and contract definitions only.

Zero query content. Zero row values. Zero PII. Ever.

→ We architected for this on day one.
0
Bytes of your data on our infra
4
First-party connectors shipped
1,100+
Proxy tests, all passing
100%
SBOM + provenance on every release

How DataDam compares

Adjacent tools cover pieces. None enforce at the agent boundary.

Five rows that summarize the difference. See the full breakdown on the Compare page.

CapabilityData catalogDLPIAM / SSOAPI gatewayDataDam
Documents what data existsYesNoNoNoImports from catalog
Enforces who reads what at runtimeNoPartial (post-hoc)CoarsePer-endpointPer-field, per-role
Masks PII before agent sees itNoDetect-onlyNoNoYes (200+ entity types)
Immutable audit log per requestNoPartialCoarseYesYes
Built for non-deterministic AI callersNoNoNoNoYes

Pricing

Four tiers. Pricing at general availability.

DataDam is in early access. Every tier ships the full governance engine; final pricing lands at general availability. Join the waitlist to lock in early access.

Free

Solo & teams of 1

Free

Forever. The wedge.

  • 5 sources / MCP upstreams
  • 1 proxy instance
  • 30-day audit retention
  • Runtime PII detection
  • LLM egress: text scanning
  • Trust scoring
  • Anomaly detection
  • No LLM image scanning
  • No SSO
  • Community support
Request early access
Team

Small & growing

Early access

For teams in early production.

  • 15 sources / MCP upstreams
  • 3 proxy instances
  • 365-day audit retention
  • Data contracts (ODCS)
  • Compliance blueprints
  • LLM egress: text + image scanning (CPU)
  • SSO (SAML / OIDC)
  • SCIM provisioning
  • Email support, 24h response
Request early access
Best for regulated
Business

Mid-market

Early access

For teams under audit pressure.

  • 50 sources / MCP upstreams
  • 10 proxy instances
  • 2,190-day retention (HIPAA-ready)
  • SIEM export (Splunk / Datadog)
  • LLM egress: text + image scanning (CPU or GPU)
  • PagerDuty integration
  • Multi-step threat correlation
  • Cohort intelligence
  • Slack support, 4h response
Request early access
Enterprise

Regulated & large

Custom

For 200+ agents, multi-region, SLA.

  • Unlimited sources / upstreams
  • Unlimited proxy instances
  • 3,650-day audit retention
  • LLM egress: text + image scanning (CPU or GPU)
  • Active-active multi-region
  • Dedicated VPC peering option
  • Custom contract review
  • Quarterly compliance reviews
  • Named CSM, 1h response
Request early access

Frequently asked

What buyers ask first.

Eight questions every CISO and platform team has worked through. If yours is missing, contact us.

What is an agent data proxy?

A governance layer that sits between AI agents and the data sources they query. It authenticates each request, evaluates it against policy, masks fields per role, and writes the decision to an immutable audit log. DataDam is the first product in this category.

How is DataDam different from a data catalog or data governance tool?

Catalogs document what data exists. DataDam enforces who can read what at runtime. The catalog answers "what does our company have"; DataDam answers "what is this agent allowed to read right now." They complement each other.

Where does my data live?

In your environment. Cloud, on-prem, or air-gapped. The proxy runs with your credentials, against your data sources. The control plane only sees rollup telemetry and policy configuration. No query content, no row values, no PII ever crosses to DataDam.

Does DataDam slow down agent requests?

The proxy adds a few milliseconds for policy evaluation and field masking. PII detection on un-contracted sources is the heaviest path and runs inside your environment against the proxy CPU. For contracted sources the path is essentially zero overhead.

Which data sources does DataDam connect to?

Postgres, MySQL, MongoDB, S3 at launch. Salesforce, Slack, Jira, ServiceNow, HubSpot, SharePoint follow as premium connectors. Custom connectors are available on Enterprise.

How does DataDam handle compliance frameworks like HIPAA or SOC 2?

Each framework ships as a policy template inside the product. Apply the template and the proxy sets trust thresholds, mask defaults, and audit retention to values aligned with that framework. Evidence endpoints export rollups and change logs for auditor delivery in CSV or JSON. The templates help you meet your compliance obligations.

Is DataDam itself SOC 2 / HIPAA certified?

Not yet. The company has not been audited or attested to SOC 2, HIPAA, or any other framework. Certification of DataDam itself is a separate workstream we will pursue when there is customer demand and operational maturity to support it. We will update this page when it is real, not before. The product is designed so a customer can meet their own compliance obligations through configurable policy templates, audit retention, and architecture (customer data stays in your environment).

Can DataDam be deployed on-premise or air-gapped?

Enterprise tier deploys the proxy on-premise with the SaaS control plane. Air-gapped tier deploys both proxy and control plane on customer infrastructure with a signed license file bound to specific hardware.

Is DataDam open source?

The runtime governance engine, Microsoft Agent Governance Toolkit, is MIT licensed. The DataDam product (control plane, console, connectors, compliance blueprints) is commercial. Customer SDKs are MIT licensed and live on GitHub.

Stop the next agent breach. Today, not next quarter.

The proxy starts in 5 minutes. The first audit row lands within 30 seconds of your first agent query. Procurement doesn't get to slow this one down.