Build a dam for your data

Stop AI agents from leaking your enterprise data.

DataDam is the governed control plane between your AI agents and your enterprise data. Every request authenticated, policy-evaluated, masked where required, and written to an immutable audit log.

Request early access→See how it works

// no credit card . self-hosted in your environment . up in 5 minutes

LIVE . LIVE AUDIT FEED

18:35:40finance-agent-15

postgres/transactions→ card→ ****-****-7129

maskpci . sensitive

18:35:46finance-agent-15

postgres/transactions→ amount, currency, ts

allowtrust 884

18:35:52finance-agent-02

s3://earnings-q4→ denied

blockfinra . mnpi

18:35:58finance-agent-08

workday/Employee→ employee_id→ tok_***

tokenizesoc2 . sensitive

18:36:04finance-agent-15

mysql/billing→ tax_id→ ***-**-9034

maskpii.detect

18:36:11claude-prod-7

mcp:github/repo→ readme, src/**

allowtrust 905

18:36:18cursor-readonly

mongo/customers→ email→ c***@acme.com

maskgdpr . pii

18:36:25marketing-bot

postgres/users→ ssn

blockrole . deny

18:36:31gpt-4o-coder

mcp:slack/channel→ members, last_msg

allowtrust 870

18:36:37etl-bot-prod

s3://customers→ email→ tok_a8f3...

tokenizereversible . owner-only

18:36:44gemini-1.5

snowflake/revenue→ q4_total→ $***K

maskpii . financial

18:36:50intern-agent

epic/Patient→ mrn

blockhipaa . phi

18:36:57claude-prod-7

mcp:linear/issue→ title, status, owner

allowtrust 887

18:37:03support-bot-04

mysql/tickets→ body→ [redacted]

maskpii.detect

18:35:40finance-agent-15

postgres/transactions→ card→ ****-****-7129

maskpci . sensitive

18:35:46finance-agent-15

postgres/transactions→ amount, currency, ts

allowtrust 884

18:35:52finance-agent-02

s3://earnings-q4→ denied

blockfinra . mnpi

18:35:58finance-agent-08

workday/Employee→ employee_id→ tok_***

tokenizesoc2 . sensitive

18:36:04finance-agent-15

mysql/billing→ tax_id→ ***-**-9034

maskpii.detect

18:36:11claude-prod-7

mcp:github/repo→ readme, src/**

allowtrust 905

18:36:18cursor-readonly

mongo/customers→ email→ c***@acme.com

maskgdpr . pii

18:36:25marketing-bot

postgres/users→ ssn

blockrole . deny

18:36:31gpt-4o-coder

mcp:slack/channel→ members, last_msg

allowtrust 870

18:36:37etl-bot-prod

s3://customers→ email→ tok_a8f3...

tokenizereversible . owner-only

18:36:44gemini-1.5

snowflake/revenue→ q4_total→ $***K

maskpii . financial

18:36:50intern-agent

epic/Patient→ mrn

blockhipaa . phi

18:36:57claude-prod-7

mcp:linear/issue→ title, status, owner

allowtrust 887

18:37:03support-bot-04

mysql/tickets→ body→ [redacted]

maskpii.detect

The agent era broke governance

Your DLP wasn't built for this.

Agents don't read like analysts. They read fast, broad, recursive. Every API key is a fan-out vector. Every chat turn is a new lineage edge. By the time legal asks "what did the model see?" the answer is in a thousand places, and not one of them is the audit log.

API keys leak silently.

An IDE assistant with read-all access becomes a credential that touches every column on every prompt. The breach is silent. The audit trail is whatever the model decided to log.

"We had no idea our IDE assistant was reading customer SSNs until a developer paste-bombed a finding into Slack."

Contracts live in Notion.

The rule that says "marketing can't read PHI" is a paragraph in a Confluence page. The role that grants it is a Snowflake grant tweaked six months ago. Drift is the default. Detection is the surprise.

"Our data contracts are real. They are also nowhere a query plan can see them."

Audit logs are LLM-shaped.

Most "AI governance" is a wrapper that tells the model to behave. That's not governance. That's an honor system in production. Auditors don't accept "we asked nicely."

"Compliance asked for an immutable log. The vendor sent a Datadog dashboard."

The proxy is the only choke point that holds.

If the rule isn't enforced before the bytes leave the database, the rule isn't enforced. DataDam makes that line a real boundary, not a policy document.

"We finally have a place where 'who can read what' is a function, not a feeling."

How it works

A proxy in your environment. Three things change. Nothing else does.

Drop the container next to your databases. Point your agent's connection string at it. The agent thinks it's talking to Postgres. Your governance team thinks it's the best Tuesday of the quarter.

Step 01

Connect

The agent points its connection string at the proxy. No SDK. No wrapper. Postgres on the wire, MySQL on the wire, Mongo, S3, MCP. Same shape, same drivers.

$ export DATABASE_URL=
"postgres://datadam:5432"

Step 02

Govern

Every query is parsed, classified, masked, and policy-checked before the connector talks to the database. Contracts, trust scores, anomaly thresholds, kill switches: enforced, not advisory.

enforced · deterministic
no LLM in the governance loop

Step 03

Audit

OpenLineage events for every read, denial, and mask. Append-only, tamper-evident, exportable to Splunk, Datadog, Elasticsearch, or your SIEM. The audit answer ships with the audit question.

append-only · openlineage
5 sink destinations supported

// see it in motionlive demo

AIAny AI agentvia governed MCP

Request

tools/call → postgres__query

SELECT email, ssn, phone

FROM users LIMIT 3

Result

email	a***@acme.com
ssn	*--****
phone	+1-*-*-1234

✓ 3 rows · 3 fields masked

request

masked

DataDam Proxyv0.2.0 · in your env

Contract users-pg v3

3 fields → policy match

Role engineer

read access ✓

Mask 3 fields

ssn → REDACT · email → generalize

Audit emitted

openlineage · append-only

SQL

raw

PGpostgres / userstrust 920

Raw rows · before governance

email	alice@acme.com
ssn	489-12-1234
phone	+1-415-555-1234

● never leaves your environment

▶

Step 1. Agent issues an MCP request to DataDam.Step 2. Request travels to the proxy. Encrypted, scoped, no PII yet.Step 3. Proxy resolves the contract and validates the agent's role.Step 4. Proxy issues the underlying SQL to your database.Step 5. Database returns raw rows, with real PII.Step 6. Raw rows flow back to the proxy. They never leave your environment.Step 7. Proxy applies field masks. SSN redacted. Email generalized. Audit emitted.Step 8. Sanitized result arrives at the agent. Exactly the data it's allowed to see.

One row, two views

What the operator sees. What the agent sees.

Same SQL. Same row in your database. The proxy strips, masks, or generalizes by role before the response reaches the agent. Operators get activity attribution; agents get governed values.

Operator view

/audit?agent=bi-runner

Activity + classification + mask attribution

agent_id        bi-runner
role            viewer
source          customers
table           customers
column          email
classification  pii.email
mask_mode       generalize_email
count_24h       47
detections_24h  47 (all generalized)
denied_24h      0
trust_score     820 (allowed)

The console shows who accessed what, how often, with which mask applied. The operator never sees a row value either; they see the classification and the count.

Agent view

POST /customers/query

Governed values, never raw PII

[
  {
    "id": "7f3a-...",
    "name": "Alice Liddell",
    "email": "*@wonderland.test",
    "ssn": "***"
  },
  {
    "id": "9c1b-...",
    "name": "Bob Cratchit",
    "email": "*@buildit.test",
    "ssn": "***"
  }
]

The agent's tool call returns governed values. Names pass (not classified PII in this contract); emails generalize to the domain; SSN redacts to placeholder. The mask runs inside the proxy on the row in memory, before bytes leave the proxy host.

Same data. Two views. Operator sees activity, classification, and mask attribution. Agent sees the governed row. Neither sees the raw values directly. The mask map is deterministic and authored by the operator in your data contract.

Tag-driven governance. No LLM in the loop.

What it does

Six layers between the agent and the database.

Each layer is deterministic. Each layer is auditable. None of them ask the model to behave.

PII masking, both ways.

Runtime PII detection flags entities on un-contracted columns. ODCS contracts mask declared columns. Tokenize mode lets you reverse the mask via an audited admin call.

200+ entity types
Per-org confidence threshold
Reversible tokens, owner-only

Data contracts (ODCS v3).

The Open Data Contract Standard, enforced at the proxy. Drift detection, freshness SLAs, schema stability. Versioned, immutable once active, deprecation paths.

Draft → active → deprecated → retired
Active versions are immutable
Per-version diff in the console

Append-only audit log.

Every read, every mask, every denial. OpenLineage-shaped, exportable to Splunk, Datadog, Elasticsearch, or your SIEM. DB-level UPDATE/DELETE prevention. Immutable means immutable.

OpenLineage facets
5 export sink destinations
30 to 3,650 day retention

Trust scoring.

Five components, 200 points each. Contract validity, freshness, schema stability, availability, ownership. Block below threshold, warn above. The agent learns the source is stale before the model does.

Per-source composite, 0 to 1,000
Org default + per-source override
Recomputed every 5 minutes

Anomaly + threat correlation.

Three statistical detectors over 14-day rolling baselines: volume z-score, novelty, time-of-day. Multi-step threats correlate across windows. No LLM in the detection path.

k-anonymity floor of 5
14-day rolling baselines
Triage suppression: 24h cooldown

Kill switch + governance intelligence.

One click stops an agent, source, or org. Recommendations surface idle agents, uncontracted sources, high-denial roles. Cohort intelligence (k=5) flags drift against industry peers.

Org / source / agent scope
Auto-suggestions, never auto-applied
Peer benchmarks, never your data

Connectors

One governed pipe to every data source and MCP server you run.

Postgres, MySQL, MongoDB, and S3 ship as first-party connectors today. The MCP gateway fans out to any MCP server you register. The proxy authenticates every agent, evaluates every request against your contracts and policies, masks fields per role, and writes the decision to an immutable audit log. Same governance pipeline for every source.

Data sources

PostgreSQL
MySQL
MongoDB
Amazon S3

MCP servers

GitHub
Salesforce
Microsoft 365
Atlassian
Snowflake
HubSpot
Stripe
Notion
Datadog
Asana

Plus any MCP server you register. Common operator additions: Workday, ServiceNow, Google Workspace, Slack, Zoom, Zendesk, and any internal MCP tool your team builds. Register the upstream once and DataDam authenticates, governs, and audits every call.

First-party connector roadmap

Native, contract-aware connectors planned for premium tiers: Salesforce, Slack, Jira, ServiceNow, HubSpot, SharePoint. Custom first-party connectors are available on Enterprise plans.

Brand marks shown above identify the data systems DataDam connects to, used under nominative fair use. We do not imply partnership, endorsement, or commercial relationship with the named companies.

Industries

Built for teams that can't ship governance late.

Six regulated verticals. Each gets pre-configured policy templates, the right audit retention floor, and a vocabulary the auditor recognizes.

Healthcare

PHI auto-masked at the proxy. Patient identifiers tokenized so analytics agents work without ever seeing real MRNs.

HIPAA blueprint

Financial services

MNPI fenced at the contract layer. Trading agents and research agents get the data they need; the data they don't is a 403.

FINRA blueprint

Legal & compliance

Privileged content stays privileged. Append-only audit trail per query, exportable on demand to your SIEM or eDiscovery flow.

SOC 2 blueprint

Government

Self-hosted, air-gapped, on-prem-ready. The control plane never sees query content; the proxy never calls home for policy.

FedRAMP roadmap

Insurance

PII masked at the column level. Claim-bot reads policy metadata without ever touching the underwriting file's medical fields.

SOC 2 blueprint

EdTech

Per-role scoping for student records and grades. Tutoring agents see what their cohort allows; admin queries are audited row-by-row.

FERPA roadmap

Trust boundary

Your data never leaves your perimeter.

The proxy runs in your environment, on your Postgres, behind your firewall. The control plane (this site, your console at app.mydatadam.com) sees rollup metadata and contract definitions only.

Zero query content. Zero row values. Zero PII. Ever.

→ We architected for this on day one.

Bytes of your data on our infra

First-party connectors shipped

1,100+

Proxy tests, all passing

100%

SBOM + provenance on every release

How DataDam compares

Adjacent tools cover pieces. None enforce at the agent boundary.

Five rows that summarize the difference. See the full breakdown on the Compare page.

Capability	Data catalog	DLP	IAM / SSO	API gateway	DataDam
Documents what data exists	Yes	No	No	No	Imports from catalog
Enforces who reads what at runtime	No	Partial (post-hoc)	Coarse	Per-endpoint	Per-field, per-role
Masks PII before agent sees it	No	Detect-only	No	No	Yes (200+ entity types)
Immutable audit log per request	No	Partial	Coarse	Yes	Yes
Built for non-deterministic AI callers	No	No	No	No	Yes

Pricing

Four tiers. Pricing at general availability.

DataDam is in early access. Every tier ships the full governance engine; final pricing lands at general availability. Join the waitlist to lock in early access.

Free

Solo & teams of 1

Free

Forever. The wedge.

5 sources / MCP upstreams
1 proxy instance
30-day audit retention
Runtime PII detection
LLM egress: text scanning
Trust scoring
Anomaly detection
No LLM image scanning
No SSO
Community support

Request early access

Team

Small & growing

Early access

For teams in early production.

15 sources / MCP upstreams
3 proxy instances
365-day audit retention
Data contracts (ODCS)
Compliance blueprints
LLM egress: text + image scanning (CPU)
SSO (SAML / OIDC)
SCIM provisioning
Email support, 24h response

Request early access

Best for regulated

Business

Mid-market

Early access

For teams under audit pressure.

50 sources / MCP upstreams
10 proxy instances
2,190-day retention (HIPAA-ready)
SIEM export (Splunk / Datadog)
LLM egress: text + image scanning (CPU or GPU)
PagerDuty integration
Multi-step threat correlation
Cohort intelligence
Slack support, 4h response

Request early access

Enterprise

Regulated & large

Custom

For 200+ agents, multi-region, SLA.

Unlimited sources / upstreams
Unlimited proxy instances
3,650-day audit retention
LLM egress: text + image scanning (CPU or GPU)
Active-active multi-region
Dedicated VPC peering option
Custom contract review
Quarterly compliance reviews
Named CSM, 1h response

Request early access

Frequently asked

What buyers ask first.

Eight questions every CISO and platform team has worked through. If yours is missing, contact us.

What is an agent data proxy?

A governance layer that sits between AI agents and the data sources they query. It authenticates each request, evaluates it against policy, masks fields per role, and writes the decision to an immutable audit log. DataDam is the first product in this category.

How is DataDam different from a data catalog or data governance tool?

Catalogs document what data exists. DataDam enforces who can read what at runtime. The catalog answers "what does our company have"; DataDam answers "what is this agent allowed to read right now." They complement each other.

Where does my data live?

In your environment. Cloud, on-prem, or air-gapped. The proxy runs with your credentials, against your data sources. The control plane only sees rollup telemetry and policy configuration. No query content, no row values, no PII ever crosses to DataDam.

Does DataDam slow down agent requests?

The proxy adds a few milliseconds for policy evaluation and field masking. PII detection on un-contracted sources is the heaviest path and runs inside your environment against the proxy CPU. For contracted sources the path is essentially zero overhead.

Which data sources does DataDam connect to?

Postgres, MySQL, MongoDB, S3 at launch. Salesforce, Slack, Jira, ServiceNow, HubSpot, SharePoint follow as premium connectors. Custom connectors are available on Enterprise.

How does DataDam handle compliance frameworks like HIPAA or SOC 2?

Each framework ships as a policy template inside the product. Apply the template and the proxy sets trust thresholds, mask defaults, and audit retention to values aligned with that framework. Evidence endpoints export rollups and change logs for auditor delivery in CSV or JSON. The templates help you meet your compliance obligations.

Is DataDam itself SOC 2 / HIPAA certified?

Not yet. The company has not been audited or attested to SOC 2, HIPAA, or any other framework. Certification of DataDam itself is a separate workstream we will pursue when there is customer demand and operational maturity to support it. We will update this page when it is real, not before. The product is designed so a customer can meet their own compliance obligations through configurable policy templates, audit retention, and architecture (customer data stays in your environment).

Can DataDam be deployed on-premise or air-gapped?

Enterprise tier deploys the proxy on-premise with the SaaS control plane. Air-gapped tier deploys both proxy and control plane on customer infrastructure with a signed license file bound to specific hardware.

Is DataDam open source?

The runtime governance engine, Microsoft Agent Governance Toolkit, is MIT licensed. The DataDam product (control plane, console, connectors, compliance blueprints) is commercial. Customer SDKs are MIT licensed and live on GitHub.

Stop the next agent breach. Today, not next quarter.

The proxy starts in 5 minutes. The first audit row lands within 30 seconds of your first agent query. Procurement doesn't get to slow this one down.

Request early access →Read the spec →