Native database support
Postgres via asyncpg. MySQL via aiomysql. MongoDB via motor. Each connector reads schema, executes queries, and applies field-level transforms before the response leaves the proxy.
How it works
Three components. Each does one job. Together they govern every request your AI agents make to your data.
Category definition
An agent data proxy is a governance layer between AI agents and the data sources they query that authenticates each request, evaluates it against policy, masks fields per role, and writes the decision to an immutable audit log.
Architecture
The proxy authenticates, evaluates policy, masks fields, scores trust, and writes the decision to an immutable audit log. App code, IDE clients, your data, and operator-registered upstream MCP servers all flow through the same governance pipeline.
Components
The proxy runs in your environment. The control plane stays small. The console is static.
Component one
The proxy
A Python service that runs in your environment, behind your network controls, with your credentials. It speaks your data sources natively (Postgres, MySQL, Mongo, S3, Salesforce) and your agents talk to it instead of talking to the source. Every request flows through the proxy. No request content, row values, or PII ever leaves your environment.
Component two
The control plane
A small AWS-hosted SaaS that holds your policies, contracts, trust scores, compliance settings, and audit rollups. The proxy syncs from it every minute, pushes telemetry rollups every five minutes. Telemetry is metadata only: counts, top names, latency percentiles. No query content.
Component three
The console
A static Next.js admin app where operators author contracts, set trust thresholds, view the lineage graph, triage anomalies, and export audit evidence. Talks to the control plane over HTTPS. No long-running backend.
Step one
Point the proxy at your sources. It runs in your environment, authenticates with credentials that never leave your environment, and starts emitting metadata-only telemetry to the control plane.
Postgres via asyncpg. MySQL via aiomysql. MongoDB via motor. Each connector reads schema, executes queries, and applies field-level transforms before the response leaves the proxy.
S3 connector with bucket-level scoping by default. Optional in-line content scanning runs built-in PII detection over text bodies and masks detected PII before serving the object. Per source, opt in.
Salesforce, Slack, Jira, ServiceNow, HubSpot, SharePoint. The same proxy that governs your databases governs your SaaS data. Field-level masking applies to REST responses the same way it applies to SQL rows.
On Enterprise, the connector framework is documented and pluggable. Write the five-method Connector interface and the proxy treats your custom source like every other source.
Step two
Author what each role can see, in version-controlled files or in the console. The proxy enforces them on every request, no agent code change required.
Open Data Contract Standard v3 is the schema. PII tags on contract columns drive auto-masking. The proxy detects schema drift on every snapshot push and emits a violation when a column type changes upstream. Contract authors set freshness SLAs per source.
Operator-authored YAML declares per-role allow lists, mask modes, and tokenization opt-ins. The Sales role sees customer email but not SSN. The Support role sees neither. The Audit role sees both, masked.

Step three
Same request shape, same response shape, with policy enforcement and audit logging in between. No agent code change. The proxy is a drop-in URL swap.
Decision order
Kill switch, then trust check, then anomaly correlation, then PII enforcement, then field policy, then optional tokenization. Each gate either passes, warns, or blocks with a structured reason.
Identity
Each agent carries an Ed25519 cryptographic identity issued at registration. Identity rotates on a schedule. Lost or stolen identities are revoked from the console.
Audit
Every request lands in an append-only rollup table. SHA-256 hash chain makes tampering detectable. Audit retention is configurable per org, exportable to CSV, JSON, or your SIEM.
How agents reach the proxy
Your application code, your IDE, and the LLM tools you already run all talk to the same proxy. No second policy language; no second audit log.
HTTP API
Plain HTTP for application code that already calls REST APIs. Bearer key auth, JSON request and response. Same shape across Postgres, MySQL, Mongo, and S3 connectors. Wrap it as a tool in the Anthropic Messages API, OpenAI Chat Completions, OpenAI Responses, LangChain, or the Vercel AI SDK.
MCP
First-party Model Context Protocol endpoint. Your data sources surface as MCP tools alongside operator-registered upstreams (GitHub, Postgres, Notion, Slack, Linear, custom internal tools). Tool arguments scanned outbound; tool results scanned inbound. Cursor, Continue, and Cline connect directly.
Stdio shim
For IDE clients that only speak stdio MCP, the published shim bridges to the proxy over Streamable HTTP. Pure transport translator: never inspects, rewrites, or persists message content. One line in the Claude Desktop config and you are governed.
Copy-paste integration snippets at /docs/integrations. OpenAPI 3.1 reference at /docs/api.
Works with what you have
Catalogs have lineage of pipelines. Nobody has lineage of agents. DataDam emits it natively because the proxy sits on the data path. We do not replace your existing catalog; we feed it the slice it does not have.
/catalog/columns and /catalog/agents/{id}/access endpoints with the agent-access slice your catalog cannot see.What we explicitly do not claim
DataDam is not a data catalog and does not pretend to be one. We do not ship a business glossary, BI tool metadata, dbt model docs, legacy data contracts, or "single pane of glass" for your data estate. We know what flows through the proxy and we are honest about not seeing the rest. The integration story is feed-your- catalog, not replace-your-catalog.
Coming: native push to DataHub, Atlan, Collibra, OpenMetadata. Ask if a different catalog is in your stack; the OpenLineage shape we already emit covers most receivers today.
Production-ready in an afternoon. Free tier covers evaluation. Pro and Business tiers unlock contract enforcement, compliance, and lineage.
Frequently asked
What platform teams ask before deploying. Answers grounded in the existing /how-it-works flow above.
No. Agents point their existing connection string at the proxy. Postgres on the wire, MySQL on the wire, Mongo, S3, MCP. Same protocol, same drivers. The proxy speaks each source natively. No code change in the agent, no library to import.
Three steps. (1) Run the proxy container in your environment (cloud, on-prem, or air-gapped) pointed at one data source: minutes. (2) Author one Open Data Contract Standard contract for that source with PII tags: under an hour for a typical schema. (3) Point your agent at the proxy: instant. Most customers have one source under governance the same day.
Native: Postgres (asyncpg), MySQL (aiomysql), MongoDB (motor), S3 (aioboto3 with optional in-line content masking). SaaS: Salesforce, Slack, Jira, ServiceNow, HubSpot, SharePoint via REST connectors. MCP: any MCP-speaking upstream registered through the console gets the same governance gates as a database.
A first-party MCP server hosted by the proxy. Agents that speak the Model Context Protocol (Claude Desktop, Cursor, Cline, ChatGPT, and others through the @datadam/mcp stdio shim) treat the proxy as their MCP server. The same field-level access control, PII masking, and audit log applies to MCP traffic as to SQL traffic.
Two surfaces. Operators write per-role YAML in the console (allow lists, mask modes, tokenization opt-ins). The Sales role might see customer email but not SSN; Support sees neither; Audit sees both, masked. The proxy enforces them on every request without an agent code change.
The proxy returns the response with denied columns stripped, a warning header naming the policy that fired, and an audit row recording the decision. For SQL the WHERE / SELECT clauses are rewritten to drop the denied fields; the agent sees a partial result rather than a 403 (configurable per-source). Every decision is hash-chained into the audit log.