Security

Built for the boardroom.

DataDam is the governance layer that regulated buyers can put in front of their AI agents without weakening their existing security posture. Here is how.

Proof, not promises

How customer data never leaves your environment.

The trust statement of the whole product is verifiable. You can run the proof yourself in under a minute.

Three layers of evidence, in order of "what would your security team accept as proof":

Layer 1: Architecture

The proxy runs in your VPC. It holds your database credentials, opens connections to your data sources, and returns governed responses to your agents. The control plane (the only DataDam-operated component) lives in AWS US-East-1 and never receives row values. The audit log it ingests carries counts, top-N names, latency percentiles, and masked entity counts. Nothing else.

Layer 2: Code

Masking runs inside the proxy code, on rows in process memory, BEFORE any bytes flow out. The proxy is open-source. Read apply_masks in the source repository, then read every call site. If the code does not match the claim, file an issue or open a PR.

Layer 3: Run the diff yourself

Same SQL, two paths. Direct to Postgres returns the raw row. Through the proxy returns the governed row. The difference IS the trust claim.

Direct Postgres

$ psql -c "
SELECT id, name, email, ssn
FROM customers LIMIT 1
"

id   | 7f3a-...
name | Alice Liddell
email| alice.liddell@wonderland.test
ssn  | 111-22-3333

Through DataDam

$ curl -H "Authorization: Bearer $KEY" \
  -X POST http://proxy/customers/query \
  -d '{"sql":"SELECT id,name,email,ssn..."}'

id   | 7f3a-...
name | Alice Liddell
email| *@wonderland.test
ssn  | ***

Now tcpdump the proxy host while you run the second curl. The only outbound destinations you will see are the Postgres at your data source (you control this) and the DataDam control plane (you can inspect the bytes; they carry metadata, not row values).

Practices

What we guarantee, by construction.

Customer data never crosses to DataDam

The proxy runs in your environment, with your credentials, against your data. The control plane only sees metadata: rollup counts, top-N names, latency percentiles. No query content, no row values, no PII ever leaves your environment. This is contractual, not best-effort.

Per-org row-level security, enforced at the database

Every org-scoped table in the control plane carries FORCE ROW LEVEL SECURITY. Every API query runs inside a per-org transaction context. Even if application code misfires, the database refuses cross-org reads. Belt and suspenders.

Immutable audit log

The audit rollup is append-only at the application layer and SHA-256 hash-chained for tamper evidence. Each rollup row carries the hash of the previous row, so a deletion or modification breaks the chain. Configurable retention per org, exportable to your SIEM.

Salted, per-org hashing

Mask mode HASH uses sha256(per_org_salt + value). A leaked rainbow table for one org cannot be reused against another. Without a synced salt, the proxy substitutes a process-local random fallback so HASH never ships raw sha256(value).

SSO with CSRF protection

SSO callbacks verify a state cookie via crypto.timingSafeEqual. The state cookie is HttpOnly, SameSite=Lax, scoped to /auth, single-use. Login-CSRF and session-fixation patterns fail before the callback even sees the code.

Encrypted session cookies

dd_access and dd_refresh cookies are HttpOnly, AES-256-GCM encrypted with a key from AWS Secrets Manager, never reach JavaScript. A custom Lambda authorizer reads and decrypts on every request.

Cross-org intelligence is opt-in and k-anonymized

Cross-org peer benchmarks are off by default. When an Owner enables sharing, aggregates are written without org_id and only when at least five contributing orgs share the same cohort. No cohort smaller than k = 5 is ever published.

No LLM in the governance loop

Policy decisions, trust scoring, anomaly detection, threat correlation, and recommendations are deterministic. Every decision traces back to a rule, threshold, or policy line. We use LLMs nowhere in the path that says yes or no to a request.

LLM egress scanning, including image attachments

Agent traffic to Anthropic, OpenAI, and Gemini routes through the same proxy. Every outbound prompt is scanned for PII, secrets, and operator-authored patterns before it reaches the vendor. Image attachments run through an in-proxy detection pipeline: text in screenshots gets the same scan as text in prompts, and detected regions are painted over with black rectangles before forwarding. CPU and GPU versions ship the same coverage; operators that don't accept images can disable the path entirely via the console.

No vendor lock-in

Contracts use an open data-contract standard. The runtime policy engine and customer-facing SDKs ship under permissive open-source licenses. If DataDam ever goes away, the policy engine and the contracts you authored survive: keep the proxy running, point it at your own audit pipeline. You own your governance posture.

MCP threat defense

Aligned with NSA guidance on securing the Model Context Protocol.

MCP reverses the usual trust direction: the server drives the client, so a tool result becomes an instruction the agent may follow. Authentication and input validation alone do not address this. NSA's Artificial Intelligence Security Center said as much in its Cybersecurity Information Sheet, Model Context Protocol (MCP): Security Design Considerations for AI-Driven Automation (U/OO/6030316-26). DataDam's gateway sits on the data path and implements defenses for the failure modes that guidance describes.

Tool-output injection neutralization

Upstream tool results are sanitized before the agent reads them. HTML, zero-width and bidirectional unicode, Unicode tag characters, and known prompt-injection phrasings are stripped or flagged. A poisoned GitHub issue, Jira ticket, or Slack message cannot smuggle instructions into the agent context.

Tool-drift pinning, the rug-pull defense

Every upstream tool is fingerprinted and pinned to an operator-approved baseline. If a tool definition changes after you approve it, the gateway withholds it until an operator re-approves. The MCP rug-pull, where a server quietly swaps a trusted tool for a malicious one, stops at the gateway.

Per-agent rate limits

Every agent carries a request ceiling enforced at the gateway. A compromised or runaway agent is throttled before it can exfiltrate at volume, not flagged after the damage is done. The limit is an org default with per-agent overrides.

Stdio server sandbox

Stdio MCP servers run inside a filesystem and network sandbox. A malicious or compromised server sees only an explicit path allowlist and, when you choose, no network at all. It cannot read host secrets or reach the internet on its own. Helm ships fail-closed by default.

Read the source guidance: NSA CSI, Model Context Protocol: Security Design Considerations (U/OO/6030316-26). DataDam is not endorsed by or affiliated with NSA. The citation describes the threat model these controls answer; it is not a claim of certification.

Compliance posture

Where we are. Where we are not.

We don't have SOC 2 yet. We don't have a HIPAA BAA yet. We're saying this on the homepage because a vendor that hides it ships a worse posture than one that names it. Here's the real list.

What ships today

Eight compliance blueprints. HIPAA, SOC 2, FINRA, GDPR with UK GDPR, CCPA with CPRA, DPDP, LGPD, PIPEDA. Each one pre-configures trust, PII, and audit retention to the framework's expectations.
Append-only audit log with OpenLineage facets, exportable to your SIEM.
Per-org PII confidence + audit retention (30 to 3,650 days).
Tamper-evident hash chain on the audit table.
RLS-isolated multi-tenancy. Every org-scoped table has FORCE ROW LEVEL SECURITY enabled.
SSO via SAML / OIDC + SCIM provisioning. Per-org IdP federation with role mapping.
Hosted in your environment. Helm chart, Docker compose, and CloudFormation manifests for AWS, GCP, Azure, or on-prem.

What we are not yet

SOC 2 Type II audit: in scope; date not committed. We have the controls; we don't have the auditor's letter.
HIPAA BAA available: not yet. Healthcare buyers should ask at sales@mydatadam.com for current status.
FedRAMP: not committed. Government workloads should self-host and air-gap.
ISO 27001: not committed.
PCI DSS: the proxy doesn't store cardholder data. We'd be in scope as passthrough; that path isn't certified.
External pen test report: internal testing only. External report planned after Series A.

If your procurement gates on a checkbox we don't have, we'll say so on the call. We won't waste your quarter.

Policy templates

Compliance-aligned defaults that ship in the console.

Operator-applied product features. They set trust thresholds, mask defaults, and audit retention to values that align with the named framework. They are not, and do not assert, an audit or attestation of DataDam.

Current state

DataDam itself is not currently certified or attested to SOC 2, HIPAA, or any other compliance framework. The product gives you the policy templates, audit retention, and architecture (customer data stays in your environment) to meet your own compliance obligations. Certification of DataDam itself is a separate workstream we will pursue when there is customer demand and operational maturity to support it. We will say so on this page when it is real, not before.

For healthcare customers

HIPAA

Policy template

A policy template aligned with HIPAA workflows. PHI fields tagged in your contract mask by default. Trust threshold defaults to 600 with block enforcement. Audit retention defaults to six years per 45 CFR §164.530(j).

For SaaS and platform customers

SOC 2

Policy template

A policy template aligned with SOC 2 controls. Trust threshold defaults to 400 with warn enforcement. PII confidence threshold tightened. Audit retention defaults to one year for SOC 2 Type II.

For broker-dealers

FINRA

Policy template

A policy template aligned with FINRA record-keeping. Trust threshold defaults to 500 with warn enforcement. Conservative PII detection. Six-year audit retention sized for FINRA Rule 17a-4.

For EU and UK customers

GDPR + UK GDPR

Policy template

A policy template aligned with the GDPR territorial scope under Article 3. Trust threshold defaults to 550 with block enforcement. Conservative PII detection at 0.40. Five-year audit retention sized for Article 30 records-of-processing. Customer-deployed proxy means data residency is whatever you make it: run the proxy in EU regions and no customer data leaves the EU.

For California consumers

CCPA + CPRA

Policy template

A policy template aligned with CCPA disclosure obligations. Warn-mode trust enforcement keeps DSAR evidence comprehensive. Two-year audit retention sized for §1798.130(a)(2)(B). Tokenization gives you reversible identifiers for right-to-know flows.

For India

DPDP Act 2023

Policy template

A policy template aligned with the Digital Personal Data Protection Act 2023. Warn-mode trust enforcement, three-year audit retention. Erasure on consent withdrawal is event-driven per §8(7), supported by tokenization for reversible identifiers.

For Brazil

LGPD

Policy template

A policy template aligned with Lei Geral de Proteção de Dados Pessoais. GDPR-shaped framework: block-mode trust enforcement, conservative PII detection, five-year audit retention sized for Article 37 records-of-processing.

For Canada

PIPEDA

Policy template

A policy template aligned with the Personal Information Protection and Electronic Documents Act. Warn-mode trust enforcement reflecting PIPEDA accountability and evidence emphasis. Three-year audit retention. Federal floor; provincial overlays (BC, Alberta, Quebec) layer on top.

Responsible disclosure

Found something? Tell us.

We treat security reports seriously and we respond fast. Email security@mydatadam.com with a description and reproduction steps. We aim to acknowledge within one business day and assign a remediation owner within three. We do not pursue legal action against good-faith researchers.

We run a security review before each release. Findings, severity, and remediation are tracked internally and shared with evaluating customers under NDA on request.

Frequently asked

Security questions.

What CISOs and platform teams ask about how DataDam handles customer data, credentials, and the audit log.

Where does my customer data live?

In your environment. The proxy runs wherever you own infrastructure (cloud, on-prem, or air-gapped), behind your network controls, with credentials you supply. Query content, row values, and PII never leave the perimeter. The control plane only sees metadata: counts, top names, latency percentiles, policy configuration.

What does the control plane actually see?

Telemetry rollups (counts of requests by agent + source + outcome, top denials, top edges, latency percentiles) plus policy configuration the operator authored. No queries, no row content, no PII, no row counts that could correlate back to specific customers. The wire contract is documented in the security page; the rollup format is checked in tests.

Is the audit log tamper-evident?

Yes. Every audit row is hash-chained: each record carries a SHA-256 of the previous record, so any modification or deletion downstream of a row breaks the chain and the next integrity check fails closed. The retention cron is the only code path with permission to delete, and it runs with an explicit per-transaction GUC that the audit-table triggers verify before allowing the delete.

Do you use LLMs anywhere in the governance decision?

No. Policy decisions, trust scoring, anomaly detection, threat correlation, and recommendations are deterministic. Every decision traces back to a rule, threshold, or policy line. We use LLMs nowhere in the path that says yes or no to a request.

How does DataDam handle PII?

Two paths. For sources with an active data contract, columns tagged pii.email or pii.phone get masked through the contract's declared mode (generalize / redact / hash / tokenize) before the response leaves the proxy. For sources without a contract, runtime PII detection runs inline as a fallback and covers 200+ entity types across seven recognizer packs (secrets, country IDs, healthcare, financial, network, crypto wallets, vehicle and asset identifiers), masking detected spans through the same mask machinery. Both paths run inside your environment.

What compliance frameworks does DataDam support?

Eight frameworks ship as policy templates that set trust thresholds, mask defaults, and audit retention to framework-aligned values: HIPAA, SOC 2, FINRA, GDPR (and UK GDPR), CCPA + CPRA, DPDP Act 2023, LGPD, and PIPEDA. Evidence endpoints export the audit rollups and policy change log in CSV or JSON for auditor delivery. The templates help you meet your obligations; DataDam itself is not yet audited or attested to any framework (we will update when that is real, not before).

How do you secure third-party MCP servers like GitHub, Slack, or internal tools?

The gateway hardens MCP traffic, it does not just route it, and the controls map to the risks in NSA's Cybersecurity Information Sheet on the Model Context Protocol (U/OO/6030316-26). Tool results are sanitized before the agent reads them, so a poisoned issue or message cannot inject instructions. Every upstream tool is fingerprinted and pinned to an operator-approved baseline, so a server that swaps a trusted tool for a malicious one (the rug-pull) is withheld until you re-approve. Per-agent rate limits throttle a runaway or compromised agent at the gateway. Stdio servers run in a filesystem and network sandbox with an explicit allowlist. DataDam is not endorsed by NSA; we implement defenses for the failure modes the guidance describes.

Forward this to your security review.

Architecture and data flow. Identity. Compliance. Encryption. Uptime. Privacy. LLM specifics. Audit visibility. Vendor risk. Every question your security team will ask about DataDam, answered in writing.

Read the CISO Q&A →