Security questions before you buy

CISO Q&A.

The questions every security review asks. Crisp answers, dated where applicable. If your team has a question this doesn't answer, email security@mydatadam.com.

Architecture and data flow

Where does customer data physically live?

Customer data lives in your environment. The proxy runs inside your VPC, with your credentials, against your databases. It never copies data to DataDam.

The control plane is the only DataDam-operated component. It runs in AWS US-East-1 and receives metadata only: rollup counts, top-N names, latency percentiles, masked entity counts. No query content. No row values. No PII.

What does the proxy send back to the control plane?

Aggregated telemetry on a 5-minute schedule: request counts per agent, denial counts, latency percentiles, top-N source aliases by volume, top-N denied columns, per-engine PII entity counts.

It never sends row values, query payloads, raw PII spans, or LLM prompt content. The masking happens inside the proxy code before bytes leave the proxy host.

Does customer data ever leave our VPC?

No. The proxy is the only network egress component, and what it emits to the control plane is metadata. You can verify this end-to-end yourself: tcpdump the proxy and look for outbound. The only outbound destinations are the control plane (metadata) and whichever vendor you direct the agent to (Anthropic, OpenAI, Gemini, or your data source).

What is the blast radius if the DataDam control plane is compromised?

An attacker with control-plane access gets metadata (counts, names, masks, audit rollups). They do not get any customer data because the control plane never holds it.

They could mint a new API key for an org or revoke an existing one, which would be observable in the audit log within seconds. They cannot change a deployed proxy's behavior beyond what the wire shape allows: trust thresholds, mode flips, kill switches, contract content. They cannot exfiltrate row data.

Customer credentials for data sources and LLM vendors live in your secrets layer, not the control plane.

Does DataDam hold our database credentials?

No. Credentials live in your environment (env vars, 1Password, AWS Secrets Manager, HashiCorp Vault) and the proxy reads them at startup. The control plane sees only the credential metadata you register: host, port, database name, alias.

Identity and access

Who at DataDam can access customer data?

Nobody. Customer data never reaches us. The control plane has no API and no internal tooling that returns row values, because no row values are stored.

The founder/staff console (gated by an explicit email allowlist) returns aggregates only: counts of orgs, members, sources, agents. No customer data, no audit row content, no policy bodies.

How does agent identity flow through audit attribution?

Every API key is per-agent. The proxy authenticates the bearer, looks up the (org, agent_id, role, deployed_by) tuple, and attaches it to every audit row. You can answer "which agent did what" from any audit row.

What happens if an agent key is leaked?

Per-key revocation: revoke the key in the console; the next /sync (within 5 minutes) propagates revocation to every proxy. Subsequent requests with that key return 403.

Kill switch: scoped per-agent, per-source, or org-wide. Soft-stop on the data path within 5 minutes of activation.

Audit trail: every request from the leaked key is in the audit rollup, attributed by agent_id and key_hash. You can build a list of affected data within the retention window.

SSO and SCIM support?

SSO via Cognito federation. Per-org SAML or OIDC providers. Customer pastes the metadata in /sso; the API registers the IdP in the user pool. State-cookie CSRF protection on the SSO callback.

SCIM 2.0 user and group provisioning. Per-provider SCIM tokens with bearer auth. Group-to-role mapping authored in the console; user role recomputes on every group change.

Compliance and certifications

What is your SOC 2 / HIPAA / ISO 27001 status?

DataDam itself is not currently audited or attested to SOC 2, HIPAA, ISO 27001, or any other framework. We will say so on /security when that changes; today the answer is "not yet."

What we ship: policy templates aligned with 8 frameworks (HIPAA, SOC 2, FINRA, GDPR + UK GDPR, CCPA + CPRA, DPDP Act 2023, LGPD, PIPEDA). The templates set trust thresholds, mask defaults, audit retention, and evidence-collection scope to values that align with the named framework. The templates are an operator-applied product feature, not an audit of DataDam.

BAA: available on request once the BAA-eligible tier is contracted. Email security@mydatadam.com.

AWS region and data residency?

Control plane today: AWS US-East-1. The proxy runs anywhere you put it. Per-region control planes are a roadmap item; ask if you need a specific region for the metadata-only telemetry path.

Pen test cadence and vulnerability disclosure?

Internal security gate per engineering phase (zero critical/high required to ship the next phase). External penetration test is on the roadmap pre-GA. Email security@mydatadam.com to coordinate disclosure of any finding; we run a standard 90-day responsible-disclosure window.

Encryption

Encryption at rest and in transit?

In transit: TLS 1.2+ on every external connection. Upstream database connections default to TLS REQUIRE; operator opts out per source if their database does not support TLS, with a structured warning in the audit log when they do.

At rest: RDS-level encryption on the control plane database. API keys are SHA-256 hashed; license keys are SHA-256 hashed; session cookies are AES-256-GCM encrypted with a key from AWS Secrets Manager.

Reversible tokens (tokenize mode): plaintext encrypted at rest with per-source AES-256-GCM in YOUR Postgres token store, not ours.

Key management approach and BYOK roadmap?

Today the control plane uses AWS Secrets Manager with default AWS-managed keys. Customer-managed KMS (BYOK) for the control plane is roadmap; the proxy never holds customer data so BYOK there is a different conversation. Ask if BYOK is a blocker.

Uptime and incident response

Target SLA?

Control plane: 99.9% target on Business tier; 99.95% target on Enterprise tier. The control plane is in-band for sync but the proxy fails open on sync failure (last-good policy continues to apply), so a control plane outage does not break governance on existing traffic.

Proxy: customer-managed. Your orchestrator (Kubernetes, ECS) provides the SLA; DataDam ships the image and the Helm chart.

DR posture (RPO / RTO)?

Control plane: nightly automated RDS snapshots, point-in-time recovery within the retention window. RPO < 5 minutes for the past 35 days; RTO < 4 hours for a region failover.

Proxy: stateless. The next /sync from the control plane re-hydrates everything. RPO and RTO bounded by your container orchestrator.

Status page and incident response?

Status page: /status. Components: API gateway, console, control plane database, sync endpoint, telemetry ingest. Subscribed customers get email + webhook notifications on incident open and close.

Sev-1 response: <1 hour acknowledgement on Enterprise tier, 24/7. Business-hours response on Team and Business tiers.

Privacy

GDPR and CCPA posture?

DataDam never sees PII (customer data does not cross to us; metadata is the only flow). The control plane stores org-level config and audit rollup counts. Operator end-user emails are stored to support SSO and the console; deletion via support ticket is honored within 30 days.

DPA available on request. Subprocessor list at /legal/dpa.

Data deletion process?

Audit retention is operator-configured per org (default 365 days; range 30 to 3650). The retention cron deletes expired rows daily.

Org deletion: ticket-driven. We cascade-delete every org-scoped row from the control plane (RLS ensures isolation; deletion is exhaustive).

Subprocessor list?

AWS (US-East-1) for the control plane. Stripe for billing (Stripe never sees customer data; only customer email and plan tier). Cognito for SSO. SES for transactional email. The current list is at /legal/dpa.

LLM-specific

Do you train models on customer data?

No. We never see customer data. The control plane only receives metadata (counts, top-N names, latency percentiles, masked entity counts). There is nothing to train on.

Does anything you operate send data to model vendors?

No. The agent sends prompts through the proxy; the proxy scans, redacts, and forwards with the customer's vendor API key. The control plane is not in this path. Vendor responses return through the proxy to the agent; the control plane sees only the audit row with entity counts.

How do you handle prompt injection at the proxy layer?

The proxy does not defend against prompt injection. That is the agent framework's layer. What the proxy does: scans the prompt for PII and secrets so that PII-bearing adversarial prompts get redacted before reaching the vendor. A pure-text adversarial prompt with no PII passes through; the agent framework must handle it.

What about image attachments in LLM requests?

The proxy runs an in-proxy OCR pipeline on every image attachment in Anthropic, OpenAI Chat, OpenAI Responses, and Gemini requests. Extracted text routes through the same PII scan as prompt text. Stage-2 visual detection flags credit cards, ID cards, faces, and signatures. Detected regions are painted over with black rectangles and the redacted image bytes substitute into the forwarded request. Full detail at /llm-egress.

Audit visibility

What is exposed in the audit log?

Per-request: timestamp, agent_id, source alias, table, fields requested, fields allowed, fields denied, fields masked, policies matched, row count, latency, error, key environment, PII entity counts, per-engine attribution, tokens minted, tokens resolved.

No row values. No query payloads. No raw PII spans. The audit row answers "what did the agent ask for and what did the gate do" without copying any data.

SIEM export options?

Splunk HEC, Datadog Logs, Elasticsearch, CloudWatch Logs, generic webhook. Configure at /settings/audit-export. Multiple sinks per org are allowed; consecutive-failure auto-disable prevents a broken sink from blocking governance.

Tamper-evident chain?

The audit rollup is append-only at the database level. BEFORE-trigger enforcement rejects UPDATE and TRUNCATE on the audit table; DELETE is allowed only inside the retention cron via a per-transaction GUC. Any other code path attempting mutation fails closed with SQLSTATE 42501.

Each rollup row carries a SHA-256 hash of the previous row, so a deletion or modification (if the BEFORE trigger were somehow bypassed) breaks the chain.

Vendor risk

Insurance coverage?

Cyber liability and E&O coverage in place. Specifics on request to security@mydatadam.com.

Acquisition or bankruptcy continuity?

The proxy and SDKs ship under permissive open-source licenses. The proxy is your container in your VPC. If DataDam ever goes away, you keep the proxy running and point it at your own audit pipeline. You own your governance posture by design; the lock-in surface is intentionally minimal.

Source code escrow availability?

On request for Enterprise contracts. The proxy is already source-available; escrow is most relevant for the control plane, which is closed-source today.

OSS policy?

We vendor permissively-licensed dependencies (MIT, Apache 2.0, BSD). The vendored Microsoft Agent Governance Toolkit ships under MIT. Detection-pipeline components all ship under MIT or Apache 2.0. We do not vendor copyleft. Full subprocessor and dependency list available under NDA.

Does DataDam replace our data catalog?

No. We feed your catalog. Catalogs (Atlan, Collibra, Alation, DataHub, OpenMetadata, Select Star) have lineage of pipelines. DataDam has lineage of agents because the proxy sits on the data path. We push our slice (per-column access counts, allowed-vs-denied, agent-to-source edges, contract-derived PII classifications, runtime PII detections, trust scores) into your catalog via OpenLineage push or via our Catalog API. Your existing catalog stays the system of record for business glossary, BI metadata, and dbt model docs.

Have a question this doesn't answer?

Email security@mydatadam.com and we will work through your specifics. We answer security questions in writing; vague PR-speak loses trust faster than the wrong answer.