Why we built the agent data proxy.

The category did not exist a year ago. It exists now because every regulated buyer hit the same wall at the same time: agents that read enterprise data with no governance layer in front of them.

Josh TepenFounder, DataDamApril 25, 20266 min read

PositioningCategory

A year ago, "agent" was a research-team word. Today it is a line item on every CISO's risk register. The change happened faster than any tooling category around it could keep up with, and the gap is now visible in every regulated buyer's procurement process. They want the productivity of agents reading their data. They cannot get past legal review with the architecture every vendor is shipping.

DataDam exists because of that gap. Not because we wanted to invent a new category. Because the work everyone is asking for keeps reducing to the same shape, and there is no product that fills the shape.

What the buyers said.

Across the design-partner conversations that produced the v1 product, three observations repeated until they could no longer be coincidence.

First. Buyers are not afraid of LLMs. They are afraid of the data the LLM gets to read. A general-purpose chat model whose only inputs are public web content is a procurement problem you can describe in two sentences. A general-purpose chat model whose tools call Salesforce, Snowflake, and SharePoint with the user's credentials is a security architecture review.

Second. The catalogs do not enforce. Every regulated buyer has a data catalog already. It tells them what data exists. It does not tell an agent runtime what an agent is allowed to read. That is a runtime concern, and the catalog is a documentation concern.

Third. The audit log is the deliverable. The reason regulated buyers ship anything in production is that they can prove what happened in production. Today, an agent that reads a customer record produces no record of having done so that an auditor would accept. That is the deliverable that makes everything else possible.

The catalog answers "what does our company have?" The agent data proxy answers "what is this agent allowed to read right now?" They are different questions. They need different products.

Why a new layer.

The first instinct of any engineering team facing a runtime governance problem is to write a wrapper. Wrap the LLM call. Inspect the prompt. Mask the response. We tried it. It does not work for the same reason that running a firewall on the user's laptop does not work: the wrong layer.

The agent is not the trustworthy actor. The agent is the request emitter. The data source has the most accurate picture of what is in it. The right place to enforce policy is between the request and the data source, where you can see the request, the intent, the response, and the role of the user behind it all.

That is a proxy. It sits between agents and data sources. It authenticates each request. It evaluates the request against a policy that the data owner authored. It masks fields per role. It writes the decision to an immutable audit log. It does not guess. It does not call out to a model. The reasoning is deterministic on purpose.

Why deterministic.

Every governance product that ships an LLM in the decision path is one prompt-injection attack away from a control failure. We know how to build LLMs into things. We also know what they do under adversarial input. They lose deterministic behavior, and a governance product that loses deterministic behavior is not a governance product. It is a suggestion engine that sometimes complies with policy.

DataDam's policy decisions, trust scoring, anomaly detection, and contract enforcement are deterministic. Every decision traces back to a rule, threshold, or policy line. An auditor can verify a decision by reading the rule, not by asking a vendor what their model felt about it. We use LLMs nowhere in the path that says yes or no to a request.

Where we are.

The product runs in production. The proxy is a Docker image you drop into your environment. The control plane is a small AWS deployment that costs us about thirty dollars a month to run for early customers. Eight compliance blueprints (HIPAA, SOC 2, FINRA, GDPR with UK GDPR, CCPA with CPRA, DPDP, LGPD, PIPEDA) are one-click. The audit log is hash-chained for tamper evidence. Cross-org peer benchmarks are k-anonymized. Anomaly detection is pure statistics.

We are talking with the first teams now. If you are running agents over regulated data and you have hit the same wall everyone hits at the legal review, we would like to talk. The product is built for the boardroom; it is also built for the operator. Both rooms get to see the same audit log.

Read the security architecture, then come argue with us. Email hello@mydatadam.com.

← More from the blog