2026-05-03Strategy6 min read

Why AI Compliance Needs Independent Evidence

AI governance fails when the same system that drives an automated workflow is also asked to prove that the workflow was governed. Regulated enterprises need a cleaner separation of duties — one that an auditor can trust because the evidence producer has no stake in the business outcome.

This is not a theoretical concern. Every major AI compliance framework — EU AI Act Article 12 (record-keeping), SOC 2 (monitoring activities), HIPAA (audit controls) — requires that evidence be independently reviewable. When the monitoring layer and the execution layer share code, infrastructure, or organizational ownership, the auditor's first question becomes: "Can the evidence be modified by the same system that produced it?"

Why monitoring alone is not enough

Most teams building AI agents today deploy application-level logging, model gateways with usage tracking, and observability stacks. These tools tell you what happened at the infrastructure level — request counts, latency percentiles, error rates. They do not tell you what an agent decided, what data it accessed, or whether that access was appropriate.

Consider a customer-support agent that retrieves an order record and sends it to a billing system. From an infrastructure perspective, the system saw two API calls: one to the CRM, one to the billing API. From a compliance perspective, the critical question is: "Did the agent have authorization to access that specific customer's PII?" An observability tool cannot answer this. A compliance evidence layer can — because it captures the decision context (which policy was evaluated, what data was matched, what the outcome was) alongside the raw event.

Decision context vs infrastructure log

Infrastructure log

GET /api/orders/7f42 → 200 OK · 143ms

Evidence record

Session 7f42 · policy=order-access · user=customer · tools=crm_lookup,billing_api · pii_detected=2 · outcome=approved

The three-layer model

Independent compliance evidence sits between two other layers that are already well-understood in enterprise infrastructure:

1. Execution layer

What the agent CAN do. Runtime enforcement, sandboxing, kernel-level policy. Belongs to AKIOS OSS / EnforceCore.

2. Evidence layer

What the agent DID do. Observation, recording, governance scoring, evidence export. This is RADAR.

3. Infrastructure layer

What the agent COST. GPU utilization, carbon footprint, operational monitoring. Future scope.

The evidence layer is intentionally independent from execution. This is not a technical limitation — it is a compliance requirement. An auditor evaluating a RADAR evidence pack needs zero access to the execution environment. The Merkle-chain proof, the Ed25519 signature, and the structured finding records are independently verifiable. If the evidence layer and the execution layer shared a database, shared credentials, or were deployed as a single binary, that independence would be lost.

Self-hosted is the buying pattern, not the feature

For regulated enterprises, evidence cannot leave the infrastructure boundary. This is not negotiable — it is procurement policy. Financial institutions cannot send traces of customer interactions to a third-party cloud. Healthcare providers cannot export PHI to an external analysis platform. Government agencies require air-gapped deployment by contract.

RADAR's self-hosted model means the evidence pipeline — collection, storage, retention, export — runs entirely inside the customer's VPC, on-prem environment, or air-gapped network. The compliance team does not need to trust a cloud provider's access controls or data-retention policies. They own the hardware, the encryption keys, and the audit trail.

This is why self-hosted is not an enterprise add-on — it is the fundamental architecture. Every feature, from PII detection to SIEM forwarding, is designed to operate with zero outbound dependencies. In air-gapped mode, even license validation happens offline using embedded cryptographic keys.

Deploying RADAR in your own infrastructure takes a single Docker command. The evidence pipeline — collector, local store, review interface, and export engine — runs entirely inside your boundary from the first container start:

$ docker compose up radar
  Creating network "radar_default" with the default driver
  Creating radar_collector ... done
  Creating radar_store     ... done
  Creating radar_review    ... done
  Radar ready. Connect agents at http://localhost:8080

No cloud dependency. No license key required for the 30-day trial. No telemetry leaving your network. The evidence infrastructure belongs to you from day one.

Once deployed, connecting RADAR to an existing agent stack is a single CLI command. The collector begins recording traces immediately — every LLM call, every tool invocation, every policy evaluation captured as a structured, hashed event:

$ radar sources add --type gateway --name customer-support
  Source connected: customer-support
  Collecting traces from 3 endpoints (LLM gateway, CRM API, billing API)

  $ radar status
  Sources:   1 connected
  Traces:    247 recorded (14.2/min)
  Findings:  12 (2 PII, 3 policy, 7 review)
  Status:    Healthy — all evidence hashes verified

The output is immediate and concrete: a compliance team can log into the RADAR review interface and see structured evidence records — not infrastructure logs, not engineer-written summaries, not screenshots. Each trace carries a Merkle-chain hash and an Ed25519 signature that an auditor can independently verify without access to the production environment.

What independence unlocks for the buyer

An independent evidence layer changes the procurement conversation. Instead of asking "Can we trust the vendor's security?" the buyer asks "Can we verify the evidence independently?" The difference is subtle but critical for regulated purchasing:

Auditor trust: Evidence produced by an independent observer carries more weight than evidence produced by the system under audit. This is standard audit doctrine across every regulated industry.
Faster evaluation: Security teams can validate the evidence pipeline in a sandboxed environment without connecting to production agent infrastructure. The evidence layer is tested independently of the workflow.
Cleaner separation: When the compliance team needs to expand monitoring to new agent types, they update the evidence configuration — not the agent code. The evidence layer is a control plane, not a deployment dependency.

The bottom line

Regulated enterprises evaluating AI agent compliance should look for three things: independent evidence production, self-hosted deployment as the default architecture, and cryptographically verifiable audit trails. Anything less creates a compliance gap that will surface during the first auditor review.

RADAR was designed from day one as an independent evidence layer — not as a monitoring add-on to an existing execution product. That independence is the product.

Continue reading

2026-05-03Compliance

What Regulator-Ready Evidence Means for AI Agents

Compliance teams need more than logs. They need trace records, findings, review decisions, control mappings, and exports that survive scrutiny.

Read article

2026-05-03Architecture

Self-Hosted AI Compliance Architecture

The architecture regulated teams actually buy keeps evidence collection, retention, review, and export inside their infrastructure.

Read article