Building the Most Machine-Readable Security Architecture on the Internet

Cloudflare announced Markdown for Agents this week — a feature that converts HTML pages to markdown on the fly when an AI agent requests them. It is a smart solution to a real problem: AI agents waste tokens parsing HTML when they just need the content. Claude Code, OpenCode, and other coding agents already send Accept: text/markdown in their request headers, and Cloudflare intercepts that to serve clean markdown instead of raw HTML.

It got us thinking about what we have already built, and where we want to go.

The Problem Cloudflare Is Solving

Most websites serve HTML. HTML is designed for browsers, not machines. When an AI agent fetches a web page to answer a question, it receives navigation bars, footers, cookie banners, analytics scripts, and deeply nested div soup alongside the actual content. All of that consumes tokens, adds latency, and reduces the quality of the agent's understanding.

Cloudflare's solution is elegant: intercept the request at the edge, strip the HTML to markdown, and serve that instead. An 80% reduction in tokens for a typical page. No changes required from the site operator beyond enabling the feature.

This is genuinely useful for the majority of websites where the only machine-readable format is HTML. But it is a general-purpose conversion — it does not understand the domain, the data model, or the relationships within the content. A markdown version of a security control page is better than HTML, but it is still just text.

What OSA Already Provides

OSA was designed from the ground up as structured data. Every pattern, every control, every framework mapping, every threat model exists as validated JSON in a public Git repository. The website is a rendering layer on top of that data, not the source of truth.

This means AI agents do not need to parse our HTML at all. They can go straight to the structured data via our API.

Here is what is available today:

GET /api/v1/patterns — all 47 security patterns with metadata, control counts, and threat counts
GET /api/v1/patterns/{id} — full pattern detail including controls, threats, key control areas, examples, and references
GET /api/v1/controls — all 315 NIST 800-53 Rev 5 controls with framework cross-references
GET /api/v1/controls/{id} — individual control with all 21 framework mappings
GET /api/v1/frameworks — all 21 compliance frameworks with mapping counts and coverage data
GET /api/v1/frameworks/{id} — framework detail with paginated mapping data

The full specification is published as OpenAPI 3.1 and explorable via our API Explorer.

This is not markdown approximation of HTML. It is the actual data model — the same JSON that the website reads at build time. An AI agent querying our API gets typed, structured, validated data with explicit relationships between entities. A control knows which patterns reference it. A pattern knows which threats it mitigates. A framework mapping knows exactly which NIST control it maps to and why.

Why Structured Data Beats Converted Markdown

Consider a practical example. A security architect asks an AI agent: "Which NIST 800-53 controls address supply chain risk, and which compliance frameworks require them?"

With markdown-converted HTML, the agent would need to fetch multiple pages, parse the text, and infer relationships from heading structure and proximity. It would work, roughly, with some hallucination risk.

With OSA's API, the agent can:

1. Query /api/v1/controls?family=SA to get all System and Services Acquisition controls 2. For each relevant control, read the frameworks field to see every compliance mapping across all 21 frameworks 3. Cross-reference with /api/v1/patterns to find which patterns implement those controls 4. Return a precise, structured answer with direct links to source data

No parsing. No inference. No hallucination. The relationships are explicit in the data.

Our Ambition

We want OSA to be the most machine-readable security architecture and controls resource on the internet.

That means every piece of security architecture knowledge we publish should be consumable by both humans and machines with equal fidelity. Not as a secondary format bolted onto a website, but as the primary representation from which everything else derives.

Our data model already supports this:

47 security patterns with structured threat models, control mappings, and key control areas
315 NIST 800-53 controls with cross-references to 21 compliance frameworks
9,600+ compliance mappings between NIST and international regulatory frameworks
Every entity linked — controls reference patterns, patterns reference threats, threats reference mitigating controls, frameworks reference control clauses

This is the kind of structured knowledge graph that AI agents can reason over directly. When an agent needs to understand the security implications of deploying an AI system, it can query SP-027 (Secure AI Integration) and SP-045 (AI Governance) and get back not just descriptions but the specific NIST controls, the specific threats, and the specific compliance requirements — all as machine-readable JSON.

What We Are Building Next

The API we have today covers read access to patterns, controls, and frameworks. We are working toward:

Threat analysis endpoints — submit an architecture description, get back a structured threat model mapped to NIST controls
Gap analysis — compare your maturity scores against control requirements and get prioritised remediation recommendations
Compliance mapping queries — ask which controls satisfy a specific regulatory requirement across any combination of the 21 mapped frameworks
Machine-readable assessment data — export maturity assessments as structured JSON for integration with GRC platforms

Each of these builds on the structured data layer. They are not search features that parse text — they are graph queries over a typed data model.

For Agent Developers

If you are building AI agents that need security architecture knowledge, our API is designed for you. The API documentation includes authentication details, rate limits, and field-level documentation. The OpenAPI specification can be imported directly into agent tool definitions.

The data is also available as raw JSON files in the osa-data repository for offline use, fine-tuning datasets, or RAG pipelines. Everything is open source under CC BY-SA 4.0.

Cloudflare's Markdown for Agents is a step in the right direction. But the destination is not converted markdown — it is structured, typed, validated data designed for machine consumption from the start. That is what we are building.

Explore the API | Read the OpenAPI spec | Browse the data on GitHub

Russell Wing — Co-founder, Open Security Architecture