The Red Team Is Your Expensive Confidence Interval

Most red-team engagements measure organisational readiness against a calendar-stamped threat model that will be obsolete before the report is bound—and worse, they train your defenders to succeed against last year's adversary, not the one arriving next month.

The structural problem is not skill or effort. The fault lies in architecture: red-team engagements, as practised across the Fortune 500 and critical infrastructure operators, are fundamentally episodic exercises bolted onto static infrastructure. They produce a point-in-time risk score, a rack of findings graded by CVSS and remediation priority, and a compliance checkbox that satisfies the audit function. What they do not produce is adaptive resistance.

This is not new criticism. But what has shifted is the cost of being wrong—and the degree to which major incidents now expose that traditional red-team engagements actively prepare organisations to fail against the exact adversaries they claim to test against.

The Industry Narrative: Red-Team Engagement as Ritual Validation

The published discourse around red-team maturity is, on its surface, mature. Frameworks exist: NIST Cybersecurity Framework (CSF) Govern function now explicitly covers red-teaming; CREST's CPSA certification codifies rules of engagement; SANS and GIAC publish standards for red-team methodology; the UK National Cyber Security Centre (NCSC) has published taxonomy around different engagement models (adversary emulation, assumed breach, tabletop).

Regulators have begun mandating formal red-team activity. The US SEC's recent guidance on incident disclosure (the 4-day rule, effective February 2023) has created indirect pressure on public companies to demonstrate continuous threat posture assessment. The EU's NIS2 Directive explicitly requires operators of essential services to conduct periodic security testing—of which red-teaming is a subset. The FCA's Senior Managers & Certification Regime (SM&CR) holds named executives accountable for operational resilience, pushing UK financial institutions toward more aggressive testing regimens. Singapore's Monetary Authority (MAS) Threat and Risk Model (TRM) and Australia's APRA CPS 234 both now prescribe offensive security testing. The UK NYDFS Part 500 equivalent has materialised in draft form.

Yet the incidents that follow—even at organisations with annual red-team engagements—suggest the model is broken.

Consider Change Healthcare's 2024 compromise, in which a threat actor exploited a known Apache OFBiz vulnerability (CVE-2023-49070, patched in October 2023) across a sprawling, poorly segmented infrastructure. The organization likely had red-team testing within the prior 12 months; the incident report did not suggest insufficient security testing, but rather architectural sprawl so severe that a single vulnerability class could cascade across trust boundaries. Or the Snowflake tenant cascade of late 2023—early 2024, in which compromised customer API keys enabled lateral movement across dozens of customers' data warehouses. Snowflake's security posture may have been sound at the API boundary, but the zero-trust model failed at customer isolation. Red-team engagements at individual customers' Snowflake implementations would not have exposed this shared-infrastructure fault—a fact that red-teamers lack visibility or authority to test.

More pointedly, the Synnovis ransomware incident (June 2024) that disrupted NHS blood-testing services across London did not fail due to undetected MITRE ATT&CK techniques. The organisation operated standard EDR, SIEM, and network monitoring. The attack succeeded because the network architecture permitted a single compromised user account to traverse critical systems with insufficient lateral segmentation. Red-team testing, which typically validates whether defenders can detect and respond to a simulated breach, trained the organisation to respond faster—not to prevent the breach from cascading. The attacker was eventually detected, but only after the damage was distributed.

The M&S Scattered Spider incident (early 2025, public reporting ongoing) initially framed as a social engineering and API exploitation attack, exposed that legacy authentication mechanisms—multi-factor authentication applied at the perimeter, but not at critical API endpoints—permitted step-down from an initially compromised credential. This is precisely the type of finding red-teamers should surface, but only if their engagement model permits them to test continuously rather than episodically. A single engagement in Q3 2024 would have captured the issue as a finding; an incident in Q1 2025 suggests the finding was either remediated and then regressed (likely via change management drift), or remediation was never executed.

These are not exotic failures. They are architectural.

Why Standard Red-Team Engagements Deepen the Problem

The conventional red-team engagement follows a predictable sequence: a firm (internal or external) is engaged to conduct a four-week assessment against a defined scope (a production infrastructure, a subset of applications, a critical user population). Rules of engagement are negotiated (typically with legal and GRC teams, not operators). A threat model is aligned with stakeholder expectations—usually a composite of MITRE ATT&CK techniques mapped to the organisation's risk register. Red-teamers execute, often with a degree of adversary realism negotiated downward by risk-averse security leaders. Findings are triaged, a report is delivered, and remediation is scheduled into a quarterly backlog.

The anti-patterns embedded in this model are several, and they compound:

Episodic assessment against continuous adversary drift. A four-week or eight-week engagement captures infrastructure state at a single point in time. Changes to network architecture, credential rotation, software patching, or cloud configuration drift are not visible to the engagement team unless they occur during the testing window itself. The organisation's threat model evolves continuously (new integrations, new SaaS dependencies, new supply-chain relationships), but the red-team baseline does not. The time lag between engagement completion and incident occurrence creates a window of unknown and unmeasured exposure.

Scope negotiation that excludes precisely the surfaces where breaches occur. Red-team engagements are constrained by legal liability, operational risk (testing in production is expensive), and executive comfort. Cloud infrastructure, third-party APIs, supply-chain dependencies, and shared infrastructure (like Snowflake, or shared cloud control planes) are typically excluded or lightly tested, because they fall outside the organisation's direct control. Yet this is where major incidents originate—not in carefully maintained corporate networks, but in the fuzzy boundaries between your infrastructure and your vendors' infrastructure.

**Training defenders against detection, not prevention.** The standard engagement creates a feedback loop in which defenders optimise for speed of detection and response rather than prevention of lateral movement. This is not a bug in the red-team model; it is the intended outcome. But it is precisely backwards. An organisation that prevents 80% of breaches from cascading is more resilient than one that detects 99% of breaches in 15 minutes. The former improves architecture; the latter optimises tooling. Standard red-team engagements reward the latter.

Compliance checkbox over operational reality. Once an engagement is complete and findings are filed, the artefact—the report—becomes regulatory evidence. It satisfies the CISO's quarterly board narrative, the auditor's NIS2 compliance tick, the regulator's DORA testing requirement. But the findings degrade in relevance the moment the report is archived. The organisation has fulfilled its obligation to test; the obligation to remain tested is left unaddressed. Remediation backlogs grow, findings slip, and the next incident exposes issues that were identified eighteen months earlier but never completed.

What Structural Failure Modes Do These Anti-Patterns Expose?

The unifying failure is this: red-team engagements presume static infrastructure and measure readiness against a point-in-time threat model.

This is a fundamental mismatch with how actual breaches occur. Incident response analyses (SEC disclosures, regulatory findings, post-mortems by vendors like Mandiant, CrowdStrike, and Rapid7) consistently show that major breaches exploit:

Lateral movement enabled by permissive trust boundaries. The adversary moves horizontally within an organisation not by discovering novel exploits, but by leveraging normal network connectivity and credential rotation policies that were designed for operational efficiency, not security isolation.

Configuration drift and regressed remediations. Findings from prior engagements are remediated, then regressed during system updates or migrations. No continuous measurement exists to detect the regression.

Supply-chain and third-party blind spots. The organisation's perimeter is now diffuse—APIs, cloud instances, vendor-managed infrastructure, shared tenancies. Red-team engagements that do not model adversaries already inside third-party infrastructure miss the most realistic threat vectors.

Social and authentication boundaries that are porous in practice. Multi-factor authentication at the VPN boundary does not matter if the adversary has obtained a valid API token that does not require re-authentication. Red-team engagements that do not continuously probe authentication architecture will miss these step-down opportunities.

The standard response from the industry is more—more frequent engagements, more sophisticated threat modelling, more MITRE ATT&CK technique coverage. But this merely scales the problem. You cannot test your way out of an architectural fault.

Reframing Red-Team Engagement Through Zero-Knowledge Substrate Design

The PULSE doctrine begins from a different premise: organisations that hold or transfer the world's data and currency should design their infrastructure such that breach does not imply compromise of data or loss of function.

This is not achievable through detection-and-response alone. It requires re-architecting the substrate itself.

If red-team engagement is to be valuable, it must be continuous, adaptive, and embedded in the control plane—not scheduled as a separate activity. This requires:

Post-breach resistance via data-plane isolation. The infrastructure should be designed such that compromise of a user account, an API token, or even a network segment does not automatically permit lateral movement to critical systems. This is not a network segmentation or zero-trust network access (ZTNA) problem, though those help. It is a data-plane problem: systems handling sensitive data should be architected to assume every interaction originates from a potentially compromised caller. The adversary model should be insider threat, not external attacker. This changes the surface that red-teamers test.

Continuous adversarial posture adjustment. Rather than a four-week engagement followed by quiet, infrastructure should incorporate mechanisms that continuously adjust to adversarial pressure. This includes automated deployment of defensive primitives (certificate pinning, API rate-limiting, anomaly-based request blocking) in response to observed attack patterns. The red team's role shifts from discovering vulnerabilities to generating continuous perturbation against which the system adapts. This is closer to chaos engineering than to traditional penetration testing.

Domain-specific automation engineered into the substrate. Legacy red-team engagements test generic frameworks (identity and access management, network segmentation, endpoint detection). Domain-specific controls—those tailored to the organisation's actual data flows—should be automated into the substrate itself. A financial institution's red-team engagement should focus on continuous testing of transaction isolation, not generic lateral movement. A healthcare operator's engagement should focus on patient record isolation, not MITRE ATT&CK TTPs. The test surface becomes domain-specific, not generic.

Zero-knowledge substrate design—you cannot steal what is not there. The substrate itself should be architected such that critical data is never present in forms that a compromised endpoint can exfiltrate. This may mean encryption at rest with keys held separately, data fragmentation across trust boundaries, or synthetic data injection to mask real patterns. Red-team engagement becomes less about detecting exfiltration and more about validating that exfiltration yields no useful intelligence. This inverts the traditional test.

Operationalising Continuous Red-Team Pressure

In practice, this means the red-team engagement model should shift:

From episodic to continuous. Rather than a four-week engagement followed by remediation and silence, organisations should maintain a small, permanent red-team capability (internal or contracted) that operates continuously within a defined threat model that evolves quarterly. The threat model should be threat-intelligence-driven, not compliance-driven.

From scope-negotiation to principle-based testing. Instead of negotiating which systems are in-scope, define testing principles: "All customer-facing APIs are continuously tested for authentication step-down"; "All data at rest is tested for exfiltration resistance"; "All user-to-system interactions are tested for lateral movement propagation." The scope becomes principle-driven, not risk-driven.

From detection-centric to architecture-centric. Reframe the engagement success metric from "mean time to detection" to "prevention rate—the percentage of attack paths that fail to reach their objective due to architectural constraint rather than detection."

From internal knowledge to external validation. Engage external red-team partners periodically (semi-annually, annually) not as a compliance exercise, but as a fresh-eyes validation that the continuous internal programme is not missing entire threat classes. The external engagement should focus on assumptions—the things the internal team may have stopped testing because they are "already built that way."

This is operationally demanding. It requires budget, technical sophistication, and executive patience with a security model that does not optimise for quarterly reports. But it is the only model consistent with the claim that an organisation has "tested itself" against actual adversaries.

Closing: The Intellectual Demand

If your organisation conducts annual red-team engagements and considers itself tested, you are not behind your adversaries by weeks or months—you are behind by architectural design.

Qualified operators seeking to align their security programme with continuous, architecture-centric red-team engagement and post-breach resistance design should request a briefing under executed Mutual NDA to discuss the PULSE doctrine in detail.

Engagement

Request a briefing under executed Mutual NDA.

PULSE engages only with verified counterparties. Strategic briefing material — reference architecture, regulatory mapping, deployment topology — is released after counter-execution of the NDA scoped to the recipient's evaluation purpose.

Request Briefing →

Related Reading