Service Accounts Are the Forgotten Identity — And They Have the Keys

The Industry Narrative: Service Accounts as a Compliance Checkbox

Service accounts—non-human identities that enable applications, batch jobs, scheduled tasks, and inter-system communication—have become the forgotten middle layer of identity governance. The industry narrative is consistent: service accounts are pervasive, poorly managed, and increasingly exploited as lateral movement pivots once attackers establish foothold. Yet investment in their governance remains fragmentary. A 2023 Gartner survey found that 74% of organisations could not fully inventory their service accounts across all systems. The SANS Cyber Defence Essentials course (Section 5.3, Identity & Access Management) treats service account discovery as foundational security hygiene—place them in dedicated organisational units, enforce unique passphrases, rotate credentials quarterly, audit their permissions via NIST CSF ID.AM-1 and ID.AM-2, and log all authentication events to a SIEM. The formula is straightforward and, crucially, auditable. Compliance frameworks—ISO 27001:2022 Clause A.5.3, NIST SP 800-53 IA-2 (Authentication & Accountability), DORA Article 17 (Logging & Monitoring), NIS2 Annex III (Access Control)—all mandate that "all users" must be uniquely identified and authenticated. When auditors ask "show me your service accounts," organisations produce spreadsheets, demonstrate logon event volumes in Splunk or Elastic, and receive the tick in the box.

The problem emerges when real breaches occur and investigators replay the logs. In the MOVEit zero-day exploitation chain (CVE-2023-34362, June 2023), Progress Software's file-transfer appliance was compromised by SQL injection; attackers then pivoted laterally using compromised service account credentials stored in environment variables and plaintext configuration files. Forensic examination by Mandiant revealed that the service account password had not been rotated in two years, and its permission scope—tied to a shared database read-write role—granted access far beyond the minimum required by any individual job. The account was logged in across 47 disparate data integrations. Similar patterns emerged in the Change Healthcare extortion event (February 2024): attackers leveraged compromised service accounts within Optum's supply-chain gateway to traverse the healthcare clearinghouse network, escalating from the compromised VPN session into backend billing and claims systems. Service account logon events were present in the Windows Event Log; the account had appropriate baseline permissions in Active Directory; it passed every static compliance check. Yet it held the keys to the kingdom.

The NHS Synnovis ransomware campaign (June 2024), attributed to the LockBit gang, followed a similar trajectory: threat actors exploited a series of vulnerabilities in the cloud-based pathology and test-ordering platform, obtaining service account credentials from source-code repositories and configuration management systems, then used those accounts to propagate throughout the shared hosting environment and exfiltrate patient data across multiple NHS trusts before encryption. Post-incident forensics confirmed that these service accounts existed, were logged, and had been "managed" under the organisation's IAM policy—yet nobody had asked whether a single service account should simultaneously hold database replication rights, backup restoration permissions, and direct access to encrypted patient records.

The Structural Failure Mode: Service Accounts as Unmanageable Blast Radius

Here lies the crux: the industry approach to service account governance treats them as a static permission boundary problem. That is, if you assign a service account the "correct" set of permissions—and audit that assignment quarterly—then you have solved the problem. The NIST CSF, ISO 27001, and DORA compliance regimes all rest on this assumption. It is false.

A service account, once compromised, does not respect the boundaries of its assigned role. It respects only the technical capabilities of the underlying system. If a Windows service account holds membership in the "Database Administrators" group, it can enumerate that group, query the directory, manipulate SQL Agent jobs, read backup files, and—critically—maintain persistence by modifying its own group membership or creating scheduled tasks. If a Kubernetes service account token is leaked, it can be used from any network location at any time; RBAC controls are binary (allow/deny) and do not degrade in confidence over time or adapt to context. If a cloud service principal (Azure, AWS, GCP) is compromised, attackers inherit its policy attachment scope instantaneously. The attacker does not need to elevate the privilege; it is already there.

The problem compounds because service accounts are stateless from a trust perspective. Unlike human users, who leave traces (failed logons, unusual geolocation, atypical login hours, anomalous API call patterns), service accounts are "supposed" to behave consistently. They do not travel. They do not take holidays. They run the same batch job at 02:00 UTC every Tuesday. Anomaly detection—the lynch-pin of modern SIEM strategy (Splunk, Elastic, Microsoft Sentinel, Datadog)—becomes near-useless: any deviation from baseline (a slightly larger dataset, a retry loop, a failover scenario) is treated either as "normal" (false negative) or as a noisy alert that drowns in thousands of others. The Scattered Spider campaign against MGM Resorts and Caesars Entertainment (2023) succeeded in part because service accounts in cloud environments (AWS IAM users, Azure service principals) were exfiltrated early and then used for lateral movement that appeared, to the environment's native logging and monitoring layer, as consistent with historical baseline behaviour.

The underlying architectural issue is that the industry has no mechanism to reduce the blast radius of a compromised service account once it exists in the environment. The remediations offered—better password vaults (HashiCorp Vault, CyberArk, Delinea Privilege Manager), more frequent rotation, tighter RBAC, behaviour analytics—all assume the service account is still running somewhere, with its permissions still intact. They are incremental hardening, not structural redesign. When a service account password is compromised (as it inevitably is, given the attack surface: repositories, configuration files, logs, memory dumps, supply-chain artifacts, cloud metadata services), the blast radius is the union of all systems that accept that credential.

The PULSE Reading: Zero-Knowledge Service Account Architecture

The PULSE doctrine inverts the problem: instead of assuming service accounts must exist in their traditional form and must be "managed" more tightly, it asks whether the account itself—the persistent identity that holds rights and credentials—needs to exist at all.

The first principle is substrate-level ephemeral identity. Rather than issuing a service account a static credential (a password, an SSH key, a token) that persists for days or months, the infrastructure issues time-bound, function-scoped capabilities that exist only for the duration of the operation. Consider a batch data-integration job that needs to read from a source database and write to a destination data warehouse. In the traditional model, a service account "batch-integration-prod" is provisioned with read access to the source and write access to the destination; this account exists in perpetuity. In a zero-knowledge architecture, the job is issued—at invocation time—a cryptographically signed, attribute-based capability token that encodes: (i) the source and destination endpoints (and only those endpoints); (ii) the operation type (SELECT from source, INSERT to destination); (iii) a hard expiry (e.g., 5 minutes); (iv) the invoking job ID and container image hash; and (v) a proof of invocation from the orchestration layer (e.g., Kubernetes operator, AWS Batch, HashiCorp Nomad). The underlying databases do not have an "account" to compromise; they have a cryptographic proof that a specific job, authenticated by the orchestrator, has permission to perform a specific operation for a bounded time window.

The second principle is data-plane and control-plane separation. Service account credentials today are "universal keys"—they authenticate the account and authorise all operations the account is permitted to perform. This collapses two distinct concerns. A zero-knowledge substrate splits them: the control plane (the identity and role-binding system) is logically isolated from the data plane (the systems that actually process the data). When a service account credential is exfiltrated, attackers gain entry to both planes simultaneously. In a disaggregated architecture, the control plane is a hardened, stateless attestation layer (conceptually similar to a root certificate authority, but distributed and continuous). When a service job invokes an operation on the data plane, it must prove—cryptographically, not just authenticate—that the control plane has delegated that specific capability at this specific time. If the control plane has been revoked or updated, data-plane systems need not maintain a "live" connection; they validate the proof of delegation offline, using a cached public key that can be refreshed asynchronously.

The third principle is adaptive adversarial posture within the data plane. Once a service capability is issued, its effective scope must degrade over time and adjust based on context. If a batch integration job is expected to read 10GB of data in under 5 minutes and it suddenly attempts to read 100GB, the data-plane gatekeeper (not a SIEM, but the system itself) restricts the read and raises the confidence level of the capability. If a job running in region A suddenly attempts to authenticate from region B, the capability is automatically narrowed (or revoked, pending control-plane re-validation). This is not anomaly detection layered on top of logging; it is architectural enforcement. The job does not get silently revoked; it receives a constrained capability with a higher re-validation cost. This forces attackers to either work within the narrowed scope (making their activity more visible and bounded) or re-authenticate, triggering a new control-plane decision.

Designing Service Capabilities in a Post-Account World

In concrete terms, this translates to architectural patterns that organisations should begin implementing now, particularly those managing regulated infrastructure or handling sensitive data.

First: Repository-as-Source-of-Truth for Service Capability Definitions. Rather than storing service account credentials in vaults or configuration files, encode the job's needs (source systems, operations, data classification) in the source repository as declarative specifications (e.g., a YAML manifest). At deploy time, the orchestrator (Kubernetes, AWS ECS, Nomad) generates a short-lived credential that is injected into the job's runtime environment, valid only for the duration of the job run. The credential itself is not reusable; it is cryptographically bound to the job invocation. Tools like Google Workload Identity, AWS IAM Roles for Service Accounts (IRSA), and Azure Workload Identity federation already implement this pattern, but adoption is inconsistent and often misconfigured (a separate problem). The principle is: no credential should survive the job that created it.

Second: Cryptographic Proof-of-Ownership for Cross-System Operations. When a batch job needs to interact with multiple systems (a source database, a message queue, a data warehouse, an object store), it should not hold separate credentials for each. Instead, it holds a single, time-bound attestation token signed by the orchestrator. Each downstream system validates the token's cryptographic signature (using a public key it has cached or can refresh from a trusted source) rather than validating a username and password. If the token is intercepted in transit, it cannot be replayed across systems or reused after its expiry. This is the design principle behind MTLS (mutual TLS) in service meshes like Istio, but extended to the application layer: every inter-system call must carry proof that the orchestrator has authorised it at invocation time, not just that a pre-provisioned account exists.

Third: Encrypted State Locality for Batch Operations. Many service accounts need to maintain state across multiple invocations (a cursor in a database log, a checkpoint in a file, a sequence number in a queue). Rather than storing this state in a shared system (which requires the account to persist to read and update it), store it encrypted and co-located with the job artifact (a container image, a Kubernetes ConfigMap, a code repository). The job reads and updates this state locally, and the updated state is encrypted and stored back to its source. If the state is exfiltrated, it is unintelligible without the key, which is rotated with each job invocation.

Real-World Implications: Reducing Service Account Blast Radius Post-Incident

The Synnovis incident is instructive. Once attackers obtained service account credentials from a repository, they had access to multiple systems simultaneously. In a zero-knowledge architecture, they would have obtained a credential token bound to a specific job (e.g., "nightly backup verification") with explicit scope (read patient list for site X, not all sites; access pathology order database, not the claims system; execute for 30 minutes, not indefinitely). Lateral movement to an unrelated system (the backup restoration service) would require either a separate capability token (which may not exist in the attacker's possession) or re-authentication to the control plane (which logs and blocks the anomalous request). The blast radius is reduced from "all systems the account can access" to "the systems explicitly enumerated in this capability token."

For the Change Healthcare breach, a service account used for VPN gateway access was later used to access billing and claims systems. In a disaggregated architecture, the VPN gateway would issue a time-bound capability valid only for VPN operations; using that same token to access billing systems would fail at the data plane, regardless of the RBAC policies assigned to the account.

Structural Redesign: Beyond Compliance to Operational Resilience

The shift from service account governance to service capability architectures is not cosmetic. It requires rethinking how organisations deploy workloads, how they authorise inter-system communication, and how they log and respond to breaches. It is more complex than "implement a vault and rotate passwords quarterly." But it provides something the traditional approach cannot: post-breach resistance. When a service capability is compromised, its scope is inherently limited. When it expires, it is gone. When the control plane is updated, the data plane's validation of the capability can reflect that update without requiring the service account to be present or aware.

For organisations bound by DORA, NIS2, or sector-specific regimes (APRA CPS 234, MAS TRM, NYDFS Part 500), this shift also addresses a latent compliance gap. Today's frameworks assume service accounts can be audited and logged—which they can be. But they do not adequately address the risk that a single compromised account can become a universal pivot point. A zero-knowledge substrate, by design, reduces that risk at the architectural layer, not just the policy layer. Regulators increasingly understand this distinction.

The migration path is pragmatic: start with the highest-risk service accounts (cloud infrastructure, financial systems, data warehouse access); issue time-bound capabilities for their most common operations; monitor the blast radius of each account over six months; then decommission the persistent accounts in favour of a pure capability model. Organisations that begin this transition now will be three years ahead when regulators formalise these requirements.

---

Qualified operators responsible for identity architecture or incident response in regulated infrastructure should request a technical briefing under NDA to explore how zero-knowledge service capability substrates can be integrated into existing infrastructure without operational disruption.

identity

Engagement

Request a briefing under executed Mutual NDA.

PULSE engages only with verified counterparties. Strategic briefing material — reference architecture, regulatory mapping, deployment topology — is released after counter-execution of the NDA scoped to the recipient's evaluation purpose.

Request Briefing →