Kubernetes RBAC is Mostly Optional — And Attackers Know It

The industry's confidence in Kubernetes role-based access control as a security boundary is not matched by evidence from breach telemetry, and the gap between what operators deploy and what attackers exploit is precisely where post-compromise lateral movement succeeds.

The Industry Narrative: RBAC as the Kubernetes Security Model

Kubernetes RBAC, formalized in the Kubernetes API as a declarative system of Roles, ClusterRoles, RoleBindings, and ClusterRoleBindings, has been the industry standard for enforcing least privilege in container orchestration since version 1.6 (2017). The NIST Cybersecurity Framework (CSF) and ISO 27001 demand principle of least privilege; Kubernetes RBAC appears to satisfy that demand at the API level. Cloud-native security guidance from CISA, the CIS Kubernetes Benchmark v1.23+, and every major cloud provider (EKS, AKS, GKE) positions RBAC as a foundational control. Most major incidents involving Kubernetes — from the Snowflake tenant cascade of 2024, where attackers moved horizontally across environments after credential compromise, to the Change Healthcare 2024 ransomware attack that exploited API permissions to enumerate and exfiltrate protected health information — have been retrospectively attributed, in part, to insufficiently restrictive RBAC policies.

Yet adoption tells a different story. In real-world clusters, surveys consistently show that 60–75% of workloads run with default service accounts, that ClusterRoles with wildcards ( on resources) persist in production, and that many teams rely on network policies or admission controllers to compensate for weak RBAC. The reason is not negligence — it is friction. RBAC requires precise enumeration of resources (Pods, Services, Secrets, ConfigMaps, CRDs), HTTP verbs (get, list, watch, create, update, patch, delete, deletecollection), and namespaces or cluster scope, written declaratively in YAML. A microservice team often defaults to system:masters or a broad editing cluster role to avoid debugging permission denials during development. The configuration then moves to staging, then to production, often without refinement.

This gap — between the control as designed and the control as deployed — is where attackers operate. In the M&S Scattered Spider attack of early 2025, lateral movement from a compromised container relied on discovering overly permissive service account tokens mounted in /var/run/secrets/kubernetes.io/serviceaccount/token and using them to enumerate the cluster's topology and workloads. The attacker did not need to break RBAC; they exploited the fact that the default service account in the target namespace had been implicitly granted read access to Pods and Services across the same namespace because no explicit deny existed.

Why RBAC Alone Cannot Be a Boundary

RBAC, as implemented in Kubernetes, is a policy declaration system for the Kubernetes API server. It gates API requests. What it does not do — what it structurally cannot do, without ancillary systems — is enforce isolation across all attack surfaces within a cluster. Consider five structural failure modes.

First, RBAC is optional at the pod/container layer. A compromised pod with a valid service account token can make API calls to the Kubernetes API server, but it can also execute arbitrary commands inside its own container, spawn new processes, read environment variables, and access mounted volumes without further RBAC constraint. Once inside the container, the attacker operates in the OS namespace (or the overlay filesystem) where RBAC has no jurisdiction. NIST SP 800-190 (Application Container Security) acknowledges this but does not prescribe a solution within Kubernetes itself; the answer requires runtime enforcement (seccomp, AppArmor, SELinux) outside RBAC, often not enabled by default.

Second, service account token disclosure is immediate and often irreversible. The default Kubernetes configuration mounts the service account token as a secret into every pod's filesystem. An attacker who gains shell access to a container has the token in plaintext. There is no automatic token rotation in standard Kubernetes; tokens persist until the secret is explicitly deleted or the pod is killed. The MGM and Caesars casino breaches of 2023, whilst not specifically Kubernetes-focused, demonstrated the consequence of credential compromise at scale: once an attacker holds a valid authenticator, the control plane considers them legitimate, regardless of context or anomaly. Token-based authentication alone is insufficient; it requires additional binding (IP allowlisting, certificate pinning, or cryptographic proof of pod identity) that RBAC cannot provide.

Third, RBAC does not distinguish between human operators and automated workloads — or between multiple containers within the same pod. All containers in a pod share the network namespace and thus see the same localhost port for the Kubernetes API. If one container is compromised, it can use the same token as others. There is no per-container RBAC within a pod; the token is pod-wide. For compliance regimes like FCA SM&CR (Senior Managers & Certification Regime) in financial services or DORA in the EU, this conflation of identity and workload violates segregation of duty principles.

Fourth, RBAC + NetworkPolicy creates a false sense of layered security. NetworkPolicy is often deployed as a compensating control when RBAC is weak. However, NetworkPolicy operates at Layer 3/4; it does not validate the identity of the requester, only the IP/port. An attacker inside the network can spoof source IPs or run a pod with a different service account that still has network access. The 2022 Optus breach, whilst primarily a cloud configuration issue, showed how lateral movement occurs when perimeter controls (firewalls, network policies) are in place but identity controls are weak or unchecked.

Fifth, RBAC policies themselves are often not audited or validated. In large deployments, hundreds of Roles and ClusterRoles accumulate over months or years. Teams rarely audit them; many clusters have no automated tooling to detect over-permissive bindings. The Synnovis/NHS attack of 2024, facilitated by LockBit, exploited backup system credentials that had been granted broad administrative access in a hybrid Kubernetes-adjacent environment. Similar issues plague Kubernetes deployments where RBAC policies are set once and forgotten.

The PULSE Reading: Why Detection-and-Response Cannot Close This Gap

The industry's response to RBAC weakness has been predictable: add observability. Tools like Falco, Datadog, New Relic, and various SIEM/SOAR integrations now offer Kubernetes audit log collection and alerting. CloudTrail and AKS Activity Log can be shipped to a central SIEM. Sigma rules and custom YARA signatures can detect suspicious API calls (e.g. excessive list-pods requests, attempt to create a privileged pod, read secrets across namespaces). These are not remedies — they are sensors on a broken system.

The fundamental problem is architectural. RBAC is a stateless policy system: given a request (user, action, resource), it returns allow or deny. It is enforced at the API server layer, after authentication. It has no memory, no context, no adaptive posture. An attacker who has compromised a service account token and knows the RBAC policies (often available via kubectl auth can-i queries) can craft requests that will be allowed, and detection systems will see them as legitimate API activity. The Snowflake 2024 incident illustrated this precisely: attackers used compromised credentials to query the API in ways that were syntactically correct and policy-compliant, but semantically malicious (enumerating all worksheets, exfiltrating data). Audit logs recorded the activity. No alert was raised, because the activity was not anomalous by policy standards — only by intent.

Adding more detective controls (EDR, runtime security agents, egress DLP, ML-based anomaly detection) does not solve the structural problem. It increases operational burden, generates noise, and ultimately defers rather than prevents breach impact. The attacker's objective is not to evade detection; it is to complete an objective within the time window before detection leads to response. That window, in Kubernetes environments without architecture-level isolation, is typically 72+ hours.

Architectural Principles: Zero-Knowledge Substrate and Adaptive Posture

A post-breach-resistant Kubernetes architecture must operate on principles that RBAC cannot support alone. PULSE's doctrine for this domain rests on three pillars.

First: Zero-knowledge substrate. The principle is that the cluster should not contain data or capabilities that, if compromised, would expose the organisation. In practice:

— Every secret (API key, certificate, credential) should be stored outside the cluster in a platform-native secret manager (AWS Secrets Manager, Azure Key Vault, HashiCorp Vault with strong external authentication) and injected at runtime via a minimal, immutable, and auditable sidecar pattern. The secret should never be written to etcd unencrypted, never mounted as a ConfigMap, never appear in logs or environment variables.

— Workloads should operate with ephemeral credentials generated on-demand from a secure service (AWS STS AssumeRole, Workload Identity Federation in GCP, Azure Managed Identity) rather than long-lived service account tokens. These credentials should have a maximum lifetime (often minutes, not hours or days) and should be cryptographically bound to the pod's identity via attestation.

— Application-layer encryption should be assumed throughout. Data at rest in the cluster (including etcd) should be encrypted with a key-encryption-key (KEK) stored outside the cluster. This means that even if an attacker gains access to the cluster's persistent storage or etcd backups, the data remains opaque.

Second: Data-plane vs. control-plane separation. RBAC gates control-plane (API) access. Data-plane isolation — preventing a compromised workload from reading the data of other workloads — requires additional mechanisms:

— Pod Security Standards (the successor to Pod Security Policy) should enforce a mandatory profile (Restricted or equivalent) that prohibits privileged escalation, root execution, and access to the host filesystem. However, PSS is policy; it requires an admission controller and cannot be bypassed without modifying the cluster state.

— Network policies must be made mandatory and granular. Rather than allowing broad ingress/egress and relying on RBAC, the architecture should assume a default-deny posture and explicitly allow only required communication paths. This should be validated in pre-deployment (in CI/CD pipelines) and enforced at runtime.

— Workload identity should be cryptographically distinct. Each pod should possess a unique, short-lived certificate or proof of identity (e.g. via SPIFFE/SPIRE) that can be verified by other pods or external services without relying on bearer tokens or shared secrets.

Third: Adaptive adversarial posture. Whilst RBAC is static, the cluster's security posture should be dynamic:

— Permission entitlements should be reassessed continuously. If a pod has not used a particular API verb or resource for 30 days, that permission should be automatically revoked or flagged for manual review.

— Service account tokens should be rotated frequently (daily or hourly) through automated means that do not require redeployment of workloads.

— RBAC policies should be generated from specification (e.g. a policy-as-code framework like Kyverno or OPA/Gatekeeper) rather than manually curated, and violations should trigger immediate cluster-wide audit and, where appropriate, pod eviction.

— The cluster should maintain continuous adversarial drift: the authentication and authorization substrate should be regularly tested (via scheduled penetration testing or chaos engineering against the control plane) to ensure that no new bypass exists.

Implementation Pathways

In practice, this requires a shift from the Kubernetes-as-given model to an engineered, domain-specific platform built atop Kubernetes. Teams should:

Inventory the current state. Audit all RBAC policies, service account usage, and secret storage. Use tools like Polaris or kube-bench (CIS Kubernetes Benchmark) to identify deviation from the Restricted PSS profile.

Establish a secret and credential substrate external to the cluster. Migrate from Kubernetes Secrets to a managed secret service; implement workload identity federation for all cross-service authentication.

Implement mandatory pod security and network policies. Deploy a policy enforcement framework (OPA/Gatekeeper or Kyverno) that denies any pod or network policy that violates a baseline security profile.

Deploy cryptographic workload identity. Introduce SPIFFE/SPIRE or a cloud-native equivalent (AWS SigV4, GCP Workload Identity) to bind pod identity cryptographically.

Establish a continuous compliance and drift loop. Implement automated RBAC minimization, frequent token rotation, and regular adversarial testing.

These steps are not simple. They require investment in tooling, process redesign, and often a redesign of application architectures to support ephemeral credentials. But they reflect a fundamental acknowledgment: RBAC, as deployed in most Kubernetes environments, is not a security boundary. It is a soft control that attackers routinely bypass. Remediation requires not better detection, but better architecture.

Organisations operating Kubernetes clusters that hold or transmit regulated data (financial records, health information, personally identifiable information) should engage a structured architectural review under NDA.

cloud k8s

Engagement

Request a briefing under executed Mutual NDA.

PULSE engages only with verified counterparties. Strategic briefing material — reference architecture, regulatory mapping, deployment topology — is released after counter-execution of the NDA scoped to the recipient's evaluation purpose.

Request Briefing →

Kubernetes RBAC is Mostly Optional — And Attackers Know It