Securing Kubernetes Pods For Production Workloads

Michael Levan - Jun 11 - - Dev Community

Securing all aspects of Kubernetes is important, yet one of the largest entry points for attackers is Pods that aren’t configured properly. That’s why Kubernetes Pods have so many different options from a security perspective.

Whether you’re thinking about policy enforcement, what users are allowed to access Pods, what Service Accounts can run Pods, or what traffic can flow through to Pods, no Kubernetes environment is ready until certain steps are taken.

In this blog post, you’ll learn about those exact steps to ensure that your environment is running to mitigate as many risks as possible.

💡 We won’t talk about RBAC, code, or cluster security in this blog post. It’ll be all about the Pods themselves.

Why Securing Pods Is Important

There are two entry points into a Kubernetes environment for a bad actor:

  1. Pods
  2. Cluster

💡 This will focus on the Pod piece, but if you look at my “The 4C’s Of Kubernetes Security” blog post, you’ll learn about the cluster piece.

Pods are one of the easiest ways to break into a system because there are three levels, the container itself within the Pod, how the Pod was deployed, and the Pod itself.

The container is where one piece of the application is running. It’s also a great entry point via a base image that was used. Base images are used within the method of building a container image (like a Dockerfile or Cloud Native Buildpack) and unfortunately, a lot of them have unresolved security issues. To test this out, feel free to run a scan against a pre-build container image and see for yourself. If a container image has a vulnerability, it could be used to interact with your environment.

How a Pod is deployed is absolutely crucial. As an example, you have two methods of authenticating and authorizing a Pod deployment - the default Service Account and a new Service Account. If you don’t specify a Service Account within your Kubernetes Manifest, that means the Pod(s) is deployed with the default. The default is created by default (hence the name) by the cluster and it’s used if you do not specify a Service Account. Therefore, if the default Service Account gets compromised, so does every single one of your deployments.

Third is the Pod itself. A Pod is an entry point. You can either use it to run scripts via containers or authenticate to other parts of the cluster. This is why authentication, authorization, and proper Service Accounts are so important. Attackers can easily run a sidecar container within a Pod that could take down your environment.

In the next few sections, you’ll see a few methods that you can use to properly deploy a Pod in a secure fashion.

SecurityContext

The SecurityContext are security implementations that can be added at both the Pod level and the Container level (you’ll typically see both).

The breakdown of each implementation that you can add is as follows:

  • runAsNonRoot: Run as a user that’s not root/admin. This is the best way from a security perspective.
  • runAsUser (or Group): Specify the user/group you want to run the Pod/Container as.
  • fsgroup: Changes the group of all files in a volume when they are mounted to a Pod.
  • allowPrivilegeEscalation: Configures a process to be able to gain more privileges than its parent. This is a huge attack point.
  • privileged: Runs a container with privileged permissions, which is the same permissions as the host (think admin).
  • readOnlyFootFilesystem: Mounts a container filesystem as read-only (no write capabilities).

You can also set SELinux, AppArmour, and Seccomp capabilities. You can learn more about those here.

Essentially, the SecurityContext aside from network policies (which you’ll learn about later) is the absolute best way to secure Pods.

Demo

Let’s jump into some hands-on.

Below is a Deployment object/kind that has all of the security features we would like via the SecurityContext.



apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginxdeployment
  replicas: 2
  template:
    metadata:
      namespace: webapp
      labels:
        app: nginxdeployment
    spec:
      containers:
      - name: nginxdeployment
        image: nginx:latest
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          privileged: false
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
        ports:
        - containerPort: 80


Enter fullscreen mode Exit fullscreen mode

However, sometimes it may not work out as you want from a security perspective based on what base image you’re using.

For example, chances are you’ll see an error like the one below.



kubectl get pods --watch                               
NAME                                READY   STATUS   RESTARTS      AGE
nginx-deployment-7fdff64ddd-ntpx9   0/1     Error    2 (27s ago)   29s
nginx-deployment-7fdff64ddd-rpwwl   0/1     Error    2 (26s ago)   29s


Enter fullscreen mode Exit fullscreen mode

Digging in a bit deeper, you’ll notice that the error comes from the ReadOnly settings.



kubectl logs nginx-deployment-7fdff64ddd-ntpx9
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: can not modify /etc/nginx/conf.d/default.conf (read-only file system?)
/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2024/06/02 15:00:04 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (30: Read-only file system)
nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (30: Read-only file system)


Enter fullscreen mode Exit fullscreen mode

Although the ReadOnly settings are what we want from a security perspective, sometimes we need Write access. Therefore, engineers must understand this and simply mitigate as much as possible.

If you remove the readOnlyRootFilesytem section, you should now see the Pods are running.



kubectl get pods --watch
NAME                               READY   STATUS    RESTARTS   AGE
nginx-deployment-b68647d85-ddwkn   1/1     Running   0          4s
nginx-deployment-b68647d85-q2lzx   1/1     Running   0          4s


Enter fullscreen mode Exit fullscreen mode

Pod Security Standards

PSS, or Pod Security Standards, are a set of standards that you should follow when deploying Pods. The hope with PSS is that it covers the typical spectrum of security.

There are three standards:

  • Privileged
  • Baseline
  • Restricted

Privileged are unrestricted policies. This is considered the “free-for-all” policy.

Baseline is a middle ground between privileged and restricted. It prevents known escalations for Pods.

Restricted is heavy-duty enforcement. It follows all of the security best practices for Pods.

Image description
Source: https://kubernetes.io/docs/concepts/security/pod-security-standards/

You can dive deeper into this in the following link here.

Policy Enforcement

By default, Pods have free reign to do whatever they want, and more importantly, so do engineers. For example, an engineer can deploy a Kubernetes Manifest to production with a container image that uses the latest tag, which means they could accidentally deploy an alpha or beta build. There’s nothing stopping anyone from this doing, which could be detrimental to your environment.

Because of that, there must be a way to have blockers and enforcers that not only disallow security issues but overall bad practices that could ultimately lead to misconfigurations and therefore become security issues.

Policy Enforcement allows you to figure policies which are blocked by the Admission Controller.

The two popular Policy Enforcers right now are:

  • Open Policy Agent (OPA) with Gatekeeper enabled. Gatekeeper is the middle-ground between OPA and Kubernetes because OPA doesn’t know “how to speak” Kubernetes and vice-versa. Think of Gatekeeper like a Shim.
  • Kyverno is Kubernetes native, so it doesn’t require a Shim. Kyverno now works outside of Kubernetes. when it was originally created, it was only for Kubernetes.
💡 There used to be a Policy Enforcer within Kubernetes called Pod Security Policy or PSP for sure. After Kubernetes v1.25, it has been deprecated. PSP was used for things like setting Policies, but now, you’ll have to use a third-party solution like OPA or Kyverno. In terms of why PSP went away, there was some talk about “usability problems”, but my assumption is that tools like OPA and Kyverno ended up becoming the standard.

Demo

Let’s take a look at how Policy Enforcement works with OPA.

First, add the Gatekeeper repo.



helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts


Enter fullscreen mode Exit fullscreen mode

Install Gatekeeper.



helm install gatekeeper/gatekeeper --name-template=gatekeeper --namespace gatekeeper-system --create-namespace



Enter fullscreen mode Exit fullscreen mode

Once Gatekeeper is installed, you can start configuring policies.

The first step is creating a Config. The Config tells Gatekeeper what it’s allowed to manage policies on. In this case, you’re telling Gatekeeper in can create and manage policies for Pods.



apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
  name: config
  namespace: "gatekeeper-system"
spec:
  sync:
    syncOnly:
      - group: ""
        version: "v1"
        kind: "Pod"


Enter fullscreen mode Exit fullscreen mode

Next is the policy itself. The ContraintTemplate below creates a policy to block privileged containers via the SecurityContext.

💡 The policy is written in Rego, which is the configuration language for OPA.



apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: blockprivcontainers
  annotations:
    description: Block Pods from using privileged containers.
spec:
  crd:
    spec:
      names:
        kind: blockprivcontainers # this must be the same name as the name on metadata.name (line 4)
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8spspprivileged

        violation[{"msg": msg, "details": {}}] {
            c := input_containers[_]
            c.securityContext.privileged
            msg := sprintf("Privileged container is not allowed: %v, securityContext: %v", [c.name, c.securityContext])
        }

        input_containers[c] {
            c := input.review.object.spec.containers[_]
        }

        input_containers[c] {
            c := input.review.object.spec.initContainers[_]
        }


Enter fullscreen mode Exit fullscreen mode

Once the ConstraintTemplate (policy) is written, you can apply it via the kind/object below, which is what you created in the previous steps.



apiVersion: constraints.gatekeeper.sh/v1beta1
kind: blockprivcontainers
metadata:
  name: blockprivcontainers
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
  parameters:
    annotation: "priv-containers"


Enter fullscreen mode Exit fullscreen mode

To test out if this will work, run the following Deployment, which specifies that the container is running as privileged via the securityContext.

It should fail.



apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginxdeployment
  replicas: 2
  template:
    metadata:
      labels:
        app: nginxdeployment
    spec:
      containers:
      - name: nginxdeployment
        image: nginx:1.23.1
        ports:
        - containerPort: 80
      securityContext:  
        privileged: true


Enter fullscreen mode Exit fullscreen mode

Delete the previous Deployment and run the following deployment which should pass because privileged is set to false.



apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginxdeployment
  replicas: 2
  template:
    metadata:
      labels:
        app: nginxdeployment
    spec:
      containers:
      - name: nginxdeployment
        image: nginx:1.23.1
        ports:
        - containerPort: 80
      securityContext:  
        privileged: false


Enter fullscreen mode Exit fullscreen mode

Network Policies

In the SecurityContext section, you learned how to secure Pods at the Pod level itself. Who can run the Pods, how the Pods run, and what access Pods have outside of the Pod level.

In the Policy Enforcement section, you learned how to set rules for Pods.

NetworkPolicies are similar to both the SecurityContext and the Policy Enforcement piece in terms of what Pods can and can’t do, except Network Policies manage this at the network level. The idea with Network Policies is that you manage all traffic from both the Ingress and Egress layers.

By default, the internal Kubernetes Network is flat, which means all Pods can talk to each other regardless of whether or not they’re in the same Namespace. therefore, it’s drastically crucial that you configure Network Policies.

In the sub-section below you’ll find a demo of how Network Policies work, but if you want to see other examples, here’s a link that can provide more information and more demos.

Demo

Run the following Pods:



kubectl run busybox1 --image=busybox --labels app=busybox1 -- sleep 3600
kubectl run busybox2 --image=busybox --labels app=busybox2 -- sleep 3600


Enter fullscreen mode Exit fullscreen mode

Obtain the IP address of the Pods.



kubectl get pods -o wide


Enter fullscreen mode Exit fullscreen mode

Run a ping against busybox1.



kubectl exec -ti busybox2 -- ping -c3 ip_of_busybox_one


Enter fullscreen mode Exit fullscreen mode

You should see that the ping works just fine and there’s 0 packet loss.

Next, let’s configure a Network Policy that denies all ingress traffic to busybox1.



kubectl apply -f - <<EOF
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: web-deny-all
spec:
  podSelector:
    matchLabels:
      app: busybox1
  ingress: []
EOF


Enter fullscreen mode Exit fullscreen mode

Run the ping again.



kubectl exec -ti busybox2 -- ping -c3 ip_of_busybox_one


Enter fullscreen mode Exit fullscreen mode

You should now see that there’s 100% packet loss.

All Traffic Is Via An API

Remember, all Kubernetes traffic is run through a Kubernetes API, and that Kubernetes API resides on the API Server. Because of that, all the requests that come in for a particular workload must pass through the Admission Controller.

Admission Controllers are used to either mutate or validate the API request when the request comes in via an engineer or another entity. If the API requests aren’t allowed, it gets blocked. For example, if a policy says that a Pod cannot use the latest container image, that means it won’t make it past the Admission Controller. It’s all about validation.

Policy enforcers like OPA or Kyverno work because they configure the policies not to allow the request to pass if it doesn’t meet the specific policy guidelines.

Essentially, Admission Controllers either allow a request or deny a request due to a policy that’s in place.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .