Despite the innovation in the tech space from mainframes to servers to virtualization to cloud to Kubernetes, one thing holds true - resources are resources. Memory is memory. CPU is CPU. Storage is storage. These are resources that engineers still have to think and care about because, regardless of where you’re running workloads, these resources aren’t unlimited.

Engineers need to think about resource optimization from both a performance perspective and cost savings perspective. Otherwise, application stacks are either going to perform poorly or they’re going to have way too many resources than they actually need.

In Kubernetes there are a few keys way to configure resource optimization.

What Are Requests

When configuring proper resource optimization in Kubernetes, the typical workload is broken down into three categories:

Requests
Limits
Quotas

Let’s start out with requests.

Requests set a minimum guaranteed resource amount. Taking a look at the Kubernetes Manifest below, take a look at the resources list. It contains both requests for memory and CPU.

What this means is this Nginx deployment is guaranteed at least 64Mi of memory and 250m of CPU. This is the minimum requirement for the workload to run properly.



apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: test
spec:
  selector:
    matchLabels:
      app: nginxdeployment
  replicas: 2
  template:
    metadata:
      namespace: webapp
      labels:
        app: nginxdeployment
    spec:
      containers:
      - name: nginxdeployment
        image: nginx:latest
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
        ports:
        - containerPort: 80

What Are Limits

Next, there are limits. Limits specify the absolute maximum amount of CPU and memory that a Deployment can receive.

In the Kubernetes Manifest example, below, the resource limit contains a memory limit, but why not a CPU limit?

Because setting CPU limits in Kubernetes is a bad practice.

When you set a limit on, for example, memory, the Pod will use the memory as needed. Once it’s done using the memory, the memory will go back into the pool of available resources for the cluster. A Limit is

If a process has limits assigned, it will be throttled whenever it attempts to consume more CPU cycles per time slice than it has been limited to. Throttling means that even if there are unused CPU cycles on the node (there are resources in "the pool"), the throttled process cannot be assigned additional CPU cycles during the throttled time slice. Effectively, this can mean that some of the node's "pool of available CPU resources" might sit idle rather than being used to run the ready-to-run throttled process, even when no other process wants/needs those CPU cycles. This is wasteful. It's wasteful of the node's resources.

The throttled process isn't directly taking anything away from other workloads by having limits. However the presence of limits on the workload is preventing available and unused node CPU resources from being used to run the limited workload, and this is what is unnecessary/wasteful.



apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: test
spec:
  selector:
    matchLabels:
      app: nginxdeployment
  replicas: 2
  template:
    metadata:
      namespace: webapp
      labels:
        app: nginxdeployment
    spec:
      containers:
      - name: nginxdeployment
        image: nginx:latest
        resources:
          limits:
            memory: "128Mi"
        ports:
        - containerPort: 80

What Are Quotas

Resource Quotas are either requests or limits that you wish to set on a particular Namespace. The example below shows that for the webapp Namespace, it’s going to have a minimum of 512Mi available for memory and a maximum of 1,112Mi for memory. This indicates the resources that are available from both a limit and request perspective for the entire Namespace. If you try to deploy Pods and the resources are not available, they will not be scheduled. Instead, the Pods will wait until the resources are available.



apiVersion: v1
kind: Namespace
metadata:
  name: webapp
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: memorylimit
  namespace: webapp
spec:
  hard:
    requests.memory: 512Mi
    limits.memory: 1112Mi

Here’s a another example of a Resource Quota where hard limits are set on CPU, memory, and how many Pods are allowed to run inside of the Namespace.



apiVersion: v1
kind: ResourceQuota
metadata:
  name: memorylimit
  namespace: test
spec:
  hard:
    cpu: "5"
    memory: 10Gi
    pods: "10"

What Happens When They Aren’t Set

As you go through all of the work necessary to ensure that proper limits, requests, and quotas are set, you may ask yourself a question - what happens if they aren’t set?

The two short answers are:

Your application will perform poorly.
You’ll be spending way too much money for no reason.

If you, for example, set a limit to low or a request to low, that means the Pod and the overall application stack will perform poorly. If you don’t set proper limits, requests, and quotas, that means the Pod and application stack may use far too many resources than it actually needs, resulting in spending money that’s unnecessary.

How To Automatically Configure In StormForge

Here’s the problem with setting limits, requests, and quotas. Although it’s a great thing to do and resource optimization is needed for every single environment, it’s a manual process. Sure, you can run some Kubernetes Manifests and update some Deployments to get the job done, but do you really want to do that? Do you want to go through the mundane task of sitting there and trying to figure out how much memory and CPU a containerized application needs for proper scalability? Of course not. It’s not a good use of time, it’s incredibly manual, and quite frankly, boring.

In the previous blog post, Automating Cognitive Relief For Engineers, there’s a step-by-step guide on how you can register a Kubernetes cluster in StormForge. The remainder of this blog post will focus on more advanced installations and techniques with automatic and semi-automatic remediation.

Optimize Live

First, there’s the option of Optimize Live, which you can think about as semi-automated. You have the ability to go in and configure how you want a workload to be optimize, but StormForge won’t automatically do it for you without you pressing the appropriate buttons.

First, ensure that the recommendation is up to date.

You’ll see that it’s being generated.

Next, you can see the recommendation available.

To set the recommendation, go into the config.

The config can than be updated based on whether you want to save, have a balance, or ensure complete reliability.

Once you click the green Update command, StormForge will get to work on the backend ensuring proper resource optimization.

StormForge Applier

If you’re comfortable going the full automatic remediation route, you can implement the StormForge Applier.

To implement the Applier, you’ll first want to install it via the Helm Chart.



helm install stormforge-applier \
  oci://registry.stormforge.io/library/stormforge-applier \
  -n stormforge-system

You can ensure it’s running by seeing that the Pod is available in the stormforge-system Namespace.

After that, go to the Config of your workload and click the “On” button for Automatic Deployment, followed by clicking the green Update button.

And that’s it!

As you can see, Automatic Deployments are now on. There’s nothing more you have to do other than sit back and let StormForge optimize your environment.

Impacts Of Not Setting Requests, Limits, and Quotas