new to Kubernetes? read Kubernetes in a nutshell

From Kubernetes website:

On-disk files in a Container are ephemeral, which presents some problems for non-trivial applications when running in Containers.

That means that since pods are disposable, we can't save files to the container or pod file systems. Upon crash, the kubelet managing it will restart the container/pod and will result in deleting all the files.

👉 Why we need storage?

Some best practices mention that Microservices should be stateless, but reality shows that sometimes applications need to be able to store and retrieve data. It can be from a file system or NoSQL / SQL storage.

In order to have this data available, we will need to configure volumes.

What are volumes?

When discussing data storage, volume refers to the logical drive. An actual physical file system on a hard drive.

For any kind of storage, we would like to work with, we will need to define them first using YAML file.

There are many types of volumes, we will focus on 3 of them:

💾 emptyDir

Using emptydir we can define a directory on the node itself.
However, if the node is down, the content of the emptydir is erased.

in YAML file it will look like this:

apiVersion: v1
kind: Pod
metadata:
  name: test-pd
spec:
  containers:
  - image: k8s.gcr.io/test-webserver
    name: test-container
    volumeMounts:
    - mountPath: /cache
      name: cache-volume
  volumes:
  - name: cache-volume
    emptyDir: {}<o:p></o:p>

💽 hostPath

Another one is hostPath.

With hostPath we mount one of the directories or files from the host node filesystem into the pod.
The major difference between hostPath and emptyDir is that hostPath might not be an empty directory.
BTW, hostPath, together with a privileged Kubernetes DaemonSet can be used to configure the underlying VM's by running commands on the host.
However, we should be careful doing so, since it can cause security breaches.

DeamonSet - set of processes that are running in the background on defined nodes. For example DeamonSet of type logs collection. Meaning we would like to have log collection processes running in the background for all of our pods so we can collect the logs and store them in a dedicated place.

in YAML file it will look like this:

apiVersion: v1
kind: Pod
metadata:
  name: test-pd
spec:
  containers:
  - image: k8s.gcr.io/test-webserver
    name: test-container
    volumeMounts:
    - mountPath: /test-pd
      name: test-volume
  volumes:
  - name: test-volume
    hostPath:
      # directory location on host
      path: /data
      # this field is optional
      type: Directory

Last but not least:

🌤️ Public Clouds

If you are using one of the public clouds, you can take advantage of many existing services like AWS Elastic Block Store and Azure Disk Storage.

💡 That's a wrap! Resources and Learn More:

SQL server container in Kuberenetes with AKS
Kubernetes volumes
Best practices for storage and backups in AKS
This post is inspired from this: How do I configure storage for a bare metal Kubernetes cluster?
all YAML file example are from Kubernetes.io docs.

Kubernetes Storage config Simplified

What are volumes?

💾 emptyDir

💽 hostPath

🌤️ Public Clouds

💡 That's a wrap! Resources and Learn More: