- new to Kubernetes? read Kubernetes in a nutshell
From Kubernetes website:
On-disk files in a Container are ephemeral, which presents some problems for non-trivial applications when running in Containers.
That means that since pods are disposable, we can't save files to the container or pod file systems. Upon crash, the kubelet managing it will restart the container/pod and will result in deleting all the files.
👉 Why we need storage?
Some best practices mention that Microservices should be stateless, but reality shows that sometimes applications need to be able to store and retrieve data. It can be from a file system or NoSQL / SQL storage.
In order to have this data available, we will need to configure volumes.
What are volumes?
When discussing data storage, volume refers to the logical drive. An actual physical file system on a hard drive.
For any kind of storage, we would like to work with, we will need to define them first using YAML file.
There are many types of volumes, we will focus on 3 of them:
💾 emptyDir
Using emptydir
we can define a directory on the node itself.
However, if the node is down, the content of the emptydir
is erased.
in YAML file it will look like this:
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
- mountPath: /cache
name: cache-volume
volumes:
- name: cache-volume
emptyDir: {}<o:p></o:p>
💽 hostPath
Another one is hostPath
.
With hostPath
we mount one of the directories or files from the host node filesystem into the pod.
The major difference between hostPath
and emptyDir
is that hostPath
might not be an empty directory.
BTW, hostPath
, together with a privileged Kubernetes DaemonSet can be used to configure the underlying VM's by running commands on the host.
However, we should be careful doing so, since it can cause security breaches.
- DeamonSet - set of processes that are running in the background on defined nodes. For example DeamonSet of type logs collection. Meaning we would like to have log collection processes running in the background for all of our pods so we can collect the logs and store them in a dedicated place.
in YAML file it will look like this:
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
- mountPath: /test-pd
name: test-volume
volumes:
- name: test-volume
hostPath:
# directory location on host
path: /data
# this field is optional
type: Directory
Last but not least:
🌤️ Public Clouds
If you are using one of the public clouds, you can take advantage of many existing services like AWS Elastic Block Store and Azure Disk Storage.
💡 That's a wrap! Resources and Learn More:
- SQL server container in Kuberenetes with AKS
- Kubernetes volumes
- Best practices for storage and backups in AKS
- This post is inspired from this: How do I configure storage for a bare metal Kubernetes cluster?
- all YAML file example are from Kubernetes.io docs.