Deploying a Kubernetes Cluster (AWS EKS) & an API Gateway secured by mTLS, with Terraform, External-DNS & Traefik - Part 2

Aurélie Vache - Apr 25 '20 - - Dev Community

In the first part we see how to create an AWS EKS Kubernetes cluster with Terraform.
Now it's time to configure our cluster and deploy Traefik and our services.

EKS cluster configuration

Here several explanations about the AWS EKS cluster configuration and why we configured this way.

  • Private Access: Enabled

Private Access allow resources in workers's VPC to access to the api-server. Needed in order to upscale potentially the cluster, nodes need access to api-server.

  • We need to add specific NAT GW IPs in Public_access_cidrs var

Why? Because this parameter indicates which CIDR blocks can access the Amazon EKS public API server endpoint when enabled. We need our internal tools to access to this cluster so if you have the same constraint, you now know where to do that.

  • Pods per nodes limitation

Warning: In an EKS, the number of pods per nodes, is strongly limited, instead of 110 (the default Kubernetes number).

The pod limit is actually the ENIs and IP addresses limit of the EC2 instance, because the default VPC CNI plugin will assign a secondary IP to each pod.

Our cluster nodes are based in t3.medium instance, so we have a limitation of 17 pods per nodes!

Here the complete map of number of pods max per instance type: https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt

Alternative: Install another CNI plugin: https://medium.com/@swazza85/dealing-with-pod-density-limitations-on-eks-worker-nodes-137a12c8b218

Connect to the new cluster locally

At the end of the Terraform step you generates a kubeconfig file so, copy and paste it, save it in your local machine in .kube/config file or if you handle several .kubeconfig files, you can save it in .kube/eks-cluster.conf file and add it in your KUBECONFIG environment variable like this:

KUBECONFIG:/home/toto/.kube/mycluster.conf:/home/toto/.kube/eks-cluster.conf
Enter fullscreen mode Exit fullscreen mode

Then export the variable.

Just one another thing … In kubeconfig file we use aws-iam-authenticator tool in order to get AWS token so you need to install it.

Tips: I recommend you to install kubectx and kubens useful tools which allow you to navigate simply to multiple clusters and multiple namespaces ;-).

$ kubectx                                                                                                                                                                                                                      
my_other_cluster
eks-scraly-cluster

$ kubectx eks-scraly-cluster
Enter fullscreen mode Exit fullscreen mode

OK, we setted our context to link to the new cluster so we can now navigate to our cluster.

External-DNS deployment

External-DNS is a service, provided by Zalando, which dynamically provisions DNS records, and in our case, manage Route 53 records.

An AWS role have to be configured to work with the eks configuration:

  • EKS provide an OIDC that we have to link to IAM through a Provider to enable iam roles for service accounts
  • create a policy that allow route53 management: external dns iam policy
  • create a role web identity using the previous provider and for sts.amazonaws.com audience. This create a role with a trust relationships from the eks oidc and allow eks pod to assume role like kube2iam: create service account iam policy and role
  • attach the previous policy to the new role

All theses required things have been already done with Terraform.

Let’s go back to resources definition and creation. We will create our kubernetes code organization:

$ cd your_git_repo
$ mkdir kubernetes
$ cd kubernetes
$ mkdir external-dns
Enter fullscreen mode Exit fullscreen mode

In this new folder, we will create 3 manifests YAML files:

cluster-role.yaml:

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: external-dns
  namespace: external-dns
rules:
- apiGroups: [""]
  resources: ["services"]
  verbs: ["get","watch","list"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get","watch","list"]
- apiGroups: ["extensions"]
  resources: ["ingresses"]
  verbs: ["get","watch","list"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: external-dns-viewer
  namespace: external-dns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-dns
subjects:
- kind: ServiceAccount
  name: external-dns
  namespace: external-dns
Enter fullscreen mode Exit fullscreen mode

deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
  namespace: external-dns
spec:
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: external-dns
  template:
    metadata:
      labels:
        app: external-dns
    spec:
      serviceAccountName: external-dns
      containers:
      - name: external-dns
        image: registry.opensource.zalan.do/teapot/external-dns:latest
        args:
        - --source=service
        - --source=ingress
        - --domain-filter=scraly.com # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
        - --provider=aws
        - --policy=sync # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
        - --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both)
        - --registry=txt
      securityContext:
        fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files
Enter fullscreen mode Exit fullscreen mode

As you can see, we specify the AWS provider and ask external-dns to creates/updates/deletes new routes for service and ingress Kubernetes resource type for the domain scraly.com.

serviceaccount.yaml:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: external-dns
  namespace: external-dns
  # If you're using Amazon EKS with IAM Roles for Service Accounts, specify the following annotation.
  # Otherwise, you may safely omit it.
  annotations:
    # Substitute your ARN external dns role below.
    eks.amazonaws.com/role-arn: ARN_EXTERNAL_DNS_ROLE
Enter fullscreen mode Exit fullscreen mode

Through Terraform output we retrieved an ARN, so now you need to retrieve it and then to pass it to manifest file:

$ sed -i -e "s|ARN_EXTERNAL_DNS_ROLE|${ARN_EXTERNAL_DNS_ROLE}|g" service-account.yaml
Enter fullscreen mode Exit fullscreen mode

Great, you gave the ARN to the external-dns serviceaccount file.

Of course, AWS IAM Role we use needs appropriate Route 53 permissions.

Let’s deploy this new service (and before, create a external-dns namespace):

$ kubectl create namespace external-dns

$ kubectl apply -f external-dns -n external-dns
Enter fullscreen mode Exit fullscreen mode

In order to check if external-dns is running, please watch the logs:

kubectl get po -n external-dns
kubectl logs external-dns-<pod_id> -n external-dns
Enter fullscreen mode Exit fullscreen mode

External-DNS is ready, let’s deploy our reverse-proxy.

Deploy an API Gateway with Traefik

Traefik is an open source reverse proxy and load balancer for HTTP and TCP-based applications that is easy, dynamic, automatic, fast, full-featured, production proven, provides metrics, and integrates with every major cluster technology.

We will install Traefik (v2.0) as an Ingress controller for a Kubernetes cluster (https://docs.traefik.io/v2.0/user-guide/kubernetes/).

Role Based Access Control configuration

We need to authorize Traefik to use the Kubernetes API.
There are two ways to set up the proper permission: Via namespace-specific RoleBindings or a single, global ClusterRoleBinding.

Deployment

We create a traefik folder:

$ cd your_git_repo
$ cd kubernetes/
$ mkdir traefik
$ cd traefik
Enter fullscreen mode Exit fullscreen mode

Creates a 1.crd.yaml file which will contains new CRDs for Traefik:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ingressroutes.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: IngressRoute
    plural: ingressroutes
    singular: ingressroute
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ingressroutetcps.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: IngressRouteTCP
    plural: ingressroutetcps
    singular: ingressroutetcp
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: middlewares.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: Middleware
    plural: middlewares
    singular: middleware
  scope: Namespaced

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: tlsoptions.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: TLSOption
    plural: tlsoptions
    singular: tlsoption
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: traefikservices.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: TraefikService
    plural: traefikservices
    singular: traefikservice
  scope: Namespaced
Enter fullscreen mode Exit fullscreen mode

Now, we creates 2.rbac.yaml file which contains ClusterRole and ClusterRoleBinding for RBAC.
RBAC will allow Traefik to do self-discovery on Kubernetes, it will use read-only rights on several objects.

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: traefik-ingress-controller

rules:
  - apiGroups:
      - ""
    resources:
      - services
      - endpoints
      - secrets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
    resources:
      - ingresses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
    resources:
      - ingresses/status
    verbs:
      - update
  - apiGroups:
      - traefik.containo.us
    resources:
      - middlewares
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - traefik.containo.us
    resources:
      - ingressroutes
    verbs:
      - get
      - list
      - watch

  - apiGroups:
      - traefik.containo.us
    resources:
      - ingressroutetcps
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - traefik.containo.us
    resources:
      - tlsoptions
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - traefik.containo.us
    resources:
      - traefikservices
    verbs:
      - get
      - list
      - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: traefik-ingress-controller

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: traefik-ingress-controller
subjects:
  - kind: ServiceAccount
    name: traefik-ingress-controller
    namespace: traefik
Enter fullscreen mode Exit fullscreen mode

Now we create the deployment file in 3.traefik.deploy.yaml file which contains the deployment of traefik pods and services and enabling the tracing:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: traefik-ingress-controller
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: traefik
  labels:
    app: traefik
spec:
  replicas: 1
  selector:
    matchLabels:
      app: traefik
  template:
    metadata:
      labels:
        app: traefik
    spec:
      serviceAccountName: traefik-ingress-controller
      containers:
        - name: traefik
          image: traefik:v2.0
          args:
          - --api.insecure
          - --accesslog
          - --entrypoints.web.Address=:8000
          - --entrypoints.websecure.Address=:4443
          - --providers.kubernetescrd
          - --certificatesresolvers.default.acme.tlschallenge
          - --certificatesresolvers.default.acme.email=scraly@myemail.com
          - --certificatesresolvers.default.acme.storage=acme.json
          # Please note that this is the staging Let's Encrypt server.
          # Once you get things working, you should remove that whole line altogether.
          - --certificatesresolvers.default.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
          - --tracing.jaeger=true
          - --tracing.jaeger.gen128Bit
          - --tracing.jaeger.propagation=b3
          - --tracing.jaeger.localAgentHostPort=jaeger-agent:6831
          - --tracing.jaeger.collector.endpoint=http://jaeger-collector:14268/api/traces?format=jaeger.thrift
          - --metrics.prometheus=true
          ports:
          - name: web
            containerPort: 8000
          - name: websecure
            containerPort: 4443
---
apiVersion: v1
kind: Service
metadata:
  name: traefik
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
    external-dns.alpha.kubernetes.io/hostname: "*.traefik.scraly.com"
spec:
  type: LoadBalancer
  ports:
  - protocol: TCP
    name: web
    port: 80
    targetPort: web
   - protocol: TCP
     name: admin
     port: 8080
  - protocol: TCP
    name: websecure
    port: 443
    targetPort: websecure
  selector:
    app: traefik
Enter fullscreen mode Exit fullscreen mode

And then we can create IngressRoute for http and https routes in 5.ingressroutes.yaml file.

We will secured Jaeger with basic auth with an user and a password, in order to generate it here the command line:

$ htpasswd -nb admin admin | base64                                                                                                                                                                                            
YWRtaW46JGFwcjEkdWllZko3UkskaHhOc1ZUZzFjSVBWZlp6VEJ6Uk9wLgoK
Enter fullscreen mode Exit fullscreen mode

5.ingressroutes.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: authsecret
data:
  # admin / admin
  users: YWRtaW46JGFwcjEkdWllZko3UkskaHhOc1ZUZzFjSVBWZlp6VEJ6Uk9wLgoK

---
# Declaring the user list
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: jaeger-ingress-auth
spec:
  basicAuth:
    secret: authsecret
---
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: striptlsprefix
spec:
  stripPrefix:
    prefixes:
      - /tls
      - /notls
 ---
 apiVersion: traefik.containo.us/v1alpha1
 kind: IngressRoute
 metadata:
   name: simpleingressroute
 spec:
   entryPoints:
     - web
   routes:
   - match: Host(`apigw.traefik.scraly.com`)
     kind: Rule
     services:
     - name: httpbin
       port: 80
     middlewares:
     - name: striptlsprefix
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: ingressroutetls
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`apigw.traefik.scraly.com`)
    kind: Rule
    services:
    - name: httpbin
      port: 80
    middlewares:
    - name: striptlsprefix
    - name: tlssubjectheader
  tls:
    certResolver: default
    options:
      name: scraly-mtls
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: jaeger-ingress-tls
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`jaeger.traefik.scraly.com`)
    kind: Rule
    services:
    - name: jaeger-query
      port: 16686
    middlewares:
    - name: jaeger-ingress-auth
  tls:
    certResolver: default
Enter fullscreen mode Exit fullscreen mode

As you can see in Deployment resource, we let Let's encrypt handle Server certificate for TLS. If you keep let’s encrypt staging URL, you will have a warning/issue in your browser, in order to remote this warning, you need to use Let’s Encrypt production URL:
https://acme-v02.api.letsencrypt.org/directory

Our manifests files are ready so let’s deploy them in the cluster:

$ cd kubernetes
$ kubectl apply -f traefik

$ kubectl get po,svc,deploy,ingressroute,secret                                                                                                                                                                                    
NAME                                   READY   STATUS    RESTARTS   AGE
pod/httpbin-c134bc87b-4c9qt            1/1     Running   0          17m
pod/jaeger-6765759cb5-9ljwb            1/1     Running   0          17m
pod/jaeger-agent-daemonset-dgdls       1/1     Running   0          17m
pod/jaeger-operator-75f9699896-fl9zv   1/1     Running   0          17m
pod/traefik-71234cd8cfc-k78b           1/1     Running   0          47m

NAME                                TYPE           CLUSTER-IP       EXTERNAL-IP                                                                  PORT(S)                                  AGE
service/httpbin                     ClusterIP      1xx.xx.xxx.x     <none>                                                                       80/TCP                                   17m
service/jaeger-agent                ClusterIP      None             <none>                                                                       5775/UDP,5778/TCP,6831/UDP,6832/UDP      17m
service/jaeger-collector            ClusterIP      1xx.xx.xx.xxx    <none>                                                                       9411/TCP,14250/TCP,14267/TCP,14268/TCP   17m
service/jaeger-collector-headless   ClusterIP      None             <none>                                                                       9411/TCP,14250/TCP,14267/TCP,14268/TCP   17m
service/jaeger-operator-metrics     ClusterIP      1xx.20.122.97    <none>                                                                       8383/TCP,8686/TCP                        17m
service/jaeger-query                ClusterIP      1xx.xx.xxx.xxx   <none>                                                                       16686/TCP                                17m
service/traefik                     LoadBalancer   1xx.xx.xxx.xxx   a012436577463524f84547656459c1f-126543650.eu-central-1.elb.amazonaws.com   80:30123/TCP,443:30980/TCP               17m

NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.extensions/httpbin           1/1     1            1           17m
deployment.extensions/jaeger            1/1     1            1           17m
deployment.extensions/jaeger-operator   1/1     1            1           17m
deployment.extensions/traefik           1/1     1            1           17m

NAME                                                  AGE
ingressroute.traefik.containo.us/ingressroutetls      17m
ingressroute.traefik.containo.us/jaeger-ingress-tls   14m
ingressroute.traefik.containo.us/simpleingressroute   14m
Enter fullscreen mode Exit fullscreen mode

Cool, we know have a reverse-proxy which in HTTP and HTTPS, nice but we need to go further and deploy it in mTLS in order to be more secure.

Secure our reverse-proxy with mTLS

Mutual TLS (mTLS) authentication ensures that traffic is both secure and trusted in both directions between a client and server. Client certificate authentication is a second layer of security.

With a root certificate authority (CA) in place, the server only allows requests from devices with a corresponding client certificate. When a request reaches the application, server responds with a request for the client to present a certificate. If the device fails to present the certificate, the request is not allowed to proceed. If the client does have a certificate, the certificate is verified.

For this POC purpose & scope, we generates a root CA that we used to sign client certificate (for testing/access purpose). Server accepts exclusively requests signed with such certificates.
Server certificate being generated by Let's Encrypt.

Root CA:

First of all, we generate a key pair for CA root:

$ openssl genrsa -out ca.key 2048
Enter fullscreen mode Exit fullscreen mode

Then, we generates Root CA certificate:

$ openssl req -days 3600 -new -x509 -key ca.key -out ca.pem
Enter fullscreen mode Exit fullscreen mode

In this step, you are about to be asked to enter information that will be incorporated into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.

Country Name (2 letter code) [AU]:FR
State or Province Name (full name) [Some-State]:Your_city
Locality Name (eg, city) []:
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Your_Company
Organizational Unit Name (eg, section) []:Your_Orga_Unit_Name
Common Name (e.g. server FQDN or YOUR name) []:Root CA Test
Email Address []:
Enter fullscreen mode Exit fullscreen mode

Server certificate:

Then, you can create the certificate server.

Key pair generation for CA root:

$ openssl genrsa -out server.key 2048
Enter fullscreen mode Exit fullscreen mode

CSR creation:

$ openssl req -new -key server.key -out server.csr -subj "/C=FR/ST=Your_city/O=Your-Company/OU=Your_Unit_Name/CN=###your_domain_name###"
Enter fullscreen mode Exit fullscreen mode

Certificate creation:

$ openssl x509 -req -days 360 -in server.csr -CA ca.pem -CAkey ca.key -CAcreateserial -out server.pem -sha256
Enter fullscreen mode Exit fullscreen mode

Client certificate:

Key pair generation for CA root:

$ openssl genrsa -out client.key 2048
Enter fullscreen mode Exit fullscreen mode

CSR creation:

$ openssl req -new -key client.key -out client.csr -subj "/C=FR/ST=Your_city/O=Test"
Enter fullscreen mode Exit fullscreen mode

Certificate creation:

$ openssl x509 -req -days 360 -in client.csr -CA ca.pem -CAkey ca.key -CAcreateserial -out client.pem -sha256
Enter fullscreen mode Exit fullscreen mode

A tips is you can visualize your certificate directly in command line:

$ openssl x509 -noout -text -in file.pem
Enter fullscreen mode Exit fullscreen mode

Ok, you have plenty of certificates now ^^.

We will uses and store this CA certificate, base64 encoded, in mtls-ca kubernetes secret, use it in ingressroutetls and specify TLSOption RequireAndVerifyClientCert.

In order to store the CA root in a Kubernetes secret, you need to encode it in base64:

$ cat ca.pem | base64
Enter fullscreen mode Exit fullscreen mode

In traefik folder, create a 4.mtls.yaml file:

apiVersion: v1
kind: Secret
metadata:
  name: mtls-ca
data:
  tls.ca: <base64_result>
---
apiVersion: traefik.containo.us/v1alpha1
kind: TLSOption
metadata:
  name: scraly-mtls
spec:
  clientAuth:
    secretNames:
      - mtls-ca
    clientAuthType: RequireAndVerifyClientCert
---
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: tlssubjectheader
spec:
  passTLSClientCert:
    info:
      subject:
        commonName: true

$ kubectl apply -f traefik/4.mtls.yaml
Enter fullscreen mode Exit fullscreen mode

And now we can edit our 5.ingressroutes.yaml file in order to use the mTLS and comment HTTP route:

apiVersion: v1
kind: Secret
metadata:
  name: authsecret
data:
  # admin / admin
  users: YWRtaW46JGFwcjEkdWllZko3UkskaHhOc1ZUZzFjSVBWZlp6VEJ6Uk9wLgoK

---
# Declaring the user list
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: jaeger-ingress-auth
spec:
  basicAuth:
    secret: authsecret
---
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: striptlsprefix
spec:
  stripPrefix:
    prefixes:
      - /tls
      - /notls
# ---
# apiVersion: traefik.containo.us/v1alpha1
# kind: IngressRoute
# metadata:
#   name: simpleingressroute
# spec:
#   entryPoints:
#     - web
#   routes:
#   - match: Host(`apigw.traefik.scraly.com`)
#     kind: Rule
#     services:
#     - name: httpbin
#       port: 80
#     middlewares:
#     - name: striptlsprefix
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: ingressroutetls
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`apigw.traefik.scraly.com`)
    kind: Rule
    services:
    - name: httpbin
      port: 80
    middlewares:
    - name: striptlsprefix
    - name: tlssubjectheader
  tls:
    certResolver: default
    options:
      name: scraly-mtls
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: jaeger-ingress-tls
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`jaeger.traefik.scraly.com`)
    kind: Rule
    services:
    - name: jaeger-query
      port: 16686
    middlewares:
    - name: jaeger-ingress-auth
  tls:
    certResolver: default

$ kubectl apply -f traefik/5.ingressroutes.yaml
Enter fullscreen mode Exit fullscreen mode

Deploy HTTPBin

In order to validate the infrastructure, we will deploy an httpbin service in the cluster (which will be exposed via the API Gateway, Traefik).

httpbin is a simple Rest API for network and implementation testing. We will deploy a httpbin deployment for testing purpose.

$ cd kubernetes
$ mkdir httpbin
$ cd httpbin
Enter fullscreen mode Exit fullscreen mode

deployment.yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: httpbin
  name: httpbin
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
  template:
    metadata:
      labels:
        app: httpbin
    spec:
      containers:
      - image: kennethreitz/httpbin
        imagePullPolicy: Always
        name: httpbin
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        resources:
          requests:
            cpu: 5m
            memory: 70Mi
Enter fullscreen mode Exit fullscreen mode

service.yaml

apiVersion: v1
kind: Service
metadata:
  labels:
    app: httpbin
  name: httpbin
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: httpbin
  type: ClusterIP

$ cd ..
$ kubectl apply -f httpbin
Enter fullscreen mode Exit fullscreen mode

As usual, we check if the resources we deployed are running:

$ kubectl get po -n traefik
…
pod/httpbin-c123bc45b-6c7qt            1/1     Running   0          1m
…
Enter fullscreen mode Exit fullscreen mode

Test with Postman

All our infrastructure is deployed so now it’s time to test it.
HTTPBin is secured with mTLS so we can test our service through cURL or for example with Postman.

Postman is one of the famous tool for API testing. Very useful for day-to-day tests during development.

You need to edit the settings of Postman to set the client key and pem in Settings>Certificates>Client Certificates like this:

Alt Text

Create a new collection in Postman and create a new request like this:

Alt Text

Issues we faced

During this story, we faced multiple issues caused by solutions we made.
Through this ReX (Return of eXperience) I will explain you the problem we faced, the analysis and the solution to fixed them.

LoadBalancer can’t be created / External-DNS can’t create Route53 records

First time we deployed our stack we faced in a problem: our service couldn't have external IP:

service/traefik                     LoadBalancer   1xx.xx.xxx.xx    <pending>     80:32666/TCP,443:32088/TCP               33m
Enter fullscreen mode Exit fullscreen mode

<pending>

$ kubectl describe svc/traefik
Name:                     traefik
Namespace:                traefik
Labels:                   <none>
...
Session Affinity:         None
External Traffic Policy:  Cluster
Events:
  Type     Reason                  Age                   From                Message
  ----     ------                  ----                  ----                -------
  Normal   EnsuringLoadBalancer    3m23s (x12 over 33m)  service-controller  Ensuring load balancer
  Warning  SyncLoadBalancerFailed  3m22s (x12 over 33m)  service-controller  Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB
Enter fullscreen mode Exit fullscreen mode

Solution:

Public subnets in the VPC may be tagged accordingly so that Kubernetes knows to use only those subnets for external load balancers, so added one tag:

"kubernetes.io/role/elb" = "1"
Enter fullscreen mode Exit fullscreen mode

c.f https://docs.aws.amazon.com/eks/latest/userguide/load-balancing.html

Issues while deployment through Terraform

If you create OIDC connect provider through Terraform, CA thumbprint list will be empty, the problem is that with OIDC as my service accounts couldn’t get valid credentials so external-dns service will not work :'(.

Alt Text

Here the problem and the solution: https://medium.com/@marcincuber/amazon-eks-with-oidc-provider-iam-roles-for-kubernetes-services-accounts-59015d15cb0c

So the solution is to create a script get_thumbprint.sh:

#!/bin/sh
# https://github.com/terraform-providers/terraform-provider-aws/issues/10104

THUMBPRINT=$(echo | openssl s_client -servername oidc.eks.${1}.amazonaws.com -showcerts -connect oidc.eks.${1}.amazonaws.com:443 2>&- | tac | sed -n '/-----END CERTIFICATE-----/,/-----BEGIN CERTIFICATE-----/p; /-----BEGIN CERTIFICATE-----/q' | tac | openssl x509 -fingerprint -sha1 -noout | sed 's/://g' | awk -F= '{print tolower($2)}')
THUMBPRINT_JSON="{\"thumbprint\": \"${THUMBPRINT}\"}"
echo $THUMBPRINT_JSON
Enter fullscreen mode Exit fullscreen mode

Then we added an external data in order to retrieve the thumbprint and we added this list in our OIDC provider resource definition:

### External cli kubergrunt
data "external" "thumb" {
  program = [ "get_thumbprint.sh", var.aws_region ]
}

# Enabling IAM Roles for Service Accounts
resource "aws_iam_openid_connect_provider" "eks-scraly-cluster-oidc" {
  client_id_list  = ["sts.amazonaws.com"]
  thumbprint_list = [data.external.thumb.result.thumbprint]
  url             = aws_eks_cluster.eks-scraly-cluster.identity.0.oidc.0.issuer
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

I know our solution it’s not a perfect solution but it aims us to deploy a start infrastructure in several days with a Kubernetes cluster, an API Gateway and secured by mTLS. For a POC purpose it’s a good way to start. I hope you’ll like this little ReX ;-).

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .