ГлавнаяЗарубежная компьютерная литератураEugeny ShtoltcIT Cloud

Уменьшить шрифт (-) | Увеличить шрифт (+)

Eugeny Shtoltc
IT Cloud

esschtolts @ cloudshell: ~ / bitrix (essch) $ cat deploymnet.yaml

apiVersion: apps / v1

kind: Deployment

metadata:

namespace: development

spec:

selector:

matchLabels:

app: lamp

replicas: 1

template:

metadata:

labels:

app: lamp

spec:

containers:

– name: lamp

image: essch / app: 0.12

ports:

– containerPort: 80

esschtolts @ cloudshell: ~ / bitrix (essch) $ IMAGE = essch / app: 0.12 kubectl create -f deploymnet.yaml

deployment.apps "Nginxlamp" created

esschtolts @ cloudshell: ~ / bitrix (essch) $ kubectl get pods -l app = lamp

NAME READY STATUS RESTARTS AGE

Nginxlamp-55f8cd8dbc-mk9nk 1/1 Running 0 5m

esschtolts @ cloudshell: ~ / bitrix (essch) $ kubectl exec Nginxlamp-55f8cd8dbc-mk9nk – ls / app /

index.php

This happens because the developer of the images, which is correct and written in his documentation, expected that the image would be mounted to the host and the app folder was deleted in the script launched at the end. Also, in this approach, we will face the problem of constant updates of images, config (we cannot set the image number of a variable, since it will be executed on the cluster nodes) and container updates, we also cannot update the folder, since when the container is recreated, the changes will be returned to the original state.

The correct solution would be to mount the folder and include in the POD lifecycle the launch of the container, which starts in front of the main container and performs preparatory environment operations, often downloading the application from the repository, building, running tests, creating users and setting rights. For each operation, it is correct to launch a separate init container, in which this operation is the basic process, which are executed sequentially – by a chain that will be broken if one of the operations is performed with an error (it will return a non-zero process termination code). For such a container, a separate description is provided in the POD – InitContainer , listing them sequentially, they will build a chain of init container launches in the same order. In our case, we created an unnamed volume and, using InitContainer, delivered the installation files to it. After the successful completion of InitContainer , of which there may be several, the main one starts. The main container is already mounted in our volume, which already has the installation files, we just need to go to the browser and complete the installation:

esschtolts @ cloudshell: ~ / bitrix (essch) $ cat deploymnet.yaml

kind: Deployment

metadata:

namespace: development

spec:

selector:

matchLabels:

app: lamp

replicas: 1

template:

metadata:

labels:

app: lamp

spec:

initContainers:

– name: init

image: ubuntu

command:

– / bin / bash

– -c

– |

cd / app

apt-get update && apt-get install -y wget

wget https://www.1c-bitrix.ru/download/small_business_encode.tar.gz

tar -xf small_business_encode.tar.gz

sed -i '5i php_value short_open_tag 1' .htaccess

chmod -R 0777.

sed -i 's / # php_value display_errors 1 / php_value display_errors 1 /' .htaccess

sed -i '5i php_value opcache.revalidate_freq 0' .htaccess

sed -i 's / # php_flag default_charset UTF-8 / php_flag default_charset UTF-8 /' .htaccess

volumeMounts:

– name: app

mountPath: / app

containers:

– name: lamp

image: essch / app: 0.12

ports:

– containerPort: 80

volumeMounts:

– name: app

mountPath: / app

volumes:

– name: app

emptyDir: {}

You can watch events during POD creation with the watch kubectl get events command , and kubectl logs {ID_CONTAINER} -c init or more universally:

kubectl logs $ (kubectl get PODs -l app = lamp -o JSON | jq ".items [0] .metadata.name" | sed 's / "// g') -c init

It is advisable to choose small images for single tasks, for example, alpine: 3.5 :

esschtolts @ cloudshell: ~ (essch) $ docker pull alpine 1> \ dev \ null

esschtolts @ cloudshell: ~ (essch) $ docker pull ubuntu 1> \ dev \ null

esschtolts @ cloudshell: ~ (essch) $ docker images

REPOSITORY TAGIMAGE ID CREATED SIZE

ubuntu latest 93fd78260bd1 4 weeks ago 86.2MB

alpine latest 196d12cf6ab1 3 months ago 4.41MB

By slightly changing the code, we significantly saved on the size of the image:

image: alpine: 3.5

command:

– / bin / bash

– -c

– |

cd / app

apk –update add wget && rm -rf / var / cache / apk / *

tar -xf small_business_encode.tar.gz

rm -f small_business_encode.tar.gz

sed -i '5i php_value short_open_tag 1' .htaccess

sed -i 's / # php_value display_errors 1 / php_value display_errors 1 /' .htaccess

sed -i '5i php_value opcache.revalidate_freq 0' .htaccess

sed -i 's / # php_flag default_charset UTF-8 / php_flag default_charset UTF-8 /' .htaccess

chmod -R 0777.

volumeMounts:

There are also minimalistic images with pre-installed packages such as APIne with git: axeclbr / git and golang: 1-alpine .

Ways to ensure fault tolerance

Any process can crash. In the case of a container, if the main process crashes, then the container containing it also crashes. It is normal for the crash to occur during graceful shutdown. For example, our application in the container makes a backup of the database, in this case, after the container is executed, we get the work done. For demonstration purposes, let's take the sleep command:

vagrant @ ubuntu: ~ $ sudo docker pull ubuntu> / dev / null

vagrant @ ubuntu: ~ $ sudo docker run -d ubuntu sleep 60

0bd80651c6f97167b27f4e8df675780a14bd9e0a5c3f8e5e8340a98fc351bc64

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS NAMES

0bd80651c6f9 ubuntu "sleep 60" 15 seconds ago Up 12 seconds distracted_kalam

vagrant @ ubuntu: ~ $ sleep 60

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS NAMES

vagrant @ ubuntu: ~ $ sudo docker ps -a | grep ubuntu

0bd80651c6f9 ubuntu "sleep 60" 4 minutes ago Exited (0) 3 minutes ago distracted_kalam

In the case of backups, this is the norm, but in the case of applications that should not be terminated, it is not. A typical trick is a web server. The easiest thing in this case is to restart it:

vagrant @ ubuntu: ~ $ sudo docker run -d –restart = always ubuntu sleep 10

c3bc2d2e37a68636080898417f5b7468adc73a022588ae073bdb3a5bba270165

vagrant @ ubuntu: ~ $ sleep 30

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS

c3bc2d2e37a6 ubuntu sleep 10 "46 seconds ago Up 1 second

We see that when the container falls, it restarts. As a result – we always have an application in two states – raised or raised. If a web server crashes from some rare error, this is the norm, but most likely there is an error in processing requests, and it will crash on every such request, and in monitoring we will see a raised container. Such a web server is better dead than half alive. But, at the same time, a normal web server may not start due to rare errors, for example, due to the lack of connection to the database due to network instability. In such a case, the application must be able to handle errors and exit. And in case of a crash due to code errors, do not restart to see the inoperability and send it to the developers for repair. In the case of a floating error, you can try several times:

vagrant @ ubuntu: ~ $ sudo docker run -d –restart = on-failure: 3 ubuntu sleep 10

056c4fc6986a13936e5270585e0dc1782a9246f01d6243dd247cb03b7789de1c

vagrant @ ubuntu: ~ $ sleep 10

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c3bc2d2e37a6 ubuntu "sleep 10" 9 minutes ago Up 2 seconds keen_sinoussi

vagrant @ ubuntu: ~ $ sleep 10

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c3bc2d2e37a6 ubuntu "sleep 10" 10 minutes ago Up 9 seconds keen_sinoussi

vagrant @ ubuntu: ~ $ sleep 10

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c3bc2d2e37a6 ubuntu "sleep 10" 10 minutes ago Up 2 seconds keen_sinoussi

Another aspect is when to consider the container dead. By default, this is a process crash. But, by far, the application does not always crash itself in case of an error in order to allow the container to be restarted. For example, a server may be designed incorrectly and try to download the necessary libraries during its startup, but it does not have this opportunity, for example, due to the blocking of requests by the firewall. In such a scenario, the server can wait a long time if an adequate timeout is not specified. In this case, we need to check the functionality. For a web server, this is a response to a specific url, for example:

docker run –rm -d \

–-name = elasticsearch \

–-health-cmd = "curl –silent –fail localhost: 9200 / _cluster / health || exit 1" \

–-health-interval = 5s \

–-health-retries = 12 \

–-health-timeout = 20s \

{image}

For demonstration, we will use the file creation command. If the application has not reached the working state within the allotted time limit (set to 0) (for example, creating a file), then it is marked as working, but before that the specified number of checks is done:

vagrant @ ubuntu: ~ $ sudo docker run \

–d –name healt \

–-health-timeout = 0s \

–-health-interval = 5s \

–-health-retries = 3 \

–-health-cmd = "ls / halth" \

ubuntu bash -c 'sleep 1000'

c0041a8d973e74fe8c96a81b6f48f96756002485c74e51a1bd4b3bc9be0d9ec5

vagrant @ ubuntu: ~ $ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c0041a8d973e ubuntu "bash -c 'sleep 1000'" 4 seconds ago Up 3 seconds (health: starting) healt

vagrant @ ubuntu: ~ $ sleep 20

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c0041a8d973e ubuntu "bash -c 'sleep 1000'" 38 seconds ago Up 37 seconds (unhealthy) healt

vagrant @ ubuntu: ~ $ sudo docker rm -f healt

healt

If at least one of the checks worked, then the container is marked as healthy immediately:

vagrant @ ubuntu: ~ $ sudo docker run \

–d –name healt \

–-health-timeout = 0s \

–-health-interval = 5s \

–-health-retries = 3 \

–-health-cmd = "ls / halth" \

ubuntu bash -c 'touch / halth && sleep 1000'

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

160820d11933 ubuntu "bash -c 'touch / hal …" 4 seconds ago Up 2 seconds (health: starting) healt

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

160820d11933 ubuntu "bash -c 'touch / hal …" 6 seconds ago Up 5 seconds (healthy) healt

vagrant @ ubuntu: ~ $ sudo docker rm -f healt

healt

In this case, the checks are repeated all the time at a given interval:

vagrant @ ubuntu: ~ $ sudo docker run \

–d –name healt \

–-health-timeout = 0s \

–-health-interval = 5s \

–-health-retries = 3 \

–-health-cmd = "ls / halth" \

ubuntu bash -c 'touch / halth && sleep 60 && rm -f / halth && sleep 60'

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

8ec3a4abf74b ubuntu "bash -c 'touch / hal …" 7 seconds ago Up 5 seconds (health: starting) healt

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

8ec3a4abf74b ubuntu "bash -c 'touch / hal …" 24 seconds ago Up 22 seconds (healthy) healt

vagrant @ ubuntu: ~ $ sleep 60

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

8ec3a4abf74b ubuntu "bash -c 'touch / hal …" About a minute ago Up About a minute (unhealthy) healt

Kubernetes provides (kubernetes.io/docs/tasks/configure-POD-container/configure-liveness-readiness-probes/) three tools that check the state of a container from outside. They are more important because they serve not only to inform, but also to manage the application life cycle, roll-forward and rollback of updates. Configuring them incorrectly can, and often does, cause the application to malfunction. So, if the liveness test is triggered before the application starts working, Kubernetes will kill the container, not allowing it to rise. Let's consider it in more detail. The liveness probe is used to determine the health of the application, and if the application crashes and does not respond to the liveness probe, Kubernetes reloads the container. As an example, we will take a shell test, due to the simplicity of the demonstration of work, but in practice it should be used only in extreme cases, for example, if the container is started not as a long-lived server, but as a JOB, doing its job and ending its existence, having achieved the result … For server checks, it is better to use HTTP probes, which already have a built-in dedicated proxy and do not require curl in the container and do not depend on external kube-proxy settings. When using databases, you must use a TCP probe, as they usually do not support the HTTP protocol. Let's create a long-lived container at www.katacoda.com/courses/kubernetes/playground:

controlplane $ cat << EOF> liveness.yaml

apiVersion: v1

kind: Pod

metadata:

spec:

containers:

– name: healtcheck

image: alpine: 3.5

args:

– / bin / sh

– -c

– touch / tmp / healthy; sleep 10; rm -rf / tmp / healthy; sleep 60

livenessProbe:

exec:

command:

– cat

– / tmp / healthy

initialDelaySeconds: 15

periodSeconds: 5

EOF

controlplane $ kubectl create -f liveness.yaml

pod / liveness created

controlplane $ kubectl get pods

NAME READY STATUS RESTARTS AGE

liveness 1/1 Running 2 2m11s

controlplane $ kubectl describe pod / liveness | tail -n 10

Type Reason Age From Message

–– – – – –

Normal Scheduled 2m37s default-scheduler Successfully assigned default / liveness to node01

Normal Pulling 2m33s kubelet, node01 Pulling image "alpine: 3.5"

Normal Pulled 2m30s kubelet, node01 Successfully pulled image "alpine: 3.5"

Normal Created 33s (x3 over 2m30s) kubelet, node01 Created container healtcheck

Normal Started 33s (x3 over 2m30s) kubelet, node01 Started container healtcheck

Normal Pulled 33s (x2 over 93s) kubelet, node01 Container image "alpine: 3.5" already present on machine

Warning Unhealthy 3s (x9 over 2m13s) kubelet, node01 Liveness probe failed: cat: can't open '/ tmp / healthy': No such file or directory

Normal Killing 3s (x3 over 2m3s) kubelet, node01 Container healtcheck failed liveness probe, will be restarted

We see that the container is constantly being restarted.

controlplane $ cat << EOF> liveness.yaml

apiVersion: v1

kind: Pod

metadata:

spec:

containers:

– name: healtcheck

image: alpine: 3.5

args:

– / bin / sh

– -c

– touch / tmp / healthy; sleep 30; rm -rf / tmp / healthy; sleep 60

livenessProbe:

exec:

command:

– cat

– / tmp / healthy

initialDelaySeconds: 15

periodSeconds: 5

EOF

controlplane $ kubectl create -f liveness.yaml

pod / liveness created

controlplane $ kubectl get pods

NAME READY STATUS RESTARTS AGE

liveness 1/1 Running 2 2m53s

controlplane $ kubectl describe pod / liveness | tail -n 15

SecretName: default-token-9v5mb

Optional: false

QoS Class: BestEffort

Node-Selectors: <none>

Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s

node.kubernetes.io/unreachable:NoExecute for 300s

Events:

Type Reason Age From Message

–– – – – –

Normal Scheduled 3m44s default-scheduler Successfully assigned default / liveness to node01

Normal Pulled 68s (x3 over 3m35s) kubelet, node01 Container image "alpine: 3.5" already present on machine

Normal Created 68s (x3 over 3m35s) kubelet, node01 Created container healtcheck

Normal Started 68s (x3 over 3m34s) kubelet, node01 Started container healtcheck

Warning Unhealthy 23s (x9 over 3m3s) kubelet, node01 Liveness probe failed: cat: can't open '/ tmp / healthy': No such file or directory

Normal Killing 23s (x3 over 2m53s) kubelet, node01 Container healtcheck failed liveness probe, will be restarted

We also see on cluster events that when cat / tmp / health fails, the container is re-created:

controlplane $ kubectl get events

controlplane $ kubectl get events | grep pod / liveness

13m Normal Scheduled pod / liveness Successfully assigned default / liveness to node01

13m Normal Pulling pod / liveness Pulling image "alpine: 3.5"

13m Normal Pulled pod / liveness Successfully pulled image "alpine: 3.5"

10m Normal Created pod / liveness Created container healtcheck

10m Normal Started pod / liveness Started container healtcheck

10m Warning Unhealthy pod / liveness Liveness probe failed: cat: can't open '/ tmp / healthy': No such file or directory

10m Normal Killing pod / liveness Container healtcheck failed liveness probe, will be restarted

10m Normal Pulled pod / liveness Container image "alpine: 3.5" already present on machine

8m32s Normal Scheduled pod / liveness Successfully assigned default / liveness to node01

4m41s Normal Pulled pod / liveness Container image "alpine: 3.5" already present on machine

4m41s Normal Created pod / liveness Created container healtcheck

4m41s Normal Started pod / liveness Started container healtcheck

2m51s Warning Unhealthy pod / liveness Liveness probe failed: cat: can't open '/ tmp / healthy': No such file or directory

5m11s Normal Killing pod / liveness Container healtcheck failed liveness probe, will be restarted

Let's take a look at RadyNess trial. The availability of this test indicates that the application is ready to accept requests and the service can switch traffic to it:

controlplane $ cat << EOF> readiness.yaml

apiVersion: apps / v1

kind: Deployment

metadata:

spec:

replicas: 2

selector:

matchLabels:

app: readiness

template:

metadata:

labels:

app: readiness

spec:

containers:

– name: readiness

image: python

args:

– / bin / sh

– -c

– sleep 15 && (hostname> health) && python -m http.server 9000

readinessProbe:

exec:

command:

– cat

– / tmp / healthy

initialDelaySeconds: 1

periodSeconds: 5

EOF

controlplane $ kubectl create -f readiness.yaml

deployment.apps / readiness created

controlplane $ kubectl get pods

NAME READY STATUS RESTARTS AGE

readiness-fd8d996dd-cfsdb 0/1 ContainerCreating 0 7s

readiness-fd8d996dd-sj8pl 0/1 ContainerCreating 0 7s

controlplane $ kubectl get pods

NAME READY STATUS RESTARTS AGE

readiness-fd8d996dd-cfsdb 0/1 Running 0 6m29s

readiness-fd8d996dd-sj8pl 0/1 Running 0 6m29s

controlplane $ kubectl exec -it readiness-fd8d996dd-cfsdb – curl localhost: 9000 / health

readiness-fd8d996dd-cfsdb

Our containers work great. Let's add traffic to them:

controlplane $ kubectl expose deploy readiness \

–-type = LoadBalancer \

–-name = readiness \

–-port = 9000 \

–-target-port = 9000

service / readiness exposed

controlplane $ kubectl get svc readiness

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT (S) AGE

readiness LoadBalancer 10.98.36.51 <pending> 9000: 32355 / TCP 98s

controlplane $ curl localhost: 9000

controlplane $ for i in {1..5}; do curl $ IP: 9000 / health; done

one

four

five

Each container has a delay. Let's check what happens if one of the containers is restarted – whether traffic will be redirected to it:

controlplane $ kubectl get pods

NAME READY STATUS RESTARTS AGE

readiness-5dd64c6c79-9vq62 0/1 CrashLoopBackOff 6 15m

readiness-5dd64c6c79-sblvl 0/1 CrashLoopBackOff 6 15m

kubectl exec -it .... -c .... bash -c "rm -f healt"

controlplane $ for i in {1..5}; do echo $ i; done

one

four

five

controlplane $ kubectl delete deploy readiness

deployment.apps "readiness" deleted

Consider a situation when a container becomes temporarily unavailable for work:

(hostname> health) && (python -m http.server 9000 &) && sleep 60 && rm health && sleep 60 && (hostname> health) sleep 6000

/ bin / sh -c sleep 60 && (python -m http.server 9000 &) && PID = $! && sleep 60 && kill -9 $ PID

By default, the container enters the Running state upon completion of the execution of scripts in the Dockerfile and the launch of the script specified in the CMD instruction if it is overridden in the configuration in the Command section. But, in practice, if we have a database, it still needs to rise (read data and transfer their RAM and other actions), and this can take a lot of time, while it will not respond to connections, and other applications, although read and ready to accept connections will not be able to do so. Also, the container transitions to the Feils state when the main process in the container crashes. In the case of a database, it can endlessly try to execute an incorrect request and will not be able to respond to incoming requests, while the container will not be restarted, since the database daemon (server) did not formally crash. For these cases, two identifiers have been invented: readinessProbe and livenessProbe, which check the transition of the container to a working state or its failure by a custom script or HTTP request.

esschtolts @ cloudshell: ~ / bitrix (essch) $ cat health_check.yaml

apiVersion: v1

kind: Pod

metadata:

labels:

test: healtcheck

spec:

containers:

– name: healtcheck

image: alpine: 3.5

args:

– / bin / sh

– -c

– sleep 12; touch / tmp / healthy; sleep 10; rm -rf / tmp / healthy; sleep 60

readinessProbe:

exec:

command:

– cat

– / tmp / healthy

initialDelaySeconds: 5

periodSeconds: 5

livenessProbe:

exec:

command:

– cat

– / tmp / healthy

initialDelaySeconds: 15

periodSeconds: 5

The container starts after 3 seconds and after 5 seconds a readiness check starts every 5 seconds. On the second check (at 15 seconds of life), the readiness check cat / tmp / healthy will be successful. At this time, the livenessProbe operability check begins and at the second check (at 25 seconds) it ends with an error, after which the container is recognized as not working and is recreated.

esschtolts @ cloudshell: ~ / bitrix (essch) $ kubectl create -f health_check.yaml && sleep 4 && kubectl get

pods && sleep 10 && kubectl get pods && sleep 10 && kubectl get pods

pod "liveness-exec" created

NAME READY STATUS RESTARTS AGE

liveness-exec 0/1 Running 0 5s

NAME READY STATUS RESTARTS AGE

liveness-exec 0/1 Running 0 15s

NAME READY STATUS RESTARTS AGE

liveness-exec 1/1 Running 0 26s

esschtolts @ cloudshell: ~ / bitrix (essch) $ kubectl get pods

NAME READY STATUS RESTARTS AGE

liveness-exec 0/1 Running 0 53s

esschtolts @ cloudshell: ~ / bitrix (essch) $ kubectl get pods

NAME READY STATUS RESTARTS AGE

liveness-exec 0/1 Running 0 1m

esschtolts @ cloudshell: ~ / bitrix (essch) $ kubectl get pods

NAME READY STATUS RESTARTS AGE

liveness-exec 1/1 Running 1 1m

Kubernetes also provides a startup, which remakes the moment when you can turn the readiness and liveness of the sample into work. This is useful if, for example, we are downloading an application. Let's consider in more detail. Let's take www.katacoda.com/courses/Kubernetes/playground and Python for the experiment. There are TCP, EXEC and HTTP, but HTTP is better, as EXEC spawns processes and can leave them as "zombie processes". In addition, if the server provides interaction via HTTP, then it is against it that you need to check (https://www.katacoda.com/courses/kubernetes/playground):

controlplane $ kubectl version –short

Client Version: v1.18.0

Server Version: v1.18.0

cat << EOF> job.yaml

apiVersion: v1

kind: Pod

metadata:

spec:

containers:

– name: python

image: python

command: ['sh', '-c', 'sleep 60 && (echo "work"> health) && sleep 60 && python -m http.server 9000']

readinessProbe:

httpGet:

path: / health

port: 9000

initialDelaySeconds: 3

periodSeconds: 3

livenessProbe:

httpGet:

path: / health

port: 9000

initialDelaySeconds: 3

periodSeconds: 3

startupProbe:

exec:

command:

– cat

– / health

initialDelaySeconds: 3

periodSeconds: 3

restartPolicy: OnFailure

EOF

controlplane $ kubectl create -f job.yaml

pod / healt

controlplane $ kubectl get pods # not loaded yet

NAME READY STATUS RESTARTS AGE

healt 0/1 Running 0 11s

controlplane $ sleep 30 && kubectl get pods # not loaded yet but image is already zipped

NAME READY STATUS RESTARTS AGE

healt 0/1 Running 0 51s

controlplane $ sleep 60 && kubectl get pods

NAME READY STATUS RESTARTS AGE

healt 0/1 Running 1 116s

controlplane $ kubectl delete -f job.yaml

pod "healt" deleted

Self-diagnosis of micro service application

Let's consider how the probe works on the example of the microservice application bookinfo, which is part of Istio as an example: https://github.com/istio/istio/tree/master/samples/bookinfo. The demo will be at www.katacoda.com/courses/istio/deploy-istio-on-kubernetes. After deployment, it will be available

Infrastructure management

Although Kubernetes also has its own graphical interface – a UI dashboard, it does not provide other than monitoring and simple actions. More possibilities are given by OpenShift, providing a combination of graphic and text creation. A full-fledged product with a formed Google ecosystem in Kubernetes does not provide, but provides a cloud solution – Google Cloud Platform. However, there are third-party solutions, such as Open Shift and Rancher, that allow you to use it fully through a graphical interface at its own facilities. If desired, of course, you can sync with the cloud.

Each product is often not API compatible with each other, the only known exception being Mail. Cloud, which claims support for Open Shift. But, there is a third-party solution that implements the infrastructure as code approach and supports the API of most well-known ecosystems – Terraform. He, like Kubernetes, applies the concept of infrastructure as code, but not to containerization, but to virtual machines (servers, networks, disks). The Infrastructure as Code principle implies a declarative configuration – that is, a description of the result without explicitly specifying the actions themselves. Upon activation, the configuration (in Kubernetes it is kubectl apply -f name_config .yml , and in Hashicorp Terraform it is terraform apply ) of the system is brought into line with the configuration files, when the configuration or infrastructure changes, the infrastructure in the conflicting parts is brought into line with its declaration, when the system itself decides how to achieve this, and the behavior can be different, for example, when the meta information in the POD changes, it will be changed, and when the image changes, the POD will be deleted and created as a new one. If, before that, we created the server infrastructure for containers in an imperative form using the gcloud command of the Google Cloud Platform (GCP) public cloud, now we will consider how to create a similar configuration using the configuration in the declarative description of the pattern infrastructure as code using the universal Terraform tool that supports cloud GCP.

Terraform did not appear out of nowhere, but became a continuation of the long history of the emergence of software products for configuring and managing server infrastructure, I will list in the order of appearance and transition:

** CFN;

** Pupet;

** Chef;

** Ansible;

** Cloud AWS API, Kubernetes API;

* IasC: Terraform does not depend on the type of infrastructure (it supports more than 120 providers, including not only clouds), in contrast to the bucket counterparts that support only themselves: CloudFormation for Amazon WEB Service, Azure Resource Manager for Microsoft Azure, Google Cloud Deployment Manager from Google Cloud Engine.

CloudFormation is built by Amazon and is intended to be worthless, and is also fully integrated into the CI / CD of its infrastructure hosted on AWS S3, which makes GIT versioning difficult. We will consider a platform independent Terraform: the syntax of the basic functionality is the same, and the specific one is connected through the Providers entities (https://www.terraform.io/docs/providers/index.html). Terraform is one binary file, supports a huge number of providers, and of course AWS and GCE. Terraform, like most products from Hashicorp, is written in Go and is a single binary executable file, does not require installation, you just need to download it to the Linux folder:

(agile-aleph-203917) $ wget https://releases.hashicorp.com/terraform/0.11.13/terraform_0.11.13_linux_amd64.zip

(agile-aleph-203917) $ unzip terraform_0.11.13_linux_amd64.zip -d.

(agile-aleph-203917) $ rm -f terraform_0.11.13_linux_amd64.zip

(agile-aleph-203917) $ ./terraform version

Terraform v0.11.13

It supports splitting into modules that you can write yourself or use ready-made ones (https://registry.terraform.io/browse?offset=27&provider=google). To orchestrate and support changes in dependencies, you can use Terragrunt (https://davidbegin.github.io/terragrunt/), for example:

terragrant = {

terraform {

source = "terraform-aws-modules / …"

}

dependencies {

path = ["..network"]

}

name = "…"

ami = "…"

instance_type = "t3.large"

Unified semantics for different providers (AWS, GCE, Yandex. Cloud and many others) configurations, which allows you to create a transcendental infrastructure, for example, permanently loaded services are located to save on their own capacities, and are variably loaded (for example, during the promotional period) in public clouds … Due to the fact that management is declarative and can be described by files (IaC, infrastructure as code), the creation of infrastructure can be added to the CI / CD pipeline (development, testing, delivery, everything is automatic and with version control). Without CI / CD, config file locking is supported to prevent concurrent editing when working together. the infrastructure is not created by a script, but is brought into conformity with the configuration, which is declarative and cannot contain logic, although it is possible to inject BASH scripts into it and use Conditions (term operator) for different environments.

Terraform will read all files in the current directory with a .tf extension in the Hachicort Configuraiton Language (HCL) format or .tf format . json in JSON format. Often, instead of one file, it is divided into several, at least two: the first containing the configuration, the second – private data in variables.

To demonstrate Terraform's capabilities, we will create a GitHub repository due to its ease of authorization and API. First, we get a token generated in the WEB interface: SettingsDeveloper sittings -> Personal access token -> Generate new token and setting permissions. We will not create anything, just check the connection:

(agile-aleph-203917) $ ls * .tf

<< предыдущий лист

следующий лист >>

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

Другие книги автора

From programmer to architects. Practical way

Notes of an IT Architect

Machine learning in practice – from PyTorch model to Kubeflow in the cloud for BigData

Все книги автора