pod控制器概述
概述
pod是由其所在节点的kubelet组件进行创建、运行、监控、销毁,但是一旦节点宕机,kubelet也就无法监控其上所在pod,因此引入了主节点上运行的组件controller-manager,controller-manager本身可部署为高可用,其内部集成了很多控制器组件,分别控制不同类型的pod,常用的有:
- replicationcontroller
- replicaset
- deployment
- daemonset
- job
- cronjob
- statusfulset
- node lifecycle controller
- namespace controller
- service controller
- csrsigning controller集群证书签名控制器
- ...
各类控制器通过内部的和解循环(使得资源对象的期待状态spec和实际状态status达成“和解”,即趋于一致),持续监控其所属的pod,发现异常会不断调整,还可以用于pod的升级,回滚,支持升级时不同策略,金丝雀发布,灰度发布等
pod与控制器
deploment,rs,statuefulset等控制运行任务的pod的控制器为工作负载型pod——workload,工作负载型pod控制器的组成:
- label selector:标识过滤其控制容器
- 期望副本数:运行多少个副本
- pod模版:运行的容器镜像以及版本,参数等
pod模版
定义控制器时,pod的定义是嵌入到控制器的spec.template字段
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: rs1
spec:
replicas: 2
selector:
matchLabels:
app: rs-demo
template:
metadata:
labels:
app: rs-demo
spec:
containers:
- name: app
image: ikubernetes/myapp:v1
ports:
- name: http
containerPort: 80
replicaset
rs循环示意
rs控制器是rc:replicationcontroller的升级版,可以精确控制pod的副本数,可结合hpa实现自动的pod伸缩
创建rs控制器
1、编辑yaml文件,并创建
[root@client rs]# kubectl apply -f rs-myapp
replicaset.apps/rs-myapp created
[root@client rs]# kubectl get pods -w
NAME READY STATUS RESTARTS AGE
rs-myapp-r2mqb 1/1 Running 0 8s
rs-myapp-tktg7 0/1 ContainerCreating 0 8s
[root@client rs]# kubectl get rs -o wide
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
rs-myapp 2 2 2 25m myapp ikubernetes/myapp:v1 app=rs-demo
rs会向api-server注册一个监听事件,通过watch机制持续监测,并根据标签和定义的副本,rs会通过持续检测api-server对应的变化,保证通过标签过滤出的pod,最终和定义的副本数一致,多删少补;更改pod的标签后,该pod就不被rs监控,且有可能被其他控制器的标签控制器捕获,列入管控;
2、以下为删除pod后,rs自动补全的过程
[root@client rs]# kubectl delete pods/rs-myapp-tktg7
pod "rs-myapp-tktg7" deleted
[root@client rs]# kubectl get pods -o wide -w
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
rs-myapp-r2mqb 1/1 Running 0 38m 10.244.2.7 node1 <none>
rs-myapp-tktg7 1/1 Running 0 38m 10.244.1.4 node2 <none>
rs-myapp-tktg7 1/1 Terminating 0 38m 10.244.1.4 node2 <none>
rs-myapp-wclqm 0/1 Pending 0 0s <none> <none> <none>
rs-myapp-wclqm 0/1 Pending 0 0s <none> node3 <none>
rs-myapp-wclqm 0/1 ContainerCreating 0 0s <none> node3 <none>
rs-myapp-tktg7 0/1 Terminating 0 38m 10.244.1.4 node2 <none>
rs-myapp-wclqm 1/1 Running 0 2s 10.244.4.6 node3 <none>
rs-myapp-tktg7 0/1 Terminating 0 38m 10.244.1.4 node2 <none>
rs-myapp-tktg7 0/1 Terminating 0 38m 10.244.1.4 node2 <none>
更新
变更版本
1、修改镜像版本
[root@client rs]# kubectl get rs -o wide
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
rs-myapp 2 2 2 47m myapp ikubernetes/myapp:v1 app=rs-demo
[root@client rs]# vim rs-myapp
[root@client rs]# kubectl replace -f rs-myapp
replicaset.apps/rs-myapp replaced
[root@client rs]# kubectl get rs -o wide
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
rs-myapp 2 2 2 48m myapp ikubernetes/myapp:v2 app=rs-demo
# 修改镜像版本为v2,此时rs虽然显示信息变成v2,但是实际pod并未更新,需要手动删除后,rs补足副本时,随之采用新版本的镜像
# 如下-w监控pod变化过程,即可发现,原版本pod删除后,才会更新,但若是deployment控制器,则会随即更新
[root@client rs]# kubectl get pods
NAME READY STATUS RESTARTS AGE
rs-myapp-r2mqb 1/1 Running 1 48m
rs-myapp-wclqm 1/1 Running 0 9m56s
[root@client rs]# kubectl get pods -w
NAME READY STATUS RESTARTS AGE
rs-myapp-r2mqb 1/1 Running 1 48m
rs-myapp-wclqm 1/1 Running 0 10m
rs-myapp-r2mqb 1/1 Terminating 1 49m
rs-myapp-ckhh4 0/1 Pending 0 0s
rs-myapp-ckhh4 0/1 Pending 0 0s
rs-myapp-ckhh4 0/1 ContainerCreating 0 0s
rs-myapp-r2mqb 0/1 Terminating 1 49m
rs-myapp-r2mqb 0/1 Terminating 1 49m
rs-myapp-r2mqb 0/1 Terminating 1 49m
rs-myapp-ckhh4 1/1 Running 0 13s
rs-myapp-wclqm 1/1 Terminating 0 10m
rs-myapp-b2bnm 0/1 Pending 0 0s
rs-myapp-b2bnm 0/1 Pending 0 0s
rs-myapp-b2bnm 0/1 ContainerCreating 0 0s
rs-myapp-wclqm 0/1 Terminating 0 10m
rs-myapp-wclqm 0/1 Terminating 0 11m
rs-myapp-wclqm 0/1 Terminating 0 11m
rs-myapp-b2bnm 1/1 Running 0 9s
2、删除现有pod
[root@client rs]# kubectl delete pods/rs-myapp-r2mqb
pod "rs-myapp-r2mqb" deleted
[root@client rs]# kubectl delete pods/rs-myapp-wclqm
pod "rs-myapp-wclqm" deleted
3、查看更新后pod
^C[root@client rs]# kubectl get pods -w
NAME READY STATUS RESTARTS AGE
rs-myapp-b2bnm 1/1 Running 0 34s
rs-myapp-ckhh4 1/1 Running 0 49s
扩缩容
扩缩容方式:
- 修改yaml文件中,副本数,然后再次apply
- 使用scale子命令,直接命令行指定扩缩容后的副本数
1、scale命令扩容
[root@client rs]# kubectl scale --replicas=3 rs/rs-myapp
replicaset.extensions/rs-myapp scaled
[root@client rs]# kubectl get pods
NAME READY STATUS RESTARTS AGE
rs-myapp-b2bnm 1/1 Running 0 4h34m
rs-myapp-ckhh4 1/1 Running 0 4h34m
rs-myapp-xdtqv 0/1 ContainerCreating 0 5s
[root@client rs]# kubectl get rs
NAME DESIRED CURRENT READY AGE
rs-myapp 3 3 2 5h24m
2、scale命令缩容
[root@client rs]# kubectl scale --replicas=1 rs/rs-myapp
replicaset.extensions/rs-myapp scaled
[root@client rs]# kubectl get rs
NAME DESIRED CURRENT READY AGE
rs-myapp 1 1 1 5h24m
[root@client rs]# kubectl get pods
NAME READY STATUS RESTARTS AGE
rs-myapp-ckhh4 1/1 Running 0 4h35m
3、非级联删除pod
[root@client rs]# kubectl delete --cascade=false rs/rs-myapp
replicaset.extensions "rs-myapp" deleted
[root@client rs]# kubectl get pods
NAME READY STATUS RESTARTS AGE
rs-myapp-9swpf 1/1 Running 0 63s
rs-myapp-pqt7f 1/1 Running 0 63s
[root@client rs]# kubectl get rs
No resources found.
删除rs控制器对象时,默认会删除其所管理的pod资源,使用--cascade=false选项,可以避免删除pod对象,使其成为自主式pod,被用户手动直接管理;
rs不常用,一般采用其更上一层,封装了rs的deployment控制器;deployment管理多个版本的rs,更方便的实现版本的变更;
删除
直接删除rs后,其所控制的pod也会随之被删除;
[root@client rs]# kubectl delete -f rs-myapp
replicaset.apps "rs-myapp" deleted
[root@client rs]# kubectl get pods
NAME READY STATUS RESTARTS AGE
rs-myapp-ckhh4 0/1 Terminating 0 4h35m
[root@client rs]# kubectl get pods
No resources found.
deployment
deployment控制器是比rs更高一层,它通过控制rs进而控制pod,通过多版本rs的控制实现pod的版本变更,不同更新策略等特性;
创建deployment
[root@client deployment]# cat dep1.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: dep1
spec:
replicas: 2
selector:
matchLabels:
app: dep1-app
template:
metadata:
labels:
app: dep1-app
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
ports:
- name: http
containerPort: 80
可以看到,创建deploy会依次创建,deploy,rs和pod,层层递进;
[root@client deployment]# kubectl apply -f dep1.yaml
deployment.apps/dep1 created
[root@client deployment]# kubectl get pods
NAME READY STATUS RESTARTS AGE
dep1-7b96746498-ggwmk 1/1 Running 0 5s
dep1-7b96746498-kbb86 1/1 Running 0 5s
[root@client deployment]# kubectl get rs
NAME DESIRED CURRENT READY AGE
dep1-7b96746498 2 2 2 83s
[root@client deployment]# kubectl get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
dep1 2 2 2 2 18s
更新策略
默认是rollingUpdate更新策略,即逐步更新,其中下层参数maxsurge和maxunavailable参数绝对了rollingUpdate的过程,
[root@client deployment]# kubectl explain deployment.spec.strategy
KIND: Deployment
VERSION: extensions/v1beta1
RESOURCE: strategy <Object>
DESCRIPTION:
The deployment strategy to use to replace existing pods with new ones.
DeploymentStrategy describes how to replace existing pods with new ones.
FIELDS:
rollingUpdate <Object>
Rolling update config params. Present only if DeploymentStrategyType =
RollingUpdate.
type <string>
Type of deployment. Can be "Recreate" or "RollingUpdate". Default is
RollingUpdate.
[root@client deployment]# kubectl explain deployment.spec.strategy.rollingUpdate
deployment的更新策略有2种,分别:
- recreate:原有pod删除,然后重新删除,会造成一段时间服务不可用
- rolling update:逐步升级,先删除部分旧的,然后启动新版本pod补充,反复多次,直到全部更新完闭
- maxsurge:指定变更版本期间,pod数量最多能超出spec定义的副本数几个,
- maxunavailable:指定变更版本期间,删除pod时,一次删除的pod数不能高于该值,
升级deployment
1、使用set image升级
除此之外,直接修改yaml文件后重新apply,或使用patch命令以命令行方式修改镜像的版本定义都可以实现容器版本的变更;
[root@client deployment]# kubectl set image deploy/dep1 myapp=ikubernetes/myapp:v2
deployment.extensions/dep1 image updated
2、查看
升级后可以看到rs有2个对象,旧的rs只是没有控制的pod在运行,但保留下来可以实现回滚操作,pod已经变为了新版本的容器镜像
[root@client deployment]# kubectl get rs -o wide
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
dep1-7b96746498 2 2 2 169m myapp ikubernetes/myapp:v1 app=dep1-app,pod-template-hash=7b96746498
[root@client deployment]# kubectl get rs -o wide -w
...
# 使用-w可以监控rs的变化过程
[root@client deployment]# kubectl get rs -o wide
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
dep1-5f9fd5b957 2 2 2 2m25s myapp ikubernetes/myapp:v2 app=dep1-app,pod-template-hash=5f9fd5b957
dep1-7b96746498 0 0 0 173m myapp ikubernetes/myapp:v1 app=dep1-app,pod-template-hash=7b96746498
[root@client deployment]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
dep1-5f9fd5b957-2xx2n 1/1 Running 0 2m28s 10.244.2.12 node1 <none>
dep1-5f9fd5b957-txvdj 1/1 Running 0 2m30s 10.244.1.7 node2 <none>
金丝雀发布
概念:
deployment升级容器镜像版本时,默认行为是:逐步升级所有的pod,但实际生产中,采用“金丝雀发布”更为稳妥,即先升级部分,然后通过service或ingress,引流,将部分用户流量引入到新升级的pod提供服务;之后,根据一段时间观察,若新版本持续稳定,则升级剩余版本,若新版本出现故障,则回滚到旧版本,对新版debug;
实验:
1、变更镜像版本,并立即暂定更新,
[root@client deployment]# kubectl set image deploy/dep1 myapp=ikubernetes/myapp:v3 \
> && kubectl rollout pause deploy/dep1
deployment.extensions/dep1 image updated
deployment.extensions/dep1 paused
2、继续更新
[root@client deployment]# kubectl rollout resume deploy/dep1
deployment.extensions/dep1 resumed
3、期间监控状态
[root@client deployment]# kubectl rollout status deploy/dep1
Waiting for deployment "dep1" rollout to finish: 1 old replicas are pending termination...
^C[root@client deployment]# kubectl rollout status deploy/dep1
Waiting for deployment "dep1" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment spec update to be observed...
Waiting for deployment spec update to be observed...
Waiting for deployment "dep1" rollout to finish: 1 old replicas are pending termination...
回滚deployment
1、直接回滚
[root@client deployment]# kubectl rollout undo deploy/dep1
deployment.extensions/dep1
2、借助版本历史回滚
[root@client deployment]# kubectl rollout history deploy/dep1
deployment.extensions/dep1
REVISION CHANGE-CAUSE
2 <none>
4 <none>
5 <none>
[root@client deployment]# kubectl rollout undo deploy/dep1 --to-revision=5
deployment.extensions/dep1
扩容缩容
1、直接修改yaml文件中的replicas字段的数值即可
[root@client deployment]# kubectl get pods
NAME READY STATUS RESTARTS AGE
dep1-7b96746498-nr2lm 1/1 Running 0 3m58s
dep1-7b96746498-sdlb6 1/1 Running 0 3h59m
[root@client deployment]# vim dep1.yaml
[root@client deployment]# kubectl apply -f dep1.yaml
deployment.apps/dep1 configured
[root@client deployment]# kubectl get pods
NAME READY STATUS RESTARTS AGE
dep1-7b96746498-lzsn6 1/1 Running 0 4s
dep1-7b96746498-nr2lm 1/1 Running 0 4m15s
dep1-7b96746498-sdlb6 1/1 Running 0 4h
daemonset
daemonset控制器简称ds,其控制的pod特点是每个节点都运行,且只运行一个,也可通过节点选择标签选特定标签的节点运行,常用于节点级别的守护进程,例如:
- 日志收集,fluentd,logstash
- 集群存储,ceph,glusterfsd
- 监控代理,promethus node exporter,collectd,datadog agent
创建
1、编写yaml文件
[root@client daemonset]# cat ds1.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat-ds
labels:
app: filebeat
spec:
selector:
matchLabels:
app: filebeat
template:
metadata:
labels:
app: filebeat
name: filebeat
spec:
containers:
- name: filebeat
image: ikubernetes/filebeat:5.6.5-alpine
env:
- name: REDIS_HOST
value: db.redis.io:6379
- name: LOG_LEVEL
value: info
2、创建
[root@client daemonset]# kubectl apply -f ds1.yaml
daemonset.apps/filebeat-ds created
[root@client daemonset]# kubectl get ds
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
filebeat-ds 3 3 0 3 0 <none> 5s
3、查看
[root@client daemonset]# kubectl describe ds/filebeat-ds
Name: filebeat-ds
Selector: app=filebeat
Node-Selector: <none>
Labels: app=filebeat
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apps/v1","kind":"DaemonSet","metadata":{"annotations":{},"labels":{"app":"filebeat"},"name":"filebeat-ds","namespace":"defa...
Desired Number of Nodes Scheduled: 3
Current Number of Nodes Scheduled: 3
Number of Nodes Scheduled with Up-to-date Pods: 3
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status: 0 Running / 3 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=filebeat
Containers:
filebeat:
Image: ikubernetes/filebeat:5.6.5-alpine
Port: <none>
Host Port: <none>
Environment:
REDIS_HOST: db.redis.io:6379
LOG_LEVEL: info
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 12m daemonset-controller Created pod: filebeat-ds-tj2b5
Normal SuccessfulCreate 12m daemonset-controller Created pod: filebeat-ds-5q4sc
Normal SuccessfulCreate 12m daemonset-controller Created pod: filebeat-ds-t6l2g
更新
类似deployment的更新,ds也有自己的更新策略,支持rollingupdate和ondelete;
[root@client daemonset]# kubectl explain ds.spec.updateStrategy
job
job运用一次性运行,达到任务目录,运行完成后即退出的工作pod,完成后状态后是completed,例如备份、计算;job分为2类:串行job,并行job,
- 串行job,多个job依次执行;前一个job执行完成退出后才执行后续job,如备份过程;
- 并行job,job分为多个队列,每个队列job执行相同任务,且同时执行,如计算过程;
创建job
1、语法
[root@client daemonset]# kubectl explain job.spec
2、创建
[root@client job]# vim job1.yaml
[root@client job]# cat job1.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: job1
spec:
template:
spec:
containers:
- name: alpine
image: alpine
command: ["/bin/sh", "-c", "sleep 20"]
restartPolicy: OnFailure
# 默认策略是always重启,但job型pod不适用,应明确指定为never或onfailure,后者更合适;
[root@client job]# kubectl apply -f job1.yaml
job.batch/job1 created
[root@client job]# kubectl get jobs
NAME COMPLETIONS DURATION AGE
job1 0/1 3s 3s
# 运行后,其complations状态为1
创建并行job
1、定义并行数字段
[root@client job]# kubectl explain job.spec.parallelism
KIND: Job
VERSION: batch/v1
FIELD: parallelism <integer>
job扩容
1、job执行时扩容方法
kubectl scale jobs/job1 --replicas=N
删除job
1、job型pod无法正常完成,但又定义了一直重启的策略时,为避免pod一直重启占用资源,控制job的工作pod不会一直重启的参数,在job.spec下面
FIELDS:
activeDeadlineSeconds <integer># pod的最大活动时间,超过即被杀死
Specifies the duration in seconds relative to the startTime that the job
may be active before the system tries to terminate it; value must be
positive integer
backoffLimit <integer># pod的重启尝试次数
Specifies the number of retries before marking this job failed. Defaults to
6
cronjob
类似linux的crontab,crontab控制周期性的计划任务,或一次性的计划任务;
语法:
[root@client job]# kubectl explain cronjob.spec
# 2个必需字段
jobTemplate <Object> -required-
Specifies the job that will be created when executing a CronJob.
schedule <string> -required-
The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.
示例
1、yaml文件
[root@client cronjob]# cat cronjob.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: cronjob-1
labels:
app: cronjob-1
spec:
schedule: " */2 * * * * "
jobTemplate:
metadata:
labels:
app: mycronjob
spec:
parallelism: 2
template:
spec:
containers:
- name: myjob
image: alpine
command:
- /bin/sh
- -c
- data;echo hello k8s;sleep 10
restartPolicy: OnFailure
2、创建
[root@client cronjob]# kubectl apply -f cronjob.yaml
cronjob.batch/cronjob-1 created
[root@client cronjob]# kubectl get cronjob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
cronjob-1 */2 * * * * False 0 <none> 5s
3、查看cronjob创建的job
类比deployment管理rs,cronjob通过管理job实现,创建cronjob时,其会创建job,job名为定义cronjob的标签加上hash值后缀;
[root@client cronjob]# kubectl get jobs
NAME COMPLETIONS DURATION AGE
cronjob-1-1605520680 2/1 of 2 30s 2m30s
cronjob-1-1605520800 2/1 of 2 29s 30s
---
[root@client cronjob]# cat cronjob.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: cronjob-1
labels:
app: cronjob-1
replicationcontroller
早期的pod控制器,replicaset是其升级版
pod中断预算
介绍
自愿中断:
可预期的变化,如pod版本变更,节点迁移等
非自愿中断:
不可预期变化,如硬件,系统突然故障,磁盘故障等,
中断预算:
podDisruptionBudget,简称PDB,意思是在自愿中断的情况下,发生变更时,仍然要保证的pod的数量在预算内,从而保证服务的可用性,
语法
值得注意的是,做中断预算时,selector部分要和其要“预算”的控制器保持一致,即采用相同的控制器能过滤出同一组的pod;
[root@client pdb]# kubectl explain pdb
KIND: PodDisruptionBudget
VERSION: policy/v1beta1
DESCRIPTION:
PodDisruptionBudget is an object to define the max disruption that can be
caused to a collection of pods
FIELDS:
apiVersion <string>
APIVersion defines the versioned schema of this representation of an
object. Servers should convert recognized schemas to the latest internal
value, and may reject unrecognized values. More info:
https://git.k8s.io/community/contributors/devel/api-conventions.md#resources
kind <string>
Kind is a string value representing the REST resource this object
represents. Servers may infer this from the endpoint the client submits
requests to. Cannot be updated. In CamelCase. More info:
https://git.k8s.io/community/contributors/devel/api-conventions.md#types-kinds
metadata <Object>
spec <Object>
Specification of the desired behavior of the PodDisruptionBudget.
status <Object>
Most recently observed status of the PodDisruptionBudget.
示例
[root@client pdb]# cat pdb.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: dep1-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: dep1-app