Kubernetes生产环境部署最佳实践
•软件部署
Kubernetes生产环境部署最佳实践
Kubernetes已经成为容器编排的事实标准。本文将详细介绍Kubernetes生产环境的部署最佳实践。
架构设计
1. 高可用架构
控制平面高可用
架构模式:
- 3个Master节点(奇数个,避免脑裂)
- etcd集群(3或5节点)
- API Server负载均衡
- Controller Manager和Schedulerleader选举
部署拓扑:
Master-1: API Server + etcd + Controller + Scheduler
Master-2: API Server + etcd + Controller + Scheduler
Master-3: API Server + etcd + Controller + Scheduler
负载均衡器: HAProxy/Nginx/Cloud LB
工作节点设计
节点类型:
- 通用节点:运行无状态应用
- 计算节点:CPU密集型应用
- 内存节点:内存密集型应用
- GPU节点:AI/ML应用
- 存储节点:有状态应用
标签规划:
node-role.kubernetes.io/worker: "true"
node-type: general/compute/memory/gpu/storage
zone: zone-a/zone-b/zone-c
2. 网络架构
CNI选择
# Calico配置示例
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
calicoNetwork:
ipPools:
- blockSize: 26
cidr: 10.244.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
Service Mesh
# Istio安装配置
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
profile: default
components:
pilot:
k8s:
resources:
requests:
cpu: 2000m
memory: 4Gi
meshConfig:
defaultConfig:
proxyMetadata:
ISTIO_META_DNS_CAPTURE: "true"
enableAutoMtls: true
安装部署
1. 使用kubeadm安装
初始化Master节点
# 准备环境
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
# 安装容器运行时(containerd)
sudo apt-get update
sudo apt-get install -y containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd
# 安装kubeadm、kubelet、kubectl
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
# 初始化第一个Master
sudo kubeadm init \
--control-plane-endpoint "k8s-api.example.com:6443" \
--pod-network-cidr=10.244.0.0/16 \
--service-cidr=10.96.0.0/12 \
--upload-certs
# 配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
加入其他Master节点
# 在第一个Master上生成加入命令
kubeadm init phase upload-certs --upload-certs
# 加入其他Master
sudo kubeadm join k8s-api.example.com:6443 \
--token <token> \
--discovery-token-ca-cert-hash sha256:<hash> \
--control-plane \
--certificate-key <certificate-key>
2. 使用Kubespray安装
inventory配置
# inventory/mycluster/inventory.ini
[all]
master-1 ansible_host=192.168.1.11 ip=192.168.1.11
master-2 ansible_host=192.168.1.12 ip=192.168.1.12
master-3 ansible_host=192.168.1.13 ip=192.168.1.13
worker-1 ansible_host=192.168.1.21 ip=192.168.1.21
worker-2 ansible_host=192.168.1.22 ip=192.168.1.22
worker-3 ansible_host=192.168.1.23 ip=192.168.1.23
[kube_control_plane]
master-1
master-2
master-3
[etcd]
master-1
master-2
master-3
[kube_node]
worker-1
worker-2
worker-3
[k8s_cluster:children]
kube_control_plane
kube_node
部署命令
# 安装依赖
pip install -r requirements.txt
# 部署集群
ansible-playbook -i inventory/mycluster/inventory.ini \
--become --become-user=root cluster.yml
# 升级集群
ansible-playbook -i inventory/mycluster/inventory.ini \
--become --become-user=root upgrade-cluster.yml
核心组件配置
1. 存储配置
StorageClass配置
# NFS StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-client
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner
parameters:
archiveOnDelete: "false"
reclaimPolicy: Delete
volumeBindingMode: Immediate
---
# Ceph RBD StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ceph-rbd
provisioner: rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: Immediate
2. 监控配置
Prometheus Operator
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
namespace: monitoring
spec:
replicas: 2
serviceAccountName: prometheus
serviceMonitorSelector:
matchLabels:
team: frontend
resources:
requests:
memory: 400Mi
enableAdminAPI: false
retention: 30d
retentionSize: "50GB"
storage:
volumeClaimTemplate:
spec:
storageClassName: standard
resources:
requests:
storage: 50Gi
Grafana配置
apiVersion: integreatly.org/v1alpha1
kind: Grafana
metadata:
name: grafana
namespace: monitoring
spec:
config:
auth:
disable_login_form: false
security:
admin_user: admin
admin_password: admin123
dashboardLabelSelector:
- matchExpressions:
- key: app
operator: In
values:
- grafana
3. 日志配置
Loki部署
apiVersion: loki.grafana.com/v1
kind: LokiStack
metadata:
name: logging-loki
namespace: openshift-logging
spec:
size: 1x.small
storage:
schemas:
- effectiveDate: "2020-10-11"
version: v11
secret:
name: logging-loki-s3
type: s3
storageClassName: standard
tenants:
mode: openshift-logging
应用部署
1. Deployment最佳实践
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
labels:
app: myapp
version: v1
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25%
maxUnavailable: 0
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
version: v1
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
serviceAccountName: myapp
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
containers:
- name: myapp
image: myregistry/myapp:v1.0.0
imagePullPolicy: Always
ports:
- name: http
containerPort: 8080
protocol: TCP
env:
- name: PORT
value: "8080"
- name: LOG_LEVEL
value: "info"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: myapp-secrets
key: database-url
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
startupProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 30
volumeMounts:
- name: tmp
mountPath: /tmp
- name: config
mountPath: /app/config
readOnly: true
volumes:
- name: tmp
emptyDir: {}
- name: config
configMap:
name: myapp-config
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- myapp
topologyKey: kubernetes.io/hostname
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: myapp
2. HPA配置
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000"
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
3. VPA配置
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: myapp
minAllowed:
cpu: 50m
memory: 100Mi
maxAllowed:
cpu: 1000m
memory: 1Gi
controlledResources: ["cpu", "memory"]
安全配置
1. RBAC配置
apiVersion: v1
kind: ServiceAccount
metadata:
name: myapp
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: myapp-role
namespace: default
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
resourceNames: ["myapp-secrets"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: myapp-rolebinding
namespace: default
subjects:
- kind: ServiceAccount
name: myapp
namespace: default
roleRef:
kind: Role
name: myapp-role
apiGroup: rbac.authorization.k8s.io
2. NetworkPolicy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: myapp-network-policy
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 5432
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
3. PodSecurityPolicy
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
- 'persistentVolumeClaim'
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
supplementalGroups:
rule: 'RunAsAny'
readOnlyRootFilesystem: true
备份与恢复
1. etcd备份
#!/bin/bash
# etcd-backup.sh
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-$(date +%Y%m%d-%H%M%S).db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# 保留最近7天的备份
find /backup -name "etcd-*.db" -mtime +7 -delete
2. Velero备份
# Velero安装
velero install \
--provider aws \
--bucket k8s-backups \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--backup-location-config region=us-east-1 \
--plugins velero/velero-plugin-for-aws:v1.7.0
# 创建备份
velero backup create full-backup \
--include-namespaces default,production \
--exclude-resources events,events.events.k8s.io
# 定时备份
velero schedule create daily-backup \
--schedule="0 1 * * *" \
--include-namespaces production \
--ttl 720h0m0s
运维管理
1. 集群监控
关键指标
# 节点指标
- 节点CPU使用率
- 节点内存使用率
- 节点磁盘使用率
- 节点网络IO
- 节点Pod数量
# Pod指标
- Pod CPU使用率
- Pod内存使用率
- Pod重启次数
- Pod就绪状态
- 容器OOM事件
# 集群指标
- API Server延迟
- etcd集群健康
- Scheduler调度延迟
- Controller Manager状态
2. 日志管理
# Fluent Bit配置
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
data:
fluent-bit.conf: |
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser docker
Tag kube.*
Mem_Buf_Limit 5MB
Skip_Long_Lines On
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
[OUTPUT]
Name es
Match *
Host elasticsearch
Port 9200
Logstash_Format On
Logstash_Prefix k8s
3. 故障排查
常用命令
# 查看Pod状态
kubectl get pods -o wide
kubectl describe pod <pod-name>
# 查看日志
kubectl logs <pod-name>
kubectl logs <pod-name> --previous
kubectl logs -l app=myapp --tail=100 -f
# 进入容器
kubectl exec -it <pod-name> -- /bin/sh
# 查看事件
kubectl get events --sort-by=.metadata.creationTimestamp
# 查看节点
kubectl top nodes
kubectl describe node <node-name>
# 网络调试
kubectl run debug --rm -i --tty --image=nicolaka/netshoot -- /bin/bash
总结
Kubernetes生产环境部署需要关注以下关键点:
- 高可用架构:控制平面和工作节点的高可用设计
- 安全加固:RBAC、NetworkPolicy、PodSecurityPolicy
- 资源管理:ResourceQuota、LimitRange、HPA、VPA
- 监控告警:Prometheus、Grafana、Alertmanager
- 日志收集:Fluent Bit、Loki、Elasticsearch
- 备份恢复:etcd备份、Velero应用备份
- 运维管理:标准化流程、故障排查、文档完善
通过遵循这些最佳实践,可以构建稳定、安全、高效的Kubernetes生产环境。