CI/CD流水线最佳实践与实现
•软件部署
CI/CD流水线最佳实践与实现
CI/CD(持续集成/持续部署)是现代软件开发的核心实践。本文将详细介绍CI/CD流水线的最佳实践和实现方法。
CI/CD基础概念
1. 持续集成(CI)
核心实践
- 代码频繁提交(每日多次)
- 自动化构建
- 自动化测试
- 快速反馈
收益
- 及早发现问题
- 减少集成风险
- 提高代码质量
- 加快交付速度
2. 持续交付/部署(CD)
持续交付
- 代码自动构建、测试
- 可手动部署到生产环境
- 保留发布决策权
持续部署
- 代码自动构建、测试、部署
- 全自动发布到生产环境
- 需要完善的测试和监控
流水线设计
1. 流水线阶段
代码提交
↓
代码检查(Lint、Format)
↓
单元测试
↓
构建打包
↓
集成测试
↓
安全扫描
↓
制品推送
↓
部署到测试环境
↓
验收测试
↓
部署到预发布环境
↓
部署到生产环境
↓
监控验证
2. 设计原则
快速反馈
- 快速阶段优先
- 失败快速终止
- 并行执行
安全可靠
- 自动化测试覆盖
- 安全扫描集成
- 人工审批关卡
可追溯
- 版本控制
- 制品管理
- 审计日志
GitLab CI实现
1. 基础配置
# .gitlab-ci.yml
stages:
- validate
- build
- test
- security
- deploy
variables:
DOCKER_REGISTRY: registry.gitlab.com
IMAGE_NAME: $DOCKER_REGISTRY/$CI_PROJECT_PATH
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: ""
# 全局默认配置
default:
image: docker:24
services:
- docker:24-dind
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
# 缓存配置
.npm_cache: &npm_cache
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
policy: pull-push
# 作业模板
.build_template: &build_definition
stage: build
script:
- docker build -t $IMAGE_NAME:$CI_COMMIT_SHA .
- docker push $IMAGE_NAME:$CI_COMMIT_SHA
2. 完整流水线示例
stages:
- validate
- build
- test
- security
- package
- deploy
# 代码检查
lint:
stage: validate
image: node:18
<<: *npm_cache
script:
- npm ci
- npm run lint
- npm run format:check
only:
- merge_requests
- main
# 单元测试
unit-test:
stage: test
image: node:18
<<: *npm_cache
script:
- npm ci
- npm run test:unit -- --coverage
coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
junit: junit.xml
paths:
- coverage/
expire_in: 1 week
parallel:
matrix:
- NODE_VERSION: ["16", "18", "20"]
# 构建应用
build:
stage: build
image: docker:24
services:
- docker:24-dind
script:
- docker build
--build-arg NODE_ENV=production
--cache-from $IMAGE_NAME:latest
-t $IMAGE_NAME:$CI_COMMIT_SHA
-t $IMAGE_NAME:$CI_COMMIT_REF_SLUG
-f Dockerfile.prod .
- docker push $IMAGE_NAME:$CI_COMMIT_SHA
- docker push $IMAGE_NAME:$CI_COMMIT_REF_SLUG
only:
- main
- tags
# 集成测试
integration-test:
stage: test
image: docker/compose:latest
services:
- docker:24-dind
script:
- docker-compose -f docker-compose.test.yml up --abort-on-container-exit
artifacts:
when: always
reports:
junit: test-results/integration.xml
# 安全扫描
security-scan:
stage: security
image: aquasec/trivy:latest
script:
# 镜像扫描
- trivy image --exit-code 1 --severity HIGH,CRITICAL
--format template --template "@contrib/sarif.tpl"
-o trivy-results.sarif $IMAGE_NAME:$CI_COMMIT_SHA
# 代码扫描
- trivy filesystem --scanners vuln,secret,config
--severity HIGH,CRITICAL .
artifacts:
reports:
sast: trivy-results.sarif
allow_failure: true
# SAST扫描
sast:
stage: security
image: returntocorp/semgrep-agent:v1
script:
- semgrep-agent
--config=auto
--config=p/security-audit
--config=p/owasp-top-ten
--json --output=semgrep-results.json
artifacts:
reports:
sast: semgrep-results.json
allow_failure: true
# 制品推送
push-image:
stage: package
image: docker:24
services:
- docker:24-dind
script:
- docker pull $IMAGE_NAME:$CI_COMMIT_SHA
- docker tag $IMAGE_NAME:$CI_COMMIT_SHA $IMAGE_NAME:latest
- docker tag $IMAGE_NAME:$CI_COMMIT_SHA $IMAGE_NAME:$CI_COMMIT_TAG
- docker push $IMAGE_NAME:latest
- docker push $IMAGE_NAME:$CI_COMMIT_TAG
only:
- tags
# 部署到开发环境
deploy-dev:
stage: deploy
image: bitnami/kubectl:latest
environment:
name: development
url: https://dev.example.com
script:
- kubectl config use-context dev
- helm upgrade --install myapp ./helm-chart
--namespace dev
--set image.tag=$CI_COMMIT_SHA
--set environment=dev
--wait
--timeout 5m
only:
- main
# 部署到测试环境
deploy-staging:
stage: deploy
image: bitnami/kubectl:latest
environment:
name: staging
url: https://staging.example.com
script:
- kubectl config use-context staging
- helm upgrade --install myapp ./helm-chart
--namespace staging
--set image.tag=$CI_COMMIT_SHA
--set environment=staging
--wait
--timeout 10m
only:
- main
when: manual
# 部署到生产环境
deploy-production:
stage: deploy
image: bitnami/kubectl:latest
environment:
name: production
url: https://example.com
script:
- kubectl config use-context production
- helm upgrade --install myapp ./helm-chart
--namespace production
--set image.tag=$CI_COMMIT_TAG
--set environment=production
--wait
--timeout 15m
only:
- tags
when: manual
needs:
- job: push-image
- job: security-scan
3. 高级特性
动态流水线
workflow:
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_COMMIT_TAG
# 条件作业
database-migration:
script:
- npm run migrate
rules:
- changes:
- migrations/*
when: always
- when: never
# 父子流水线
trigger-child:
trigger:
include:
- local: '/microservices/service-a/.gitlab-ci.yml'
strategy: depend
矩阵构建
# 多环境测试
test:
parallel:
matrix:
- PROVIDER: [aws, gcp, azure]
STACK: [cfn, terraform, pulumi]
script:
- echo "Testing $PROVIDER with $STACK"
GitHub Actions实现
1. 基础工作流
# .github/workflows/ci.yml
name: CI/CD Pipeline
on:
push:
branches: [ main, develop ]
tags: [ 'v*' ]
pull_request:
branches: [ main ]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
# 代码检查
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '18'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run ESLint
run: npm run lint
- name: Check formatting
run: npm run format:check
# 单元测试
unit-test:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [16, 18, 20]
steps:
- uses: actions/checkout@v4
- name: Setup Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm run test:unit -- --coverage
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
files: ./coverage/lcov.info
# 构建镜像
build:
runs-on: ubuntu-latest
needs: [lint, unit-test]
permissions:
contents: read
packages: write
outputs:
image-tag: ${{ steps.meta.outputs.tags }}
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha,prefix={{branch}}-
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
platforms: linux/amd64,linux/arm64
# 安全扫描
security-scan:
runs-on: ubuntu-latest
needs: build
permissions:
contents: read
security-events: write
steps:
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
# 集成测试
integration-test:
runs-on: ubuntu-latest
needs: build
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
redis:
image: redis:7
ports:
- 6379:6379
steps:
- uses: actions/checkout@v4
- name: Run integration tests
run: |
docker run --network host \
-e DATABASE_URL=postgresql://postgres:postgres@localhost:5432/test \
-e REDIS_URL=redis://localhost:6379 \
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \
npm run test:integration
# 部署到开发环境
deploy-dev:
runs-on: ubuntu-latest
needs: [build, integration-test]
if: github.ref == 'refs/heads/main'
environment:
name: development
url: https://dev.example.com
steps:
- uses: actions/checkout@v4
- name: Setup kubectl
uses: azure/setup-kubectl@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Update kubeconfig
run: aws eks update-kubeconfig --name dev-cluster
- name: Deploy to EKS
run: |
helm upgrade --install myapp ./helm-chart \
--namespace dev \
--set image.tag=${{ github.sha }} \
--set environment=dev \
--wait
# 部署到生产环境
deploy-production:
runs-on: ubuntu-latest
needs: [security-scan, integration-test]
if: startsWith(github.ref, 'refs/tags/v')
environment:
name: production
url: https://example.com
steps:
- uses: actions/checkout@v4
- name: Deploy to production
run: |
echo "Deploying ${{ github.ref_name }} to production"
# 生产部署步骤
2. 可复用工作流
# .github/workflows/reusable-build.yml
name: Reusable Build
on:
workflow_call:
inputs:
node-version:
required: true
type: string
environment:
required: true
type: string
secrets:
registry-token:
required: true
outputs:
image-tag:
description: "Built image tag"
value: ${{ jobs.build.outputs.tag }}
jobs:
build:
runs-on: ubuntu-latest
outputs:
tag: ${{ steps.meta.outputs.tags }}
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ inputs.node-version }}
- name: Build
run: |
npm ci
npm run build
- name: Push to registry
run: |
echo "Pushing to ${{ inputs.environment }}"
调用可复用工作流
# .github/workflows/main.yml
name: Main CI
on:
push:
branches: [main]
jobs:
build-dev:
uses: ./.github/workflows/reusable-build.yml
with:
node-version: '18'
environment: 'development'
secrets:
registry-token: ${{ secrets.REGISTRY_TOKEN }}
制品管理
1. Docker镜像管理
镜像标签策略
分支构建:
- main: latest, main-{sha}
- feature/*: feature-{branch}-{sha}
- PR: pr-{number}-{sha}
标签构建:
- v1.2.3: 1.2.3, 1.2, 1
- v1.2.3-rc1: 1.2.3-rc1
镜像清理
# .github/workflows/cleanup.yml
name: Cleanup Old Images
on:
schedule:
- cron: '0 0 * * 0' # 每周日
jobs:
cleanup:
runs-on: ubuntu-latest
steps:
- name: Delete old container images
uses: snok/container-retention-policy@v2
with:
image-names: myapp
cut-off: 30 days ago UTC
account-type: org
org-name: myorg
keep-at-least: 10
token: ${{ secrets.GITHUB_TOKEN }}
2. Helm Chart管理
# Chart.yaml
apiVersion: v2
name: myapp
description: A Helm chart for myapp
type: application
version: 1.0.0
appVersion: "1.0.0"
dependencies:
- name: postgresql
version: 12.x.x
repository: https://charts.bitnami.com/bitnami
condition: postgresql.enabled
Chart发布
# .github/workflows/release-chart.yml
name: Release Chart
on:
push:
branches: [main]
paths:
- 'helm-chart/**'
jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Install Helm
uses: azure/setup-helm@v3
- name: Run chart-releaser
uses: helm/chart-releaser-action@v1.6.0
with:
charts_dir: helm-chart
env:
CR_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
安全最佳实践
1. 密钥管理
GitLab CI
# 使用CI/CD变量
script:
- echo "$PRODUCTION_KEY" > key.pem
- chmod 600 key.pem
# 使用HashiCorp Vault
vault-secrets:
image: vault:latest
script:
- export VAULT_TOKEN=$(vault write -field=token auth/jwt/login role=ci jwt=$CI_JOB_JWT)
- vault kv get -field=password secret/data/db
GitHub Actions
# 使用Secrets
- name: Deploy
env:
API_KEY: ${{ secrets.API_KEY }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
run: |
echo "Deploying with API key"
# 使用OIDC
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/my-role
aws-region: us-east-1
2. 安全扫描
SAST扫描
sast:
stage: security
image: returntocorp/semgrep-agent:v1
script:
- semgrep-agent
--config=auto
--config=p/security-audit
--config=p/owasp-top-ten
--config=p/cwe-top-25
artifacts:
reports:
sast: gl-sast-report.json
依赖扫描
dependency-scanning:
stage: security
image: node:18
script:
- npm audit --audit-level=moderate --json > dependency-scan-report.json
artifacts:
reports:
dependency_scanning: dependency-scan-report.json
allow_failure: true
监控与度量
1. DORA指标
# 收集部署频率
deploy-metrics:
stage: .post
script:
- |
curl -X POST https://metrics.example.com/dora \
-H "Content-Type: application/json" \
-d '{
"event": "deployment",
"timestamp": "'$(date -Iseconds)'",
"environment": "'"$CI_ENVIRONMENT_NAME"'",
"commit_sha": "'"$CI_COMMIT_SHA"'",
"duration": "'"$CI_JOB_DURATION"'"
}'
2. 流水线性能监控
# 发送指标到Prometheus Pushgateway
pipeline-metrics:
stage: .post
script:
- |
cat <<EOF | curl --data-binary @- http://pushgateway:9091/metrics/job/gitlab-ci
# HELP gitlab_ci_pipeline_duration_seconds Pipeline duration
# TYPE gitlab_ci_pipeline_duration_seconds gauge
gitlab_ci_pipeline_duration_seconds{pipeline_id="$CI_PIPELINE_ID"} $CI_PIPELINE_DURATION
# HELP gitlab_ci_job_duration_seconds Job duration
# TYPE gitlab_ci_job_duration_seconds gauge
gitlab_ci_job_duration_seconds{job_name="$CI_JOB_NAME"} $CI_JOB_DURATION
EOF
总结
CI/CD流水线的最佳实践包括:
- 快速反馈:快速阶段优先,失败快速终止
- 全面测试:单元测试、集成测试、安全扫描
- 制品管理:版本控制、镜像管理、依赖管理
- 安全加固:密钥管理、安全扫描、访问控制
- 可观测性:日志、指标、追踪
- 持续优化:度量分析、流程改进
通过合理的CI/CD流水线设计,可以实现快速、安全、可靠的软件交付。