机密金库管理

v20260326

secrets-vault-manager

覆盖 Vault 与云端秘钥库的部署、认证、轮换与审计，帮助运维及安全团队构建可靠的机密管理与事件响应流程。

HashiCorp Vault 密钥管理云安全 DevOps 轮换审计 Kubernetes 自动化

获取技能

259 次下载

概览

Secrets Vault Manager

Tier: POWERFUL Category: Engineering Domain: Security / Infrastructure / DevOps

Overview

Production secret infrastructure management for teams running HashiCorp Vault, cloud-native secret stores, or hybrid architectures. This skill covers policy authoring, auth method configuration, automated rotation, dynamic secrets, audit logging, and incident response.

Distinct from env-secrets-manager which handles local .env file hygiene and leak detection. This skill operates at the infrastructure layer — Vault clusters, cloud KMS, certificate authorities, and CI/CD secret injection.

When to Use

Standing up a new Vault cluster or migrating to a managed secret store
Designing auth methods for services, CI runners, and human operators
Implementing automated credential rotation (database, API keys, certificates)
Auditing secret access patterns for compliance (SOC 2, ISO 27001, HIPAA)
Responding to a secret leak that requires mass revocation
Integrating secrets into Kubernetes workloads or CI/CD pipelines

HashiCorp Vault Patterns

Architecture Decisions

Decision	Recommendation	Rationale
Deployment mode	HA with Raft storage	No external dependency, built-in leader election
Auto-unseal	Cloud KMS (AWS KMS / Azure Key Vault / GCP KMS)	Eliminates manual unseal, enables automated restarts
Namespaces	One per environment (dev/staging/prod)	Blast-radius isolation, independent policies
Audit devices	File + syslog (dual)	Vault refuses requests if all audit devices fail — dual prevents outages

Auth Methods

AppRole — Machine-to-machine authentication for services and batch jobs.

# Enable AppRole
path "auth/approle/*" {
  capabilities = ["create", "read", "update", "delete", "list"]
}

# Application-specific role
vault write auth/approle/role/payment-service \
  token_ttl=1h \
  token_max_ttl=4h \
  secret_id_num_uses=1 \
  secret_id_ttl=10m \
  token_policies="payment-service-read"

Kubernetes — Pod-native authentication via service account tokens.

vault write auth/kubernetes/role/api-server \
  bound_service_account_names=api-server \
  bound_service_account_namespaces=production \
  policies=api-server-secrets \
  ttl=1h

OIDC — Human operator access via SSO provider (Okta, Azure AD, Google Workspace).

vault write auth/oidc/role/engineering \
  bound_audiences="vault" \
  allowed_redirect_uris="https://vault.example.com/ui/vault/auth/oidc/oidc/callback" \
  user_claim="email" \
  oidc_scopes="openid,profile,email" \
  policies="engineering-read" \
  ttl=8h

Secret Engines

Engine	Use Case	TTL Strategy
KV v2	Static secrets (API keys, config)	Versioned, manual rotation
Database	Dynamic DB credentials	1h default, 24h max
PKI	TLS certificates	90d leaf certs, 5y intermediate CA
Transit	Encryption-as-a-service	Key rotation every 90d
SSH	Signed SSH certificates	30m for interactive, 8h for automation

Policy Design

Follow least-privilege with path-based granularity:

# payment-service-read policy
path "secret/data/production/payment/*" {
  capabilities = ["read"]
}

path "database/creds/payment-readonly" {
  capabilities = ["read"]
}

# Deny access to admin paths explicitly
path "sys/*" {
  capabilities = ["deny"]
}

Policy naming convention: {service}-{access-level} (e.g., payment-service-read, api-gateway-admin).

Cloud Secret Store Integration

Comparison Matrix

Feature	AWS Secrets Manager	Azure Key Vault	GCP Secret Manager
Rotation	Built-in Lambda	Custom logic via Functions	Cloud Functions
Versioning	Automatic	Manual or automatic	Automatic
Encryption	AWS KMS (default or CMK)	HSM-backed	Google-managed or CMEK
Access control	IAM policies + resource policy	RBAC + Access Policies	IAM bindings
Cross-region	Replication supported	Geo-redundant by default	Replication supported
Audit	CloudTrail	Azure Monitor + Diagnostic Logs	Cloud Audit Logs
Pricing model	Per-secret + per-API call	Per-operation + per-key	Per-secret version + per-access

When to Use Which

AWS Secrets Manager: RDS/Aurora credential rotation out of the box. Best when fully on AWS.
Azure Key Vault: Certificate management strength. Required for Azure AD integrated workloads.
GCP Secret Manager: Simplest API surface. Best for GKE-native workloads with Workload Identity.
HashiCorp Vault: Multi-cloud, dynamic secrets, PKI, transit encryption. Best for complex or hybrid environments.

SDK Access Patterns

Principle: Always fetch secrets at startup or via sidecar — never bake into images or config files.

# AWS Secrets Manager pattern
import boto3, json

def get_secret(secret_name, region="us-east-1"):
    client = boto3.client("secretsmanager", region_name=region)
    response = client.get_secret_value(SecretId=secret_name)
    return json.loads(response["SecretString"])

# GCP Secret Manager pattern
from google.cloud import secretmanager

def get_secret(project_id, secret_id, version="latest"):
    client = secretmanager.SecretManagerServiceClient()
    name = f"projects/{project_id}/secrets/{secret_id}/versions/{version}"
    response = client.access_secret_version(request={"name": name})
    return response.payload.data.decode("UTF-8")

# Azure Key Vault pattern
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient

def get_secret(vault_url, secret_name):
    credential = DefaultAzureCredential()
    client = SecretClient(vault_url=vault_url, credential=credential)
    return client.get_secret(secret_name).value

Secret Rotation Workflows

Rotation Strategy by Secret Type

Secret Type	Rotation Frequency	Method	Downtime Risk
Database passwords	30 days	Dual-account swap	Zero (A/B rotation)
API keys	90 days	Generate new, deprecate old	Zero (overlap window)
TLS certificates	60 days before expiry	ACME or Vault PKI	Zero (graceful reload)
SSH keys	90 days	Vault-signed certificates	Zero (CA-based)
Service tokens	24 hours	Dynamic generation	Zero (short-lived)
Encryption keys	90 days	Key versioning (rewrap)	Zero (version coexistence)

Database Credential Rotation (Dual-Account)

Two database accounts exist: app_user_a and app_user_b
Application currently uses app_user_a
Rotation rotates app_user_b password, updates secret store
Application switches to app_user_b on next credential fetch
After grace period, app_user_a password is rotated
Cycle repeats

API Key Rotation (Overlap Window)

Generate new API key with provider
Store new key in secret store as current, move old to previous
Deploy applications — they read current
After all instances restarted (or TTL expired), revoke previous
Monitoring confirms zero usage of old key before revocation

Dynamic Secrets

Dynamic secrets are generated on-demand with automatic expiration. Prefer dynamic secrets over static credentials wherever possible.

Database Dynamic Credentials (Vault)

# Configure database engine
vault write database/config/postgres \
  plugin_name=postgresql-database-plugin \
  connection_url="postgresql://{{username}}:{{password}}@db.example.com:5432/app" \
  allowed_roles="app-readonly,app-readwrite" \
  username="vault_admin" \
  password="<admin-password>"

# Create role with TTL
vault write database/roles/app-readonly \
  db_name=postgres \
  creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
  default_ttl=1h \
  max_ttl=24h

Cloud IAM Dynamic Credentials

Vault can generate short-lived AWS IAM credentials, Azure service principal passwords, or GCP service account keys — eliminating long-lived cloud credentials entirely.

SSH Certificate Authority

Replace SSH key distribution with a Vault-signed certificate model:

Vault acts as SSH CA
Users/machines request signed certificates with short TTL (30 min)
SSH servers trust the CA public key — no authorized_keys management
Certificates expire automatically — no revocation needed for normal operations

Audit Logging

What to Log

Event	Priority	Retention
Secret read access	HIGH	1 year minimum
Secret creation/update	HIGH	1 year minimum
Auth method login	MEDIUM	90 days
Policy changes	CRITICAL	2 years (compliance)
Failed access attempts	CRITICAL	1 year
Token creation/revocation	MEDIUM	90 days
Seal/unseal operations	CRITICAL	Indefinite

Anomaly Detection Signals

Secret accessed from new IP/CIDR range
Access volume spike (>3x baseline for a path)
Off-hours access for human auth methods
Service accessing secrets outside its policy scope (denied requests)
Multiple failed auth attempts from single source
Token created with unusually long TTL

Compliance Reporting

Generate periodic reports covering:

Access inventory — Which identities accessed which secrets, when
Rotation compliance — Secrets overdue for rotation
Policy drift — Policies modified since last review
Orphaned secrets — Secrets with no recent access (>90 days)

Use audit_log_analyzer.py to parse Vault or cloud audit logs for these signals.

Emergency Procedures

Secret Leak Response (Immediate)

Time target: Contain within 15 minutes of detection.

Identify scope — Which secret(s) leaked, where (repo, log, error message, third party)
Revoke immediately — Rotate the compromised credential at the source (provider API, Vault, cloud SM)
Invalidate tokens — Revoke all Vault tokens that accessed the leaked secret
Audit blast radius — Query audit logs for usage of the compromised secret in the exposure window
Notify stakeholders — Security team, affected service owners, compliance (if PII/regulated data)
Post-mortem — Document root cause, update controls to prevent recurrence

Vault Seal Operations

When to seal: Active security incident affecting Vault infrastructure, suspected key compromise.

Sealing stops all Vault operations. Use only as last resort.

Unseal procedure:

Gather quorum of unseal key holders (Shamir threshold)
Or confirm auto-unseal KMS key is accessible
Unseal via vault operator unseal or restart with auto-unseal
Verify audit devices reconnected
Check active leases and token validity

See references/emergency_procedures.md for complete playbooks.

CI/CD Integration

Vault Agent Sidecar (Kubernetes)

Vault Agent runs alongside application pods, handles authentication and secret rendering:

# Pod annotation for Vault Agent Injector
annotations:
  vault.hashicorp.com/agent-inject: "true"
  vault.hashicorp.com/role: "api-server"
  vault.hashicorp.com/agent-inject-secret-db: "database/creds/app-readonly"
  vault.hashicorp.com/agent-inject-template-db: |
    {{- with secret "database/creds/app-readonly" -}}
    postgresql://{{ .Data.username }}:{{ .Data.password }}@db:5432/app
    {{- end }}

External Secrets Operator (Kubernetes)

For teams preferring declarative GitOps over agent sidecars:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: api-credentials
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend
    kind: ClusterSecretStore
  target:
    name: api-credentials
  data:
    - secretKey: api-key
      remoteRef:
        key: secret/data/production/api
        property: key

GitHub Actions OIDC

Eliminate long-lived secrets in CI by using OIDC federation:

- name: Authenticate to Vault
  uses: hashicorp/vault-action@v2
  with:
    url: https://vault.example.com
    method: jwt
    role: github-ci
    jwtGithubAudience: https://vault.example.com
    secrets: |
      secret/data/ci/deploy api_key | DEPLOY_API_KEY ;
      secret/data/ci/deploy db_password | DB_PASSWORD

Anti-Patterns

Anti-Pattern	Risk	Correct Approach
Hardcoded secrets in source code	Leak via repo, logs, error output	Fetch from secret store at runtime
Long-lived static tokens (>30 days)	Stale credentials, no accountability	Dynamic secrets or short TTL + rotation
Shared service accounts	No audit trail per consumer	Per-service identity with unique credentials
No rotation policy	Compromised creds persist indefinitely	Automated rotation on schedule
Secrets in environment variables on CI	Visible in build logs, process table	Vault Agent or OIDC-based injection
Single unseal key holder	Bus factor of 1, recovery blocked	Shamir split (3-of-5) or auto-unseal
No audit device configured	Zero visibility into access	Dual audit devices (file + syslog)
Wildcard policies (`path "*"`)	Over-permissioned, violates least privilege	Explicit path-based policies per service

Tools

Script	Purpose
`vault_config_generator.py`	Generate Vault policy and auth config from application requirements
`rotation_planner.py`	Create rotation schedule from a secret inventory file
`audit_log_analyzer.py`	Analyze audit logs for anomalies and compliance gaps

Cross-References

env-secrets-manager — Local .env file hygiene, leak detection, drift awareness
senior-secops — Security operations, incident response, threat modeling
ci-cd-pipeline-builder — Pipeline design where secrets are consumed
docker-development — Container secret injection patterns
helm-chart-builder — Kubernetes secret management in Helm charts

信息

Category 编程开发

Name secrets-vault-manager

版本 v20260326

大小 28.66KB

Source alirezarezvani/claude-skills

更新时间 2026-03-27