CoreWeave GPU工作负载安全配置

v20260423

coreweave-security-basics

本技能旨在指导用户如何在CoreWeave平台上安全部署GPU工作负载。涵盖了从API密钥管理、RBAC权限控制到网络策略（NetworkPolicy）的全面安全最佳实践。帮助用户有效隔离命名空间、保护模型权重和敏感训练数据，确保云原生环境的高安全性。

Kubernetes 安全 RBAC GPU 密钥管理网络云平台

获取技能

200 次下载

概览

CoreWeave Security Basics

Overview

CoreWeave provides bare-metal GPU cloud on Kubernetes. Security concerns center on compute credential management (kubeconfig, deploy tokens), network isolation between inference workloads, secrets for model registry access (HuggingFace, container registries), and protecting sensitive training data on persistent volumes. A compromised namespace can expose GPU resources, model weights, and customer inference data.

API Key Management

import { KubeConfig, CoreV1Api } from "@kubernetes/client-node";

function createCoreWeaveClient(): CoreV1Api {
  const apiKey = process.env.COREWEAVE_API_KEY;
  if (!apiKey) {
    throw new Error("Missing COREWEAVE_API_KEY — set via secrets manager");
  }
  const kc = new KubeConfig();
  kc.loadFromDefault();
  const api = kc.makeApiClient(CoreV1Api);
  // Never log kubeconfig or API key contents
  console.log("CoreWeave client initialized for namespace:", process.env.CW_NAMESPACE);
  return api;
}

Webhook Signature Verification

import crypto from "crypto";
import { Request, Response, NextFunction } from "express";

function verifyCoreWeaveWebhook(req: Request, res: Response, next: NextFunction): void {
  const signature = req.headers["x-coreweave-signature"] as string;
  const secret = process.env.COREWEAVE_WEBHOOK_SECRET!;
  const expected = crypto.createHmac("sha256", secret).update(req.body).digest("hex");
  if (!signature || !crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected))) {
    res.status(401).send("Invalid signature");
    return;
  }
  next();
}

Input Validation

import { z } from "zod";

const WorkloadRequestSchema = z.object({
  namespace: z.string().regex(/^[a-z0-9-]+$/).max(63),
  gpu_type: z.enum(["A100_80GB", "A100_40GB", "H100_80GB", "RTX_A6000"]),
  gpu_count: z.number().int().min(1).max(8),
  image: z.string().regex(/^[a-z0-9.\-/]+:[a-z0-9.\-]+$/),
  model_id: z.string().min(1).max(200),
});

function validateWorkloadRequest(data: unknown) {
  return WorkloadRequestSchema.parse(data);
}

Data Protection

const CW_SENSITIVE_FIELDS = ["kubeconfig", "hf_token", "registry_password", "api_key", "model_weights_url"];

function redactCoreWeaveLog(record: Record<string, unknown>): Record<string, unknown> {
  const redacted = { ...record };
  for (const field of CW_SENSITIVE_FIELDS) {
    if (field in redacted) redacted[field] = "[REDACTED]";
  }
  return redacted;
}

Security Checklist

Kubeconfig stored in secrets manager, never in repos
Kubernetes Secrets used for model tokens (not env vars in YAML)
Network policies restrict inference endpoint access
RBAC limits namespace access per team
Container images scanned for CVEs before deployment
PVCs encrypted at rest for training data
GPU workload namespaces isolated with NetworkPolicy
Deploy tokens scoped per-namespace, not cluster-wide

Error Handling

Vulnerability	Risk	Mitigation
Leaked kubeconfig	Full cluster access, GPU resource theft	Secrets manager + RBAC scoping
Open inference endpoints	Unauthorized model access	NetworkPolicy ingress rules
Unscanned container images	CVE exploitation in GPU pods	CI image scanning before deploy
Overly broad RBAC	Cross-namespace data leakage	Per-team namespace RBAC bindings
Unencrypted PVCs	Training data exposure	Encrypted storage classes

Resources

Next Steps

See coreweave-prod-checklist.

信息

Category 编程开发

Name coreweave-security-basics

版本 v20260423

大小 4.16KB

Source jeremylongshore/claude-code-plugins-plus-skills

更新时间 2026-04-28