CoreWeave GPU工作负载部署

v20260423

coreweave-hello-world

本指南提供在CoreWeave平台上使用Kubernetes（kubectl）部署GPU工作负载的教程。内容包括部署vLLM服务进行大型语言模型（LLM）推理测试，以及运行CUDA批量作业进行性能基准测试。适用于首次部署GPU任务或验证集群访问的场景。

Kubernetes GPU CoreWeave vLLM 推理部署 CUDA 云平台

获取技能

436 次下载

概览

CoreWeave Hello World

Overview

Deploy your first GPU workload on CoreWeave: a simple inference service using vLLM or a batch CUDA job. CoreWeave runs Kubernetes on bare-metal GPU nodes with A100, H100, and L40 GPUs.

Prerequisites

Completed coreweave-install-auth setup
kubectl configured with CoreWeave kubeconfig
Namespace with GPU quota

Instructions

Step 1: Deploy a vLLM Inference Server

# vllm-inference.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: vllm-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vllm-server
  template:
    metadata:
      labels:
        app: vllm-server
    spec:
      containers:
        - name: vllm
          image: vllm/vllm-openai:latest
          args:
            - "--model"
            - "meta-llama/Llama-3.1-8B-Instruct"
            - "--port"
            - "8000"
          ports:
            - containerPort: 8000
          resources:
            limits:
              nvidia.com/gpu: 1
              memory: 48Gi
              cpu: "8"
            requests:
              nvidia.com/gpu: 1
              memory: 32Gi
              cpu: "4"
          env:
            - name: HUGGING_FACE_HUB_TOKEN
              valueFrom:
                secretKeyRef:
                  name: hf-token
                  key: token
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: gpu.nvidia.com/class
                    operator: In
                    values: ["A100_PCIE_80GB"]
---
apiVersion: v1
kind: Service
metadata:
  name: vllm-server
spec:
  selector:
    app: vllm-server
  ports:
    - port: 8000
      targetPort: 8000
  type: ClusterIP

# Create HuggingFace token secret
kubectl create secret generic hf-token --from-literal=token="${HF_TOKEN}"

# Deploy
kubectl apply -f vllm-inference.yaml
kubectl get pods -w  # Wait for Running state

# Port-forward and test
kubectl port-forward svc/vllm-server 8000:8000 &
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "meta-llama/Llama-3.1-8B-Instruct", "messages": [{"role": "user", "content": "Hello!"}]}'

Step 2: Batch GPU Job

# gpu-batch-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: gpu-benchmark
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: benchmark
          image: pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
          command: ["python3", "-c"]
          args:
            - |
              import torch
              print(f"CUDA available: {torch.cuda.is_available()}")
              print(f"GPU: {torch.cuda.get_device_name(0)}")
              x = torch.randn(10000, 10000, device="cuda")
              y = torch.matmul(x, x)
              print(f"Matrix multiply result shape: {y.shape}")
              print("CoreWeave GPU test passed!")
          resources:
            limits:
              nvidia.com/gpu: 1
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: gpu.nvidia.com/class
                    operator: In
                    values: ["A100_PCIE_80GB"]

kubectl apply -f gpu-batch-job.yaml
kubectl logs job/gpu-benchmark --follow

Error Handling

Error	Cause	Solution
Pod stuck Pending	No GPU capacity	Try different GPU type or check quota
`nvidia-smi` not found	Wrong base image	Use NVIDIA CUDA images
OOMKilled	Insufficient GPU memory	Use larger GPU (80GB A100)
Image pull error	Registry auth	Create imagePullSecret

Resources

Next Steps

Proceed to coreweave-local-dev-loop for development workflow setup.

信息

Category 数据科学

Name coreweave-hello-world

版本 v20260423

大小 4.6KB

Source jeremylongshore/claude-code-plugins-plus-skills

更新时间 2026-04-28