云端机器学习工作负载迁移指南

v20260423

coreweave-migration-deep-dive

本技能包详细指导用户如何将机器学习工作负载（包括推理服务和训练管道）从AWS/GCP/Azure等大型云平台迁移到CoreWeave GPU云。内容涵盖成本对比、容器化步骤、Kubernetes配置适配和分阶段部署，帮助用户确保迁移过程平稳、高效且具成本效益。

机器学习云端迁移 GPU Kubernetes AWS CoreWeave 推理

205 次下载

概览

CoreWeave Migration Deep Dive

Instance	AWS	CoreWeave	Savings
1x A100 80GB	~$3.60/hr (p4d)	~$2.21/hr	~39%
8x A100 80GB	~$32/hr (p4d.24xl)	~$17.70/hr	~45%
1x H100 80GB	~$6.50/hr (p5)	~$4.76/hr	~27%

# If running on bare EC2/GCE, containerize first
docker build -t inference-server:v1 .
docker push ghcr.io/myorg/inference-server:v1

Key changes from AWS EKS / GKE:

Node affinity: Use gpu.nvidia.com/class instead of nvidia.com/gpu.product
Storage: Use CoreWeave storage classes (shared-ssd-ord1)
Networking: CoreWeave provides flat networking within VPC

Run both old and new infrastructure simultaneously, gradually shift traffic.

Decommission old GPU instances after validation period.

Issue	Solution
Different CUDA drivers	Match container CUDA to CoreWeave node drivers
Storage migration	Use rclone or rsync to move data to CoreWeave PVC
DNS changes	Update ingress/load balancer DNS
IAM differences	CoreWeave uses kubeconfig, not IAM roles

This completes the CoreWeave skill pack. Start with coreweave-install-auth for new deployments.

信息

Category 人工智能

Name coreweave-migration-deep-dive

版本 v20260423

大小 2.11KB

Source jeremylongshore/claude-code-plugins-plus-skills

更新时间 2026-04-28