Login
Download
Skill UI
Browse and discover
9688+
curated skills
All
Development
Artificial Intelligence
Design & Creative
Product & Business
Data Science
Marketing
Soft Skills
Productivity
Engineering
Languages
Search
Fast Serving
, found
3
results
Default
Newest
Most Downloaded
FastAPI ML Endpoint
fastapi-ml-endpoint
jeremylongshore/claude-code-plugins-plus-skills
207
Automates FastAPI ML endpoint deployments by guiding architecture, producing production-ready code, and validating serving configs when you invoke the ML deployment skill.
View Details
SGLang Structured Inference Service
sglang
Orchestra-Research/AI-Research-SKILLs
150
SGLang is a high-performance serving framework for LLMs and VLMs that uses RadixAttention prefix caching to deliver structured outputs (JSON/regex/grammar) and agentic workflows with function calls, providing up to 5× faster inference than vLLM across multi-GPU production deployments.
View Details
TensorRT LLM Optimizer
tensorrt-llm
Orchestra-Research/AI-Research-SKILLs
334
Optimizes large language model inference on NVIDIA GPUs with TensorRT, delivering 10-100× faster throughput, quantized precision (FP8/INT4), multi-GPU scaling, and serving-ready tooling for production deployments.
View Details
1
Language
简体中文
English