Login
Download
Skill UI
Browse and discover
7568+
curated skills
All
Development
Artificial Intelligence
Design & Creative
Product & Business
Data Science
Marketing
Soft Skills
Productivity
Engineering
Languages
Search
Capability
, found
5
results
Default
Newest
Most Downloaded
LLM Agent Benchmarking and Evaluation
agent-evaluation
sickn33/antigravity-awesome-skills
271
A comprehensive framework for rigorously testing and benchmarking LLM agents. It moves beyond simple pass/fail checks to assess complex behaviors, reliability metrics, and capability consistency across multiple runs. Ideal for production monitoring and identifying subtle failure modes in advanced AI agents.
View Details
Claude Eval Harness
eval-harness
affaan-m/everything-claude-code
90
Formal evaluation framework for Claude Code sessions that applies evaluation-driven development to define capability and regression checks, run deterministic and model graders, and track pass@k/pass^k reliability metrics before shipping.
View Details
AI-Shaped Readiness Advisor
ai-shaped-readiness-advisor
deanpeters/Product-Manager-Skills
479
Interactive assessment that gauges whether your product work is AI-first or AI-shaped, measures readiness across five essential PM competencies for 2026, highlights gaps, and recommends which AI capability to build next.
View Details
Aws Agentic AI
aws-agentic-ai
sickn33/antigravity-awesome-skills
479
Aws Agentic Ai refers to an AI-driven agentic capability within AWS, aiming to automate complex workflows by orchestrating services and responding to evolving goals with minimal human intervention.
View Details
AI Agent Self-Evolution Engine
capability-evolver
EvoMap/evolver
446
Evolver is a sophisticated self-evolution engine designed for advanced AI agents. It analyzes runtime logs and operational history to autonomously identify performance bottlenecks, failures, and opportunities for improvement. It applies protocol-constrained evolution, communicating exclusively via a local Proxy Mailbox to interact with the EvoMap Hub for asset submission and task claiming.
View Details
1
Language
简体中文
English