hugging-face-community-evals
sickn33/antigravity-awesome-skills
This skill is designed for performing local, on-premise evaluations and benchmarking of models hosted on the Hugging Face Hub. It facilitates running specialized evaluation frameworks like `inspect-ai` and `lighteval` against local hardware. Users can select optimal inference backends, including `vllm` for high throughput, or fall back to Hugging Face Transformers or `accelerate`, ensuring comprehensive smoke testing and task selection for LLMs. It focuses purely on the local execution process, separate from remote jobs or publishing workflows.