result-to-claim
wanshuiyin/Auto-claude-code-research-in-sleep
This skill guides the process of evaluating research results against intended scientific claims. After running experiments, it systematically collects data from various sources (W&B, logs, etc.), performs a deterministic evidence pre-check to detect hallucinated evidence, and finally uses a Codex judgment to determine if the results support the claims, requiring next actions like pivoting or supplementing the findings.