experiment-audit
wanshuiyin/Auto-claude-code-research-in-sleep
This advanced skill audits the integrity of AI/LLM experimental results. It employs a rigorous cross-model review process to analyze evaluation scripts, result files, and published claims. Its purpose is to detect common sources of academic fraud, such as using synthetic ground truth, score normalization manipulation, referencing non-existent results, or misrepresenting the scope of the experiment, thereby ensuring the scientific rigor and trustworthiness of reported findings.