Analyze the current codebase and produce a knowledge-graph.json file in .understand-anything/. This file powers the interactive dashboard for exploring the project's architecture.
$ARGUMENTS may contain:
--full — Force a full rebuild, ignoring any existing graph--auto-update — Enable automatic graph updates on commit (writes autoUpdate: true to .understand-anything/config.json)--no-auto-update — Disable automatic graph updates (writes autoUpdate: false to .understand-anything/config.json)--review — Run full LLM graph-reviewer instead of inline deterministic validationDetermine whether to run a full analysis or incremental update.
PROJECT_ROOT to the current working directory.git rev-parse HEAD
mkdir -p $PROJECT_ROOT/.understand-anything/intermediate
mkdir -p $PROJECT_ROOT/.understand-anything/tmp
3.5. Auto-update configuration:
--auto-update is in $ARGUMENTS: write {"autoUpdate": true} to $PROJECT_ROOT/.understand-anything/config.json
--no-auto-update is in $ARGUMENTS: write {"autoUpdate": false} to $PROJECT_ROOT/.understand-anything/config.json
Check if $PROJECT_ROOT/.understand-anything/knowledge-graph.json exists. If it does, read it.
Check if $PROJECT_ROOT/.understand-anything/meta.json exists. If it does, read it to get gitCommitHash.
Decision logic:
| Condition | Action |
|---|---|
--full flag in $ARGUMENTS |
Full analysis (all phases) |
| No existing graph or meta | Full analysis (all phases) |
--review flag + existing graph + unchanged commit hash |
Skip to Phase 6 (review-only — reuse existing assembled graph) |
| Existing graph + unchanged commit hash | Ask the user: "The graph is up to date at this commit. Would you like to: (a) run a full rebuild (--full), (b) run the LLM graph reviewer (--review), or (c) do nothing?" Then follow their choice. If they pick (c), STOP. |
| Existing graph + changed files | Incremental update (re-analyze changed files only) |
Review-only path: Copy the existing knowledge-graph.json to $PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json, then jump directly to Phase 6 step 3.
For incremental updates, get the changed file list:
git diff <lastCommitHash>..HEAD --name-only
If this returns no files, report "Graph is up to date" and STOP.
Collect project context for subagent injection:
README.md (or README.rst, readme.md) from $PROJECT_ROOT if it exists. Store as $README_CONTENT (first 3000 characters).package.json, pyproject.toml, Cargo.toml, go.mod, pom.xml) if it exists. Store as $MANIFEST_CONTENT.find $PROJECT_ROOT -maxdepth 2 -type f -not -path '*/node_modules/*' -not -path '*/.git/*' -not -path '*/dist/*' | head -100
Store as $DIR_TREE.src/index.ts, src/main.ts, src/App.tsx, index.js, main.py, manage.py, app.py, wsgi.py, asgi.py, run.py, __main__.py, main.go, cmd/*/main.go, src/main.rs, src/lib.rs, src/main/java/**/Application.java, Program.cs, config.ru, index.php. Store first match as $ENTRY_POINT.Dispatch a subagent using the prompt template at ./project-scanner-prompt.md. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
Additional context from main session:
Project README (first 3000 chars):
$README_CONTENTPackage manifest:
$MANIFEST_CONTENTUse this context to produce more accurate project name, description, and framework detection. The README and manifest are authoritative — prefer their information over heuristics.
Pass these parameters in the dispatch prompt:
Scan this project directory to discover all project files (including non-code files like configs, docs, infrastructure), detect languages and frameworks. Project root:
$PROJECT_ROOTWrite output to:$PROJECT_ROOT/.understand-anything/intermediate/scan-result.json
After the subagent completes, read $PROJECT_ROOT/.understand-anything/intermediate/scan-result.json to get:
fileCategory per file (code, config, docs, infra, data, script, markup)importMap): pre-resolved project-internal imports per file (non-code files have empty arrays)Store importMap in memory as $IMPORT_MAP for use in Phase 2 batch construction.
Store the file list as $FILE_LIST with fileCategory metadata for use in Phase 2 batch construction.
Gate check: If >200 files, inform the user and suggest scoping with a subdirectory argument. Proceed only if user confirms or add guidance that this may take a while.
Batch the file list from Phase 1 into groups of 20-30 files each (aim for ~25 files per batch for balanced sizes).
Batching strategy for non-code files:
depends_on Dockerfile)fileCategory from Phase 1 must be included in the batch file listFor each batch, dispatch a subagent using the prompt template at ./file-analyzer-prompt.md. Run up to 5 subagents concurrently using parallel dispatch. Pass the template as the subagent's prompt, appending the following additional context:
Additional context from main session:
Project:
<projectName>—<projectDescription>Languages:<languages from Phase 1>
Before dispatching each batch, construct batchImportData from $IMPORT_MAP:
batchImportData = {}
for each file in this batch:
batchImportData[file.path] = $IMPORT_MAP[file.path] ?? []
Fill in batch-specific parameters below and dispatch:
Analyze these files and produce GraphNode and GraphEdge objects. Project root:
$PROJECT_ROOTProject:<projectName>Languages:<languages>Batch index:<batchIndex>Write output to:$PROJECT_ROOT/.understand-anything/intermediate/batch-<batchIndex>.jsonPre-resolved import data for this batch (use this for all import edge creation — do NOT re-resolve imports from source):
<batchImportData JSON>Files to analyze in this batch:
<path>(<sizeLines> lines, fileCategory:<fileCategory>)<path>(<sizeLines> lines, fileCategory:<fileCategory>) ...
After ALL batches complete, read each batch-<N>.json file and merge:
nodes arrays. If duplicate node IDs exist, keep the later occurrence.edges arrays. Deduplicate by the composite key source + target + type.Use the changed files list from Phase 0. Batch and dispatch file-analyzer subagents using the same process as above (20-30 files per batch, up to 5 concurrent, with batchImportData constructed from $IMPORT_MAP), but only for changed files.
After batches complete, merge with the existing graph:
filePath matches any changed filesource or target references a removed nodeMerge all file-analyzer results into a single set of nodes and edges. Then perform basic integrity cleanup:
source or target references a node ID that does not exist in the merged node setBuild the combined prompt template:
./architecture-analyzer-prompt.md.python, markdown, dockerfile, yaml, sql, terraform, graphql, protobuf, shell, html, css), read the file at ./languages/<language-id>.md (e.g., ./languages/python.md, ./languages/dockerfile.md) and append its content after the base template under a ## Language Context header. If the file does not exist for a detected language, skip it silently and continue. These files are in the languages/ subdirectory next to this SKILL.md file. Include non-code language snippets — they provide edge patterns and summary styles for non-code files.Django), read the file at ./frameworks/<framework-id-lowercase>.md (e.g., ./frameworks/django.md) and append its full content after the language context. If the file does not exist for a detected framework, skip it silently and continue. These files are in the frameworks/ subdirectory next to this SKILL.md file.Pass the combined content as the subagent's prompt, appending the following additional context:
Additional context from main session:
Frameworks detected:
<frameworks from Phase 1>Directory tree (top 2 levels):
$DIR_TREEUse the directory tree, language context, and framework addendums (appended above) to inform layer assignments. Directory structure is strong evidence for layer boundaries. Non-code files (config, docs, infrastructure, data) should be assigned to appropriate layers — see the prompt template for guidance.
Pass these parameters in the dispatch prompt:
Analyze this codebase's structure to identify architectural layers. Project root:
$PROJECT_ROOTWrite output to:$PROJECT_ROOT/.understand-anything/intermediate/layers.jsonProject:<projectName>—<projectDescription>File nodes (all node types — includes code files, config, document, service, pipeline, table, schema, resource, endpoint):
[list of {id, type, name, filePath, summary, tags} for ALL file-level nodes — omit complexity, languageNotes]Import edges:
[list of edges with type "imports"]All edges (for cross-category analysis — includes configures, documents, deploys, triggers, etc.):
[list of ALL edges — include all edge types]
After the subagent completes, read $PROJECT_ROOT/.understand-anything/intermediate/layers.json and normalize it into a final layers array. Apply these steps in order:
{ "layers": [...] } instead of a plain array, extract the inner array. (The prompt requests a plain array, but LLMs may still produce an envelope.)nodes field instead of nodeIds, rename nodes → nodeIds. If nodes entries are objects with an id field rather than plain strings, extract just the id values into nodeIds.id, generate one as layer:<kebab-case-name>.nodeIds entries are raw file paths without a known prefix (file:, config:, document:, service:, pipeline:, table:, schema:, resource:, endpoint:), convert them to file:<relative-path>.nodeIds entries that do not exist in the merged node set.Each element of the final layers array MUST have this shape:
[
{
"id": "layer:<kebab-case-name>",
"name": "<layer name>",
"description": "<what belongs in this layer>",
"nodeIds": ["file:src/App.tsx", "config:tsconfig.json", "document:README.md"]
}
]
All four fields (id, name, description, nodeIds) are required.
For incremental updates: Always re-run architecture analysis on the full merged node set, since layer assignments may shift when files change.
Context for incremental updates: When re-running architecture analysis, also inject the previous layer definitions:
Previous layer definitions (for naming consistency):
[previous layers from existing graph]Maintain the same layer names and IDs where possible. Only add/remove layers if the file structure has materially changed.
Dispatch a subagent using the prompt template at ./tour-builder-prompt.md. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
Additional context from main session:
Project README (first 3000 chars):
$README_CONTENTProject entry point:
$ENTRY_POINTUse the README to align the tour narrative with the project's own documentation. Start the tour from the entry point if one was detected. The tour should tell the same story the README tells, but through the lens of actual code structure.
Pass these parameters in the dispatch prompt:
Create a guided learning tour for this codebase. Project root:
$PROJECT_ROOTWrite output to:$PROJECT_ROOT/.understand-anything/intermediate/tour.jsonProject:<projectName>—<projectDescription>Languages:<languages>Nodes (all file-level nodes — includes code files, config, document, service, pipeline, table, schema, resource, endpoint):
[list of {id, name, filePath, summary, type} for ALL file-level nodes — do NOT include function or class nodes]Layers:
[list of {id, name, description} for each layer — omit nodeIds]Edges (all types — includes imports, calls, configures, documents, deploys, triggers, etc.):
[list of ALL edges — include all edge types for complete graph topology analysis]
After the subagent completes, read $PROJECT_ROOT/.understand-anything/intermediate/tour.json and normalize it into a final tour array. Apply these steps in order:
{ "steps": [...] } instead of a plain array, extract the inner array. (The prompt requests a plain array, but LLMs may still produce an envelope.)nodesToInspect instead of nodeIds, rename it → nodeIds. If any step has whyItMatters instead of description, rename it → description.nodeIds entries are raw file paths without a known prefix (file:, config:, document:, service:, pipeline:, table:, schema:, resource:, endpoint:), convert them to file:<relative-path>.nodeIds entries that do not exist in the merged node set.order before saving.Each element of the final tour array MUST have this shape:
[
{
"order": 1,
"title": "Project Overview",
"description": "Start with the README to understand the project's purpose and architecture.",
"nodeIds": ["document:README.md"]
},
{
"order": 2,
"title": "Application Entry Point",
"description": "This step explains how the frontend boots and mounts.",
"nodeIds": ["file:src/main.tsx", "file:src/App.tsx"]
}
]
Required fields: order, title, description, nodeIds. Preserve optional languageLesson when present.
Assemble the full KnowledgeGraph JSON object:
{
"version": "1.0.0",
"project": {
"name": "<projectName>",
"languages": ["<languages>"],
"frameworks": ["<frameworks>"],
"description": "<projectDescription>",
"analyzedAt": "<ISO 8601 timestamp>",
"gitCommitHash": "<commit hash from Phase 0>"
},
"nodes": [<all merged nodes from Phase 3>],
"edges": [<all merged edges from Phase 3>],
"layers": [<layers from Phase 4>],
"tour": [<steps from Phase 5>]
}
Before writing the assembled graph, validate that:
layers is an array of objects with these required fields: id, name, description, nodeIds
tour is an array of objects with these required fields: order, title, description, nodeIds
tour[*].languageLesson is allowed as an optional string fieldlayers[*].nodeIds entry exists in the merged node settour[*].nodeIds entry exists in the merged node setIf validation fails, automatically normalize and rewrite the graph into this shape before saving. If the graph still fails final validation after the normalization pass, save it with warnings but mark dashboard auto-launch as skipped.
Write the assembled graph to $PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json.
Check $ARGUMENTS for --review flag. Then run the appropriate validation path:
--review): inline deterministic validationWrite the following Node.js script to $PROJECT_ROOT/.understand-anything/tmp/ua-inline-validate.cjs:
#!/usr/bin/env node
const fs = require('fs');
const graphPath = process.argv[2];
const outputPath = process.argv[3];
try {
const graph = JSON.parse(fs.readFileSync(graphPath, 'utf8'));
const issues = [], warnings = [];
if (!Array.isArray(graph.nodes)) { issues.push('graph.nodes is missing or not an array'); graph.nodes = []; }
if (!Array.isArray(graph.edges)) { issues.push('graph.edges is missing or not an array'); graph.edges = []; }
const nodeIds = new Set();
const seen = new Map();
graph.nodes.forEach((n, i) => {
if (!n.id) { issues.push(`Node[${i}] missing id`); return; }
if (!n.type) issues.push(`Node[${i}] '${n.id}' missing type`);
if (!n.name) issues.push(`Node[${i}] '${n.id}' missing name`);
if (!n.summary) issues.push(`Node[${i}] '${n.id}' missing summary`);
if (!n.tags || !n.tags.length) issues.push(`Node[${i}] '${n.id}' missing tags`);
if (seen.has(n.id)) issues.push(`Duplicate node ID '${n.id}' at indices ${seen.get(n.id)} and ${i}`);
else seen.set(n.id, i);
nodeIds.add(n.id);
});
graph.edges.forEach((e, i) => {
if (!nodeIds.has(e.source)) issues.push(`Edge[${i}] source '${e.source}' not found`);
if (!nodeIds.has(e.target)) issues.push(`Edge[${i}] target '${e.target}' not found`);
});
const fileLevelTypes = new Set(['file', 'config', 'document', 'service', 'pipeline', 'table', 'schema', 'resource', 'endpoint']);
const fileNodes = graph.nodes.filter(n => fileLevelTypes.has(n.type)).map(n => n.id);
const assigned = new Map();
if (!Array.isArray(graph.layers)) { if (graph.layers) warnings.push('graph.layers is not an array'); graph.layers = []; }
if (!Array.isArray(graph.tour)) { if (graph.tour) warnings.push('graph.tour is not an array'); graph.tour = []; }
graph.layers.forEach(layer => {
(layer.nodeIds || []).forEach(id => {
if (!nodeIds.has(id)) issues.push(`Layer '${layer.id}' refs missing node '${id}'`);
if (assigned.has(id)) issues.push(`Node '${id}' appears in multiple layers`);
assigned.set(id, layer.id);
});
});
fileNodes.forEach(id => {
if (!assigned.has(id)) issues.push(`File node '${id}' not in any layer`);
});
graph.tour.forEach((step, i) => {
(step.nodeIds || []).forEach(id => {
if (!nodeIds.has(id)) issues.push(`Tour step[${i}] refs missing node '${id}'`);
});
});
const withEdges = new Set([
...graph.edges.map(e => e.source),
...graph.edges.map(e => e.target)
]);
graph.nodes.forEach(n => {
if (!withEdges.has(n.id)) warnings.push(`Node '${n.id}' has no edges (orphan)`);
});
const stats = {
totalNodes: graph.nodes.length,
totalEdges: graph.edges.length,
totalLayers: graph.layers.length,
tourSteps: graph.tour.length,
nodeTypes: graph.nodes.reduce((a, n) => { a[n.type] = (a[n.type]||0)+1; return a; }, {}),
edgeTypes: graph.edges.reduce((a, e) => { a[e.type] = (a[e.type]||0)+1; return a; }, {})
};
fs.writeFileSync(outputPath, JSON.stringify({ issues, warnings, stats }, null, 2));
process.exit(0);
} catch (err) { process.stderr.write(err.message + '\n'); process.exit(1); }
Execute it:
node $PROJECT_ROOT/.understand-anything/tmp/ua-inline-validate.cjs \
"$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json" \
"$PROJECT_ROOT/.understand-anything/intermediate/review.json"
If the script exits non-zero, read stderr, fix the script, and retry once.
--review path: full LLM reviewerIf --review IS in $ARGUMENTS, dispatch the LLM graph-reviewer subagent as follows:
Dispatch a subagent using the prompt template at ./graph-reviewer-prompt.md. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
Additional context from main session:
Phase 1 scan results (file inventory):
[list of {path, sizeLines} from scan-result.json]Phase warnings/errors accumulated during analysis:
- [list any batch failures, skipped files, or warnings from Phases 2-5]
Cross-validate: every file in the scan inventory should have a corresponding node in the graph (node types may vary:
file:,config:,document:,service:,pipeline:,table:,schema:,resource:,endpoint:). Flag any missing files. Also flag any graph nodes whosefilePathdoesn't appear in the scan inventory.
Pass these parameters in the dispatch prompt:
Validate the knowledge graph at
$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json. Project root:$PROJECT_ROOTRead the file and validate it for completeness and correctness. Write output to:$PROJECT_ROOT/.understand-anything/intermediate/review.json
Read $PROJECT_ROOT/.understand-anything/intermediate/review.json.
If issues array is non-empty:
issues listtags -> ["untagged"], empty summary -> "No summary available")If issues array is empty: Proceed to Phase 7.
Write the final knowledge graph to $PROJECT_ROOT/.understand-anything/knowledge-graph.json.
Write metadata to $PROJECT_ROOT/.understand-anything/meta.json:
{
"lastAnalyzedAt": "<ISO 8601 timestamp>",
"gitCommitHash": "<commit hash>",
"version": "1.0.0",
"analyzedFiles": <number of files analyzed>
}
2.5. Generate structural fingerprints for all analyzed files and save to $PROJECT_ROOT/.understand-anything/fingerprints.json. This creates the baseline for future automatic incremental updates.
Write and execute a Node.js script that uses the core fingerprint module (tree-sitter-based, not regex):
import { buildFingerprintStore } from '@understand-anything/core';
import { saveFingerprints } from '@understand-anything/core';
const store = await buildFingerprintStore('<PROJECT_ROOT>', sourceFilePaths);
saveFingerprints('<PROJECT_ROOT>', store);
Where sourceFilePaths is the list of all analyzed source file paths from Phase 1. This uses the same tree-sitter analysis pipeline as the main fingerprint engine, ensuring the baseline matches the comparison logic used during auto-updates.
Clean up intermediate files:
rm -rf $PROJECT_ROOT/.understand-anything/intermediate
rm -rf $PROJECT_ROOT/.understand-anything/tmp
Report a summary to the user containing:
$PROJECT_ROOT/.understand-anything/knowledge-graph.json
Only automatically launch the dashboard by invoking the /understand-dashboard skill if final graph validation passed after normalization/review fixes.
If final validation did not pass, report that the graph was saved with warnings and dashboard launch was skipped.
$PHASE_WARNINGS list. When using --review, pass this list to the graph-reviewer in Phase 6. On the default path, include accumulated warnings in the Phase 7 final report.| Type | Description | ID Convention |
|---|---|---|
file |
Source code file | file:<relative-path> |
function |
Function or method | function:<relative-path>:<name> |
class |
Class, interface, or type | class:<relative-path>:<name> |
module |
Logical module or package | module:<name> |
concept |
Abstract concept or pattern | concept:<name> |
config |
Configuration file (YAML, JSON, TOML, env) | config:<relative-path> |
document |
Documentation file (Markdown, RST, TXT) | document:<relative-path> |
service |
Deployable service definition (Dockerfile, K8s) | service:<relative-path> |
table |
Database table or migration | table:<relative-path>:<table-name> |
endpoint |
API endpoint or route definition | endpoint:<relative-path>:<endpoint-name> |
pipeline |
CI/CD pipeline configuration | pipeline:<relative-path> |
schema |
Schema definition (GraphQL, Protobuf, Prisma) | schema:<relative-path> |
resource |
Infrastructure resource (Terraform, CloudFormation) | resource:<relative-path> |
| Category | Types |
|---|---|
| Structural | imports, exports, contains, inherits, implements |
| Behavioral | calls, subscribes, publishes, middleware |
| Data flow | reads_from, writes_to, transforms, validates |
| Dependencies | depends_on, tested_by, configures |
| Semantic | related, similar_to |
| Infrastructure | deploys, serves, provisions, triggers |
| Schema/Data | migrates, documents, routes, defines_schema |
| Edge Type | Weight |
|---|---|
contains |
1.0 |
inherits, implements |
0.9 |
calls, exports, defines_schema |
0.8 |
imports, deploys, migrates |
0.7 |
depends_on, configures, triggers |
0.6 |
tested_by, documents, provisions, serves, routes |
0.5 |
| All others | 0.5 (default) |