Generate Academic Reading List From Syllabus

v20260612

syllabus

This skill transforms a course syllabus (PDF, DOCX, etc.) into a highly curated, supplementary academic reading list. It parses topics and learning outcomes, calibrates content difficulty based on the target audience (undergraduate, graduate, etc.), and searches academic databases (Consensus) for recent, peer-reviewed papers. The output is a professional .docx file containing summaries, links, and advanced discussion questions, making it invaluable for educators and students needing structured, up-to-date course resources.

Syllabus Academic Education Curriculum Research Content Generation Writing

Get Skill

392 downloads

Overview

Syllabus — Course Supplementary Reading List

Portability: Requires a Consensus MCP connection, Node.js with docx package, and file reading capability for the syllabus. Works in Claude Code CLI natively. In Claude.ai with Consensus MCP + Code Execution + file upload, the workflow is supported.

For an instructor or student with a course syllabus, produce a professional supplementary reading list as .docx containing recent peer-reviewed papers per course section.

Architectural Pattern: Bundled Script

This skill uses a bundled JavaScript helper script for DOCX generation rather than inlining the 300+ lines of layout code:

DOCX generation logic is reusable + complex
Better separation of concerns: skill = orchestration + intelligence; script = mechanical document assembly
Token-efficient: skill doesn't re-derive layout each run
Easier to maintain and version

The bundled script is at scripts/generate_reading_list.js. The skill orchestrates the pipeline + invokes the script with JSON input.

Agent Integrity Rules (Research-Pack Convention)

Locked verbatim per PR #657 audit.

Only use what Consensus returns. Every paper title, author, journal, year, URL must come from this session's tool calls. Training-knowledge papers labeled [Not from Consensus — model knowledge] and excluded.
Confirm before moving on. A search isn't complete until response received and inspected.
Track three counts. Queries sent / papers received / papers cited. Surface in audit summary.
Surface gaps, don't fill them. Section with one paper + note about limited results > section padded with fabrications.

Phase 0: Grill-Me Intake (3 forcing questions)

Q1 (root) — Syllabus input

Provide the syllabus — pick one:

File path (PDF, DOCX, text) — I'll read it

Pasted content — paste below

Image of a printed syllabus — attach the image

Why I'm asking: Each format needs a different reader (PDF / DOCX parser / vision). Picking upfront prevents wasted attempts.

Forcing choice. Refuse to start without a syllabus.

Q2 (depends on Q1) — Course audience

Course audience — pick one:

Undergraduate (intro level)

Undergraduate (advanced / upper division)

Graduate (Masters / early PhD)

Graduate (doctoral / advanced)

Professional / continuing education

Mixed

Why I'm asking: Audience dictates summary jargon level and discussion-question complexity. Undergrad summaries define every term; grad summaries assume technical fluency. Discussion questions for undergrads test analysis; for grads test critique and extension.

See references/audience_calibration.md for the canon.

Q3 (depends on Q1) — Year range

Year range for papers — pick one:

Last 1 year (most recent only)

Last 2 years (default — recent + a year of context)

Last 5 years (broader, includes foundational recent work)

Why I'm asking: Reading lists go stale fast. 1-year filters keep things fresh; 5-year filters surface foundational recent work that's already standard. Drives the year_min parameter on every Consensus search.

Forcing choice with default (last 2 years).

Stop condition: 3 questions max before Phase 1. The post-Phase-2 group-and-confirm checkpoint is its own grill-me moment.

Phase 1: Parse the Syllabus

Per Q1 input format:

PDF: use PDF reader; extract text
DOCX: use pandoc or DOCX parser; extract text
Text/pasted: read directly
Image: use vision; extract text

From extracted text:

Course title + instructor + term
Topic list (lecture titles, week-by-week breakdown, etc.)
Learning outcomes (if explicit; if missing, infer 3-5 from description)

Mark inferred learning outcomes as [inferred] in the DOCX.

Phase 2: Group Topics + Confirm with User

Group via topic_grouper.py

Use scripts/topic_grouper.py to cluster related topics into 6-12 sections. Heuristic: closely-related topics merge; cross-cutting topics get their own section.

Group-and-Confirm Checkpoint (Forcing Options)

After grouping, present:

Proposed sections: [list with item counts]. Pick one:

"Looks good — proceed with these sections"

"Merge sections [X] and [Y]"

"Split section [X] into two"

"Add a section for [topic]"

"Remove section [X]"

Why I'm asking: Grouping drives search allocation. Wrong grouping wastes the search budget on bad clusters. This is the last cheap moment to correct course before searches consume Consensus calls.

Refuse to start Phase 3 without explicit user choice.

Phase 3: Search Consensus per Section

Sequential, 1 q/sec. 1-2 queries per section.

Applied-Domain Weaving (Critical)

Don't just search the topic — search the topic + applied domain:

❌ Generic	✅ Applied-domain
"enzyme kinetics"	"enzyme kinetics food processing applications"
"machine learning"	"machine learning clinical decision support"
"thermodynamics"	"thermodynamics renewable energy systems"
"social network analysis"	"social network analysis public health interventions"

Boosts paper relevance dramatically. See references/applied_domain_weaving.md for the canon.

Per-Section Pattern

For each section:
  1. Construct query: "{topic-keywords} {applied-domain-angle}" + year_min from Q3
  2. Submit to Consensus (sequential, 1 q/sec gap enforced by citation_tracker)
  3. Receive results
  4. (If thin) submit one fallback query without applied-domain angle
  5. Select 1-3 papers per section (15-25 total across all sections)

Selection Priorities

Relevance — paper directly addresses the section topic
Reviews / meta-analyses — synthesize the field
Citation count — established work
Applied-domain connection — tied to the course's domain (e.g., engineering vs theory)

Phase 4: Write Summaries + Discussion Questions

Summary writing

Per paper:

Plain language (calibrated to audience from Q2)
2-3 sentences
Define jargon if undergraduate audience; assume fluency if graduate

Quality bars

✅ Good summary	❌ Bad summary
"This review maps how different diets — Mediterranean, Nordic, vegetarian — reshape the types of fat molecules circulating in your blood, with implications for heart disease risk."	"This paper reviews lipidomic profiles across dietary interventions and their cardiometabolic implications."

Discussion question writing

Per paper:

Bloom higher-order (apply / analyze / evaluate)
Tied to a specific course learning outcome
Promotes discussion, not just recall

✅ Good question	❌ Bad question
"If dietary fat quality can reshape your lipoprotein lipidome, what does this suggest about the biochemical basis for dietary guidelines recommending unsaturated over saturated fats?"	"What did the authors find?" (Just recall)

Use scripts/discussion_question_validator.py to flag recall-only questions.

Phase 5: Generate .docx via Bundled Script

node scripts/generate_reading_list.js \
  --input /tmp/syllabus_data.json \
  --output /path/to/reading_list_<course>_<date>.docx

The script accepts JSON with this schema:

{
  "courseTitle": "string",
  "courseSubtitle": "string",
  "generatedDate": "string",
  "yearRange": "string",
  "introText": "string",
  "learningOutcomes": ["string", ...],
  "sections": [
    {
      "heading": "string",
      "papers": [
        {
          "title": "string",
          "authors": "string",
          "journal": "string",
          "year": number,
          "url": "string",
          "summary": "string",
          "question": "string"
        }
      ]
    }
  ],
  "auditLog": {
    "totalQueriesSent": number,
    "totalPapersReceived": number,
    "totalPapersCited": number,
    "toolConstraints": "string",
    "searchDetails": [
      {
        "section": "string",
        "query": "string",
        "papersReturned": number,
        "papersSelected": number,
        "status": "string"
      }
    ],
    "failures": []
  }
}

The script handles:

docx package require with multi-location fallback
Title page, intro with Consensus link, learning outcomes box, numbered papers per section
ExternalHyperlink with full Consensus URLs (never truncated)
LevelFormat.BULLET for lists (not unicode bullets)
Footer with generation metadata
Input validation (missing fields → graceful error)

See references/bundled_script_pattern.md for why bundled vs inline.

Phase 6: Deliver

File path
Audit summary in chat: "Saved {file}. {N} sections × {M} papers / {K} cited. Plan tier: {tier}."
Validate: check zip integrity with python3 -c "import zipfile,sys; zipfile.ZipFile(sys.argv[1]).testzip()" <docx> (no output = intact), then confirm the required sections are present

Tooling

Script	Role
`scripts/citation_tracker.py`	Consensus three-count audit + 1s sequential discipline at `~/.syllabus_sessions/<session>.json`
`scripts/topic_grouper.py`	Heuristic 6-12 section grouping from extracted topics
`scripts/discussion_question_validator.py`	Bloom higher-order quality check; flags recall-only questions
`scripts/generate_reading_list.js`	Bundled Node.js DOCX generator — JSON input → .docx output

References

references/applied_domain_weaving.md — search-quality canon (7+ sources)
references/audience_calibration.md — undergrad vs grad summary jargon (7+ sources)
references/bundled_script_pattern.md — why bundle vs inline (7+ sources)

Error Handling

Failure	Behavior
Consensus rate-limit hit	Wait 3s, retry once, log
Search returns 0 for a section	Note section as "limited results — consider manual supplementation"
3 consecutive failures	Stop, alert user, share collected so far
`docx` package not installed	Script attempts `npm install`; if still failing, fail with clear message
DOCX validation fails	Unpack XML, log issue, ask user to retry
Syllabus format unsupported	List supported formats, ask user to convert
Learning outcomes can't be extracted	Infer 3-5 from course description; mark as inferred in document

Anti-Patterns To Reject

Parallelizing Consensus calls (rate limit)
Searching topics without applied-domain angle (poor relevance)
Padding sections with fabricated entries when Consensus returns thin
Generic discussion questions ("What did the authors find?")
Jargon-heavy summaries unsuitable for the course's audience level
Skipping the group-and-confirm step (wastes searches)
Truncating Consensus URLs in hyperlinks
Inlining 300 lines of docx-generation JavaScript in the skill body (use bundled script)

Version: 1.0.0 Source spec: megaprompts/10-syllabus-megaprompt.md Build pattern: Path B (direct conversion). Bundled-JS-DOCX-generator variant.

Info

Category Artificial Intelligence

Name syllabus

Version v20260612

Size 28.35KB

Source alirezarezvani/claude-skills

Updated At 2026-06-13