Skills Product & Business AI Product Canvas Framework

AI Product Canvas Framework

v20260618
ai-product-canvas
This comprehensive canvas guides teams through developing robust AI-powered features. It ensures product decisions are rigorous by covering problem definition, model selection (LLM, RAG, etc.), data governance, evaluation metrics, user experience design (including fallbacks), and responsible AI compliance. Ideal for assessing AI readiness and preventing common product failures.
Get Skill
234 downloads
Overview

AI Product Canvas Skill

Define AI products with the same rigour as any product decision — but with additional layers for data, model, evaluation, and responsible AI. This canvas prevents the most common AI product failure: building a technically impressive feature that doesn't solve a real problem.

AI Product Anti-Patterns to Check First

Before building, flag if any of these apply:

  • ❌ "We should add AI to [existing feature]" — with no user problem defined
  • ❌ Accuracy target undefined before build begins
  • ❌ No plan for what happens when the model is wrong
  • ❌ User-facing AI output with no human review or fallback
  • ❌ Training data not audited for bias or quality
  • ❌ No evaluation metric — "we'll know it when we see it"

AI Product Canvas Output Format

AI Product Canvas — [Feature Name] — [Date]

PM Owner: [Name] ML/AI Lead: [Name] Status: Discovery / Design / Build / Evaluation / Live


1. Problem Definition

User problem being solved:

[What specific situation is the user in? What job are they trying to get done?]

Why AI?

[What makes this problem require AI vs a deterministic solution? If the answer is "because we can," stop here.]

Success for the user looks like:

[What outcome does the user experience when the AI feature is working well?]


2. AI Approach

Task type:

  • Classification
  • Generation (text, image, code)
  • Summarisation / extraction
  • Recommendation
  • Search / retrieval
  • Prediction / forecasting
  • Conversation / agent

Model approach:

  • LLM API (GPT-4, Claude, Gemini, etc.) — specify: [Model name + version]
  • Fine-tuned model on own data
  • Custom model trained from scratch
  • RAG (retrieval-augmented generation)
  • Embedding + vector search

Rationale for chosen approach: [Why this, not alternatives]


3. Data Requirements

Data Type Source Volume Quality Status Bias Risk
[Training data] [Where it comes from] [Volume] [Audit status] H/M/L
[Evaluation data] [Where it comes from] [Volume] [Audit status] H/M/L

Data gaps: [What's missing and plan to get it] Privacy considerations: [Any PII in training or inference data] Data ownership: [Do we own this data? Can we use it for training?]


4. Evaluation Framework

Primary metric: [The number that defines success — accuracy, F1, BLEU, user rating, task completion rate] Minimum acceptable threshold: [Below X, the feature does not ship] Human evaluation plan: [How will humans review model outputs? Sampling rate? Review panel?]

Evaluation Type Method Cadence Owner
Offline (pre-launch) [Test set, benchmark] Pre-launch ML Lead
Online (post-launch) [A/B test, user feedback] Weekly PM + ML
Adversarial [Red-team, edge cases] Pre-launch Safety reviewer

5. User Experience Design

How is AI output presented?

  • Direct output shown to user (high trust required)
  • AI-assisted with user confirmation
  • Suggestion user can accept/reject
  • Background action with audit log

Confidence and uncertainty handling:

  • What happens when confidence is low? [Show alternative, ask for clarification, fallback to manual]
  • How is uncertainty communicated to the user? [UI pattern]

Fallback plan:

  • If the model fails or returns an error: [Specific fallback behaviour]
  • If accuracy degrades below threshold: [Kill switch or graceful degradation plan]

6. Responsible AI Checklist

  • Bias audit completed on training data
  • Demographic fairness evaluated (does performance differ by user group?)
  • Hallucination / confabulation risk assessed and mitigated
  • User can see and correct AI output
  • Opt-out mechanism exists (can user disable the AI feature?)
  • Output provenance visible when relevant (does user know AI generated this?)
  • PII not used in ways user didn't consent to
  • Regulatory review completed (GDPR, AI Act, sector-specific)
  • Model cards / documentation completed

7. Launch & Monitoring Plan

Rollout: [% of users, with staged expansion criteria] Monitoring metrics:

  • Model performance: [Metric + alert threshold]
  • User engagement with AI output: [Acceptance rate, override rate, feedback score]
  • Error rate: [% of failed inferences]
  • Latency: [P95 target]

Model refresh cadence: [How often is the model retrained or updated?] Drift detection: [How will you know when model performance degrades in production?]


Guidelines

  • Never skip the "Why AI?" section — it's the most important question in AI product development
  • The fallback UX is not optional — what happens when AI fails defines your product's trustworthiness
  • Responsible AI checklist must be completed before launch, not after
  • Include latency in success metrics — a 5-second AI response is often worse than no AI at all
  • Recommend starting with a human-in-the-loop design and automating only when accuracy is proven

Required Inputs

Ask the user for these if not provided:

  • Feature or product description (what the AI is intended to do)
  • User problem (what problem the AI is solving for users)
  • Available data (what training/inference data exists)
  • ML/AI lead (who owns the technical implementation)

Anti-Patterns

  • Do not skip the "Why AI?" question — if the answer is "we want to use AI," stop and reframe around the user problem first
  • Do not launch with an undefined accuracy threshold — "good enough" is not a threshold; set a number before build begins
  • Do not design the UX to hide AI-generated output as if it were system truth — users need to know when AI is involved so they can override it
  • Do not defer the Responsible AI checklist to post-launch — bias and privacy issues are far harder to fix in production than in design
  • Do not treat model latency as a post-launch optimisation — a 6-second AI response that replaces a 1-second rule-based response is a regression, not a feature

Quality Checks

  • "Why AI?" is answered clearly (not "because we can")
  • Minimum acceptable accuracy threshold is defined before build begins
  • Fallback UX is specified for model failures or low-confidence outputs
  • Responsible AI checklist is completed (not deferred to post-launch)
  • Monitoring plan includes both model performance and user engagement metrics
Info
Name ai-product-canvas
Version v20260618
Size 6.83KB
Updated At 2026-06-19
Language