Go back
Go back
Published:Β Β 
Nov 25, 2025
Product Discovery

part 4 of Context Engineering with Claude Skills for Product Discovery

‍

37 patterns that determine how Claude processes your product discovery Skills

‍

These patterns address what happens when your Skill actually runs: how Claude handles tool failures, manages conditional logic, processes boundary cases, and determines when work is complete. This is where reliability separates working Skills from inconsistent ones.

I’ve rigorously tested these patterns across complex product workflows: legal and compliance research, creation of test cards and hypotheses, product roadmaps, OKR monitoring, competitive analysis, user research synthesis, and more. Below, I distill the 37 patterns into 4 strategic pillars:

‍

1. Core structure

2. Clarity & Organization

3. Execution & Control flow

4. Completion & Quality

‍

29. Exact signal strings

Specify completion signals as exact strings Claude must output. Don't leave format open to interpretation.

‍

❌ Ambiguous completion signal:

Signal when complete.
Let the orchestrator know you're done.
Indicate completion.

‍

βœ… Exact signal string:

Signal completion by outputting EXACTLY this string (no variations):

PM-COMPLIANCE ANALYSIS COMPLETE

CORRECT:
PM-COMPLIANCE ANALYSIS COMPLETE

INCORRECT (DO NOT use):
"Analysis complete"
"I'm done"
"Finished compliance analysis"
"PM-Compliance: Analysis Complete"
pm-compliance analysis complete (wrong case)

‍

Apply to all agent completion signals

‍

βœ… Multiple signal variants:

<completion_signal_format>
Signal format depends on outcome:

Success completion:
PM-DISCOVERY SYNTHESIS COMPLETE - [Confluence URL] - [Jira Ticket Key]

Example:
PM-DISCOVERY SYNTHESIS COMPLETE - https://company.atlassian.net/wiki/spaces/PRODUCT/pages/123456 - ST87-142

Failure completion:
PM-DISCOVERY SYNTHESIS INCOMPLETE - [Reason]

Example:
PM-DISCOVERY SYNTHESIS INCOMPLETE - Confluence creation failed after retry

NEVER use any other signal format.
</completion_signal_format>

‍

‍

‍

‍

30. Loop termination conditions

Specify exactly when to stop iterating or repeating. Include success exit, failure exit, and timeout exit conditions.

‍

❌ Unclear termination:

Keep searching until you find relevant documents.
Process recordings until done.
Retry until it works.

‍

βœ… Clear termination conditions:

<search_iteration_rules>
Loop termination conditions (stop when ANY condition met):

Success Exit:
- Found 10 relevant documents
- Action: Proceed to analysis phase

Failure Exit:
- Completed 5 search attempts with zero results
- Action: Proceed to edge case: "Insufficient Documentation"

Timeout Exit:
- Elapsed time exceeds 3 minutes (180 seconds)
- Action: Proceed with partial results (if any found)

NEVER search indefinitely.
ALWAYS exit loop after one of the three conditions met.
</search_iteration_rules>

‍

‍

Apply to retry logic:

‍

βœ… Retry termination:

<retry_logic>
Tool call retry conditions:

Attempt 1: Initial call
IF success THEN proceed
IF failure THEN attempt 2

Attempt 2: First retry (after 5 second wait)
IF success THEN proceed
IF failure THEN attempt 3

Attempt 3: Final retry (after 10 second wait)
IF success THEN proceed
IF failure THEN stop retrying - proceed to error handling

Maximum retries: 2 (3 total attempts)
NEVER retry more than 2 times.
</retry_logic>

‍

‍

‍

31. Success criteria

Define what "successful completion" means with measurable criteria. Don't assume Claude knows when something is "good enough".

‍

❌ Vague success criteria:

Create a good quality report.
Make sure the analysis is complete.
Ensure the output is acceptable.

‍

βœ… Defined success criteria:

<analysis_success_criteria>
Analysis is successfully complete when ALL criteria met:
- Minimum 3 obligations identified (NOT 0, 1, or 2)
- Each obligation has citation in format: [Title] | [Section] | [Page] | [Jurisdiction] | [Date]
- Risk level assigned to each obligation (High, Medium, or Low - NOT unassigned)
- At least 1 EU regulation intersection documented (can be "None identified")
- Output includes all 4 required sections (A, B, C, D - NOT missing any)

IF any criterion fails THEN
- Analysis is incomplete
- Return to analysis phase
- DO NOT proceed to output generation

IF all criteria met THEN
- Analysis is successfully complete
- Proceed to output generation
</analysis_success_criteria>

‍

Apply to output quality:

‍

βœ… Output success criteria:

<output_success_criteria>
Confluence page creation is successful when ALL criteria met:
- Page created (have valid page ID or URL)
- All 5 sections present in page (Executive Summary, Compliance, UX, Integration, Next Steps)
- All citations preserved exactly from PM-Compliance (no reformatting)
- All user quotes include timestamps
- No real participant names (all pseudonymized)
- Page linked to Jira ticket

Verify ALL criteria before signaling completion.
</output_success_criteria>

‍

‍

‍

‍

32. Boundary and edge cases

Specify behavior at limits: zero, maximum, empty, null values. Don't just handle the middle range.

‍

❌ Missing boundary handling:

Process the documents and extract insights.

‍

βœ… Specified boundary behaviors:

<document_processing_boundaries>
Handle all boundary conditions:

Zero Documents (empty result set):
IF 0 documents found THEN
- Output: "No documents found in search"
- Skip analysis phase entirely
- Proceed to edge case handling: "Insufficient Documentation"
- DO NOT attempt to analyze empty set

One Document (minimum):
IF exactly 1 document found THEN
- Process normally
- Note: "Analysis based on single document - limited coverage"
- Recommend: "Expand search criteria for comprehensive analysis"

Normal Range (2-10 documents):
IF 2-10 documents found THEN
- Process all documents normally
- No special handling needed

Maximum Exceeded (>10 documents):
IF more than 10 documents found THEN
- Process first 10 most recent documents only
- Note: "Additional [N] documents available - processed 10 most recent"
- Recommend: "Process remaining documents in follow-up analysis"

Empty Document (0 bytes):
IF any document has fileSize = 0 THEN
- Skip that document
- Log: "Empty document skipped: [filename]"
- Continue with remaining documents

Null Values:
IF document content is null THEN
- Skip that document
- Log: "Null content skipped: [filename]"
- Continue with remaining documents
</document_processing_boundaries>

‍

‍

‍

‍

33. Data format specifications

Specify exact structure for inputs and outputs. Include data types, required fields, optional fields.

‍

❌ Vague format:

Return the analysis results.
Provide the findings in structured format.

‍

βœ… Exact format specification:

<output_format_specification>
Return analysis results in this EXACT structure:

{
  "obligations": [
    {
      "statement": string (required, non-empty),
      "citation": string (required, format: [Title] | [Section] | [Page] | [Jurisdiction] | [Date]),
      "risk_level": string (required, one of: "High", "Medium", "Low"),
      "product_implication": string (required, non-empty)
    }
  ],
  "eu_intersections": [
    {
      "gdpr_article": string (required, e.g., "Article 17"),
      "relationship": string (required, e.g., "Conflicts with", "Aligns with"),
      "description": string (required, non-empty)
    }
  ],
  "gaps": [
    {
      "description": string (required, non-empty),
      "recommendation": string (required, non-empty)
    }
  ],
  "conflicts": array (optional, empty array [] if no conflicts)
}

Data type requirements:
- ALL string fields must be non-empty (NOT null, NOT "")
- ALL arrays must be present (can be empty [])
- ALL required fields must be present (NOT omitted)
- obligations array must have at least 1 element
- eu_intersections array can be empty (state "None identified" in report)
- gaps array must be present even if empty

IF any requirement not met THEN format is invalid - regenerate.
</output_format_specification>

‍

‍

‍

34. Output completeness requirements

Define what must be present in output to consider it complete. Distinguish required vs optional sections.

‍

❌ Unclear completeness:

Generate the report with findings.
Create comprehensive documentation.

‍

βœ… Clear completeness requirements:

<report_completeness_requirements>
Report is COMPLETE when ALL required sections present:

REQUIRED Sections (report is incomplete without ANY of these):
- Section A: Applicable Obligations
  - Must contain: At least 1 obligation with citation
  - Can NOT be empty
  
- Section B: EU Regulation Intersection
  - Must contain: At least one entry OR explicit statement "None identified"
  - Can NOT be omitted
  
- Section C: Technical Requirements
  - Must contain: Technical standards OR explicit statement "None identified"
  - Can NOT be omitted
  
- Section D: Risks & Gaps
  - Must contain: Risk assessment OR explicit statement "No gaps identified"
  - Can NOT be omitted

OPTIONAL Sections (report is still complete without these):
- Appendix: Additional Context (only include if relevant)
- References: Extended Reading (only include if beneficial)

Completeness Verification:
IF any REQUIRED section missing THEN
- Report is INCOMPLETE
- Regenerate with missing section
- DO NOT signal completion

IF optional sections missing THEN
- Report is still COMPLETE
- Proceed to completion
</report_completeness_requirements>

‍

‍

‍

35. Validation timing

Specify WHEN to validate: before operation, after operation, or during. Don't assume Claude knows the right time to check.

‍

❌ Unclear validation timing:

Validate the inputs.
Check the file format.
Verify the citations.

‍

βœ… Clear validation timing:

<validation_phases>
Execute validations at specific times (NOT all at once):

PHASE 1: Pre-Retrieval Validation (BEFORE searching documents)
Validate:
- [ ] Search query is not empty
- [ ] Folder paths exist and are accessible
- [ ] User has permissions to access folders

IF Phase 1 fails THEN
- Stop - DO NOT attempt document retrieval
- Report error and request fixes

PHASE 2: Post-Retrieval Validation (AFTER search, BEFORE processing)
Validate:
- [ ] At least 1 document retrieved (not zero)
- [ ] File formats are supported (.pdf, .docx, .txt)
- [ ] Documents are not corrupted (can be opened)

IF Phase 2 fails THEN
- Proceed to edge case: "Insufficient Documentation"
- DO NOT attempt processing

PHASE 3: During Processing Validation (WHILE analyzing each document)
Validate for each document:
- [ ] Content is extractable (not encrypted)
- [ ] Language is supported (Dutch or English)
- [ ] Document contains relevant keywords

IF validation fails for a document THEN
- Skip that document
- Continue with remaining documents

PHASE 4: Post-Processing Validation (AFTER analysis, BEFORE output)
Validate:
- [ ] All obligations have citations
- [ ] All citations match required format
- [ ] Output includes all required sections
- [ ] No placeholder text remains (e.g., "[TODO]", "[TBD]")

IF Phase 4 fails THEN
- Return to processing phase
- DO NOT output incomplete results
</validation_phases>

‍

‍

‍

36. Temporal references (absolute vs relative)

Use absolute time references or explicit relative periods. Avoid vague relative time like "recently", "soon", "earlier today".

‍

❌ Vague temporal references:

Search for recent documents.
Update the file if it was modified recently.
Process new recordings.
Check for updates from earlier.

‍

βœ… Clear temporal references:

Search for documents modified within last 30 days:
- Filter: modifiedTime > (current_date - 30 days)
- Example: If today is 2025-11-22, search for documents modified after 2025-10-23

Update file IF last modified more than 7 days ago:
- Check: lastModified < (current_date - 7 days)
- IF condition true THEN update
- IF condition false THEN skip update

Process recordings uploaded after 2025-11-01:
- Filter: createdTime > 2025-11-01T00:00:00Z
- Use absolute date (NOT "recent" or "new")

Check for updates from last workflow run:
- Compare: last_run_timestamp (stored in /memory/last-run.txt)
- Filter: modifiedTime > last_run_timestamp
- Use explicit timestamp comparison

‍

‍

Apply to all time-based operations:

‍

βœ… Explicit time windows:

<time_window_definitions>
Define explicit time windows (NOT vague terms):

"Recent" = Last 30 days
"This week" = Last 7 days from today
"This month" = Current calendar month (e.g., November 1-30, 2025)
"This quarter" = Current Q (Q4 2025 = Oct 1 - Dec 31, 2025)
"Recent changes" = Modified within last 14 days

ALWAYS use explicit date ranges in filters.
NEVER use ambiguous terms like "recently", "soon", "lately".
</time_window_definitions>

‍

‍

‍

37. Self-evaluation protocols (not automated evals)

Include quality criteria that Claude checks before signaling completion. This is different from automated evals that belong in agent development frameworks.

‍

What PMs often confuse:

✦ Automated evals (external testing) - belongs in LangGraph, LangChain, agent frameworks

✦ Self-evaluation protocols (internal quality gates) - belongs IN Claude Skills

‍

Why Self-Evaluation matters for Claude

Without self-evaluation criteria:

✦ Claude might signal completion with incomplete work

✦ Quality varies unpredictably

✦ Users receive low-quality outputs

✦ No consistency across skill executions

‍

With self-evaluation criteria:

✦ Claude catches quality issues before completion

✦ Consistent minimum quality bar

✦ Fewer iterations needed from users

✦ Built-in quality control

‍

‍

❌ No self-evaluation:

Generate the competitive analysis and signal completion.

‍

No quality checks before completion.

‍

❌ Confusing automated evals with self-evaluation:

Run pytest suite to validate output quality.
Use LangSmith to measure accuracy.

‍

These are external tools, NOT something Claude can do within a skill

‍

βœ… Example : PM OKR monitoring Skill with Self-Eval in Claude Skills (what Claude checks itself):

<okr_self_evaluation>
Before completing OKR creation, Claude verifies:

Objective Quality:
- [ ] Qualitative and inspiring (NOT just "increase revenue")
- [ ] Clear direction (team knows what success looks like)
- [ ] Aligned with company strategy (NOT random goals)

Key Result Quality:
- [ ] Quantitative with number and unit (e.g., "25% increase", "50 new customers")
- [ ] Measurable within timeframe (quarterly or annually)
- [ ] Ambitious but achievable (stretch goals, NOT easy wins)
- [ ] 3-5 key results per objective (NOT 1-2 or 7-8)

Success Criteria:
- [ ] Key results answer: "How do we know we achieved the objective?"
- [ ] All key results have baseline (current state) and target (desired state)
- [ ] Metrics are specific (NOT vague like "improve user satisfaction")

IF objective is vague (e.g., "Be better") THEN rewrite to be specific
IF key result lacks metric THEN add quantitative measure
IF <3 key results per objective THEN add more or justify why fewer

Quality gate: Must pass 10/12 criteria to signal completion.
</okr_self_evaluation>

‍

βœ… Example: PM Research Skill with Self-Eval

<user_research_analysis_self_eval>
Before completing user research analysis, verify quality:

Research Coverage:
- [ ] Minimum 5 user interviews processed (or note if fewer with limitation)
- [ ] At least 3 user segments represented (or note if limited coverage)
- [ ] Interview dates span at least 2 weeks (avoid single-day bias)

Insight Quality:
- [ ] 5-7 key insights identified (NOT 1-2 or 20+)
- [ ] Each insight has direct user quote with timestamp
- [ ] Insights are specific (NOT generic like "users want better UX")
- [ ] Insights reveal user needs or pain points (NOT just feature requests)

Evidence Standards:
- [ ] All quotes use "Participant N" format (NO real names - GDPR)
- [ ] Timestamps included for every quote [MM:SS]
- [ ] User sentiment noted (frustration, satisfaction, confusion)
- [ ] Contradictory feedback acknowledged (NOT hidden)

Actionability:
- [ ] Product implications stated for each insight
- [ ] Implications specific (NOT vague like "improve product")
- [ ] Priority signals indicated (how many users mentioned, severity)

Hypothesis Validation (if applicable):
- [ ] Hypothesis clearly stated
- [ ] Evidence categorized (confirming, contradicting, neutral)
- [ ] Confidence level assigned with rationale
- [ ] Recommendation provided (Validated/Contradicted/Insufficient Data)

Quality Scoring:
IF 15-17/17 criteria met THEN high quality (proceed)
IF 12-14/17 criteria met THEN acceptable (note gaps, proceed)
IF <12/17 criteria met THEN insufficient quality (regenerate)

Common failures to check:
- Generic insights without user quotes
- Missing timestamps on quotes
- Real participant names not pseudonymized
- Vague product implications
- No hypothesis assessment when hypothesis was provided

Self-evaluation prevents delivering incomplete or low-quality analysis.
</user_research_analysis_self_eval>

‍

‍

The Key Distinction

Self-Evaluation (IN skill):

✦ "Did I do what I was supposed to do?"

✦ Quality gates before completion

✦ Deterministic checks Claude can perform

✦ Real-time during execution

‍

Automated Evals (OUTSIDE skill):

✦ "How well does this skill perform across many runs?"

✦ Statistical performance measurement

✦ Requires test datasets and metrics

✦ Retrospective after execution

‍

Both are valuable, but they serve different purposes and live in different places.

‍

‍

‍

Together, all 37 form a practical contract between you and Claude: how you structure instructions, how execution behaves under real constraints, and how you verify that the result is actually good enough for product discovery.

‍

If you treat these patterns as a design system for Skills, not as isolated tricks, three things follow:

1. You can turn messy, multi tool discovery flows into repeatable workflows instead of one off experiments.

2. You can debug Skills systematically, by tracing failures back to gaps in structure, control flow, or completion criteria.

3. You can gradually raise the quality bar, by encoding every new failure mode into preconditions, edge case handling, or self evaluation.

‍

The next step is not to memorize all 37 patterns.

The next step is to operationalize them in your own environment:

✦ define a small set of standard Skill skeletons for your main discovery workflows,

✦ add explicit error handling and completion signals, and

✦ maintain a simple log of failure cases that you periodically fold back into the patterns.

‍

If Part 1 & 2 helped Claude understand what you want, and Part 3 & 4 helps Claude execute reliably when things go wrong, then the real leverage comes from how consistently you apply these patterns over time.

‍

I hope this continuation gives you a clearer path to constructing Skills that operate reliably across discovery workflows! πŸ‘‹πŸΎ Gigi Biharie

View all articles
View all articles