Go back
Go back
Published:Β Β 
Nov 25, 2025
Product Discovery

part 1 of Context Engineering with Claude Skills for Product Discovery

37 patterns that determine how Claude processes your product discovery Skills

Much of the available guidance highlights code oriented use cases, while the day to day work of product strategy, research, and analysis requires a different level of structural precision.

I’ve rigorously tested these patterns across complex product workflows: legal and compliance research, creation of test cards and hypotheses, product roadmaps, OKR monitoring, competitive analysis, user research synthesis, and more. Below, I distill the 37 patterns into 4 strategic pillars:

‍

1. Core structure

2. Clarity & Organization

3. Execution & Control flow

4. Completion & Quality

‍

These are patterns I found by testing what Claude actually does with Skills, specifically what gets processed reliably, what gets ignored, how the context window affects attention quality, and how Claude parses semantic structure.

The prompt architecture has to match Claude’s attention mechanisms and processing model because Claude’s outputs vary between runs. The same Skill can produce different results each time because Claude generates responses by predicting likely next tokens, not by running a repeatable script.

‍

‍

1. Placement matters

Put critical instructions at the beginning of sections or in dedicated blocks.

‍

❌ Buried instruction (easy to miss):

Synthesize interview findings into actionable product insights. 
Identify patterns across interviews, extract relevant quotes, 
and organize findings by theme. When working with participant data, 
remember to anonymize names and use Participant IDs instead of real names. 
Look for evidence of pain points, current workarounds, and feature requests...

‍

βœ… Early placement:

<critical_requirements>
- ALWAYS anonymize participant data using "Participant N" format
- NEVER include real names, email addresses, or company names
- VERIFY anonymization before generating any output
</critical_requirements>

Synthesize interview findings into actionable product insights. Identify patterns across 
interviews, extract relevant quotes, and organize findings by theme...

‍

‍

‍

2. Repetition

Restate key requirements in multiple places throughout the skill.

‍

❌ Single mention:

Key Results should be quantifiable with specific success metrics. 
Don't use vague language like "improve" or "increase" without numbers.

‍

βœ… Reinforced multiple times:

<key_result_requirements>
EVERY Key Result MUST include:
- Specific number or percentage
- Baseline value (starting point)
- Target value (success criteria)
- Measurement method

NEVER use: "improve", "increase", "enhance", "better" without numbers
</key_result_requirements>

...later in OKR generation section...

Key Result Format (use this exact structure):

KR1: [Metric name] from [baseline] to [target] by [date]
- Measurement: [how you'll track this]

CORRECT Example:
KR1: Increase weekly active users from 12,000 to 18,000 by Q2 end
- Measurement: Google Analytics, weekly cohort analysis

INCORRECT Example:
KR1: Improve user engagement (no numbers, vague metric)

...later in validation checklist...

Before finalizing OKRs, verify EVERY Key Result has:
- Specific baseline number (where you're starting)
- Specific target number (where you're going)
- Clear measurement method (how you'll track it)
- Time-bound target date
- NO vague language ("improve", "increase", "better" alone)

IF Key Result lacks any of these β†’ rewrite with specific numbers

‍

‍

‍

3. Specificity

Use exact formats, exact steps, and exact criteria (not vague language).

‍

❌ Vague:

Analyze the user interviews and organize the findings into themes. 
Make sure to include supporting evidence and highlight the most important insights.

‍

βœ… Specific:

Structure interview synthesis using this exact format:

## Insight [N]: [One-sentence summary]

Evidence:
- "[Direct quote]" [Participant N, MM:SS]
- "[Direct quote]" [Participant N, MM:SS]
- Pattern: [X of Y participants mentioned this]

Product Implication: [Specific feature/change recommendation]

Priority: High | Medium | Low

NEVER use bullet points for synthesis prose, write in paragraphs.
ALWAYS include minimum 2 direct quotes per insight.
REQUIRED: Participant count showing pattern strength.

‍

‍

‍

4. Examples over explanation

Show concrete examples of desired output rather than just describing it.

‍

❌ Abstract instruction:

Generate hypotheses and test cards. Capture what we're trying to learn 
and how we'll know if we're right. Include the customer segment being targeted, 
the problem they have, and what success looks like. Make the hypothesis 
clear enough that the team knows what experiment to run and what data to collect. 
Consider desirability, viability, and feasibility depending on what's being tested.

‍

βœ… With examples:

Hypotheses must be specific, testable, and falsifiable using Strategyzer's format.

EXAMPLE (Desirability Test Card):
We believe that: Marketing managers at B2B SaaS companies (50-200 employees) experience frustration with manually creating social media content variations, spending 5+ hours per week on this task, and will adopt an AI tool that reduces this to <1 hour per week
To verify this, we will: Conduct 10 problem interviews with marketing managers at target companies
We will measure: Number of participants who (a) confirm spending 5+ hours/week on content variations AND (b) express intent to use automation weekly
We are right if: 8 of 10 participants confirm the pain point AND 6 of 10 say they would use this weekly

EXAMPLE (Viability Test Card):
We believe that: B2B SaaS companies with 50-200 employees have budget allocated for marketing tools and will pay $199/month per seat for AI content generation
To verify this, we will: Show pricing page to 15 qualified prospects and track conversion to paid trial
We will measure: Conversion rate from pricing page view to paid trial signup
We are right if: 15% of prospects convert to paid trial AND 60% of trials convert to paid customer within 30 days

EXAMPLE (Feasibility Test Card):
We believe that: We can build an AI content generation system that produces brand-compliant social posts in <30 seconds using GPT-4 API and maintains 95%+ brand voice accuracy
To verify this, we will: Build technical spike with 100 sample content requests and measure processing time and accuracy
We will measure: Processing time per request, brand voice accuracy rating from 5 marketing managers, API cost per request
We are right if: Process content in <30 seconds with 95%+ accuracy rating AND <$50/month in API costs per customer

DO NOT write hypotheses like this:
- "Users want better content tools" (not specific - which users? what problem?)
- "People will pay for this" (no price point, no segment, not testable)
- "We can build it" (no technical specifications, no success criteria)
- "Customers like AI features" (not falsifiable - how would you prove this wrong?)
- "We believe that users might be interested in our product" (weak language, no specifics)
- "Let's do some interviews and see what happens" (no hypothesis, no success criteria)

‍

‍

‍

‍

5. Clear section headers with XML tags

Use XML tags to create structured boundaries and section organization.

‍

❌ Scattered throughout:

Search Archives Act requirements for the specific record type. Make sure to 
document the retention period and disposal authority for each finding. 
The risk level should be High, Medium, or Low depending on regulatory exposure. 
If a requirement is unclear, flag it for legal review. 

Always include the statutory reference with section numbers. Cross-reference 
with the General Disposal Authority if applicable. Format each finding with 
the record type, retention period, DA number, and risk level. Don't cite 
outdated disposal authorities. Don't skip the risk assessment. 
Don't provide general advice without specific DA references.

‍

βœ… Organized in XML sections:

<compliance_agent_workflow>
MANDATORY: The Compliance Agent must complete all steps below before returning results.

1. Search Archives Act requirements for the specific record type
2. Document findings in this exact format:
   
   Compliance Finding: [Record Type]
   - Retention Period: [X years]
   - Disposal Authority: [DA number]
   - Statutory Reference: [Section X.Y]
   - Risk Level: [High/Medium/Low]

3. If any requirement is unclear, flag it as REQUIRES LEGAL REVIEW
4. Cross-reference with GDA (General Disposal Authority) if applicable

EXAMPLE OUTPUT:
[paste actual example here]

Common Mistakes to Avoid:
- DO NOT cite outdated disposal authorities
- DO NOT skip the risk level assessment
- DO NOT provide general advice without specific DA references
</compliance_agent_workflow>

‍

‍

‍

6. Strong directive language

Use MUST, NEVER, ALWAYS, REQUIRED (not β€œshould” or β€œtry to”).

‍

❌ Weak:

As you help me structure this product discovery process, it would be good to 
document our key assumptions about the problem space. You should identify 
what we think we know versus what we actually need to validate with users. 
Try to flag any high-risk assumptions - things that could invalidate the 
entire opportunity if they turn out to be wrong.

It's helpful to think about validation methods for each assumption. Consider 
what kind of research or testing would give us confidence that an assumption 
is true or false. If there are major unknowns or gaps in our understanding, 
those should probably be highlighted so we can plan discovery work around them. 
We don't want to build something based on unvalidated assumptions, 
especially the risky ones.

‍

βœ… Strong:

You MUST document all assumptions before proceeding to design. 
NEVER move to execution without validating high-risk assumptions. 
ALWAYS flag unknowns explicitly.

REQUIRED assumption documentation:
- Assumption statement (clear, testable claim)
- Risk level (High/Medium/Low)
- Validation method (user interview, prototype test, data analysis, market research)
- Validation status (Validated/Invalidated/Not Yet Tested)

High-risk assumptions MUST be validated before development:
- User willingness to pay
- Technical feasibility of core features
- Regulatory compliance requirements
- Market demand estimates

NEVER proceed to build when:
- High-risk assumptions remain unvalidated
- User need is assumed but not confirmed
- Business model is untested
- Competitive response is unknown

‍

‍

‍

7. Negative examples

Show what not to do alongside positive examples.

‍

❌ Only positive:

Extract key insights from the user interviews. For each insight, include what 
users said and what it means for the product. Make sure to cite specific 
participants and include direct quotes to support your findings. 
Organize insights by theme and indicate how many users mentioned each issue.

Example format:
Insight: Users struggle with the checkout flow
- "The checkout process was confusing" - Participant 3
- Several users mentioned this
- Recommendation: Simplify the checkout experience

‍

βœ… Positive AND negative:

Research insights must include direct quotes with timestamps and participant IDs.

CORRECT:
Insight: Users abandon checkout due to unexpected shipping costs revealed at final step

- Evidence: "I had everything ready to buy, then saw $45 shipping. Closed the tab immediately." [Participant 7, 08:34]
- Evidence: "Why don't you show me the real price upfront? This feels like a bait and switch." [Participant 12, 15:22]

- Impact: 4 of 8 participants abandoned at shipping reveal
- Product Implication: Display estimated shipping on product page, not just at checkout

INCORRECT (DO NOT use these):
- "Users want better pricing"
- "Checkout is confusing"
- "Shipping costs are too high" (no user evidence)
- "Everyone hates the current flow" (no specifics, vague quantifier)

‍

‍

‍

‍

8. Explicit constraints

State boundaries clearly (what’s out of scope, limits, edge cases).

‍

❌ Vague boundaries:

Analyze the user research data and help me understand if we should build this 
compliance automation feature. Look through the interview transcripts and tell 
me what users said about their current process and pain points. 
See if there's evidence they would use our solution and what they might pay for it. 
Give me a recommendation on whether to proceed with development.

‍

βœ… Clear constraints:

<discovery_analysis_constraints>
Data sources: ONLY analyze:
- Interview transcripts in /research/transcripts/
- Hypothesis document from Stage 1
- Participant metadata (role, company size)

Validation thresholds:
- Validated: 6+ of 10 participants confirm
- Invalidated: 4+ of 10 participants contradict
- Minimum: 8 participants required (NOT fewer)

Evidence format:
- Direct quotes with [Participant N, MM:SS]
- Pattern strength: [X of Y participants]
- NEVER use real names or company names

Out of scope:
- DO NOT infer pain points not mentioned by participants
- DO NOT recommend features beyond validated assumptions
- DO NOT cite external research or general best practices
- DO NOT make claims without direct quote evidence

Edge cases:
- IF <8 transcripts THEN flag insufficient data + do not validate
- IF mixed evidence THEN report as inconclusive + recommend more research
</discovery_analysis_constraints>

‍

‍

❌ Vague boundaries:

Analyze how our competitors monetize ancillaries in the flight and hotel booking flow. 
Look at what Booking.com, Expedia, and other OTAs are doing with baggage fees, 
seat selection, insurance, and hotel add-ons. Use web search to find their current 
offerings and pricing. Check recent reviews to see what travelers think about 
ancillary pricing. Give me insights on what's working, revenue models, 
and where we should focus our ancillary strategy. Include any data you can find on 
conversion rates or attach rates.

‍

βœ… Clear constraints:

<ancillary_analysis_constraints>
Competitors: ONLY analyze these 3 OTAs:
- Booking.com (hotel-first, flight ancillaries added 2019)
- Expedia (integrated flight+hotel, loyalty-driven)
- Kayak (metasearch, ancillary referrals only)

Ancillary categories: ONLY evaluate:
- Baggage fees (how displayed, when disclosed, upsell timing)
- Seat selection (basic vs premium, family seating)
- Travel insurance (provider partnerships, opt-in vs opt-out)
- Hotel add-ons (breakfast, parking, late checkout)

Data sources: ONLY use:
- Use web_search for: public product announcements, pricing pages
- Use web_fetch for: specific competitor booking flow pages
- Review sites: TrustPilot reviews mentioning "fees" or "insurance" (last 6 months)
- DO NOT use: internal company data, proprietary research, paywalled reports

Analysis constraints:
- Revenue data: ONLY cite if from public earnings calls or press releases
etc.

Out of scope:
- DO NOT analyze loyalty point redemption mechanics
etc.

Edge cases:
- IF ancillary not offered THEN note "Not available on this platform"
- IF pricing varies by route/region THEN specify "US domestic baseline"
- IF feature requires login THEN note "Paywall/login required - not analyzed"
- IF recent UI change detected THEN flag "Recently updated - verify current state"

Output format:
- Competitor: [Name]
- Ancillary: [Category]
- Implementation: [How it's presented in booking flow]
- Pricing: [Specific amounts or "Variable" with range if known]
- UX Pattern: [Opt-in vs opt-out, timing in flow]
- Evidence: [URL, review quote, or "Not publicly disclosed"]
</ancillary_analysis_constraints>

(Note: I've reduced the length with 'etc.' in this example)

‍

‍

‍

‍

9. Verification steps

Include explicit checklists or validation criteria.

‍

❌ No validation:

This skill conducts social listening research to extract Voice of Customer insights 
from Reddit communities, supporting product discovery for small claims court in 
Canada, Ontario.

Search Reddit communities including r/legaladvicecanada, r/ontario, and 
r/PersonalFinanceCanada using web_search. Look for posts where people discuss their 
small claims court experiences, particularly around filing confusion, court preparation, 
documentation requirements, and self-representation challenges. 
Use web_fetch to pull full threads from relevant results.

For each pain point you identify, include direct quotes from the posts along with the 
subreddit source and post date. Document any workarounds users have created and flag 
any signals that suggest willingness to pay for solutions. Make sure to focus on 
Ontario-specific posts and exclude discussions about other provinces.

Create the synthesis document in Confluence under the DISCOVERY space using 
confluence_create_page.

‍

βœ… With verification checklist:

Conduct social listening research to extract Voice of Customer insights from Reddit communities,
supporting product discovery for small claims court in Canada, Ontario.

<data_sources>
Use web_search to find discussions in:
- r/legaladvicecanada
- r/ontario
- r/PersonalFinanceCanada

Search terms: "small claims court Ontario", "filing small claims", "self-represented litigant"

Use web_fetch to retrieve full thread content from relevant results.
</data_sources>

<extraction_requirements>
For each pain point identified:
- Include direct quotes from posts
- Note subreddit source and post date
- Document any workarounds users mention
- Flag signals indicating willingness to pay for solutions
</extraction_requirements>

<verification_checklist>
Before finalizing synthesis, verify ALL items:

Data Quality:
- [ ] Minimum 15 unique posts analyzed
- [ ] Each pain point supported by 3+ direct quotes
- [ ] All quotes include subreddit source and post date
- [ ] All posts from last 18 months

Jurisdiction Accuracy:
- [ ] Every quote confirmed relevant to Ontario Small Claims Court
- [ ] Posts referencing other provinces excluded
- [ ] Ontario-specific terminology present (e.g., "Deputy Judge", "$35,000 limit")

Pattern Strength:
- [ ] Each pain point includes frequency count
- [ ] Workarounds documented with supporting quotes
- [ ] Willingness-to-pay signals noted with evidence (or "none identified")

IF <15 posts found THEN flag "Insufficient data - expand search terms"
IF jurisdiction ambiguous in post THEN exclude from analysis
IF pain point has <3 quotes THEN label "Emerging signal - requires validation"
</verification_checklist>

Create synthesis using confluence_create_page with space_key: "DISCOVERY" 

‍

‍

‍

‍

10. Conditional logic structure

Use clear if/then statements for different scenarios.

‍

❌ Ambiguous flow:

Analyze user research to validate product assumptions.

Start by searching Google Drive for the hypothesis document in /discovery/hypothesis/ using
google_drive_search. If a hypothesis document exists, map findings back to specific assumptions. 
If there's no hypothesis doc, treat this as exploratory research and focus on emergent themes instead.

For hypothesis validation, look for interview recordings first since they provide the strongest evidence.
Transcribe them, extract insights, and create a validation table showing which assumptions held up. 
If you only find interview notes instead of recordings, work with those but flag that the evidence 
quality is lower without direct quotes.

For exploratory mode, extract themes from whatever data exists and identify opportunities for follow-up
research. Don't force findings into a hypothesis structure that doesn't exist.

Document the analysis in Confluence using confluence_create_page in the PRODUCT space.

‍

βœ… Clear if/then logic:

<discovery_research_workflow>
Determine research mode using this exact flow:

IF hypothesis document exists in /discovery/hypothesis/ THEN
  - Set mode: HYPOTHESIS_VALIDATION
  - Load hypothesis from document
  - Extract testable assumptions
  - Proceed to Step 2
  
ELSE IF user explicitly requests exploratory research THEN
  - Set mode: EXPLORATORY
  - Skip hypothesis validation section
  - Proceed to Step 2
  
ELSE
  - Return error: "Cannot determine research mode. Provide hypothesis document or specify exploratory research."
  - STOP workflow

Step 2: Data Processing

IF mode = HYPOTHESIS_VALIDATION THEN
  IF recordings found in Google Drive THEN
    - Use web_search to find transcription service documentation
    - Extract insights mapped to hypothesis assumptions
    - Generate validation table: [Assumption | Validated/Invalidated | Evidence]
  ELSE IF interview notes exist THEN
    - Extract insights from notes
    - Flag: "Limited evidence - recordings preferred for validation rigor"
  ELSE
    - Return: "Insufficient data for hypothesis validation"
    - STOP workflow

IF mode = EXPLORATORY THEN
  - Extract themes from available data
  - Generate insights without hypothesis mapping
  - Flag opportunities for follow-up research

NEVER mix hypothesis validation with exploratory synthesis.
ALWAYS document which mode was used in Confluence output.
</discovery_research_workflow>

‍

‍

‍

‍

11. Markdown headers for structural hierarchy

Markdown headers (# ## ###) help Claude’s processing by providing semantic structure markers that indicate document hierarchy and section boundaries.

‍

❌ Do NOT use flat structure:

Main Agent Role

Activation Trigger

Specific Conditions

Research Process

Step 1: Preparation

Step 2: Execution

‍

βœ… Use markdown headers:

# Main Agent Role

## Activation Trigger

### Specific Conditions

## Research Process

### Step 1: Preparation

### Step 2: Execution

‍

‍

Header levels that work best:

# (H1): Main role or agent identity (use once at top)

## (H2): Major sections (Activation Trigger, Research Process, Edge Cases)

### (H3): Subsections within major sections (Step 1, Step 2, Case 1, Case 2)

Do not go deeper than ### (H3). Flatter is better for processing

‍

‍

‍

‍

12. Word choice over visual formatting

Semantic meaning of words matters, visual formatting (CAPS, bold) does not. The word choice itself determines instruction strength, not the visual formatting.

These are functionally equivalent for Claude’s processing:

You MUST complete all steps.
You must complete all steps.

‍

‍

Choose directive words by semantic strength:

✦ STRONGEST: MUST, NEVER, REQUIRED, ALWAYS

✦ STRONG: Should, Need to, Have to

✦ WEAK: Try to, Consider, Might want to

‍

What does matter for Claude:

✦ Directive words: MUST, NEVER, ALWAYS, REQUIRED

✦ Prohibition words: DO NOT, NEVER, NOT ALLOWED

✦ Conditional words: IF, THEN, ELSE

✦ Verification words: VERIFY, CHECK, ENSURE

‍

What does not matter for Claude:

✦ Whether directive words are in CAPS or lowercase

✦ Bold formatting - Italic formatting

✦ Font size

‍

‍

❌ Do not use weak language:

You should probably try to validate inputs if possible.
(Weak regardless of formatting)

‍

βœ… Use strong directive words:

You MUST validate all inputs before processing.
You must validate all inputs before processing.
(Both work equally well for Claude)

‍

The word choice itself determines instruction strength, not the visual formatting.

‍

View all articles
View all articles