How to write a SkillShelf skill. This document covers SkillShelf-specific conventions and quality standards. For the underlying SKILL.md file format (frontmatter fields, directory structure, validation rules), see skillmd-specs.md.
Every skill must conform to the SKILL.md specification documented in skillmd-specs.md. Key requirements:
- Valid YAML frontmatter with
nameanddescription. namematches the parent directory name.descriptionis written in third person ("Produces a positioning brief..." not "I help you write...").descriptionis under 155 characters. This is used as the meta description on skillshelf.ai.- Skill title (the h1 heading in SKILL.md) starts with a verb and describes the outcome. "Document Your Brand Voice" not "Brand Voice Extractor." "Write a Positioning Brief" not "Write a Brand and Product Positioning Overview." Keep it short and something your target user would click on.
- Body under 500 lines (guideline, not hard limit).
- Forward slashes in all file paths.
- File references one level deep from SKILL.md.
SkillShelf skills should also include these metadata fields in frontmatter:
metadata:
category: product-content # one of the 10 SkillShelf categories
level: beginner # beginner, intermediate, or advanced
platforms: platform-agnostic # or specific platforms (shopify, klaviyo, etc.)
primitive: "true" # only if this is a foundational primitive skillLevel definitions are in project-scaffold.md under "Skill levels." The short version: beginner means the user just talks and gets output; intermediate means they bring prepared input; advanced means they work outside the chat window.
SkillShelf skills do not exist in isolation. Before writing a skill, understand what it consumes and what consumes it.
Primitives are foundational skills that produce reusable reference documents. The current primitives are documented in launch-skill-ideas/primitives_claude.md. If a skill would benefit from a primitive's output (brand voice guide, positioning brief, customer persona, benefit map), note that in the skill's instructions. Tell the user what to upload alongside their request and explain what improves when they do.
Do not make primitives a hard prerequisite. Every skill must produce useful output without them. But the output quality difference between "with primitive" and "without" should be real and obvious.
If a skill produces a document that other skills reference, structure the output with consistent, descriptive headings. Downstream skills will point users at specific sections by name (e.g., "read the Key Differentiators section of the positioning brief"). Heading names should be stable because changing them breaks references in other skills.
If your skill produces structured output that other skills consume, include a glossary at references/glossary.md. The glossary tells downstream skills how to interpret your output: what each section contains, what values to expect, and how to handle missing or unexpected input. See glossary-specification.md for the template and conventions.
The producing skill may also reference its own glossary during generation to ensure consistent field definitions and vocabulary.
When a skill would benefit from another skill's output, reference it by its natural-language name, not its directory name. "If you have a brand voice profile from the Brand Voice Extractor skill, upload it alongside your request," not "see brand-voice-extractor."
Do not force users through a rigid Q&A when they might already have the information in an existing document. The default input pattern for skills that need business context should be:
- Tell the user what kinds of existing content are useful (About Us pages, homepage copy, product pages, exported CSVs, email campaigns, etc.).
- Offer guided prompts as a fallback for users who don't have existing content.
- Identify gaps in whatever they provide and ask targeted follow-up questions, only for what's missing.
This reduces friction for users who have material and provides structure for users who don't.
When a skill accepts product or business data, design it to work with the exports users already have. Do not invent custom input formats when standard ones exist.
Common exports skills should expect and handle:
| Source | Format | Key fields |
|---|---|---|
| Shopify product export | CSV | Title, Body (HTML), Vendor, Type, Tags, Variant Price, Variant SKU, Image Src |
| WooCommerce product export | CSV | Name, Description, Regular price, Categories, Tags, Images |
| Amazon listing report | TSV/CSV | item-name, product-description, bullet-point1-5, generic-keyword |
| Google Merchant Center | CSV/TSV | title, description, price, brand, google_product_category, custom_label_0-4 |
| Klaviyo campaign export | CSV | Campaign Name, Subject, Send Date, Recipients, Open Rate, Click Rate, Revenue |
| Yotpo/Stamped/Judge.me reviews | CSV | Product, Rating, Title, Body, Date, Reviewer |
| Google Analytics (GA4) | CSV | Various; usually includes sessions, conversions, revenue by channel/page |
When a skill accepts CSV input, be explicit about which columns it needs and handle common variations in column naming (e.g., "Body (HTML)" vs. "Description" vs. "product_description" for the same concept).
Users frequently have messy, incomplete, or inconsistent data. Skills should handle imperfect input gracefully: produce the best output possible from what's provided, note what's missing, and suggest what would improve the result. Never refuse to produce output because the input isn't ideal.
Every skill output should use consistent Markdown headings that downstream skills and human readers can reference by name. If a positioning brief has a "Why they choose us" section, that heading should be stable and descriptive enough that another skill can say "reference the Why They Choose Us section."
Output should be ready to paste into a CMS, upload to a platform, or hand to a team member. If the output requires further editing or reformatting before it's useful, the skill isn't done.
For CSV-producing skills, output should be importable into the target platform without manual column renaming or reformatting.
When a skill works from limited input, the output should say so. Use a "Confidence notes" section to flag which parts of the output are based on limited evidence and what additional input would strengthen them. Do not pad thin input into confident-sounding output.
Some skills should include a calibration step where the user chooses between 2-3 variations before the final output is produced. This is not a default for every skill. It is appropriate only when the same input legitimately supports multiple good outputs and the user's preference is the tiebreaker.
When to calibrate:
- The output involves interpretation of voice, tone, or personality (brand voice extraction, positioning framing, creative direction).
- The user's input is ambiguous on a dimension that significantly changes the output (e.g., their content could read as playful or minimal, and the choice changes everything downstream).
When not to calibrate:
- The output is primarily determined by the input (product descriptions from specs, CSV formatting, data normalization, audits, checklists).
- The skill already receives a calibrated artifact as input (e.g., a description skill that consumes a brand voice guide, which already encodes the user's preferences).
- The skill produces structured analysis where there is a right answer, not a preference (review analysis, performance summaries, taxonomy mapping).
The calibration pattern (when used):
- Analyze the user's input silently.
- Present 2-3 variations that represent plausible but meaningfully different interpretations.
- Ask the user which resonates, or what they'd change.
- Use their selection to anchor the final output.
Present variations neutrally (A, B, C) without labeling them with descriptors. Let the user react to the output itself, not to a category name.
Every skill should include a references/ directory with a sample output file named example-[description].md (e.g., example-output.md, example-positioning-brief.md). The references/ directory is part of the Agent Skills open standard. It may also contain glossaries and other supporting documentation the skill reads at runtime.
Example files use the example- prefix so they are distinguishable from other reference documents. The example serves as both a quality benchmark for the LLM and a preview for the user.
Use generic, category-obvious brand names in examples. The name should make the product category immediately clear. Slightly punny names are fine if they fit naturally.
Good: "GreatOutdoors Co." (outdoor gear), "GoodBoy Treats" (pet products), "BeanThere Coffee" (coffee)
Bad: "Ridgeline Supply Co.", "Duskbloom", "Apex Provisions." These sound like real brands and don't signal the category instantly.
The goal is that anyone reading the example immediately understands it's a template, not a case study.
The example should be good enough to use as a reference in production. It demonstrates the ceiling, not the floor. If the example is mediocre, the LLM will calibrate to mediocre output.
Every skill should include an "Edge cases" section that addresses at minimum:
- Thin input: What happens when the user provides minimal information. The skill should produce output and note what would improve it, not refuse.
- Inconsistent input: What happens when the user's input contradicts itself. Document the variation rather than averaging into a bland middle ground.
- Missing context: What happens when a key dimension (competitors, target customer, etc.) isn't provided. Produce the brief without that section and note it in confidence notes.
Skills that accept CSV data should also address:
- Missing columns
- Inconsistent formatting within columns
- Very small datasets (< 10 rows) and very large datasets (1,000+ rows)
If you've tested the skill and noticed the AI consistently makes certain mistakes, document them in a "Gotchas" section in the SKILL.md. These are different from edge cases. Edge cases describe input variations; gotchas describe AI behavior patterns you want to correct.
Good gotchas are specific and actionable:
- "The model tends to invent product features not present in the input. Every claim must trace back to something the user provided."
- "When given a short About Us page, the model pads the positioning statement with generic industry language. Keep it grounded in what the user actually said."
- "The model defaults to aspirational language ('elevate your experience') even when the brand voice profile says to avoid it. Re-read the voice profile before writing."
Build this section over time. Start with an empty gotchas section after your first few test runs, then add entries as you discover patterns.
The fixtures/ directory at the repo root contains sample ecommerce data for testing skills. The Great Outdoors Co. set includes Shopify CSV/JSON exports, Google Merchant XML, product attributes, taxonomy, reviews, brand content pages, and four PDPs at varying quality levels (clean, technical, minimal, messy).
The fixture data has intentional quality issues (mixed units, inconsistent casing, HTML artifacts, keyword stuffing) so it exercises the edge cases every skill should handle.
The fixtures/greatoutdoorsco/skill-outputs/ directory contains example outputs from primitive skills (brand voice profile, positioning brief) run against this data. Use these as inputs when testing a skill that consumes primitive output.
To test: paste the SKILL.md and a relevant fixture file into your AI tool and run it. Check that the output handles the data's messiness gracefully and matches the quality ceiling set by the example output.
Every claim, differentiator, or recommendation in skill output should be specific to the user's brand, product, or data. Generic statements that could apply to any brand in the category are not useful. "High-quality ingredients" is generic. "Single-origin cacao from Piura, Peru, fermented on-site for 6 days" is specific.
If the user's input is generic, reflect that honestly and suggest how to sharpen it. Do not fabricate specificity.
Skill instructions and output should use clear, direct language. Avoid marketing jargon, buzzwords, and abstraction. The output is a working document, not a manifesto.
Do not use em dashes, en dashes, or double hyphens as punctuation. Rewrite sentences to use periods, commas, parentheses, or conjunctions instead. Hyphens in number ranges are fine (e.g., 2-3 sentences, 100-150 words). This applies to SKILL.md, example output files, glossaries, and skillshelf.yaml content.
A skill does one thing well. If a skill's scope creeps beyond its description, split it into two skills. A positioning brief skill should not also try to write product descriptions. A product description skill should not also audit the page for conversion best practices.