Skip to content

feat: integrate skill-creator, add improve-ic-skill, overhaul contributor workflow#198

Merged
raymondk merged 22 commits into
mainfrom
docs/use-skill-creator-for-new-skills
Jun 1, 2026
Merged

feat: integrate skill-creator, add improve-ic-skill, overhaul contributor workflow#198
raymondk merged 22 commits into
mainfrom
docs/use-skill-creator-for-new-skills

Conversation

@marc0olo
Copy link
Copy Markdown
Member

@marc0olo marc0olo commented May 29, 2026

What changed against main

New: skill-creator installed

skill-creator vendored at .agents/skills/skill-creator/ with a symlink at .claude/skills/skill-creator for Claude Code auto-discovery. Installed via npx skills add from anthropics/skills commit b0cbd3df (2026-03-06).

skills-lock.json is gitignored — our copy is a patched fork and must not be silently overwritten by npx skills add. Re-apply patches from PATCHES.md if updating intentionally.

5 bugs patched (confirmed from testrun, documented in .agents/skills/skill-creator/PATCHES.md):

  1. generate_review.py: escape </script> — viewer breaks on HTML eval output containing script tags
  2. SKILL.md: add missing run-1/ path level — aggregate_benchmark.py silently drops all results without it
  3. SKILL.md: require eval-0-descriptive-name/ format — aggregator globs eval-*, purely descriptive names vanish
  4. SKILL.md: --static wrong for Claude Code — generates file:// URLs browsers restrict; use server mode
  5. quick_validate.py: replace PyYAML (undeclared third-party dep) with stdlib parser — package_skill.py fails at import in a clean environment

1 doc improvement to agents/grader.md: explicit warning that grading.json expectations must use text/passed/evidence field names — viewer silently shows empty grades otherwise.

New: improve-ic-skill

Internal skill at .agents/skills/improve-ic-skill/ for improving existing skills. Token-efficient alternative to skill-creator's heavy interactive loop.

Key behaviours:

  • Step 0 hard gate: requires an explicit problem statement before doing any work
  • Knows our toolchain (npm run validate, evaluate-skills.js flags), eval location (evaluations/<skill-name>.json), and upstream-tracked skill rules
  • Distinguishes updating the eval file (Step 6) from running existing evals (Step 7) — evals are run selectively, not on every pass
  • Seeds evals from the upstream diff when a skill has none yet
  • Routes upstream sync work through this skill

Deleted: legacy skill template

skills/_template/SKILL.md.template removed — its prescribed body sections contradicted the "no rigid structure" principle and skill-creator now handles drafting.

Updated: .gitignore

Added entries for skill-creator runtime artifacts: skills/*-workspace/, **/__pycache__/, skills-lock.json.

Updated: CLAUDE.md and CONTRIBUTING.md

  • improve-ic-skill for improving existing skills; skill-creator for creating new skills — explicit in both docs
  • Two-phase eval workflow clarified: skill-creator's internal loop for drafting; evaluations/<skill-name>.json + evaluate-skills.js as the committed regression safety net
  • Eval run requirements: required with baseline for new skill PRs; recommended-not-required for improvements
  • Upstream sync: improve-ic-skill named as the skill to load; eval guidance aligned with general policy; branch naming clarified (<repo> = upstream repo short name)
  • Project Structure corrected (src/data/src/lib/, non-existent components removed)
  • ~15 additional consistency fixes (step numbering, diff notation, eval path references, category instructions, PR eval scope, branch naming)

Not fixed (intentional)

grader.md step ordering bug (Step 7 writes before Step 8 reads metrics) — pre-existing in upstream Anthropic code, does not affect improve-ic-skill's workflow.

@marc0olo marc0olo requested review from a team and JoshDFN as code owners May 29, 2026 13:12
marc0olo added 7 commits May 29, 2026 15:43
Replace the template-copy workflow with the Anthropic skill-creator
skill as the recommended starting point. Add explicit callouts for the
IC-specific metadata block (title, category) that skill-creator does not
produce, and clarify the two-phase eval workflow: skill-creator's
internal loop for iterative drafting vs. the committed evaluations/
file required for PRs.

Also remove prescriptive body section recommendations from CLAUDE.md —
structure is individual to each skill and better left to skill-creator.
Install skill-creator as a project skill via `npx skills add` so it is
auto-discovered by Claude Code without manual loading. Files land at
.agents/skills/skill-creator/ (multi-agent canonical location) with a
symlink at .claude/skills/skill-creator for Claude Code.

Doc fixes:
- Update skill-creator references to reflect it is pre-installed
- Step 3 renamed "Review and finalize" to avoid implying manual authoring
- Step 6 leads with porting evals from skill-creator's evals.json
- "Keep it flat" bullet now correctly allows references/ subdirectory
- "see step 5 above" cross-reference replaced with anchor link
- CLAUDE.md: add three-file instruction for adding a new category
- CLAUDE.md: mark _template/ as legacy in Project Structure
@marc0olo marc0olo force-pushed the docs/use-skill-creator-for-new-skills branch from d4c95d0 to 941fe4f Compare May 29, 2026 13:44
@marc0olo marc0olo changed the title docs: recommend skill-creator for new skill drafting feat: install skill-creator and overhaul contributing workflow May 29, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR installs the skill-creator project skill and updates contributor/agent workflows to use it instead of the removed legacy skill template. It also clarifies IC-specific metadata/eval requirements and upstream sync guidance.

Changes:

  • Adds vendored skill-creator skill assets, scripts, eval viewer, schemas, and lockfile.
  • Removes the old skills/_template/SKILL.md.template.
  • Updates CONTRIBUTING.md and .claude/CLAUDE.md to document the new skill creation/improvement workflow.

Reviewed changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
skills/_template/SKILL.md.template Removes the old rigid starter template.
skills-lock.json Records installed skill-creator provenance/hash.
CONTRIBUTING.md Reworks contributor workflow around skill-creator, metadata, evals, and upstream sync.
.claude/CLAUDE.md Updates agent instructions for skill creation, evals, categories, and repo layout.
.agents/skills/skill-creator/SKILL.md Adds the main skill-creator workflow instructions.
.agents/skills/skill-creator/scripts/utils.py Adds shared SKILL.md parsing helper.
.agents/skills/skill-creator/scripts/run_loop.py Adds description optimization loop.
.agents/skills/skill-creator/scripts/run_eval.py Adds trigger-evaluation runner.
.agents/skills/skill-creator/scripts/quick_validate.py Adds basic skill validation script.
.agents/skills/skill-creator/scripts/package_skill.py Adds skill packaging utility.
.agents/skills/skill-creator/scripts/improve_description.py Adds Claude-based description improvement script.
.agents/skills/skill-creator/scripts/generate_report.py Adds HTML report generator.
.agents/skills/skill-creator/scripts/aggregate_benchmark.py Adds benchmark aggregation script.
.agents/skills/skill-creator/scripts/__init__.py Marks scripts package for module execution.
.agents/skills/skill-creator/references/schemas.md Documents skill-creator JSON artifacts.
.agents/skills/skill-creator/LICENSE.txt Adds Apache-2.0 license text.
.agents/skills/skill-creator/eval-viewer/viewer.html Adds browser review UI.
.agents/skills/skill-creator/eval-viewer/generate_review.py Adds review page generator/server.
.agents/skills/skill-creator/assets/eval_review.html Adds trigger eval review template.
.agents/skills/skill-creator/agents/grader.md Adds grader-agent instructions.
.agents/skills/skill-creator/agents/comparator.md Adds blind comparator instructions.
.agents/skills/skill-creator/agents/analyzer.md Adds post-hoc analyzer instructions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CONTRIBUTING.md Outdated
Comment thread .claude/CLAUDE.md Outdated
Comment thread CONTRIBUTING.md Outdated
Comment thread .agents/skills/skill-creator/scripts/quick_validate.py Outdated
Comment thread .agents/skills/skill-creator/SKILL.md Outdated
@marc0olo marc0olo marked this pull request as draft May 29, 2026 16:11
@marc0olo
Copy link
Copy Markdown
Member Author

needs to be tested properly before merging

Comment thread .agents/skills/skill-creator/eval-viewer/generate_review.py
Comment thread .agents/skills/skill-creator/SKILL.md Outdated
Comment thread .agents/skills/skill-creator/SKILL.md Outdated
Comment thread .agents/skills/skill-creator/SKILL.md Outdated
@marc0olo
Copy link
Copy Markdown
Member Author

Two additional items not tied to specific lines:

1. Missing .gitignore entries for skill-creator artifacts

Running skill-creator in this repo produces two untracked directories that are not currently gitignored:

  • skills/*-workspace/ — the iteration workspace (CLAUDE.md already says it's not committed, but nothing prevents it from being accidentally staged)
  • **/__pycache__/ — Python bytecode generated when running the aggregation and viewer scripts

Suggest adding both to .gitignore as part of this PR (or a follow-up).


2. agents/grader.md doesn't mention the field name requirement for grading.json

SKILL.md line 225 warns that grading.json expectations must use text, passed, evidence (not name/met/details). But a grader subagent reads only agents/grader.md — and that file's output format section doesn't repeat this constraint. If a grader subagent uses different field names, the viewer silently shows empty grades with no error.

The warning should appear in agents/grader.md directly above the output format JSON example, not just in SKILL.md.

marc0olo added 4 commits May 30, 2026 00:19
- Add internal improve-ic-skill skill (.agents/skills/improve-ic-skill/)
  for token-efficient improvement of existing skills. Uses our toolchain
  (npm run validate, evaluate-skills.js with targeted flags), knows eval
  location (evaluations/<skill-name>.json), handles upstream-tracked skills,
  and seeds evals when none exist. Distinct from skill-creator which is for
  new skill creation only.

- Fix 4 confirmed bugs in vendored skill-creator (documented in PATCHES.md):
  1. generate_review.py: escape </script> in JSON output to prevent viewer breakage
  2. SKILL.md: add missing run-1/ level in output paths (aggregate_benchmark.py requires it)
  3. SKILL.md: require eval-0-descriptive-name/ format (aggregator globs eval-*)
  4. SKILL.md: clarify --static is wrong for Claude Code; use server mode instead

- Remove skills-lock.json from git and add to .gitignore. Our skill-creator
  copy is a patched fork — npx skills add would overwrite fixes silently.
  Intentional updates must re-apply patches from PATCHES.md.

- Update CLAUDE.md: distinguish improve-ic-skill vs skill-creator, warn
  against npx skills add updates, clarify that PR eval results must come
  from evaluate-skills.js (with-skill vs baseline), not skill-creator internals.

- Add gitignore entries for skill-creator artifacts (skills/*-workspace/,
  **/__pycache__/, skills-lock.json).
@marc0olo marc0olo changed the title feat: install skill-creator and overhaul contributing workflow feat: integrate skill-creator, add improve-ic-skill, overhaul contributor workflow May 29, 2026
marc0olo added 5 commits May 30, 2026 01:36
… CLAUDE.md, PATCHES.md

- CONTRIBUTING.md: route improvements to improve-ic-skill (not skill-creator);
  fix nonexistent skills/<name>/evals/evals.json path reference
- improve-ic-skill: remove redundant eval run from Step 8 (Step 7 covers it);
  add upstream sync guidance for seeding evals from the diff when none exist
- CLAUDE.md: add improve-ic-skill mention in upstream sync workflow; align
  sync checklist eval policy with general improvement policy
- PATCHES.md: reorder patches 4 and 5 into sequential order
- CONTRIBUTING.md: rename 'That's it' step to 'No site edits needed'
- CLAUDE.md: remove redundant eval command from Workflow section; point to
  Evaluations section for the full command reference
- CLAUDE.md: fix stale Project Structure (src/data/ → src/lib/, remove
  non-existent SiteLayout, components/*)
- CLAUDE.md: clarify branch naming — <skill-name> is the IC skill name;
  document combined-branch pattern for multi-skill syncs
- CLAUDE.md: clarify 'Body content' row in upstream table (not icskills-owned)
- CLAUDE.md: scope PR eval requirement to new skills only (line 67)
- CLAUDE.md: fix Project Structure paths (already committed, included here)
- improve-ic-skill: add #8-submit-a-pr anchor to CONTRIBUTING.md link
- CONTRIBUTING.md: add improve-ic-skill and branch naming note to sync section
- PATCHES.md: clarify upstream commit date vs vendored date labels
@marc0olo
Copy link
Copy Markdown
Member Author

here is what happens if I ask the agent to improve the icp-cli skill without giving clear guidance:
image

@marc0olo marc0olo marked this pull request as ready for review May 30, 2026 00:30
@marc0olo
Copy link
Copy Markdown
Member Author

after we fixed the generic "just improve it" prompt:

image

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 30, 2026

Skill Validation Report

Validating skill: /home/runner/work/icskills/icskills/skills/icp-cli

Structure

  • Pass: SKILL.md found
  • Pass: all files in references/ are referenced

Frontmatter

  • Pass: name: "icp-cli" (valid)
  • Pass: description: (507 chars)
  • Pass: license: "Apache-2.0"
  • Pass: metadata: (2 entries)

Tokens

  • Warning: SKILL.md body is 5380 tokens (spec recommends < 5000)

Markdown

  • Pass: no unclosed code fences found

Tokens

File Tokens
SKILL.md body 5,380
references/binding-generation.md 1,031
references/dev-server.md 690
references/dfx-migration.md 2,620
Total 9,721

Content Analysis

Metric Value
Word count 2,865
Code block ratio 0.19
Imperative ratio 0.12
Information density 0.16
Instruction specificity 0.84
Sections 18
List items 57
Code blocks 29

References Content Analysis

Metric Value
Word count 2,133
Code block ratio 0.25
Imperative ratio 0.13
Information density 0.19
Instruction specificity 0.80
Sections 18
List items 33
Code blocks 12

Contamination Analysis

Metric Value
Contamination level low
Contamination score 0.12
Primary language category shell
Scope breadth 3
  • Warning: Language mismatch: config, javascript (2 categories differ from primary)

References Contamination Analysis

Metric Value
Contamination level low
Contamination score 0.03
Primary language category javascript
Scope breadth 2
  • Warning: Language mismatch: shell (1 category differ from primary)

Result: 1 warning

Project Checks


✓ Project checks passed for 1 skills (0 warnings)

@marc0olo
Copy link
Copy Markdown
Member Author

seems like we are good now on this:
image

@raymondk raymondk merged commit 6e49de7 into main Jun 1, 2026
6 checks passed
@raymondk raymondk deleted the docs/use-skill-creator-for-new-skills branch June 1, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants