feat: improve implement-spec skill score (53% → 96%) by yogesh-tessl · Pull Request #1531 · a16z/jolt

yogesh-tessl · 2026-05-15T07:16:34Z

ran your skills through tessl skill review at work and found some targeted improvements. Here's the before/after:

Skill	Before	After	Change
implement-spec	53%	96%	+43%

focused on implement-spec. It had the most improvement headroom (tied at 53% with several others) and is the most central workflow skill (spec-to-code implementation is the core development action). Also fixed a missing name field in update-docs that was causing it to fail validation entirely.

Changes made

implement-spec (53% → 96%, +43%)

Expanded the frontmatter description from a vague one-liner ("Autonomous one-shot implementation from an approved spec") to a structured description listing concrete actions (plans changes, executes code modifications, runs QA cycles, validates correctness, posts PR summaries)
Added explicit USE FOR and TRIGGERS sections with natural language trigger terms ("implement spec", "build from spec", "execute spec", "implement feature", etc.)
Folded the <Purpose> section content into the expanded description - it was duplicating information already covered
Removed the <Examples> section that added minimal value for an autonomous workflow skill

update-docs (16% → 62%, +46%)

Added the missing name: update-docs field to the frontmatter - the skill was failing validation entirely because this required field was absent, which prevented the LLM judge from scoring it

quick honest disclosure. I work at https://github.com/tesslio where we build tooling around skills like these. Not a pitch, just saw room for improvement and wanted to contribute.

if you want to self-improve your skills, or define your own scenarios to pressure test, just ask your agent (Claude Code, Codex, etc.) to evaluate and optimize your skill with Tessl. Ping me @yogesh-tessl, if you hit any snags.

@moodlezoup

Hey 👋 @moodlezoup I ran your skills through `tessl skill review` at work and found some targeted improvements. Here's the full before/after:  | Skill | Before | After | Change | |-------|--------|-------|--------| | implement-spec | 53% | 96% | +43% | | update-docs | 16% | 62% | +46% | | ci-code-review | 90% | 90% | — | | jolt | 66% | 66% | — | | new-spec | 63% | 63% | — | | analyze-spec | 53% | 53% | — | | new-invariant | 53% | 53% | — | | new-objective | 53% | 53% | — | ## Summary Focused on `implement-spec` — it had the most improvement headroom (tied at 53% with several others) and is the most central workflow skill (spec-to-code implementation is the core development action). Also fixed a missing `name` field in `update-docs` that was causing it to fail validation entirely. ## Changes <details> <summary>Changes made</summary> **`implement-spec` (53% → 96%, +43%)** - Expanded the frontmatter description from a vague one-liner ("Autonomous one-shot implementation from an approved spec") to a structured description listing concrete actions (plans changes, executes code modifications, runs QA cycles, validates correctness, posts PR summaries) - Added explicit USE FOR and TRIGGERS sections with natural language trigger terms ("implement spec", "build from spec", "execute spec", "implement feature", etc.) - Folded the `<Purpose>` section content into the expanded description — it was duplicating information already covered - Removed the `<Examples>` section that added minimal value for an autonomous workflow skill **`update-docs` (16% → 62%, +46%)** - Added the missing `name: update-docs` field to the frontmatter — the skill was failing validation entirely because this required field was absent, which prevented the LLM judge from scoring it </details> ## Testing - [x] `tessl skill review` confirms `implement-spec` improved from 53% → 96% - [x] `tessl skill review` confirms `update-docs` improved from 16% → 62% - [x] All validation checks pass (no errors) - [ ] No modified crates — changes are skill metadata only ## Security Considerations No security implications. Changes are limited to skill metadata (frontmatter descriptions) and removal of redundant documentation sections. No code, API, proof system, or verifier changes. ## Breaking Changes None --- I also stress-tested your `implement-spec` skill against a few real-world task evals and it held up really well on autonomous spec implementation requiring dual-mode (host+zk) QA validation with jolt-eval invariant scaffolding. Kudos for that. Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch — just saw room for improvement and wanted to contribute. Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at [this Tessl guide](https://docs.tessl.io/evaluate/optimize-a-skill-using-best-practices) and ask it to optimize your skill. Ping me — [@yogesh-tessl](https://github.com/yogesh-tessl) — if you hit any snags. Thanks in advance 🙏

moodlezoup · 2026-05-18T14:41:56Z

Hi @yogesh-tessl, thanks for the PR. What are these percentages of? Is there any documentation you can point me to?

yogesh-tessl · 2026-05-19T08:36:56Z

@moodlezoup, good question.

The percentages come from Tessl's skill review, which scores SKILL.md files across a few dimensions: whether the frontmatter has the required fields? How clear the trigger/routing language is? for an LLM to pick the right skill, and how actionable the instructions are once the skill fires. Each dimension gets a 0-100 score and the overall number is a weighted average.

The big jump on implement-spec (53 to 96) was mostly the frontmatter description being too short for agents to reliably match it.

The update-docs one (16 to 62) was just a missing name field that was failing validation entirely.

I'll be happy to point you to the scoring docs if you want more detail on the rubric.

github-actions Bot added the no-spec PR has no spec file label May 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve implement-spec skill score (53% → 96%)#1531

feat: improve implement-spec skill score (53% → 96%)#1531
yogesh-tessl wants to merge 1 commit into
a16z:mainfrom
yogesh-tessl:improve/skill-review-optimization

yogesh-tessl commented May 15, 2026

Uh oh!

moodlezoup commented May 18, 2026

Uh oh!

yogesh-tessl commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yogesh-tessl commented May 15, 2026

Uh oh!

moodlezoup commented May 18, 2026

Uh oh!

yogesh-tessl commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants