obophenotype · dosumis · May 20, 2026 · Apr 27, 2026 · Apr 27, 2026 · Apr 27, 2026
diff --git a/.claude/agents/ntr-term-researcher.md b/.claude/agents/ntr-term-researcher.md
diff --git a/.claude/agents/ontology-term-lookup.md b/.claude/agents/ontology-term-lookup.md
@@ -0,0 +1,102 @@
+---
+name: ontology-term-lookup
+description: Use this agent when you need to find ontology terms by their textual labels or descriptions using the OLS4 MCP. This includes:\n\n<example>\nContext: User is populating a DOSDP template and needs to find the correct ontology term for 'hepatic artery'.\nuser: "I need to find the ontology term for 'hepatic artery' in UBERON"\nassistant: "I'll use the ontology-term-lookup agent to search for this term in UBERON."\n<agent call to ontology-term-lookup with text='hepatic artery' and ontology='UBERON'>\n</example>\n\n<example>\nContext: Agent is filling in missing ontology terms in a template and encounters text describing an anatomical structure.\nassistant: "I need to find the ontology term for 'renal vein' to complete this template entry. Let me use the ontology-term-lookup agent."\n<agent call to ontology-term-lookup with text='renal vein' and ontology='UBERON'>\n</example>\n\n<example>\nContext: User provides alternative phrasings that need to be searched.\nuser: "Check if there's a term for either 'artery of kidney' or 'kidney artery'"\nassistant: "I'll use the ontology-term-lookup agent to search for both phrasings."\n<agent call to ontology-term-lookup with text='artery of kidney' and ontology='UBERON'>\n<agent call to ontology-term-lookup with text='kidney artery' and ontology='UBERON'>\n</example>
+model: sonnet
+---
+
+You are an expert ontology term matcher specializing in using the OLS4 (Ontology Lookup Service 4) MCP to find precise ontology term matches for textual descriptions.
+
+Your core responsibility is to take textual input describing an anatomical or biological concept and find the best matching ontology term(s) from a specified ontology using the ols4-mcp tool.
+
+## Input Processing
+
+You will receive:
+1. **text**: The term or phrase to look up (e.g., 'hepatic artery', 'blood vessel', 'artery of liver')
+2. **ontology**: The target ontology to search within (e.g., 'UBERON', 'CL', 'GO')
+
+## Search Strategy
+
+Execute searches systematically:
+
+1. **Primary Search**: Search for the exact text as provided in the specified ontology using ols4-mcp, looking for matches in labels and synonyms.
+
+2. **Alternative Phrasing**: If no high-confidence match is found, automatically generate and search alternative phrasings:
+   - Convert "X artery" to "artery of X" and vice versa
+   - Try singular/plural variations
+   - Substitute common synonyms (e.g., 'vessel' for 'blood vessel', 'hepatic' for 'liver')
+   - Consider anatomical term variations (e.g., 'renal' for 'kidney', 'cardiac' for 'heart')
+
+3. **Iterative Refinement**: If initial searches yield poor results, progressively broaden or narrow the search terms based on the domain.
+
+## Match Quality Assessment
+
+Evaluate matches based on:
+- **Exact label match**: Highest confidence
+- **Exact synonym match**: High confidence
+- **Partial label/synonym match**: Medium confidence (note the differences)
+- **Related term**: Low confidence (clearly indicate this is not a direct match)
+
+## Output Format
+
+Return results in this structured format:
+
+**For single high-confidence match:**
+```
+Best Match Found:
+- Input Text: [original input]
+- Matched Term: [term label]
+- Ontology ID: [full IRI or CURIE]
+- Match Type: [exact label | exact synonym | partial match]
+- Definition: [term definition if available]
+- Confidence: High
+```
+
+**For multiple high-confidence matches:**
+```
+Multiple Matches Found (ranked by relevance):
+
+Input Text: [original input]
+
+1. [Match rank]
+   - Matched Term: [term label]
+   - Ontology ID: [full IRI or CURIE]
+   - Match Type: [exact label | exact synonym | partial match]
+   - Definition: [term definition if available]
+   - Confidence: High/Medium
+   - Reason for ranking: [brief explanation]
+
+2. [Match rank]
+   - Matched Term: [term label]
+   - Ontology ID: [full IRI or CURIE]
+   - Match Type: [exact label | exact synonym | partial match]
+   - Definition: [term definition if available]
+   - Confidence: High/Medium
+   - Reason for ranking: [brief explanation]
+
+[Continue for all relevant matches]
+```
+
+**For no matches:**
+```
+No Match Found:
+- Input Text: [original input]
+- Ontology Searched: [ontology name]
+- Alternative phrasings tried: [list attempted variations]
+- Recommendation: [suggest manual review, broader ontology search, or term creation]
+```
+
+## Quality Control
+
+- Always verify that the matched term's definition aligns semantically with the input text
+- Flag cases where the match seems questionable despite technical similarity
+- When ranking multiple matches, prioritize based on: definition alignment > match type > term specificity
+- Never return matches with low confidence without clearly labeling them as such
+- If the ontology parameter seems inappropriate for the term type, note this in your response
+
+## Error Handling
+
+- If the ols4-mcp tool is unavailable, clearly state this and suggest alternative approaches
+- If the specified ontology doesn't exist or is inaccessible, report this explicitly
+- If the input text is ambiguous, note this and explain what additional context would help
+
+Remember: Precision is paramount. It's better to return no match or multiple candidates than to return a single incorrect high-confidence match.
diff --git a/.claude/skills/fetch-wiki-info-api/SKILL.md b/.claude/skills/fetch-wiki-info-api/SKILL.md
@@ -0,0 +1,75 @@
+---
+name: fetch-wiki-info-api
+description: Fetch structured and descriptive information from Wikidata and Wikipedia via HTTP APIs (no browser, no Playwright)
+argument-hint: "[search term] [--images]"
+allowed-tools: Bash
+---
+
+# Fetch Wiki Info Skill (HTTP-API variant)
+
+Parallel implementation of `fetch-wiki-info` that hits Wikidata + Wikipedia public APIs directly instead of going through Playwright. Faster, no Chromium dependency, no 8-parallel cap.
+
+## Search Term
+
+Topic to search for: **$ARGUMENTS**
+
+## Instructions
+
+Run the bundled Python helper. It is stdlib-only — no `pip install`.
+
+```bash
+python3 .claude/skills/fetch-wiki-info-api/fetch_wiki_info.py "$ARGUMENTS"
+```
+
+If the caller wants Wikipedia images + captions (e.g. for the `ntr-term-researcher` agent's image-xref step), pass `--images`:
+
+```bash
+python3 .claude/skills/fetch-wiki-info-api/fetch_wiki_info.py "$ARGUMENTS" --images
+```
+
+For machine-readable output, add `--json`.
+
+## Workflow inside the script
+
+1. **Wikidata search** (`wbsearchentities`) — top 5 candidates.
+2. **Wikidata entity fetch** (`Special:EntityData/{Q}.json`) for the top hit. Extracts label, description, aliases, P31/P361/P279, and the canonical English Wikipedia title via `sitelinks.enwiki.title` (avoids redirect guessing).
+3. **Wikipedia summary** (`/api/rest_v1/page/summary/{title}`) — liberal relevance gate: rejects only disambiguation pages or empty extracts.
+4. **Wikipedia full extract** (`action=query&prop=extracts&explaintext=1&redirects=1`) — full plain-text article body.
+5. **Wikipedia media** (with `--images` only): `/api/rest_v1/page/media-list/{title}`, keeping only items whose caption shares a word with the query term.
+
+Set a polite `User-Agent` (already done in the script).
+
+## Output Format
+
+Markdown with the same overall shape as the Playwright skill, plus an optional **Wikipedia Full Text** section and an optional **Wikipedia Images** section:
+
+```
+# <term>
+
+## Wikidata (Q#######)
+- Label / Description / Aliases / Instance of / Subclass of / Part of / Wikipedia link
+
+## Wikipedia Summary (<title>)
+<one-paragraph extract>
+
+## Wikipedia Full Text
+<full plain-text article>
+
+## Wikipedia Images           (only with --images)
+- <file title> — <caption>
+  - src: <url>
+
+## Notes
+- <relevance-gate reasons, if any>
+
+## Sources
+- Wikidata: https://www.wikidata.org/wiki/Q#######
+- Wikipedia: https://en.wikipedia.org/wiki/<page>
+```
+
+## Notes
+
+- Endpoints are anonymous; no auth required.
+- This skill exists in parallel with `fetch-wiki-info` for A/B comparison. Once validated on a real Stage 3 NTR run, the Playwright version (and the 8-parallel cap in [bulk_ntr_workflow/CLAUDE.md](../../../bulk_ntr_workflow/CLAUDE.md)) can be retired.
+- If Wikidata has no match, the script reports the empty candidate list and exits cleanly.
+- Disambiguation pages (e.g. "head") are dropped via the relevance gate — try a more specific term.
diff --git a/.claude/skills/fetch-wiki-info-api/VALIDATION.md b/.claude/skills/fetch-wiki-info-api/VALIDATION.md
@@ -0,0 +1,89 @@
+# Validation of the HTTP-API `fetch-wiki-info-api` skill
+
+This skill replaced the Playwright-based `fetch-wiki-info` skill. This note documents
+the A/B validation run that justified the switch — keep alongside the skill so the
+provenance lives with the code, not in a PR description that ages out.
+
+**Not auto-loaded into agent context** (only `SKILL.md` frontmatter is). Safe reference,
+won't distract agents.
+
+## Method
+
+Test set: every unique term label across the 45 group-input JSONs on the
+`add-hra-muscular-ntr` branch (`bulk_ntr_workflow/outputs/definitions/input/*.json`).
+75 unique terms after label-deduplication.
+
+For each term:
+- Invoke the new skill (`fetch_wiki_info.py <label> --json`)
+- Record: Wikidata Q-ID found? Wikipedia summary found? Full-text length? Latency?
+- Compare against the `wikipedia_summary` field in the Playwright-skill-produced
+  output JSONs on the same branch (`bulk_ntr_workflow/outputs/definitions/*.json`).
+
+39 of the 75 terms had a Playwright-produced reference summary to compare against.
+
+Test harness: `/tmp/wiki-test/run_test.py` (not checked in — single-shot validation
+script; recreate from this note + the branch fixtures if needed to re-run).
+
+## Headline results (parallel=6)
+
+| Metric | Result |
+|---|---:|
+| Successful runs | 75 / 75 |
+| Got Wikidata Q-ID | 65 / 75 (87%) |
+| Got Wikipedia summary | 72 / 75 (96%) |
+| Got Wikipedia full-text | 72 / 75 (96%) |
+| **Matches Playwright reference** | 38 / 39 |
+| Failures (crashes) | 0 |
+| Latency p50 / p95 | 1.77 s / 13.01 s |
+
+The single remaining miss (`pteryopharyngeal part of superior pharyngeal constrictor
+muscle`) was a misspelling for which the Playwright-side **agent step 4.2**
+(parent-article passage extraction) had carried the load — not the Playwright skill
+itself. That step is orthogonal to this skill and works identically with the new
+skill (call it on the parent label).
+
+## Issues found and fixed during validation
+
+1. **Rate-limit handling.** Wikimedia returned HTTP 429 once parallelism reached ~12.
+   Added exponential backoff + `Retry-After` honouring + up to 5 retries on 429/5xx
+   in `_request`. 0 crashes at parallel=6 afterwards.
+2. **Wikidata `wbsearchentities` is strict.** Initial hit rate was 29% — many real
+   anatomy terms didn't match because Wikidata search insists on tight prefix +
+   word-order matches (e.g. `splenius capitus` typo, `respiratory diaphragm muscle`
+   → Wikipedia title is `Thoracic diaphragm`, `spermatic cord muscle` →
+   `Spermatic cord`).
+   Added two cascading fallbacks:
+   - Wikipedia `opensearch` (prefix match, handles typos)
+   - Wikipedia `list=search` (CirrusSearch full-text, catches redirects + alternate names)
+   When a fallback resolves a Wikipedia title, the skill reverse-looks-up the Q-ID
+   via `action=wbgetentities&sites=enwiki&titles=...` so the Wikidata block is still
+   populated.
+3. **Captions weren't on `media-list`.** The REST `page/media-list/{title}` endpoint
+   does NOT include caption text despite docs suggesting otherwise. Switched to
+   parsing `<figure>+<figcaption>` blocks from `page/html/{title}` instead.
+4. **macOS Homebrew Python SSL.** The default `urllib` SSL context on
+   Homebrew-Python doesn't trust system roots. Added a fallback that tries
+   `certifi`, then `$SSL_CERT_FILE`, then common Homebrew/OS bundle paths.
+
+## Operational guidance
+
+- **Safe parallelism**: tested clean at 6. Likely fine up to ~10 with the retry
+  logic, but observed p95 latency climbs from rate-limit retries past that.
+- **Reverse lookup is cheap**: Wikipedia title → Q-ID via `wbgetentities` is one
+  extra HTTP call per fallback hit; ~+0.3 s.
+- **3 remaining test misses** are all misspellings (`pteryopharyngeal`,
+  `compartmet`, `puboperineales`) — the curator should flag these as
+  `name_corrections` rather than relying on the wiki lookup.
+
+## How to re-validate
+
+1. Check out the `add-hra-muscular-ntr` branch (or any branch with finished
+   Stage 3 outputs).
+2. Collect unique labels from `bulk_ntr_workflow/outputs/definitions/input/*.json`.
+3. Run the skill helper (`fetch_wiki_info.py <label> --json`) on each, in parallel.
+4. Compare `wikipedia.summary` field against the Playwright run's
+   `confirmed_matches[*].wikipedia_summary` in
+   `bulk_ntr_workflow/outputs/definitions/*.json`.
+
+A skill regression should show as either a drop in the per-term hit rate or in the
+Playwright-reference match count.