docs(zeb-139): KL-retrofit findings — attractor holds (cheap-win confound)#258
docs(zeb-139): KL-retrofit findings — attractor holds (cheap-win confound)#258jenglund wants to merge 2 commits into
Conversation
…-win confound) ZEB-139 4-cell matrix run on TinyLlama oracle + sidecar produced by PR #255. KL+CE training (λ=0.5) at cells 3+4 did NOT escape the maximum-entropy attractor. cross_run_cos between real-oracle and shuffled-oracle KL+router cells = +0.9999 — the smoking gun for the cheap-win confound: KL forces both routers to the same content-independent average distribution rather than learning per-position content routing. Per spec §11 outer matrix: ZEB-139 contribution is "Holds". Combined with ZEB-138's pending verdict, points to either teacher-arch dominance (if ZEB-138 breaks) or the structural-ceiling steelman (if ZEB-138 also holds). Sanity checks all passed: - Cell 1+2 (no-router baseline) reproduces ZEB-136's val_loss to 4 decimals (4.5546 vs ZEB-136's 4.5546 / 4.5544) - Oracle PCA explained_variance_ratio_total = 0.9338690864205668, bit-identical to ZEB-136's stored value (proves the GPU-side index_add_ accumulator from PR #255's perf fix produces the same Welford means as the original CPU path) - Sidecar shape (10000, 32000) bf16, 10000/10000 rows populated, shape-matched to engram_table Operational note in the doc: first matrix attempt failed at cell 3 because the local main repo dir was on a stale branch (zeblith/zeb-138-same-arch-teacher); the venv's ct87 editable install therefore imported a train.py without the new flags. Resolved by git checkout main && git pull, then re-ran cells 3+4 only (each cell init's independently, no chaining was lost). Doc captures the recipe to avoid recurrence. Spec doc STATUS line updated from "DRAFT — blocked on PR #254" to "COMPLETE — see findings". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
PR author is in the excluded authors list. |
|
CodeAnt AI is reviewing your PR. |
📝 WalkthroughWalkthroughThis PR adds a new findings document reporting the ZEB-139 “KL-Retrofit Objective-Axis” experiment and updates the design spec status from DRAFT to COMPLETE, recording quantitative results (including cross_run_cos ≈ +0.9999), experiment matrix, forensic metrics, and a λ=0.9 closure run. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Review Summary by QodoZEB-139 KL-retrofit findings: attractor holds, cheap-win confound confirmed
WalkthroughsDescription• Comprehensive findings document for ZEB-139 KL-retrofit experiment on TinyLlama • Confirms attractor HOLDS at λ=0.5 with cross_run_cos=+0.9999 indicating cheap-win confound • KL forces both real and shuffled routers to identical content-independent average distribution • Spec status updated from DRAFT to COMPLETE with verdict locked in for outer matrix Diagramflowchart LR
A["ZEB-139 Experiment<br/>4-cell matrix<br/>KL+CE training"] --> B["Cell 3+4 Results<br/>val_loss: 4.5636 vs 4.5637<br/>Δ-diff: -0.0001 nats"]
B --> C["Forensic Analysis<br/>cross_run_cos: +0.9999<br/>KL trajectory: 1.27→1.22"]
C --> D["Verdict: Attractor HOLDS<br/>Cheap-win confound confirmed<br/>KL forces average distribution"]
D --> E["Spec §11 Matrix<br/>Awaiting ZEB-138 result<br/>Determines next axis"]
File Changes1. docs/findings/2026-04-19-zeb-139-kl-retrofit.md
|
Code Review by Qodo
1. Local-only artifact references
|
User descriptionSummaryFindings doc + spec status update for ZEB-139. The 4-cell matrix run completed; verdict is attractor HOLDS under λ=0.5 KL+CE on the cross-arch TinyLlama setup. TL;DR
What this contributes to the bigger picturePer spec §11 outer matrix, ZEB-139 fills the "KL-retrofit Holds" row. ZEB-138 (same-arch teacher, CE-only) is the orthogonal axis still pending KRILE's Harmony-474M handoff. Once both rows land:
Bonus diagnostic (worth flagging)Notably, Operational note (worth capturing for next operator)First matrix attempt failed at cell 3 because the local main repo dir was checked out on the stale Test plan
Doc-only PR; no code changes, no tests to run. 🤖 Generated with Claude Code Note Low Risk Overview Updates the ZEB-139 design spec status from draft to complete, linking to the findings and summarizing the locked-in verdict for the spec §11 outcome matrix pending ZEB-138. Reviewed by Cursor Bugbot for commit 9900737. Bugbot is set up for automated code reviews on this repo. Configure here. CodeAnt-AI DescriptionMark the ZEB-139 spec as complete and add the final findings What Changed
Impact
Details💡 Usage GuideChecking Your Pull RequestEvery time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later. Talking to CodeAnt AIGot a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask: This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code. ExamplePreserve Org Learnings with CodeAntYou can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input: This helps CodeAnt AI learn and adapt to your team's coding style and standards. ExampleRetrigger reviewAsk CodeAnt AI to review the PR again, by typing: Check Your Repository HealthTo analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health. |
|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/findings/2026-04-19-zeb-139-kl-retrofit.md`:
- Line 61: The sentence claiming "to 4 decimal places" is inaccurate for the
comparison 4.5546 vs 4.5544; update the wording in that sentence (the one that
mentions 4.5546 / 4.5544) to either "within 0.0002" or "to 3 decimal places"
(e.g., replace "to 4 decimal places" with "within 0.0002") so the baseline
reproducibility claim is numerically precise.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: df753444-0555-4d53-bd46-05a0ea539b70
📒 Files selected for processing (2)
docs/findings/2026-04-19-zeb-139-kl-retrofit.mddocs/superpowers/specs/2026-04-18-zeb-139-kl-retrofit-design.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cursor Bugbot
🧰 Additional context used
🪛 LanguageTool
docs/findings/2026-04-19-zeb-139-kl-retrofit.md
[style] ~87-~87: Consider using a different adverb to strengthen your wording.
Context: ...ator: when two router models trained on completely different per-position teacher targets ...
(COMPLETELY_ENTIRELY)
🔇 Additional comments (2)
docs/superpowers/specs/2026-04-18-zeb-139-kl-retrofit-design.md (1)
3-3: Status update and quantitative verdict are clear and well-anchored.This succinctly captures completion state, key metric (
cross_run_cos), and how the result maps into §11.docs/findings/2026-04-19-zeb-139-kl-retrofit.md (1)
79-90: Fingerprint interpretation and cheap-win confound conclusion are well-supported.The threshold table plus
cross_run_cos = +0.9999provides a clear, evidence-based discriminator outcome.
|
|
||
| **Two observations from the matrix alone**: | ||
|
|
||
| 1. The router-off baseline reproduces ZEB-136's cells 1+2 to 4 decimal places (4.5546 vs ZEB-136's 4.5546 / 4.5544). Sanity check passes — the data path and frozen-backbone init are unchanged. |
There was a problem hiding this comment.
Fix precision wording in the baseline reproducibility claim.
Line 61 says “to 4 decimal places,” but 4.5546 vs 4.5544 differs at the 4th decimal. Suggest rewording to “within 0.0002” or “to 3 decimal places” for exactness.
✏️ Proposed doc fix
-1. The router-off baseline reproduces ZEB-136's cells 1+2 to 4 decimal places (4.5546 vs ZEB-136's 4.5546 / 4.5544). Sanity check passes — the data path and frozen-backbone init are unchanged.
+1. The router-off baseline closely reproduces ZEB-136's cells 1+2 (4.5546 vs ZEB-136's 4.5546 / 4.5544; max Δ=0.0002). Sanity check passes — the data path and frozen-backbone init are unchanged.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 1. The router-off baseline reproduces ZEB-136's cells 1+2 to 4 decimal places (4.5546 vs ZEB-136's 4.5546 / 4.5544). Sanity check passes — the data path and frozen-backbone init are unchanged. | |
| 1. The router-off baseline closely reproduces ZEB-136's cells 1+2 (4.5546 vs ZEB-136's 4.5546 / 4.5544; max Δ=0.0002). Sanity check passes — the data path and frozen-backbone init are unchanged. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/findings/2026-04-19-zeb-139-kl-retrofit.md` at line 61, The sentence
claiming "to 4 decimal places" is inaccurate for the comparison 4.5546 vs
4.5544; update the wording in that sentence (the one that mentions 4.5546 /
4.5544) to either "within 0.0002" or "to 3 decimal places" (e.g., replace "to 4
decimal places" with "within 0.0002") so the baseline reproducibility claim is
numerically precise.
…closed
Per spec §12 q1, ran cells 3+4 again at --kl-lambda 0.9 to nail down
the λ-sensitivity signal before retiring the KL-only experimental
axis at 40M. Same setup as λ=0.5 except for the λ value and output
paths; ~30 min wall time.
Headline numbers:
λ Cell 3 (real) Cell 4 (shuf) Δ-diff vs ZEB-136
0 (none) 4.5545 4.5543 +0.0002 —
0.5 4.5636 4.5637 −0.0001 +0.009
0.9 4.5907 4.5912 −0.0005 +0.036
Two clean monotonic patterns:
1. Higher λ → val_loss strictly worse. KL pressure increasingly
hurts the LM objective.
2. Δ-diff stays at noise across all λ values. No content-dependence
emerges no matter how hard we crank KL.
Forensic fingerprint (skip-to-logit probe at λ=0.9):
cross_run_cos engram_logits = +1.0000 (was +0.9999 at λ=0.5)
max LM-head row |cos| = 0.9257 (was 0.78 at λ=0.5)
||W_align||_F = 0.58 (was 1.35 at λ=0.5)
engram_logit_entropy = 10.3039 (was 10.3467; still
well above 10.27 break
threshold)
alpha = 0.1762 (saturated, λ-independent
above 0.5)
cross_run_cos = +1.0000 between real-oracle and shuffled-oracle cells
at λ=0.9 is the dispositive cheap-win signature. Higher KL pressure
intensifies the lever rather than escapes the attractor.
Curious side observation: g5 (L5 engram gate alpha) flipped sign
between λ=0.5 (+0.40) and λ=0.9 (-0.41). Different optimization
regime, same content-blind destination — suggests the "match the
corpus average" attractor is robust across optimizer trajectories.
Doc updates:
- TL;DR mentions both λ values now; net verdict unchanged
- Open question §1 (λ-sweep) marked DONE, points at the new section
- New section "λ=0.9 closure run" with full λ-sweep matrix, fingerprint
comparison, and the optimization-regime observation
- Artifacts section lists the new λ=0.9 checkpoints, CSVs, forensic
output, and run script
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/findings/2026-04-19-zeb-139-kl-retrofit.md (1)
63-63:⚠️ Potential issue | 🟡 MinorPrecision wording is still numerically inaccurate.
“to 4 decimal places” is not correct for
4.5546vs4.5544; use “within 0.0002” (or “to 3 decimal places”).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/findings/2026-04-19-zeb-139-kl-retrofit.md` at line 63, The wording "to 4 decimal places" is numerically incorrect for the values 4.5546 vs 4.5544; edit the sentence that currently reads "to 4 decimal places" (the router-off baseline reproduction line) and replace it with either "within 0.0002" or "to 3 decimal places" so the precision claim matches the actual numeric difference.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/findings/2026-04-19-zeb-139-kl-retrofit.md`:
- Around line 218-229: Replace the hard-coded absolute paths under
/home/zebli/work/LOCAL/zeb139/ with repo-relative paths (e.g., artifacts/... ,
logs/... , checkpoints/... , forensics/... , scripts/...) or use an
environment-placeholder like ${ZEB139_ROOT} for the top-level directory
referenced in this document; update the listed entries such as
artifacts/oracle_tinyllama_10k.safetensors,
artifacts/oracle_tinyllama_10k_teacher_logits.safetensors, logs/run{1..4}_*.csv,
checkpoints/zeb139_router_{off,on}_..., forensics/router_on_kl.txt and scripts/*
accordingly and add one short note that a local mount point (e.g.,
${ZEB139_ROOT} -> /home/zebli/work/LOCAL/zeb139) may be required for reproducing
locally.
---
Duplicate comments:
In `@docs/findings/2026-04-19-zeb-139-kl-retrofit.md`:
- Line 63: The wording "to 4 decimal places" is numerically incorrect for the
values 4.5546 vs 4.5544; edit the sentence that currently reads "to 4 decimal
places" (the router-off baseline reproduction line) and replace it with either
"within 0.0002" or "to 3 decimal places" so the precision claim matches the
actual numeric difference.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 40dff53e-645d-4a4b-a57a-b17c55248a90
📒 Files selected for processing (1)
docs/findings/2026-04-19-zeb-139-kl-retrofit.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cursor Bugbot
🧰 Additional context used
🪛 LanguageTool
docs/findings/2026-04-19-zeb-139-kl-retrofit.md
[style] ~89-~89: Consider using a different adverb to strengthen your wording.
Context: ...ator: when two router models trained on completely different per-position teacher targets ...
(COMPLETELY_ENTIRELY)
| All under `/home/zebli/work/LOCAL/zeb139/`: | ||
|
|
||
| - **Oracle**: `artifacts/oracle_tinyllama_10k.safetensors` (4.9 MB, [10K, 128] f32) and `_shuffled_seed0.safetensors` | ||
| - **Teacher-logits sidecar**: `artifacts/oracle_tinyllama_10k_teacher_logits.safetensors` (611 MB, [10K, 32K] bf16) and `_shuffled_seed0_teacher_logits.safetensors` | ||
| - **Stats**: `artifacts/oracle_tinyllama_10k.safetensors.stats.json` (PCA explained variance, populated rows, hash seeds) | ||
| - **Per-cell training logs (CSV)**: `logs/run{1..4}_*.csv` (λ=0.5) and `logs/run{3,4}_router_on_{real,shuf}_kl09.csv` (λ=0.9 closure), 200 rows × 36 columns each, including the `kl_loss` column | ||
| - **Per-cell checkpoints**: `checkpoints/zeb139_router_{off,on}_{real,shuf}{,_kl,_kl09}/checkpoint.pt` | ||
| - **Forensic outputs**: `forensics/router_off_no_kl.txt` (full 10-probe battery, pair A), `forensics/router_on_kl.txt` (skip-to-logit diagnostics, λ=0.5), `forensics/router_on_kl09.txt` (skip-to-logit diagnostics, λ=0.9 closure) | ||
| - **Scripts**: `scripts/shuffle_oracle_and_sidecar.py`, `scripts/run_4cell_matrix.sh` (cells 1-4), `scripts/run_cells_3_and_4.sh` (re-run after stale-checkout fix), `scripts/run_cells_3_and_4_lambda09.sh` (λ=0.9 closure), `scripts/run_forensics.sh` | ||
|
|
||
| ZEB-136's prior forensics (`/home/zebli/work/LOCAL/zeb136/forensics/router_on.txt`) are the direct comparison point for the ZEB-139 (KL+CE) vs ZEB-136 (CE-only) contrast. | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Use repo-relative or environment-agnostic artifact paths.
Hard-coded local absolute paths make the findings harder to reproduce for other operators. Prefer repo-relative paths (or a ${ZEB139_ROOT} placeholder) and one short note for the local mount.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/findings/2026-04-19-zeb-139-kl-retrofit.md` around lines 218 - 229,
Replace the hard-coded absolute paths under /home/zebli/work/LOCAL/zeb139/ with
repo-relative paths (e.g., artifacts/... , logs/... , checkpoints/... ,
forensics/... , scripts/...) or use an environment-placeholder like
${ZEB139_ROOT} for the top-level directory referenced in this document; update
the listed entries such as artifacts/oracle_tinyllama_10k.safetensors,
artifacts/oracle_tinyllama_10k_teacher_logits.safetensors, logs/run{1..4}_*.csv,
checkpoints/zeb139_router_{off,on}_..., forensics/router_on_kl.txt and scripts/*
accordingly and add one short note that a local mount point (e.g.,
${ZEB139_ROOT} -> /home/zebli/work/LOCAL/zeb139) may be required for reproducing
locally.
|
CodeAnt AI is running the review. |
User descriptionSummaryFindings doc + spec status update for ZEB-139. The 4-cell matrix run completed; verdict is attractor HOLDS under λ=0.5 KL+CE on the cross-arch TinyLlama setup. TL;DR
What this contributes to the bigger picturePer spec §11 outer matrix, ZEB-139 fills the "KL-retrofit Holds" row. ZEB-138 (same-arch teacher, CE-only) is the orthogonal axis still pending KRILE's Harmony-474M handoff. Once both rows land:
Bonus diagnostic (worth flagging)Notably, Operational note (worth capturing for next operator)First matrix attempt failed at cell 3 because the local main repo dir was checked out on the stale Test plan
Doc-only PR; no code changes, no tests to run. 🤖 Generated with Claude Code Note Low Risk Overview Updates the ZEB-139 design spec Reviewed by Cursor Bugbot for commit b4a371b. Bugbot is set up for automated code reviews on this repo. Configure here. Summary by CodeRabbit
CodeAnt-AI DescriptionMark ZEB-139 as complete and add the experiment findings What Changed
Impact
Details💡 Usage GuideChecking Your Pull RequestEvery time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later. Talking to CodeAnt AIGot a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask: This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code. ExamplePreserve Org Learnings with CodeAntYou can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input: This helps CodeAnt AI learn and adapt to your team's coding style and standards. ExampleRetrigger reviewAsk CodeAnt AI to review the PR again, by typing: Check Your Repository HealthTo analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health. |
Sequence DiagramThis PR documents the completed KL-retrofit experiment: generating teacher-logits sidecars, running a 4-cell matrix with router on and off under KL+CE, and concluding that the engram attractor holds and the KL-only axis at 40M is closed. sequenceDiagram
participant Researcher
participant ExperimentRunner
participant TeacherModel
participant Metrics
Researcher->>ExperimentRunner: Launch KL retrofit runs with lambda values
ExperimentRunner->>TeacherModel: Generate oracle and teacher logits sidecar
ExperimentRunner->>ExperimentRunner: Train 4 cell matrix (router off and on, real and shuffled)
ExperimentRunner->>Metrics: Collect validation loss and router fingerprints
Metrics-->>ExperimentRunner: Report matching real and shuffled behavior and high cross run cosine
ExperimentRunner-->>Researcher: Conclude attractor holds and KL only axis is closed
Generated by CodeAnt AI |
|
CodeAnt AI finished running the review. |
|
CodeAnt AI is running the review. |
User descriptionSummaryFindings doc + spec status update for ZEB-139. The 4-cell matrix run completed; verdict is attractor HOLDS under λ=0.5 KL+CE on the cross-arch TinyLlama setup. TL;DR
What this contributes to the bigger picturePer spec §11 outer matrix, ZEB-139 fills the "KL-retrofit Holds" row. ZEB-138 (same-arch teacher, CE-only) is the orthogonal axis still pending KRILE's Harmony-474M handoff. Once both rows land:
Bonus diagnostic (worth flagging)Notably, Operational note (worth capturing for next operator)First matrix attempt failed at cell 3 because the local main repo dir was checked out on the stale Test plan
Doc-only PR; no code changes, no tests to run. 🤖 Generated with Claude Code Note Low Risk Overview Updates the ZEB-139 design spec Reviewed by Cursor Bugbot for commit b4a371b. Bugbot is set up for automated code reviews on this repo. Configure here. Summary by CodeRabbit
CodeAnt-AI DescriptionMark the ZEB-139 KL-retrofit experiment complete with final findings What Changed
Impact
Details💡 Usage GuideChecking Your Pull RequestEvery time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later. Talking to CodeAnt AIGot a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask: This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code. ExamplePreserve Org Learnings with CodeAntYou can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input: This helps CodeAnt AI learn and adapt to your team's coding style and standards. ExampleRetrigger reviewAsk CodeAnt AI to review the PR again, by typing: Check Your Repository HealthTo analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health. |
Sequence DiagramThis PR documents the completed KL-retrofit experiment: generating teacher logits, running the 4-cell KL+CE training matrix, probing router behavior, and concluding that the maximum-entropy attractor still holds. sequenceDiagram
participant Researcher
participant OracleJob
participant TrainingMatrix
participant Forensics
participant SpecDoc
Researcher->>OracleJob: Generate oracle and teacher logits sidecar
Researcher->>TrainingMatrix: Run 4-cell KL plus CE training with real and shuffled sidecars
TrainingMatrix-->>Forensics: Emit val loss and router outputs for all cells
Forensics->>Forensics: Compute cross run cosine and fingerprint metrics
Forensics-->>Researcher: Verdict attractor holds under KL
Researcher->>SpecDoc: Update spec status and record ZEB-139 findings
Generated by CodeAnt AI |
| | `engram_logit_entropy` | < log(V) − 0.1 = 10.27 | 10.3735 (= log V) | **10.3467** | **HOLDS** (Δ from log V = 0.027, well above the 0.1 threshold for "broken") | | ||
| | `α` | outside [0.14, 0.20] | 0.1644 | **0.1762** | **HOLDS** (still inside attractor band) | |
There was a problem hiding this comment.
🟠 Architect Review — HIGH
The entropy-threshold explanation is directionally wrong: the table marks "broken if engram_logit_entropy < log(V) − 0.1 = 10.27", but the narrative calls Δ from log V = 0.027 "well above" the 0.1 break threshold, inverting the inequality and mis-stating what counts as a break.
Suggestion: Reword the verdict text so it correctly states that Δ=0.027 is well below the 0.1 break threshold (or equivalently that entropy must stay >10.27 to hold) and apply the same convention consistently, including in the λ=0.9 entropy row.
Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is an **Architect / Logical Review** comment left during a code review. These reviews are first-class, important findings — not optional suggestions. Do NOT dismiss this as a 'big architectural change' just because the title says architect review; most of these can be resolved with a small, localized fix once the intent is understood.
**Path:** docs/findings/2026-04-19-zeb-139-kl-retrofit.md
**Line:** 83:84
**Comment:**
*HIGH: The entropy-threshold explanation is directionally wrong: the table marks "broken if engram_logit_entropy < log(V) − 0.1 = 10.27", but the narrative calls Δ from log V = 0.027 "well above" the 0.1 break threshold, inverting the inequality and mis-stating what counts as a break.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
If a suggested approach is provided above, use it as the authoritative instruction. If no explicit code suggestion is given, you MUST still draft and apply your own minimal, localized fix — do not punt back with 'no suggestion provided, review manually'. Keep the change as small as possible: add a guard clause, gate on a loading state, reorder an await, wrap in a conditional, etc. Do not refactor surrounding code or expand scope beyond the finding.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix|
CodeAnt AI finished running the review. |
|
CodeAnt AI is running the review. |
User descriptionSummaryFindings doc + spec status update for ZEB-139. The 4-cell matrix run completed; verdict is attractor HOLDS under λ=0.5 KL+CE on the cross-arch TinyLlama setup. TL;DR
What this contributes to the bigger picturePer spec §11 outer matrix, ZEB-139 fills the "KL-retrofit Holds" row. ZEB-138 (same-arch teacher, CE-only) is the orthogonal axis still pending KRILE's Harmony-474M handoff. Once both rows land:
Bonus diagnostic (worth flagging)Notably, Operational note (worth capturing for next operator)First matrix attempt failed at cell 3 because the local main repo dir was checked out on the stale Test plan
Doc-only PR; no code changes, no tests to run. 🤖 Generated with Claude Code Note Low Risk Overview Updates the ZEB-139 design spec Reviewed by Cursor Bugbot for commit b4a371b. Bugbot is set up for automated code reviews on this repo. Configure here. Summary by CodeRabbit
CodeAnt-AI DescriptionMark the ZEB-139 KL-retrofit experiment as complete and record the final findings What Changed
Impact
Details💡 Usage GuideChecking Your Pull RequestEvery time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later. Talking to CodeAnt AIGot a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask: This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code. ExamplePreserve Org Learnings with CodeAntYou can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input: This helps CodeAnt AI learn and adapt to your team's coding style and standards. ExampleRetrigger reviewAsk CodeAnt AI to review the PR again, by typing: Check Your Repository HealthTo analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health. |
Sequence DiagramThis PR documents the completed ZEB-139 KL-retrofit experiment, where KL+CE training with teacher logits is run and analyzed, concluding the attractor still holds and updating the design spec status to complete. sequenceDiagram
participant Researcher
participant TrainingPipeline
participant MetricsProbe
participant SpecDoc
Researcher->>TrainingPipeline: Run ZEB-139 KL retrofit matrix with teacher logits
TrainingPipeline-->>MetricsProbe: Output losses and router traces for all cells
MetricsProbe->>MetricsProbe: Compare real vs shuffled cells and compute fingerprints
MetricsProbe-->>Researcher: Conclude attractor holds and KL only axis closed
Researcher->>SpecDoc: Record findings and mark spec complete
Generated by CodeAnt AI |
|
CodeAnt AI finished running the review. |
|
CodeAnt AI is running the review. |
User descriptionSummaryFindings doc + spec status update for ZEB-139. The 4-cell matrix run completed; verdict is attractor HOLDS under λ=0.5 KL+CE on the cross-arch TinyLlama setup. TL;DR
What this contributes to the bigger picturePer spec §11 outer matrix, ZEB-139 fills the "KL-retrofit Holds" row. ZEB-138 (same-arch teacher, CE-only) is the orthogonal axis still pending KRILE's Harmony-474M handoff. Once both rows land:
Bonus diagnostic (worth flagging)Notably, Operational note (worth capturing for next operator)First matrix attempt failed at cell 3 because the local main repo dir was checked out on the stale Test plan
Doc-only PR; no code changes, no tests to run. 🤖 Generated with Claude Code Note Low Risk Overview Updates the ZEB-139 design spec Reviewed by Cursor Bugbot for commit b4a371b. Bugbot is set up for automated code reviews on this repo. Configure here. Summary by CodeRabbit
CodeAnt-AI DescriptionMark the ZEB-139 KL-retrofit spec as complete and add the findings report What Changed
Impact
Details💡 Usage GuideChecking Your Pull RequestEvery time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later. Talking to CodeAnt AIGot a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask: This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code. ExamplePreserve Org Learnings with CodeAntYou can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input: This helps CodeAnt AI learn and adapt to your team's coding style and standards. ExampleRetrigger reviewAsk CodeAnt AI to review the PR again, by typing: Check Your Repository HealthTo analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health. |
Sequence DiagramThis PR documents the completed ZEB-139 experiment, where KL+CE training is applied to router cells with real and shuffled teacher logits to test whether the engram attractor can be broken; the findings show both cells collapse to the same content-blind distribution, so the attractor holds and the KL-only axis is closed. sequenceDiagram
participant Researcher
participant Experiment
participant TeacherModel
participant Router
participant Metrics
Researcher->>Experiment: Launch ZEB-139 KL+CE 4-cell matrix
Experiment->>TeacherModel: Load oracle and teacher logits sidecars (real and shuffled)
Experiment->>Router: Train router with CE and KL router teacher objective
Experiment->>Router: Run real oracle cell and shuffled oracle cell
Router-->>Metrics: Report router outputs and validation losses
Metrics-->>Researcher: cross_run_cos near 1 and matched losses (attractor holds, KL axis closed)
Generated by CodeAnt AI |
|
CodeAnt AI finished running the review. |
Summary
Findings doc + spec status update for ZEB-139. The 4-cell matrix run completed; verdict is attractor HOLDS under λ=0.5 KL+CE on the cross-arch TinyLlama setup.
TL;DR
cross_run_cos engram_logits = +0.9999between cells 3 and 4. Smoking gun for the cheap-win confound: KL forces both routers to the same content-independent average distribution.engram_logit_entropyαW_align ‖·‖_FWhat this contributes to the bigger picture
Per spec §11 outer matrix, ZEB-139 fills the "KL-retrofit Holds" row. ZEB-138 (same-arch teacher, CE-only) is the orthogonal axis still pending KRILE's Harmony-474M handoff. Once both rows land:
W_alignor backbone unfreezingBonus diagnostic (worth flagging)
Notably,
max LM-head row cosjumped from 0.22 (ZEB-136 without KL) to 0.78 (ZEB-139 with KL). The router IS aligning with vocab directions — it's just aligning ALL positions with the SAME average direction (cross_run_cos=1.0). The KL term is doing what it was designed to do (push the router toward the teacher's distribution); the teacher's distribution just turns out to be roughly position-independent at the corpus average, so the result is content-blind. A future variant could try a temperature on the router-side softmax (spec §12 q2) or a per-token KL mask that down-weights frequent-token positions to force per-position attention.Operational note (worth capturing for next operator)
First matrix attempt failed at cell 3 because the local main repo dir was checked out on the stale
zeblith/zeb-138-same-arch-teacherbranch (predates PR #257 by several commits). The venv'sct87editable install therefore imported atrain.pywithout the new--engram-skip-to-logit/--kl-lambdaflags. Cells 1+2 succeeded incidentally (no-router code path is identical across branches). Resolved bygit checkout main && git pull, then re-ran cells 3+4 only (each cell init's independently fromzeta_ctrl_2048, so no chaining was lost). Doc captures the recipe so the next operator doesn't repeat.Test plan
forensics/router_on_kl.txt, ZEB-136 baselines from/home/zebli/work/LOCAL/zeb136/)Doc-only PR; no code changes, no tests to run.
🤖 Generated with Claude Code
Note
Low Risk
Low risk because this PR only adds/updates documentation and does not change runtime code paths or data handling.
Overview
Adds a new findings writeup
docs/findings/2026-04-19-zeb-139-kl-retrofit.mddocumenting the completed ZEB-139 4-cell KL+CE experiment (including a λ=0.9 closure run) and its key outcome: the maximum-entropy attractor holds with evidence of a content-independent “cheap-win” collapse.Updates the ZEB-139 design spec
docs/superpowers/specs/2026-04-18-zeb-139-kl-retrofit-design.mdstatus from draft/blocked to complete, linking to the findings and summarizing the final verdict + discriminator metric.Reviewed by Cursor Bugbot for commit b4a371b. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit