From 99007376b2a9aab2e8f12c18627d970ec1d3ce28 Mon Sep 17 00:00:00 2001
From: Jake Englund <zeblith@gmail.com>
Date: Sun, 19 Apr 2026 21:50:15 -0700
Subject: [PATCH 1/2] =?UTF-8?q?docs(zeb-139):=20KL-retrofit=20findings=20?=
 =?UTF-8?q?=E2=80=94=20attractor=20holds=20at=20=CE=BB=3D0.5=20(cheap-win?=
 =?UTF-8?q?=20confound)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

ZEB-139 4-cell matrix run on TinyLlama oracle + sidecar produced by
PR #255. KL+CE training (λ=0.5) at cells 3+4 did NOT escape the
maximum-entropy attractor. cross_run_cos between real-oracle and
shuffled-oracle KL+router cells = +0.9999 — the smoking gun for the
cheap-win confound: KL forces both routers to the same
content-independent average distribution rather than learning
per-position content routing.

Per spec §11 outer matrix: ZEB-139 contribution is "Holds". Combined
with ZEB-138's pending verdict, points to either teacher-arch
dominance (if ZEB-138 breaks) or the structural-ceiling steelman
(if ZEB-138 also holds).

Sanity checks all passed:

- Cell 1+2 (no-router baseline) reproduces ZEB-136's val_loss to 4
  decimals (4.5546 vs ZEB-136's 4.5546 / 4.5544)
- Oracle PCA explained_variance_ratio_total = 0.9338690864205668,
  bit-identical to ZEB-136's stored value (proves the GPU-side
  index_add_ accumulator from PR #255's perf fix produces the same
  Welford means as the original CPU path)
- Sidecar shape (10000, 32000) bf16, 10000/10000 rows populated,
  shape-matched to engram_table

Operational note in the doc: first matrix attempt failed at cell 3
because the local main repo dir was on a stale branch
(zeblith/zeb-138-same-arch-teacher); the venv's ct87 editable install
therefore imported a train.py without the new flags. Resolved by
git checkout main && git pull, then re-ran cells 3+4 only (each cell
init's independently, no chaining was lost). Doc captures the recipe
to avoid recurrence.

Spec doc STATUS line updated from "DRAFT — blocked on PR #254"
to "COMPLETE — see findings".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../2026-04-19-zeb-139-kl-retrofit.md         | 166 ++++++++++++++++++
 .../2026-04-18-zeb-139-kl-retrofit-design.md  |   2 +-
 2 files changed, 167 insertions(+), 1 deletion(-)
 create mode 100644 docs/findings/2026-04-19-zeb-139-kl-retrofit.md

diff --git a/docs/findings/2026-04-19-zeb-139-kl-retrofit.md b/docs/findings/2026-04-19-zeb-139-kl-retrofit.md
new file mode 100644
index 00000000..f474608a
--- /dev/null
+++ b/docs/findings/2026-04-19-zeb-139-kl-retrofit.md
@@ -0,0 +1,166 @@
+# ZEB-139 — KL-Retrofit Objective-Axis Experiment Findings
+
+**Date:** 2026-04-19
+**Linear:** [ZEB-139](https://linear.app/zeblith/issue/ZEB-139/kl-retrofit-experiment-objective-axis-diagnostic-for-engram-attractor)
+**Spec:** `docs/superpowers/specs/2026-04-18-zeb-139-kl-retrofit-design.md`
+**Foundation:** [PR #257](https://github.com/zeblithic/harmony/pull/257) (ZEB-134 revival + ZEB-139 KL+CE) and [PR #255](https://github.com/zeblithic/harmony/pull/255) (`--save-teacher-logits` sidecar producer)
+**Prior art:** [ZEB-134](https://linear.app/zeblith/issue/ZEB-134) (Skip-to-Logit, CE-only, attractor observed), [ZEB-136](https://linear.app/zeblith/issue/ZEB-136) (TinyLlama cross-arch, attractor observed)
+
+---
+
+## TL;DR
+
+**Adding a Memory-Decoder-style `KL(P_router || P_teacher)` term at λ=0.5 did NOT escape the maximum-entropy attractor on the cross-arch TinyLlama setup.** Both the real-oracle and shuffled-oracle KL+router cells converged to essentially identical val_loss (4.5636 vs 4.5637, Δ-diff = -0.0001 nats) and produced a router output with `cross_run_cos = +0.9999` between the two cells — the smoking gun for the cheap-win confound. KL forced both routers to the same content-independent average distribution rather than learning per-position content routing.
+
+**Per spec §11 outer matrix this is the "KL-retrofit attractor HOLDS" outcome.** Combined with whatever ZEB-138 produces on the orthogonal teacher-architecture axis, it points toward either teacher-arch dominance (if ZEB-138 breaks) or a structural ceiling at 40M (if ZEB-138 also holds — Gemini §7 steelman).
+
+---
+
+## Setup
+
+### Code prereqs (both merged to main 2026-04-19)
+
+- **PR #255**: `--save-teacher-logits` flag added to `generate_oracle_table.py`. Welford-means the teacher's full LM-head outputs (vocab=32000) keyed by the same xxhash row indices as the existing oracle. Sidecar is `[10K, 32K] bf16` ≈ 640MB. Throughput recovered from a 24× regression via a GPU-side `index_add_` accumulator (CPU `np.add.at` on a `[10K, 32K] f64` master is fundamentally bandwidth-bound at ~10 GB/s).
+- **PR #257**: ZEB-134's `SkipToLogitEngramRouter` revived (W_align d_model→d_model + log_alpha scalar + frozen LM-head reuse), and ZEB-139's KL+CE wired into `train.py`: `--kl-lambda` + `--oracle-teacher-logits` flags, per-token-normalized `F.kl_div(log_p_teacher, log_p_router, log_target=True).sum(-1).mean()` (the `log_target=True` form gets the spec'd FORWARD KL `KL(P_router || P_teacher)` direction; PyTorch's `F.kl_div(input, target)` computes `KL(target || input)` per its `target * (log target − input)` formula).
+
+### Teacher-logits sidecar extraction
+
+Re-ran `generate_oracle_table.py --save-teacher-logits` on the same TinyLlama-1.1B teacher + 99M-token FineWeb-Edu-POC corpus that ZEB-136 used. Wall time **5.8 hours** at 4,771 tok/s sustained on the 5080 (~14% slower than ZEB-136's no-sidecar 5,017 tok/s baseline — the GPU-resident `[10K, 32K] f64` sum table + bf16→fp32 cast per chunk accounts for the gap).
+
+Sanity check: `pca_explained_variance_ratio_total = 0.9338690864205668`, **bit-identical to ZEB-136's stored value**. Confirms the GPU-side `SumAccumulatorTable`/`GpuSumAccumulatorTable` math is equivalent to the original CPU `WelfordTable` for the hidden-state path (proves the perf-optimization didn't change numerics).
+
+Shuffled artifacts produced by a single `torch.randperm(seed=0)` applied to BOTH the oracle (`engram.weight`) and the sidecar (`teacher_logits.weight`) — same permutation across both files so cell-4's per-position teacher target is independently scrambled in both the engram-emb path AND the KL target path.
+
+### 4-cell matrix configuration
+
+Identical to ZEB-136's `run_4cell_matrix.sh` for cells 1+2 (router-off baselines), with `--engram-skip-to-logit --engram-skip-alpha-init 0.1 --kl-lambda 0.5 --oracle-teacher-logits …` added to cells 3+4.
+
+| Cell | Router | Oracle | Teacher-logits sidecar | KL term |
+| --- | --- | --- | --- | --- |
+| 1 | off | real | — | off |
+| 2 | off | shuf (seed=0) | — | off |
+| 3 | on | real | real | λ=0.5 |
+| 4 | on | shuf (seed=0) | shuf (seed=0, same perm) | λ=0.5 |
+
+Each cell init's from `zeta_ctrl_2048/checkpoint.pt` (the same backbone-frozen baseline ZEB-136 used) with `--allow-partial-init` so the new `engram_skip_router.W_align` (zero) and `engram_skip_router.log_alpha` (= log(0.1)) start from their constructor's safe-init values. 2000 steps each, batch=4, seq=2048, bf16 mixed precision, `--engram-vcontrast` + `--engram-qdiv` aux losses still active per spec §4.2.
+
+---
+
+## Results
+
+### val_loss matrix (final, step 2000)
+
+|  | Real oracle | Shuffled oracle | Δ-diff (real − shuf) |
+| --- | --- | --- | --- |
+| **Router off, KL off** (cells 1, 2) | 4.5546 | 4.5546 | **+0.0000** |
+| **Router on, KL on** (cells 3, 4) | 4.5636 | 4.5637 | **−0.0001** |
+| Δ vs no-router baseline | +0.0090 | +0.0091 | — |
+
+**Two observations from the matrix alone**:
+
+1. The router-off baseline reproduces ZEB-136's cells 1+2 to 4 decimal places (4.5546 vs ZEB-136's 4.5546 / 4.5544). Sanity check passes — the data path and frozen-backbone init are unchanged.
+2. The router-on KL+CE cells got val_loss ~0.009 nats *worse* than the no-router baseline, with cell 3 vs cell 4 essentially identical. This is the inverse of what would constitute a positive ZEB-139 result.
+
+### Cell 3 vs Cell 4 fingerprint (the discriminator)
+
+Per spec §11's intra-experiment discriminator table, cell 3 vs cell 4 separates "KL signal is content-dependent" (real teacher info actually helping, the clean positive result) from "KL signal is content-independent" (KL forcing sharp output regardless of input — the cheap-win confound).
+
+Pulled from `forensics/router_on_kl.txt` (probe: `scripts.probe_skip_to_logit`):
+
+```text
+real: log_alpha=-1.7360  alpha=exp=0.1762  ||W_align||_F=1.3461
+shuf: log_alpha=-1.7342  alpha=exp=0.1765  ||W_align||_F=1.3471
+
+cross_run_cos engram_logits  =  +0.9999
+max LM-head row |cos|        =  0.7779
+engram_logit_entropy (nats)  =  10.3467  (log(vocab) = 10.3735)
+```
+
+| Fingerprint metric | Spec §7 threshold (broken if…) | ZEB-136 (no KL) | ZEB-139 (KL=0.5, real) | Verdict |
+| --- | --- | --- | --- | --- |
+| `engram_logit_entropy` | < log(V) − 0.1 = 10.27 | 10.3735 (= log V) | **10.3467** | **HOLDS** (Δ from log V = 0.027, well above the 0.1 threshold for "broken") |
+| `α` | outside [0.14, 0.20] | 0.1644 | **0.1762** | **HOLDS** (still inside attractor band) |
+| Cross-run cosine (real vs shuf, router) | < 0.7 | +0.7979 | **+0.9999** | **HOLDS** + WORSE — KL drove the two routers to converge |
+| Δ-diff (real − shuf val_loss) | ≥ +0.001 nats | +0.0002 | **−0.0001** | **HOLDS** + slight reverse |
+| `W_align` Frobenius drift | > 2× init (init = 0) | 1.91 | **1.35** | **HOLDS** + smaller — KL kept W_align contained |
+
+**All five thresholds say the attractor HOLDS.** And `cross_run_cos = +0.9999` is the dispositive result for the cheap-win discriminator: when two router models trained on completely different per-position teacher targets (real vs shuffled-via-permutation) end up producing essentially the same output distribution to 4-decimal cosine, the model is matching SOMETHING content-independent — almost certainly the corpus-wide token-frequency average that the Welford-mean teacher logits encode after enough position averaging.
+
+That same `cross_run_cos` jumping from 0.80 (no KL) to 1.00 (with KL=0.5) is the mechanism: KL pressure pulls both routers to the SAME target distribution. The "real" and "shuf" sidecars contain the same set of per-row teacher distributions just at different row indices — the KL signal therefore averages out to "match the corpus distribution somehow", which is identical regardless of how rows are permuted.
+
+### KL trajectory
+
+From the per-step CSV logs (`run3_router_on_real_kl.csv`):
+
+```text
+step    0  loss=2.9147  kl_loss=1.2697  alpha=0.10  W_align=0  (init)
+step  300  loss=3.1555  kl_loss=1.2705  (essentially flat — W_align still ~0, gradient through alpha is zero by construction)
+step  600  loss=2.9575  kl_loss=1.2705  (alpha=0.1, W_align starting to grow under small alpha gradient)
+step  900  loss=2.9254  kl_loss=1.2678  (KL begins moving)
+step 1200  loss=2.9449  kl_loss=1.2587
+step 1500  loss=2.9853  kl_loss=1.2453
+step 1800  loss=2.9885  kl_loss=1.2212
+final     loss=4.5636 (val)  KL trajectory: 1.27 → 1.22 nats over 2000 steps (Δ = −0.05 nats)
+```
+
+The KL did decrease monotonically — the router IS learning to better match the teacher distribution. But the magnitude is small (~4% relative drop) and the destination is content-independent: cell 4 (shuffled sidecar) shows the IDENTICAL trajectory (`kl_loss` 1.2709 → 1.2209 in the same number of steps). Both runs are converging toward the same "average TinyLlama distribution" target, which is not what the experiment hoped to find.
+
+### Pair A baseline forensic (router-off cells)
+
+Reproduces ZEB-136's standard η-B capgap battery on cells 1+2 (full output in `forensics/router_off_no_kl.txt`). All ten ZEB-130 probes (D/P/E/M/C/W/A/X/Q-overlap/V-rank) within noise of ZEB-136's prior values; cross-run cos at L2 = +0.87 / L5 = +0.80 (matching ZEB-136's known content-poor baseline). Confirms the no-router data path didn't drift between ZEB-136 and ZEB-139.
+
+---
+
+## Verdict matrix (this experiment × ZEB-138)
+
+Per spec §11:
+
+| This (KL-retrofit) | ZEB-138 (same-arch teacher, CE-only) | Combined interpretation |
+| --- | --- | --- |
+| Holds | Holds | **Structural ceiling confirmed** at 40M (Gemini §7 steelman) — neither objective shift nor teacher-arch shift escapes the attractor; the 40M frozen-backbone linear pipeline is the binding constraint. Multi-layer non-linear `W_align` OR end-to-end retraining without freezing is the recommended next axis. |
+| Holds | Breaks | **Teacher-arch dominates, objective insufficient** — ZEB-138's same-arch decode break is the load-bearing axis; pursue same-arch teacher + capgap as the substrate, deprioritize KL+CE. |
+
+ZEB-138's verdict is pending KRILE's Harmony-474M handoff and the corresponding 4-cell run on AVALON. **ZEB-139's contribution to the matrix is now locked in as "Holds".**
+
+---
+
+## Open questions and next-step recommendations
+
+### 1. λ-sweep (spec §12 question 1)
+
+The spec says "If no break at 0.5, try 0.9 once before concluding." This is worth doing for completeness, but the `cross_run_cos = +0.9999` result strongly suggests a higher λ would just intensify the convergence to the average distribution — it cranks up the same lever that's already saturating. **Recommendation: run a single λ=0.9 cell-3 + cell-4 pair (~30 min on AVALON) to nail down the λ-sensitivity signal, then close the door on the KL-only axis at 40M.**
+
+### 2. Same-arch teacher + KL (spec §10 follow-up)
+
+The 2×2's fourth cell (same-arch teacher AND KL term) is contingent on either ZEB-139 or ZEB-138 yielding signal. Since ZEB-139 didn't, and ZEB-138 is pending, this remains "wait for ZEB-138." If ZEB-138 also holds, the 2×2 is closed (structural ceiling) and same-arch+KL becomes redundant. If ZEB-138 breaks, same-arch+KL becomes the natural follow-up to test whether KL adds to the same-arch signal.
+
+### 3. The Gemini §7 steelman — multi-layer non-linear `W_align`
+
+If ZEB-138 also holds, the Gemini Deep Research findings (§7) recommend abandoning the single-layer-linear `W_align` in favor of either a multi-layer non-linear projection (more capacity in the alignment path) or unfreezing the backbone (resolves the "frozen 40M can't decode high-dim teacher features" steelman). Both are substantially more invasive than ZEB-139 was. Multi-layer `W_align` is probably the cheaper try-first.
+
+### 4. Diagnostic bonus: KL trajectory IS learning, just not usefully
+
+Worth flagging that KL did monotonically decrease (1.27 → 1.22) and `max LM-head row cos` jumped from 0.22 (ZEB-136) to 0.78 (ZEB-139). The router IS aligning with vocab directions — it's just aligning ALL positions with the SAME average direction (cross_run_cos = 1.0). A future variant could try a temperature on the router-side softmax (spec §12 question 2) or a per-token KL mask that down-weights frequent-token positions — both might force the router to pay attention to per-position content rather than averaging it out. These are speculative; the cleaner next move is the λ-sweep + ZEB-138 result.
+
+---
+
+## Artifacts
+
+All under `/home/zebli/work/LOCAL/zeb139/`:
+
+- **Oracle**: `artifacts/oracle_tinyllama_10k.safetensors` (4.9 MB, [10K, 128] f32) and `_shuffled_seed0.safetensors`
+- **Teacher-logits sidecar**: `artifacts/oracle_tinyllama_10k_teacher_logits.safetensors` (611 MB, [10K, 32K] bf16) and `_shuffled_seed0_teacher_logits.safetensors`
+- **Stats**: `artifacts/oracle_tinyllama_10k.safetensors.stats.json` (PCA explained variance, populated rows, hash seeds)
+- **Per-cell training logs (CSV)**: `logs/run{1..4}_*.csv` (200 rows × 36 columns each, including the new `kl_loss` column)
+- **Per-cell checkpoints**: `checkpoints/zeb139_router_{off,on}_{real,shuf}{,_kl}/checkpoint.pt`
+- **Forensic outputs**: `forensics/router_off_no_kl.txt` (full 10-probe battery, pair A) and `forensics/router_on_kl.txt` (skip-to-logit diagnostics, pair B)
+- **Scripts**: `scripts/shuffle_oracle_and_sidecar.py`, `scripts/run_4cell_matrix.sh` (cells 1-4), `scripts/run_cells_3_and_4.sh` (re-run after stale-checkout fix), `scripts/run_forensics.sh`
+
+ZEB-136's prior forensics (`/home/zebli/work/LOCAL/zeb136/forensics/router_on.txt`) are the direct comparison point for the ZEB-139 (KL+CE) vs ZEB-136 (CE-only) contrast.
+
+---
+
+## Operational notes
+
+- The 4-cell matrix's first attempt failed at cell 3 because the local main repo dir was checked out on `zeblith/zeb-138-same-arch-teacher` (stale, predates PR #257). The venv's `ct87` editable install therefore imported a `train.py` without the `--engram-skip-to-logit` / `--kl-lambda` flags. Cells 1+2 succeeded incidentally (no-router code path is identical across branches). Resolved by `git checkout main && git pull` and re-running cells 3+4 only (each cell init's independently from `zeta_ctrl_2048`, so no chaining was lost).
+- Total wall time for the experiment: ~6h oracle extraction + ~30 min cells 1+2 + ~30 min cells 3+4 + a few min for forensics. The spec §8 estimate of "4-6h end-to-end" was off by ~3× on the oracle extraction (the new logits-Welford accumulator is the dominant cost); the matrix + forensics matched spec.
diff --git a/docs/superpowers/specs/2026-04-18-zeb-139-kl-retrofit-design.md b/docs/superpowers/specs/2026-04-18-zeb-139-kl-retrofit-design.md
index bca83f3f..d0981357 100644
--- a/docs/superpowers/specs/2026-04-18-zeb-139-kl-retrofit-design.md
+++ b/docs/superpowers/specs/2026-04-18-zeb-139-kl-retrofit-design.md
@@ -1,6 +1,6 @@
 # ZEB-139 — KL-Retrofit Objective-Axis Experiment (Design Spec)
 
-> **STATUS: DRAFT — blocked on PR #254 merge + teacher-logits extension.** This spec is written during the ZEB-137/138 wait so the experiment can launch immediately once the prereq PRs land on main. AVALON can execute end-to-end in ~4-6h once unblocked.
+> **STATUS: COMPLETE — see findings at `docs/findings/2026-04-19-zeb-139-kl-retrofit.md`.** Verdict: **attractor HOLDS** under λ=0.5 KL+CE on the cross-arch TinyLlama setup. `cross_run_cos = +0.9999` between the real-oracle and shuffled-oracle KL+router cells confirms the cheap-win confound (KL forces both routers to the same content-independent average distribution). Combined with ZEB-138's pending verdict, this fills the "KL-retrofit Holds" cell of the spec §11 outer matrix.
 
 **Linear:** [ZEB-139](https://linear.app/zeblith/issue/ZEB-139/kl-retrofit-experiment-objective-axis-diagnostic-for-engram-attractor)
 **Parent:** [ZEB-102](https://linear.app/zeblith/issue/ZEB-102)

From b4a371b9d9f11dd9c3c7402ce26148388c77b237 Mon Sep 17 00:00:00 2001
From: Jake Englund <zeblith@gmail.com>
Date: Sun, 19 Apr 2026 23:05:10 -0700
Subject: [PATCH 2/2] =?UTF-8?q?docs(zeb-139):=20=CE=BB=3D0.9=20closure=20r?=
 =?UTF-8?q?un=20results=20=E2=80=94=20KL-only=20axis=20definitively=20clos?=
 =?UTF-8?q?ed?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per spec §12 q1, ran cells 3+4 again at --kl-lambda 0.9 to nail down
the λ-sensitivity signal before retiring the KL-only experimental
axis at 40M. Same setup as λ=0.5 except for the λ value and output
paths; ~30 min wall time.

Headline numbers:

  λ          Cell 3 (real)    Cell 4 (shuf)    Δ-diff       vs ZEB-136
  0 (none)   4.5545           4.5543           +0.0002      —
  0.5        4.5636           4.5637           −0.0001      +0.009
  0.9        4.5907           4.5912           −0.0005      +0.036

Two clean monotonic patterns:
1. Higher λ → val_loss strictly worse. KL pressure increasingly
   hurts the LM objective.
2. Δ-diff stays at noise across all λ values. No content-dependence
   emerges no matter how hard we crank KL.

Forensic fingerprint (skip-to-logit probe at λ=0.9):
  cross_run_cos engram_logits  =  +1.0000  (was +0.9999 at λ=0.5)
  max LM-head row |cos|        =   0.9257  (was 0.78 at λ=0.5)
  ||W_align||_F                 =   0.58    (was 1.35 at λ=0.5)
  engram_logit_entropy          =  10.3039  (was 10.3467; still
                                            well above 10.27 break
                                            threshold)
  alpha                         =   0.1762  (saturated, λ-independent
                                            above 0.5)

cross_run_cos = +1.0000 between real-oracle and shuffled-oracle cells
at λ=0.9 is the dispositive cheap-win signature. Higher KL pressure
intensifies the lever rather than escapes the attractor.

Curious side observation: g5 (L5 engram gate alpha) flipped sign
between λ=0.5 (+0.40) and λ=0.9 (-0.41). Different optimization
regime, same content-blind destination — suggests the "match the
corpus average" attractor is robust across optimizer trajectories.

Doc updates:
- TL;DR mentions both λ values now; net verdict unchanged
- Open question §1 (λ-sweep) marked DONE, points at the new section
- New section "λ=0.9 closure run" with full λ-sweep matrix, fingerprint
  comparison, and the optimization-regime observation
- Artifacts section lists the new λ=0.9 checkpoints, CSVs, forensic
  output, and run script

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../2026-04-19-zeb-139-kl-retrofit.md         | 81 +++++++++++++++++--
 1 file changed, 75 insertions(+), 6 deletions(-)

diff --git a/docs/findings/2026-04-19-zeb-139-kl-retrofit.md b/docs/findings/2026-04-19-zeb-139-kl-retrofit.md
index f474608a..e5646139 100644
--- a/docs/findings/2026-04-19-zeb-139-kl-retrofit.md
+++ b/docs/findings/2026-04-19-zeb-139-kl-retrofit.md
@@ -12,6 +12,8 @@
 
 **Adding a Memory-Decoder-style `KL(P_router || P_teacher)` term at λ=0.5 did NOT escape the maximum-entropy attractor on the cross-arch TinyLlama setup.** Both the real-oracle and shuffled-oracle KL+router cells converged to essentially identical val_loss (4.5636 vs 4.5637, Δ-diff = -0.0001 nats) and produced a router output with `cross_run_cos = +0.9999` between the two cells — the smoking gun for the cheap-win confound. KL forced both routers to the same content-independent average distribution rather than learning per-position content routing.
 
+**The λ=0.9 closure run (spec §12 q1) confirmed the verdict** with stronger signature: `cross_run_cos = +1.0000` (rounds to bit-exact), val_loss got *worse* (+0.027 nats over λ=0.5; +0.036 nats over the no-KL baseline), and the optimization shifted regime (g5 alpha goes negative under λ=0.9 vs positive under λ=0.5) but produced the same content-blind result. Higher KL pressure intensifies the cheap-win lever rather than escaping the attractor. **The KL-only axis at 40M is closed.**
+
 **Per spec §11 outer matrix this is the "KL-retrofit attractor HOLDS" outcome.** Combined with whatever ZEB-138 produces on the orthogonal teacher-architecture axis, it points toward either teacher-arch dominance (if ZEB-138 breaks) or a structural ceiling at 40M (if ZEB-138 also holds — Gemini §7 steelman).
 
 ---
@@ -126,9 +128,12 @@ ZEB-138's verdict is pending KRILE's Harmony-474M handoff and the corresponding
 
 ## Open questions and next-step recommendations
 
-### 1. λ-sweep (spec §12 question 1)
+### 1. λ-sweep (spec §12 question 1) — DONE, see "λ=0.9 closure run" section below
 
-The spec says "If no break at 0.5, try 0.9 once before concluding." This is worth doing for completeness, but the `cross_run_cos = +0.9999` result strongly suggests a higher λ would just intensify the convergence to the average distribution — it cranks up the same lever that's already saturating. **Recommendation: run a single λ=0.9 cell-3 + cell-4 pair (~30 min on AVALON) to nail down the λ-sensitivity signal, then close the door on the KL-only axis at 40M.**
+The spec said "If no break at 0.5, try 0.9 once before concluding." Completed
+2026-04-19; results in the new "λ=0.9 closure run" section. Net: the higher λ
+intensified the cheap-win signature exactly as predicted (`cross_run_cos`
+1.0000, val_loss worse). KL-only axis at 40M is closed.
 
 ### 2. Same-arch teacher + KL (spec §10 follow-up)
 
@@ -144,6 +149,70 @@ Worth flagging that KL did monotonically decrease (1.27 → 1.22) and `max LM-he
 
 ---
 
+## λ=0.9 closure run
+
+Per spec §12 q1 ("if no break at 0.5, try 0.9 once before concluding"), reran
+cells 3+4 with `--kl-lambda 0.9`. Same setup otherwise (same data, same
+checkpoints-init-from, same seeds, same code). Wall time ~30 min.
+
+### Full λ-sweep matrix (cells 3+4 only — cells 1+2 are router-off, λ-independent)
+
+| λ | Cell 3 (real) | Cell 4 (shuf) | Δ-diff (real − shuf) | Δ vs no-KL baseline (cell 3) |
+| --- | --- | --- | --- | --- |
+| 0 (ZEB-136 router-on) | 4.5545 | 4.5543 | +0.0002 | — |
+| 0.5 | 4.5636 | 4.5637 | −0.0001 | +0.009 |
+| **0.9** | **4.5907** | **4.5912** | **−0.0005** | **+0.036** |
+
+**Two clean monotonic patterns**:
+1. Higher λ → val_loss strictly worse. KL pressure increasingly hurts the LM
+   objective at every step up the λ ladder.
+2. Δ-diff stays at noise across all λ values. **No content-dependence emerges
+   regardless of how hard we crank the KL lever.** This is the dispositive
+   answer to spec §12 q1.
+
+### Forensic fingerprint comparison (skip-to-logit probe)
+
+Pulled from `forensics/router_on_kl09.txt`:
+
+```text
+real: log_alpha=-1.7362  alpha=exp=0.1762  ||W_align||_F=0.5786
+shuf: log_alpha=-1.7377  alpha=exp=0.1759  ||W_align||_F=0.5709
+
+cross_run_cos engram_logits  =  +1.0000
+max LM-head row |cos|        =  0.9257
+engram_logit_entropy (nats)  =  10.3039  (log(vocab) = 10.3735)
+```
+
+| Metric | ZEB-136 (no KL) | λ=0.5 | **λ=0.9** | Trend |
+| --- | --- | --- | --- | --- |
+| `α` (real) | 0.1644 | 0.1762 | **0.1762** | Saturated in attractor band, λ-independent above 0.5 |
+| `‖W_align‖_F` (real) | 1.91 | 1.35 | **0.58** | Monotonically smaller — KL keeps the projection more contained at higher λ |
+| `cross_run_cos` | +0.7979 | +0.9999 | **+1.0000** | Higher λ → more perfect collapse to identical content-blind output |
+| `max LM-head row \|cos\|` | 0.22 | 0.78 | **0.93** | Router aligns with one average LM-head direction more strongly |
+| `engram_logit_entropy` (Δ from log V) | 0.0000 | 0.027 | **0.069** | Slowly moving away from log V but still well above the 0.1 break threshold |
+
+### Optimization-regime shift, same destination
+
+One curiosity: the L5 engram gate's behavior changes sign across λ. At λ=0.5,
+g5 (the post-tanh L5 gate alpha) grew positive (+0.40 at step 1800). At
+λ=0.9, g5 went negative (-0.41 at step 1500). The router under λ=0.9 is
+*subtracting* the L5 engram contribution from the hidden state instead of
+adding it — a different optimization regime entirely. Yet both regimes land
+at the same content-independent average distribution at the router's output
+(`cross_run_cos` = +1.0000). This further suggests the destination ("match
+the corpus average teacher distribution") is robust across optimizer
+trajectories, and that varying λ just changes HOW the model gets to the
+same useless attractor, not WHETHER.
+
+### Verdict
+
+KL-only axis at 40M is **definitively closed**. Higher λ intensifies the
+cheap-win lever but does not unlock content routing. ZEB-139's row of the
+spec §11 outer matrix is locked in as "Holds". Next move depends on
+ZEB-138's verdict (see open question §2 above).
+
+---
+
 ## Artifacts
 
 All under `/home/zebli/work/LOCAL/zeb139/`:
@@ -151,10 +220,10 @@ All under `/home/zebli/work/LOCAL/zeb139/`:
 - **Oracle**: `artifacts/oracle_tinyllama_10k.safetensors` (4.9 MB, [10K, 128] f32) and `_shuffled_seed0.safetensors`
 - **Teacher-logits sidecar**: `artifacts/oracle_tinyllama_10k_teacher_logits.safetensors` (611 MB, [10K, 32K] bf16) and `_shuffled_seed0_teacher_logits.safetensors`
 - **Stats**: `artifacts/oracle_tinyllama_10k.safetensors.stats.json` (PCA explained variance, populated rows, hash seeds)
-- **Per-cell training logs (CSV)**: `logs/run{1..4}_*.csv` (200 rows × 36 columns each, including the new `kl_loss` column)
-- **Per-cell checkpoints**: `checkpoints/zeb139_router_{off,on}_{real,shuf}{,_kl}/checkpoint.pt`
-- **Forensic outputs**: `forensics/router_off_no_kl.txt` (full 10-probe battery, pair A) and `forensics/router_on_kl.txt` (skip-to-logit diagnostics, pair B)
-- **Scripts**: `scripts/shuffle_oracle_and_sidecar.py`, `scripts/run_4cell_matrix.sh` (cells 1-4), `scripts/run_cells_3_and_4.sh` (re-run after stale-checkout fix), `scripts/run_forensics.sh`
+- **Per-cell training logs (CSV)**: `logs/run{1..4}_*.csv` (λ=0.5) and `logs/run{3,4}_router_on_{real,shuf}_kl09.csv` (λ=0.9 closure), 200 rows × 36 columns each, including the `kl_loss` column
+- **Per-cell checkpoints**: `checkpoints/zeb139_router_{off,on}_{real,shuf}{,_kl,_kl09}/checkpoint.pt`
+- **Forensic outputs**: `forensics/router_off_no_kl.txt` (full 10-probe battery, pair A), `forensics/router_on_kl.txt` (skip-to-logit diagnostics, λ=0.5), `forensics/router_on_kl09.txt` (skip-to-logit diagnostics, λ=0.9 closure)
+- **Scripts**: `scripts/shuffle_oracle_and_sidecar.py`, `scripts/run_4cell_matrix.sh` (cells 1-4), `scripts/run_cells_3_and_4.sh` (re-run after stale-checkout fix), `scripts/run_cells_3_and_4_lambda09.sh` (λ=0.9 closure), `scripts/run_forensics.sh`
 
 ZEB-136's prior forensics (`/home/zebli/work/LOCAL/zeb136/forensics/router_on.txt`) are the direct comparison point for the ZEB-139 (KL+CE) vs ZEB-136 (CE-only) contrast.