Skip to content

feat(v1): drop first-class MultiIndex — feasibility + implementation (#744)#803

Draft
FBumann wants to merge 33 commits into
feat/arithmetic-conventionfrom
feat/v1-no-multiindex
Draft

feat(v1): drop first-class MultiIndex — feasibility + implementation (#744)#803
FBumann wants to merge 33 commits into
feat/arithmetic-conventionfrom
feat/v1-no-multiindex

Conversation

@FBumann

@FBumann FBumann commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Note

AI-assisted (Claude Code): feasibility analysis, implementation, and this
description; reviewed by me.

Stacked on #717 (the v1 semantics branch — base feat/arithmetic-convention).
Implements the v1 side of dropping first-class pd.MultiIndex, together with the
feasibility analysis that motivated it. Discussion: #744.

Rollout tracker: arithmetics-design/open-items.md — the v1 schedule (opt-in → default → 1.0) and per-stage status.

What changes under v1

  • A pd.MultiIndex dimension is rejected wherever it would enter the model —
    add_variables (coords and DataArray bounds), add_constraints (MI introduced
    via arithmetic), and add_objective — with an error that names what you did and
    points at reset_index. Legacy keeps MI with a DeprecationWarning, so
    downstream (PyPSA, PyPSA-Eur, …) migrates on the legacyv1 switch.
  • Multi-key groupby returns a flat group dim + key aux coords under v1 (the
    stacked group MultiIndex stays under legacy). This is the one place linopy itself
    minted an MI; PyPSA-Eur's CCL constraints consume it.

Under v1 the model is flat dims + auxiliary level coords throughout — MI is a
legacy-only concept.

What's intentionally NOT here — no deletions

Nothing is removed in this PR. All first-class-MI machinery is kept so legacy
keeps working unchanged — both the concentrated ~300 lines (the alignment.py
level-projection subsystem, netcdf MI (de)serialize) and the diffuse surface: the
#303 assign_multiindex_safe corruption workaround (39 call sites), the
isinstance(MultiIndex) guards, and the MI branches scattered through
alignment/coords. All of it stays live for legacy and can only be deleted at legacy
EOL. This PR launches the legacy/v1 split; the strip is the later payoff.

Feasibility analysis

  • 📄 arithmetics-design/multiindex-feasibility.md — renders with the feasibility
    matrix + design-decisions table.
  • test/test_mi_feasibility.py — runnable equality checks (both semantics).

Tests

MI is now legacy-only: MI-building tests are marked @pytest.mark.legacy (or their
fixtures skip under v1); the obsolete @v1-MI tests (which asserted the earlier
MI-allowed-but-strict v1 design this supersedes) are removed and replaced by reject
tests. Full suite green under both semantics.

@FBumann FBumann force-pushed the feat/v1-no-multiindex branch from efca300 to 51ea473 Compare July 1, 2026 08:50
@FBumann FBumann force-pushed the feat/arithmetic-convention branch from b5d5a70 to 71bd655 Compare July 1, 2026 08:54
FBumann and others added 27 commits July 1, 2026 10:58
Prove a first-class pd.MultiIndex snapshot can be replaced by a flat dim +
level aux coords without changing the model, by explicit equality checks:
reset_index normalization, where/groupby selection, per-period roll (#751),
group-by-level (#751), an equivalent solved LP, and the output re-stack
round-trip (flat solution -> MI via the caller's own mapping). Pins that
Variable.sel does not forward MI tuple-select (#752 §2). Both semantics.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Discussion artifact for the v1 MultiIndex decision: feasible (verified) vs
desirable (opinion) matrix per op, with PyPSA call sites pinned at v1.2.4;
the snapshot dim-coordinate sub-decision; the conclusion that linopy drops
MultiIndex from its mental model (flat in, flat out, output re-stack owned by
PyPSA); and the open PyPSA-side rows (stochastic 2nd MI, n.snapshots).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- add test_storage_soc_lp_equivalent: the cyclic-per-period SOC "previous"
  built via groupby("period").roll vs an explicit positional roll produces a
  byte-identical LP file; a period-unaware global roll is the negative control
  (diverges at the period-start rows), proving the check has teeth
- matrix: drop the non-operation "solve LP" row (the end-to-end compose proof is
  noted inline on the tracked-tests line instead); mark storage SOC 🟢 now that
  it is pinned by a test

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Parametrize test_storage_soc_lp_equivalent over boundary in {cyclic,
non-cyclic}. Non-cyclic (period start carries nothing) is the boundary ramps
and non-cyclic storage use -- groupby("period").roll + a period-start where-mask
vs an explicit positional roll, byte-identical LP either way.

Teeth: the two boundaries yield different LPs (mask/wrap is not a no-op); the
period-unaware global-roll control stays cyclic-only since non-cyclic masks away
the rows it diverges on.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The prior "ramps too" conflated two distinct period-start boundaries, now both
covered as a third parametrize value:

- non-cyclic storage: drop the previous *term*, keep the row (initial SOC -> rhs).
  This is exactly PyPSA's `previous_soc...roll(1).where(include_previous_soc)`
  (constraints.py L1691) -- a `where` on the term, verified byte-faithful.
- ramp: drop the whole *row* at the period start via a constraint `mask=`
  (PyPSA's `_period_start_mask`, L838) -- no previous to ramp from.

flat (groupby roll) == oracle (positional roll) byte-for-byte in all three; each
boundary differs from the cyclic baseline (teeth); global-roll control stays
cyclic-only since the other two mask away the rows it diverges on.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ies (#744)

The test's scope is the per-period roll's period-start boundary (cyclic /
non-cyclic / ramp), not storage SOC specifically -- and it proves a byte-IDENTICAL
LP file, distinct from the solve-equivalence test_per_period_lp_equivalent. Rename
accordingly. Surface all three boundary spellings in the storage-SOC matrix row's
flat+aux cell (wrap / .where term-drop / mask= row-drop).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PyPSA's stochastic path keeps `scenario` and `name` as separate xarray dims
(components/array.py: expand_dims(scenario=...), isel(scenario=0)); the
`(scenario, name)` MultiIndex is manufactured only by `.stack(combined=...)` at
the pandas output boundary, and common.py's MI is the input DataFrame's columns.
So it is not an in-model MI like the snapshot `(period, timestep)` index -- it is
already the flat+aux N-D form, nothing to decompose.

Correct the stochastic row's verdict and the conclusion/decision-record prose:
the one real open item is the public `n.snapshots` choice; stochastic drops from
"genuine open risk / second MI" to a low-risk watch on whether #1484 ever
introduces an in-model stacked index (v1.2.4 does not).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Six rows are the genuine in-linopy MultiIndex decision (snapshot (period,
timestep) sitting inside linopy variables). The other three are not, and now carry
a ⊥ tag + legend: output (PyPSA-side boundary re-stack), n.snapshots (PyPSA public
API, never a linopy model index), stochastic (not an MI at all — N-D dims).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ot cause (#744)

Open with a bold, falsifiable claim: inside the linopy model PyPSA uses a
MultiIndex in exactly one place -- the snapshot dim (period, timestep). The
lookalikes are ruled out inline (stochastic = N-D dims, n.snapshots = public API,
output = pandas .stack), so the whole linopy-side question collapses to one axis;
other axes are assumed N-D, not audited -- corrections welcome.

Add a "What reset_index changes (and doesn't)" section stating the root cause:
snapshot is one dim whose index is a MultiIndex, period/timestep are levels (never
dims, surfacing only via .sel-collapse), and reset_index flips only the index type
-- not the dim count -- turning collapse/loop/.data into direct groupby/where.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drop the "working assumption / corrections welcome" framing -- just state that
inside the linopy model MultiIndex is used only on the snapshot dim, with the
three lookalikes ruled out inline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ary (#744)

solution/dual carry the snapshot dim, which today is the MultiIndex inherited from
the MI-indexed variables -- so output holds the same in-model snapshot MI as entry,
on the way out. Drop the ⊥ tag; show the snapshot MI flowing in (entry) -> ops ->
out (output) in the lead and legend. Only stochastic (N-D) and n.snapshots (public
API) remain ⊥.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PyPSA parks the snapshot index on model.parameters (assign(snapshots=sns),
optimize.py L689) and reads it back via .snapshots.to_index() (L905) -- today that
lands a real MultiIndex *inside* the linopy object. test_snapshots_param_flat_rebuild
shows the flat+aux form parks only a flat snapshot + period/timestep aux vars (no MI
index inside parameters) and rebuilds the identical index on demand. Flips the last
in-linopy 🔵 row to 🟢.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Order the matrix as the seven in-linopy snapshot-MI rows first (entry → ops →
storage SOC → output → snapshots param), then the two ⊥ rows that are outside
linopy (stochastic, n.snapshots) last.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
)

The feasibility matrix is now exactly the seven in-linopy snapshot-MI rows (all
🟢). The two ⊥ rows move to a dedicated section, "Observed PyPSA MultiIndex usages
— not linopy's to solve": stochastic and n.snapshots are observed PyPSA usages that
provably need no in-linopy solution and are cheap for PyPSA to handle at its
boundary. They fail the in-linopy-MI test on opposite axes -- stochastic is inside
linopy but not an MI; n.snapshots is an MI but not inside linopy -- which the new
table states directly. Drop the now-unused ⊥ tag/legend.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…744)

Three structural fixes from review, one coherent pass:

- Lead with a Verdict (BLUF): feasible with zero linopy changes, flat in/flat out,
  one open PyPSA-side item (n.snapshots). Everything after is evidence for a claim
  the reader already holds, instead of arguing up to it.
- Fix the per-column glyph overload: feasibility uses glyphs (✅ tested / 🔲
  achievable / ❌ no), desirable uses plain words (better / parity / worse) -- a
  glyph no longer means two things across columns.
- Hoist the settled-vs-open framing above the matrix (Scope + the two-table split
  already embody it), drop the now-redundant decoder paragraph and restated
  conclusion.

Net 211 → 160 lines; same evidence and call sites, less argument.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…p analysis (#744)

"Zero linopy changes" overclaimed. Feasibility needs no new capability, but adopting
flat+aux is a net simplification, not a no-op: a few small additive things (positional
snapshot alignment; shrink MI-input to accept-then-reset_index) against a large
deletion of first-class-MI machinery.

Add a "What linopy can strip" section quantifying the deletion from a read-only code
sweep: ~300 lines, ~half of it one cluster -- the alignment.py level-projection
subsystem (_project_onto_multiindex_levels / _enforce_implicit_projections /
_LevelProjection + the projections plumbing through broadcast_to_coords), which
exists solely to make a single MI level align like a dimension and is dead under
flat+aux; plus ~50 lines of netcdf MI flatten/reconstruct. Notes what stays
(assign_multiindex_safe, internal _term/groupby stacking, semantics §11 aux-conflict).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#744)

The case for dropping first-class MI isn't only "~300 fewer lines". Reframe the
Verdict around four axes, each backed by a section below:
- simpler implementation (What linopy can strip)
- simpler mental model (What reset_index changes)
- enables features (level groupby works; removes the #752 internals coupling)
- no UX cost (MI still accepted in / re-stacked out; only level-tuple .sel lost)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The simpler mental model is linopy's: it stops having to know pd.MultiIndex exists
or cover its quirks (level alignment, index-corruption guards, reconstruction) and
becomes purely xarray-native -- plain dims + aux coords, what it's already built on.
MI lives only at the boundary as input/output sugar, never inside linopy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lead with the linopy maintainer's actual question -- can the library be simplified
without destabilising PyPSA, its primary consumer -- and answer it: PyPSA's interface
doesn't move (MI in, MI-indexed out as boundary sugar), every in-linopy use has a
tested identical-model flat+aux form, so nothing PyPSA relies on breaks. Stability is
the frame; the wins (less to maintain, xarray-native internals, unblocks work) follow
from it, and the one open item is PyPSA's own decoupled n.snapshots call.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… assumption (#744)

The rows are PyPSA's MI uses *of* linopy, not linopy's own -- reword accordingly.
And make the epistemics explicit: PyPSA is the one consumer audited, so these are a
proof set, not a proof of universality. The argument still generalises (after
reset_index everything is ordinary xarray linopy already handles; the only
MI-specific capability lost for any user is level-tuple .sel), but we now state the
assumption that no other user has an infeasible MI case, and invite counterexamples.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Restructure to one decision screen + appendix (the chosen shape), 216 → 141 lines:

- Verdict: claim + 3 wins + open item + proof-set caveat, tightened.
- The evidence: both tables (in-model matrix, boundary usages) under one heading;
  the three post-matrix findings paragraphs collapse to one caption; Scope and the
  side-finding fold in.
- The payoff: strip table + a line; "stays"/"additive" compressed.
- Appendix: the three overlapping mechanics sections merged/condensed (reset_index
  + data-model table, dim-coordinate sub-decision, transition shape) + decision
  record.

Each claim stated once; explanatory prose turned into captions. Tables and all
call-site links preserved verbatim; only the #756/#757 links in a memory-cost aside
were trimmed (the point survives without them).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#744)

linopy does not hand MI-indexed results back; that contradicts flat-in/flat-out. It
returns a flat solution; PyPSA re-applies its own n.snapshots index when mapping the
result onto components (assign_solution already does this), so PyPSA's results stay
MI-indexed -- but the re-stack is PyPSA's, not linopy's. MI is boundary sugar PyPSA
owns, never inside linopy's model.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…#744)

Separate feasibility from the entry policy. Feasibility needs no new linopy
capability (the rewrites use existing ops; entry is a plain reset_index). Whether
linopy accepts an MI and flattens it (input sugar, backwards-compatible) or rejects
MI so callers flatten first (purest, breaks existing callers) is a boundary policy,
now surfaced as an explicit Decision-record item with a recommendation
(accept-as-sugar, optional deprecation path to flat-only) -- not smuggled in as both
"no change" and "accept sugar" at once.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the prose "decision record" + scattered "small additive work" mentions with
one top-level Design decisions table: internal model, MI-on-input, snapshot
alignment, output, n.snapshots -- each with options and a recommendation/status.

Also fold in the input-policy correctness fix: MI in / flat out is a silent contract
change, so silent auto-flatten is ruled out -- reject, or accept-and-warn (undecided).
Slim the Verdict to point at the table; only n.snapshots is flagged as the
substantive open item, and it's PyPSA's call.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d sweep (#744)

The ~300 estimate was a floor. Two corrections from a deeper measure:

- assign_multiindex_safe (the #303 "would corrupt the index" workaround, 39 call
  sites) is snapshot-MI-driven, not a general helper: internal stacks use
  create_index=False, so the snapshot MI is the only thing that triggers the
  corruption warning. Under flat+aux all 39 sites revert to plain .assign(). The
  first sweep wrongly marked it "stays" -- corrected, with a strip-table row.
- Add the diffuse "cognitive tax" layer the line-count misses: 6 isinstance(MI)
  guards, 17 level-ops, ~40 quirk-comments, and 169 lines of MI edge-case tests
  across 10 files.

Reframe the mental-model pillar around that defensive surface (guards, the 39-site
corruption workaround, quirk-comments, edge-case tests) rather than just lines.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two more points strengthening the case:
- Latent risk: MI's edge surface almost certainly isn't fully covered (the #303
  corruption is one gap that already bit), so keeping MI carries unknown-bug risk
  that flat+aux retires. Folded into the mental-model pillar and the payoff.
- MI isn't a promoted feature, so few users lean on it -- lowers the cost of the
  lost level-tuple .sel and makes the "no other infeasible case" generalization
  safer. Folded into the proof-set caveat.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The "MI in exactly one place — snapshot" claim was false. Verified here (both
semantics): a multi-key groupby (DataFrame grouper) mints a stacked `group`
MultiIndex, and PyPSA-Eur's add_CCL_constraints consumes it richly
(groupby(country×carrier).sum() then .intersection / .loc[MI] / .sel(group=index),
solve_network.py L1064/L1080-1095). add_EQ/add_BAU are single-key (no group MI) --
correcting an over-broad relayed claim.

Integrate at "2nd open item, flagged harder": scope the Verdict's snapshot claim,
turn one open item into two (n.snapshots + multi-key groupby), add a "Second MI
surface" section and a Design-decisions row, and make the proof-set caveat honest
(the generalization already cracked once). Framing per FBumann: it's the same
flat+aux change, linopy-owned (groupby returns flat group + key aux coords) -- but a
public-API change with external consumers, so it needs a deprecation path, not a
boundary fix. Blast radius (-Earth/-DE/plugins) not yet scoped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FBumann and others added 6 commits July 1, 2026 10:58
#744)

Per FBumann: now that the deprecation is just the standard legacy/v1 path (warn
under legacy, raise/flat+aux under v1), multi-key groupby is the same kind of
in-linopy MI as the snapshot rows -- siloing it overstated its difference. Move it
into table 1 as the 8th row, marked 🔲 (the one achievable-not-yet-tested row, vs
the 7 tested ✅ -- which also finally exercises the glyph legend). Replace the
standalone "Second MI surface" section with a tight caption (verified both
semantics; PyPSA-Eur CCL cite; legacy/v1 migration; blast radius the only open
part). Soften the earlier "harder / needs a deprecation path" framing throughout:
it reaches external code but rides the standard switch, so it's guided, not a hard
break.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A frame grouper (`groupby(DataFrame).sum()`, e.g. PyPSA-Eur's CCL country×carrier)
mints a stacked `group` MultiIndex today. Under v1 it now returns a flat `group`
dim with the keys as ordinary aux coords — uniform with every other dim, no MI;
select via groupby/where, not a `.sel(group=tuple)` level tuple. Legacy keeps the
stacked MultiIndex and emits a DeprecationWarning pointing at the change.

Scope: only the cases where a stacked group MI survives today — the DataFrame
grouper and `observed=True`. The default `observed=False` list-of-names path that
already unstacks into separate dims is untouched. The legacy deprecation warning is
limited to the user-facing frame grouper.

This is the first transition slice for dropping first-class MultiIndex in v1 (the
strip of the now-dead-under-v1 MI machinery follows at legacy EOL). Affected
groupby tests split into @pytest.mark.legacy (MI + asserts the warning) /
@pytest.mark.v1 (flat+aux) twins. Full suite green under both semantics.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Toward v1 disallowing MultiIndex entirely: the shared `u` fixture now yields a
(level1, level2) MultiIndex variable under legacy and the equivalent flat `dim_3`
dim + level1/level2 aux coords under v1. Moved it out of the central `m` fixture so
`m` is MI-free. The two v1 tests that asserted MI-specific behavior are rewritten to
the flat+aux idiom: the level-projection "raises" test becomes explicit per-level
weighting via the aux coord; the MI inner-join alignment test is marked legacy
(flat-dim alignment is covered by the other align tests).

Prep only — MI is still allowed under v1 at this point; no v1 test relies on it now.
Suite green under both semantics.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v1 now rejects a pandas MultiIndex dimension anywhere it would enter the model;
legacy keeps it with a DeprecationWarning. This supersedes the branch's earlier
"MI-allowed-but-strict" v1 design — under v1 a snapshot (or any dim) must be a flat
dim with the levels as auxiliary coords (`reset_index`).

enforce_no_multiindex(obj, *, context) scans an object's dim-indexes for a
MultiIndex and raises (v1) / warns (legacy) with a message naming what the user did.
It's wired at the caller-facing boundaries where MI can enter a persistent object,
so the error is precise:
- add_variables  → context="variable 'x'"  (catches coords and DataArray bounds)
- add_constraints → context="constraint 'c'" (catches MI introduced via arithmetic)

Tests: MultiIndex is now a legacy-only concept, so MI-building tests are marked
legacy (or their fixtures skip under v1), and the obsolete @v1 tests that asserted
the old MI-under-v1 semantics are removed and replaced by a v1 reject test. Full
suite green under both semantics.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
add_objective assigns the expression directly (bypassing Objective.__init__), so
add the enforce_no_multiindex check there too — an MI forced onto an expression and
handed to the objective now rejects under v1 with the "objective ..." context.
Covered by a v1 reject test alongside the add_variables/add_constraints ones.

Also move the `enforce_no_multiindex` import to module level in model.py (no import
cycle — semantics imports none of model/variables/constraints/objective).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Commit the v1-rollout checklist (previously untracked) as the shared status
tracker, refreshed to the current state: the one open design decision (#744,
MultiIndex storage) is resolved — v1 disallows MI — and the remaining work is
organised by goals.md's three stages (opt-in / default / 1.0). The
transition-surface-complete + changelog items are placed at stage 1, matching
goals.md step 1 (warn on legacy, raise on v1).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@FBumann FBumann force-pushed the feat/v1-no-multiindex branch from 51ea473 to e87efc1 Compare July 1, 2026 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant