bench: arithmetic operation micro-benchmarks by FBumann · Pull Request #805 · PyPSA/linopy

FBumann · 2026-07-01T13:10:44Z

I really want to make sure the v1 convention doesnt introduce regressions to linopy. And i think as soon as we decide on what to do with the multiindex stuff, there will be room for improval. THis should lay a nie foundation for that.

Codspeed cost is really small, but coverage (and granularity) really improves.

Note

AI-assisted (Claude Code): implementation and this description; reviewed by me.

Adds an op-level benchmark tier to benchmarks/, alongside the whole-model build benchmarks. One benchmark per operation, with operands built outside the measured region so a run isolates a single arithmetic op.

Why. Whole-build benchmarks catch a regression but can't attribute it — a build says "kvl got heavier", an op benchmark says "expr+expr broadcast got heavier". (Motivated by a real regression hunt where attribution needed exactly this granularity.)

What.

benchmarks/ops.py — op registry (OpSpec) + a single 3-D grid size profile (dims 3×4×1000; the asymmetric shape also catches dim-order/transpose bugs) + 35 ops across scaling, var/expr arithmetic, quadratic, reductions, masking, groupby, merge, and constraint construction. Binary labelled ops carry match/broadcast variants — the alignment-path axis where the interesting regressions live.
benchmarks/drivers/test_ops.py — parametrized driver, one benchmark per op.
conftest.py — test_ops added to CODSPEED_MODULES (tracked; memory advisory).

Cost. 35 benchmarks; the memory run stays ~2–2.5 min including the model builds that dominate the job — cheap.

Signal validated. broadcast ≈ 5× match on the alignment axis (the §9 cross-product) — well above the noise floor.

Memory is report-only to start (op-scale memory can be noisy); op-time is the natural gate candidate once the signal proves stable.

An op-level tier alongside the whole-model builds: one benchmark per (operation, size profile), operands built outside the measured region so a run isolates a single op rather than a whole build. This attributes perf changes to a specific arithmetic path — a build benchmark says "kvl got heavier", an op benchmark says "expr+expr broadcast got heavier". - benchmarks/ops.py: op registry (OpSpec) + size profiles (small 1D×2000; large 3D×3×4×1000 — differ in element count *and* dim count; the asymmetric shape also catches dim-order bugs) + ~30 ops across scaling, var/expr arithmetic, quadratic, reductions and constraint construction. Binary labelled ops carry match/broadcast variants — the alignment-path axis where the interesting regressions live. - benchmarks/drivers/test_ops.py: parametrized driver, one benchmark per (op, profile). - conftest: add test_ops to CODSPEED_MODULES (tracked; memory advisory). 60 benchmarks, ~80s/run with memory. Signal validates: large ≈ 6× small, broadcast ≈ 5× match (the §9 cross-product).

codspeed-hq · 2026-07-01T13:13:45Z

Merging this PR will not alter performance

✅ 138 untouched benchmarks
🆕 35 new benchmarks
⏩ 138 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
🆕	Memory	`test_op[con_eq_expr]`	N/A	1.1 MB	N/A
🆕	Memory	`test_op[con_le_array]`	N/A	480.5 KB	N/A
🆕	Memory	`test_op[con_le_scalar]`	N/A	480.5 KB	N/A
🆕	Memory	`test_op[expr_add_array_bcast]`	N/A	1 MB	N/A
🆕	Memory	`test_op[expr_add_array_match]`	N/A	281.3 KB	N/A
🆕	Memory	`test_op[expr_add_expr_bcast]`	N/A	4.8 MB	N/A
🆕	Memory	`test_op[expr_add_expr_match]`	N/A	985.9 KB	N/A
🆕	Memory	`test_op[expr_add_masked]`	N/A	985.9 KB	N/A
🆕	Memory	`test_op[expr_add_scalar]`	N/A	187.5 KB	N/A
🆕	Memory	`test_op[expr_add_var]`	N/A	1.1 MB	N/A
🆕	Memory	`test_op[expr_fillna]`	N/A	468.8 KB	N/A
🆕	Memory	`test_op[expr_groupby_sum]`	N/A	606.6 KB	N/A
🆕	Memory	`test_op[expr_mul_array_bcast]`	N/A	1.6 MB	N/A
🆕	Memory	`test_op[expr_mul_array_match]`	N/A	468.8 KB	N/A
🆕	Memory	`test_op[expr_mul_scalar]`	N/A	468.8 KB	N/A
🆕	Memory	`test_op[expr_mul_var]`	N/A	891.6 KB	N/A
🆕	Memory	`test_op[expr_sub_expr_match]`	N/A	1.1 MB	N/A
🆕	Memory	`test_op[expr_sum_all]`	N/A	212.3 KB	N/A
🆕	Memory	`test_op[expr_sum_dim]`	N/A	212.3 KB	N/A
🆕	Memory	`test_op[expr_where]`	N/A	294.4 KB	N/A
...	...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

_{Comparing bench/arithmetic-ops (a99bcb4) with master (fe798b1)}

138 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Collapse to one 3-D profile (3×4×1000, ~12 K elements) — CodSpeed records time *and* memory per benchmark, so a second size wasn't buying a separate signal; one multi-dim profile keeps broadcast/alignment coverage with MB-scale ops above the noise floor, and halves the matrix. Benchmark ids drop the size suffix. Add three categories: absence/masking (expr.where / fillna / absence propagation — §4–§7, the semantics-heavy surface), groupby.sum, and an N-way merge (constraint-assembly cost). 35 ops, ~45 s/run with memory.

FBumann · 2026-07-01T14:35:52Z

Note

The following content was generated by AI.

CodSpeed cost check. This PR adds 35 arithmetic micro-benchmarks (one GRID
profile each, dims 3×4×1000). Under CodSpeed's per-PR memory instrument the
full benchmarks/ run (build + ops) stays at ~2.5 min — no cost blow-up:

Run	Instrument	Duration
fork PR (ops only)	memory	2m36s
fork PR #40 (build + ops)	memory	2m44s

The bare-metal walltime job remains gated to master + the trigger:benchmark
label, so PRs don't incur it. Verified on the fork: fluxopt#40 ran the ops
under CodSpeed and the memory comparison worked without inflating cost.

FBumann mentioned this pull request Jul 1, 2026

bench: arithmetic operation micro-benchmarks fluxopt/linopy#39

Merged

FBumann marked this pull request as ready for review July 1, 2026 14:37

FBumann requested a review from FabianHofmann July 1, 2026 14:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bench: arithmetic operation micro-benchmarks#805

bench: arithmetic operation micro-benchmarks#805
FBumann wants to merge 2 commits into
masterfrom
bench/arithmetic-ops

FBumann commented Jul 1, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

FBumann commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

FBumann commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Performance Changes

Footnotes

Uh oh!

FBumann commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FBumann commented Jul 1, 2026 •

edited

Loading

codspeed-hq Bot commented Jul 1, 2026 •

edited

Loading