Skip to content

Catalog implications: Hydrate plans in parse_item from the expression cache#36146

Draft
ggevay wants to merge 1 commit intoMaterializeInc:mainfrom
ggevay:implications-optimizer-step-D
Draft

Catalog implications: Hydrate plans in parse_item from the expression cache#36146
ggevay wants to merge 1 commit intoMaterializeInc:mainfrom
ggevay:implications-optimizer-step-D

Conversation

@ggevay
Copy link
Copy Markdown
Contributor

@ggevay ggevay commented Apr 19, 2026

DRAFT

Followup to #35837

(Btw. I'm still planning to simplify how we handle plans on catalog objects by making the plans non-optional a bit later. That would eliminate the annoying differences we have between how the catalog handles locally-optimized plans vs. globally-optimized plans.)

Nightly: https://buildkite.com/materialize/nightly/builds/16132

@ggevay ggevay added the A-ADAPTER Topics related to the ADAPTER layer label Apr 19, 2026
@ggevay ggevay force-pushed the implications-optimizer-step-D branch 10 times, most recently from 295514d to ed67cae Compare April 20, 2026 15:31
@ggevay ggevay force-pushed the implications-optimizer-step-D branch 3 times, most recently from cbb20b3 to 97a24f3 Compare May 5, 2026 13:27
ggevay added a commit that referenced this pull request May 5, 2026
…lications (#35837)

This is further preparation for handling those DDL commands in the
catalog implications that need query plans. The 2 commits move things
from the catalog transaction's side effect closure to before the catalog
transaction. See more details in the commit messages.

The important thing is that this unlocks a later change that reads plans
back from the cache in `parse_item`. (Draft PR:
#36146)

Nightly: https://buildkite.com/materialize/nightly/builds/16131
(slightly stale)

---------

Co-authored-by: Junie <junie@jetbrains.com>
Makes the durable expression cache the source of truth for plans
(optimized MIR, physical LIR, and dataflow metainfo including
rendered optimizer notices) on newly-created Index /
MaterializedView items, rather than threading them
through an in-memory side-channel. This works across envd
processes, which we will need for 0-downtime upgrades and
eventually for running multiple envds at the same time.

Changes:

* `expr_cache.rs`: the cache task now maintains an in-memory
  mirror of `GlobalExpressions` (seeded from the `open` result and
  kept in sync on every `Update` op). Adds
  `ExpressionCacheHandle::get_global_expressions(ids)`, which
  returns the best-effort snapshot for the requested ids via a
  oneshot request/response channel — consistent with the existing
  single-writer design. Missing ids are simply absent from the
  map (no error).

* `catalog/transact.rs`: `Catalog::transact` pre-fetches
  `GlobalExpressions` for every plan-bearing `Op::CreateItem`
  (Index / MaterializedView) and threads the map
  through `transact_inner` into the `InMemoryExpressionCache`.
  Correctness relies on the sequencer `_finish` methods
  `.await`ing `cache_expressions` before reaching `transact`
  (see the "Move cache_expressions before the catalog
  transaction and await it" commit), so the entry for each id is
  guaranteed to be visible in the cache's in-memory mirror by the
  time we read it here. `transact_incremental_dry_run` passes an
  empty map (ALTER SINK dry-runs don't create new plan-bearing
  items).

* `catalog/state.rs`: `InMemoryExpressionCache::Open` gains a
  `cached_global_exprs` map with a matching
  `remove_cached_global_expression` accessor. After
  `parse_item_inner` succeeds, `parse_item` stamps the returned
  `optimized_plan` / `physical_plan` / `dataflow_metainfo` onto
  the freshly-built `CatalogItem` (MV / Index / CT).

* Removes now-redundant `set_optimized_plan` / `set_physical_plan`
  / `set_dataflow_metainfo` calls from all three `_finish` paths
  (`create_index.rs`, `create_materialized_view.rs`),
  since `parse_item` already stamps those fields from the cache
  entry the sequencer has just awaited.

`bootstrap_dataflow_plans` in `coord.rs` is intentionally kept
as-is: `parse_item` hydration covers the cache-hit case, but the
cache-miss path (cold start, version upgrade where the cache is
empty for the new version, `optimizer_features` drift) still
needs to re-optimize from `create_sql` and populate
`uncached_expressions` for the next writeback.

Verification:

* `cargo check -p mz-adapter`, `cargo check -p mz-catalog`: clean.
* `mz-catalog::expr_cache::tests::expression_cache`: PASS.
* All four `test/sqllogictest/transform/notice/*.slt`: PASS
  (17 + 24 + 13 + 41 = 95/95).
* `test/sqllogictest/rename.slt` + `test/sqllogictest/alter.slt`
  (which exercise the retract+add plan-carry-through fix in
  `parse_item_inner`): PASS 230/230.

Co-authored-by: Junie <junie@jetbrains.com>
@ggevay ggevay force-pushed the implications-optimizer-step-D branch from 97a24f3 to 5672efd Compare May 5, 2026 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ADAPTER Topics related to the ADAPTER layer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant