Update spec shaking v2 to keep alive subtype markers via reference#1835
Update spec shaking v2 to keep alive subtype markers via reference#1835
Conversation
There was a problem hiding this comment.
All functional changes exist here. The rest are tests / expanded tests / snapshots / etc.
There was a problem hiding this comment.
Pull request overview
This PR updates Soroban Rust SDK “spec shaking v2” marker propagation for container/reference types to avoid runtime recursion with recursive user-defined types, while still keeping nested type markers reachable for post-build spec filtering.
Changes:
- Add
keep_reachablehelper and rewrite container/referenceSpecShakingMarkerimpls to reference inner marker functions instead of calling them. - Add new recursive contract types/functions in test contracts and extend spec-shaking tests to assert recursive marker/spec retention.
- Update build/test outputs (expanded sources and JSON snapshots) and ensure spec shaking v2 is enabled during
make test.
Reviewed changes
Copilot reviewed 21 out of 24 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| soroban-sdk/src/spec_shaking.rs | Introduces keep_reachable and switches container/reference marker propagation to symbol-reference style to avoid recursion. |
| Makefile | Ensures SOROBAN_SDK_BUILD_SYSTEM_SUPPORTS_SPEC_SHAKING_V2=1 is set for test-only runs. |
| tests/udt/src/lib.rs | Adds recursive contract types and contract functions (recursive, recursive_enum) plus native + wasm-import tests. |
| tests/spec_shaking_v2/src/lib.rs | Adds recursive contract types and a with_recursion boundary function to validate marker propagation. |
| tests/spec_shaking_v2/src/test.rs | Extends spec-shaking v2 assertions to include recursion and adds marker/spec count sanity checks. |
| tests/spec_shaking_v1/src/test.rs | Updates v1 spec shaking test expectations to include recursive types. |
| soroban-spec-rust/src/lib.rs | Updates golden expected generated Rust output to include the new recursive types and functions. |
| tests/udt/test_snapshots/test_with_wasm/test_add.1.json | Updates wasm-backed snapshot to reflect new build output. |
| tests/udt/test_snapshots/test_with_wasm/test_recursive.1.json | Adds/updates snapshot for recursive-type wasm invocation. |
| tests/udt/test_snapshots/test_with_wasm/test_recursive_enum.1.json | Adds/updates snapshot for recursive-enum wasm invocation. |
| tests/udt/test_snapshots/test/test_recursive.1.json | Adds native (non-wasm) snapshot for recursive test case. |
| tests/udt/test_snapshots/test/test_recursive_enum.1.json | Adds native (non-wasm) snapshot for recursive enum test case. |
| tests/tuples/test_snapshots/test/test_wasm_tuple1.1.json | Updates tuple wasm snapshot (codegen/table/elem-segment changes). |
| tests/tuples/test_snapshots/test/test_wasm_tuple2.1.json | Updates tuple wasm snapshot (codegen/table/elem-segment changes). |
| tests/tuples/test_snapshots/test/test_wasm_void.1.json | Updates void wasm snapshot (codegen/table/elem-segment changes). |
| soroban-sdk/test_snapshots/tests/crypto_bls12_381/test_invoke_contract.1.json | Updates wasm snapshot due to codegen changes affecting function/table layout. |
| soroban-sdk/test_snapshots/tests/address/test_get_existing_contract_address_executable_wasm.1.json | Updates wasm snapshot hash/metrics due to changed wasm output. |
| tests-expanded/test_udt_wasm32v1-none.rs | Updates expanded wasm output reflecting new contract types/functions and marker behavior. |
| tests-expanded/test_tuples_tests.rs | Updates embedded wasm bytes in expanded tuple tests due to new wasm layout. |
| tests-expanded/test_spec_shaking_v2_wasm32v1-none.rs | Updates expanded wasm output for v2 test contract (recursive types + new fn). |
| tests-expanded/test_spec_shaking_v1_wasm32v1-none.rs | Updates expanded wasm output for v1 test contract (recursive types + new fn). |
| /// Same volatile-read technique the macro-generated marker bodies use to | ||
| /// keep their `MARKER` byte statics alive (see `soroban-sdk-macros/src/shaking.rs`), | ||
| /// applied to a function pointer instead of a data pointer. | ||
| #[doc(hidden)] | ||
| #[inline(always)] | ||
| fn keep_reachable(f: fn()) { | ||
| let _ = unsafe { core::ptr::read_volatile(&f) }; | ||
| } | ||
|
|
There was a problem hiding this comment.
keep_reachable currently performs a volatile load on all targets. Since the marker bytes are only emitted under #[cfg(target_family = "wasm")], doing a volatile read in native test builds is unnecessary overhead and can inhibit optimizations. Consider making keep_reachable a no-op (or just let _ = f;) on non-wasm targets, and keeping the volatile-load implementation only for wasm builds where you need to defeat linker DCE.
| /// Same volatile-read technique the macro-generated marker bodies use to | |
| /// keep their `MARKER` byte statics alive (see `soroban-sdk-macros/src/shaking.rs`), | |
| /// applied to a function pointer instead of a data pointer. | |
| #[doc(hidden)] | |
| #[inline(always)] | |
| fn keep_reachable(f: fn()) { | |
| let _ = unsafe { core::ptr::read_volatile(&f) }; | |
| } | |
| /// On Wasm, use the same volatile-read technique the macro-generated marker | |
| /// bodies use to keep their `MARKER` byte statics alive (see | |
| /// `soroban-sdk-macros/src/shaking.rs`), applied to a function pointer | |
| /// instead of a data pointer. On non-Wasm targets this is unnecessary, so | |
| /// `keep_reachable` is a no-op. | |
| #[doc(hidden)] | |
| #[cfg(target_family = "wasm")] | |
| #[inline(always)] | |
| fn keep_reachable(f: fn()) { | |
| let _ = unsafe { core::ptr::read_volatile(&f) }; | |
| } | |
| #[doc(hidden)] | |
| #[cfg(not(target_family = "wasm"))] | |
| #[inline(always)] | |
| fn keep_reachable(f: fn()) { | |
| let _ = f; | |
| } |
There was a problem hiding this comment.
I'd prefer to keep the Rust runtime as similar as possible to the WASM one. Ultimately, this doesn't really make a difference either way.
There was a problem hiding this comment.
We already scope a lot to the wasm target, so I don't think we need to avoid that here.
leighmcculloch
left a comment
There was a problem hiding this comment.
One concern discussed inline.
| //! Types whose size is independent of their inner types (`Vec<T>`, | ||
| //! `Map<K, V>`, `&T`, `&mut T`) use [`keep_reachable`] to reference each | ||
| //! inner type's marker without calling it. Recursive definitions are | ||
| //! possible through these types, so a direct call would risk a runtime cycle. |
There was a problem hiding this comment.
This seems problematic if Vec<T> exists as a function parameter, and T references another type U. My understanding from reading the implementation is that U's marker fn will not be called because T's wasn't called.
There was a problem hiding this comment.
Updated the test, I think this works OK. The type T still has the U::spec_shaking_marker call, and T is kept alive via the fn-pointer reference.
There was a problem hiding this comment.
Interesting, so should we just use this method of keeping alive instead of actually calling in all the cases? We don't need two approaches, we can use this fn ptr approach for all cases and just never call the fns?
There was a problem hiding this comment.
Yeah if this way is more reliable and avoids an actual call at runtime, then lets use this approach for all marker creation, not just the Vec/Map ones. Is there a reason we can't do that?
There was a problem hiding this comment.
I originally had that, but it does inflate the WASM size more.
I recorded test_associated_type_chained between the versions:
main: 1985 bytes
just vec/map: 2067 bytes
all ptrs: 2218 bytes
I'll pull some data from larger contract sources and update here.
There was a problem hiding this comment.
I guess this is because in wasm Rust forces any fn that has its pointer taken to be a fn in the fn table. So it can't be inlined. So every contract type's marker fn ends up in the fn table even if it is never called. 🤔
What
Introduce
keep_reachable, a small helper that preserves a function's symbol across the linker's DCE pass without invoking it at runtime.The following type implementations in
soroban-sdk/src/spec_shaking.rshave been rewritten fromT::spec_shaking_marker()tokeep_reachable(T::spec_shaking_marker):This ensures spec markers for nested types don't get eliminated by DCE, but there is no risk of recursive function call paths in the final WASM.
Why
Closes #1834
The overflow issue was causing by a loop within the
spec_shaking_markerfunction. Each type invoked this on any subtypes, so if there was a loop, a stack overflow would occur during execution.This only applies to
VecandMapcurrently. Due to some restrictions inmap_typewith applying lifetime arguments to struct field types, references currently can't be used to contain recursive type definitions. However, given this restriction is unrelated, thekeep_reachablefix was still applied to those types.Known limitations
This will increase contract size for each occurrence of
keep_reachable, by ~15 bytes each. Per the SDK test contracts, the contacts impacted saw an increase of ~100 bytes.Determine if we need to support all types, or if justVec<T>andMap<K,V>is sufficient. I don't think we currently allow constructing recursive types without using a container type likeVecorMap, but this should be confirmed as intentional and documented before relying on it