Skip to content

Update spec shaking v2 to keep alive subtype markers via reference#1835

Open
mootz12 wants to merge 11 commits intomainfrom
spec-shaking-marker-recursive-guard
Open

Update spec shaking v2 to keep alive subtype markers via reference#1835
mootz12 wants to merge 11 commits intomainfrom
spec-shaking-marker-recursive-guard

Conversation

@mootz12
Copy link
Copy Markdown
Contributor

@mootz12 mootz12 commented Apr 16, 2026

What

Introduce keep_reachable, a small helper that preserves a function's symbol across the linker's DCE pass without invoking it at runtime.

The following type implementations in soroban-sdk/src/spec_shaking.rs have been rewritten from T::spec_shaking_marker() to keep_reachable(T::spec_shaking_marker):

  • &T, &mut T
  • Vec, Map<K, V>

This ensures spec markers for nested types don't get eliminated by DCE, but there is no risk of recursive function call paths in the final WASM.

Why

Closes #1834

The overflow issue was causing by a loop within the spec_shaking_marker function. Each type invoked this on any subtypes, so if there was a loop, a stack overflow would occur during execution.

This only applies to Vec and Map currently. Due to some restrictions in map_type with applying lifetime arguments to struct field types, references currently can't be used to contain recursive type definitions. However, given this restriction is unrelated, the keep_reachable fix was still applied to those types.

Known limitations

This will increase contract size for each occurrence of keep_reachable, by ~15 bytes each. Per the SDK test contracts, the contacts impacted saw an increase of ~100 bytes.

  • Determine if we need to support all types, or if just Vec<T> and Map<K,V> is sufficient. I don't think we currently allow constructing recursive types without using a container type like Vec or Map, but this should be confirmed as intentional and documented before relying on it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All functional changes exist here. The rest are tests / expanded tests / snapshots / etc.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Soroban Rust SDK “spec shaking v2” marker propagation for container/reference types to avoid runtime recursion with recursive user-defined types, while still keeping nested type markers reachable for post-build spec filtering.

Changes:

  • Add keep_reachable helper and rewrite container/reference SpecShakingMarker impls to reference inner marker functions instead of calling them.
  • Add new recursive contract types/functions in test contracts and extend spec-shaking tests to assert recursive marker/spec retention.
  • Update build/test outputs (expanded sources and JSON snapshots) and ensure spec shaking v2 is enabled during make test.

Reviewed changes

Copilot reviewed 21 out of 24 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
soroban-sdk/src/spec_shaking.rs Introduces keep_reachable and switches container/reference marker propagation to symbol-reference style to avoid recursion.
Makefile Ensures SOROBAN_SDK_BUILD_SYSTEM_SUPPORTS_SPEC_SHAKING_V2=1 is set for test-only runs.
tests/udt/src/lib.rs Adds recursive contract types and contract functions (recursive, recursive_enum) plus native + wasm-import tests.
tests/spec_shaking_v2/src/lib.rs Adds recursive contract types and a with_recursion boundary function to validate marker propagation.
tests/spec_shaking_v2/src/test.rs Extends spec-shaking v2 assertions to include recursion and adds marker/spec count sanity checks.
tests/spec_shaking_v1/src/test.rs Updates v1 spec shaking test expectations to include recursive types.
soroban-spec-rust/src/lib.rs Updates golden expected generated Rust output to include the new recursive types and functions.
tests/udt/test_snapshots/test_with_wasm/test_add.1.json Updates wasm-backed snapshot to reflect new build output.
tests/udt/test_snapshots/test_with_wasm/test_recursive.1.json Adds/updates snapshot for recursive-type wasm invocation.
tests/udt/test_snapshots/test_with_wasm/test_recursive_enum.1.json Adds/updates snapshot for recursive-enum wasm invocation.
tests/udt/test_snapshots/test/test_recursive.1.json Adds native (non-wasm) snapshot for recursive test case.
tests/udt/test_snapshots/test/test_recursive_enum.1.json Adds native (non-wasm) snapshot for recursive enum test case.
tests/tuples/test_snapshots/test/test_wasm_tuple1.1.json Updates tuple wasm snapshot (codegen/table/elem-segment changes).
tests/tuples/test_snapshots/test/test_wasm_tuple2.1.json Updates tuple wasm snapshot (codegen/table/elem-segment changes).
tests/tuples/test_snapshots/test/test_wasm_void.1.json Updates void wasm snapshot (codegen/table/elem-segment changes).
soroban-sdk/test_snapshots/tests/crypto_bls12_381/test_invoke_contract.1.json Updates wasm snapshot due to codegen changes affecting function/table layout.
soroban-sdk/test_snapshots/tests/address/test_get_existing_contract_address_executable_wasm.1.json Updates wasm snapshot hash/metrics due to changed wasm output.
tests-expanded/test_udt_wasm32v1-none.rs Updates expanded wasm output reflecting new contract types/functions and marker behavior.
tests-expanded/test_tuples_tests.rs Updates embedded wasm bytes in expanded tuple tests due to new wasm layout.
tests-expanded/test_spec_shaking_v2_wasm32v1-none.rs Updates expanded wasm output for v2 test contract (recursive types + new fn).
tests-expanded/test_spec_shaking_v1_wasm32v1-none.rs Updates expanded wasm output for v1 test contract (recursive types + new fn).

Comment thread soroban-sdk/src/spec_shaking.rs Outdated
Comment on lines +48 to +56
/// Same volatile-read technique the macro-generated marker bodies use to
/// keep their `MARKER` byte statics alive (see `soroban-sdk-macros/src/shaking.rs`),
/// applied to a function pointer instead of a data pointer.
#[doc(hidden)]
#[inline(always)]
fn keep_reachable(f: fn()) {
let _ = unsafe { core::ptr::read_volatile(&f) };
}

Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep_reachable currently performs a volatile load on all targets. Since the marker bytes are only emitted under #[cfg(target_family = "wasm")], doing a volatile read in native test builds is unnecessary overhead and can inhibit optimizations. Consider making keep_reachable a no-op (or just let _ = f;) on non-wasm targets, and keeping the volatile-load implementation only for wasm builds where you need to defeat linker DCE.

Suggested change
/// Same volatile-read technique the macro-generated marker bodies use to
/// keep their `MARKER` byte statics alive (see `soroban-sdk-macros/src/shaking.rs`),
/// applied to a function pointer instead of a data pointer.
#[doc(hidden)]
#[inline(always)]
fn keep_reachable(f: fn()) {
let _ = unsafe { core::ptr::read_volatile(&f) };
}
/// On Wasm, use the same volatile-read technique the macro-generated marker
/// bodies use to keep their `MARKER` byte statics alive (see
/// `soroban-sdk-macros/src/shaking.rs`), applied to a function pointer
/// instead of a data pointer. On non-Wasm targets this is unnecessary, so
/// `keep_reachable` is a no-op.
#[doc(hidden)]
#[cfg(target_family = "wasm")]
#[inline(always)]
fn keep_reachable(f: fn()) {
let _ = unsafe { core::ptr::read_volatile(&f) };
}
#[doc(hidden)]
#[cfg(not(target_family = "wasm"))]
#[inline(always)]
fn keep_reachable(f: fn()) {
let _ = f;
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep the Rust runtime as similar as possible to the WASM one. Ultimately, this doesn't really make a difference either way.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already scope a lot to the wasm target, so I don't think we need to avoid that here.

@mootz12 mootz12 requested a review from Copilot April 16, 2026 19:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 20 changed files in this pull request and generated no new comments.

@mootz12 mootz12 marked this pull request as ready for review April 17, 2026 12:14
@mootz12 mootz12 requested review from a team and leighmcculloch April 17, 2026 12:15
Copy link
Copy Markdown
Member

@leighmcculloch leighmcculloch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern discussed inline.

Comment on lines +21 to +24
//! Types whose size is independent of their inner types (`Vec<T>`,
//! `Map<K, V>`, `&T`, `&mut T`) use [`keep_reachable`] to reference each
//! inner type's marker without calling it. Recursive definitions are
//! possible through these types, so a direct call would risk a runtime cycle.
Copy link
Copy Markdown
Member

@leighmcculloch leighmcculloch Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems problematic if Vec<T> exists as a function parameter, and T references another type U. My understanding from reading the implementation is that U's marker fn will not be called because T's wasn't called.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the test, I think this works OK. The type T still has the U::spec_shaking_marker call, and T is kept alive via the fn-pointer reference.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, so should we just use this method of keeping alive instead of actually calling in all the cases? We don't need two approaches, we can use this fn ptr approach for all cases and just never call the fns?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if this way is more reliable and avoids an actual call at runtime, then lets use this approach for all marker creation, not just the Vec/Map ones. Is there a reason we can't do that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had that, but it does inflate the WASM size more.

I recorded test_associated_type_chained between the versions:
main: 1985 bytes
just vec/map: 2067 bytes
all ptrs: 2218 bytes

I'll pull some data from larger contract sources and update here.

Copy link
Copy Markdown
Member

@leighmcculloch leighmcculloch Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is because in wasm Rust forces any fn that has its pointer taken to be a fn in the fn table. So it can't be inlined. So every contract type's marker fn ends up in the fn table even if it is never called. 🤔

Comment thread soroban-sdk/src/spec_shaking.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

spec shaking v2: Support recursive types

3 participants