From 354d445586b6f48bf2f47fd0643a5cd8b9e7ead7 Mon Sep 17 00:00:00 2001 From: Red Davies Date: Fri, 1 May 2026 21:29:05 -0400 Subject: [PATCH 1/3] =?UTF-8?q?Initial=20draft=20RFC=20to=20kick=20off=20d?= =?UTF-8?q?iscussions=E2=80=A6?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- text/0000-type-layout-intrinsics.md | 437 ++++++++++++++++++++++++++++ 1 file changed, 437 insertions(+) create mode 100644 text/0000-type-layout-intrinsics.md diff --git a/text/0000-type-layout-intrinsics.md b/text/0000-type-layout-intrinsics.md new file mode 100644 index 00000000..6298327b --- /dev/null +++ b/text/0000-type-layout-intrinsics.md @@ -0,0 +1,437 @@ +- Feature Name: type-layout-intrinsics +- Start Date: 2026-05-01 +- RFC PR: (leave this empty) +- Pony Issue: (leave this empty) + +# Summary + +Add a small, cohesive set of compiler intrinsics that expose the in-memory +layout of types: store size, array stride, alignment, and field offsets. +These are exposed as methods on a new `TypeInfo` primitive and are direct +projections of information LLVM already computes during code generation. + +# Motivation + +Pony users today have no first-class way to ask basic layout questions +about a type, oft required for C-FFI binding: + +- "How many bytes do I need to allocate to hold one of these?" +- "How many bytes between consecutive elements in an array of these?" +- "What alignment does this type require?" +- "Where does field `_x` start within this struct?" + +These questions come up every time a Pony program touches raw memory: +allocating buffers via `@pony_alloc`, packing or unpacking C-ABI +structs across the FFI boundary, building custom serializers, implementing +allocators, computing addresses for pointer arithmetic, or interfacing +with hardware/protocol layouts that pin field positions. + +Today, users hand-compute these numbers, hard-code them, or copy them out +of C headers. All three approaches drift silently when fields are +reordered, types are widened, or the target ABI changes. The information +is already known to the compiler — there is no good reason for users to +recompute it by hand. + +There is also a related family of questions about deep memory accounting +("how much memory does this `Array[String]` actually consume, transitively?"). +Those are explicitly **out of scope** for this RFC; they will be proposed +separately as a separate interface. + +Intrinsics in this RFC are the primitives that such a follow-up may choose +build upon. + +# Detailed design + +A new primitive `TypeInfo` is added to the standard library. It exposes +four methods, all `compile_intrinsic`. Each takes the type of interest as +a type parameter. + +```pony +primitive TypeInfo + fun tag size_of[T](): USize => compile_intrinsic + fun tag stride_of[T](): USize => compile_intrinsic + fun tag align_of[T](): USize => compile_intrinsic + fun tag offset_of[T](field: String): USize => compile_intrinsic +``` + +Each intrinsic maps to a specific LLVM query. Where two LLVM functions +both plausibly fit, the split below picks the one whose semantics match +the method's documented contract. + +## `size_of[T]()` + +The number of bytes occupied by the bits of a single value of type `T`, +not counting any padding required to place the next value in an array. +Equivalent to LLVM's `LLVMStoreSizeOfType`. + +`T` must be a concrete nominal type (a numeric primitive, a struct, a +class, or an actor). It may not be a union, intersection, interface, +trait, or type parameter that does not resolve to a concrete type at +the call site — those are compile-time errors. For class and actor +types, `size_of` returns the full size of the heap-allocated object: +the runtime type-descriptor pointer (one `USize`-sized word on the +current target — eight bytes on 64-bit) plus the type's fields. This +is the size the runtime allocates for one instance, and the value +needed for packing classes or actors into a C-compatible structure. + +To compute the size of a value whose static type is a union, match on +the value first to obtain a binding of the concrete member type, then +call `size_of` on that: + +```pony +let bytes = match cell + | let s: String => TypeInfo.size_of[String]() + | let n: U64 => TypeInfo.size_of[U64]() + end +``` + +The same pattern applies to `stride_of` and `align_of`. + +**Example uses:** + +Allocating a buffer to hold `n` packed values: + +```pony +let buf = @pony_alloc(@pony_ctx(), (TypeInfo.size_of[Header]() * n)) +``` + +Sanity-checking that an FFI-returned buffer is large enough before +casting: + +```pony +if returned_bytes < TypeInfo.size_of[CSocketAddr]() then + error +end +``` + +Computing the position of a trailing variable-length field in a serial +format: + +```pony +let payload_offset = TypeInfo.size_of[FrameHeader]() +``` + +## `stride_of[T]()` + +The number of bytes between the start of one element and the start of +the next when `T` is laid out in an array. Equivalent to LLVM's +`LLVMABISizeOfType`. Always greater than or equal to `size_of[T]()`; +they differ only when `T`'s alignment requires trailing padding. + +This is the value that should be used whenever pointer arithmetic +walks element-by-element. Calling `size_of` instead would silently miss +the inter-element padding and produce a wrong address for any type +whose store size is not a multiple of its alignment. + +**Example uses:** + +Computing the address of element `i` in a manually managed contiguous +buffer: + +```pony +let p_i = base.offset(TypeInfo.stride_of[Entry]() * i) +``` + +Sizing the backing allocation for an array-style data structure: + +```pony +let bytes_needed = TypeInfo.stride_of[T]() * capacity +``` + +Round-tripping through a C API that takes `(void*, element_size, count)`: + +```pony +@write_array(buf, TypeInfo.stride_of[T](), n) +``` + +## `align_of[T]()` + +The required alignment of `T` in bytes, as a power of two. Equivalent to +LLVM's `LLVMABIAlignmentOfType`. + +**Example uses:** + +Verifying that a pointer obtained from FFI is suitably aligned before +casting it to a typed pointer: + +```pony +if (raw_ptr.usize() % TypeInfo.align_of[Header]()) != 0 then + error // misaligned; reading would be UB on strict-alignment targets +end +``` + +Implementing an arena/bump allocator that must round its cursor up to +the alignment of the type being placed: + +```pony +fun ref _align_to[T](): USize => + let a = TypeInfo.align_of[T]() + (_cursor + (a - 1)) and not (a - 1) +``` + +Confirming a buffer is suitable for SIMD or DMA, both of which often +require stricter alignment than the natural type alignment. + +## `offset_of[T](field: String)` + +The byte offset at which the named field begins within `T`'s layout. +The `field` argument must be a string literal known at compile time; +the compiler validates that `T` has a field by that name and rejects +the call otherwise. Equivalent in spirit to C's `offsetof()` and Rust's +`core::mem::offset_of!`. + +`T` must be a struct, class, actor, or tuple. Unions, interfaces, and +traits are compile-time errors. For tuples, the `field` argument is +the positional accessor name (`"_1"`, `"_2"`, …) — the same name used +to access the field in source code. For classes and actors, the +offset is measured from the start of the heap-allocated object — the +same anchor as `size_of` — so the type-descriptor pointer occupies +the first `USize`-sized word and user-declared fields begin after it. + +**Example uses:** + +Writing a generic serializer that reads each field of a C-compatible +struct without naming the fields one by one: + +```pony +let off_x = TypeInfo.offset_of[Point]("x") +let off_y = TypeInfo.offset_of[Point]("y") +buf.write_at(off_x, p.x) +buf.write_at(off_y, p.y) +``` + +Implementing an intrusive data structure (the Linux-kernel +`container_of` pattern) where a node embedded in a larger struct +recovers the address of its container: + +```pony +fun container_of[Outer, Inner](inner_ptr: Pointer[Inner]): Pointer[Outer] => + inner_ptr.offset(-TypeInfo.offset_of[Outer]("node").isize()) +``` + +Crossing an FFI boundary where the C side hands back a pointer to a +field, not the enclosing struct. + +Why a string literal? Pony has no first-class field references and +no macro system, so a compile-time-validated string is the smallest +addition. The compiler is already touching field names during type +checking, so validating the literal against `T`'s field set is +cheap and gives a clear error message ("type `T` has no field +`foo`") at the call site. Nested paths (`"inner.field"`) are +deliberately not supported by this RFC; they can be added later +without breaking single-name calls. + +## What these intrinsics do not do + +This RFC deliberately stops at the layout questions LLVM can answer +about a single type or value. It does not propose: + +- **Deep memory accounting.** "How many bytes does this `Array[String]` + transitively own?" needs traversal, sharing/cycle policy, and per-type + semantics. A separate RFC will propose a `MemoryFootprint` interface + that types opt into, implemented in terms of these intrinsics. +- **Actor heap introspection.** "How much has *this actor* allocated?" +- **Type names, descriptors, or other reflection.** Out of scope. + +## Implementation + +PR ponyc#5267 is a WIP for an example implementation for the purposes +of exploring this design-space. Changes should only exist in the +compiler and stdlib. No changes required to the runtime (libponyrt). + +# How We Teach This + +These intrinsics live alongside `Pointer.alloc` and the FFI machinery +in the user's mental model. They should be introduced in the section +of the language tutorial that already covers raw memory and FFI, with +worked examples for each of the use cases above. The +`size_of` / `stride_of` distinction will be the most novel concept +for users coming from C (where `sizeof` conflates the two), and is +explained below. + +## The size / stride / alignment relationship + +The mental model: + +- **`align_of[T]`** is the rule. Every value of `T` must live at an + address divisible by `align_of[T]`. +- **`size_of[T]`** is how much one value of `T` occupies under that + rule. +- **`stride_of[T]`** is how far apart the rule forces consecutive + elements to be in an array of `T`. + +Stride must always be a multiple of alignment, so that every element's +start is aligned. The minimum legal stride is `size_of` rounded up to +the next multiple of `align_of`. For most types, size and stride are +equal — they diverge only when a type's content size isn't already a +multiple of its alignment. When they diverge, the gap is **trailing +padding**. + +### Example: `(U64, U8)` — size 9, align 8, stride 16 + +Layout of a single value (9 bytes): + +| `U64` | `U8` | +|-----------------|------| +| 0 1 2 3 4 5 6 7 | 8 | + +In an array, element 1 must start at a multiple of 8. Byte 9 isn't, +nor are 10..15. Byte 16 is. So elements are 16 bytes apart, and each +element's 16-byte slot has 7 bytes of trailing padding: + +| `U64` | `U8` | Padding | +|-----------------|------|----------------------| +| 0 1 2 3 4 5 6 7 | 8 | 9 10 11 12 13 14 15 | + +Subsequent elements occupy slots starting at byte 16, byte 32, byte +48, and so on, each laid out the same way relative to its slot start. + +- `size_of` = 9 (a single value occupies 9 bytes) +- `stride_of` = 16 (the next value starts 16 bytes later) +- `align_of` = 8 (max alignment of any field — the `U64` needs 8-aligned) + +The 7-byte trailing pad is what keeps each subsequent element's `U64` +8-aligned. + +### Example: `(U32, U8)` vs `(U8, U32)` — same fields, different layouts + +`(U32, U8)`: size 5, stride 8 — padding is trailing. + +| `U32` | `U8` | Padding | +|---------|------|---------| +| 0 1 2 3 | 4 | 5 6 7 | + +`(U8, U32)`: size 8, stride 8 — padding is internal. + +| `U8` | Padding | `U32` | +|------|---------|---------| +| 0 | 1 2 3 | 4 5 6 7 | + +Same fields, different order, same stride. `size_of` differs because +internal padding (between fields) counts toward size but trailing +padding does not. The visible consequence: a serializer that writes +`size_of[T]()` bytes per record writes 5 bytes for `(U32, U8)` but +8 bytes for `(U8, U32)`. + +### Working out the layout by hand + +Given a struct or tuple, you can derive size, stride, and alignment by +reading the fields left-to-right: + +1. Start a cursor at byte 0. +2. For each field: advance the cursor to the next address divisible + by that field's alignment (the gap is internal padding), then add + the field's size. +3. After the last field, the cursor's position is `size_of`. +4. The whole type's `align_of` is the maximum alignment among its + fields. +5. Round `size_of` up to a multiple of `align_of` — that's + `stride_of`. + +Worked example for `(U16, U64, U8)`: + +1. Cursor at 0. `U16` (align 2) is already aligned. It occupies + bytes 0..1. Cursor advances to 2. +2. `U64` (align 8): pad to byte 8 (6 bytes of internal padding). + It occupies 8..15. Cursor advances to 16. +3. `U8` (align 1) is already aligned. It occupies byte 16. Cursor + advances to 17. +4. `align_of` = max(2, 8, 1) = 8. +5. Round 17 up to a multiple of 8: 24. + +So `size_of[(U16, U64, U8)]() == 17`, +`align_of[(U16, U64, U8)]() == 8`, and +`stride_of[(U16, U64, U8)]() == 24`. + +### The footgun + +For every primitive type in Pony (`U8`, `U16`, …, `U64`, `F32`, +`F64`), `size_of == stride_of`. Generic code that uses `size_of` +where it should use `stride_of` passes tests written against +primitives and silently misbehaves when instantiated with a tuple or +struct whose alignment forces trailing padding. + +The rule of thumb: reach for `stride_of` whenever you are computing +positions in an array; reach for `size_of` only when you are packing +bytes into a format that has no inter-element padding (a wire format, +a tightly-packed file header, a record stream). + +## Naming precedent + +The naming follows established precedent: Swift's `MemoryLayout.size` +/ `.stride` / `.alignment`, and Rust's `mem::size_of` / +`mem::align_of` / `mem::offset_of!`. Pony users coming from either +language should find the names self-explanatory. + +# How We Test This + +Each intrinsic gets unit tests in the standard-library test suite +covering: + +- Numeric primitives (`U8`, `U32`, `F64`, `I128`) — exact known sizes. +- Structs with mixed-alignment fields where `size_of` and `stride_of` + must differ. +- Classes including ones with `embed`ded fields. +- `offset_of` for every field of a representative struct, asserting + that consecutive offsets are non-overlapping and respect alignment. +- `offset_of` with an invalid field name — must produce a compile + error, not a runtime failure. +- Negative tests: `size_of[SomeUnion]()`, `align_of[SomeInterface]()`, + `offset_of[SomeStruct]("nonexistent")` — all compile errors. + +The LLVM functions these intrinsics wrap are exercised constantly by +LLVM's own backend; we do not need to test that LLVM is correct, only +that we are calling the right one. Standard CI coverage is sufficient. + +# Drawbacks + +- Adds four new compiler intrinsics, growing the intrinsic surface + area the compiler must keep working as it evolves. +- The `size_of` / `stride_of` split is novel for C-trained users; + picking the wrong one produces correct-looking code that breaks for + types with trailing alignment padding. +- `offset_of` introduces compile-time string validation as a new + category of intrinsic argument handling; future intrinsics may want + the same machinery, which is a small but real generalization + pressure on the compiler. +- Exposing layout questions invites users to write code that depends + on platform- or version-specific layouts. The intrinsics are + inherently target-dependent (a struct's size on a 32-bit ABI may + differ from a 64-bit ABI); users must understand they are getting + the answer for the current target. + +# Alternatives + +**Expose only `size_of` and `align_of`; let users compute stride +themselves.** Stride is "round size up to alignment", which users could +compute. Rejected because the rounding rule is exactly the kind of +small, easy-to-get-subtly-wrong arithmetic that should live behind a +named primitive — and because "stride" is itself a load-bearing concept +in any code that walks an array. + +**Use traits/interfaces instead of intrinsics.** A `Sized` trait that +each type implements would avoid touching the compiler. Rejected +because the values are properties of the type's layout, not behavior +the type chooses; making them virtual function calls would (a) impose a +runtime cost where there is none today and (b) let types lie about +their own size. The compiler is the source of truth. + +**Put the methods on individual types instead of a `TypeInfo` +primitive** (e.g., `U32.size_of()`). Rejected because the method needs +to work with type parameters, not just concrete types named at the +call site, and because grouping all layout questions under one +namespace makes them discoverable as a family. + +Not implementing this RFC leaves Pony users hand-computing layout +numbers and copying them from C headers. The code keeps working until +something gets reordered or the target ABI changes, at which point it +silently breaks. + +# Unresolved questions + +- **Should `offset_of` accept nested paths** like + `"outer_field.inner_field"` in this RFC, or punt to a follow-up? + The argument for now: the parsing is trivial and avoids a near-term + follow-up RFC. The argument for later: every additional feature in + the initial drop is more compiler surface to get right; landing the + flat case first lets nested be added without breaking anything. From ec3c0fbeb7f6d309987131e9261d1855c669bf4e Mon Sep 17 00:00:00 2001 From: Red Davies Date: Tue, 12 May 2026 00:29:11 -0400 Subject: [PATCH 2/3] Revise RFC per sync-call feedback - Rename TypeInfo primitive to ABIMemoryLayout to avoid builtin name collisions (jemc/adrianboyko consensus). - Replace offset_of[T](field: String) with offsets_of[T]() returning Array[(String, USize)] val (adrianboyko's proposal). Removes the literal-string special case from the compiler and unifies the four intrinsics under one validation shape; the single-field-lookup ergonomic is recovered with a small user-side helper. - Drop nested-path unresolved question and embed-specific testing language; no special handling of embedded fields, per sync call. - Reword the misleading "round-tripping" stride_of example. - Replace the generic-serializer example, which called a nonexistent field_value(name) method, with an honest layout-inspection example. - Document the rejected offset_of(field: String) design as an alternative. --- text/0000-type-layout-intrinsics.md | 166 +++++++++++++++++----------- 1 file changed, 102 insertions(+), 64 deletions(-) diff --git a/text/0000-type-layout-intrinsics.md b/text/0000-type-layout-intrinsics.md index 6298327b..f1e04e7b 100644 --- a/text/0000-type-layout-intrinsics.md +++ b/text/0000-type-layout-intrinsics.md @@ -7,7 +7,7 @@ Add a small, cohesive set of compiler intrinsics that expose the in-memory layout of types: store size, array stride, alignment, and field offsets. -These are exposed as methods on a new `TypeInfo` primitive and are direct +These are exposed as methods on a new `ABIMemoryLayout` primitive and are direct projections of information LLVM already computes during code generation. # Motivation @@ -42,16 +42,16 @@ build upon. # Detailed design -A new primitive `TypeInfo` is added to the standard library. It exposes +A new primitive `ABIMemoryLayout` is added to the standard library. It exposes four methods, all `compile_intrinsic`. Each takes the type of interest as a type parameter. ```pony -primitive TypeInfo - fun tag size_of[T](): USize => compile_intrinsic - fun tag stride_of[T](): USize => compile_intrinsic - fun tag align_of[T](): USize => compile_intrinsic - fun tag offset_of[T](field: String): USize => compile_intrinsic +primitive ABIMemoryLayout + fun tag size_of[T](): USize => compile_intrinsic + fun tag stride_of[T](): USize => compile_intrinsic + fun tag align_of[T](): USize => compile_intrinsic + fun tag offsets_of[T](): Array[(String, USize)] val => compile_intrinsic ``` Each intrinsic maps to a specific LLVM query. Where two LLVM functions @@ -80,8 +80,8 @@ call `size_of` on that: ```pony let bytes = match cell - | let s: String => TypeInfo.size_of[String]() - | let n: U64 => TypeInfo.size_of[U64]() + | let s: String => ABIMemoryLayout.size_of[String]() + | let n: U64 => ABIMemoryLayout.size_of[U64]() end ``` @@ -92,14 +92,14 @@ The same pattern applies to `stride_of` and `align_of`. Allocating a buffer to hold `n` packed values: ```pony -let buf = @pony_alloc(@pony_ctx(), (TypeInfo.size_of[Header]() * n)) +let buf = @pony_alloc(@pony_ctx(), (ABIMemoryLayout.size_of[Header]() * n)) ``` Sanity-checking that an FFI-returned buffer is large enough before casting: ```pony -if returned_bytes < TypeInfo.size_of[CSocketAddr]() then +if returned_bytes < ABIMemoryLayout.size_of[CSocketAddr]() then error end ``` @@ -108,7 +108,7 @@ Computing the position of a trailing variable-length field in a serial format: ```pony -let payload_offset = TypeInfo.size_of[FrameHeader]() +let payload_offset = ABIMemoryLayout.size_of[FrameHeader]() ``` ## `stride_of[T]()` @@ -129,19 +129,19 @@ Computing the address of element `i` in a manually managed contiguous buffer: ```pony -let p_i = base.offset(TypeInfo.stride_of[Entry]() * i) +let p_i = base.offset(ABIMemoryLayout.stride_of[Entry]() * i) ``` Sizing the backing allocation for an array-style data structure: ```pony -let bytes_needed = TypeInfo.stride_of[T]() * capacity +let bytes_needed = ABIMemoryLayout.stride_of[T]() * capacity ``` -Round-tripping through a C API that takes `(void*, element_size, count)`: +Passing element size to a C API that takes `(void*, element_size, count)`: ```pony -@write_array(buf, TypeInfo.stride_of[T](), n) +@write_array(buf, ABIMemoryLayout.stride_of[T](), n) ``` ## `align_of[T]()` @@ -155,7 +155,7 @@ Verifying that a pointer obtained from FFI is suitably aligned before casting it to a typed pointer: ```pony -if (raw_ptr.usize() % TypeInfo.align_of[Header]()) != 0 then +if (raw_ptr.usize() % ABIMemoryLayout.align_of[Header]()) != 0 then error // misaligned; reading would be UB on strict-alignment targets end ``` @@ -165,61 +165,89 @@ the alignment of the type being placed: ```pony fun ref _align_to[T](): USize => - let a = TypeInfo.align_of[T]() + let a = ABIMemoryLayout.align_of[T]() (_cursor + (a - 1)) and not (a - 1) ``` Confirming a buffer is suitable for SIMD or DMA, both of which often require stricter alignment than the natural type alignment. -## `offset_of[T](field: String)` +## `offsets_of[T]()` -The byte offset at which the named field begins within `T`'s layout. -The `field` argument must be a string literal known at compile time; -the compiler validates that `T` has a field by that name and rejects -the call otherwise. Equivalent in spirit to C's `offsetof()` and Rust's -`core::mem::offset_of!`. +Returns the complete field table of `T` as an immutable array of +`(name, byte_offset)` pairs, in declaration order. The compiler +synthesises this array per type at codegen time; callers iterate it +or look up a field by name. `T` must be a struct, class, actor, or tuple. Unions, interfaces, and -traits are compile-time errors. For tuples, the `field` argument is -the positional accessor name (`"_1"`, `"_2"`, …) — the same name used -to access the field in source code. For classes and actors, the -offset is measured from the start of the heap-allocated object — the -same anchor as `size_of` — so the type-descriptor pointer occupies -the first `USize`-sized word and user-declared fields begin after it. +traits are compile-time errors. For tuples, the names are the +positional accessor names (`"_1"`, `"_2"`, …) — the same names used +to access the fields in source code. For classes and actors, offsets +are measured from the start of the heap-allocated object — the same +anchor as `size_of` — so the type-descriptor pointer occupies the +first `USize`-sized word and user-declared fields begin after it. + +The returned `Array[(String, USize)] val` is fully immutable and safe +to share. Both the array itself and the names inside it are `val`. +The compiler is free to deduplicate the table across call sites for +the same `T`; callers must not rely on identity. **Example uses:** -Writing a generic serializer that reads each field of a C-compatible -struct without naming the fields one by one: +Inspecting the layout of a type — for documentation, debugging, or +validating against a hand-derived expected layout for a C struct +binding: ```pony -let off_x = TypeInfo.offset_of[Point]("x") -let off_y = TypeInfo.offset_of[Point]("y") -buf.write_at(off_x, p.x) -buf.write_at(off_y, p.y) +for (name, off) in ABIMemoryLayout.offsets_of[Point]().values() do + Debug(" " + name + " @ +" + off.string()) +end ``` +Iterating the table is the natural shape when the caller cares about +*every* field. Going further — copying every field into a buffer, say +— additionally needs per-field type information, which is out of scope +for this RFC (see "What these intrinsics do not do"). + +Looking up a single field by name (the typical `offsetof`-style use) +with a small helper: + +```pony +fun _offset_of[T](name: String): (USize | None) => + for (n, off) in ABIMemoryLayout.offsets_of[T]().values() do + if n == name then return off end + end + None +``` + +The helper returns `(USize | None)` rather than partial-erroring on a +missing name, so each caller chooses its own failure policy — error, +log, fall back to a default, or surface the absence in its own return +type. + Implementing an intrusive data structure (the Linux-kernel -`container_of` pattern) where a node embedded in a larger struct +`container_of` pattern), where a node embedded in a larger struct recovers the address of its container: ```pony -fun container_of[Outer, Inner](inner_ptr: Pointer[Inner]): Pointer[Outer] => - inner_ptr.offset(-TypeInfo.offset_of[Outer]("node").isize()) +fun container_of[Outer, Inner](inner_ptr: Pointer[Inner]): Pointer[Outer] ? => + let off = (_offset_of[Outer]("node") as USize).isize() + inner_ptr.offset(-off) ``` Crossing an FFI boundary where the C side hands back a pointer to a field, not the enclosing struct. -Why a string literal? Pony has no first-class field references and -no macro system, so a compile-time-validated string is the smallest -addition. The compiler is already touching field names during type -checking, so validating the literal against `T`'s field set is -cheap and gives a clear error message ("type `T` has no field -`foo`") at the call site. Nested paths (`"inner.field"`) are -deliberately not supported by this RFC; they can be added later -without breaking single-name calls. +Why an array of pairs rather than a per-field accessor taking a +string? Pony has no first-class field references and no macro +system. An accessor like `offset_of[T](field: String)` would need +the compiler to special-case "argument must be a string literal" so +typos could be rejected at compile time; that is a new category of +intrinsic-argument handling. Exposing the whole table once and +letting users look up by name keeps the compiler change small and +uniform with the other three intrinsics, at the cost of moving the +"unknown field name" error from compile time to whatever the caller's +lookup helper does at runtime. ## What these intrinsics do not do @@ -371,13 +399,12 @@ covering: - Numeric primitives (`U8`, `U32`, `F64`, `I128`) — exact known sizes. - Structs with mixed-alignment fields where `size_of` and `stride_of` must differ. -- Classes including ones with `embed`ded fields. -- `offset_of` for every field of a representative struct, asserting - that consecutive offsets are non-overlapping and respect alignment. -- `offset_of` with an invalid field name — must produce a compile - error, not a runtime failure. +- Classes and actors — `offsets_of` returns the correct number of + entries in declaration order, with offsets that are non-overlapping + and respect alignment. +- Tuples — `offsets_of` uses `"_1"`, `"_2"`, … as the names. - Negative tests: `size_of[SomeUnion]()`, `align_of[SomeInterface]()`, - `offset_of[SomeStruct]("nonexistent")` — all compile errors. + `offsets_of[SomeUnion]()` — all compile errors. The LLVM functions these intrinsics wrap are exercised constantly by LLVM's own backend; we do not need to test that LLVM is correct, only @@ -390,10 +417,14 @@ that we are calling the right one. Standard CI coverage is sufficient. - The `size_of` / `stride_of` split is novel for C-trained users; picking the wrong one produces correct-looking code that breaks for types with trailing alignment padding. -- `offset_of` introduces compile-time string validation as a new - category of intrinsic argument handling; future intrinsics may want - the same machinery, which is a small but real generalization - pressure on the compiler. +- `offsets_of` returns a freshly-synthesised `Array val` per type + used, growing the read-only data section of the binary by a small + amount for each instantiated `T`. The cost scales with field count, + not call sites (the compiler is free to deduplicate). +- Field-name lookups via `offsets_of` move the "unknown field name" + error from compile time to runtime. Users who want compile-time + protection must build it themselves (e.g. unit tests that exercise + every lookup site). - Exposing layout questions invites users to write code that depends on platform- or version-specific layouts. The intrinsics are inherently target-dependent (a struct's size on a 32-bit ABI may @@ -416,12 +447,24 @@ the type chooses; making them virtual function calls would (a) impose a runtime cost where there is none today and (b) let types lie about their own size. The compiler is the source of truth. -**Put the methods on individual types instead of a `TypeInfo` +**Put the methods on individual types instead of an `ABIMemoryLayout` primitive** (e.g., `U32.size_of()`). Rejected because the method needs to work with type parameters, not just concrete types named at the call site, and because grouping all layout questions under one namespace makes them discoverable as a family. +**Expose `offset_of[T](field: String): USize` instead of `offsets_of`.** +The original draft of this RFC took the C/Rust shape directly: a +per-field accessor whose `field` argument is required to be a string +literal so the compiler can reject typos at compile time. Rejected +because "intrinsic argument must be a string literal" is a new +category of compile-time validation that doesn't exist for any other +intrinsic today, and the team prefers to keep the compiler change +small and uniform. The `offsets_of` shape recovers the +single-field-lookup ergonomics with a five-line helper (see the +`_offset_of` example above) at the cost of moving "unknown field +name" detection to runtime. + Not implementing this RFC leaves Pony users hand-computing layout numbers and copying them from C headers. The code keeps working until something gets reordered or the target ABI changes, at which point it @@ -429,9 +472,4 @@ silently breaks. # Unresolved questions -- **Should `offset_of` accept nested paths** like - `"outer_field.inner_field"` in this RFC, or punt to a follow-up? - The argument for now: the parsing is trivial and avoids a near-term - follow-up RFC. The argument for later: every additional feature in - the initial drop is more compiler surface to get right; landing the - flat case first lets nested be added without breaking anything. +None at this time. From 563f9e9d3fb87f74cf2cd0f35ca54087621ca2c9 Mon Sep 17 00:00:00 2001 From: Red Davies Date: Wed, 27 May 2026 00:46:09 -0400 Subject: [PATCH 3/3] Added example requested in Pony Sync. --- text/0000-type-layout-intrinsics.md | 332 +++++++--------------------- 1 file changed, 83 insertions(+), 249 deletions(-) diff --git a/text/0000-type-layout-intrinsics.md b/text/0000-type-layout-intrinsics.md index f1e04e7b..5ca383ca 100644 --- a/text/0000-type-layout-intrinsics.md +++ b/text/0000-type-layout-intrinsics.md @@ -5,46 +5,28 @@ # Summary -Add a small, cohesive set of compiler intrinsics that expose the in-memory -layout of types: store size, array stride, alignment, and field offsets. -These are exposed as methods on a new `ABIMemoryLayout` primitive and are direct -projections of information LLVM already computes during code generation. +Add a small, cohesive set of compiler intrinsics that expose the in-memory layout of types: store size, array stride, alignment, and field offsets. These are exposed as methods on a new `ABIMemoryLayout` primitive and are direct projections of information LLVM already computes during code generation. # Motivation -Pony users today have no first-class way to ask basic layout questions -about a type, oft required for C-FFI binding: +Pony users today have no first-class way to ask basic layout questions about a type, oft required for C-FFI binding: - "How many bytes do I need to allocate to hold one of these?" - "How many bytes between consecutive elements in an array of these?" - "What alignment does this type require?" - "Where does field `_x` start within this struct?" -These questions come up every time a Pony program touches raw memory: -allocating buffers via `@pony_alloc`, packing or unpacking C-ABI -structs across the FFI boundary, building custom serializers, implementing -allocators, computing addresses for pointer arithmetic, or interfacing -with hardware/protocol layouts that pin field positions. +These questions come up every time a Pony program touches raw memory: allocating buffers via `@pony_alloc`, packing or unpacking C-ABI structs across the FFI boundary, building custom serializers, implementing allocators, computing addresses for pointer arithmetic, or interfacing with hardware/protocol layouts that pin field positions. -Today, users hand-compute these numbers, hard-code them, or copy them out -of C headers. All three approaches drift silently when fields are -reordered, types are widened, or the target ABI changes. The information -is already known to the compiler — there is no good reason for users to -recompute it by hand. +Today, users hand-compute these numbers, hard-code them, or copy them out of C headers. All three approaches drift silently when fields are reordered, types are widened, or the target ABI changes. The information is already known to the compiler — there is no good reason for users to recompute it by hand. -There is also a related family of questions about deep memory accounting -("how much memory does this `Array[String]` actually consume, transitively?"). -Those are explicitly **out of scope** for this RFC; they will be proposed -separately as a separate interface. +There is also a related family of questions about deep memory accounting ("how much memory does this `Array[String]` actually consume, transitively?"). Those are explicitly **out of scope** for this RFC; they will be proposed separately as a separate interface. -Intrinsics in this RFC are the primitives that such a follow-up may choose -build upon. +Intrinsics in this RFC are the primitives that such a follow-up may choose build upon. # Detailed design -A new primitive `ABIMemoryLayout` is added to the standard library. It exposes -four methods, all `compile_intrinsic`. Each takes the type of interest as -a type parameter. +A new primitive `ABIMemoryLayout` is added to the standard library. It exposes four methods, all `compile_intrinsic`. Each takes the type of interest as a type parameter. ```pony primitive ABIMemoryLayout @@ -54,29 +36,15 @@ primitive ABIMemoryLayout fun tag offsets_of[T](): Array[(String, USize)] val => compile_intrinsic ``` -Each intrinsic maps to a specific LLVM query. Where two LLVM functions -both plausibly fit, the split below picks the one whose semantics match -the method's documented contract. +Each intrinsic maps to a specific LLVM query. Where two LLVM functions both plausibly fit, the split below picks the one whose semantics match the method's documented contract. ## `size_of[T]()` -The number of bytes occupied by the bits of a single value of type `T`, -not counting any padding required to place the next value in an array. -Equivalent to LLVM's `LLVMStoreSizeOfType`. +The number of bytes occupied by the bits of a single value of type `T`, not counting any padding required to place the next value in an array. Equivalent to LLVM's `LLVMStoreSizeOfType`. -`T` must be a concrete nominal type (a numeric primitive, a struct, a -class, or an actor). It may not be a union, intersection, interface, -trait, or type parameter that does not resolve to a concrete type at -the call site — those are compile-time errors. For class and actor -types, `size_of` returns the full size of the heap-allocated object: -the runtime type-descriptor pointer (one `USize`-sized word on the -current target — eight bytes on 64-bit) plus the type's fields. This -is the size the runtime allocates for one instance, and the value -needed for packing classes or actors into a C-compatible structure. +`T` must be a concrete nominal type (a numeric primitive, a struct, a class, or an actor). It may not be a union, intersection, interface, trait, or type parameter that does not resolve to a concrete type at the call site — those are compile-time errors. For class and actor types, `size_of` returns the full size of the heap-allocated object: the runtime type-descriptor pointer (one `USize`-sized word on the current target — eight bytes on 64-bit) plus the type's fields. This is the size the runtime allocates for one instance, and the value needed for packing classes or actors into a C-compatible structure. -To compute the size of a value whose static type is a union, match on -the value first to obtain a binding of the concrete member type, then -call `size_of` on that: +To compute the size of a value whose static type is a union, match on the value first to obtain a binding of the concrete member type, then call `size_of` on that: ```pony let bytes = match cell @@ -95,8 +63,7 @@ Allocating a buffer to hold `n` packed values: let buf = @pony_alloc(@pony_ctx(), (ABIMemoryLayout.size_of[Header]() * n)) ``` -Sanity-checking that an FFI-returned buffer is large enough before -casting: +Sanity-checking that an FFI-returned buffer is large enough before casting: ```pony if returned_bytes < ABIMemoryLayout.size_of[CSocketAddr]() then @@ -104,8 +71,7 @@ if returned_bytes < ABIMemoryLayout.size_of[CSocketAddr]() then end ``` -Computing the position of a trailing variable-length field in a serial -format: +Computing the position of a trailing variable-length field in a serial format: ```pony let payload_offset = ABIMemoryLayout.size_of[FrameHeader]() @@ -113,20 +79,13 @@ let payload_offset = ABIMemoryLayout.size_of[FrameHeader]() ## `stride_of[T]()` -The number of bytes between the start of one element and the start of -the next when `T` is laid out in an array. Equivalent to LLVM's -`LLVMABISizeOfType`. Always greater than or equal to `size_of[T]()`; -they differ only when `T`'s alignment requires trailing padding. +The number of bytes between the start of one element and the start of the next when `T` is laid out in an array. Equivalent to LLVM's `LLVMABISizeOfType`. Always greater than or equal to `size_of[T]()`; they differ only when `T`'s alignment requires trailing padding. -This is the value that should be used whenever pointer arithmetic -walks element-by-element. Calling `size_of` instead would silently miss -the inter-element padding and produce a wrong address for any type -whose store size is not a multiple of its alignment. +This is the value that should be used whenever pointer arithmetic walks element-by-element. Calling `size_of` instead would silently miss the inter-element padding and produce a wrong address for any type whose store size is not a multiple of its alignment. **Example uses:** -Computing the address of element `i` in a manually managed contiguous -buffer: +Computing the address of element `i` in a manually managed contiguous buffer: ```pony let p_i = base.offset(ABIMemoryLayout.stride_of[Entry]() * i) @@ -146,13 +105,11 @@ Passing element size to a C API that takes `(void*, element_size, count)`: ## `align_of[T]()` -The required alignment of `T` in bytes, as a power of two. Equivalent to -LLVM's `LLVMABIAlignmentOfType`. +The required alignment of `T` in bytes, as a power of two. Equivalent to LLVM's `LLVMABIAlignmentOfType`. **Example uses:** -Verifying that a pointer obtained from FFI is suitably aligned before -casting it to a typed pointer: +Verifying that a pointer obtained from FFI is suitably aligned before casting it to a typed pointer: ```pony if (raw_ptr.usize() % ABIMemoryLayout.align_of[Header]()) != 0 then @@ -160,8 +117,7 @@ if (raw_ptr.usize() % ABIMemoryLayout.align_of[Header]()) != 0 then end ``` -Implementing an arena/bump allocator that must round its cursor up to -the alignment of the type being placed: +Implementing an arena/bump allocator that must round its cursor up to the alignment of the type being placed: ```pony fun ref _align_to[T](): USize => @@ -169,34 +125,19 @@ fun ref _align_to[T](): USize => (_cursor + (a - 1)) and not (a - 1) ``` -Confirming a buffer is suitable for SIMD or DMA, both of which often -require stricter alignment than the natural type alignment. +Confirming a buffer is suitable for SIMD or DMA, both of which often require stricter alignment than the natural type alignment. ## `offsets_of[T]()` -Returns the complete field table of `T` as an immutable array of -`(name, byte_offset)` pairs, in declaration order. The compiler -synthesises this array per type at codegen time; callers iterate it -or look up a field by name. +Returns the complete field table of `T` as an immutable array of `(name, byte_offset)` pairs, in declaration order. The compiler synthesises this array per type at codegen time; callers iterate it or look up a field by name. -`T` must be a struct, class, actor, or tuple. Unions, interfaces, and -traits are compile-time errors. For tuples, the names are the -positional accessor names (`"_1"`, `"_2"`, …) — the same names used -to access the fields in source code. For classes and actors, offsets -are measured from the start of the heap-allocated object — the same -anchor as `size_of` — so the type-descriptor pointer occupies the -first `USize`-sized word and user-declared fields begin after it. +`T` must be a struct, class, actor, or tuple. Unions, interfaces, and traits are compile-time errors. For tuples, the names are the positional accessor names (`"_1"`, `"_2"`, …) — the same names used to access the fields in source code. For classes and actors, offsets are measured from the start of the heap-allocated object — the same anchor as `size_of` — so the type-descriptor pointer occupies the first `USize`-sized word and user-declared fields begin after it. -The returned `Array[(String, USize)] val` is fully immutable and safe -to share. Both the array itself and the names inside it are `val`. -The compiler is free to deduplicate the table across call sites for -the same `T`; callers must not rely on identity. +The returned `Array[(String, USize)] val` is fully immutable and safe to share. Both the array itself and the names inside it are `val`. The compiler is free to deduplicate the table across call sites for the same `T`; callers must not rely on identity. **Example uses:** -Inspecting the layout of a type — for documentation, debugging, or -validating against a hand-derived expected layout for a C struct -binding: +Inspecting the layout of a type — for documentation, debugging, or validating against a hand-derived expected layout for a C struct binding: ```pony for (name, off) in ABIMemoryLayout.offsets_of[Point]().values() do @@ -204,13 +145,9 @@ for (name, off) in ABIMemoryLayout.offsets_of[Point]().values() do end ``` -Iterating the table is the natural shape when the caller cares about -*every* field. Going further — copying every field into a buffer, say -— additionally needs per-field type information, which is out of scope -for this RFC (see "What these intrinsics do not do"). +Iterating the table is the natural shape when the caller cares about *every* field. Going further — copying every field into a buffer automatically needs per-field type information, which is out of scope for this RFC (see "What these intrinsics do not do"). -Looking up a single field by name (the typical `offsetof`-style use) -with a small helper: +Looking up a single field by name (the typical `offsetof`-style use) with a small helper: ```pony fun _offset_of[T](name: String): (USize | None) => @@ -220,80 +157,52 @@ fun _offset_of[T](name: String): (USize | None) => None ``` -The helper returns `(USize | None)` rather than partial-erroring on a -missing name, so each caller chooses its own failure policy — error, -log, fall back to a default, or surface the absence in its own return -type. +The helper returns `(USize | None)` rather than partial-erroring on a missing name, so each caller chooses its own failure policy — error, log, fall back to a default, or surface the absence in its own return type. -Implementing an intrusive data structure (the Linux-kernel -`container_of` pattern), where a node embedded in a larger struct -recovers the address of its container: +Why an array of pairs rather than a per-field accessor taking a string? Pony has no first-class field references and no macro system. An accessor like `offset_of[T](field: String)` would need the compiler to special-case "argument must be a string literal" so typos could be rejected at compile time; that is a new category of intrinsic-argument handling. Exposing the whole table once and letting users look up by name keeps the compiler change small and uniform with the other three intrinsics, at the cost of moving the "unknown field name" error from compile time to whatever the caller's lookup helper does at runtime. + +Copying each field of a struct into the matching position of an `Array[U8]`, so the buffer ends up with the same byte image the struct itself has in memory. `_offset_of` says where each field belongs in the buffer; `size_of` of the field type says how many bytes to copy: ```pony -fun container_of[Outer, Inner](inner_ptr: Pointer[Inner]): Pointer[Outer] ? => - let off = (_offset_of[Outer]("node") as USize).isize() - inner_ptr.offset(-off) +struct Header + var magic: U16 = 0 + var version: U8 = 0 + var length: U32 = 0 + +fun pack(a: Array[U8] ref, index: USize, h: Header) ? => + let base = ABIMemoryLayout.stride_of[Header]() * index + a.update_u16(base + (_offset_of[Header]("magic") as USize), h.magic)? + a.update_u8(base + (_offset_of[Header]("version") as USize), h.version)? + a.update_u32(base + (_offset_of[Header]("length") as USize), h.length)? ``` -Crossing an FFI boundary where the C side hands back a pointer to a -field, not the enclosing struct. - -Why an array of pairs rather than a per-field accessor taking a -string? Pony has no first-class field references and no macro -system. An accessor like `offset_of[T](field: String)` would need -the compiler to special-case "argument must be a string literal" so -typos could be rejected at compile time; that is a new category of -intrinsic-argument handling. Exposing the whole table once and -letting users look up by name keeps the compiler change small and -uniform with the other three intrinsics, at the cost of moving the -"unknown field name" error from compile time to whatever the caller's -lookup helper does at runtime. +Interestingly, if a `\packed\` annotation is on the struct, the same recipe can be used to encode and decode data off the wire in network protocols. ## What these intrinsics do not do -This RFC deliberately stops at the layout questions LLVM can answer -about a single type or value. It does not propose: +This RFC deliberately stops at the layout questions LLVM can answer about a single type or value. It does not propose: -- **Deep memory accounting.** "How many bytes does this `Array[String]` - transitively own?" needs traversal, sharing/cycle policy, and per-type - semantics. A separate RFC will propose a `MemoryFootprint` interface - that types opt into, implemented in terms of these intrinsics. +- **Deep memory accounting.** "How many bytes does this `Array[String]` transitively own?" needs traversal, sharing/cycle policy, and per-type semantics. A separate RFC will propose a `MemoryFootprint` interface that types opt into, implemented in terms of these intrinsics. - **Actor heap introspection.** "How much has *this actor* allocated?" - **Type names, descriptors, or other reflection.** Out of scope. ## Implementation -PR ponyc#5267 is a WIP for an example implementation for the purposes -of exploring this design-space. Changes should only exist in the -compiler and stdlib. No changes required to the runtime (libponyrt). +PR ponyc#5267 is a WIP for an example implementation for the purposes of exploring this design-space. Changes should only exist in the compiler and stdlib. No changes required to the runtime (libponyrt). # How We Teach This -These intrinsics live alongside `Pointer.alloc` and the FFI machinery -in the user's mental model. They should be introduced in the section -of the language tutorial that already covers raw memory and FFI, with -worked examples for each of the use cases above. The -`size_of` / `stride_of` distinction will be the most novel concept -for users coming from C (where `sizeof` conflates the two), and is -explained below. +These intrinsics live alongside `Pointer.alloc` and the FFI machinery in the user's mental model. They should be introduced in the section of the language tutorial that already covers raw memory and FFI, with worked examples for each of the use cases above. The `size_of` / `stride_of` distinction will be the most novel concept for users coming from C (where `sizeof` conflates the two), and is explained below. ## The size / stride / alignment relationship The mental model: -- **`align_of[T]`** is the rule. Every value of `T` must live at an - address divisible by `align_of[T]`. -- **`size_of[T]`** is how much one value of `T` occupies under that - rule. -- **`stride_of[T]`** is how far apart the rule forces consecutive - elements to be in an array of `T`. +- **`align_of[T]`** is the rule. Every value of `T` must live at an address divisible by `align_of[T]`. +- **`size_of[T]`** is how much one value of `T` occupies under that rule. +- **`stride_of[T]`** is how far apart the rule forces consecutive elements to be in an array of `T`. -Stride must always be a multiple of alignment, so that every element's -start is aligned. The minimum legal stride is `size_of` rounded up to -the next multiple of `align_of`. For most types, size and stride are -equal — they diverge only when a type's content size isn't already a -multiple of its alignment. When they diverge, the gap is **trailing -padding**. +Stride must always be a multiple of alignment, so that every element's start is aligned. The minimum legal stride is `size_of` rounded up to the next multiple of `align_of`. For most types, size and stride are equal — they diverge only when a type's content size isn't already a multiple of its alignment. When they diverge, the gap is **trailing padding**. ### Example: `(U64, U8)` — size 9, align 8, stride 16 @@ -303,23 +212,19 @@ Layout of a single value (9 bytes): |-----------------|------| | 0 1 2 3 4 5 6 7 | 8 | -In an array, element 1 must start at a multiple of 8. Byte 9 isn't, -nor are 10..15. Byte 16 is. So elements are 16 bytes apart, and each -element's 16-byte slot has 7 bytes of trailing padding: +In an array, element 1 must start at a multiple of 8. Byte 9 isn't, nor are 10..15. Byte 16 is. So elements are 16 bytes apart, and each element's 16-byte slot has 7 bytes of trailing padding: | `U64` | `U8` | Padding | |-----------------|------|----------------------| | 0 1 2 3 4 5 6 7 | 8 | 9 10 11 12 13 14 15 | -Subsequent elements occupy slots starting at byte 16, byte 32, byte -48, and so on, each laid out the same way relative to its slot start. +Subsequent elements occupy slots starting at byte 16, byte 32, byte 48, and so on, each laid out the same way relative to its slot start. - `size_of` = 9 (a single value occupies 9 bytes) - `stride_of` = 16 (the next value starts 16 bytes later) - `align_of` = 8 (max alignment of any field — the `U64` needs 8-aligned) -The 7-byte trailing pad is what keeps each subsequent element's `U64` -8-aligned. +The 7-byte trailing pad is what keeps each subsequent element's `U64` 8-aligned. ### Example: `(U32, U8)` vs `(U8, U32)` — same fields, different layouts @@ -335,140 +240,69 @@ The 7-byte trailing pad is what keeps each subsequent element's `U64` |------|---------|---------| | 0 | 1 2 3 | 4 5 6 7 | -Same fields, different order, same stride. `size_of` differs because -internal padding (between fields) counts toward size but trailing -padding does not. The visible consequence: a serializer that writes -`size_of[T]()` bytes per record writes 5 bytes for `(U32, U8)` but -8 bytes for `(U8, U32)`. +Same fields, different order, same stride. `size_of` differs because internal padding (between fields) counts toward size but trailing padding does not. The visible consequence: a serializer that writes `size_of[T]()` bytes per record writes 5 bytes for `(U32, U8)` but 8 bytes for `(U8, U32)`. ### Working out the layout by hand -Given a struct or tuple, you can derive size, stride, and alignment by -reading the fields left-to-right: +Given a struct or tuple, you can derive size, stride, and alignment by reading the fields left-to-right: 1. Start a cursor at byte 0. -2. For each field: advance the cursor to the next address divisible - by that field's alignment (the gap is internal padding), then add - the field's size. +2. For each field: advance the cursor to the next address divisible by that field's alignment (the gap is internal padding), then add the field's size. 3. After the last field, the cursor's position is `size_of`. -4. The whole type's `align_of` is the maximum alignment among its - fields. -5. Round `size_of` up to a multiple of `align_of` — that's - `stride_of`. +4. The whole type's `align_of` is the maximum alignment among its fields. +5. Round `size_of` up to a multiple of `align_of` — that's `stride_of`. Worked example for `(U16, U64, U8)`: -1. Cursor at 0. `U16` (align 2) is already aligned. It occupies - bytes 0..1. Cursor advances to 2. -2. `U64` (align 8): pad to byte 8 (6 bytes of internal padding). - It occupies 8..15. Cursor advances to 16. -3. `U8` (align 1) is already aligned. It occupies byte 16. Cursor - advances to 17. +1. Cursor at 0. `U16` (align 2) is already aligned. It occupies bytes 0..1. Cursor advances to 2. +2. `U64` (align 8): pad to byte 8 (6 bytes of internal padding). It occupies 8..15. Cursor advances to 16. +3. `U8` (align 1) is already aligned. It occupies byte 16. Cursor advances to 17. 4. `align_of` = max(2, 8, 1) = 8. 5. Round 17 up to a multiple of 8: 24. -So `size_of[(U16, U64, U8)]() == 17`, -`align_of[(U16, U64, U8)]() == 8`, and -`stride_of[(U16, U64, U8)]() == 24`. +So `size_of[(U16, U64, U8)]() == 17`, `align_of[(U16, U64, U8)]() == 8`, and `stride_of[(U16, U64, U8)]() == 24`. ### The footgun -For every primitive type in Pony (`U8`, `U16`, …, `U64`, `F32`, -`F64`), `size_of == stride_of`. Generic code that uses `size_of` -where it should use `stride_of` passes tests written against -primitives and silently misbehaves when instantiated with a tuple or -struct whose alignment forces trailing padding. +For every primitive type in Pony (`U8`, `U16`, …, `U64`, `F32`, `F64`), `size_of == stride_of`. Generic code that uses `size_of` where it should use `stride_of` passes tests written against primitives and silently misbehaves when instantiated with a tuple or struct whose alignment forces trailing padding. -The rule of thumb: reach for `stride_of` whenever you are computing -positions in an array; reach for `size_of` only when you are packing -bytes into a format that has no inter-element padding (a wire format, -a tightly-packed file header, a record stream). +The rule of thumb: reach for `stride_of` whenever you are computing positions in an array; reach for `size_of` only when you are packing bytes into a format that has no inter-element padding (a wire format, a tightly-packed file header, a record stream). ## Naming precedent -The naming follows established precedent: Swift's `MemoryLayout.size` -/ `.stride` / `.alignment`, and Rust's `mem::size_of` / -`mem::align_of` / `mem::offset_of!`. Pony users coming from either -language should find the names self-explanatory. +The naming follows established precedent: Swift's `MemoryLayout.size` / `.stride` / `.alignment`, and Rust's `mem::size_of` / `mem::align_of` / `mem::offset_of!`. Pony users coming from either language should find the names self-explanatory. # How We Test This -Each intrinsic gets unit tests in the standard-library test suite -covering: +Each intrinsic gets unit tests in the standard-library test suite covering: - Numeric primitives (`U8`, `U32`, `F64`, `I128`) — exact known sizes. -- Structs with mixed-alignment fields where `size_of` and `stride_of` - must differ. -- Classes and actors — `offsets_of` returns the correct number of - entries in declaration order, with offsets that are non-overlapping - and respect alignment. +- Structs with mixed-alignment fields where `size_of` and `stride_of` must differ. +- Classes and actors — `offsets_of` returns the correct number of entries in declaration order, with offsets that are non-overlapping and respect alignment. - Tuples — `offsets_of` uses `"_1"`, `"_2"`, … as the names. -- Negative tests: `size_of[SomeUnion]()`, `align_of[SomeInterface]()`, - `offsets_of[SomeUnion]()` — all compile errors. +- Negative tests: `size_of[SomeUnion]()`, `align_of[SomeInterface]()`, `offsets_of[SomeUnion]()` — all compile errors. -The LLVM functions these intrinsics wrap are exercised constantly by -LLVM's own backend; we do not need to test that LLVM is correct, only -that we are calling the right one. Standard CI coverage is sufficient. +The LLVM functions these intrinsics wrap are exercised constantly by LLVM's own backend; we do not need to test that LLVM is correct, only that we are calling the right one. Standard CI coverage is sufficient. # Drawbacks -- Adds four new compiler intrinsics, growing the intrinsic surface - area the compiler must keep working as it evolves. -- The `size_of` / `stride_of` split is novel for C-trained users; - picking the wrong one produces correct-looking code that breaks for - types with trailing alignment padding. -- `offsets_of` returns a freshly-synthesised `Array val` per type - used, growing the read-only data section of the binary by a small - amount for each instantiated `T`. The cost scales with field count, - not call sites (the compiler is free to deduplicate). -- Field-name lookups via `offsets_of` move the "unknown field name" - error from compile time to runtime. Users who want compile-time - protection must build it themselves (e.g. unit tests that exercise - every lookup site). -- Exposing layout questions invites users to write code that depends - on platform- or version-specific layouts. The intrinsics are - inherently target-dependent (a struct's size on a 32-bit ABI may - differ from a 64-bit ABI); users must understand they are getting - the answer for the current target. +- Adds four new compiler intrinsics, growing the intrinsic surface area the compiler must keep working as it evolves. +- The `size_of` / `stride_of` split is novel for C-trained users; picking the wrong one produces correct-looking code that breaks for types with trailing alignment padding. +- `offsets_of` returns a freshly-synthesised `Array val` per type used, growing the read-only data section of the binary by a small amount for each instantiated `T`. The cost scales with field count, not call sites (the compiler is free to deduplicate). +- Field-name lookups via `offsets_of` move the "unknown field name" error from compile time to runtime. Users who want compile-time protection must build it themselves (e.g. unit tests that exercise every lookup site). +- Exposing layout questions invites users to write code that depends on platform- or version-specific layouts. The intrinsics are inherently target-dependent (a struct's size on a 32-bit ABI may differ from a 64-bit ABI); users must understand they are getting the answer for the current target. # Alternatives -**Expose only `size_of` and `align_of`; let users compute stride -themselves.** Stride is "round size up to alignment", which users could -compute. Rejected because the rounding rule is exactly the kind of -small, easy-to-get-subtly-wrong arithmetic that should live behind a -named primitive — and because "stride" is itself a load-bearing concept -in any code that walks an array. - -**Use traits/interfaces instead of intrinsics.** A `Sized` trait that -each type implements would avoid touching the compiler. Rejected -because the values are properties of the type's layout, not behavior -the type chooses; making them virtual function calls would (a) impose a -runtime cost where there is none today and (b) let types lie about -their own size. The compiler is the source of truth. - -**Put the methods on individual types instead of an `ABIMemoryLayout` -primitive** (e.g., `U32.size_of()`). Rejected because the method needs -to work with type parameters, not just concrete types named at the -call site, and because grouping all layout questions under one -namespace makes them discoverable as a family. - -**Expose `offset_of[T](field: String): USize` instead of `offsets_of`.** -The original draft of this RFC took the C/Rust shape directly: a -per-field accessor whose `field` argument is required to be a string -literal so the compiler can reject typos at compile time. Rejected -because "intrinsic argument must be a string literal" is a new -category of compile-time validation that doesn't exist for any other -intrinsic today, and the team prefers to keep the compiler change -small and uniform. The `offsets_of` shape recovers the -single-field-lookup ergonomics with a five-line helper (see the -`_offset_of` example above) at the cost of moving "unknown field -name" detection to runtime. - -Not implementing this RFC leaves Pony users hand-computing layout -numbers and copying them from C headers. The code keeps working until -something gets reordered or the target ABI changes, at which point it -silently breaks. +**Expose only `size_of` and `align_of`; let users compute stride themselves.** Stride is "round size up to alignment", which users could compute. Rejected because the rounding rule is exactly the kind of small, easy-to-get-subtly-wrong arithmetic that should live behind a named primitive — and because "stride" is itself a load-bearing concept in any code that walks an array. + +**Use traits/interfaces instead of intrinsics.** A `Sized` trait that each type implements would avoid touching the compiler. Rejected because the values are properties of the type's layout, not behavior the type chooses; making them virtual function calls would (a) impose a runtime cost where there is none today and (b) let types lie about their own size. The compiler is the source of truth. + +**Put the methods on individual types instead of an `ABIMemoryLayout` primitive** (e.g., `U32.size_of()`). Rejected because the method needs to work with type parameters, not just concrete types named at the call site, and because grouping all layout questions under one namespace makes them discoverable as a family. + +**Expose `offset_of[T](field: String): USize` instead of `offsets_of`.** The original draft of this RFC took the C/Rust shape directly: a per-field accessor whose `field` argument is required to be a string literal so the compiler can reject typos at compile time. Rejected because "intrinsic argument must be a string literal" is a new category of compile-time validation that doesn't exist for any other intrinsic today, and the team prefers to keep the compiler change small and uniform. The `offsets_of` shape recovers the single-field-lookup ergonomics with a five-line helper (see the `_offset_of` example above) at the cost of moving "unknown field name" detection to runtime. + +Not implementing this RFC leaves Pony users hand-computing layout numbers and copying them from C headers. The code keeps working until something gets reordered or the target ABI changes, at which point it silently breaks. # Unresolved questions