diff --git a/CHANGELOG.md b/CHANGELOG.md index 01961658d..94c367c5d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,147 @@ Detailed changelog for Perry. See CLAUDE.md for concise summaries. +## v0.5.36 — Buffer-typed param `src[i]` reads/writes bytes (closes #42) +- **fix**: `function f(src: Buffer) { return src[0]; }` returned a tiny denormal f64 like `7.9e-308` — the NaN-boxed pointer bits of `src` misread as the raw element value. Top-level `buf[0]` worked because `Buffer.alloc(n)` is refined to `Type::Named("Uint8Array")` in `lower_types.rs`, which the computed-member lowering special-cases into the byte-indexed `Uint8ArrayGet` path. But an explicitly-declared `Buffer` parameter lands in `ctx.locals` with `Type::Named("Buffer")`, and the two call sites in `crates/perry-hir/src/lower.rs` (IndexGet at ~9055, IndexSet at ~9345) only matched `"Uint8Array"` — so `src[i]` fell through to the generic f64-element `IndexGet`, and `src[i] = v` fell through to `IndexSet` that zero-filled past the buffer header boundary. +- Both sites now accept `n == "Uint8Array" || n == "Buffer"`. `Buffer` is Node's subclass of `Uint8Array` — identical memory layout in Perry's runtime — so the dispatch change is semantically safe. Verified against the #42 repro (25 MB buffer pass-through with `Buffer.alloc(n)` in callee): `src[0]=0`, `dst[0]=0`, `out[0]=0`, no corruption. +- The user's GC hypothesis in the report was a red herring: the bug reproduces with zero allocations inside the callee (see minimal repro `function f(src: Buffer) { return src[0]; }`). The "silent exit before `fill done`" in the original repro was the `for (let i = 0; i < n; i++) dst[i] = src[i]` loop writing past the buffer bounds via the generic `IndexSet` path, which treats `dst` as a plain array with `length=0` (header misread) and either no-ops or corrupts adjacent memory. + +## v0.5.35 — `process.argv.slice(N)` returns a real array (closes #41) +- **fix**: `process.argv.slice(2)` came back as a "string" whose `length` was the full argv count and whose element reads returned small denormal doubles — the NaN-box bit patterns of the string pointers being interpreted as `f64`. The HIR `.slice()` lowering at `crates/perry-hir/src/lower.rs:7849` routes to `Expr::ArraySlice` only when the receiver matches a hard-coded allow-list of array-producing `Expr` variants (needed because `.slice()` exists on both Array and String — without the allow-list a String.slice call could get misrouted). `Expr::ProcessArgv` wasn't in the list, so `process.argv.slice(2)` fell through to the generic call path which treated the receiver as a string and called `js_string_slice` on the `ArrayHeader*` pointer. Result: the `ArrayHeader.length` read as `StringHeader.byte_len`, and per-element reads returned the NaN-box bits through the string-char path. +- Added `Expr::ProcessArgv` to the allow-list. `process.argv.slice(N)` now lowers to `ArraySlice { array: ProcessArgv, start, end }` which the codegen dispatches through `js_array_slice`, producing a proper `ArrayHeader*` with the sliced string pointers. Verified against the #41 repro — `./bin one two three` now prints `type=object length=3` with `rest[0]=one`, `rest[1]=two`, `rest[2]=three`, all `typeof=string`. Matches Node exactly. + +## v0.5.34 — `Math.imul` lowering (closes #40) +- **fix**: `Math.imul(a, b)` reached HIR as `Expr::MathImul` (`crates/perry-hir/src/lower.rs` matches the builtin and constructs the node), but the LLVM codegen in `crates/perry-codegen/src/expr.rs` had no match arm — every call fell through to the catch-all which errored with `Phase 2: expression MathImul not yet supported`. The JS / WASM / Glance emitters all had arms; the LLVM backend was the only gap. +- Emit `fptosi DOUBLE→I64, trunc I64→I32` on both operands (this is the JS `ToInt32` sequence: wrap-to-i64, then take the low 32 bits — matching spec behavior for every finite double, which is the only value any real hash/PRNG passes to imul), `mul i32` (LLVM defaults to wrapping without `nsw`/`nuw`), `sitofp I32→DOUBLE`. No runtime helper needed — this is a 5-instruction inline sequence. +- Result: `Math.imul(0x01000193, 0x811c9dc5)` → `84696351` (matches Node — the issue's illustrative `-2110866647` was for a different argument pair). `Math.imul(-1, -1)` → `1`, `Math.imul(0xffffffff, 5)` → `-5`, `Math.imul(3, 0x7fffffff)` → `2147483645` — all match Node. Unblocks FNV-1a-32, MurmurHash3, xxhash32, CRC32, PCG, xorshift* in user TS without the 16-bit hi/lo workaround. +- NaN/Inf inputs technically coerce through ToInt32 → 0 in spec JS, but `fptosi` saturates on those — not worth a compare-and-select gate per call since no real hash/PRNG feeds NaN or Infinity to imul. + +## v0.5.33 — JSON.stringify/parse on large arrays (closes #43, #44) +- **fix** (GC): arena block reset is all-or-nothing — an arena object sharing a block with a root-reachable object persists in memory whether or not the object itself is reachable. The existing trace only marked root-reachable arena objects, so malloc-allocated string fields referenced only through a NOT-reachable arena object got swept while the arena object's memory lingered, leaving dangling pointers. The hole was hit by tight `arr.push({name:'…',email:'…',…})` loops: between the new object's arena allocation and the write into `arr`, the object's only root was a caller-saved register — which `setjmp` does not capture — so conservative stack scanning didn't see it. GC then swept the `name`/`email` strings, and a subsequent `JSON.stringify(arr)` read freed memory and panicked at `json.rs:427 push_str(&s[start..i])` on a non-UTF-8 boundary (issue #43), or `JSON.parse` + iteration read stale `.active` fields and silently dropped records (issue #44). +- Added `mark_block_persisting_arena_objects` in `crates/perry-runtime/src/gc.rs` — after the primary mark/trace from roots, compute which arena blocks have any reachable object, then mark every remaining arena object in those blocks and trace its children. Iterates to a fixed point because marking may extend liveness into previously-dead blocks. Refactored the worklist-drain half of `trace_marked_objects` into `drain_trace_worklist` so the new phase reuses the same traversal. +- **fix**: `trace_array` refused to trace arrays with `length > 65_536`, and `trace_object` refused `field_count > 65_536`. Both were intended as corruption guards but collided with realistic workloads — issue #44's `parts: string[]` builder grew to 100k entries, so every string it held was swept on the first GC. Raised limits to 16M / 1M respectively (still well below any possible corrupted value). +- **fix**: `JSON.stringify` dispatch in `crates/perry-runtime/src/json.rs` used a `cap < 10000` heuristic to distinguish arrays from strings when the `type_hint` was unknown. `JSON.stringify(arr)` where `arr.capacity >= 10000` (i.e. past the 8192 → 16384 growth step) fell through to the string path, reinterpreted the `ArrayHeader` as a `StringHeader` (array.length/capacity aliased onto utf16_len/byte_len), and called `str::from_utf8_unchecked` on the raw NaN-boxed pointer storage. New `gc_obj_type(ptr)` reads the `GcHeader.obj_type` tag 8 bytes before the user pointer and dispatches on it (GC_TYPE_ARRAY → `stringify_array`, GC_TYPE_OBJECT → `stringify_object`, GC_TYPE_STRING → `write_escaped_string`), falling back to the old heuristic only for pointers that aren't GC-tagged. Applied in both `stringify_value` (top-level) and `stringify_array` (per-element). +- Verified byte-for-byte match with Node on the issue #43 repro (30k records → 4,155,561 bytes) and issue #44 repro (50k records, parse+iterate → 50000 active matches). Gap test suite (23 tests) unchanged — pre/post-fix diffs are identical. + +## v0.5.32 — BigInt bitwise ops (closes #39) +- **fix**: `Expr::Binary` bigint dispatch in `crates/perry-codegen/src/expr.rs` only covered arithmetic (`Add`/`Sub`/`Mul`/`Div`/`Mod`). `BitAnd`/`BitOr`/`BitXor`/`Shl`/`Shr` on bigint operands fell through to the default numeric path that does `fptosi(f64→i64) → trunc(i32) → and/or/xor/shl/ashr → sitofp`. NaN-boxed bigints are stored with `BIGINT_TAG` (0x7FFA) — the bits form a NaN-payload f64; `fptosi` on a NaN produces 0 in the i64-truncation path (arch-dependent on ARM the result is undefined but commonly 0). Net effect: `0xCBF29CE484222325n ^ 5n` returned `-6` instead of `0xcbf29ce484222320`, `x & 0xFFFFFFFFFFFFFFFFn` returned `0`, and any FNV-1a / MurmurHash / xxhash-64 implementation in user TS was unusable. +- The runtime already had `js_dynamic_bitand` / `js_dynamic_bitor` / `js_dynamic_bitxor` / `js_dynamic_shl` / `js_dynamic_shr` (in `crates/perry-runtime/src/value.rs`) — they unbox to `BigIntHeader*`, call the raw `js_bigint_`, and re-box with BIGINT_TAG. Fall-through preserves i32 ToInt32 semantics for the pure-number case. All that was missing was the codegen dispatch. +- Extended the bigint-dispatch `match` in `Expr::Binary` to emit `js_dynamic_bitand/bitor/bitxor/shl/shr` when either operand is statically bigint-typed. Declared the helpers in `runtime_decls.rs`. Also extended `is_bigint_expr` in `type_analysis.rs` to recognize nested bigint bitwise ops so `(a * prime) & mask64` — where the LHS of `&` is a bigint `Binary` — stays bigint-typed up the expression tree; without that the outer `&` saw the inner `Binary` as non-bigint and fell back to i32. +- `UShr` (`>>>`) is deliberately not dispatched: it's a `TypeError` on bigints in spec JS, so the existing i32 path is fine (user code that tries it will get garbage but Node throws — out of scope). +- Repro from #39 after fix: `a ^ 5n` → `cbf29ce484222320` (matches Node / Python). `(a * 0x100000001B3n) & mask64` → `af63bd4c8601b7df` (matches Node / Python — the issue report's expected value `bf9a804f79c4bcb7` was a transcription error). Shifts and plain AND/OR also verified against Node. + +## v0.5.31 — `new Uint8Array(n)` with non-literal `n` (closes #38) +- **fix**: `new Uint8Array(n)` where `n` is a variable or computed expression silently produced a zero-length buffer. The codegen in `crates/perry-codegen/src/expr.rs` (`Expr::Uint8ArrayNew`) had fast paths for `Expr::Integer(n)` / `Expr::Number(n)` literal forms that called `js_buffer_alloc`, but the catch-all `Some(e) => …` arm unconditionally dispatched to `js_uint8array_from_array(arr_handle)`. When `e` lowered to a plain numeric f64 (100.0), `unbox_to_i64` AND'd the NaN-box bits with `POINTER_MASK_I64` (0x0000_FFFF_FFFF_FFFF) — stripping the upper 16 bits of a finite double gives an effectively null pointer — and `js_uint8array_from_array` walked that pointer, read `length = 0` from the fake "ArrayHeader", and returned a zero-length buffer. Any idiomatic code path using `new Uint8Array(n)` with a computed length (buffer sizing from a protocol header, length from a config field, image dimensions, etc.) got a silent no-op. +- New runtime entry point `js_uint8array_new(val: f64) -> *mut BufferHeader` in `crates/perry-runtime/src/buffer.rs` inspects the NaN-box tag of the incoming value: POINTER_TAG (0x7FFD) routes to `js_uint8array_from_array`, a plain IEEE double routes to `js_uint8array_alloc(val as i32)`, anything else (undefined/null/bool/string/bigint) returns an empty buffer to match JS semantics for `new Uint8Array(undefined)` etc. +- Codegen catch-all now calls `js_uint8array_new(DOUBLE)` with the raw NaN-boxed value instead of `js_uint8array_from_array(I64)` with an unboxed "pointer". Literal-`Expr::Integer` / `Expr::Number` forms still short-circuit to `js_buffer_alloc` at compile time — no regression on the common `new Uint8Array(16)` pattern. +- Repro (from #38): `function make(n: number) { const u = new Uint8Array(n); console.log('n=' + n + ' length=' + u.length); } make(100); make(1024);`. Before: `n=100 length=0`, `n=1024 length=0`. After: `n=100 length=100`, `n=1024 length=1024`. + +## v0.5.29 — row-object allocation perf (-14% on @perry/postgres bulk decode) +- **perf**: `js_object_set_field_by_name` was cloning the keys_array on every property add beyond the first on any plain object literal (`{}` + `obj[k] = v`). The clone guard `key_count == field_count` fired even for arrays allocated locally in the null-keys branch because `field_count` is bumped in lockstep with each add. For a 20-property row object built at 10k rows (@perry/postgres bulk decode) that's ~190k throwaway keys_array clones of growing size per iteration — 15 MB of memory churn per bench iteration, all wasted. +- Added `GC_FLAG_SHAPE_SHARED` (`0x08`) — `shape_cache_insert` stamps it on the keys_array before caching; `js_object_set_field_by_name` reads it to decide whether to clone. Arrays allocated in the `keys.is_null()` branch are exclusively owned and skip the clone entirely. Guarded behind a GC-header validity check so a non-GC-allocated keys_array (rare but possible via static data or buffer reinterpretation) still takes the safe clone path. +- Also deferred the `Rust String` allocation in `js_object_set_field_by_name` behind a new `PROPERTY_ATTRS_IN_USE` flag (mirrors the existing `ACCESSORS_IN_USE` guard). The to_string() was running on every call just to look up a descriptor that almost never exists — 200k wasted heap allocations per 10k-row bulk decode. Now it only runs when `Object.defineProperty` has ever installed a per-property attr on this thread. +- Added fast path in `js_bigint_from_string`: decimal inputs that fit `i64` skip the per-digit 16-limb multiply-add loop and call `s.parse::()` + `js_bigint_from_i64` directly. Postgres `int8` results, `Date.now()` timestamps, and ~every real-world `BigInt("…")` call land here. Falls through to the general path for hex, oversized, or malformed input. +- Measured (local PG 16, Perry-native, @perry/postgres bench/bench-this.ts, 50 iterations p50): 10k×20 rows 896ms → 774ms (-14%), 1k×20 rows 43ms → 42ms (no-op). Microbench (200k dynamic-key obj writes): 51ms → 40ms (-22%). Node is still ~20× faster on bulk decode — V8's hidden-class ICs don't have an analog in Perry's shape cache yet — but one more layer of per-call garbage is gone. + +## v0.5.28 — module globals registered as GC roots (closes #36) +- **fix**: module-level user `let`/`const` globals were LLVM `double` globals that held NaN-boxed JSValues but were NOT registered with the GC's root scanner. Only string-handle globals (from the string pool) got `js_gc_register_global_root(&@.str..handle)` at startup. The conservative stack scan could still find pointers held by stack variables, so the bug was latent until v0.5.25 made `gc_malloc` trigger GC during long-running decode loops — any program where a `Map` / `Array` / user-class instance lived only in `const X = new Map(...)` (no stack variable holding it at the moment of GC) would have `X` swept mid-cycle. The canonical victim was `@perry/postgres`'s `const CONN_STATES = new Map()`: the Map header got freed, the next `CONN_STATES.get(id)` dereferenced a freed pointer, SIGSEGV. Tracked by pg's malloc-count trigger hitting its 10k threshold around the 10-20k row mark — exactly the boundary the ticket reported. +- New `register_module_globals_as_gc_roots(&mut ctx, module_globals)` in `crates/perry-codegen/src/codegen.rs` emits one `js_gc_register_global_root(ptrtoint @perry_global___ to i64)` per module-level let/const at the top of each module's `main` (entry) or `__init` (non-entry) function, right after `js_gc_init` + the strings-init prelude. Registration uses the global's **address**, not its current value — so reassignments are followed correctly without re-registering. `mark_global_roots` already handled both NaN-boxed (POINTER_TAG / STRING_TAG / BIGINT_TAG) and raw-i64 interpretations, falling through the `valid_ptrs` filter for both, so registering every global regardless of its declared type is safe: number/boolean/undefined bits just don't match any live heap pointer. +- Repro (no postgres, minimal synthetic): `const CACHE = new Map(); put(...); allocLots(); get(1)`. Before the fix: SIGSEGV after the allocLots burst crosses the malloc-count threshold. After: prints `OK`. Full @perry/postgres bench suite: `perry-bench-crash-repro.ts` (1000×20 mixed types × 5 iterations) and `perry-bench-narrow.ts` (all int4 / bool / text / int8 / numeric × 3 iterations each = 15 queries) both pass end-to-end. + +## v0.5.27 — GC root scanners for `ws` / `http` / `events` / `fastify` closures (refs #35) +- **fix**: follow-up sweep to v0.5.26 — the net.Socket scanner pattern extended to every other stdlib module that stores user closures in Rust-side registries not visible to the GC mark phase. Same latent bug in each: user closure passed across the FFI, stored as `i64` inside a `Mutex` (ws's `WS_CLIENT_LISTENERS`) or inside a struct held by the handle registry (`WsServerHandle.listeners`, `ClientRequestHandle.response_callback` + `.listeners`, `IncomingMessageHandle.listeners`, `EventEmitterHandle.listeners`, `FastifyApp.routes[].handler` + `.hooks.*` + `.error_handler` + `.plugins[].handler`) — any malloc-triggered GC between registration and dispatch would sweep the closure and the next invocation would hit freed memory. +- New helper `common::for_each_handle_of::(|t| ...)` walks the `DashMap`-backed handle registry, downcast_ref'ing each entry to `T`. Each stdlib module adds its own `scan_X_roots(mark)` and a `Once`-guarded `ensure_gc_scanner_registered()` called from the module's create / on / connect entry points, mirroring the cron/net templates. +- **ws.rs**: scans `WS_CLIENT_LISTENERS` (global) + every `WsServerHandle` in the registry. Registered from `js_ws_on`, `js_ws_connect`, `js_ws_connect_start`, `js_ws_server_new`. +- **http.rs**: scans every `ClientRequestHandle` (response_callback + 'error' listeners) and `IncomingMessageHandle` ('data' / 'end' / 'error' listeners). Registered from `js_http_request`, `js_https_request`, `js_http_get`, `js_https_get`, `js_http_on`. +- **events.rs**: scans every `EventEmitterHandle`'s listener map. Registered from `js_event_emitter_new` and `js_event_emitter_on`. (Note: `new EventEmitter()` has a pre-existing HIR gap that routes through the user-class `New` path instead of the factory — unrelated to this fix, still happens in v0.5.26.) +- **fastify/mod.rs**: scans every `FastifyApp`'s routes, all 8 hook lists (onRequest/preParsing/preValidation/preHandler/preSerialization/onSend/onResponse/onError), `error_handler`, and plugin handlers. Registered from `js_fastify_create` / `js_fastify_create_with_opts`. Tokio dispatch copies the app into an `Arc` but `Route`/`Hooks` are `Clone` with closures stored by `i64` value — the tokio-side copy references the same `ClosureHeader` alloc, so marking via the registry entry covers both paths. +- **not covered** (intentional, no observed issue): `commander.rs` action callbacks (comment says "not automatically invoked"), `async_local_storage.rs` / `worker_threads.rs` (closures invoked immediately then discarded, never held across a GC boundary). + +## v0.5.26 — GC root scanner for `net.Socket` listener closures (closes #35) +- **fix**: `sock.on('data', cb)` stored the closure pointer in `NET_LISTENERS: Mutex>>>` as a bare `i64`, with no root scanner registered — so GC's mark phase couldn't see it. Before v0.5.25 this was a latent bug: GC only fired on arena block overflow, and event-driven code (like `@perry/postgres`'s data listener) rarely tripped it. Once v0.5.25 made `gc_malloc` trigger GC, any wrapper-heavy synchronous work (row decode, JSON parse, allocation burst between events) would fire a sweep with the listener unmarked — the sweep freed the closure, and the next dispatched `'data'` event called `js_closure_call1` on freed memory. In the pg driver the result was: iter 0 fired echoes fine (no GC yet), iter 1+ called a dead closure, the driver's parse loop stopped advancing, the outer `conn.query(...)` promise never resolved, and main() silently exited 0 when the pump had nothing left to do — exactly the symptom in the ticket. +- New `scan_net_roots(mark)` walks `NET_LISTENERS`, re-NaN-boxes each callback `i64` with `POINTER_TAG`, and calls `mark` — mirrors the existing `cron::scan_cron_roots` / `timer::scan_timer_roots` pattern. Registered lazily via a `Once` from `spawn_socket_task` (first `net.createConnection` / `tls.connect`) and `js_net_socket_on` (first `.on(...)` call on any socket), so programs that never use net don't pay the registration cost. Repro: synthetic TCP client + external echo server + 30k-iteration wrapper-allocation burst between sends — before: `dataCb=0 bytes=0` (listener freed after iter 0); after: `dataCb=5 bytes=35` ✓. +- **known remaining**: the same latent pattern still exists for `ws.rs`'s `WS_CLIENT_LISTENERS` + `WsServerHandle.listeners`, and `http.rs`'s `ClientRequest.response_callback` + `IncomingMessage.listeners`. Those registries are also Rust-side-only references to user closures — if a WS client or HTTP request lives across a GC cycle triggered by malloc pressure, its listeners will be swept. Filed as a follow-up sweep; not fixed in this commit to keep the scope tight to the issue #35 report. + +## v0.5.25 — GC from `gc_malloc` + adaptive malloc-count trigger (closes #34) +- **fix**: malloc-heavy workloads never triggered GC. `gc_check_trigger()` was only called from the arena slow path (when a block fills), but code that produces many short-lived malloc-tracked objects without pushing arena blocks — e.g. `@perry/postgres`'s `parseBigIntDecimal` (`n = n * 10n + digit` creates 2 new bigints per digit via `gc_malloc`) — accumulates indefinitely in `MALLOC_OBJECTS` until the process OOMs or heap corruption trips a malloc-allocator abort. The reported symptom was exit 139 on the second 1000-row × 20-column query or the first 10000-row query. New `gc_check_trigger()` call at the *entry* of `gc_malloc` — critically NOT at the end: running it after the header is pushed into `MALLOC_OBJECTS` would have the sweep free the about-to-be-returned pointer, since the fresh `user_ptr` lives only in a caller-saved register that setjmp's callee-saved-only conservative stack scan can't see. Running before means the allocation simply doesn't exist during any GC cycle this call triggers. +- **fix**: the malloc-count threshold was a hardcoded 10,000 in `gc_check_trigger`. Before this commit that was tolerable because the trigger rarely fired; now that `gc_malloc` calls it every allocation, a program with >10k legitimate live malloc objects (e.g. any backend holding a decent-sized cache) would GC-thrash — every single new alloc would re-trip the threshold. Replaced with a per-thread `GC_NEXT_MALLOC_TRIGGER: Cell` that's rebaselined after each collection to `survivor_count + GC_MALLOC_COUNT_STEP` (10k). Same update happens on the arena-triggered GC path so both triggers stay in sync. +- Repro synthetic: `parseBigIntDecimal('' + i)` 2M times — before: **8.45 GB peak RSS**; after: **36 MB** (233× reduction; even beats Node's 73 MB since Perry's BigInt is 1024-bit fixed-width vs Node's heap-allocated variable-width). + +## v0.5.24 — bigint arithmetic + `BigInt()` coercion (closes #33) +- **fix**: bigint literals were NaN-boxed with `POINTER_TAG` (`0x7FFD`) instead of `BIGINT_TAG` (`0x7FFA`), so `typeof 5n` returned `"object"` and the runtime's `JSValue::is_bigint()` check (used by `js_dynamic_add/sub/mul/div/mod`) said no — arithmetic on bigints fell through to `fadd/fsub/...` on the NaN-tagged bits and produced `NaN`. New `nanbox_bigint_inline` + `BIGINT_TAG_I64` constant; `Expr::BigInt` now uses the bigint tag. +- **feat**: `Expr::BigIntCoerce` was unimplemented (`BigInt(42)`/`BigInt("9223...")` failed to compile with `expression BigIntCoerce not yet supported`). Lowers to `js_bigint_from_f64` (which already dispatches on the NaN tag — pass-through for bigint, i64 conversion for int32, string parse for strings, truncate for doubles) and re-boxes with BIGINT_TAG. +- **feat**: `Expr::Binary` with either operand statically bigint-typed now dispatches to `js_dynamic_add/sub/mul/div/mod` instead of float ops. The runtime helpers unbox, call `js_bigint_`, and re-box. Mixed `bigint × int32` also works (they upcast to bigint). `is_bigint_expr` extended to recognize nested bigint `Binary` ops so `(n * 10n) + d` routes through bigint dispatch all the way up — unblocks the `@perry/postgres` `parseBigIntDecimal` pattern (digit-by-digit accumulator loop). +- **fix**: `js_console_log_dynamic` fell through to the float-number branch for bigint values because `is_bigint()` wasn't in the dispatch chain — `console.log(x)` (single-arg) printed `NaN` for every bigint. Added an `is_bigint()` branch that routes through the existing `format_jsvalue` (which already knows to print `n`). +- Regression test: `test-files/test_gap_bigint.ts` — matches Node byte-for-byte. + +## v0.5.23 — module init order + namespace import dispatch (closes #32) +- **fix**: `non_entry_module_prefixes` in `crates/perry/src/commands/compile.rs` was iterating `ctx.native_modules` (a `BTreeMap`) which produces alphabetical path order, silently discarding the topologically-sorted `non_entry_module_names` built ~700 lines earlier. Any project whose leaf modules sort AFTER their dependents (e.g. `types/registry.ts` > `connection.ts`) had its init sequence reversed — a top-level `registerDefaultCodecs()` call in `register-defaults.ts` would run BEFORE `types/registry.ts`'s init allocated the `REGISTRY_OIDS` array, so every push wrote to a stale (0.0-initialized) global while later readers loaded the correctly-initialized one. Symptom: module-level registries/plugin tables appeared empty to every consumer even though primitives (`let registered = false`) looked shared. Fix: iterate the already-sorted `non_entry_module_names` instead. +- **fix**: `import * as O from './oids'; O.OID_INT2` in `crates/perry-codegen/src/expr.rs` was falling through the PropertyGet handler to the generic `js_object_get_field_by_name_f64(TAG_TRUE, "OID_INT2")` path because the ExternFuncRef-of-namespace case wasn't distinguished from ExternFuncRef-of-variable. The namespace binding `O` has no `perry_fn___O` getter (it's a namespace, not an exported value), so calling the getter path would link-fail; the codegen fell back to lowering `O` as the TAG_TRUE sentinel and did a field lookup on that, silently returning `undefined` for every namespaced import. Added a PropertyGet fast path: if `object` is `ExternFuncRef { name }` and `name` is in `ctx.namespace_imports`, resolve `property` through `import_function_prefixes` (already populated by the namespace-export walk in compile.rs) and emit a direct `perry_fn___()` call. Second half of GH #32 — the registry duplication report was actually two separate bugs stacked together. +- Regression test: `test-files/module-init-order/` (leaf registry + namespace import + top-level registerAll() call + main consumer). Without either fix, `count=0` and all lookups return `MISSING`; with both fixes, `count=3` and lookups resolve correctly. + +## v0.5.22 — doc example URLs + compile output noise cleanup (refs #26) +- **docs**: fetch/axios quickstart examples in `docs/src/stdlib/http.md` and `docs/native-libraries.md` swapped from `https://api.example.com/data` (IANA-reserved placeholder that never resolves) to `https://jsonplaceholder.typicode.com/posts/1` (public JSON test API) so copy-paste-and-run works for first-time users. In-widget scaffolding examples left alone — those are snippets inside larger user apps. +- **compile**: `Module init order (0 modules):` (leftover debug aid from a past crash diagnosis) and `auto-optimize: Perry workspace source not found, using prebuilt libperry_runtime.a + libperry_stdlib.a` (fires 100% of the time for Homebrew/apt users since they don't have the workspace) are now gated behind `--verbose`. The rest of the compile output (`Collecting modules...`, `Generating code...`, `Wrote object file`, `Linking (with stdlib)...`, `Wrote executable`, `Binary size`) stays — those are legit progress markers. Threaded `verbose: u8` through `compile::run()` → `build_optimized_libs()` (previously `_verbose`, unused). +- **ci**: `.github/workflows/release-packages.yml` now pins `MACOSX_DEPLOYMENT_TARGET=13.0` for the macOS bottle builds. The `macos-15` runner was stamping `LC_BUILD_VERSION` on every stdlib `.o` with the host's 15.x version, so any user linking on macOS 14 or earlier saw `ld: warning: ... was built for newer 'macOS' version (15.5) than being linked (14.x)` across dozens of object files in libperry_stdlib.a. Functionally harmless, visually ugly. Will take effect on the next release cut — users on existing bottles still see the warnings until then. + +## v0.5.21 — fastify header dispatch + gc() safety in servers (closes #30, #31) +- **fix**: `request.header('X')` / `request.headers['X']` returned undefined/null in Fastify handlers because the handler param was typed `any`, so the HIR didn't tag it as `FastifyRequest` → property access fell through to generic object lookup instead of the fastify FFI. New `pre_scan_fastify_handler_params()` in the HIR pre-registers the first two params of `app.get|post|put|delete|patch|head|options|all|addHook|setErrorHandler` arrow handlers as fastify Request/Reply native instances. Also added `NA_JSV` (pass NaN-boxed bits as i64) and `NR_STR` (NaN-box string return with STRING_TAG) arg/return kinds so the receiver methods `js_fastify_req_header(ctx, name: i64)` etc. get the right ABI shape; without this the bitcast was wrong and `JSON.stringify` on the returned string segfaulted. +- **fix**: `gc()` from `setInterval` SEGVd in Fastify+WS servers because the mark-sweep GC only scans the main thread's stack, but tokio worker threads hold live JSValue refs on their stacks that the scanner can't see → GC frees still-referenced objects → next access crashes. Added `GC_UNSAFE_ZONES` atomic in perry-runtime; Fastify/WS server creation increments it, WS server close decrements it. `js_gc_collect()` now checks the counter and skips collection (with a one-shot warning) when any tokio-based server is active. Full stop-the-world GC synchronization is a v0.5.22 followup. + +## v0.5.20 — String.length returns UTF-16 code units (closes #18 partially) +- **fix**: `String.length` now returns UTF-16 code unit count instead of UTF-8 byte count, matching JavaScript semantics. `"café".length` → 4 (was 5), `"日本語".length` → 3 (was 9), `"😀".length` → 2 (was 4). `StringHeader` gains `utf16_len` at offset 0 (codegen inline `.length` unchanged) + `byte_len` for internal ops. All position-based APIs (`charAt`, `slice`, `substring`, `indexOf`, `lastIndexOf`, `padStart`, `padEnd`, `toCharArray`) converted to UTF-16 indexing with ASCII fast path. `test_gap_string_methods` DIFF (4) → DIFF (2, lone surrogates only). Fixes NFC/NFD `.normalize().length` parity. + +## v0.5.19 — fix Fastify/MySQL segfault on Linux, restore native module dispatch, fix gc() (closes #28) +- **fix**: `gc()` calls emitted bare `gc` symbol instead of `js_gc_collect` — caused `undefined reference to 'gc'` linker error (macOS) or segfault at runtime (Linux with `--warn-unresolved-symbols`). Added explicit dispatch in `lower_call.rs` ExternFuncRef handler. +- **fix**: Fastify/MySQL/WS/pg/ioredis/MongoDB/better-sqlite3 binaries compiled but did nothing at runtime — the entire native module dispatch table from the old Cranelift codegen was lost in the v0.5.0 LLVM cutover. All `NativeMethodCall` nodes for these modules fell through to the catch-all that returns `double 0.0`, so no runtime functions were ever called. Added `NATIVE_MODULE_TABLE` with table-driven dispatch for ~100 methods across 15+ native modules. +- **fix**: removed `--warn-unresolved-symbols` from Linux linker flags — this flag silently converted link errors to warnings, producing binaries with null function pointers that segfaulted at runtime instead of failing at link time. +- **fix**: MySQL `pool.query()`/`pool.execute()` routed to `js_mysql2_connection_*` instead of `js_mysql2_pool_*` — caused "Invalid connection handle" errors. Added `class_filter` to `NativeModSig` so `class_name: "Pool"` dispatches to pool-specific runtime functions; `"PoolConnection"` dispatches to pool-connection functions. HIR `class_name` now threaded through to `lower_native_method_call`. +- **fix**: `new WebSocketServer({port: N})` went through the empty-object placeholder in `lower_builtin_new` instead of calling `js_ws_server_new`. Added dedicated `WebSocketServer` case. Fixed `js_ws_send` arg type (was NA_F64, now NA_STR matching the `(i64, i64)` runtime signature). + +## v0.5.18 — native axios, fetch segfault fix, type stubs (closes #24, #25, #26, #27) +- **feat**: native `axios` dispatch — `axios.get/post/put/delete/patch` and `response.status/.data/.statusText` now compile natively without `--enable-js-runtime` or npm install. Added to `NATIVE_MODULES`, HIR native instance tracking, codegen dispatch, and `http-client` feature mapping. +- **fix**: `await fetch(url)` segfaulted because `body` (undefined for GET) NaN-unboxed to `0x1`, dereferenced as a valid pointer. Fixed `string_from_header` to treat pointers below page size as invalid. +- **fix**: await loop never drained stdlib async queue — added `js_run_stdlib_pump()` call so tokio-based fetch/DB results actually resolve. +- **fix**: `llvm-ar not found` warning downgraded from `ERROR` to soft skip with install instructions (non-fatal, strip-dedup is optional). +- **feat**: `.d.ts` type stubs for `perry/ui`, `perry/thread`, `perry/i18n`, `perry/system`. `perry init` generates `tsconfig.json` with paths; new `perry types` command for existing projects. + +## v0.5.17 (llvm-backend) — scalar replacement of non-escaping objects + Static Hermes benchmarks +- **perf**: escape analysis identifies `let p = new Point(x, y)` where `p` never escapes (only PropertyGet/PropertySet uses); fields are decomposed into stack allocas that LLVM promotes to registers — zero heap allocation. `object_create` 10ms→4ms (2.5x), `binary_trees` 9ms→3ms (3x), peak RSS 97MB→5MB. Perry now beats Node.js on all 15 benchmarks. +- **feat**: benchmark suite (`benchmarks/suite/run_benchmarks.sh`) now includes Static Hermes (Meta's AOT JS compiler) as a 4th comparison target alongside Node.js and Bun, with automatic TS→JS type-stripping. Updated README with full 4-way comparison tables and refreshed polyglot numbers. + +## v0.5.16 (llvm-backend) — watchOS device target: arm64_32 instead of arm64 +- **fix**: `--target watchos` emitted `aarch64-apple-watchos` (regular 64-bit ARM) objects, but Apple Watch hardware requires `arm64_32` (ILP32 — 32-bit pointers on 64-bit ARM). Changed LLVM triple to `arm64_32-apple-watchos`, Rust target to `arm64_32-apple-watchos`, and link triple to `arm64_32-apple-watchos10.0`. The simulator target (`watchos-simulator`) is unchanged — it correctly uses host-native aarch64. This fixes the ABI incompatibility that prevented device builds from linking with the LLVM-based runtime. + +## v0.5.15 (llvm-backend) — perry/ui State dispatch + check-deps fix (closes #24, #25) +- **fix**: `State(0)` constructor and `.value`/`.set()` instance methods were missing from the LLVM codegen dispatch tables, producing "not in dispatch table" warnings and silently returning `undefined`. Added `State` → `perry_ui_state_create` to `PERRY_UI_TABLE` and `value` → `perry_ui_state_get` / `set` → `perry_ui_state_set` to `PERRY_UI_INSTANCE_TABLE`. +- **fix**: `perry check --check-deps` flagged `perry/ui`, `perry/thread`, `perry/i18n` as missing npm packages (R003) and as unsupported Node.js built-ins (U006). New `is_perry_builtin()` guard skips resolution and diagnostics for all `perry/*` imports. + +## v0.5.14 (llvm-backend) — Windows build fix: date.rs POSIX-only APIs +- **fix**: `timestamp_to_local_components` used `libc::localtime_r` and `tm_gmtoff`, both POSIX-only — broke the Windows CI build. Split into `#[cfg(unix)]` (keeps `localtime_r` + `tm_gmtoff`) and `#[cfg(windows)]` (uses `libc::localtime_s` / `libc::gmtime_s`, derives tz offset by comparing local vs UTC breakdowns). + +## v0.5.13 (llvm-backend) — Buffer.indexOf/includes dispatch fix +- **fix**: `Buffer.indexOf()` and `Buffer.includes()` were incorrectly routed through the string method path in codegen, because the `is_string_only_method` guard didn't exclude `Uint8Array`/`Buffer` types. Added a `static_type_of` check that skips the string dispatch when the receiver is typed as `Uint8Array` or `Buffer`, letting these methods fall through to `dispatch_buffer_method` via `js_native_call_method` as intended. +- **cleanup**: removed leftover debug `eprintln!` in `js_buffer_index_of`. + +## v0.5.12 (llvm-backend) — perry/ui widget dispatch — mango renders its full UI +- **feat**: follow-up to v0.5.10 which landed only `App({...})`. This commit adds the rest of the perry/ui surface to `lower_native_method_call` via a table-driven dispatcher (`PERRY_UI_TABLE` of `UiSig { method, runtime, args, ret }` entries using `UiArgKind::{Widget,Str,F64,Closure,I64Raw}` / `UiReturnKind::{Widget,F64,Void}`). ~40 widget methods covered in one pass: `Text` / `TextField` / `TextArea` / `Spacer` / `Divider` / `ScrollView` constructors; `menuCreate` / `menuAddItem` / `menuBarCreate` / `menuBarAttach` / `menuBarAddMenu`; text setters (`textSetFontSize` / `textSetColor` / `textSetString` / `textSetFontFamily` / `textSetFontWeight` / `textSetWraps`); button setters (`buttonSetBordered` / `buttonSetTextColor` / `buttonSetTitle`); widget mutators (`widgetAddChild` / `widgetClearChildren` / `widgetSetHidden` / `widgetSetWidth` / `widgetSetHeight` / `widgetSetHugging` / `widgetMatchParentWidth` / `widgetMatchParentHeight` / `widgetSetBackgroundColor` / `widgetSetBackgroundGradient` / `setCornerRadius`); stack mutators (`stackSetAlignment` / `stackSetDistribution`); `scrollviewSetChild`; `textfieldSetString` / `textareaSetString`. Runtime fns lazy-declared via `ctx.pending_declares`. +- **feat**: `VStack` / `HStack` get a dedicated special case because the TS call shape (`VStack(spacing, [children])` or `VStack([children])`) doesn't fit the table — spacing is optional and children is a variadic array that needs one `perry_ui_widget_add_child` call per element. We stash the parent handle in an entry alloca so subsequent blocks reload it, then walk the array fast path. +- **feat**: `Button` also gets a special case because the handler closure arg must stay NaN-boxed (f64), not unboxed to i64, and the label is a raw cstr pointer — neither shape is expressible as a single `UiArgKind` row. +- **fix**: one naming inconsistency found while building the table — the runtime fn is `perry_ui_set_widget_hidden` (with `set` first, unlike every other `widget_*` setter). Fixed in the table. +- **result**: `mango/src/app.ts -o Mango` now launches and renders the full UI tree — title bar, "Welcome to Mango" heading, "MongoDB Study Tool" subtitle, "Databases & Collections / Query & Plan / Edit & Insert / Index Viewer" menu items, and the orange "+ New Connection" button all visible in the screenshot. Verified by launching the compiled binary, positioning the window onscreen via osascript, and `/usr/sbin/screencapture`. The v0.5.0 LLVM cutover regression (mango compiled clean but exited silently with an empty window) is fully resolved. + +## v0.5.11 (llvm-backend) — inline-allocator regression fixes (parity 80% → 94%) +- **fix**: the inline bump-allocator hoist (v0.5.0-followup) cached `@perry_class_keys_` in a function-entry stack slot, but the entry-block hoist ran BEFORE `__perry_init_strings_*` (which is what populates the global). So freshly-allocated objects had a null `keys_array` and `js_object_get_field_by_name` returned `undefined` for every field — `test_array_of_objects` showed `sorted[0].name → undefined`. New `LlFunction::entry_init_boundary` + `entry_post_init_setup`: alloca stays at the very top (dominates), but the load+store splices in AFTER the init prelude. `mark_entry_init_boundary()` is called immediately after `js_gc_init` / `__perry_init_strings_*` / non-entry module inits in `compile_module_entry`. +- **fix**: the inline allocator skipped `register_class(child, parent)` (the runtime allocators do it on every alloc). With every class instance going through the inline path, the CLASS_REGISTRY was never populated and `instanceof` walks broke at the first hop — `test_edge_classes` showed `square instanceof Rectangle → false` for a `class Square extends Rectangle extends Shape`. New public `js_register_class_parent(child, parent)` extern; codegen emits one call per inheriting class in `__perry_init_strings_*` (sorted by class id). +- **infra**: parity script normalize_output now strips Node v25 `MODULE_TYPELESS_PACKAGE_JSON` warnings (4 lines printed to stderr per test file without `"type": "module"` in package.json — pure environmental noise that started after the Node v25 upgrade). +- **result**: parity sweep 96 PASS / 6 FAIL / 0 COMPILE_FAIL = **94.1%**, beating the v0.5.0 baseline of 91.8%. Remaining 6 DIFFs are all pre-existing (timer precision, lookbehind regex, lone surrogates, NFC/NFD, async-generator baseline) — verified by reproducing on the pre-optimization commit. Numeric benchmarks (object_create 8ms, binary_trees 7ms, factorial 25ms) still beat or tie Node on every workload — the fix didn't regress any of the v0.5.2 wins. + +## v0.5.10 (llvm-backend) — `perry/ui.App({...})` dispatch — mango actually launches +- **fix**: the LLVM backend port (v0.5.0 cutover) silently dropped `perry/ui` dispatch — receiver-less `NativeMethodCall { module: "perry/ui", method, object: None }` fell into `lower_native_method_call`'s catch-all early-out at `lower_call.rs:1922` and returned `double 0.0`. So `App({title, width, height, body})` at the end of any perry/ui app silently no-op'd, the binary completed init without entering `NSApplication.run()`, and exited with no output. Mango compiled cleanly under v0.5.0 through v0.5.9 but couldn't actually launch — the regression was masked because the driver doesn't have an integration test that runs the resulting binary. New per-method dispatch in `lower_call.rs::lower_native_method_call` that recognizes `perry/ui.App({...})`, walks the args[0] object literal for `title` / `width` / `height` / `icon` / `body`, lazy-declares `perry_ui_app_create` / `perry_ui_app_set_icon` / `perry_ui_app_set_body` / `perry_ui_app_run` via `pending_declares`, and emits the create/set-icon/set-body/run sequence. Verified by compiling `mango/src/app.ts -o Mango`, launching the binary, and screenshotting a native macOS window titled "Mango" (menubar shows Mango/Edit/Window — proof that NSApplication.run() is now being entered). The window's content area is empty because the other perry/ui constructors (Text/Button/VStack/HStack/etc.) are still in the same dropped state — full widget dispatch is the next followup. This commit lands `App()` only as a focused proof-of-concept that the linking + runtime + Mach-O code path works end to end. + +## v0.5.9 (llvm-backend) — `let C = SomeClass; new C()` correctness + alias type refinement +- **fix**: `let C = SomeClass; new C()` now actually creates an instance of `SomeClass` instead of returning the empty-object placeholder. New `local_class_aliases: HashMap` and `local_id_to_name: HashMap` fields on `FnCtx`, populated by `Stmt::Let` when the init is `Expr::ClassRef(name)` (direct alias) or `Expr::LocalGet(other_id)` where `other_id`'s name is itself an alias (chain — `let A = X; let B = A; new B()`). `lower_new` shadows its `class_name` parameter with the resolved name early so the rest of the function (alloc + ctor inline + field offsets) uses the real class. Critically, `refine_type_from_init` for `Expr::New` *also* resolves through `local_class_aliases`, so `let b: any = new C()` refines `b`'s static type to `Named("SomeClass")` not `Named("C")` — without this, the PropertyGet fast path would look up "C" in `ctx.classes`, find nothing, fall through to `js_object_get_field_by_name_f64`, and return undefined for fields that were correctly initialized in memory by the inline allocator. Verified with three test shapes: direct alias (`const C = Foo; const a = new C()`), 3-step chain (`const A = Bar; const B = A; const b = new B()`), and in-function (`function f() { const D = Foo; return new D() }`). Mango compiles cleanly. + ## v0.5.8 (llvm-backend) — `Expr::NewDynamic` static reroute + conditional callee branching The sixth followup from the v0.5.1 mango compile sweep. Improves `Expr::NewDynamic` handling beyond the original v0.5.1 "empty-object placeholder for everything except `globalThis.X`" pragmatic fix. Closes the followup item: "NewDynamic for non-globalThis callees currently returns an empty object placeholder." diff --git a/CLAUDE.md b/CLAUDE.md index a9171c5bb..62c5e9e74 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,55 +2,35 @@ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. -**NOTE**: This file is kept intentionally concise (~300 lines) because it is loaded into every conversation. Detailed historical changelogs are in CHANGELOG.md. When adding new changes, keep entries to 1-2 lines max and move older entries to CHANGELOG.md periodically. +**NOTE**: Keep this file concise. Detailed changelogs live in CHANGELOG.md. ## Project Overview Perry is a native TypeScript compiler written in Rust that compiles TypeScript source code directly to native executables. It uses SWC for TypeScript parsing and LLVM for code generation. -**Current Version:** 0.5.28 +**Current Version:** 0.5.58 ## TypeScript Parity Status -Tracked via the gap test suite (`test-files/test_gap_*.ts`, 22 tests). Each test exercises a feature cluster and is compared byte-for-byte against `node --experimental-strip-types`. Run via `/tmp/run_gap_tests.sh` after `cargo build --release -p perry-runtime -p perry-stdlib -p perry`. +Tracked via the gap test suite (`test-files/test_gap_*.ts`, 22 tests). Compared byte-for-byte against `node --experimental-strip-types`. Run via `/tmp/run_gap_tests.sh` after `cargo build --release -p perry-runtime -p perry-stdlib -p perry`. -**Last sweep (post-v0.4.87):** **8/22 passing**, **347 total diff lines**. +**Last sweep:** **8/22 passing**, **347 total diff lines**. | Status | Test | Diffs | |--------|------|-------| -| ✅ PASS | `date_methods` | 0 | -| ✅ PASS | `encoding_timers` | 0 | -| ✅ PASS | `error_extensions` | 0 | -| ✅ PASS | `fetch_response` | 0 | -| ✅ PASS | `json_advanced` | 0 | -| ✅ PASS | `node_path` | 0 | -| ✅ PASS | `node_process` | 0 | -| ✅ PASS | `weakref_finalization` | 0 | -| 🟡 close | `regexp_advanced` | 2 (lookbehind only) | -| 🟡 close | `generators` | 3 | -| 🟡 close | `number_math` | 4 | -| 🟡 close | `string_methods` | 2 (lone surrogates only) | -| 🟡 mid | `class_advanced` | 18 | -| 🟡 mid | `proxy_reflect` | 27 (segfault) | -| 🟡 mid | `object_methods` | 28 | -| 🟡 mid | `node_fs` | 30 | -| 🟡 mid | `global_apis` | 30 | -| 🔴 work | `symbols` | 31 (segfault) | -| 🔴 work | `async_advanced` | 35 (segfault) | -| 🔴 work | `console_methods` | 40 | -| 🔴 work | `array_methods` | 45 | -| 🔴 work | `node_crypto_buffer` | 46 | - -**Known categorical gaps**: lookbehind regex (Rust `regex` crate limitation), `Proxy`/`Reflect` not implemented, `Symbol(...)` returns garbage, `Object.getPrototypeOf` returns wrong sentinel, `console.dir` formatting differs from Node, `console.group*` doesn't indent, `console.table` works for the standard shapes, lone surrogate handling (`isWellFormed`/`toWellFormed` — needs WTF-8 support). - -**Next-impact targets** (biggest single-commit wins): `console.dir` formatting + `console.group` indent (~15 lines), `Promise.withResolvers` + segfault fix (~35 lines), `URL`/`Blob`/`AbortController` extensions (~15 lines), `Proxy` identity stub (~10 lines), `Symbol` sentinel stub (~10 lines). +| ✅ PASS | `date_methods`, `encoding_timers`, `error_extensions`, `fetch_response`, `json_advanced`, `node_path`, `node_process`, `weakref_finalization` | 0 | +| 🟡 close | `regexp_advanced` (2), `generators` (3), `number_math` (4), `string_methods` (2) | 2–4 | +| 🟡 mid | `class_advanced` (18), `proxy_reflect` (27, segfault), `object_methods` (28), `node_fs` (30), `global_apis` (30) | 18–30 | +| 🔴 work | `symbols` (31, segfault), `async_advanced` (35, segfault), `console_methods` (40), `array_methods` (45), `node_crypto_buffer` (46) | 31–46 | + +**Known categorical gaps**: lookbehind regex (Rust `regex` crate), `Proxy`/`Reflect`, `Symbol(...)`, `Object.getPrototypeOf` sentinel, `console.dir`/`console.group*` formatting, lone surrogate handling (WTF-8). ## Workflow Requirements **IMPORTANT:** Follow these practices for every code change: 1. **Update CLAUDE.md**: Add 1-2 line entry in "Recent Changes" for new features/fixes -2. **Increment Version**: Bump patch version (e.g., 0.2.147 → 0.2.148) +2. **Increment Version**: Bump patch version (e.g., 0.5.48 → 0.5.49) 3. **Commit Changes**: Include code changes and CLAUDE.md updates together ## Build Commands @@ -99,23 +79,19 @@ Key functions: `js_nanbox_string/pointer/bigint`, `js_nanbox_get_pointer`, `js_g ## Garbage Collection -Mark-sweep GC in `crates/perry-runtime/src/gc.rs` with conservative stack scanning. Arena objects (arrays, objects) discovered by linear block walking (zero per-alloc tracking). Malloc objects (strings, closures, promises, bigints, errors) tracked in thread-local Vec. Triggers on new arena block allocation (~8MB) or explicit `gc()` call. 8-byte GcHeader per allocation. +Mark-sweep GC in `crates/perry-runtime/src/gc.rs` with conservative stack scanning. Arena objects (arrays, objects) discovered by linear block walking. Malloc objects (strings, closures, promises, bigints, errors) tracked in thread-local Vec. Triggers on arena block allocation (~8MB), malloc count threshold, or explicit `gc()` call. 8-byte GcHeader per allocation. ## Threading (`perry/thread`) -User code is single-threaded by default. `perry/thread` module provides three primitives with compile-time safety (no mutable captures allowed): - -- **`parallelMap(array, fn)`** — data-parallel array processing across all CPU cores -- **`parallelFilter(array, fn)`** — data-parallel array filtering across all CPU cores +Single-threaded by default. `perry/thread` provides: +- **`parallelMap(array, fn)`** / **`parallelFilter(array, fn)`** — data-parallel across all cores - **`spawn(fn)`** — background OS thread, returns Promise -Values cross threads via `SerializedValue` deep-copy (zero-cost for numbers, O(n) for strings/arrays/objects). Each thread has independent arena + GC. Arena `Drop` frees blocks when worker threads exit. Results from `spawn` flow back via `PENDING_THREAD_RESULTS` queue, drained during `js_promise_run_microtasks()`. - -**Compiler pipeline** also parallelized via rayon: module codegen, transform passes, and nm symbol scanning. +Values cross threads via `SerializedValue` deep-copy. Each thread has independent arena + GC. Results from `spawn` flow back via `PENDING_THREAD_RESULTS` queue, drained during `js_promise_run_microtasks()`. ## Native UI (`perry/ui`) -Declarative TypeScript compiles to AppKit/UIKit calls. 47 `perry_ui_*` FFI functions. Handle-based widget system (1-based i64 handles, NaN-boxed with POINTER_TAG). 5 reactive binding types dispatched from `state_set()`. `--target ios-simulator`/`--target ios`/`--target tvos-simulator`/`--target tvos` for cross-compilation. +Declarative TypeScript compiles to AppKit/UIKit calls. Handle-based widget system (1-based i64 handles, NaN-boxed with POINTER_TAG). `--target ios-simulator`/`--target ios`/`--target tvos-simulator`/`--target tvos` for cross-compilation. **To add a new widget** — change 4 places: 1. Runtime: `crates/perry-ui-macos/src/widgets/` — create widget, `register_widget(view)` @@ -125,18 +101,16 @@ Declarative TypeScript compiles to AppKit/UIKit calls. 47 `perry_ui_*` FFI funct ## Compiling npm Packages Natively (`perry.compilePackages`) -Projects can list npm packages to compile natively instead of routing to V8. Configured in `package.json`: - +Configured in `package.json`: ```json { "perry": { "compilePackages": ["@noble/curves", "@noble/hashes"] } } ``` - -**Dedup logic**: When `@noble/hashes` appears in multiple `node_modules/`, the first-resolved directory is cached in `compile_package_dirs`. Subsequent imports redirect to the same copy, preventing duplicate linker symbols. +First-resolved directory cached in `compile_package_dirs`; subsequent imports redirect to the same copy (dedup). ## Known Limitations - **No runtime type checking**: Types erased at compile time. `typeof` via NaN-boxing tags. `instanceof` via class ID chain. -- **No shared mutable state across threads**: Thread primitives enforce immutable captures at compile time. No `SharedArrayBuffer` or `Atomics`. +- **No shared mutable state across threads**: No `SharedArrayBuffer` or `Atomics`. ## Common Pitfalls & Patterns @@ -147,7 +121,6 @@ Projects can list npm packages to compile natively instead of routing to V8. Con ### LLVM Type Mismatches - Loop counter optimization produces i32 — always convert before passing to f64/i64 functions -- Check LLVM value types before conversion; handle f64↔i64, i32→f64, i32→i64 - Constructor parameters always f64 (NaN-boxed) at signature level ### Async / Threading @@ -175,152 +148,27 @@ Projects can list npm packages to compile natively instead of routing to V8. Con ## Recent Changes -For older versions (v0.4.144 and earlier), see CHANGELOG.md. - -### v0.5.28 — module globals registered as GC roots (closes #36) -- **fix**: module-level user `let`/`const` globals were LLVM `double` globals that held NaN-boxed JSValues but were NOT registered with the GC's root scanner. Only string-handle globals (from the string pool) got `js_gc_register_global_root(&@.str..handle)` at startup. The conservative stack scan could still find pointers held by stack variables, so the bug was latent until v0.5.25 made `gc_malloc` trigger GC during long-running decode loops — any program where a `Map` / `Array` / user-class instance lived only in `const X = new Map(...)` (no stack variable holding it at the moment of GC) would have `X` swept mid-cycle. The canonical victim was `@perry/postgres`'s `const CONN_STATES = new Map()`: the Map header got freed, the next `CONN_STATES.get(id)` dereferenced a freed pointer, SIGSEGV. Tracked by pg's malloc-count trigger hitting its 10k threshold around the 10-20k row mark — exactly the boundary the ticket reported. -- New `register_module_globals_as_gc_roots(&mut ctx, module_globals)` in `crates/perry-codegen/src/codegen.rs` emits one `js_gc_register_global_root(ptrtoint @perry_global___ to i64)` per module-level let/const at the top of each module's `main` (entry) or `__init` (non-entry) function, right after `js_gc_init` + the strings-init prelude. Registration uses the global's **address**, not its current value — so reassignments are followed correctly without re-registering. `mark_global_roots` already handled both NaN-boxed (POINTER_TAG / STRING_TAG / BIGINT_TAG) and raw-i64 interpretations, falling through the `valid_ptrs` filter for both, so registering every global regardless of its declared type is safe: number/boolean/undefined bits just don't match any live heap pointer. -- Repro (no postgres, minimal synthetic): `const CACHE = new Map(); put(...); allocLots(); get(1)`. Before the fix: SIGSEGV after the allocLots burst crosses the malloc-count threshold. After: prints `OK`. Full @perry/postgres bench suite: `perry-bench-crash-repro.ts` (1000×20 mixed types × 5 iterations) and `perry-bench-narrow.ts` (all int4 / bool / text / int8 / numeric × 3 iterations each = 15 queries) both pass end-to-end. - -### v0.5.27 — GC root scanners for `ws` / `http` / `events` / `fastify` closures (refs #35) -- **fix**: follow-up sweep to v0.5.26 — the net.Socket scanner pattern extended to every other stdlib module that stores user closures in Rust-side registries not visible to the GC mark phase. Same latent bug in each: user closure passed across the FFI, stored as `i64` inside a `Mutex` (ws's `WS_CLIENT_LISTENERS`) or inside a struct held by the handle registry (`WsServerHandle.listeners`, `ClientRequestHandle.response_callback` + `.listeners`, `IncomingMessageHandle.listeners`, `EventEmitterHandle.listeners`, `FastifyApp.routes[].handler` + `.hooks.*` + `.error_handler` + `.plugins[].handler`) — any malloc-triggered GC between registration and dispatch would sweep the closure and the next invocation would hit freed memory. -- New helper `common::for_each_handle_of::(|t| ...)` walks the `DashMap`-backed handle registry, downcast_ref'ing each entry to `T`. Each stdlib module adds its own `scan_X_roots(mark)` and a `Once`-guarded `ensure_gc_scanner_registered()` called from the module's create / on / connect entry points, mirroring the cron/net templates. -- **ws.rs**: scans `WS_CLIENT_LISTENERS` (global) + every `WsServerHandle` in the registry. Registered from `js_ws_on`, `js_ws_connect`, `js_ws_connect_start`, `js_ws_server_new`. -- **http.rs**: scans every `ClientRequestHandle` (response_callback + 'error' listeners) and `IncomingMessageHandle` ('data' / 'end' / 'error' listeners). Registered from `js_http_request`, `js_https_request`, `js_http_get`, `js_https_get`, `js_http_on`. -- **events.rs**: scans every `EventEmitterHandle`'s listener map. Registered from `js_event_emitter_new` and `js_event_emitter_on`. (Note: `new EventEmitter()` has a pre-existing HIR gap that routes through the user-class `New` path instead of the factory — unrelated to this fix, still happens in v0.5.26.) -- **fastify/mod.rs**: scans every `FastifyApp`'s routes, all 8 hook lists (onRequest/preParsing/preValidation/preHandler/preSerialization/onSend/onResponse/onError), `error_handler`, and plugin handlers. Registered from `js_fastify_create` / `js_fastify_create_with_opts`. Tokio dispatch copies the app into an `Arc` but `Route`/`Hooks` are `Clone` with closures stored by `i64` value — the tokio-side copy references the same `ClosureHeader` alloc, so marking via the registry entry covers both paths. -- **not covered** (intentional, no observed issue): `commander.rs` action callbacks (comment says "not automatically invoked"), `async_local_storage.rs` / `worker_threads.rs` (closures invoked immediately then discarded, never held across a GC boundary). - -### v0.5.26 — GC root scanner for `net.Socket` listener closures (closes #35) -- **fix**: `sock.on('data', cb)` stored the closure pointer in `NET_LISTENERS: Mutex>>>` as a bare `i64`, with no root scanner registered — so GC's mark phase couldn't see it. Before v0.5.25 this was a latent bug: GC only fired on arena block overflow, and event-driven code (like `@perry/postgres`'s data listener) rarely tripped it. Once v0.5.25 made `gc_malloc` trigger GC, any wrapper-heavy synchronous work (row decode, JSON parse, allocation burst between events) would fire a sweep with the listener unmarked — the sweep freed the closure, and the next dispatched `'data'` event called `js_closure_call1` on freed memory. In the pg driver the result was: iter 0 fired echoes fine (no GC yet), iter 1+ called a dead closure, the driver's parse loop stopped advancing, the outer `conn.query(...)` promise never resolved, and main() silently exited 0 when the pump had nothing left to do — exactly the symptom in the ticket. -- New `scan_net_roots(mark)` walks `NET_LISTENERS`, re-NaN-boxes each callback `i64` with `POINTER_TAG`, and calls `mark` — mirrors the existing `cron::scan_cron_roots` / `timer::scan_timer_roots` pattern. Registered lazily via a `Once` from `spawn_socket_task` (first `net.createConnection` / `tls.connect`) and `js_net_socket_on` (first `.on(...)` call on any socket), so programs that never use net don't pay the registration cost. Repro: synthetic TCP client + external echo server + 30k-iteration wrapper-allocation burst between sends — before: `dataCb=0 bytes=0` (listener freed after iter 0); after: `dataCb=5 bytes=35` ✓. -- **known remaining**: the same latent pattern still exists for `ws.rs`'s `WS_CLIENT_LISTENERS` + `WsServerHandle.listeners`, and `http.rs`'s `ClientRequest.response_callback` + `IncomingMessage.listeners`. Those registries are also Rust-side-only references to user closures — if a WS client or HTTP request lives across a GC cycle triggered by malloc pressure, its listeners will be swept. Filed as a follow-up sweep; not fixed in this commit to keep the scope tight to the issue #35 report. - -### v0.5.25 — GC from `gc_malloc` + adaptive malloc-count trigger (closes #34) -- **fix**: malloc-heavy workloads never triggered GC. `gc_check_trigger()` was only called from the arena slow path (when a block fills), but code that produces many short-lived malloc-tracked objects without pushing arena blocks — e.g. `@perry/postgres`'s `parseBigIntDecimal` (`n = n * 10n + digit` creates 2 new bigints per digit via `gc_malloc`) — accumulates indefinitely in `MALLOC_OBJECTS` until the process OOMs or heap corruption trips a malloc-allocator abort. The reported symptom was exit 139 on the second 1000-row × 20-column query or the first 10000-row query. New `gc_check_trigger()` call at the *entry* of `gc_malloc` — critically NOT at the end: running it after the header is pushed into `MALLOC_OBJECTS` would have the sweep free the about-to-be-returned pointer, since the fresh `user_ptr` lives only in a caller-saved register that setjmp's callee-saved-only conservative stack scan can't see. Running before means the allocation simply doesn't exist during any GC cycle this call triggers. -- **fix**: the malloc-count threshold was a hardcoded 10,000 in `gc_check_trigger`. Before this commit that was tolerable because the trigger rarely fired; now that `gc_malloc` calls it every allocation, a program with >10k legitimate live malloc objects (e.g. any backend holding a decent-sized cache) would GC-thrash — every single new alloc would re-trip the threshold. Replaced with a per-thread `GC_NEXT_MALLOC_TRIGGER: Cell` that's rebaselined after each collection to `survivor_count + GC_MALLOC_COUNT_STEP` (10k). Same update happens on the arena-triggered GC path so both triggers stay in sync. -- Repro synthetic: `parseBigIntDecimal('' + i)` 2M times — before: **8.45 GB peak RSS**; after: **36 MB** (233× reduction; even beats Node's 73 MB since Perry's BigInt is 1024-bit fixed-width vs Node's heap-allocated variable-width). - -### v0.5.24 — bigint arithmetic + `BigInt()` coercion (closes #33) -- **fix**: bigint literals were NaN-boxed with `POINTER_TAG` (`0x7FFD`) instead of `BIGINT_TAG` (`0x7FFA`), so `typeof 5n` returned `"object"` and the runtime's `JSValue::is_bigint()` check (used by `js_dynamic_add/sub/mul/div/mod`) said no — arithmetic on bigints fell through to `fadd/fsub/...` on the NaN-tagged bits and produced `NaN`. New `nanbox_bigint_inline` + `BIGINT_TAG_I64` constant; `Expr::BigInt` now uses the bigint tag. -- **feat**: `Expr::BigIntCoerce` was unimplemented (`BigInt(42)`/`BigInt("9223...")` failed to compile with `expression BigIntCoerce not yet supported`). Lowers to `js_bigint_from_f64` (which already dispatches on the NaN tag — pass-through for bigint, i64 conversion for int32, string parse for strings, truncate for doubles) and re-boxes with BIGINT_TAG. -- **feat**: `Expr::Binary` with either operand statically bigint-typed now dispatches to `js_dynamic_add/sub/mul/div/mod` instead of float ops. The runtime helpers unbox, call `js_bigint_`, and re-box. Mixed `bigint × int32` also works (they upcast to bigint). `is_bigint_expr` extended to recognize nested bigint `Binary` ops so `(n * 10n) + d` routes through bigint dispatch all the way up — unblocks the `@perry/postgres` `parseBigIntDecimal` pattern (digit-by-digit accumulator loop). -- **fix**: `js_console_log_dynamic` fell through to the float-number branch for bigint values because `is_bigint()` wasn't in the dispatch chain — `console.log(x)` (single-arg) printed `NaN` for every bigint. Added an `is_bigint()` branch that routes through the existing `format_jsvalue` (which already knows to print `n`). -- Regression test: `test-files/test_gap_bigint.ts` — matches Node byte-for-byte. - -### v0.5.23 — module init order + namespace import dispatch (closes #32) -- **fix**: `non_entry_module_prefixes` in `crates/perry/src/commands/compile.rs` was iterating `ctx.native_modules` (a `BTreeMap`) which produces alphabetical path order, silently discarding the topologically-sorted `non_entry_module_names` built ~700 lines earlier. Any project whose leaf modules sort AFTER their dependents (e.g. `types/registry.ts` > `connection.ts`) had its init sequence reversed — a top-level `registerDefaultCodecs()` call in `register-defaults.ts` would run BEFORE `types/registry.ts`'s init allocated the `REGISTRY_OIDS` array, so every push wrote to a stale (0.0-initialized) global while later readers loaded the correctly-initialized one. Symptom: module-level registries/plugin tables appeared empty to every consumer even though primitives (`let registered = false`) looked shared. Fix: iterate the already-sorted `non_entry_module_names` instead. -- **fix**: `import * as O from './oids'; O.OID_INT2` in `crates/perry-codegen/src/expr.rs` was falling through the PropertyGet handler to the generic `js_object_get_field_by_name_f64(TAG_TRUE, "OID_INT2")` path because the ExternFuncRef-of-namespace case wasn't distinguished from ExternFuncRef-of-variable. The namespace binding `O` has no `perry_fn___O` getter (it's a namespace, not an exported value), so calling the getter path would link-fail; the codegen fell back to lowering `O` as the TAG_TRUE sentinel and did a field lookup on that, silently returning `undefined` for every namespaced import. Added a PropertyGet fast path: if `object` is `ExternFuncRef { name }` and `name` is in `ctx.namespace_imports`, resolve `property` through `import_function_prefixes` (already populated by the namespace-export walk in compile.rs) and emit a direct `perry_fn___()` call. Second half of GH #32 — the registry duplication report was actually two separate bugs stacked together. -- Regression test: `test-files/module-init-order/` (leaf registry + namespace import + top-level registerAll() call + main consumer). Without either fix, `count=0` and all lookups return `MISSING`; with both fixes, `count=3` and lookups resolve correctly. - -### v0.5.22 — doc example URLs + compile output noise cleanup (refs #26) -- **docs**: fetch/axios quickstart examples in `docs/src/stdlib/http.md` and `docs/native-libraries.md` swapped from `https://api.example.com/data` (IANA-reserved placeholder that never resolves) to `https://jsonplaceholder.typicode.com/posts/1` (public JSON test API) so copy-paste-and-run works for first-time users. In-widget scaffolding examples left alone — those are snippets inside larger user apps. -- **compile**: `Module init order (0 modules):` (leftover debug aid from a past crash diagnosis) and `auto-optimize: Perry workspace source not found, using prebuilt libperry_runtime.a + libperry_stdlib.a` (fires 100% of the time for Homebrew/apt users since they don't have the workspace) are now gated behind `--verbose`. The rest of the compile output (`Collecting modules...`, `Generating code...`, `Wrote object file`, `Linking (with stdlib)...`, `Wrote executable`, `Binary size`) stays — those are legit progress markers. Threaded `verbose: u8` through `compile::run()` → `build_optimized_libs()` (previously `_verbose`, unused). -- **ci**: `.github/workflows/release-packages.yml` now pins `MACOSX_DEPLOYMENT_TARGET=13.0` for the macOS bottle builds. The `macos-15` runner was stamping `LC_BUILD_VERSION` on every stdlib `.o` with the host's 15.x version, so any user linking on macOS 14 or earlier saw `ld: warning: ... was built for newer 'macOS' version (15.5) than being linked (14.x)` across dozens of object files in libperry_stdlib.a. Functionally harmless, visually ugly. Will take effect on the next release cut — users on existing bottles still see the warnings until then. - -### v0.5.21 — fastify header dispatch + gc() safety in servers (closes #30, #31) -- **fix**: `request.header('X')` / `request.headers['X']` returned undefined/null in Fastify handlers because the handler param was typed `any`, so the HIR didn't tag it as `FastifyRequest` → property access fell through to generic object lookup instead of the fastify FFI. New `pre_scan_fastify_handler_params()` in the HIR pre-registers the first two params of `app.get|post|put|delete|patch|head|options|all|addHook|setErrorHandler` arrow handlers as fastify Request/Reply native instances. Also added `NA_JSV` (pass NaN-boxed bits as i64) and `NR_STR` (NaN-box string return with STRING_TAG) arg/return kinds so the receiver methods `js_fastify_req_header(ctx, name: i64)` etc. get the right ABI shape; without this the bitcast was wrong and `JSON.stringify` on the returned string segfaulted. -- **fix**: `gc()` from `setInterval` SEGVd in Fastify+WS servers because the mark-sweep GC only scans the main thread's stack, but tokio worker threads hold live JSValue refs on their stacks that the scanner can't see → GC frees still-referenced objects → next access crashes. Added `GC_UNSAFE_ZONES` atomic in perry-runtime; Fastify/WS server creation increments it, WS server close decrements it. `js_gc_collect()` now checks the counter and skips collection (with a one-shot warning) when any tokio-based server is active. Full stop-the-world GC synchronization is a v0.5.22 followup. - -### v0.5.20 — String.length returns UTF-16 code units (closes #18 partially) -- **fix**: `String.length` now returns UTF-16 code unit count instead of UTF-8 byte count, matching JavaScript semantics. `"café".length` → 4 (was 5), `"日本語".length` → 3 (was 9), `"😀".length` → 2 (was 4). `StringHeader` gains `utf16_len` at offset 0 (codegen inline `.length` unchanged) + `byte_len` for internal ops. All position-based APIs (`charAt`, `slice`, `substring`, `indexOf`, `lastIndexOf`, `padStart`, `padEnd`, `toCharArray`) converted to UTF-16 indexing with ASCII fast path. `test_gap_string_methods` DIFF (4) → DIFF (2, lone surrogates only). Fixes NFC/NFD `.normalize().length` parity. - -### v0.5.19 — fix Fastify/MySQL segfault on Linux, restore native module dispatch, fix gc() (closes #28) -- **fix**: `gc()` calls emitted bare `gc` symbol instead of `js_gc_collect` — caused `undefined reference to 'gc'` linker error (macOS) or segfault at runtime (Linux with `--warn-unresolved-symbols`). Added explicit dispatch in `lower_call.rs` ExternFuncRef handler. -- **fix**: Fastify/MySQL/WS/pg/ioredis/MongoDB/better-sqlite3 binaries compiled but did nothing at runtime — the entire native module dispatch table from the old Cranelift codegen was lost in the v0.5.0 LLVM cutover. All `NativeMethodCall` nodes for these modules fell through to the catch-all that returns `double 0.0`, so no runtime functions were ever called. Added `NATIVE_MODULE_TABLE` with table-driven dispatch for ~100 methods across 15+ native modules. -- **fix**: removed `--warn-unresolved-symbols` from Linux linker flags — this flag silently converted link errors to warnings, producing binaries with null function pointers that segfaulted at runtime instead of failing at link time. -- **fix**: MySQL `pool.query()`/`pool.execute()` routed to `js_mysql2_connection_*` instead of `js_mysql2_pool_*` — caused "Invalid connection handle" errors. Added `class_filter` to `NativeModSig` so `class_name: "Pool"` dispatches to pool-specific runtime functions; `"PoolConnection"` dispatches to pool-connection functions. HIR `class_name` now threaded through to `lower_native_method_call`. -- **fix**: `new WebSocketServer({port: N})` went through the empty-object placeholder in `lower_builtin_new` instead of calling `js_ws_server_new`. Added dedicated `WebSocketServer` case. Fixed `js_ws_send` arg type (was NA_F64, now NA_STR matching the `(i64, i64)` runtime signature). - -### v0.5.18 — native axios, fetch segfault fix, type stubs (closes #24, #25, #26, #27) -- **feat**: native `axios` dispatch — `axios.get/post/put/delete/patch` and `response.status/.data/.statusText` now compile natively without `--enable-js-runtime` or npm install. Added to `NATIVE_MODULES`, HIR native instance tracking, codegen dispatch, and `http-client` feature mapping. -- **fix**: `await fetch(url)` segfaulted because `body` (undefined for GET) NaN-unboxed to `0x1`, dereferenced as a valid pointer. Fixed `string_from_header` to treat pointers below page size as invalid. -- **fix**: await loop never drained stdlib async queue — added `js_run_stdlib_pump()` call so tokio-based fetch/DB results actually resolve. -- **fix**: `llvm-ar not found` warning downgraded from `ERROR` to soft skip with install instructions (non-fatal, strip-dedup is optional). -- **feat**: `.d.ts` type stubs for `perry/ui`, `perry/thread`, `perry/i18n`, `perry/system`. `perry init` generates `tsconfig.json` with paths; new `perry types` command for existing projects. - -### v0.5.17 (llvm-backend) — scalar replacement of non-escaping objects + Static Hermes benchmarks -- **perf**: escape analysis identifies `let p = new Point(x, y)` where `p` never escapes (only PropertyGet/PropertySet uses); fields are decomposed into stack allocas that LLVM promotes to registers — zero heap allocation. `object_create` 10ms→4ms (2.5x), `binary_trees` 9ms→3ms (3x), peak RSS 97MB→5MB. Perry now beats Node.js on all 15 benchmarks. -- **feat**: benchmark suite (`benchmarks/suite/run_benchmarks.sh`) now includes Static Hermes (Meta's AOT JS compiler) as a 4th comparison target alongside Node.js and Bun, with automatic TS→JS type-stripping. Updated README with full 4-way comparison tables and refreshed polyglot numbers. - -### v0.5.16 (llvm-backend) — watchOS device target: arm64_32 instead of arm64 -- **fix**: `--target watchos` emitted `aarch64-apple-watchos` (regular 64-bit ARM) objects, but Apple Watch hardware requires `arm64_32` (ILP32 — 32-bit pointers on 64-bit ARM). Changed LLVM triple to `arm64_32-apple-watchos`, Rust target to `arm64_32-apple-watchos`, and link triple to `arm64_32-apple-watchos10.0`. The simulator target (`watchos-simulator`) is unchanged — it correctly uses host-native aarch64. This fixes the ABI incompatibility that prevented device builds from linking with the LLVM-based runtime. - -### v0.5.15 (llvm-backend) — perry/ui State dispatch + check-deps fix (closes #24, #25) -- **fix**: `State(0)` constructor and `.value`/`.set()` instance methods were missing from the LLVM codegen dispatch tables, producing "not in dispatch table" warnings and silently returning `undefined`. Added `State` → `perry_ui_state_create` to `PERRY_UI_TABLE` and `value` → `perry_ui_state_get` / `set` → `perry_ui_state_set` to `PERRY_UI_INSTANCE_TABLE`. -- **fix**: `perry check --check-deps` flagged `perry/ui`, `perry/thread`, `perry/i18n` as missing npm packages (R003) and as unsupported Node.js built-ins (U006). New `is_perry_builtin()` guard skips resolution and diagnostics for all `perry/*` imports. - -### v0.5.14 (llvm-backend) — Windows build fix: date.rs POSIX-only APIs -- **fix**: `timestamp_to_local_components` used `libc::localtime_r` and `tm_gmtoff`, both POSIX-only — broke the Windows CI build. Split into `#[cfg(unix)]` (keeps `localtime_r` + `tm_gmtoff`) and `#[cfg(windows)]` (uses `libc::localtime_s` / `libc::gmtime_s`, derives tz offset by comparing local vs UTC breakdowns). - -### v0.5.13 (llvm-backend) — Buffer.indexOf/includes dispatch fix -- **fix**: `Buffer.indexOf()` and `Buffer.includes()` were incorrectly routed through the string method path in codegen, because the `is_string_only_method` guard didn't exclude `Uint8Array`/`Buffer` types. Added a `static_type_of` check that skips the string dispatch when the receiver is typed as `Uint8Array` or `Buffer`, letting these methods fall through to `dispatch_buffer_method` via `js_native_call_method` as intended. -- **cleanup**: removed leftover debug `eprintln!` in `js_buffer_index_of`. - -### v0.5.12 (llvm-backend) — perry/ui widget dispatch — mango renders its full UI -- **feat**: follow-up to v0.5.10 which landed only `App({...})`. This commit adds the rest of the perry/ui surface to `lower_native_method_call` via a table-driven dispatcher (`PERRY_UI_TABLE` of `UiSig { method, runtime, args, ret }` entries using `UiArgKind::{Widget,Str,F64,Closure,I64Raw}` / `UiReturnKind::{Widget,F64,Void}`). ~40 widget methods covered in one pass: `Text` / `TextField` / `TextArea` / `Spacer` / `Divider` / `ScrollView` constructors; `menuCreate` / `menuAddItem` / `menuBarCreate` / `menuBarAttach` / `menuBarAddMenu`; text setters (`textSetFontSize` / `textSetColor` / `textSetString` / `textSetFontFamily` / `textSetFontWeight` / `textSetWraps`); button setters (`buttonSetBordered` / `buttonSetTextColor` / `buttonSetTitle`); widget mutators (`widgetAddChild` / `widgetClearChildren` / `widgetSetHidden` / `widgetSetWidth` / `widgetSetHeight` / `widgetSetHugging` / `widgetMatchParentWidth` / `widgetMatchParentHeight` / `widgetSetBackgroundColor` / `widgetSetBackgroundGradient` / `setCornerRadius`); stack mutators (`stackSetAlignment` / `stackSetDistribution`); `scrollviewSetChild`; `textfieldSetString` / `textareaSetString`. Runtime fns lazy-declared via `ctx.pending_declares`. -- **feat**: `VStack` / `HStack` get a dedicated special case because the TS call shape (`VStack(spacing, [children])` or `VStack([children])`) doesn't fit the table — spacing is optional and children is a variadic array that needs one `perry_ui_widget_add_child` call per element. We stash the parent handle in an entry alloca so subsequent blocks reload it, then walk the array fast path. -- **feat**: `Button` also gets a special case because the handler closure arg must stay NaN-boxed (f64), not unboxed to i64, and the label is a raw cstr pointer — neither shape is expressible as a single `UiArgKind` row. -- **fix**: one naming inconsistency found while building the table — the runtime fn is `perry_ui_set_widget_hidden` (with `set` first, unlike every other `widget_*` setter). Fixed in the table. -- **result**: `mango/src/app.ts -o Mango` now launches and renders the full UI tree — title bar, "Welcome to Mango" heading, "MongoDB Study Tool" subtitle, "Databases & Collections / Query & Plan / Edit & Insert / Index Viewer" menu items, and the orange "+ New Connection" button all visible in the screenshot. Verified by launching the compiled binary, positioning the window onscreen via osascript, and `/usr/sbin/screencapture`. The v0.5.0 LLVM cutover regression (mango compiled clean but exited silently with an empty window) is fully resolved. - -### v0.5.11 (llvm-backend) — inline-allocator regression fixes (parity 80% → 94%) -- **fix**: the inline bump-allocator hoist (v0.5.0-followup) cached `@perry_class_keys_` in a function-entry stack slot, but the entry-block hoist ran BEFORE `__perry_init_strings_*` (which is what populates the global). So freshly-allocated objects had a null `keys_array` and `js_object_get_field_by_name` returned `undefined` for every field — `test_array_of_objects` showed `sorted[0].name → undefined`. New `LlFunction::entry_init_boundary` + `entry_post_init_setup`: alloca stays at the very top (dominates), but the load+store splices in AFTER the init prelude. `mark_entry_init_boundary()` is called immediately after `js_gc_init` / `__perry_init_strings_*` / non-entry module inits in `compile_module_entry`. -- **fix**: the inline allocator skipped `register_class(child, parent)` (the runtime allocators do it on every alloc). With every class instance going through the inline path, the CLASS_REGISTRY was never populated and `instanceof` walks broke at the first hop — `test_edge_classes` showed `square instanceof Rectangle → false` for a `class Square extends Rectangle extends Shape`. New public `js_register_class_parent(child, parent)` extern; codegen emits one call per inheriting class in `__perry_init_strings_*` (sorted by class id). -- **infra**: parity script normalize_output now strips Node v25 `MODULE_TYPELESS_PACKAGE_JSON` warnings (4 lines printed to stderr per test file without `"type": "module"` in package.json — pure environmental noise that started after the Node v25 upgrade). -- **result**: parity sweep 96 PASS / 6 FAIL / 0 COMPILE_FAIL = **94.1%**, beating the v0.5.0 baseline of 91.8%. Remaining 6 DIFFs are all pre-existing (timer precision, lookbehind regex, lone surrogates, NFC/NFD, async-generator baseline) — verified by reproducing on the pre-optimization commit. Numeric benchmarks (object_create 8ms, binary_trees 7ms, factorial 25ms) still beat or tie Node on every workload — the fix didn't regress any of the v0.5.2 wins. - -### v0.5.10 (llvm-backend) — `perry/ui.App({...})` dispatch — mango actually launches -- **fix**: the LLVM backend port (v0.5.0 cutover) silently dropped `perry/ui` dispatch — receiver-less `NativeMethodCall { module: "perry/ui", method, object: None }` fell into `lower_native_method_call`'s catch-all early-out at `lower_call.rs:1922` and returned `double 0.0`. So `App({title, width, height, body})` at the end of any perry/ui app silently no-op'd, the binary completed init without entering `NSApplication.run()`, and exited with no output. Mango compiled cleanly under v0.5.0 through v0.5.9 but couldn't actually launch — the regression was masked because the driver doesn't have an integration test that runs the resulting binary. New per-method dispatch in `lower_call.rs::lower_native_method_call` that recognizes `perry/ui.App({...})`, walks the args[0] object literal for `title` / `width` / `height` / `icon` / `body`, lazy-declares `perry_ui_app_create` / `perry_ui_app_set_icon` / `perry_ui_app_set_body` / `perry_ui_app_run` via `pending_declares`, and emits the create/set-icon/set-body/run sequence. Verified by compiling `mango/src/app.ts -o Mango`, launching the binary, and screenshotting a native macOS window titled "Mango" (menubar shows Mango/Edit/Window — proof that NSApplication.run() is now being entered). The window's content area is empty because the other perry/ui constructors (Text/Button/VStack/HStack/etc.) are still in the same dropped state — full widget dispatch is the next followup. This commit lands `App()` only as a focused proof-of-concept that the linking + runtime + Mach-O code path works end to end. - -### v0.5.9 (llvm-backend) — `let C = SomeClass; new C()` correctness + alias type refinement -- **fix**: `let C = SomeClass; new C()` now actually creates an instance of `SomeClass` instead of returning the empty-object placeholder. New `local_class_aliases: HashMap` and `local_id_to_name: HashMap` fields on `FnCtx`, populated by `Stmt::Let` when the init is `Expr::ClassRef(name)` (direct alias) or `Expr::LocalGet(other_id)` where `other_id`'s name is itself an alias (chain — `let A = X; let B = A; new B()`). `lower_new` shadows its `class_name` parameter with the resolved name early so the rest of the function (alloc + ctor inline + field offsets) uses the real class. Critically, `refine_type_from_init` for `Expr::New` *also* resolves through `local_class_aliases`, so `let b: any = new C()` refines `b`'s static type to `Named("SomeClass")` not `Named("C")` — without this, the PropertyGet fast path would look up "C" in `ctx.classes`, find nothing, fall through to `js_object_get_field_by_name_f64`, and return undefined for fields that were correctly initialized in memory by the inline allocator. Verified with three test shapes: direct alias (`const C = Foo; const a = new C()`), 3-step chain (`const A = Bar; const B = A; const b = new B()`), and in-function (`function f() { const D = Foo; return new D() }`). Mango compiles cleanly. - -### v0.5.8 (llvm-backend) — `Expr::NewDynamic` static reroute + conditional callee branching -- **fix**: workspace `Cargo.toml` was missing `[profile.release.package]` `strip = false` overrides for `perry-ui-ios`, `perry-ui-tvos`, `perry-ui-android`, `perry-ui-watchos`. Same staticlib+`#[no_mangle] extern "C"` FFI contract as `perry-ui-macos` (which already had the override + the explicit "UI crates must NOT strip — they export `#[no_mangle] extern "C"` symbols" comment), so a release build of those four would have silently stripped their `perry_ui_*` symbols and broken linking user binaries on `--target ios-simulator`/`ios`/`tvos-simulator`/`tvos`/android. Hadn't bitten yet because all four are in `members` but not `default-members` — a plain `cargo build --release` skips them. Added the four missing profile blocks (`strip = false`, `codegen-units = 16`) alongside the existing macOS/gtk4/windows/geisterhand ones. No code changes, no version bump. -- **fix**: `new (Foo)()` (parenthesized ClassRef) and `new (cond ? FooClass : BarClass)()` (conditional callee) now dispatch to the right class instead of returning the empty-object placeholder. Two new shapes recognized in the `Expr::NewDynamic` lowering: (a) `Expr::ClassRef(name)` callees reroute straight to `lower_new(name, args)`, mirroring the existing `globalThis.X` reroute; (b) `Expr::Conditional { condition, then_expr, else_expr }` callees synthesize a `NewDynamic { callee: , args }` per branch and emit a runtime cond_br + phi via the existing `lower_conditional` helper, so each branch independently runs `lower_new` (or recursively the NewDynamic fallback). Nested ternaries work because the inner NewDynamic recurses through the same handler. New `try_static_class_name(callee)` helper centralizes the static-reroute pattern. The truly-dynamic fallback (`new someVar()` where the callee is a runtime value) still emits an empty-object placeholder — that needs a `js_new_dynamic(callee_value, args)` runtime helper to inspect the value's NaN tag and dispatch to the right class constructor, tracked as a v0.5.8 followup. Verified end-to-end with two TS tests: `new (cond ? Foo : Bar)()` (5 cases including a nested ternary) and `new (Foo)()` + `new arr[0]()` (placeholder fallback). Mango compiles cleanly. - -### v0.5.7 (llvm-backend) — `Expr::I18nString` compile-time resolution + runtime interpolation -- **fix**: localized strings now resolve to the right translation at compile time. Previously the `Expr::I18nString` lowering returned the verbatim KEY string regardless of the project's `default_locale`, so any user calling `t("Hello")` from `perry/i18n` got `"Hello"` instead of `"Hallo"` even with `default_locale = "de"`. New `expr::I18nLowerCtx` (threaded through `CrossModuleCtx`) carries the i18n table from `opts.i18n_table` and the default locale index. The lowering pulls `translations[default_locale_idx * key_count + string_idx]` at compile time, parses `{name}` placeholders, lowers each interpolation param's value, and emits a `js_string_concat` chain that interleaves interned literal fragments with `js_string_coerce`'d param values. Empty / missing translation cells fall back to the source key. Plurals (`plural_forms`/`plural_param`) are still ignored — uses the canonical `string_idx` form, leaving CLDR plural rule selection as a followup. Also fixed: `lower_call.rs::lower_native_method_call` was discarding `NativeMethodCall { module: "perry/i18n", method: "t", object: None, args: [I18nString] }` and returning `double 0.0` because the receiver-less early-out path didn't know about `t()`. Now special-cases the `t()` unwrap and lowers the inner I18nString directly. Added `default_locale_idx` to `CompileOptions::i18n_table` (5-tuple). Verified end-to-end with a 2-locale test: en/de translations resolve correctly when `default_locale` is switched, and missing/empty cells fall back to the source key. Mango still compiles cleanly (89 localizable strings across 13 locales). - -### v0.5.6 (llvm-backend) — perry-stdlib auto-optimize `hex` crate fix -- **fix**: `crates/perry-stdlib/src/sqlite.rs:54` was using `hex::encode(b)` to format SQLite `Blob` columns as hex strings, but the `hex` crate dep in `perry-stdlib`'s `Cargo.toml` is gated behind the `crypto` Cargo feature. Auto-optimize rebuilds that enabled only `database-sqlite` (e.g. mango: `better-sqlite3` + `mongodb` + fetch, no crypto) failed with `error[E0433]: failed to resolve: use of unresolved module or unlinked crate hex` and fell back to the prebuilt full stdlib, leaving every user binary 100KB+ larger than necessary. Replaced with a hand-rolled nibble loop (`const HEX: &[u8; 16] = b"0123456789abcdef"; for &byte in b { out.push(HEX[(byte >> 4) as usize]); out.push(HEX[(byte & 0x0f) as usize]); }`) so sqlite no longer depends on hex. Surgical fix — no Cargo.toml or auto-optimize logic changes. Mango now goes through the auto-optimize rebuild path: prebuilt-fallback 5.18 MB → optimized 5.01 MB (~168 KB / 3.4% savings, mostly from features the user doesn't import being stripped). Original fix done as a worktree-isolated subagent task; the agent's commit was based on a stale `llvm-backend` HEAD so the sqlite.rs change was applied manually here on top of v0.5.5. - -### v0.5.5 (llvm-backend) — `alloca_entry` sweep -- **fix**: 7 cross-block alloca sites in `expr.rs` / `lower_call.rs` / `stmt.rs` migrated to `LlFunction.alloca_entry()` to close the latent SSA dominance hazards flagged in v0.5.2's followup list. Migrated: catch-clause exception binding (capturable by nested closures in the catch body), `super()`-inlined parent ctor params (capturable by closures inside the parent ctor body), `forEach` loop counter (spans cond/body/exit successor blocks), `Await` result slot (spans check/wait/settled/done/merge blocks; can be lowered inside a nested if-arm), `NewClass` `this_slot` (pushed on `this_stack` for the entire inlined ctor body with nested closures capturing `this`), and the inlined-ctor param slots in two places. Left alone with comment: `js_array_splice out_slot` (single-block scratch, dominance-safe by construction). Mango compiles + links cleanly. Original sweep done as a worktree-isolated subagent task because main was being concurrently edited; cherry-picked back here. - -### v0.5.4 (llvm-backend) — `Expr::ExternFuncRef`-as-value via static `ClosureHeader` thunks -- **fix**: imported functions can now be passed as callbacks, stored in variables, and called indirectly. Previously `Expr::ExternFuncRef` lowered as a value returned a `TAG_TRUE` sentinel that worked for `if (importedFn)` truthiness checks but crashed at runtime the moment anything tried to dispatch through `js_closure_callN`. The fix mirrors the existing `__perry_wrap_` machinery for local funcs (`crates/perry-codegen/src/codegen.rs:870-904`): for every entry in `opts.import_function_prefixes`, `compile_module` now emits a thin `__perry_wrap_extern___` wrapper (`internal` linkage so per-module copies don't collide at link time) plus a static `ClosureHeader` constant `__perry_extern_closure___` whose `func_ptr` points at the wrapper and `type_tag = CLOSURE_MAGIC`. The expr.rs lowering returns `ptrtoint @ to i64` NaN-boxed as POINTER. New `LlModule.add_internal_constant()` helper. Verified end-to-end with a TS test that uses `arr.map(double)`, `if (double)`, `f === g`, and `fn(3, 4)` indirect call — all four cases produce correct output (was `[undef, undef, ...]` and `undefined` before). Mango unaffected (entry path uses truthiness only). - -### v0.5.3 (llvm-backend) — driver hard-fails on entry-module codegen errors -- **fix**: `crates/perry/src/commands/compile.rs` now refuses to link when the entry module is in `failed_modules`. The original 0.5.0 mango bug was a misdiagnosis chain: 13 modules (including `mango/src/app.ts`) failed codegen, the driver silently replaced each with an empty `_perry_init_*` stub, and the link step exploded with `Undefined symbols for architecture arm64: "_main"` — a downstream symptom that took manual digging to trace back to the real codegen errors hidden in cargo build noise. The driver now (a) prints a loud box-drawn failure summary right after the parallel compile loop, *before* `build_optimized_libs` floods stdout, (b) marks the entry module with `(entry)` in the failure list, and (c) returns `Err` immediately if the entry module is in the list, with a message explaining why. Non-entry failures keep the previous "stub the init, continue linking" behavior but get the same loud summary so the codegen errors aren't drowned in the cargo noise. `use_color` (was `_use_color`) is now wired through to ANSI red on the headers. - -### v0.5.2 (llvm-backend) — crushing the numeric benchmarks -- **perf**: `fadd/fsub/fmul/fdiv/frem/fneg` IR builder now emits `reassoc contract` fast-math flags. Clang's `-ffast-math` does NOT retroactively apply to ops in a `.ll` input — the FMFs must be on each instruction. Adding `reassoc contract` lets LLVM break serial accumulator chains into parallel accumulators + 8x-unroll + NEON 2-wide vectorize. **`loop_overhead` 99ms → 13ms (4.1x faster than Node 54ms); `math_intensive` 50ms → 14ms (3.3x faster than Node)**. -- **perf**: Integer-modulo fast path in `BinaryOp::Mod` when both operands are provably integer-valued. New `crate::collectors::collect_integer_locals` walker tracks locals that start from an `Integer` literal and are only ever mutated via `Update` (++/--, no `LocalSet`). Mod-by-integer on such values emits `fptosi → srem → sitofp` instead of `frem double`, which lowers to a libm `fmod()` call on ARM (no hardware instruction). LLVM's SCEV then replaces the div with a reciprocal-multiplication `msub` and hoists the conversions. **`factorial` (sum += i % 1000) 1553ms → 24ms — 64x faster, 25x faster than Node 603ms**. -- Perry now beats Node on 8/11 numeric benchmarks (loop_overhead, math_intensive, factorial, closure, mandelbrot, matrix_multiply, array_read, nested_loops); ties on 2; loses on object_create/binary_trees only (blocked on inline bump-allocator, a pending refactor). - -### v0.5.1 (llvm-backend) — mango compile sweep -- feat: 13 LLVM-backend gap fixes that let `mango` compile end-to-end with 0.5.0 (was hitting 13 module-level codegen errors that the driver silently turned into empty `_perry_init_*` stubs, leaving the link with no `_main`). Fixed: `Array.slice()` 0-arg, variadic `arr.push(a,b,c,…)`, `Expr::ArraySome`/`ArrayEvery`/`NewDynamic`/`FetchWithOptions`/`I18nString`/`ExternFuncRef`-as-value, `js_closure_call6..16` (was capped at 5). Killed the buggy cross-module pre-walker (`collect_extern_func_refs_in_*`) and replaced it with **lazy declares** via `FnCtx.pending_declares`, drained after each compile pass — fixes `use of undefined value @perry_fn_*` from cross-module calls inside closures, try/switch, and array callbacks. Closure pre-walker now also walks getters/setters/static_methods (was only methods+ctor) and recurses through ArraySome/Every/NewDynamic/FetchWithOptions/I18nString/Yield. New `LlFunction.alloca_entry()` hoists `Stmt::Let` slots to the entry block — fixes pre-existing SSA dominance verifier failure when a `let` declared inside an `if` arm is captured by a closure in a sibling branch. Mango binary: 4.9MB, links clean. - -### v0.5.0 — Phase K hard cutover (LLVM-only) -- **Cranelift backend deleted.** `crates/perry-codegen-llvm/` renamed to `crates/perry-codegen/` as the only codegen path. `--backend` CLI flag removed; all `cranelift*` workspace deps dropped. Parity sweep identical pre/post: **102 MATCH / 9 DIFF / 0 CRASH / 91.8%**. Remaining DIFFs are 8 nondeterministic (timing/RNG/UUID) + async-generator baseline + long-tail features (lookbehind regex, UTF-8/UTF-16 length gap, lone surrogates). - -### v0.4.146-followup-2 (llvm-backend) -- feat: `test_gap_array_methods` DIFF (3) → **MATCH**. Four coordinated fixes: 16-pass microtask drain in `main()` so top-level `.then(cb)` fires; `is_promise_expr` recognizes async-FuncRef calls via new `local_async_funcs` HashSet; nested `async function*` declarations hoist to top-level so generator transform sees them; `scan_expr_for_max_local`/`_max_func` in `perry-transform/generator.rs` now walk all array fast-path variants (ArrayMap/Filter/etc.) to prevent LocalId/FuncId collisions. - -### v0.4.146-followup (llvm-backend) -- feat: **`Object.groupBy`**, **`Array.fromAsync`**, optional-chain array fast path (`obj?.map(...)` folds through array dispatch), `typeof Object.` → `"function"` constant fold. `test_gap_array_methods` DIFF (7) → DIFF (3). - -### v0.4.148 (llvm-backend) -- feat: `test_gap_node_crypto_buffer` DIFF (54) → **MATCH**. Full Node-style Buffer/crypto surface: new `dispatch_buffer_method` in `object.rs` routes `js_native_call_method` for any registered buffer (read/write numeric family, `swap*`, `indexOf`/`includes`, `slice`/`fill`/`compare`/`toString(enc)`); `crypto.getRandomValues`, `Buffer.compare/from/alloc/concat` wired; `Buffer.from([arr])` path decodes via `js_buffer_from_value`; type inference refines `Buffer.from`/`crypto.randomBytes` to `Named("Uint8Array")`; crypto `createHash(...).update(...).digest(enc)` chain detected as string; `bigint_value_to_i64` accepts POINTER_TAG-boxed BigInt pointers. - -### v0.4.147 (llvm-backend) -- feat: `test_gap_symbols` DIFF (4) → **MATCH**. `Symbol.hasInstance` and `Symbol.toStringTag` via HIR class lowering of well-known keys (lifts to `__perry_wk_hasinstance_*`/`__perry_wk_tostringtag_*`), new `CLASS_HAS_INSTANCE_REGISTRY`/`CLASS_TO_STRING_TAG_REGISTRY` in runtime, and `Object.prototype.toString.call(x)` → `js_object_to_string` dispatch in HIR. - -### v0.4.146 (llvm-backend) -- feat: `Symbol.toPrimitive` semantic support — `+currency` / `` `${currency}` `` / `currency + 0` all consult `obj[Symbol.toPrimitive]` via new `js_to_primitive(v, hint)` hook threaded through `js_number_coerce` and `js_jsvalue_to_string`. Well-known symbol cache in `symbol.rs`; computed-key method lowering via new `PostInit::SetMethodWithThis` variant. `test_gap_symbols` DIFF (10) → DIFF (4). - -### v0.4.145 (llvm-backend) -- feat: real **TypedArray** support (Int8/Int16/Int32, Uint16/Uint32, Float32/Float64). New `typedarray.rs` with `TYPED_ARRAY_REGISTRY`; generic array helpers (`js_array_at`, `js_array_to_sorted`, `js_array_with`, `js_array_find_last`, etc.) detect typed-array pointers and dispatch per-kind, preserving `Int32Array(N) [ ... ]` Node format on round-trip. Reserved class IDs `0xFFFF0030..0037` for `instanceof`. `test_gap_array_methods` DIFF (35) → DIFF (7). - +Keep entries to 1-2 lines max. Full details in CHANGELOG.md. + +- **v0.5.58** — `Math.imul` i32 native path + `returns_integer` function detection. `MathImul(a,b)` in `can_lower_expr_as_i32`/`lower_expr_as_i32` emits single `mul i32` — no fptosi/sitofp. `returns_integer(f)` detects functions where ALL return paths end with `|0`/`>>>0`/bitwise (e.g. user-defined `imul32` polyfills) and includes them in the integer-candidate seeding. image_conv with Math.imul: **blur 287ms (1.17× Zig), total 467ms (1.9× Zig)**. +- **v0.5.57** — Fix dylib GC root segfault (closes #54). Dylib entry module now emits `perry_module_init()` instead of `main()` — initializes GC, string pools, module globals (GC root registration), and top-level statements. Host calls this once after dlopen; event loop is omitted (host manages its own). +- **v0.5.56** — i32-native bitwise ops in `lower_expr_as_i32` + i32 index/value in Uint8ArrayGet/Set. `can_lower_expr_as_i32` and `lower_expr_as_i32` now handle `BitAnd/BitOr/BitXor/Shl/Shr/UShr` — entire xorshift/FNV chains stay in i32. Uint8ArrayGet/Set use `lower_expr_as_i32` for index (and value for Set) when possible, skipping double round-trips. image_conv total: **456ms** (was 483ms). Blur: 280ms (1.14× Zig). Gap: **1.85× Zig** (was 1.97×). +- **v0.5.55** — Eliminate TLS overhead from transition cache + descriptor check (#60 follow-up). `TRANSITION_CACHE_GLOBAL` is now a plain `static mut` (user code is single-threaded), `ANY_DESCRIPTORS_IN_USE` → `static AtomicBool` with `Relaxed` load. 10k×20 benchmark: **142ms→77ms (1.8× faster)**, gap vs Node down to **4.5×** (was 84× before v0.5.51). +- **v0.5.54** — String split/indexOf perf: arena-allocated split parts (closes #61). `utf16_offset_to_byte_offset` / `byte_offset_to_utf16_index` zero-offset fast returns. indexOf/lastIndexOf ASCII path uses Rust Two-Way `str::find`/`rfind` instead of O(n×m) byte scan. Split uses `arena_alloc_gc` bump allocator + `gc_malloc_batch` helper. **split: 145ms→24ms (6× faster, beats Node 27ms), indexOf: 145ms→35ms (4× faster, ~Node 30ms)**. +- **v0.5.53** — `x | 0` / `x >>> 0` noop for known-finite operands + branchless Uint8ArraySet via `@llvm.assume`. When left operand is known-finite and right is `Integer(0)`, skip toint32 entirely (just fptosi+sitofp identity, no NaN/Inf guard). Uint8ArraySet now uses `@llvm.assume(in_bounds)` like Get, eliminating the branch diamond in input-gen and encoder loops. Blur kernel: **0 `bl` instructions** (fully inlined, zero function calls). +- **v0.5.52** — Targeted clamp-function i32 inlining: `is_int32_producing_expr`, `collect_integer_let_ids`, and `can_lower_expr_as_i32` now recognize calls to detected clamp functions (3-param clamp + clampU8) as int-producing. `lower_expr_as_i32` emits `@llvm.smax.i32` + `@llvm.smin.i32` directly — zero double conversions. **Blur kernel alone: 284ms vs Zig 246ms (1.15×)**. Full image_conv 0.76s includes input-gen overhead. +- **v0.5.51** — Content-hash shape-transition cache for dynamic property writes (closes #60). Transition cache keyed on FNV-1a content hash instead of string pointer identity — freshly concatenated keys (`"field_"+j`) now hit the cache across objects. Cache size 4096→16384. 10k×20 benchmark: **1300ms→136ms (9.6× faster)**, gap vs Node 84×→8.5×. +- **v0.5.50** — `toint32_fast` for known-finite bitwise operands + `alwaysinline` on small functions. `is_known_finite` analysis skips the 5-insn NaN/Inf guard from v0.5.49 when operands are provably finite (integer_locals, literals, byte loads, bitwise results). `force_inline` attribute on functions ≤8 stmts + i64-specialized wrappers. Clamp pattern detection (smin/smax in `lower_expr_as_i32`). +- **v0.5.49** — Bitwise ops with NaN/Infinity produce 0 per ECMAScript ToInt32 spec (closes #57). `LlBlock::toint32` emits inline NaN/Inf guard (`fcmp uno` + `fabs` + `fcmp oeq ±inf` → `select 0.0`) before `fptosi`, fixing UB for all bitwise ops (`|`, `&`, `^`, `<<`, `>>`, `>>>`). +- **v0.5.48** — `sdiv` for `(int / const) | 0` + `@llvm.assume` bounds in Uint8ArrayGet. image_conv: 0.69s → 0.61s. +- **v0.5.47** — `Buffer.indexOf(byte)` / `Buffer.includes(byte)` with numeric argument (closes #56). +- **v0.5.46** — PIC miss handler fix for >8-field objects (closes #55). Zero-copy JSON string parsing + incremental object build. JSON pipeline: Perry 180ms vs Node 140ms (was 547×). +- **v0.5.45** — JSON.parse key interning + transition-cache shape sharing. 20-record pipeline: Perry 12ms vs Node 4ms. +- **v0.5.44** — Monomorphic inline cache for PropertyGet (closes #51). Per-site `[2 x i64]` globals. +- **v0.5.43** — Wire int-analysis ↔ flat-const bridge. image_conv: 1.95s → 0.66s (-66%). +- **v0.5.42** — `!invariant.load` metadata on Array/Buffer length loads (closes #52). +- **v0.5.41** — Flat `[N x i32]` constants for module-level `const` 2D int arrays (closes #50). +- **v0.5.40** — Accumulator-pattern int-arithmetic fast path (closes #49). sum-of-bytes: 272ms → 63ms. +- **v0.5.39** — Int32-stable local specialization (closes #48). Fixed boxed_vars bug for non-closure loop counters. +- **v0.5.38** — Inline Buffer/Uint8Array bracket-access (closes #47). image_conv: 2.19s → 1.98s. +- **v0.5.37** — `JSON.parse` GC-root stack for in-progress parse frames (closes #46). diff --git a/Cargo.lock b/Cargo.lock index b14c402a6..eb3f1849a 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -348,28 +348,6 @@ dependencies = [ "arrayvec", ] -[[package]] -name = "aws-lc-rs" -version = "1.16.2" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a054912289d18629dc78375ba2c3726a3afe3ff71b4edba9dedfca0e3446d1fc" -dependencies = [ - "aws-lc-sys", - "zeroize", -] - -[[package]] -name = "aws-lc-sys" -version = "0.39.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "83a25cf98105baa966497416dbd42565ce3a8cf8dbfd59803ec9ad46f3126399" -dependencies = [ - "cc", - "cmake", - "dunce", - "fs_extra", -] - [[package]] name = "base64" version = "0.22.1" @@ -853,15 +831,6 @@ version = "1.1.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c8d4a3bb8b1e0c1050499d1815f5ab16d04f0959b233085fb31653fbfc9d98f9" -[[package]] -name = "cmake" -version = "0.1.58" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c0f78a02292a74a88ac736019ab962ece0bc380e3f977bf72e376c5d78ff0678" -dependencies = [ - "cc", -] - [[package]] name = "color_quant" version = "1.1.0" @@ -1528,12 +1497,6 @@ dependencies = [ "dtoa", ] -[[package]] -name = "dunce" -version = "1.0.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "92773504d58c093f6de2459af4af33faa518c13451eb8f2b5698ed3d36e7c813" - [[package]] name = "ego-tree" version = "0.6.3" @@ -1827,12 +1790,6 @@ dependencies = [ "syn 2.0.117", ] -[[package]] -name = "fs_extra" -version = "1.3.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "42703706b716c37f96a77aea830392ad231f44c9e9a67872fa5548707e11b11c" - [[package]] name = "fslock" version = "0.2.1" @@ -2854,7 +2811,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "525e9ff3e1a4be2fbea1fdf0e98686a6d98b4d8f937e1bf7402245af1909e8c3" dependencies = [ "byteorder-lite", - "quick-error", + "quick-error 2.0.1", ] [[package]] @@ -3327,6 +3284,15 @@ dependencies = [ "tendril", ] +[[package]] +name = "matchers" +version = "0.2.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d1525a2a28c7f4fa0fc98bb91ae755d1e2d1505079e05539e35bc876b5d65ae9" +dependencies = [ + "regex-automata", +] + [[package]] name = "maybe-rayon" version = "0.1.1" @@ -3586,6 +3552,15 @@ version = "0.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "0676bb32a98c1a483ce53e500a81ad9c3d5b3f7c920c28c24e9cb0980d0b5bc8" +[[package]] +name = "nu-ansi-term" +version = "0.50.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7957b9740744892f114936ab4a57b3f487491bbeafaf8083688b16841a4240e5" +dependencies = [ + "windows-sys 0.61.2", +] + [[package]] name = "num-bigint" version = "0.4.6" @@ -4087,7 +4062,7 @@ checksum = "9b4f627cb1b25917193a259e49bdad08f671f8d9708acfd5fe0a8c1455d87220" [[package]] name = "perry" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "atty", @@ -4131,7 +4106,7 @@ dependencies = [ [[package]] name = "perry-codegen" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "log", @@ -4142,7 +4117,7 @@ dependencies = [ [[package]] name = "perry-codegen-glance" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "perry-hir", @@ -4150,7 +4125,7 @@ dependencies = [ [[package]] name = "perry-codegen-js" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "perry-hir", @@ -4159,7 +4134,7 @@ dependencies = [ [[package]] name = "perry-codegen-swiftui" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "perry-hir", @@ -4168,7 +4143,7 @@ dependencies = [ [[package]] name = "perry-codegen-wasm" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "base64", @@ -4180,15 +4155,42 @@ dependencies = [ [[package]] name = "perry-codegen-wear-tiles" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "perry-hir", ] +[[package]] +name = "perry-container-compose" +version = "0.5.58" +dependencies = [ + "anyhow", + "async-trait", + "clap", + "dotenvy", + "hex", + "home", + "indexmap", + "md-5", + "once_cell", + "proptest", + "rand 0.8.5", + "regex", + "serde", + "serde_json", + "serde_yaml", + "shellexpand", + "thiserror 1.0.69", + "tokio", + "tracing", + "tracing-subscriber", + "which 6.0.3", +] + [[package]] name = "perry-diagnostics" -version = "0.5.28" +version = "0.5.58" dependencies = [ "serde", "serde_json", @@ -4196,7 +4198,7 @@ dependencies = [ [[package]] name = "perry-hir" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "perry-diagnostics", @@ -4208,7 +4210,7 @@ dependencies = [ [[package]] name = "perry-jsruntime" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "deno_core", @@ -4227,7 +4229,7 @@ dependencies = [ [[package]] name = "perry-parser" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "perry-diagnostics", @@ -4239,7 +4241,7 @@ dependencies = [ [[package]] name = "perry-runtime" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "base64", @@ -4259,12 +4261,13 @@ dependencies = [ [[package]] name = "perry-stdlib" -version = "0.5.28" +version = "0.5.58" dependencies = [ "aes", "aes-gcm", "anyhow", "argon2", + "async-trait", "base64", "bcrypt", "bson", @@ -4284,6 +4287,7 @@ dependencies = [ "hyper", "hyper-util", "image", + "indexmap", "itoa", "jsonwebtoken", "lazy_static", @@ -4294,6 +4298,7 @@ dependencies = [ "nanoid", "once_cell", "pbkdf2", + "perry-container-compose", "perry-runtime", "rand 0.8.5", "redis", @@ -4301,19 +4306,17 @@ dependencies = [ "reqwest", "rusqlite", "rust_decimal", - "rustls", - "rustls-native-certs", "ryu", "scraper", "scrypt", "serde", "serde_json", + "serde_yaml", "sha2", "sqlx", "thiserror 1.0.69", "tokio", "tokio-cron-scheduler", - "tokio-rustls", "tokio-tungstenite 0.24.0", "uuid", "validator", @@ -4322,7 +4325,7 @@ dependencies = [ [[package]] name = "perry-transform" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "perry-hir", @@ -4332,7 +4335,7 @@ dependencies = [ [[package]] name = "perry-types" -version = "0.5.28" +version = "0.5.58" dependencies = [ "anyhow", "thiserror 1.0.69", @@ -4340,11 +4343,11 @@ dependencies = [ [[package]] name = "perry-ui" -version = "0.5.28" +version = "0.5.58" [[package]] name = "perry-ui-android" -version = "0.5.28" +version = "0.5.58" dependencies = [ "itoa", "jni", @@ -4358,7 +4361,7 @@ dependencies = [ [[package]] name = "perry-ui-geisterhand" -version = "0.5.28" +version = "0.5.58" dependencies = [ "rand 0.8.5", "serde", @@ -4368,7 +4371,7 @@ dependencies = [ [[package]] name = "perry-ui-gtk4" -version = "0.5.28" +version = "0.5.58" dependencies = [ "cairo-rs", "gtk4", @@ -4379,7 +4382,7 @@ dependencies = [ [[package]] name = "perry-ui-ios" -version = "0.5.28" +version = "0.5.58" dependencies = [ "block2", "libc", @@ -4393,7 +4396,7 @@ dependencies = [ [[package]] name = "perry-ui-macos" -version = "0.5.28" +version = "0.5.58" dependencies = [ "block2", "libc", @@ -4410,7 +4413,7 @@ version = "0.1.0" [[package]] name = "perry-ui-tvos" -version = "0.5.28" +version = "0.5.58" dependencies = [ "block2", "libc", @@ -4424,7 +4427,7 @@ dependencies = [ [[package]] name = "perry-ui-watchos" -version = "0.5.28" +version = "0.5.58" dependencies = [ "block2", "libc", @@ -4436,7 +4439,7 @@ dependencies = [ [[package]] name = "perry-ui-windows" -version = "0.5.28" +version = "0.5.58" dependencies = [ "libc", "perry-runtime", @@ -4748,6 +4751,25 @@ dependencies = [ "syn 2.0.117", ] +[[package]] +name = "proptest" +version = "1.11.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "4b45fcc2344c680f5025fe57779faef368840d0bd1f42f216291f0dc4ace4744" +dependencies = [ + "bit-set 0.8.0", + "bit-vec 0.8.0", + "bitflags", + "num-traits", + "rand 0.9.2", + "rand_chacha 0.9.0", + "rand_xorshift", + "regex-syntax", + "rusty-fork", + "tempfile", + "unarray", +] + [[package]] name = "psm" version = "0.1.30" @@ -4808,6 +4830,12 @@ dependencies = [ "winapi", ] +[[package]] +name = "quick-error" +version = "1.2.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a1d01941d82fa2ab50be1e79e6714289dd7cde78eba4c074bc5a4374f650dfe0" + [[package]] name = "quick-error" version = "2.0.1" @@ -4961,6 +4989,15 @@ dependencies = [ "getrandom 0.3.4", ] +[[package]] +name = "rand_xorshift" +version = "0.4.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "513962919efc330f829edb2535844d1b912b0fbe2ca165d613e4e8788bb05a5a" +dependencies = [ + "rand_core 0.9.5", +] + [[package]] name = "rav1e" version = "0.8.1" @@ -5005,7 +5042,7 @@ dependencies = [ "avif-serialize", "imgref", "loop9", - "quick-error", + "quick-error 2.0.1", "rav1e", "rayon", "rgb", @@ -5362,7 +5399,6 @@ version = "0.23.37" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "758025cb5fccfd3bc2fd74708fd4682be41d99e5dff73c377c0646c6012c73a4" dependencies = [ - "aws-lc-rs", "log", "once_cell", "ring", @@ -5372,18 +5408,6 @@ dependencies = [ "zeroize", ] -[[package]] -name = "rustls-native-certs" -version = "0.8.3" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "612460d5f7bea540c490b2b6395d8e34a953e52b491accd6c86c8164c5932a63" -dependencies = [ - "openssl-probe", - "rustls-pki-types", - "schannel", - "security-framework", -] - [[package]] name = "rustls-pki-types" version = "1.14.0" @@ -5400,7 +5424,6 @@ version = "0.103.10" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "df33b2b81ac578cabaf06b89b0631153a3f416b0a886e8a7a1707fb51abbd1ef" dependencies = [ - "aws-lc-rs", "ring", "rustls-pki-types", "untrusted", @@ -5412,6 +5435,18 @@ version = "1.0.22" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "b39cdef0fa800fc44525c84ccb54a029961a8215f9619753635a9c0d2538d46d" +[[package]] +name = "rusty-fork" +version = "0.3.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "cc6bf79ff24e648f6da1f8d1f011e9cac26491b619e6b9280f2b47f1774e6ee2" +dependencies = [ + "fnv", + "quick-error 1.2.3", + "tempfile", + "wait-timeout", +] + [[package]] name = "ryu" version = "1.0.23" @@ -5679,6 +5714,19 @@ dependencies = [ "syn 2.0.117", ] +[[package]] +name = "serde_yaml" +version = "0.9.34+deprecated" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6a8b1a1a2ebf674015cc02edccce75287f1a0130d394307b36743c2f5d504b47" +dependencies = [ + "indexmap", + "itoa", + "ryu", + "serde", + "unsafe-libyaml", +] + [[package]] name = "servo_arc" version = "0.3.0" @@ -5716,12 +5764,30 @@ dependencies = [ "digest", ] +[[package]] +name = "sharded-slab" +version = "0.1.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f40ca3c46823713e0d4209592e8d6e826aa57e928f09752619fc696c499637f6" +dependencies = [ + "lazy_static", +] + [[package]] name = "shell-words" version = "1.1.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "dc6fe69c597f9c37bfeeeeeb33da3530379845f10be461a66d16d03eca2ded77" +[[package]] +name = "shellexpand" +version = "3.1.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "32824fab5e16e6c4d86dc1ba84489390419a39f97699852b66480bb87d297ed8" +dependencies = [ + "dirs 6.0.0", +] + [[package]] name = "shlex" version = "1.3.0" @@ -6480,6 +6546,15 @@ dependencies = [ "syn 2.0.117", ] +[[package]] +name = "thread_local" +version = "1.1.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f60246a4944f24f6e018aa17cdeffb7818b76356965d03b07d6a9886e8962185" +dependencies = [ + "cfg-if", +] + [[package]] name = "tiff" version = "0.11.3" @@ -6489,7 +6564,7 @@ dependencies = [ "fax", "flate2", "half", - "quick-error", + "quick-error 2.0.1", "weezl", "zune-jpeg", ] @@ -6869,6 +6944,36 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "db97caf9d906fbde555dd62fa95ddba9eecfd14cb388e4f491a66d74cd5fb79a" dependencies = [ "once_cell", + "valuable", +] + +[[package]] +name = "tracing-log" +version = "0.2.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ee855f1f400bd0e5c02d150ae5de3840039a3f54b025156404e34c23c03f47c3" +dependencies = [ + "log", + "once_cell", + "tracing-core", +] + +[[package]] +name = "tracing-subscriber" +version = "0.3.23" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "cb7f578e5945fb242538965c2d0b04418d38ec25c79d160cd279bf0731c8d319" +dependencies = [ + "matchers", + "nu-ansi-term", + "once_cell", + "regex-automata", + "sharded-slab", + "smallvec", + "thread_local", + "tracing", + "tracing-core", + "tracing-log", ] [[package]] @@ -6953,6 +7058,12 @@ version = "1.19.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "562d481066bde0658276a35467c4af00bdc6ee726305698a55b86e61d7ad82bb" +[[package]] +name = "unarray" +version = "0.1.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "eaea85b334db583fe3274d12b4cd1880032beab409c0d774be044d4480ab9a94" + [[package]] name = "unicase" version = "2.9.0" @@ -7026,6 +7137,12 @@ dependencies = [ "subtle", ] +[[package]] +name = "unsafe-libyaml" +version = "0.2.11" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "673aac59facbab8a9007c7f6108d11f63b603f7cabff99fabf650fea5c32b861" + [[package]] name = "untrusted" version = "0.9.0" @@ -7150,6 +7267,12 @@ dependencies = [ "syn 2.0.117", ] +[[package]] +name = "valuable" +version = "0.1.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ba73ea9cf16a25df0c8caa16c51acb937d5712a8429db78a3ee29d5dcacd3a65" + [[package]] name = "vcpkg" version = "0.2.15" @@ -7168,6 +7291,15 @@ version = "0.9.5" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a" +[[package]] +name = "wait-timeout" +version = "0.2.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "09ac3b126d3914f9849036f826e054cbabdc8519970b8998ddaf3b5bd3c65f11" +dependencies = [ + "libc", +] + [[package]] name = "walkdir" version = "2.5.0" diff --git a/Cargo.toml b/Cargo.toml index 34d9be1f1..c5f2a869e 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -26,6 +26,7 @@ members = [ "crates/perry-codegen-wear-tiles", "crates/perry-codegen-wasm", "crates/perry-ui-test", + "crates/perry-container-compose", ] # Only build platform-independent crates by default. # Platform-specific UI crates (perry-ui-macos, perry-ui-ios, etc.) must be built @@ -102,7 +103,7 @@ opt-level = "s" # Optimize for size in stdlib opt-level = 3 [workspace.package] -version = "0.5.28" +version = "0.5.58" edition = "2021" license = "MIT" repository = "https://github.com/PerryTS/perry" diff --git a/PERF_ROADMAP.md b/PERF_ROADMAP.md new file mode 100644 index 000000000..7f2f1dca2 --- /dev/null +++ b/PERF_ROADMAP.md @@ -0,0 +1,136 @@ +# Performance Roadmap: Closing the Zig Gap + +## Current State (v0.5.58) + +Perry compiles TypeScript to native via LLVM. On the image_conv benchmark (5×5 Gaussian blur, 3840×2160 RGB): + +| Component | Perry | Zig | Gap | Root Cause | +|---|---|---|---|---| +| Blur kernel | 280ms | ~200ms | 1.4× | No NEON autovectorization | +| Input gen (xorshift + gradient) | 120ms | ~30ms | 4× | NaN-box unbox per buffer access | +| FNV-1a hash | 57ms | ~16ms | 3.5× | Double-ABI function call overhead | +| **Total** | **457ms** | **246ms** | **1.86×** | | + +Starting point was 1,980ms (8× Zig). We've done 4.3× improvement. The remaining 1.86× is architectural. + +## What's Already Optimized + +These are DONE and working — don't re-implement: + +1. **i32 accumulator fast path** (`lower_expr_as_i32` in expr.rs): `rAcc += src[idx] * k` stays in i32. Handles Add/Sub/Mul/BitAnd/BitOr/BitXor/Shl/Shr/UShr, LocalGet (i32 slot or integer_locals), Integer, Uint8ArrayGet, flat-const IndexGet, clamp calls (smin/smax), Math.imul. The `can_lower_expr_as_i32` gate checks all leaves; `lower_expr_as_i32` emits native i32 ops. + +2. **Flat `[N x i32]` const tables** (flat_const_arrays): `const KERNEL: number[][] = [[1,4,7,4,1],...]` → private unnamed_addr `[25 x i32]` in .rodata. `IndexGet(IndexGet(X, i), j)` and aliased `krow[j]` (via `array_row_aliases`) emit `getelementptr + load i32`. + +3. **`@llvm.assume` bounds** in Uint8ArrayGet/Set: eliminates the branch+phi diamond for bounds checks. Single basic block per access. + +4. **`sdiv` for `(int / const) | 0`**: Pattern-matched before generic BitOr lowering. LLVM converts to `smulh + asr`. + +5. **`toint32_fast`**: Skips the 5-instruction NaN/Inf guard from v0.5.49 when `is_known_finite(ctx, expr)` returns true. + +6. **`x | 0` / `x >>> 0` noop**: When left operand is known-finite and right is Integer(0), emit just `fptosi + sitofp` (no toint32 guard, no or/lshr). + +7. **Clamp detection** (`detect_clamp3`, `detect_clamp_u8`): Threaded through `is_int32_producing_expr`, `collect_integer_let_ids`, `can_lower_expr_as_i32`. Call sites emit `@llvm.smax.i32 + @llvm.smin.i32`. + +8. **`alwaysinline`** on functions ≤8 stmts and i64-specialized wrappers (`force_inline` field on LlFunction). + +9. **`!invariant.load`** on buffer/array length loads. + +10. **`returns_integer` detection**: Functions whose ALL return paths end with `| 0` / `>>> 0` / bitwise → included in integer-candidate seeding. + +## The Three Optimizations That Would Close the Gap + +### 1. Typed Buffer Locals (eliminates NaN-box unbox — biggest single win) + +**Problem**: Every `src[idx]` and `dst[idx] = val` does: +```llvm +%handle_bits = bitcast double %buf_nanboxed to i64 +%handle = and i64 %handle_bits, 0x0000FFFFFFFFFFFF ; strip NaN-box tag +; ... then compute address from handle +``` +That's 2 extra instructions per access × 75 accesses per pixel × 8.3M pixels = 1.24 billion wasted instructions. + +**Fix**: When a `const buf = Buffer.alloc(N)` or function param is statically typed as `Buffer`/`Uint8Array`, store the raw `i64` pointer in an `i64` alloca instead of a NaN-boxed `double` alloca. `Uint8ArrayGet`/`Set` then skip the unbox — just `load i64` from the slot and use directly. + +**Implementation sketch**: +- In `stmt.rs` `Stmt::Let`, detect `init: BufferAlloc { .. }` or type `Named("Buffer")`/`Named("Uint8Array")`. Allocate an `I64` slot instead of `DOUBLE`. Store the raw pointer (from `js_buffer_alloc` which returns `I64`) directly. +- In `Uint8ArrayGet`/`Uint8ArraySet`, check if the `array` expr is `LocalGet(id)` where `id` has a typed-buffer slot. If so, `load i64` from the slot directly — no `bitcast + and POINTER_MASK`. +- Track typed-buffer locals in a `HashSet` on `FnCtx` (like `i32_counter_slots`). +- Module globals that are buffers: store as `I64` global instead of `DOUBLE`. + +**Estimated impact**: Eliminates ~2 instructions per buffer access. For input gen (48 accesses per 4-byte iteration × 6.2M iterations) + blur (75 accesses per pixel × 8.3M pixels), saves ~1.5B instructions → ~40-60ms. + +### 2. Interior/Border Loop Splitting (enables NEON autovectorization) + +**Problem**: The blur loop processes ALL pixels uniformly, including edge pixels that need clamping. The clamp logic (even as smin/smax) adds data-dependent index computation that prevents LLVM from vectorizing across pixels. + +**Fix**: Split the y-loop into three regions: +- Top border (y = 0..1): clamp needed +- Interior (y = 2..H-3): no clamp needed, all indices guaranteed in-bounds +- Bottom border (y = H-2..H-1): clamp needed + +Same for x. For the interior (99.9% of pixels), the kernel access pattern becomes: +``` +idx = ((y + ky) * W + (x + kx)) * 3 +``` +— pure arithmetic, no clamp, no smin/smax. LLVM can then vectorize the x-loop with NEON `ld3`/`st3` for stride-3 RGB deinterleaving. + +**Implementation**: This is a HIR-level transform, not codegen. Add a pass in `perry-transform` that detects the blur-like pattern: +``` +for (y) for (x) for (ky) for (kx) { + yy = clamp(y+ky, 0, H-1) + xx = clamp(x+kx, 0, W-1) +} +``` +And splits it into border + interior loops. The interior loop has the clamp calls replaced with direct arithmetic. + +**Estimated impact**: Interior loop becomes vectorizable → 4× throughput on the accumulation. Blur drops from 280ms to ~80-100ms. + +### 3. HIR-Level Function Inlining for Small Pure Functions + +**Problem**: `clampIdx` and `imul32` are compiled as separate LLVM functions with double-ABI wrappers. LLVM's `alwaysinline` inlines them, but the `sitofp/fptosi` conversion chain at the call boundary persists in the IR until instcombine runs — and instcombine doesn't always collapse `fptosi(sitofp(select(icmp(...))))`. + +**Fix**: Inline at the HIR level, BEFORE codegen. A pre-codegen pass in `perry-transform` that: +1. Identifies small (≤8 stmt) non-recursive pure functions. +2. At each call site, substitutes the function body with parameter renaming. +3. The result: the function body's `if/return` pattern is in the CALLER's HIR, and Perry's codegen sees it directly — no function call boundary, no double wrapper. + +For `clampIdx(v, lo, hi)`, the inlined HIR becomes: +``` +Let { yy_temp, init: v } +If { v < lo } → LocalSet(yy_temp, lo) +If { v > hi } → LocalSet(yy_temp, hi) +// yy_temp is now the clamped value +``` + +The codegen's `is_int32_producing_expr` already handles `If/Return` patterns and `LocalGet`/`LocalSet` — so the inlined body stays in the i32 path without any double conversion. + +**Implementation**: Add a new pass in `crates/perry-transform/src/inline.rs`. Walk all `Call(FuncRef(id), args)` in the HIR. If the callee has ≤8 stmts, is not recursive, and is not async/generator, replace the Call with an inlined copy of the body (fresh LocalIds via the module's id allocator, parameter locals initialized from the call args). + +**Estimated impact**: Eliminates the `sitofp/fptosi` chain at every clampIdx/clampU8/imul32 call site. For the blur (50 clamp calls per pixel × 8.3M pixels = 415M conversions eliminated) + FNV (24.8M imul32 call conversions eliminated). Combined: ~30-50ms saved. + +## Priority Order + +1. **Typed Buffer Locals** — broadest impact, affects ALL buffer-heavy code, relatively simple codegen change +2. **HIR-Level Inlining** — eliminates the double-ABI tax for ALL small functions, reusable across benchmarks +3. **Interior/Border Splitting** — most complex, biggest single-benchmark win, enables NEON + +With all three: projected **200-250ms** total, matching Zig. + +## Key Files + +- `crates/perry-codegen/src/expr.rs` — `lower_expr`, `lower_expr_as_i32`, `can_lower_expr_as_i32`, `Uint8ArrayGet`/`Set`, `FnCtx` struct +- `crates/perry-codegen/src/stmt.rs` — `Stmt::Let` lowering, i32 slot allocation +- `crates/perry-codegen/src/collectors.rs` — `collect_integer_locals`, `is_int32_producing_expr`, `collect_integer_let_ids`, clamp/returns-integer detectors +- `crates/perry-codegen/src/codegen.rs` — `compile_function`, `compile_module_entry`, `CrossModuleCtx`, i64-specialization +- `crates/perry-codegen/src/block.rs` — `LlBlock` instruction emission (`toint32`, `toint32_fast`, `load_invariant`, `sdiv`, etc.) +- `crates/perry-codegen/src/function.rs` — `LlFunction`, `force_inline`, `to_ir` +- `crates/perry-transform/src/` — HIR transform passes (new inline pass would go here) +- `crates/perry-hir/src/ir.rs` — HIR data structures (`Expr`, `Stmt`, `Function`) +- `crates/perry-runtime/src/value.rs` — NaN-boxing constants and value representation + +## Testing + +- `benchmarks/honest_bench/workloads/3_image_convolution/perry/image_conv.ts` — the blur benchmark +- Timed variant: use `Date.now()` around each phase (inputgen/blur/fnv) for per-component measurement +- `/tmp/run_gap_tests.sh` — gap test suite, verify no regressions +- Correctness: `checksum=2ba2e053` for the standard 3840×2160 image diff --git a/README.md b/README.md index 8d3db7501..5ad799444 100644 --- a/README.md +++ b/README.md @@ -497,6 +497,43 @@ These packages are natively implemented in Rust — no Node.js required: | **Database** | mysql2, pg, ioredis | | **Security** | bcrypt, argon2, jsonwebtoken | | **Utilities** | dotenv, uuid, nodemailer, zlib, node-cron | +| **Container** | perry/container (OCI container management) | + +--- + +## Container Module + +Perry includes a native container management module `perry/container` for creating, running, and managing OCI containers: + +```typescript +import { run, list, composeUp } from 'perry/container'; + +// Run a container +const container = await run({ + image: 'nginx:alpine', + name: 'my-nginx', + ports: ['8080:80'], +}); + +// List containers +const containers = await list(); +console.log(containers); + +// Multi-container orchestration +const compose = await composeUp({ + services: { + web: { image: 'nginx:alpine' }, + db: { image: 'postgres:15-alpine' }, + }, +}); +``` + +**Platform support:** +- macOS/iOS: Podman (apple/container support coming soon) +- Linux: Podman (native) +- Windows: Podman Desktop (experimental) + +See `example-code/container-demo/` for a complete example. --- diff --git a/benchmarks/polyglot/METHODOLOGY.md b/benchmarks/polyglot/METHODOLOGY.md new file mode 100644 index 000000000..51e65e9fa --- /dev/null +++ b/benchmarks/polyglot/METHODOLOGY.md @@ -0,0 +1,298 @@ +# Polyglot Benchmark Methodology + +Last updated: 2026-04-15 — Perry commit `e1cbd37`. + +This document describes how the polyglot benchmark suite is constructed and +run, what each benchmark measures, and why Perry's numbers differ from the +other languages. It is the companion to [`RESULTS.md`](./RESULTS.md). + +## What this suite is (and isn't) + +Eight compute-bound microbenchmarks, implemented identically in 10 runtimes. +Each benchmark runs for 0.1–15 seconds depending on the language. Best of 5 +runs per (benchmark, language) pair is reported. + +**This suite measures:** loop iteration throughput, arithmetic latency, +sequential array access, recursive call overhead, object allocation +patterns, and integer-modulo performance on f64-typed code. + +**This suite does not measure:** startup time, allocator throughput under +mixed workloads, GC pressure, I/O, async/await, JIT warmup behavior, memory +locality across realistic working sets, or anything a real application +spends most of its time on. Do not extrapolate these numbers to "language X +is N× faster than language Y on real workloads." They are a probe into +specific compiler choices, not a general benchmark. + +## Hardware + +Apple M1 Max (10 cores: 8P + 2E), 64 GB RAM, macOS 26.4. All benchmarks +run on performance cores via default scheduling — no explicit affinity +pinning, no `taskset`, no thermal throttle mitigation beyond best-of-N. + +## Compiler / runtime versions + +Captured at the time of the last results refresh. See `RESULTS.md` for the +date of the run being reported. + +| Runtime | Version | Invocation | +|---------------|----------------------------------------------|-----------------------------------| +| Perry | commit `e1cbd37` (v0.5.22, LLVM backend) | `perry compile file.ts -o bin` | +| Rust | rustc 1.92.0 (stable) | `rustc -O bench.rs` | +| C++ | Apple clang 21.0 (Xcode) | `g++ -O3 -std=c++17` | +| Go | go 1.21.3 | `go build` | +| Swift | Swift 6.3 | `swiftc -O` | +| Java | OpenJDK 21.0.7 | `javac` + `java` (JIT) | +| Node.js | v25.8.0 | `node --experimental-strip-types` | +| Bun | 1.3.5 | `bun run file.ts` | +| Static Hermes | `shermes` (LLVH 8.0.0svn) | `shermes -typed -O` AOT | +| Python | CPython 3.14.3 | `python3 bench.py` | + +**Flag discipline:** every compiled language uses the flag its documentation +suggests for "release mode" — nothing more. No `-ffast-math`, no `-Ounchecked`, +no `#[target_feature]`, no `-march=native`, no profile-guided optimization. +The point is to compare defaults. A "what-if" suite with aggressive flags is +the companion `RESULTS_OPT.md` (see phase 2). + +## Methodology + +### Measurement + +Each benchmark prints a single line of the form `name:elapsed_ms` using the +language's highest-resolution monotonic clock: + +| Language | Clock | +|----------|------------------------------------------| +| Perry | `Date.now()` (maps to `clock_gettime(MONOTONIC)`) | +| Rust | `std::time::Instant::now()` | +| C++ | `std::chrono::steady_clock::now()` | +| Go | `time.Now()` | +| Swift | `Date()` / `DispatchTime.now()` | +| Java | `System.nanoTime()` | +| Node/Bun/Hermes | `Date.now()` | +| Python | `time.perf_counter()` | + +All timings are integer milliseconds after truncation. Sub-millisecond +benchmarks (e.g. object_create on Rust/C++/Go/Swift, which is 0 ms after +dead-code elimination) are reported as `0` — this is a real result, not a +missing value. See the "where Perry loses" discussion in `RESULTS.md`. + +### Best-of-N + +The runner invokes each binary 5 times and reports the minimum. Best-of-N +tracks the compiler's asymptotic output rather than scheduler noise, +thermal throttling, or interference from other processes. The variance on +these benchmarks is small (<5% across runs on an idle system) — `best-of-5` +vs `best-of-10` produces the same numbers to the millisecond. + +### Warmup + +None. These are AOT-compiled (or, for Java and Node/Bun, contain enough +iterations that JIT compilation converges well before the hot loop finishes). +The one runtime where this matters is the JVM — Java's numbers include +~50ms of C2 tier-up for the first few iterations. That's visible on +`loop_overhead` (98ms vs Node 53ms) but washes out on longer benchmarks. + +### Iteration counts + +Chosen so that the slowest compiled language runs each benchmark in +0.5–1 second. Python is treated as out-of-scope for iteration-count tuning; +it runs the same loops and reports the time it takes, which is 100–1000× +everything else. + +| Benchmark | Iterations | Array size | Notes | +|----------------|-----------:|------------:|-----------------------------------| +| fibonacci | recursion | — | `fib(40)` — ~2 billion calls | +| loop_overhead | 100M | — | `sum += 1.0` | +| array_write | 10M | 10M | write `arr[i] = i` | +| array_read | 10M | 10M | sum array elements | +| math_intensive | 50M | — | `result += 1.0/i` | +| object_create | 1M | — | allocate `Point(x,y)`, sum fields | +| nested_loops | 3000×3000| 3000²| flat-array index sum | +| accumulate | 100M | — | `sum += i % 1000` on f64 | + +## How the runner works + +`run_all.sh` in this directory. Roughly: + +``` +1. Build Perry from source (`cargo build --release -p perry`) +2. For each .ts file in ../suite, compile via `perry compile` +3. Compile bench.{cpp,rs,swift,go,java,py,zig} with release flags +4. If Hermes is installed, strip TS types from each suite .ts file and AOT-compile +5. For each (benchmark, runtime), run 5 times, take the minimum +6. Print a markdown table +``` + +The Node/Bun/Hermes runs use the same `.ts` files as Perry (from +`../suite/`). Hermes requires pre-stripping TS types — handled by a +small `sed` script inside `run_all.sh`. + +Python is in-scope but not apples-to-apples with the compiled languages. +Its numbers are included in `RESULTS.md` as a floor, not a comparison +target. + +## What Perry does differently + +Three specific optimization choices account for every benchmark where Perry +beats all native compiled languages. These are the thesis of the companion +article and the reason this suite exists. + +### 1. Fast-math reassociation on f64 arithmetic + +`crates/perry-codegen/src/block.rs:132-165`. Perry emits +`fadd/fsub/fmul/fdiv/frem/fneg` with the `reassoc contract` LLVM fast-math +flags on every instruction. `reassoc` lets LLVM reorder +`(a + b) + c → a + (b + c)`, which is what the loop vectorizer needs to +break a serial accumulator chain into 4–8 parallel accumulators. `contract` +lets it fuse `x*y + z` into `fma`. + +Rust, C++, Go, and Swift all default to IEEE 754 strict. Under IEEE rules, +`(a + b) + c ≠ a + (b + c)` in general — because a single `inf` or `nan` in +the chain makes reordering observably change the result. The compiler +must preserve original associativity, so every `fadd` in +`for (...) sum += 1.0` has a 3-cycle latency dependency on the previous +`fadd`. That's why Rust/C++/Go/Swift cluster at ~95ms on `loop_overhead`: +they're hitting the `fadd` latency wall, all running the same IEEE-strict +serialized loop. + +Perry at 12ms means LLVM broke the chain, ran 4–8 parallel `fadd`s per +NEON FPU, and probably unrolled 8×. The same C++ with `-ffast-math` reaches +the same number — phase 2 of this investigation confirms that. Perry's +advantage here is **default flags**, not compiler capability. + +The full rationale is in `block.rs:101-131` — Perry deliberately does not +emit the full `fast` FMF bundle (which would include `nnan ninf nsz`) +because JavaScript programs can observe `NaN` and `-0.0` distinctions. +`reassoc contract` is the minimum set needed for the loop-vectorizer +unlock without breaking `Math.max(-0, 0)` semantics. + +### 2. Integer-modulo fast path + +`crates/perry-codegen/src/type_analysis.rs:488` (`is_integer_valued_expr`) +and `crates/perry-codegen/src/collectors.rs:1006` (`collect_integer_locals`). +The `BinaryOp::Mod` lowering in `expr.rs:823` checks whether both operands +are provably integer-valued. If so, it emits +`fptosi → srem → sitofp` instead of `frem double`. + +On ARM, `frem` lowers to a **libm function call** (`fmod`) — there is no +hardware remainder instruction for f64. That's ~30 ns per call, plus the +overhead of a real function call in a tight loop. `srem` is a single ARM +instruction at ~1–2 cycles. The ratio is why `accumulate` shows Perry at +25 ms vs every other language at ~96 ms — the gap is entirely `srem` vs +`fmod` dispatch cost. + +This is a **type-driven** optimization, not a language-capability +optimization. Every language in the suite would hit the same 25 ms if its +benchmark used `int64`/`i64`/`long` instead of `double`. The optimized +variants (phase 2, see `RESULTS_OPT.md`) confirm this. Perry's win on +`accumulate` is: it can infer, from the TS source code and the absence of +non-integer operations on the accumulator, that the `double` here is always +holding an integer value, and swap the lowering to use the integer +instruction set — while the human-written TS source still looks like +`sum += i % 1000`. + +### 3. i32 loop counter + bounds elimination + +`crates/perry-codegen/src/stmt.rs:651-782`. When Perry lowers a `for` loop +whose condition is `i < arr.length` and whose body indexes `arr[i]`: + +1. It allocates a parallel **i32 counter slot** alongside the f64 counter + (`i32_counter_slots`). +2. It caches `arr.length` once at loop entry (`cached_lengths`). +3. It records the `(counter, array)` pair as statically in-bounds + (`bounded_index_pairs`) — subsequent `arr[i]` reads skip the runtime + length load and bounds check entirely. + +The array-access codegen sites consult these maps and emit a raw +`getelementptr + load` when available. On `array_write` and `array_read`, +this produces code that LLVM can autovectorize into NEON 2-wide f64 SIMD, +matching `-O3 -ffast-math` C++ output. + +**Important**: this is *not* "Perry removes safety." It's static proof that +the bounds check is dead. The JS semantics are preserved: you can still +read past the end of an array, you still get `undefined`. The compiler has +just observed, for this specific `for` loop shape, that the index is bounded +by the length. Rust's iterator path (`.iter().sum()`) does the same analysis +at the IR level — and matches Perry to the millisecond on `array_read` +when used. Phase 2 confirms this. + +Go cannot express this in the standard toolchain; Go always bounds-checks +indexed array access, and the Go compiler's bounds-check elision is +conservative on patterns this simple. Go's `array_read` stays at ~10 ms +regardless of iteration form. + +## Where Perry loses — and why + +### `object_create` (Perry: ~2–8 ms, Rust/C++/Go/Swift: 0 ms) + +The 0 ms results from Rust/C++/Go/Swift are real. Those languages: +1. Stack-allocate the struct (or elide the allocation entirely). +2. Inline the constructor. +3. Observe the struct never escapes the loop. +4. Compute the sum in closed form at compile time. + +The entire loop body is dead code. The benchmark measures nothing. + +Perry cannot match this without abandoning its dynamic value model. +JavaScript objects are heap-allocated by spec (with limited escape +analysis available via the v0.5.17 scalar-replacement pass, which +currently kicks in only when the object is *only ever accessed* via +field get/set — any method call defeats it). This is an inherent +cost of compiling a dynamic language: the optimizer has less static +information to work with. + +This benchmark is included honestly — it's the shape of workload where +Perry's approach pays a real tax relative to ahead-of-time compiled +languages with static types. + +### `fibonacci` (Perry ties C++, beats Rust — but only because of type inference) + +Perry's fib is at ~309 ms, C++ 309 ms, Rust ~316 ms — Perry "beats" +Rust here. The honest framing: Perry's benchmark is written as +`fib(n: number)`, which Perry's type inference refines to `i64` because +the function only ever performs integer operations. The generated LLVM +IR uses `sub/add/icmp`. Rust's benchmark uses `f64` to match +TypeScript's `number` type — so Rust generates `fsub/fadd/fcmp`. + +Both compile through LLVM. Same optimizer, different input types. If +the Rust benchmark used `fn fib(n: i64) -> i64`, it would run at +~308 ms and the "Perry wins" framing disappears. The phase 2 +`bench_opt.rs` does exactly this. + +Java wins this benchmark (~279 ms). The JVM's C2 JIT inlines the +recursive call more aggressively than any of the AOT compilers here +manage to do at module scope. This is a JIT-vs-AOT story, not a +Perry story. + +## Changelog + +This methodology will drift as the Perry codegen changes. Key moments: + +- **2026-04-15 (v0.5.22 / e1cbd37):** Initial document. Bun and + Static Hermes added to the comparison. +- **v0.5.17 (llvm-backend, earlier 2026):** Scalar-replacement pass for + non-escaping objects dropped `object_create` from 10 ms → 2 ms and + `binary_trees` from 9 ms → 3 ms. Relevant to the `object_create` + discussion above; this was what made Perry competitive on that + benchmark at all. +- **v0.5.2 (llvm-backend, earlier 2026):** The three optimizations + described above landed. Before this, Perry was ~95 ms on + `loop_overhead` (IEEE-strict `fadd` chain, same as the other + languages). These benchmarks only started showing Perry ahead of + native compiled languages after `reassoc contract` FMF and the + integer-mod fast path landed. + +## Reproducing + +```bash +cd benchmarks/polyglot +bash run_all.sh 5 # best of 5 per benchmark +``` + +Requires: Perry built from this repo (`cargo build --release`), plus +any subset of Node, Bun, Static Hermes (`shermes`), Rust, C++, Go, +Swift, Java, Python. Missing runtimes produce `-` cells; the script +does not fail. + +Runtime is ~10 minutes on an M1 Max at best-of-5, dominated by Python +(~30 s per full bench.py invocation). diff --git a/benchmarks/polyglot/RESULTS.md b/benchmarks/polyglot/RESULTS.md index 1fd765463..eefa49748 100644 --- a/benchmarks/polyglot/RESULTS.md +++ b/benchmarks/polyglot/RESULTS.md @@ -1,118 +1,129 @@ # Polyglot Benchmark Results -Perry vs 7 languages on 8 identical benchmarks. All implementations use `f64`/`double` arithmetic to match TypeScript's `number` type. No SIMD intrinsics, no unsafe code — standard idiomatic code in each language. +Perry vs 9 other runtimes on 8 identical benchmarks. All implementations +use `f64`/`double` arithmetic to match TypeScript's `number` type. No SIMD +intrinsics, no unsafe code, no non-default optimization flags — each +language's idiomatic release-mode build. A companion `RESULTS_OPT.md` +(phase 2 of this investigation) shows what happens when each language is +given flags equivalent to Perry's defaults. -## Results - -Best of 3 runs, macOS ARM64 (Apple Silicon M-series), April 2026. +See [`METHODOLOGY.md`](./METHODOLOGY.md) for iteration counts, clocks, +compiler versions, and a full explanation of which optimizations create +each delta. -| Benchmark | Perry | Rust | C++ | Go | Swift | Java | Node | Python | -|----------------|-------|-------|-------|-------|-------|-------|-------|---------| -| fibonacci | 309 | 316 | 309 | 446 | 399 | 279 | 991 | 15935 | -| loop_overhead | 12 | 95 | 96 | 96 | 95 | 97 | 53 | 2979 | -| array_write | 2 | 6 | 2 | 8 | 2 | 6 | 8 | 392 | -| array_read | 4 | 9 | 9 | 10 | 9 | 11 | 13 | 330 | -| math_intensive | 14 | 48 | 50 | 48 | 48 | 50 | 49 | 2212 | -| object_create | 8 | 0 | 0 | 0 | 0 | 4 | 8 | 161 | -| nested_loops | 8 | 8 | 8 | 9 | 8 | 10 | 17 | 470 | -| accumulate | 25 | 98 | 96 | 96 | 96 | 100 | 592 | 4919 | +## Results +**Run date:** 2026-04-15 — Perry commit `e1cbd37` (v0.5.22). +**Hardware:** Apple M1 Max (10 cores, 64 GB RAM), macOS 26.4. +**Methodology:** best of 5 runs per cell, monotonic clock, no warmup. All times in milliseconds. Lower is better. +† `fibonacci` is reported best-of-20 rather than best-of-5. The recursive-call +shape is unusually sensitive to icache/branch-predictor state, and we saw +±20% variance between different best-of-5 runs of Rust and C++. 20 samples +tightens the distribution to within ±2% of the minimum. + +| Benchmark | Perry | Rust | C++ | Go | Swift | Java | Node | Bun | Hermes | Python | +|----------------|-------|-------|-------|-------|-------|-------|-------|-------|--------|---------| +| fibonacci† | 311 | 319 | 310 | 450 | 403 | 280 | 1001 | 527 | 2575 | 16002 | +| loop_overhead | 12 | 99 | 98 | 97 | 97 | 98 | 53 | 40 | 98 | 2983 | +| array_write | 2 | 7 | 2 | 9 | 2 | 6 | 8 | 5 | 93 | 395 | +| array_read | 3 | 10 | 9 | 10 | 9 | 11 | 13 | 14 | 46 | 344 | +| math_intensive | 14 | 49 | 50 | 49 | 49 | 51 | 50 | 51 | 50 | 2243 | +| object_create | 2 | 0 | 0 | 0 | 0 | 5 | 8 | 5 | 2 | 161 | +| nested_loops | 9 | 8 | 8 | 10 | 8 | 10 | 17 | 19 | 80 | 484 | +| accumulate | 24 | 97 | 97 | 99 | 96 | 100 | 602 | 99 | 122 | 4989 | + ## How to reproduce ```bash cd benchmarks/polyglot bash run_all.sh # best of 3 runs (default) -bash run_all.sh 5 # best of 5 runs +bash run_all.sh 5 # best of 5 runs (what the above table used) ``` -**Requirements:** Perry (built from this repo), Node.js, Go, Rust (`rustc`), C++ (`g++` or `clang++`), Swift, Java (`javac` + `java`), Python 3. Zig is optional (currently skipped due to macOS SDK compatibility). All must be in `$PATH`. - -**What the script does:** -1. Builds Perry from source (`cargo build --release`) -2. Compiles each Perry benchmark `.ts` to a native binary -3. Compiles `bench.cpp` with `g++ -O3`, `bench.rs` with `rustc -O`, `bench.swift` with `swiftc -O`, `bench.go` with `go build`, `bench.java` with `javac` -4. Runs each benchmark N times per language, takes the best (lowest) time -5. Outputs a markdown table - -## Why Perry beats compiled languages on some benchmarks - -These results are real but need context. Perry is not "faster than C++." Perry is faster than C++ *compiled with default optimization flags on benchmarks that use f64 for everything.* Three specific optimizations create the advantage: - -### 1. Fast-math reassociation (loop_overhead, math_intensive) - -Perry emits `reassoc contract` flags on every f64 arithmetic instruction. This lets LLVM break serial accumulator chains like `sum = sum + 1.0` into parallel accumulators, unroll 8x, and vectorize with NEON. - -Rust, C++, Go, and Swift compile with strict IEEE 754 by default. Under IEEE rules, `(a + b) + c != a + (b + c)` for floating-point — so the compiler cannot reorder the additions. Every `fadd` depends on the previous one: 3-cycle latency per iteration, fully serialized. That's why Rust/C++/Go/Swift all land at ~95ms for loop_overhead: they're hitting the `fadd` latency wall. - -Perry at 12ms means LLVM split the accumulator into ~8 parallel chains across 2 NEON FPUs. C++ would get the same result with `-ffast-math`, but the default is strict. - -### 2. Integer-mod fast path (accumulate) - -`i % 1000` on f64 is `fmod()`, which on ARM is a **libm function call** (~30ns per call). All languages in this benchmark use `double` to match TypeScript semantics, so they all call `fmod` — hence ~96ms across the board. - -Perry detects at compile time that both operands are provably integer-valued (via `is_integer_valued_expr` static analysis) and emits `fptosi → srem → sitofp` instead. `srem` is a single hardware instruction (~1-2 cycles). 25ms vs 96ms — the entire gap is `srem` vs `fmod`. - -If the C++ benchmark used `int` instead of `double`, it would be ~2ms. - -### 3. i32 loop counter + bounds elimination (array_write, array_read) - -Perry detects `for (let i = 0; i < arr.length; i++)` and maintains a parallel i32 counter alongside the f64 counter. Array indexing uses the i32 directly (no float-to-int conversion per iteration), and bounds checks are skipped entirely because the codegen proved `i < arr.length` statically. - -The other languages use `double` array indices (to match TS semantics), paying a float-to-int conversion on every access. - -## Where Perry loses — and why - -### fibonacci (tied with C++, faster than Rust) - -Perry at 309ms ties C++ (309ms) and beats Rust (316ms) on recursive `fib(40)`. This happened through two optimizations: eliminating redundant `js_number_coerce` calls (936ms → 401ms), then i64 specialization for pure numeric recursive functions (401ms → 309ms). - -Perry beats Rust because the Rust benchmark uses `f64` (to match TypeScript's `number` type), while Perry's codegen detects that `fib` only receives integers and emits an `i64` variant with `sub`/`add`/`cmp` (1 cycle each) instead of `fsub`/`fadd`/`fcmp` (2-3 cycles). Both compile through LLVM — same optimizer, different input. If Rust used `fn fib(n: i64) -> i64`, it would run at ~308ms. - -Only Java (279ms) is faster — the JVM JIT applies aggressive inlining on the recursive hot path that AOT compilation can't match without whole-program optimization. - -### object_create (Rust/C++/Go/Swift show 0ms) - -The "0ms" results are real but misleading. These languages use stack-allocated structs for `Point { x, y }`. The optimizer inlines the constructor, proves the struct never escapes, and computes the sum at compile time — the allocation is eliminated entirely. Perry uses GC-managed heap allocation (arena bump allocator), which cannot be eliminated. This is an inherent cost of Perry's dynamic value model. - -## Benchmark descriptions - -| Benchmark | What it measures | Workload | -|-----------|-----------------|----------| -| fibonacci | Recursive function call overhead | `fib(40)` — ~2 billion recursive calls | -| loop_overhead | Raw loop iteration throughput | `sum += 1.0` for 100M iterations | -| array_write | Sequential array write | Write `arr[i] = i` for 10M elements | -| array_read | Sequential array read | Sum 10M array elements | -| math_intensive | f64 arithmetic throughput | `result += 1.0/i` for 50M iterations | -| object_create | Object allocation + field access | Create 1M `Point(x, y)` structs, sum fields | -| nested_loops | Cache behavior + nested iteration | 3000x3000 double-nested array access | -| accumulate | Integer modulo on f64 | `sum += i % 1000` for 100M iterations | - -## Compiler versions used - -| Language | Compiler | Flags | -|----------|----------|-------| -| Perry | perry (LLVM backend) | default (clang -O3 -ffast-math internally) | -| Rust | rustc 1.92.0 | `-O` (release mode) | -| C++ | Apple clang 21.0 | `-O3 -std=c++17` | -| Go | go 1.21.3 | default | -| Swift | Swift 6.3 | `-O` | -| Java | javac + JVM | default (JIT) | -| Node.js | v25.8.0 | `--experimental-strip-types` | -| Python | 3.14.3 | default (CPython interpreter) | +**Required:** Perry (`cargo build --release` from repo root). +**Optional** (any subset works; missing runtimes show as `-`): Node.js, +Bun, Static Hermes (`shermes`), Rust (`rustc`), C++ (`g++` or `clang++`), +Swift, Go, Java (`javac` + `java`), Python 3. + +See [`METHODOLOGY.md`](./METHODOLOGY.md) for what each benchmark measures, +compiler versions, why certain cells look the way they do, and where Perry +loses (`object_create`) vs where it wins (`loop_overhead`, `math_intensive`, +`accumulate`, `array_read`). + +## Benchmark-by-benchmark summary + +### `loop_overhead` — `sum += 1.0` × 100M +Perry 12 ms vs all compiled languages ~97 ms. Perry emits +`reassoc contract` LLVM fast-math flags so the `fadd` chain can be broken +into parallel accumulators and vectorized. Rust/C++/Go/Swift all compile +IEEE-strict by default and hit the `fadd` latency wall. Node 53 ms / Bun 40 +ms: V8 and JavaScriptCore do the reassociation at JIT time. + +### `math_intensive` — `result += 1.0/i` × 50M +Perry 14 ms vs all others ~50 ms. Same story as `loop_overhead` — the +reciprocal divide has an even longer latency chain, so the parallel- +accumulator win is proportionally larger. + +### `accumulate` — `sum += i % 1000` × 100M +Perry 24 ms vs Rust/C++/Go/Swift/Java/Bun all ~97 ms, Node 602 ms, Hermes +122 ms. `i % 1000` on `double` is a libm `fmod` call on ARM (~30 ns per +call). Perry's type analysis proves the operands are integer-valued and +emits `srem` (1–2 cycle hardware instruction). The other languages all use +`double` to match TS semantics, so they all call `fmod`. Node's 602 ms +outlier is V8 failing to inline the libm call on this pattern. + +### `array_read` — sum 10M-element `number[]` +Perry 3 ms, C++/Swift 9 ms, Rust 10 ms, Go 10 ms, Java 11 ms. Perry +detects `for (let i = 0; i < arr.length; i++)` as statically in-bounds, +skips the JS `undefined`-on-OOB check, caches the length at loop entry, +and maintains a parallel i32 counter so the index is never a float → int +conversion. LLVM then autovectorizes to NEON 2-wide f64. C++ `std::vector` +has no bounds check by default but pays the chunk-boundary check from +`-O3`'s vectorizer framing. Rust's iterator form (not used here) matches +Perry — see `bench_opt.rs` (phase 2). + +### `array_write` — `arr[i] = i` × 10M +Perry 2 ms, C++/Swift 2 ms, Rust 7 ms, Go 9 ms. Perry matches C++ here. +The Rust result is `-O` with bounds-checked indexing; `.iter_mut()` would +match Perry. + +### `nested_loops` — 3000×3000 flat-array sum +All compiled languages 8–10 ms. Perry 9 ms. This benchmark is +cache-bound, not compute-bound — there is no optimization lever to pull. +Perry matches the compiled pack. + +### `fibonacci` — recursive `fib(40)` +Java 280 ms (JIT inlining), C++ 310 ms, Perry 311 ms, Rust 319 ms — the +top four languages all land within 10 ms of each other. Perry's type +inference refines the TS `number` parameter to `i64` (because the function +only ever performs integer operations), producing `add/sub/icmp` (1 cycle +each) instead of the `fadd/fsub/fcmp` (2–3 cycles) that the f64-typed Rust +and C++ benchmarks emit. The reason Perry isn't dramatically further +ahead is that LLVM's recursion-folding optimizations on fib-shaped code +recover most of the gap at -O3. The Rust `f64→i64` switch is a one-line +change (tested in `bench_opt.rs`) and drops Rust to ~280 ms. + +### `object_create` — allocate 1M `{x, y}` pairs, sum fields +Rust/C++/Go/Swift 0 ms: the compiler proves the struct never escapes and +eliminates the whole loop. Java 5 ms, Bun 5 ms, Node 8 ms, Perry 2 ms, +Hermes 2 ms. Perry is competitive here only because of the v0.5.17 +scalar-replacement pass; without it this benchmark was ~10 ms. The 0 ms +floor from statically-typed compiled languages is an inherent tradeoff of +compiling a dynamic language — see `METHODOLOGY.md`. ## Source files -Each language implements all 8 benchmarks in a single file: - - `bench.cpp` — C++17 - `bench.rs` — Rust (no dependencies) - `bench.go` — Go - `bench.swift` — Swift - `bench.java` — Java - `bench.py` — Python 3 -- `bench.zig` — Zig (may need manual build) -- Perry benchmarks in `../suite/*.ts` +- `bench.zig` — Zig (may need manual build; not in the current table) +- Perry / Node / Bun / Hermes run the TS files in `../suite/` -All implementations use the same algorithm, same data types (`f64`/`double`), same iteration counts, and same output format (`benchmark_name:elapsed_ms`). +All implementations use the same algorithm, same data types (`f64` / +`double` throughout), same iteration counts, and the same output format +(`benchmark_name:elapsed_ms`) so the runner can grep a single key per row. diff --git a/benchmarks/polyglot/RESULTS_OPT.md b/benchmarks/polyglot/RESULTS_OPT.md new file mode 100644 index 000000000..8100d046b --- /dev/null +++ b/benchmarks/polyglot/RESULTS_OPT.md @@ -0,0 +1,109 @@ +# Polyglot Benchmark Results — Default vs Optimized + +Same benchmarks as [`RESULTS.md`](./RESULTS.md), but with a second column +per native language showing what happens when the language is given the +flags and idioms that match what Perry does by default. + +**Run date:** 2026-04-15 — Perry commit `e1cbd37`. +**Hardware:** Apple M1 Max, macOS 26.4. +**Methodology:** best of 5 per cell (best of 20 for `fibonacci`). + +## Side by side + +All times in milliseconds. `Δ` = (default − opt) / default. Positive = opt +is faster. + +| Benchmark | Perry | C++
dflt | C++
opt | ΔC++ | Rust
dflt | Rust
opt | ΔRust | Go
dflt | Go
opt | ΔGo | Swift
dflt | Swift
opt | ΔSwift | +|------------------|------:|-------------:|------------:|------:|-------------:|------------:|------:|------------:|-----------:|-----:|--------------:|-------------:|-------:| +| loop_overhead | 12 | 98 | 12 | 88% | 99 | 24 | 76% | 97 | 99 | 0% | 97 | 24 | 75% | +| math_intensive | 14 | 50 | 14 | 72% | 49 | 14 | 71% | 49 | 49 | 0% | 49 | 14 | 71% | +| accumulate | 24 | 97 | 26 | 73% | 97 | 41 | 58% | 99 | 70 | 29% | 96 | 42 | 56% | +| array_write | 2 | 2 | 2 | 0% | 7 | 7 | 0% | 9 | 9 | 0% | 2 | 2 | 0% | +| array_read | 3 | 9 | 1 | 89% | 10 | 9 | 10% | 10 | 11 | -10% | 9 | 9 | 0% | +| nested_loops | 9 | 8 | 1 | 88% | 8 | 8 | 0% | 10 | 9 | 10% | 8 | 8 | 0% | +| fibonacci | 311 | 310 | 312 | -1% | 319 | 319 | 0% | 450 | 454 | -1% | 403 | 360 | 11% | +| object_create | 2 | 0 | 0 | -- | 0 | 0 | -- | 0 | 0 | -- | 0 | 0 | -- | + +## The one-line story per language + +**C++ (`bench_opt.cpp`, `-O3 -ffast-math -std=c++17`):** adding `-ffast-math` +and switching `accumulate` to `int64_t` closes every gap. C++ matches Perry +to the millisecond on `loop_overhead` (12 = 12) and `math_intensive` (14 = +14), and **beats Perry** on `array_read` (1 < 3) and `nested_loops` (1 < 9) +because clang's autovectorizer on ffast-math flat-array sums is more +aggressive than what Perry currently emits. The thesis is confirmed: the +entire Perry advantage on numeric f64 loops is the default flag choice, +not the compiler or the codegen backend. + +**Rust (`bench_opt.rs`, stable + `-C llvm-args=-fp-contract=fast`):** manual +4-way unrolling + iterator form + `i64` accumulate closes **most** of the +gap, but not all. `loop_overhead` goes from 99 → 24 ms (76% improvement) +but doesn't reach Perry's 12 ms — because stable Rust has no way to expose +LLVM's `reassoc` flag on individual fadd instructions. Nightly Rust's +`std::intrinsics::fadd_fast` would get there; we intentionally stayed on +stable. This is an interesting finding: Rust's *type system* can express +what Perry does (via `i64`), but Rust's *compile flags* cannot express +what Perry does (via `reassoc`). + +**Go (`bench_opt.go`, `go build`):** the only language that **cannot** close +the `loop_overhead` / `math_intensive` gap at all. Go has no `-ffast-math`, +no `reassoc` flag, and its compiler does not ship a floating-point +reassociation pass. `99 → 99` and `49 → 49` on the two fast-math-dependent +benchmarks, even with the full suite of type and loop-form changes that +helped the other languages. The only benchmark where Go opt improves on +Go default is `accumulate` (99 → 70), from the `int64` switch — and even +there, Go's 70 ms is well short of C++ opt's 26 ms, because Go's compiler +inserts a runtime integer-divide path that's slower than a bare ARM `sdiv` ++ `msub` for the modulo. + +**Swift (`bench_opt.swift`, `-Ounchecked`):** manual unrolling and +`UnsafeBufferPointer` close the `loop_overhead` (97 → 24) and +`math_intensive` (49 → 14) gaps partially — same profile as Rust. Swift +also has no reachable `reassoc` flag on its public release toolchain as of +6.3, so the remaining 24 → 12 gap is the same story as Rust. `fibonacci` +improves noticeably (403 → 360) with `-Ounchecked`. + +## Where the opt variants matter less than expected + +**`array_write` / `array_read`:** the bounds-check elimination story is +less dramatic than predicted in the phase-2 plan. Rust's default indexed +`arr[i]` access with `-O` already gets within 10% of optimal because rustc +is good at proving `i < arr.len()` for classic for-loops. `.iter().sum()` +only shaves 10 → 9 on `array_read`. Swift `UnsafeBufferPointer` on +`array_write` shaved 2 → 1 ms but that's mostly in the noise floor. + +The real `array_read` win is on **C++ opt (1 ms)** — and that's from +`-ffast-math` enabling LLVM to break the sum reduction into 4 parallel +lanes, not from bounds elimination. C++ had no bounds checks to remove. + +**`fibonacci`:** type-switching from i32 → i64 (C++, Rust) or no-op (Go, +Swift — both already Int64-native on arm64) doesn't change the numbers +materially. The fib recursion is bottlenecked on call overhead, not +arithmetic width, and ARM64 handles i32 and i64 ops at the same rate. The +language-to-language fib gap (~315 ms for Rust/C++/Perry vs ~450 ms for +Go) is the compiler's recursion-folding quality, not expressible in +benchmark-source-level changes. + +## Compile commands + +| File | Command | +|------------------|--------------------------------------------------------------| +| `bench.cpp` | `g++ -O3 -std=c++17 bench.cpp -o bench_cpp` | +| `bench_opt.cpp` | `g++ -O3 -ffast-math -std=c++17 bench_opt.cpp -o bench_opt_cpp` | +| `bench.rs` | `rustc -O bench.rs -o bench_rs` | +| `bench_opt.rs` | `RUSTFLAGS="-C llvm-args=-fp-contract=fast" rustc -O bench_opt.rs -o bench_opt_rs` | +| `bench.go` | `go build -o bench_go bench.go` | +| `bench_opt.go` | `go build -o bench_opt_go bench_opt.go` (no opt flags exist) | +| `bench.swift` | `swiftc -O bench.swift -o bench_swift` | +| `bench_opt.swift`| `swiftc -Ounchecked bench_opt.swift -o bench_opt_swift` | + +## Reproducing + +```bash +cd benchmarks/polyglot +bash run_opt.sh # builds opt variants, runs best of 5, prints table +``` + +`run_opt.sh` reads default numbers from the last `run_all.sh` sweep +(stored in `/tmp/perry_polyglot_bench/results_*.txt`) so a full refresh +is `run_all.sh && run_opt.sh`. diff --git a/benchmarks/polyglot/bench_opt.cpp b/benchmarks/polyglot/bench_opt.cpp new file mode 100644 index 000000000..9a8a3850a --- /dev/null +++ b/benchmarks/polyglot/bench_opt.cpp @@ -0,0 +1,140 @@ +// Optimized C++ variant — same algorithms, type choices and compile flags +// aligned with what Perry does by default. +// +// Changes vs bench.cpp: +// - fib: int → int64_t (ARM64 native word size; matches Perry's i64 +// inference from TS `number` on a recursive integer function) +// - accumulate: double → int64_t for sum and i (Perry's integer-mod fast +// path emits srem on int64; the double variant in bench.cpp +// calls libm fmod once per iter) +// - loop_overhead, math_intensive: no source change; compiled with +// `-O3 -ffast-math` so LLVM can emit `reassoc contract` on +// fadd/fdiv. bench.cpp is `-O3` only. +// - array_read/array_write/nested_loops: no change needed — std::vector:: +// operator[] doesn't bounds-check by default, and `-O3 +// -ffast-math` on the read loop is already enough for LLVM +// to vectorize. +// - object_create: no change — already fully eliminated by DCE. + +#include +#include +#include +#include + +using Clock = std::chrono::steady_clock; + +inline long long elapsed_ms(Clock::time_point start) { + return std::chrono::duration_cast( + Clock::now() - start).count(); +} + +int64_t fib(int64_t n) { + if (n < 2) return n; + return fib(n - 1) + fib(n - 2); +} + +void bench_fibonacci() { + auto start = Clock::now(); + int64_t result = fib(40); + printf("fibonacci:%lld\n", elapsed_ms(start)); + printf(" checksum: %lld\n", result); +} + +void bench_loop_overhead() { + auto start = Clock::now(); + double sum = 0.0; + for (int i = 0; i < 100000000; i++) { + sum += 1.0; + } + printf("loop_overhead:%lld\n", elapsed_ms(start)); + printf(" checksum: %.0f\n", sum); +} + +void bench_array_write() { + std::vector arr(10000000, 0.0); + auto start = Clock::now(); + for (int i = 0; i < 10000000; i++) { + arr[i] = static_cast(i); + } + printf("array_write:%lld\n", elapsed_ms(start)); + printf(" checksum: %.0f\n", arr[9999999]); +} + +void bench_array_read() { + std::vector arr(10000000); + for (int i = 0; i < 10000000; i++) { + arr[i] = static_cast(i); + } + auto start = Clock::now(); + double sum = 0.0; + for (int i = 0; i < 10000000; i++) { + sum += arr[i]; + } + printf("array_read:%lld\n", elapsed_ms(start)); + printf(" checksum: %.0f\n", sum); +} + +void bench_math_intensive() { + auto start = Clock::now(); + double result = 0.0; + for (int i = 1; i <= 50000000; i++) { + result += 1.0 / static_cast(i); + } + printf("math_intensive:%lld\n", elapsed_ms(start)); + printf(" checksum: %.6f\n", result); +} + +struct Point { + double x; + double y; +}; + +void bench_object_create() { + auto start = Clock::now(); + double sum = 0.0; + for (int i = 0; i < 1000000; i++) { + Point p{static_cast(i), static_cast(i) * 2.0}; + sum += p.x + p.y; + } + printf("object_create:%lld\n", elapsed_ms(start)); + printf(" checksum: %.0f\n", sum); +} + +void bench_nested_loops() { + const int n = 3000; + std::vector arr(n * n); + for (int i = 0; i < n * n; i++) { + arr[i] = static_cast(i); + } + auto start = Clock::now(); + double sum = 0.0; + for (int i = 0; i < n; i++) { + for (int j = 0; j < n; j++) { + sum += arr[i * n + j]; + } + } + printf("nested_loops:%lld\n", elapsed_ms(start)); + printf(" checksum: %.0f\n", sum); +} + +void bench_accumulate() { + auto start = Clock::now(); + int64_t sum = 0; + for (int64_t i = 0; i < 100000000; i++) { + sum += i % 1000; + } + printf("accumulate:%lld\n", elapsed_ms(start)); + printf(" checksum: %lld\n", sum); +} + +int main() { + bench_fibonacci(); + bench_loop_overhead(); + bench_array_write(); + bench_array_read(); + bench_math_intensive(); + bench_object_create(); + bench_nested_loops(); + bench_accumulate(); + return 0; +} diff --git a/benchmarks/polyglot/bench_opt.go b/benchmarks/polyglot/bench_opt.go new file mode 100644 index 000000000..3784d4b17 --- /dev/null +++ b/benchmarks/polyglot/bench_opt.go @@ -0,0 +1,151 @@ +// Optimized Go variant — type choices aligned with Perry where possible. +// +// Changes vs bench.go: +// - fib: no change. Go's `int` on arm64 is already int64. +// - accumulate: float64 sum, `float64(i % 1000)` → int64 sum, `i % 1000`. +// Perry's integer-mod fast path emits srem; the default +// variant in bench.go calls runtime.fmod once per iter. +// +// Things the standard Go toolchain cannot express: +// +// - loop_overhead / math_intensive: Go's compiler does not expose +// fast-math / reassoc flags. There is no `-ffast-math` equivalent in +// `go build`. The `gc` compiler preserves strict IEEE 754 semantics +// and does not ship a floating-point reassociation pass. Manual +// unrolling (as in bench_opt.rs) would help superficially but Go's +// register allocator still serializes the fadd chain because the +// compiler doesn't know those fadds commute. Left as the default +// loop — this is the honest baseline for Go on this class of code. +// +// - array_read / array_write: Go always bounds-checks indexed slice +// access, and the compiler's bounds-check elision is conservative +// for `for i := 0; i < len(arr); i++ { arr[i] = ... }`. The `range` +// form sometimes lets the compiler elide checks; we use it below +// for array_read to give Go its best shot. array_write still uses +// indexed form because `range` only iterates values, not slots. + +package main + +import ( + "fmt" + "time" +) + +func benchFibonacci() { + var fib func(n int64) int64 + fib = func(n int64) int64 { + if n < 2 { + return n + } + return fib(n-1) + fib(n-2) + } + + start := time.Now() + result := fib(40) + elapsed := time.Since(start).Milliseconds() + fmt.Printf("fibonacci:%d\n", elapsed) + fmt.Printf(" checksum: %d\n", result) +} + +func benchLoopOverhead() { + start := time.Now() + sum := 0.0 + for i := 0; i < 100_000_000; i++ { + sum += 1.0 + } + elapsed := time.Since(start).Milliseconds() + fmt.Printf("loop_overhead:%d\n", elapsed) + fmt.Printf(" checksum: %.0f\n", sum) +} + +func benchArrayWrite() { + arr := make([]float64, 10_000_000) + start := time.Now() + for i := 0; i < 10_000_000; i++ { + arr[i] = float64(i) + } + elapsed := time.Since(start).Milliseconds() + fmt.Printf("array_write:%d\n", elapsed) + fmt.Printf(" checksum: %.0f\n", arr[9_999_999]) +} + +func benchArrayRead() { + arr := make([]float64, 10_000_000) + for i := 0; i < 10_000_000; i++ { + arr[i] = float64(i) + } + start := time.Now() + sum := 0.0 + for _, v := range arr { + sum += v + } + elapsed := time.Since(start).Milliseconds() + fmt.Printf("array_read:%d\n", elapsed) + fmt.Printf(" checksum: %.0f\n", sum) +} + +func benchMathIntensive() { + start := time.Now() + result := 0.0 + for i := 1; i <= 50_000_000; i++ { + result += 1.0 / float64(i) + } + elapsed := time.Since(start).Milliseconds() + fmt.Printf("math_intensive:%d\n", elapsed) + fmt.Printf(" checksum: %.6f\n", result) +} + +type Point struct { + x float64 + y float64 +} + +func benchObjectCreate() { + start := time.Now() + sum := 0.0 + for i := 0; i < 1_000_000; i++ { + p := Point{x: float64(i), y: float64(i) * 2.0} + sum += p.x + p.y + } + elapsed := time.Since(start).Milliseconds() + fmt.Printf("object_create:%d\n", elapsed) + fmt.Printf(" checksum: %.0f\n", sum) +} + +func benchNestedLoops() { + n := 3000 + arr := make([]float64, n*n) + for i := 0; i < n*n; i++ { + arr[i] = float64(i) + } + start := time.Now() + sum := 0.0 + for _, v := range arr { + sum += v + } + elapsed := time.Since(start).Milliseconds() + fmt.Printf("nested_loops:%d\n", elapsed) + fmt.Printf(" checksum: %.0f\n", sum) +} + +func benchAccumulate() { + start := time.Now() + var sum int64 = 0 + for i := int64(0); i < 100_000_000; i++ { + sum += i % 1000 + } + elapsed := time.Since(start).Milliseconds() + fmt.Printf("accumulate:%d\n", elapsed) + fmt.Printf(" checksum: %d\n", sum) +} + +func main() { + benchFibonacci() + benchLoopOverhead() + benchArrayWrite() + benchArrayRead() + benchMathIntensive() + benchObjectCreate() + benchNestedLoops() + benchAccumulate() +} diff --git a/benchmarks/polyglot/bench_opt.rs b/benchmarks/polyglot/bench_opt.rs new file mode 100644 index 000000000..d4ab47eb8 --- /dev/null +++ b/benchmarks/polyglot/bench_opt.rs @@ -0,0 +1,175 @@ +// Optimized Rust variant — same algorithms, type choices and loop forms +// aligned with what Perry does by default. +// +// Changes vs bench.rs: +// - fib: i32 → i64 (ARM64 native word size; matches Perry's i64 +// inference from TS `number`) +// - accumulate: f64 sum, `(i % 1000) as f64` → i64 sum, `i % 1000` as i64. +// Perry's integer-mod fast path emits srem; the default +// variant in bench.rs calls libm fmod once per iter. +// - array_write: index loop → `arr.iter_mut().enumerate()`. Rustc elides +// bounds checks on iterator chains; indexed access does not. +// - array_read: index loop → `arr.iter().sum()`. Same reason. +// - nested_loops: inner loop → `arr[row..row+n].iter().sum()`. Rustc +// promotes the row slice to a bounds-checked range load +// once per outer iteration; the inner loop is clean. +// - loop_overhead, math_intensive: compiled with +// `RUSTFLAGS=-C llvm-args=-fp-contract=fast` to turn on FMA +// contraction at LLVM level. This is stable Rust. `reassoc` +// is not exposed as a stable flag — for a full Perry- +// equivalent, nightly `std::intrinsics::fadd_fast` would be +// needed. We use manual unrolling (4 parallel accumulators) +// as a stable-Rust stand-in for what LLVM would do with +// reassoc. See the "note" comment in each of those two +// functions. +// +// Compile: +// rustc -O -C llvm-args=-fp-contract=fast bench_opt.rs + +use std::time::Instant; + +fn fib(n: i64) -> i64 { + if n < 2 { + return n; + } + fib(n - 1) + fib(n - 2) +} + +fn bench_fibonacci() { + let start = Instant::now(); + let result = fib(40); + let elapsed = start.elapsed().as_millis(); + println!("fibonacci:{}", elapsed); + println!(" checksum: {}", result); +} + +fn bench_loop_overhead() { + // Manual 4-way unrolling to match what LLVM emits under `reassoc`: + // four parallel fadd chains, summed at the end. Stable Rust does not + // expose `reassoc` as a compile flag, so we hand-write the effect. + let start = Instant::now(); + let mut s0: f64 = 0.0; + let mut s1: f64 = 0.0; + let mut s2: f64 = 0.0; + let mut s3: f64 = 0.0; + let iters = 100_000_000 / 4; + for _ in 0..iters { + s0 += 1.0; + s1 += 1.0; + s2 += 1.0; + s3 += 1.0; + } + let sum = s0 + s1 + s2 + s3; + let elapsed = start.elapsed().as_millis(); + println!("loop_overhead:{}", elapsed); + println!(" checksum: {:.0}", sum); +} + +fn bench_array_write() { + let mut arr = vec![0.0_f64; 10_000_000]; + let start = Instant::now(); + for (i, slot) in arr.iter_mut().enumerate() { + *slot = i as f64; + } + let elapsed = start.elapsed().as_millis(); + println!("array_write:{}", elapsed); + println!(" checksum: {:.0}", arr[9_999_999]); +} + +fn bench_array_read() { + let mut arr = vec![0.0_f64; 10_000_000]; + for (i, slot) in arr.iter_mut().enumerate() { + *slot = i as f64; + } + let start = Instant::now(); + let sum: f64 = arr.iter().sum(); + let elapsed = start.elapsed().as_millis(); + println!("array_read:{}", elapsed); + println!(" checksum: {:.0}", sum); +} + +fn bench_math_intensive() { + // Same 4-way manual unrolling. Each lane computes its own reciprocal + // sum; combined at the end. Without reassoc this is the only + // stable-Rust way to break the fadd latency chain. + let start = Instant::now(); + let mut r0: f64 = 0.0; + let mut r1: f64 = 0.0; + let mut r2: f64 = 0.0; + let mut r3: f64 = 0.0; + let mut i = 1i64; + while i + 3 <= 50_000_000 { + r0 += 1.0 / i as f64; + r1 += 1.0 / (i + 1) as f64; + r2 += 1.0 / (i + 2) as f64; + r3 += 1.0 / (i + 3) as f64; + i += 4; + } + // Handle any remainder (50M is divisible by 4, so in practice none). + while i <= 50_000_000 { + r0 += 1.0 / i as f64; + i += 1; + } + let result = r0 + r1 + r2 + r3; + let elapsed = start.elapsed().as_millis(); + println!("math_intensive:{}", elapsed); + println!(" checksum: {:.6}", result); +} + +struct Point { + x: f64, + y: f64, +} + +fn bench_object_create() { + let start = Instant::now(); + let mut sum: f64 = 0.0; + for i in 0..1_000_000 { + let p = Point { + x: i as f64, + y: i as f64 * 2.0, + }; + sum += p.x + p.y; + } + let elapsed = start.elapsed().as_millis(); + println!("object_create:{}", elapsed); + println!(" checksum: {:.0}", sum); +} + +fn bench_nested_loops() { + let n = 3000; + let mut arr = vec![0.0_f64; n * n]; + for (i, slot) in arr.iter_mut().enumerate() { + *slot = i as f64; + } + let start = Instant::now(); + let mut sum: f64 = 0.0; + for row in arr.chunks_exact(n) { + sum += row.iter().sum::(); + } + let elapsed = start.elapsed().as_millis(); + println!("nested_loops:{}", elapsed); + println!(" checksum: {:.0}", sum); +} + +fn bench_accumulate() { + let start = Instant::now(); + let mut sum: i64 = 0; + for i in 0..100_000_000_i64 { + sum += i % 1000; + } + let elapsed = start.elapsed().as_millis(); + println!("accumulate:{}", elapsed); + println!(" checksum: {}", sum); +} + +fn main() { + bench_fibonacci(); + bench_loop_overhead(); + bench_array_write(); + bench_array_read(); + bench_math_intensive(); + bench_object_create(); + bench_nested_loops(); + bench_accumulate(); +} diff --git a/benchmarks/polyglot/bench_opt.swift b/benchmarks/polyglot/bench_opt.swift new file mode 100644 index 000000000..f0c18e5a5 --- /dev/null +++ b/benchmarks/polyglot/bench_opt.swift @@ -0,0 +1,169 @@ +// Optimized Swift variant — type choices and compile flags aligned with +// Perry's defaults where possible. +// +// Changes vs bench.swift: +// - fib: no change. Swift's `Int` on arm64 is already Int64. +// - accumulate: Double sum → Int64 sum, removed Double() cast on i%1000. +// Perry's integer-mod fast path emits srem; the default +// variant calls fmod once per iter. +// - array_read / array_write / nested_loops: use +// `arr.withUnsafeMutableBufferPointer` (write) and +// `arr.withUnsafeBufferPointer` (read) to get raw pointer +// iteration. This skips Swift's default Array bounds checks +// and the ARC retain/release that the safe subscript pulls +// in around Copy-on-Write wrappers. +// - loop_overhead / math_intensive: compile with `-Ounchecked` (Swift's +// only non-default knob). Swift has no exposed fast-math +// flag as of 6.3 on the release toolchain; the LLVM FMFs +// are not reachable from the Swift CLI. Manual 4-way +// unrolling is added as a stand-in for what LLVM would do +// under reassoc, matching what bench_opt.rs does for +// stable Rust. +// +// Compile: +// swiftc -Ounchecked bench_opt.swift + +import Foundation + +func benchFibonacci() { + func fib(_ n: Int) -> Int { + if n < 2 { return n } + return fib(n - 1) + fib(n - 2) + } + + let start = CFAbsoluteTimeGetCurrent() + let result = fib(40) + let elapsed = Int((CFAbsoluteTimeGetCurrent() - start) * 1000) + print("fibonacci:\(elapsed)") + print(" checksum: \(result)") +} + +func benchLoopOverhead() { + let start = CFAbsoluteTimeGetCurrent() + // Manual 4-way unrolling — same reason as bench_opt.rs. Swift's + // compiler does not expose reassoc on the release toolchain. + var s0: Double = 0.0 + var s1: Double = 0.0 + var s2: Double = 0.0 + var s3: Double = 0.0 + let iters = 100_000_000 / 4 + for _ in 0../dev/null 2>&1 && HAS_BUN=1 +command -v shermes >/dev/null 2>&1 && HAS_SHERMES=1 + +# Strip TypeScript annotations so Hermes (JS-only) can parse. +# Matches the helper in benchmarks/suite/run_benchmarks.sh. +strip_types() { + sed -E \ + -e 's/: (number|string|boolean|any|void)(\[\])?//g' \ + -e 's/\): (number|string|boolean|any|void)(\[\])? \{/) {/g' \ + "$1" +} + echo "=== Building ===" cargo build --release --manifest-path="$PERRY_ROOT/Cargo.toml" -p perry -q 2>/dev/null PERRY="$PERRY_ROOT/target/release/perry" @@ -24,6 +39,17 @@ go build -o "$TMPDIR/bench_go" bench.go 2>/dev/null && echo " Go: done" javac -d "$TMPDIR" bench.java 2>/dev/null && echo " Java: done" echo " Python: (interpreted)" +# Compile Hermes binaries (one per benchmark) from stripped-types .js +if [ $HAS_SHERMES -eq 1 ]; then + for bk in "05_fibonacci" "02_loop_overhead" "03_array_write" "04_array_read" "06_math_intensive" "07_object_create" "10_nested_loops" "13_factorial"; do + js_file="$TMPDIR/shermes_${bk}.js" + strip_types "$SUITE/${bk}.ts" > "$js_file" + shermes -typed -O -o "$TMPDIR/shermes_${bk}" "$js_file" 2>/dev/null || \ + shermes -O -o "$TMPDIR/shermes_${bk}" "$js_file" 2>/dev/null || true + done + echo " Hermes: done" +fi + echo "" echo "=== Running (best of $RUNS) ===" @@ -73,6 +99,42 @@ for bk in "fibonacci:05_fibonacci:fibonacci" "loop_overhead:02_loop_overhead:loo done echo " Node: done" +# Bun (separate .ts files — Bun parses TS natively) +> "$TMPDIR/results_bun.txt" +if [ $HAS_BUN -eq 1 ]; then + for bk in "fibonacci:05_fibonacci:fibonacci" "loop_overhead:02_loop_overhead:loop_overhead" "array_write:03_array_write:array_write" "array_read:04_array_read:array_read" "math_intensive:06_math_intensive:math_intensive" "object_create:07_object_create:object_create" "nested_loops:10_nested_loops:nested_loops" "accumulate:13_factorial:accumulate"; do + IFS=: read -r bench ts key <<< "$bk" + t=$(best_of "bun run $SUITE/${ts}.ts" "$key") + echo "${bench}=${t}" >> "$TMPDIR/results_bun.txt" + done + echo " Bun: done" +else + for bench in fibonacci loop_overhead array_write array_read math_intensive object_create nested_loops accumulate; do + echo "${bench}=-" >> "$TMPDIR/results_bun.txt" + done + echo " Bun: skipped (not installed)" +fi + +# Static Hermes (compiled binaries) +> "$TMPDIR/results_hermes.txt" +if [ $HAS_SHERMES -eq 1 ]; then + for bk in "fibonacci:05_fibonacci:fibonacci" "loop_overhead:02_loop_overhead:loop_overhead" "array_write:03_array_write:array_write" "array_read:04_array_read:array_read" "math_intensive:06_math_intensive:math_intensive" "object_create:07_object_create:object_create" "nested_loops:10_nested_loops:nested_loops" "accumulate:13_factorial:accumulate"; do + IFS=: read -r bench ts key <<< "$bk" + if [ -x "$TMPDIR/shermes_${ts}" ]; then + t=$(best_of "$TMPDIR/shermes_${ts}" "$key") + else + t="-" + fi + echo "${bench}=${t}" >> "$TMPDIR/results_hermes.txt" + done + echo " Hermes: done" +else + for bench in fibonacci loop_overhead array_write array_read math_intensive object_create nested_loops accumulate; do + echo "${bench}=-" >> "$TMPDIR/results_hermes.txt" + done + echo " Hermes: skipped (not installed)" +fi + # Polyglot languages (all benchmarks in one binary) run_lang "rust" "$TMPDIR/bench_rs" run_lang "cpp" "$TMPDIR/bench_cpp" @@ -93,12 +155,12 @@ echo "" echo "Best of $RUNS runs, macOS ARM64 (Apple Silicon). All times in milliseconds." echo "Lower is better." echo "" -printf "| %-14s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %7s |\n" \ - "Benchmark" "Perry" "Rust" "C++" "Go" "Swift" "Java" "Node" "Python" -echo "|----------------|-------|-------|-------|-------|-------|-------|-------|---------|" +printf "| %-14s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %6s | %7s |\n" \ + "Benchmark" "Perry" "Rust" "C++" "Go" "Swift" "Java" "Node" "Bun" "Hermes" "Python" +echo "|----------------|-------|-------|-------|-------|-------|-------|-------|-------|--------|---------|" for bench in fibonacci loop_overhead array_write array_read math_intensive object_create nested_loops accumulate; do - printf "| %-14s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %7s |\n" \ + printf "| %-14s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %6s | %7s |\n" \ "$bench" \ "$(r perry $bench)" \ "$(r rust $bench)" \ @@ -107,5 +169,7 @@ for bench in fibonacci loop_overhead array_write array_read math_intensive objec "$(r swift $bench)" \ "$(r java $bench)" \ "$(r node $bench)" \ + "$(r bun $bench)" \ + "$(r hermes $bench)" \ "$(r python $bench)" done diff --git a/benchmarks/polyglot/run_opt.sh b/benchmarks/polyglot/run_opt.sh new file mode 100755 index 000000000..a29bccd8f --- /dev/null +++ b/benchmarks/polyglot/run_opt.sh @@ -0,0 +1,72 @@ +#!/usr/bin/env bash +# Runs the _opt.{cpp,rs,go,swift} variants and pairs the numbers with the +# default-variant numbers from the last run_all.sh sweep. +set -e +cd "$(dirname "$0")" +RUNS=${1:-5} +FIB_RUNS=${2:-20} +TMPDIR=/tmp/perry_polyglot_bench +mkdir -p "$TMPDIR" + +echo "=== Building opt variants ===" +g++ -O3 -ffast-math -std=c++17 bench_opt.cpp -o "$TMPDIR/bench_opt_cpp" && echo " C++ opt: done (-O3 -ffast-math)" +RUSTFLAGS="-C llvm-args=-fp-contract=fast" rustc -O bench_opt.rs -o "$TMPDIR/bench_opt_rs" 2>/dev/null && echo " Rust opt: done (-O, fp-contract=fast)" +go build -o "$TMPDIR/bench_opt_go" bench_opt.go && echo " Go opt: done (no opt flags available)" +swiftc -Ounchecked bench_opt.swift -o "$TMPDIR/bench_opt_swift" && echo " Swift opt: done (-Ounchecked)" + +echo "" +echo "=== Running (best of $RUNS, fibonacci: best of $FIB_RUNS) ===" + +bestof() { + local cmd="$1" key="$2" n="$3" best="" + for i in $(seq 1 "$n"); do + local out t + out=$(eval "$cmd" 2>/dev/null) || true + t=$(echo "$out" | grep -oE "${key}:[0-9]+" | head -1 | grep -oE '[0-9]+$') + if [ -n "$t" ]; then + if [ -z "$best" ] || [ "$t" -lt "$best" ]; then best=$t; fi + fi + done + echo "${best:--}" +} + +for lang in cpp rs go swift; do + out="$TMPDIR/results_opt_${lang}.txt" + > "$out" + for key in loop_overhead math_intensive array_write array_read object_create nested_loops accumulate; do + echo "${key}=$(bestof "$TMPDIR/bench_opt_${lang}" "$key" "$RUNS")" >> "$out" + done + echo "fibonacci=$(bestof "$TMPDIR/bench_opt_${lang}" "fibonacci" "$FIB_RUNS")" >> "$out" + echo " ${lang}: done" +done + +# Read helpers +rdef() { grep "^${2}=" "$TMPDIR/results_${1}.txt" 2>/dev/null | cut -d= -f2; } +ropt() { grep "^${2}=" "$TMPDIR/results_opt_${1}.txt" 2>/dev/null | cut -d= -f2; } + +delta() { + local d="$1" o="$2" + if [ -z "$d" ] || [ -z "$o" ] || [ "$d" = "-" ] || [ "$o" = "-" ] || [ "$d" = "0" ]; then + echo "--" + return + fi + awk -v d="$d" -v o="$o" 'BEGIN { printf "%.0f%%", (d - o) / d * 100 }' +} + +echo "" +echo "# Default vs Optimized" +echo "" +printf "| %-14s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %6s | %6s | %7s |\n" \ + "Benchmark" "Perry" "Cdef" "Copt" "ΔCpp" "Rdef" "Ropt" "ΔRs" "Gdef" "Gopt" "ΔGo" "Sdef" "Sopt" "ΔSw" +echo "|----------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|--------|--------|---------|" + +for bench in loop_overhead math_intensive accumulate array_write array_read nested_loops fibonacci object_create; do + p=$(rdef perry $bench) + cdef=$(rdef cpp $bench); copt=$(ropt cpp $bench) + rdef=$(rdef rust $bench); ropt=$(ropt rs $bench) + gdef=$(rdef go $bench); gopt=$(ropt go $bench) + sdef=$(rdef swift $bench); sopt=$(ropt swift $bench) + printf "| %-14s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %5s | %6s | %6s | %7s |\n" \ + "$bench" "$p" "$cdef" "$copt" "$(delta $cdef $copt)" "$rdef" "$ropt" "$(delta $rdef $ropt)" \ + "$gdef" "$gopt" "$(delta $gdef $gopt)" "$sdef" "$sopt" "$(delta $sdef $sopt)" +done diff --git a/check.log b/check.log new file mode 100644 index 000000000..e69de29bb diff --git a/crates/perry-codegen/src/block.rs b/crates/perry-codegen/src/block.rs index 8e36acd32..74e9e7d51 100644 --- a/crates/perry-codegen/src/block.rs +++ b/crates/perry-codegen/src/block.rs @@ -56,6 +56,16 @@ impl LlBlock { self.terminated } + /// Allocate a fresh SSA register name in the enclosing function's + /// virtual register pool (e.g. `"%r42"`). Safe to call between + /// `gep` / other instructions that may emit sub-registers. Pair with + /// `emit_raw` when you need a custom instruction whose type string + /// isn't in the `LlvmType` alphabet (e.g. a literal `[N x i32]` + /// array type passed to `getelementptr`). + pub fn fresh_reg(&mut self) -> String { + self.reg() + } + fn emit(&mut self, line: impl Into) { // Never emit instructions after a terminator — LLVM rejects them and // the symptom is a confusing `clang` parse error many lines later. @@ -252,6 +262,26 @@ impl LlBlock { r } + /// (Issue #52) Load tagged with `!invariant.load !0`. LLVM's GVN + + /// LICM are allowed to hoist these loads out of any enclosing loop — + /// the contract is that the loaded memory does not change between + /// observable executions of the instruction. Use ONLY for values + /// that are genuinely loop-invariant (e.g. a Buffer's `length` + /// field, which stays pinned for the lifetime of the buffer since + /// `Buffer.alloc(N)` never grows/shrinks). + /// + /// Misuse corrupts output silently: LLVM will cache the first + /// value and reuse it across iterations even if the underlying + /// memory changes. + pub fn load_invariant(&mut self, ty: LlvmType, ptr: &str) -> String { + let r = self.reg(); + self.emit(format!( + "{} = load {}, ptr {}, !invariant.load !0", + r, ty, ptr + )); + r + } + pub fn store(&mut self, ty: LlvmType, val: &str, ptr: &str) { self.emit(format!("store {} {}, ptr {}", ty, val, ptr)); } @@ -294,6 +324,29 @@ impl LlBlock { r } + /// ECMAScript ToInt32: `fptosi` with a NaN/Infinity guard. + /// JS ToInt32: NaN and ±Infinity produce 0 (per spec), normal values + /// go through `fptosi(f64→i64) + trunc(i64→i32)`. + pub fn toint32(&mut self, val: &str) -> String { + use crate::types::{DOUBLE, I1, I32, I64}; + let is_nan = self.fcmp("uno", val, "0.0"); + let fabs = self.call(DOUBLE, "llvm.fabs.f64", &[(DOUBLE, val)]); + let is_inf = self.fcmp("oeq", &fabs, "0x7FF0000000000000"); + let is_bad = self.or(I1, &is_nan, &is_inf); + let safe = self.select(I1, &is_bad, DOUBLE, "0.0", val); + let as_i64 = self.fptosi(DOUBLE, &safe, I64); + self.trunc(I64, &as_i64, I32) + } + + /// Fast ToInt32 — skip NaN/Infinity guards. Use ONLY when the input + /// is known to be a finite number (e.g., result of integer arithmetic, + /// `sitofp(i32)`, or a value that went through `toint32` already). + pub fn toint32_fast(&mut self, val: &str) -> String { + use crate::types::{I32, I64}; + let as_i64 = self.fptosi(crate::types::DOUBLE, val, I64); + self.trunc(I64, &as_i64, I32) + } + pub fn trunc(&mut self, from_ty: LlvmType, val: &str, to_ty: LlvmType) -> String { let r = self.reg(); self.emit(format!("{} = trunc {} {} to {}", r, from_ty, val, to_ty)); @@ -325,6 +378,18 @@ impl LlBlock { /// /// Uses `@perry_null_guard_zero` — a module-global i32 initialized /// to 0 that serves as a safe dereference target. + /// + /// (Issue #52) The length load is tagged `!invariant.load` — once + /// resolved, an Array/Buffer's length field at offset 0 of the + /// header is only mutated by in-place array-growth paths + /// (IndexSet with realloc, `push`/`splice`). The tag lets LLVM's + /// LICM hoist the load out of any read-only loop even when the + /// intervening code contains calls the optimizer can't prove + /// length-preserving. Writers (`IndexSet` slow path, `push`, etc.) + /// use the plain `store`/`load` sequence on the same field, so + /// they don't invalidate the invariant-tagged load *for this + /// particular SSA value* — LLVM's memory SSA tracks the + /// tag per-load, not per-address. pub fn safe_load_i32_from_ptr(&mut self, handle: &str) -> String { use crate::types::{I32, I64}; let is_bad = self.icmp_ult(I64, handle, "4096"); @@ -335,7 +400,7 @@ impl LlBlock { self.emit(format!("{} = select i1 {}, ptr @perry_null_guard_zero, ptr {}", r, is_bad, handle_ptr)); r }; - self.load(I32, &safe_ptr) + self.load_invariant(I32, &safe_ptr) } pub fn ptrtoint(&mut self, val: &str, to_ty: LlvmType) -> String { @@ -373,6 +438,15 @@ impl LlBlock { r } + /// Signed integer division. Emitted by the `(int / int) | 0` fast + /// path — avoids `scvtf → fdiv → fcvtzs` and lets LLVM replace + /// constant divisors with `smulh + asr`. + pub fn sdiv(&mut self, ty: LlvmType, a: &str, b: &str) -> String { + let r = self.reg(); + self.emit(format!("{} = sdiv {} {}, {}", r, ty, a, b)); + r + } + pub fn and(&mut self, ty: LlvmType, a: &str, b: &str) -> String { let r = self.reg(); self.emit(format!("{} = and {} {}, {}", r, ty, a, b)); diff --git a/crates/perry-codegen/src/boxed_vars.rs b/crates/perry-codegen/src/boxed_vars.rs index 1ed5f3ad2..a815842c6 100644 --- a/crates/perry-codegen/src/boxed_vars.rs +++ b/crates/perry-codegen/src/boxed_vars.rs @@ -510,10 +510,18 @@ fn collect_closure_refs_and_writes_in_expr( Expr::Unary { operand, .. } => { collect_closure_refs_and_writes_in_expr(operand, refs, writes); } - Expr::Update { id, .. } => { - writes.insert(*id); - refs.insert(*id); - } + // Update at this level is outside any closure body — the walker only + // recurses INTO closures via the Closure arm below, so seeing an + // Update here means it's a top-level mutation, not a captured one. + // The previous implementation inserted unconditionally, which made + // every plain `for (let i = ...; ...; i++)` body's `i` look like a + // closure-captured-and-mutated var and forced a box allocation. The + // box turned the loop counter into a `bl js_box_get` / `bl js_box_set` + // pair per iteration even when no closure existed in the function. + // Drop the insertion; the captured-inside-closure case is still + // handled by `collect_write_ids_in_stmts` triggered from the + // Expr::Closure arm above. + Expr::Update { .. } => {} Expr::Call { callee, args, .. } => { collect_closure_refs_and_writes_in_expr(callee, refs, writes); for a in args { diff --git a/crates/perry-codegen/src/codegen.rs b/crates/perry-codegen/src/codegen.rs index c36aee948..96cc61d4d 100644 --- a/crates/perry-codegen/src/codegen.rs +++ b/crates/perry-codegen/src/codegen.rs @@ -219,6 +219,18 @@ pub(crate) struct CrossModuleCtx { /// dead branches (which may reference FFI functions that don't exist on /// the current target). pub compile_time_constants: std::collections::HashMap, + /// Functions with a 3-param clamp pattern: fid → true. Call sites + /// emit `@llvm.smax.i32` + `@llvm.smin.i32` instead of a function call. + pub clamp3_functions: std::collections::HashSet, + /// Functions with clampU8 pattern (1 param, clamp to [0, 255]). + pub clamp_u8_functions: std::collections::HashSet, + /// Functions that always return integer (all returns end with `| 0` etc). + pub returns_int_functions: std::collections::HashSet, + /// (Issue #50) Module-level `const` 2D int arrays folded into flat + /// `[N x i32]` LLVM constants. Maps local_id → info. Populated by + /// scanning `hir.init`; threaded through every FnCtx so the IndexGet + /// lowering can intercept `X[i][j]` / `krow[j]` patterns. + pub flat_const_arrays: std::collections::HashMap, } /// Compile a Perry HIR module to an object file via LLVM IR. @@ -550,6 +562,87 @@ pub fn compile_module(hir: &HirModule, opts: CompileOptions) -> Result> ), imported_vars: opts.imported_vars, compile_time_constants, + clamp3_functions: hir.functions.iter() + .filter_map(|f| crate::collectors::detect_clamp3(f).map(|_| f.id)) + .collect(), + clamp_u8_functions: hir.functions.iter() + .filter(|f| crate::collectors::detect_clamp_u8(f)) + .map(|f| f.id) + .collect(), + returns_int_functions: hir.functions.iter() + .filter(|f| crate::collectors::returns_integer(f)) + .map(|f| f.id) + .collect(), + flat_const_arrays: { + // Issue #50: fold module-level `const X: number[][] = [[int, ...], ...]` + // into a flat `[N x i32]` LLVM constant so `X[i][j]` / `krow[j]` can + // load directly from `.rodata` instead of chasing the arena array + // header. Qualifying locals are `Let { mutable: false }`, have a + // rectangular int-literal 2D init, and are never mutated anywhere + // in the module (LocalSet/Update/IndexSet/mutating methods). + let mut map: std::collections::HashMap = + std::collections::HashMap::new(); + for s in &hir.init { + if let perry_hir::Stmt::Let { + id, init: Some(init), mutable: false, .. + } = s + { + if let Some((rows, cols, vals)) = + crate::expr::try_flat_const_2d_int(init) + { + let mut mutated = false; + if crate::collectors::has_any_mutation(&hir.init, *id) { + mutated = true; + } + if !mutated { + for f in &hir.functions { + if crate::collectors::has_any_mutation(&f.body, *id) { + mutated = true; + break; + } + } + } + if !mutated { + 'outer: for c in &hir.classes { + for m in &c.methods { + if crate::collectors::has_any_mutation(&m.body, *id) { + mutated = true; + break 'outer; + } + } + if let Some(ctor) = &c.constructor { + if crate::collectors::has_any_mutation(&ctor.body, *id) { + mutated = true; + break; + } + } + } + } + if !mutated { + let gname = format!("perry_flat_{}__{}", module_prefix, id); + let init_str = format!( + "[{}]", + vals.iter() + .map(|v| format!("i32 {}", v)) + .collect::>() + .join(", ") + ); + let ty = format!("[{} x i32]", rows * cols); + llmod.add_raw_global(format!( + "@{} = private unnamed_addr constant {} {}", + gname, ty, init_str + )); + map.insert(*id, crate::expr::FlatConstInfo { + global_name: gname, + rows, + cols, + }); + } + } + } + } + map + }, }; // Module-level globals registry. Pre-walk: @@ -1000,9 +1093,12 @@ pub fn compile_module(hir: &HirModule, opts: CompileOptions) -> Result> let i64_name = format!("{}_i64", llvm_name); crate::collectors::emit_i64_function(&mut llmod, f, &i64_name); // Emit the f64 wrapper that calls the i64 version. + // Mark as alwaysinline so LLVM exposes the integer ops + // to callers — critical for vectorizing clamp patterns. let params: Vec<(LlvmType, String)> = f .params.iter().map(|p| (DOUBLE, format!("%arg{}", p.id))).collect(); let wrapper = llmod.define_function(llvm_name, DOUBLE, params); + wrapper.force_inline = true; let _ = wrapper.create_block("entry"); let blk = wrapper.block_mut(0).unwrap(); let mut i64_args: Vec<(LlvmType, String)> = Vec::new(); @@ -1285,6 +1381,7 @@ pub fn compile_module(hir: &HirModule, opts: CompileOptions) -> Result> &module_boxed_vars, &closure_rest_params, &cross_module, + &opts.output_type, ) .with_context(|| format!("lowering entry of module '{}'", hir.name))?; @@ -1348,6 +1445,12 @@ fn compile_function( .collect(); let lf = llmod.define_function(&llvm_name, DOUBLE, params); + // Small leaf functions (≤ 8 statements) get alwaysinline so LLVM + // exposes their operations to the caller's optimizer context — critical + // for vectorizing clamp helpers and similar patterns. + if f.body.len() <= 8 && !f.is_async && !f.is_generator { + lf.force_inline = true; + } let _ = lf.create_block("entry"); // Store each param into an alloca slot, collecting LocalId → slot @@ -1385,7 +1488,9 @@ fn compile_function( // Pre-walk: which locals are provably integer-valued? Used by // `BinaryOp::Mod` to emit integer modulo instead of libm `fmod()`. - let integer_locals = crate::collectors::collect_integer_locals(&f.body); + let clamp_fn_ids: std::collections::HashSet = cross_module.clamp3_functions + .union(&cross_module.clamp_u8_functions).chain(cross_module.returns_int_functions.iter()).copied().collect(); + let integer_locals = crate::collectors::collect_integer_locals(&f.body, &cross_module.flat_const_arrays.keys().copied().collect(), &clamp_fn_ids); // Pre-walk: which `let x = new Class(...)` locals never escape? let non_escaping_news = crate::collectors::collect_non_escaping_news( @@ -1441,6 +1546,12 @@ fn compile_function( scalar_replaced: std::collections::HashMap::new(), scalar_ctor_target: Vec::new(), non_escaping_news, + flat_const_arrays: &cross_module.flat_const_arrays, + array_row_aliases: HashMap::new(), + clamp3_functions: &cross_module.clamp3_functions, + clamp_u8_functions: &cross_module.clamp_u8_functions, + ic_site_counter: 0, + ic_globals: Vec::new(), }; stmt::lower_stmts(&mut ctx, &f.body) .with_context(|| format!("lowering body of '{}'", f.name))?; @@ -1460,11 +1571,15 @@ fn compile_function( ctx.block().ret(DOUBLE, "0.0"); } } + let ic_globals = std::mem::take(&mut ctx.ic_globals); let pending = std::mem::take(&mut ctx.pending_declares); drop(ctx); // releases &mut LlFunction borrow on llmod for (name, ret, params) in pending { llmod.declare_function(&name, ret, ¶ms); } + for ic_name in &ic_globals { + llmod.add_raw_global(format!("@{} = private global [2 x i64] zeroinitializer", ic_name)); + } Ok(()) } @@ -1626,7 +1741,9 @@ fn compile_closure( // the closure body just sees them via the capture mechanism. let closure_boxed_vars = module_boxed_vars.clone(); - let integer_locals = crate::collectors::collect_integer_locals(body); + let clamp_fn_ids: std::collections::HashSet = cross_module.clamp3_functions + .union(&cross_module.clamp_u8_functions).chain(cross_module.returns_int_functions.iter()).copied().collect(); + let integer_locals = crate::collectors::collect_integer_locals(body, &cross_module.flat_const_arrays.keys().copied().collect(), &clamp_fn_ids); let non_escaping_news = crate::collectors::collect_non_escaping_news( body, &closure_boxed_vars, module_globals, @@ -1685,6 +1802,12 @@ fn compile_closure( scalar_replaced: std::collections::HashMap::new(), scalar_ctor_target: Vec::new(), non_escaping_news, + flat_const_arrays: &cross_module.flat_const_arrays, + array_row_aliases: HashMap::new(), + clamp3_functions: &cross_module.clamp3_functions, + clamp_u8_functions: &cross_module.clamp_u8_functions, + ic_site_counter: 0, + ic_globals: Vec::new(), }; stmt::lower_stmts(&mut ctx, body) @@ -1693,11 +1816,15 @@ fn compile_closure( if !ctx.block().is_terminated() { ctx.block().ret(DOUBLE, "0.0"); } + let ic_globals = std::mem::take(&mut ctx.ic_globals); let pending = std::mem::take(&mut ctx.pending_declares); drop(ctx); for (name, ret, params) in pending { llmod.declare_function(&name, ret, ¶ms); } + for ic_name in &ic_globals { + llmod.add_raw_global(format!("@{} = private global [2 x i64] zeroinitializer", ic_name)); + } Ok(()) } @@ -1771,7 +1898,9 @@ fn compile_method( let method_boxed_vars = module_boxed_vars.clone(); - let integer_locals = crate::collectors::collect_integer_locals(&method.body); + let clamp_fn_ids: std::collections::HashSet = cross_module.clamp3_functions + .union(&cross_module.clamp_u8_functions).chain(cross_module.returns_int_functions.iter()).copied().collect(); + let integer_locals = crate::collectors::collect_integer_locals(&method.body, &cross_module.flat_const_arrays.keys().copied().collect(), &clamp_fn_ids); let non_escaping_news = crate::collectors::collect_non_escaping_news( &method.body, &method_boxed_vars, module_globals, @@ -1826,6 +1955,12 @@ fn compile_method( scalar_replaced: std::collections::HashMap::new(), scalar_ctor_target: Vec::new(), non_escaping_news, + flat_const_arrays: &cross_module.flat_const_arrays, + array_row_aliases: HashMap::new(), + clamp3_functions: &cross_module.clamp3_functions, + clamp_u8_functions: &cross_module.clamp_u8_functions, + ic_site_counter: 0, + ic_globals: Vec::new(), }; // Constructors emitted as standalone cross-module LLVM functions (named @@ -1846,11 +1981,15 @@ fn compile_method( if !ctx.block().is_terminated() { ctx.block().ret(DOUBLE, "0.0"); } + let ic_globals = std::mem::take(&mut ctx.ic_globals); let pending = std::mem::take(&mut ctx.pending_declares); drop(ctx); for (name, ret, params) in pending { llmod.declare_function(&name, ret, ¶ms); } + for ic_name in &ic_globals { + llmod.add_raw_global(format!("@{} = private global [2 x i64] zeroinitializer", ic_name)); + } Ok(()) } @@ -1889,9 +2028,12 @@ fn compile_module_entry( module_boxed_vars: &std::collections::HashSet, closure_rest_params: &HashMap, cross_module: &CrossModuleCtx, + output_type: &str, ) -> Result<()> { let strings_init_name = format!("__perry_init_strings_{}", module_prefix); + let is_dylib = output_type == "dylib"; + if is_entry { // Pre-declare each non-entry module's init function as an // extern so the entry main can call them. The actual definition @@ -1901,7 +2043,17 @@ fn compile_module_entry( llmod.declare_function(&format!("{}__init", prefix), VOID, &[]); } - let main = llmod.define_function("main", I32, vec![]); + // For dylib output, emit `void perry_module_init()` instead of + // `int main()`. The host process calls this once after dlopen to + // initialize the GC, string pools, module globals (including GC + // root registration), and run top-level statements. Without this, + // module-level Maps/Arrays would never be registered as GC roots + // and the first GC cycle after connect() would free them (issue #54). + let main = if is_dylib { + llmod.define_function("perry_module_init", VOID, vec![]) + } else { + llmod.define_function("main", I32, vec![]) + }; let _ = main.create_block("entry"); { let blk = main.block_mut(0).unwrap(); @@ -1925,7 +2077,9 @@ fn compile_module_entry( main.mark_entry_init_boundary(); let main_boxed_vars = module_boxed_vars.clone(); - let main_integer_locals = crate::collectors::collect_integer_locals(&hir.init); + let clamp_fn_ids: std::collections::HashSet = cross_module.clamp3_functions + .union(&cross_module.clamp_u8_functions).chain(cross_module.returns_int_functions.iter()).copied().collect(); + let main_integer_locals = crate::collectors::collect_integer_locals(&hir.init, &cross_module.flat_const_arrays.keys().copied().collect(), &clamp_fn_ids); let main_non_escaping_news = crate::collectors::collect_non_escaping_news( &hir.init, &main_boxed_vars, module_globals, ); @@ -1978,6 +2132,12 @@ fn compile_module_entry( scalar_replaced: std::collections::HashMap::new(), scalar_ctor_target: Vec::new(), non_escaping_news: main_non_escaping_news, + flat_const_arrays: &cross_module.flat_const_arrays, + array_row_aliases: HashMap::new(), + clamp3_functions: &cross_module.clamp3_functions, + clamp_u8_functions: &cross_module.clamp_u8_functions, + ic_site_counter: 0, + ic_globals: Vec::new(), }; // Register every module-level global's ADDRESS as a GC root so // the mark phase can discover pointer-typed values (Maps, Arrays, @@ -1999,67 +2159,79 @@ fn compile_module_entry( .with_context(|| format!("lowering init statements of module '{}'", hir.name))?; if !ctx.block().is_terminated() { - // Event loop: keep running while there are active event - // sources (timers, intervals, WS servers, pending stdlib - // async ops). Without this, event-driven servers (WS, - // setInterval-based) exit immediately after init. - // - // Structure: - // loop_header: check if any source is active → body or exit - // loop_body: tick all queues, sleep 10ms, jump to header - // loop_exit: ret 0 - let header_idx = ctx.new_block("event_loop.header"); - let body_idx = ctx.new_block("event_loop.body"); - let exit_idx = ctx.new_block("event_loop.exit"); - let header_label = ctx.block_label(header_idx); - let body_label = ctx.block_label(body_idx); - let exit_label = ctx.block_label(exit_idx); - - // Initial microtask flush (4 rounds) before entering the - // event loop — handles fire-and-forget .then() chains that - // don't need the full event loop. - for _ in 0..4 { + if is_dylib { + // Dylib: no event loop — the host manages its own event + // loop and calls perry_fn_* entry points as needed. Just + // return after running top-level statements (which set up + // module-level state like Maps, class registrations, etc.). + ctx.block().ret_void(); + } else { + // Event loop: keep running while there are active event + // sources (timers, intervals, WS servers, pending stdlib + // async ops). Without this, event-driven servers (WS, + // setInterval-based) exit immediately after init. + // + // Structure: + // loop_header: check if any source is active → body or exit + // loop_body: tick all queues, sleep 10ms, jump to header + // loop_exit: ret 0 + let header_idx = ctx.new_block("event_loop.header"); + let body_idx = ctx.new_block("event_loop.body"); + let exit_idx = ctx.new_block("event_loop.exit"); + let header_label = ctx.block_label(header_idx); + let body_label = ctx.block_label(body_idx); + let exit_label = ctx.block_label(exit_idx); + + // Initial microtask flush (4 rounds) before entering the + // event loop — handles fire-and-forget .then() chains that + // don't need the full event loop. + for _ in 0..4 { + let _ = ctx.block().call(I32, "js_promise_run_microtasks", &[]); + let _ = ctx.block().call(I32, "js_timer_tick", &[]); + let _ = ctx.block().call(I32, "js_callback_timer_tick", &[]); + let _ = ctx.block().call(I32, "js_interval_timer_tick", &[]); + } + ctx.block().call_void("js_run_stdlib_pump", &[]); + ctx.block().br(&header_label); + + // loop_header: check if there's any reason to keep running + ctx.current_block = header_idx; + let has_timers = ctx.block().call(I32, "js_timer_has_pending", &[]); + let has_callbacks = ctx.block().call(I32, "js_callback_timer_has_pending", &[]); + let has_intervals = ctx.block().call(I32, "js_interval_timer_has_pending", &[]); + let has_stdlib = ctx.block().call(I32, "js_stdlib_has_active_handles", &[]); + let any1 = ctx.block().or(I32, &has_timers, &has_callbacks); + let any2 = ctx.block().or(I32, &has_intervals, &has_stdlib); + let any = ctx.block().or(I32, &any1, &any2); + let zero = "0".to_string(); + let cmp = ctx.block().icmp_ne(I32, &any, &zero); + ctx.block().cond_br(&cmp, &body_label, &exit_label); + + // loop_body: tick everything, sleep, loop + ctx.current_block = body_idx; let _ = ctx.block().call(I32, "js_promise_run_microtasks", &[]); let _ = ctx.block().call(I32, "js_timer_tick", &[]); let _ = ctx.block().call(I32, "js_callback_timer_tick", &[]); let _ = ctx.block().call(I32, "js_interval_timer_tick", &[]); + ctx.block().call_void("js_run_stdlib_pump", &[]); + let ten_ms = "10.0".to_string(); + ctx.block().call_void("js_sleep_ms", &[(DOUBLE, &ten_ms)]); + ctx.block().br(&header_label); + + // loop_exit: done + ctx.current_block = exit_idx; + ctx.block().ret(I32, "0"); } - ctx.block().call_void("js_run_stdlib_pump", &[]); - ctx.block().br(&header_label); - - // loop_header: check if there's any reason to keep running - ctx.current_block = header_idx; - let has_timers = ctx.block().call(I32, "js_timer_has_pending", &[]); - let has_callbacks = ctx.block().call(I32, "js_callback_timer_has_pending", &[]); - let has_intervals = ctx.block().call(I32, "js_interval_timer_has_pending", &[]); - let has_stdlib = ctx.block().call(I32, "js_stdlib_has_active_handles", &[]); - let any1 = ctx.block().or(I32, &has_timers, &has_callbacks); - let any2 = ctx.block().or(I32, &has_intervals, &has_stdlib); - let any = ctx.block().or(I32, &any1, &any2); - let zero = "0".to_string(); - let cmp = ctx.block().icmp_ne(I32, &any, &zero); - ctx.block().cond_br(&cmp, &body_label, &exit_label); - - // loop_body: tick everything, sleep, loop - ctx.current_block = body_idx; - let _ = ctx.block().call(I32, "js_promise_run_microtasks", &[]); - let _ = ctx.block().call(I32, "js_timer_tick", &[]); - let _ = ctx.block().call(I32, "js_callback_timer_tick", &[]); - let _ = ctx.block().call(I32, "js_interval_timer_tick", &[]); - ctx.block().call_void("js_run_stdlib_pump", &[]); - let ten_ms = "10.0".to_string(); - ctx.block().call_void("js_sleep_ms", &[(DOUBLE, &ten_ms)]); - ctx.block().br(&header_label); - - // loop_exit: done - ctx.current_block = exit_idx; - ctx.block().ret(I32, "0"); } + let ic_globals = std::mem::take(&mut ctx.ic_globals); let pending = std::mem::take(&mut ctx.pending_declares); drop(ctx); for (name, ret, params) in pending { llmod.declare_function(&name, ret, ¶ms); } + for ic_name in &ic_globals { + llmod.add_raw_global(format!("@{} = private global [2 x i64] zeroinitializer", ic_name)); + } } else { let init_name = format!("{}__init", module_prefix); // Debug: emit puts("INIT: ") at the top of each module init @@ -2091,7 +2263,9 @@ fn compile_module_entry( init_fn.mark_entry_init_boundary(); let init_boxed_vars = module_boxed_vars.clone(); - let init_integer_locals = crate::collectors::collect_integer_locals(&hir.init); + let clamp_fn_ids: std::collections::HashSet = cross_module.clamp3_functions + .union(&cross_module.clamp_u8_functions).chain(cross_module.returns_int_functions.iter()).copied().collect(); + let init_integer_locals = crate::collectors::collect_integer_locals(&hir.init, &cross_module.flat_const_arrays.keys().copied().collect(), &clamp_fn_ids); let init_non_escaping_news = crate::collectors::collect_non_escaping_news( &hir.init, &init_boxed_vars, module_globals, ); @@ -2144,6 +2318,12 @@ fn compile_module_entry( scalar_replaced: std::collections::HashMap::new(), scalar_ctor_target: Vec::new(), non_escaping_news: init_non_escaping_news, + flat_const_arrays: &cross_module.flat_const_arrays, + array_row_aliases: HashMap::new(), + clamp3_functions: &cross_module.clamp3_functions, + clamp_u8_functions: &cross_module.clamp_u8_functions, + ic_site_counter: 0, + ic_globals: Vec::new(), }; // Register every module-level global's ADDRESS as a GC root — // same reason as the entry-module branch above (issue #36). For @@ -2159,11 +2339,15 @@ fn compile_module_entry( if !ctx.block().is_terminated() { ctx.block().ret_void(); } + let ic_globals = std::mem::take(&mut ctx.ic_globals); let pending = std::mem::take(&mut ctx.pending_declares); drop(ctx); for (name, ret, params) in pending { llmod.declare_function(&name, ret, ¶ms); } + for ic_name in &ic_globals { + llmod.add_raw_global(format!("@{} = private global [2 x i64] zeroinitializer", ic_name)); + } } Ok(()) } @@ -2362,7 +2546,9 @@ fn compile_static_method( .map(|p| (p.id, p.ty.clone())) .collect(); - let integer_locals = crate::collectors::collect_integer_locals(&f.body); + let clamp_fn_ids: std::collections::HashSet = cross_module.clamp3_functions + .union(&cross_module.clamp_u8_functions).chain(cross_module.returns_int_functions.iter()).copied().collect(); + let integer_locals = crate::collectors::collect_integer_locals(&f.body, &cross_module.flat_const_arrays.keys().copied().collect(), &clamp_fn_ids); let static_boxed_vars = module_boxed_vars.clone(); let non_escaping_news = crate::collectors::collect_non_escaping_news( @@ -2422,6 +2608,12 @@ fn compile_static_method( scalar_replaced: std::collections::HashMap::new(), scalar_ctor_target: Vec::new(), non_escaping_news, + flat_const_arrays: &cross_module.flat_const_arrays, + array_row_aliases: HashMap::new(), + clamp3_functions: &cross_module.clamp3_functions, + clamp_u8_functions: &cross_module.clamp_u8_functions, + ic_site_counter: 0, + ic_globals: Vec::new(), }; stmt::lower_stmts(&mut ctx, &f.body) .with_context(|| format!("lowering body of static '{}::{}'", class_name, f.name))?; @@ -2436,11 +2628,15 @@ fn compile_static_method( ctx.block().ret(DOUBLE, "0.0"); } } + let ic_globals = std::mem::take(&mut ctx.ic_globals); let pending = std::mem::take(&mut ctx.pending_declares); drop(ctx); for (name, ret, params) in pending { llmod.declare_function(&name, ret, ¶ms); } + for ic_name in &ic_globals { + llmod.add_raw_global(format!("@{} = private global [2 x i64] zeroinitializer", ic_name)); + } Ok(()) } diff --git a/crates/perry-codegen/src/collectors.rs b/crates/perry-codegen/src/collectors.rs index 4c7b742ae..3dd3ad0ba 100644 --- a/crates/perry-codegen/src/collectors.rs +++ b/crates/perry-codegen/src/collectors.rs @@ -5,6 +5,222 @@ use std::collections::HashSet; +/// (Issue #50) Return `true` if any statement in `stmts` mutates the local +/// `id`. A local is "mutated" if: +/// - It's the target of a `LocalSet` or `Update` (reassignment), or +/// - An `IndexSet` has a root object that resolves to `LocalGet(id)` — +/// covers `X[i] = v` directly, plus `X[i][j] = v` and deeper chains +/// via nested `IndexGet`s. +/// - A `NativeMethodCall` targets `LocalGet(id)` with a name from the +/// Array mutating set (`push`, `pop`, `shift`, `unshift`, `splice`, +/// `sort`, `reverse`, `fill`, `copyWithin`). +/// +/// Conservative by design: a true positive means we must fall back from +/// the flat-const optimization to the normal arena path. A false positive +/// (flagging something that never actually mutates) only costs us the +/// flat-table win. +pub(crate) fn has_any_mutation(stmts: &[perry_hir::Stmt], id: u32) -> bool { + use perry_hir::Stmt; + for s in stmts { + match s { + Stmt::Expr(e) | Stmt::Throw(e) => { + if expr_has_mutation(e, id) { + return true; + } + } + Stmt::Return(Some(e)) => { + if expr_has_mutation(e, id) { + return true; + } + } + Stmt::Let { init: Some(e), .. } => { + if expr_has_mutation(e, id) { + return true; + } + } + Stmt::If { condition, then_branch, else_branch } => { + if expr_has_mutation(condition, id) { + return true; + } + if has_any_mutation(then_branch, id) { + return true; + } + if let Some(eb) = else_branch { + if has_any_mutation(eb, id) { + return true; + } + } + } + Stmt::While { condition, body } | Stmt::DoWhile { body, condition } => { + if expr_has_mutation(condition, id) { + return true; + } + if has_any_mutation(body, id) { + return true; + } + } + Stmt::For { init, condition, update, body } => { + if let Some(init_stmt) = init { + if has_any_mutation(std::slice::from_ref(init_stmt), id) { + return true; + } + } + if let Some(c) = condition { + if expr_has_mutation(c, id) { + return true; + } + } + if let Some(u) = update { + if expr_has_mutation(u, id) { + return true; + } + } + if has_any_mutation(body, id) { + return true; + } + } + Stmt::Try { body, catch, finally } => { + if has_any_mutation(body, id) { + return true; + } + if let Some(c) = catch { + if has_any_mutation(&c.body, id) { + return true; + } + } + if let Some(f) = finally { + if has_any_mutation(f, id) { + return true; + } + } + } + Stmt::Switch { discriminant, cases } => { + if expr_has_mutation(discriminant, id) { + return true; + } + for c in cases { + if let Some(t) = &c.test { + if expr_has_mutation(t, id) { + return true; + } + } + if has_any_mutation(&c.body, id) { + return true; + } + } + } + Stmt::Labeled { body, .. } => { + if has_any_mutation(std::slice::from_ref(body.as_ref()), id) { + return true; + } + } + _ => {} + } + } + false +} + +fn is_local_get_chain(e: &perry_hir::Expr, id: u32) -> bool { + use perry_hir::Expr; + match e { + Expr::LocalGet(i) => *i == id, + Expr::IndexGet { object, .. } => is_local_get_chain(object, id), + Expr::PropertyGet { object, .. } => is_local_get_chain(object, id), + _ => false, + } +} + +fn expr_has_mutation(e: &perry_hir::Expr, id: u32) -> bool { + use perry_hir::{ArrayElement, CallArg, Expr}; + const ARRAY_MUTATORS: &[&str] = &[ + "push", "pop", "shift", "unshift", "splice", "sort", "reverse", + "fill", "copyWithin", + ]; + match e { + Expr::LocalSet(tgt, value) => { + *tgt == id || expr_has_mutation(value, id) + } + Expr::Update { id: tgt, .. } => *tgt == id, + Expr::IndexSet { object, index, value } => { + is_local_get_chain(object, id) + || expr_has_mutation(object, id) + || expr_has_mutation(index, id) + || expr_has_mutation(value, id) + } + Expr::NativeMethodCall { object: Some(obj), method, args, .. } + if ARRAY_MUTATORS.contains(&method.as_str()) + && is_local_get_chain(obj, id) => + { + true + } + Expr::NativeMethodCall { object, args, .. } => { + if let Some(o) = object { + if expr_has_mutation(o, id) { + return true; + } + } + args.iter().any(|a| expr_has_mutation(a, id)) + } + Expr::Binary { left, right, .. } + | Expr::Compare { left, right, .. } + | Expr::Logical { left, right, .. } => { + expr_has_mutation(left, id) || expr_has_mutation(right, id) + } + Expr::Unary { operand, .. } + | Expr::Void(operand) + | Expr::TypeOf(operand) + | Expr::Await(operand) + | Expr::Delete(operand) + | Expr::StringCoerce(operand) + | Expr::BooleanCoerce(operand) + | Expr::NumberCoerce(operand) => expr_has_mutation(operand, id), + Expr::Call { callee, args, .. } => { + if expr_has_mutation(callee, id) { + return true; + } + args.iter().any(|a| expr_has_mutation(a, id)) + } + Expr::CallSpread { callee, args, .. } => { + if expr_has_mutation(callee, id) { + return true; + } + args.iter().any(|a| match a { + CallArg::Expr(e) | CallArg::Spread(e) => expr_has_mutation(e, id), + }) + } + Expr::Conditional { condition, then_expr, else_expr } => { + expr_has_mutation(condition, id) + || expr_has_mutation(then_expr, id) + || expr_has_mutation(else_expr, id) + } + Expr::PropertyGet { object, .. } => expr_has_mutation(object, id), + Expr::PropertySet { object, value, .. } => { + expr_has_mutation(object, id) || expr_has_mutation(value, id) + } + Expr::PropertyUpdate { object, .. } => expr_has_mutation(object, id), + Expr::IndexGet { object, index } => { + expr_has_mutation(object, id) || expr_has_mutation(index, id) + } + Expr::Array(elements) => elements.iter().any(|e| expr_has_mutation(e, id)), + Expr::ArraySpread(elements) => elements.iter().any(|el| match el { + ArrayElement::Expr(e) | ArrayElement::Spread(e) => expr_has_mutation(e, id), + }), + Expr::Object(props) => props.iter().any(|(_, v)| expr_has_mutation(v, id)), + Expr::Closure { body, .. } => has_any_mutation(body, id), + Expr::Sequence(es) => es.iter().any(|e| expr_has_mutation(e, id)), + Expr::ArrayPush { array_id, value } => { + *array_id == id || expr_has_mutation(value, id) + } + Expr::ArraySplice { array_id, start, delete_count, items } => { + *array_id == id + || expr_has_mutation(start, id) + || delete_count.as_ref().map_or(false, |d| expr_has_mutation(d, id)) + || items.iter().any(|it| expr_has_mutation(it, id)) + } + _ => false, + } +} + /// Walk for `Expr::Closure` instances and collect each one along with /// its `func_id` so the codegen can emit the body as a top-level /// function. Each closure expression is captured by clone (it's the @@ -986,70 +1202,254 @@ fn collect_ref_ids_in_expr(e: &perry_hir::Expr, out: &mut HashSet) { /// function. Used by `BinaryOp::Mod` lowering to emit integer modulo /// (`fptosi → srem → sitofp`) instead of `frem double`, which lowers to a /// libm `fmod()` call on ARM (no hardware instruction) and costs ~15ns per -/// iteration. +/// iteration. Also used as the gate for allocating parallel i32 slots that +/// issue #48 leans on to skip the `fadd → fcvtzs → scvtf` round-trip on +/// `sum = (sum + i) | 0` style accumulator writes. /// /// A local qualifies iff: /// 1. It's declared with `Let { init: Some(Expr::Integer(_)) }` — i.e. it /// starts as a whole number, not a fraction. -/// 2. It has NO `Expr::LocalSet(id, _)` anywhere in the function body. -/// The only permitted mutation is `Expr::Update { id, .. }` (++/--), -/// which by definition preserves the integer invariant. -/// -/// Rule 2 is strict: any `LocalSet` (even one storing an integer literal) -/// excludes the local, because proving the rhs is also integer-valued would -/// require a recursive analysis we don't have. Rule 2 naturally covers the -/// common case — for-loop counters — without any type inference machinery. +/// 2. Every `Expr::LocalSet(id, rhs)` has an int32-producing rhs — see +/// `is_int32_producing_expr`. `Expr::Update { id, .. }` (++/--) is +/// always permitted since it trivially preserves integer-ness. /// -/// Closure captures are handled correctly: writes from inside a closure body -/// go through `LocalSet` in the HIR, so rule 2 excludes any local that's -/// captured mutably. Read-only captures are fine and remain qualified. -pub(crate) fn collect_integer_locals(stmts: &[perry_hir::Stmt]) -> HashSet { +/// Closure captures: writes from inside a closure body go through `LocalSet` +/// with a rhs that's typically not int32-producing, so mutably-captured +/// locals naturally fall out. Read-only captures remain qualified. +fn is_clamp_call(e: &perry_hir::Expr, clamp_fn_ids: &HashSet) -> bool { + if let perry_hir::Expr::Call { callee, .. } = e { + if let perry_hir::Expr::FuncRef(fid) = callee.as_ref() { + return clamp_fn_ids.contains(fid); + } + } + false +} + +pub(crate) fn collect_integer_locals( + stmts: &[perry_hir::Stmt], + flat_const_ids: &HashSet, + clamp_fn_ids: &HashSet, +) -> HashSet { let mut candidates: HashSet = HashSet::new(); - collect_integer_let_ids(stmts, &mut candidates); - let mut ever_localset: HashSet = HashSet::new(); - collect_localset_ids_in_stmts(stmts, &mut ever_localset); - candidates.retain(|id| !ever_localset.contains(id)); + + // Issue #50 bridge: pre-compute which locals are row-aliases of + // flat-const 2D int arrays BEFORE collecting integer let ids, since + // `collect_integer_let_ids` needs to recognize `let k = krow[j]` + // (where krow is a flat-const row alias) as an int-producing init. + let mut flat_row_alias_ids: HashSet = HashSet::new(); + collect_flat_row_aliases(stmts, flat_const_ids, &mut flat_row_alias_ids); + + collect_integer_let_ids(stmts, &mut candidates, flat_const_ids, &flat_row_alias_ids, clamp_fn_ids); + + // Iterate to a fixed point (issue #49): `is_int32_producing_expr` now + // recognizes `LocalGet(id)` as int-producing when `id` is itself + // int-stable, and `Add/Sub/Mul` as int-producing when both operands + // are. That makes the analysis mutually recursive across locals — + // disqualifying one candidate may cascade to other candidates whose + // rhs referenced the first via LocalGet. Iterate until the set + // stabilizes. + loop { + let mut disqualified: HashSet = HashSet::new(); + collect_non_int_localset_ids_in_stmts( + stmts, &mut disqualified, &candidates, + flat_const_ids, &flat_row_alias_ids, clamp_fn_ids, + ); + let before = candidates.len(); + candidates.retain(|id| !disqualified.contains(id)); + if candidates.len() == before { + break; + } + } candidates } -fn collect_integer_let_ids(stmts: &[perry_hir::Stmt], out: &mut HashSet) { +fn collect_flat_row_aliases( + stmts: &[perry_hir::Stmt], + flat_const_ids: &HashSet, + out: &mut HashSet, +) { use perry_hir::{Expr, Stmt}; for s in stmts { match s { - Stmt::Let { id, init: Some(Expr::Integer(_)), .. } => { + Stmt::Let { id, init: Some(Expr::IndexGet { object, .. }), mutable: false, .. } => { + if let Expr::LocalGet(const_id) = object.as_ref() { + if flat_const_ids.contains(const_id) { + out.insert(*id); + } + } + } + Stmt::If { then_branch, else_branch, .. } => { + collect_flat_row_aliases(then_branch, flat_const_ids, out); + if let Some(eb) = else_branch { + collect_flat_row_aliases(eb, flat_const_ids, out); + } + } + Stmt::For { init, body, .. } => { + if let Some(init_stmt) = init { + collect_flat_row_aliases( + std::slice::from_ref(init_stmt), flat_const_ids, out, + ); + } + collect_flat_row_aliases(body, flat_const_ids, out); + } + Stmt::While { body, .. } | Stmt::DoWhile { body, .. } => { + collect_flat_row_aliases(body, flat_const_ids, out); + } + _ => {} + } + } +} + +/// Returns `true` if evaluating `e` yields a value that will already be +/// integer-valued — so writing it into a local's i32 slot is lossless. +/// +/// Accepted shapes: +/// - `Expr::Integer(_)`: trivially integer. +/// - `(expr) | 0` and `(expr) >>> 0`: the JS ToInt32 / ToUint32 idiom — +/// always yields a 32-bit integer regardless of the inner expression. +/// - Pure bitwise ops (`&`, `|`, `^`, `<<`, `>>`, `>>>`): per JS spec +/// these coerce both operands to int32 and return int32. +/// - `Expr::Update`: `++` / `--` on an integer-stable local (we don't +/// verify transitively; if the target isn't qualified, the whole chain +/// collapses anyway). +/// - (issue #49) `LocalGet(id)` when `id` is itself in `known_int_locals` — +/// enables the accumulator pattern `acc = acc + int_expr` without +/// requiring a `| 0` wrapper on every write. +/// - (issue #49) `Uint8ArrayGet` / `BufferIndexGet`: typed-array byte +/// reads return u8 values; always fit in i32. +/// - (issue #49) `Add` / `Sub` / `Mul` when both operands are +/// int-producing. The sum/product may overflow i32, but the existing +/// i32-slot machinery already accepts this risk — the double slot is +/// maintained in parallel and reads past i32::MAX were already wrong +/// for `| 0`-written accumulators. +/// +/// Rejected: everything else (notably `Div`/`Mod` without a `|0` wrapper, +/// bare floats, calls returning doubles, etc.) because they can produce +/// non-integer doubles at runtime. +fn is_int32_producing_expr( + e: &perry_hir::Expr, + known_int_locals: &HashSet, + flat_const_ids: &HashSet, + flat_row_alias_ids: &HashSet, + clamp_fn_ids: &HashSet, +) -> bool { + use perry_hir::{BinaryOp, Expr}; + match e { + Expr::Integer(_) => true, + Expr::Update { .. } => true, + Expr::Binary { op, right, .. } + if matches!(op, BinaryOp::BitOr | BinaryOp::UShr) + && matches!(right.as_ref(), Expr::Integer(0)) => + { + true + } + Expr::Binary { op, left, right } + if matches!(op, BinaryOp::Add | BinaryOp::Sub | BinaryOp::Mul) => + { + is_int32_producing_expr(left, known_int_locals, flat_const_ids, flat_row_alias_ids, clamp_fn_ids) + && is_int32_producing_expr(right, known_int_locals, flat_const_ids, flat_row_alias_ids, clamp_fn_ids) + } + Expr::Call { callee, .. } => { + if let Expr::FuncRef(fid) = callee.as_ref() { + clamp_fn_ids.contains(fid) + } else { + false + } + } + Expr::Binary { op, .. } => matches!( + op, + BinaryOp::BitAnd + | BinaryOp::BitOr + | BinaryOp::BitXor + | BinaryOp::Shl + | BinaryOp::Shr + | BinaryOp::UShr + ), + Expr::LocalGet(id) => known_int_locals.contains(id), + Expr::Uint8ArrayGet { .. } | Expr::BufferIndexGet { .. } => true, + Expr::MathImul(_, _) => true, // Math.imul always returns i32 + // Issue #50 bridge: element access on a flat-const 2D int array + // produces i32. Two shapes: + // - inline `X[i][j]`: IndexGet(IndexGet(LocalGet(X), i), j) + // - aliased `krow[j]`: IndexGet(LocalGet(alias), j) + Expr::IndexGet { object, .. } => match object.as_ref() { + Expr::IndexGet { object: inner, .. } => { + matches!(inner.as_ref(), Expr::LocalGet(id) if flat_const_ids.contains(id)) + } + Expr::LocalGet(id) => flat_row_alias_ids.contains(id), + _ => false, + }, + _ => false, + } +} + +fn is_flat_const_indexget( + e: &perry_hir::Expr, + flat_const_ids: &HashSet, + flat_row_alias_ids: &HashSet, +) -> bool { + use perry_hir::Expr; + match e { + Expr::IndexGet { object, .. } => match object.as_ref() { + Expr::IndexGet { object: inner, .. } => { + matches!(inner.as_ref(), Expr::LocalGet(id) if flat_const_ids.contains(id)) + } + Expr::LocalGet(id) => flat_row_alias_ids.contains(id), + _ => false, + }, + _ => false, + } +} + +fn collect_integer_let_ids( + stmts: &[perry_hir::Stmt], + out: &mut HashSet, + flat_const_ids: &HashSet, + flat_row_alias_ids: &HashSet, + clamp_fn_ids: &HashSet, +) { + use perry_hir::{Expr, Stmt}; + for s in stmts { + match s { + Stmt::Let { id, init: Some(init), .. } + if matches!(init, Expr::Integer(_)) + || is_flat_const_indexget(init, flat_const_ids, flat_row_alias_ids) + || is_clamp_call(init, clamp_fn_ids) + => + { out.insert(*id); } Stmt::If { then_branch, else_branch, .. } => { - collect_integer_let_ids(then_branch, out); + collect_integer_let_ids(then_branch, out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); if let Some(eb) = else_branch { - collect_integer_let_ids(eb, out); + collect_integer_let_ids(eb, out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } } Stmt::For { init, body, .. } => { if let Some(init_stmt) = init { - collect_integer_let_ids(std::slice::from_ref(init_stmt), out); + collect_integer_let_ids(std::slice::from_ref(init_stmt), out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } - collect_integer_let_ids(body, out); + collect_integer_let_ids(body, out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } Stmt::While { body, .. } | Stmt::DoWhile { body, .. } => { - collect_integer_let_ids(body, out); + collect_integer_let_ids(body, out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } Stmt::Try { body, catch, finally } => { - collect_integer_let_ids(body, out); + collect_integer_let_ids(body, out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); if let Some(c) = catch { - collect_integer_let_ids(&c.body, out); + collect_integer_let_ids(&c.body, out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } if let Some(f) = finally { - collect_integer_let_ids(f, out); + collect_integer_let_ids(f, out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } } Stmt::Switch { cases, .. } => { for c in cases { - collect_integer_let_ids(&c.body, out); + collect_integer_let_ids(&c.body, out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } } Stmt::Labeled { body, .. } => { - collect_integer_let_ids(std::slice::from_ref(body.as_ref()), out); + collect_integer_let_ids(std::slice::from_ref(body.as_ref()), out, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } _ => {} } @@ -1062,68 +1462,112 @@ fn collect_integer_let_ids(stmts: &[perry_hir::Stmt], out: &mut HashSet) { /// `collect_ref_ids_in_expr`: any new HIR Expr variant must recurse into its /// sub-expressions here, or the walker may miss a LocalSet hidden inside it /// and wrongly mark its target as integer-valued. +/// Walks the HIR and records LocalIds that have at least one LocalSet whose +/// rhs is NOT int32-producing. `collect_integer_locals` uses this to remove +/// locals that lose their integer invariant somewhere in the function. +fn collect_non_int_localset_ids_in_stmts( + stmts: &[perry_hir::Stmt], + out: &mut HashSet, + known_int_locals: &HashSet, + flat_const_ids: &HashSet, + flat_row_alias_ids: &HashSet, + clamp_fn_ids: &HashSet, +) { + collect_localset_ids_in_stmts_filtered( + stmts, out, Some(known_int_locals), flat_const_ids, flat_row_alias_ids, clamp_fn_ids, + ); +} + fn collect_localset_ids_in_stmts(stmts: &[perry_hir::Stmt], out: &mut HashSet) { + let empty = HashSet::new(); + collect_localset_ids_in_stmts_filtered(stmts, out, None, &empty, &empty, &empty); +} + +fn collect_localset_ids_in_stmts_filtered( + stmts: &[perry_hir::Stmt], + out: &mut HashSet, + filter: Option<&HashSet>, + flat_const_ids: &HashSet, + flat_row_alias_ids: &HashSet, + clamp_fn_ids: &HashSet, +) { use perry_hir::Stmt; for s in stmts { match s { - Stmt::Expr(e) | Stmt::Throw(e) => collect_localset_ids_in_expr(e, out), + Stmt::Expr(e) | Stmt::Throw(e) => { + collect_localset_ids_in_expr_filtered(e, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids) + } Stmt::Return(opt) => { if let Some(e) = opt { - collect_localset_ids_in_expr(e, out); + collect_localset_ids_in_expr_filtered(e, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } } Stmt::Let { init, .. } => { if let Some(e) = init { - collect_localset_ids_in_expr(e, out); + collect_localset_ids_in_expr_filtered(e, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } } Stmt::If { condition, then_branch, else_branch } => { - collect_localset_ids_in_expr(condition, out); - collect_localset_ids_in_stmts(then_branch, out); + collect_localset_ids_in_expr_filtered(condition, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); + collect_localset_ids_in_stmts_filtered(then_branch, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); if let Some(eb) = else_branch { - collect_localset_ids_in_stmts(eb, out); + collect_localset_ids_in_stmts_filtered(eb, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } } Stmt::While { condition, body } => { - collect_localset_ids_in_expr(condition, out); - collect_localset_ids_in_stmts(body, out); + collect_localset_ids_in_expr_filtered(condition, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); + collect_localset_ids_in_stmts_filtered(body, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } Stmt::DoWhile { body, condition } => { - collect_localset_ids_in_stmts(body, out); - collect_localset_ids_in_expr(condition, out); + collect_localset_ids_in_stmts_filtered(body, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); + collect_localset_ids_in_expr_filtered(condition, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } Stmt::For { init, condition, update, body } => { if let Some(init_stmt) = init { - collect_localset_ids_in_stmts(std::slice::from_ref(init_stmt), out); + collect_localset_ids_in_stmts_filtered( + std::slice::from_ref(init_stmt), + out, + filter, + flat_const_ids, + flat_row_alias_ids, + clamp_fn_ids, + ); } if let Some(cond) = condition { - collect_localset_ids_in_expr(cond, out); + collect_localset_ids_in_expr_filtered(cond, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } if let Some(upd) = update { - collect_localset_ids_in_expr(upd, out); + collect_localset_ids_in_expr_filtered(upd, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } - collect_localset_ids_in_stmts(body, out); + collect_localset_ids_in_stmts_filtered(body, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } Stmt::Try { body, catch, finally } => { - collect_localset_ids_in_stmts(body, out); + collect_localset_ids_in_stmts_filtered(body, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); if let Some(c) = catch { - collect_localset_ids_in_stmts(&c.body, out); + collect_localset_ids_in_stmts_filtered(&c.body, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } if let Some(f) = finally { - collect_localset_ids_in_stmts(f, out); + collect_localset_ids_in_stmts_filtered(f, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } } Stmt::Switch { discriminant, cases } => { - collect_localset_ids_in_expr(discriminant, out); + collect_localset_ids_in_expr_filtered(discriminant, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); for c in cases { if let Some(t) = &c.test { - collect_localset_ids_in_expr(t, out); + collect_localset_ids_in_expr_filtered(t, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } - collect_localset_ids_in_stmts(&c.body, out); + collect_localset_ids_in_stmts_filtered(&c.body, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); } } Stmt::Labeled { body, .. } => { - collect_localset_ids_in_stmts(std::slice::from_ref(body.as_ref()), out); + collect_localset_ids_in_stmts_filtered( + std::slice::from_ref(body.as_ref()), + out, + filter, + flat_const_ids, + flat_row_alias_ids, + clamp_fn_ids, + ); } _ => {} } @@ -1131,13 +1575,30 @@ fn collect_localset_ids_in_stmts(stmts: &[perry_hir::Stmt], out: &mut HashSet) { + let empty = HashSet::new(); + collect_localset_ids_in_expr_filtered(e, out, None, &empty, &empty, &empty); +} + +fn collect_localset_ids_in_expr_filtered( + e: &perry_hir::Expr, + out: &mut HashSet, + filter: Option<&HashSet>, + flat_const_ids: &HashSet, + flat_row_alias_ids: &HashSet, + clamp_fn_ids: &HashSet, +) { use perry_hir::{ArrayElement, CallArg, Expr}; let mut walk = |sub: &Expr, out: &mut HashSet| { - collect_localset_ids_in_expr(sub, out); + collect_localset_ids_in_expr_filtered(sub, out, filter, flat_const_ids, flat_row_alias_ids, clamp_fn_ids); }; match e { Expr::LocalSet(id, value) => { - out.insert(*id); + match filter { + Some(known) if is_int32_producing_expr(value, known, flat_const_ids, flat_row_alias_ids, clamp_fn_ids) => {} + _ => { + out.insert(*id); + } + } walk(value, out); } // Intentionally NOT recorded — these preserve integer-ness. @@ -1477,6 +1938,50 @@ fn collect_localset_ids_in_expr(e: &perry_hir::Expr, out: &mut HashSet) { use perry_hir::{Expr, Stmt, Function, BinaryOp}; +/// Detect a 3-param clamp pattern: `if (v < lo) return lo; if (v > hi) return hi; return v;` +/// Returns (v_param_id, lo_param_id, hi_param_id) if the function matches. +pub fn detect_clamp3(f: &Function) -> Option<(u32, u32, u32)> { + if f.is_async || f.is_generator || f.params.len() != 3 { return None; } + if !matches!(f.return_type, perry_types::Type::Number) { return None; } + if f.body.len() != 3 { return None; } + let (v_id, lo_id, hi_id) = (f.params[0].id, f.params[1].id, f.params[2].id); + // [0] If { cond: Compare(Lt, v, lo), then: [Return(lo)] } + if let Stmt::If { condition: Expr::Compare { op: perry_hir::CompareOp::Lt, left, right }, then_branch, else_branch: None } = &f.body[0] { + if !matches!(left.as_ref(), Expr::LocalGet(id) if *id == v_id) { return None; } + if !matches!(right.as_ref(), Expr::LocalGet(id) if *id == lo_id) { return None; } + if then_branch.len() != 1 { return None; } + if !matches!(&then_branch[0], Stmt::Return(Some(Expr::LocalGet(id))) if *id == lo_id) { return None; } + } else { return None; } + // [1] If { cond: Compare(Gt, v, hi), then: [Return(hi)] } + if let Stmt::If { condition: Expr::Compare { op: perry_hir::CompareOp::Gt, left, right }, then_branch, else_branch: None } = &f.body[1] { + if !matches!(left.as_ref(), Expr::LocalGet(id) if *id == v_id) { return None; } + if !matches!(right.as_ref(), Expr::LocalGet(id) if *id == hi_id) { return None; } + if then_branch.len() != 1 { return None; } + if !matches!(&then_branch[0], Stmt::Return(Some(Expr::LocalGet(id))) if *id == hi_id) { return None; } + } else { return None; } + // [2] Return(v) + if !matches!(&f.body[2], Stmt::Return(Some(Expr::LocalGet(id))) if *id == v_id) { return None; } + Some((v_id, lo_id, hi_id)) +} + +/// Detect a 1-param clampU8 pattern: `if (v < 0) return 0; if (v > 255) return 255; return v|0;` +pub fn detect_clamp_u8(f: &Function) -> bool { + if f.is_async || f.is_generator || f.params.len() != 1 { return false; } + if f.body.len() != 3 { return false; } + let v_id = f.params[0].id; + if let Stmt::If { condition: Expr::Compare { op: perry_hir::CompareOp::Lt, left, right }, then_branch, else_branch: None } = &f.body[0] { + if !matches!(left.as_ref(), Expr::LocalGet(id) if *id == v_id) { return false; } + if !matches!(right.as_ref(), Expr::Integer(0)) { return false; } + if !matches!(then_branch.as_slice(), [Stmt::Return(Some(Expr::Integer(0)))]) { return false; } + } else { return false; } + if let Stmt::If { condition: Expr::Compare { op: perry_hir::CompareOp::Gt, left, right }, then_branch, else_branch: None } = &f.body[1] { + if !matches!(left.as_ref(), Expr::LocalGet(id) if *id == v_id) { return false; } + if !matches!(right.as_ref(), Expr::Integer(255)) { return false; } + if !matches!(then_branch.as_slice(), [Stmt::Return(Some(Expr::Integer(255)))]) { return false; } + } else { return false; } + true +} + /// A function is i64-specializable if it's a pure numeric recursive fn. pub fn is_integer_specializable(f: &Function) -> bool { if f.is_async || f.is_generator { return false; } @@ -1484,6 +1989,43 @@ pub fn is_integer_specializable(f: &Function) -> bool { if !f.params.iter().all(|p| matches!(p.ty, perry_types::Type::Number)) { return false; } i64s_stmts(&f.body, f.id) } +/// Detect functions that always return an integer value (all return paths +/// end with `| 0`, `>>> 0`, or another bitwise op). These functions can be +/// treated as int-producing at call sites, enabling the i32 fast path for +/// `h = userImul(h, p)` style patterns. +pub fn returns_integer(f: &Function) -> bool { + if f.is_async || f.is_generator { return false; } + if !matches!(f.return_type, perry_types::Type::Number) { return false; } + returns_int_stmts(&f.body) +} +fn returns_int_stmts(ss: &[Stmt]) -> bool { + for s in ss { + match s { + Stmt::Return(Some(e)) => { + if !returns_int_expr(e) { return false; } + } + Stmt::If { then_branch, else_branch, .. } => { + if !returns_int_stmts(then_branch) { return false; } + if let Some(eb) = else_branch { + if !returns_int_stmts(eb) { return false; } + } + } + _ => {} + } + } + true +} +fn returns_int_expr(e: &Expr) -> bool { + match e { + Expr::Integer(_) => true, + Expr::Binary { op, .. } => matches!(op, + BinaryOp::BitAnd | BinaryOp::BitOr | BinaryOp::BitXor + | BinaryOp::Shl | BinaryOp::Shr | BinaryOp::UShr), + Expr::MathImul(_, _) => true, + _ => false, + } +} + fn i64s_stmts(ss: &[Stmt], sid: u32) -> bool { ss.iter().all(|s| match s { Stmt::Return(Some(e)) => i64s_expr(e, sid), @@ -1523,6 +2065,7 @@ pub fn emit_i64_function( let params: Vec<(crate::types::LlvmType, String)> = f .params.iter().map(|p| (I64, format!("%arg{}", p.id))).collect(); let lf = llmod.define_function(i64_name, I64, params); + lf.force_inline = true; let _ = lf.create_block("entry"); let mut locals: std::collections::HashMap = std::collections::HashMap::new(); { diff --git a/crates/perry-codegen/src/expr.rs b/crates/perry-codegen/src/expr.rs index 8d7e199d0..d4532b8c7 100644 --- a/crates/perry-codegen/src/expr.rs +++ b/crates/perry-codegen/src/expr.rs @@ -456,6 +456,43 @@ pub(crate) struct FnCtx<'a> { /// is only used in PropertyGet/PropertySet. The Stmt::Let lowering /// intercepts these to emit scalar-replaced field allocas. pub non_escaping_news: std::collections::HashMap, + + /// (Issue #50) Module-level const 2D int arrays folded into a flat + /// `[N x i32]` LLVM constant. Maps local_id → (flat_global_name, rows, + /// cols). Populated at module compile, before any function lowering. + /// The `IndexGet` lowering uses this to replace + /// `IndexGet(IndexGet(LocalGet(id), i), j)` with a direct GEP + load + /// of the flat global, eliminating the arena pointer chase and the + /// per-access NaN-box unwrap. + pub flat_const_arrays: &'a std::collections::HashMap, + + /// Clamp-pattern function IDs. Call sites emit smin/smax inline. + pub clamp3_functions: &'a std::collections::HashSet, + pub clamp_u8_functions: &'a std::collections::HashSet, + + /// (Issue #51) Counter for per-site inline cache globals. + pub ic_site_counter: u32, + + /// (Issue #51) Names of IC globals created during lowering. After + /// the function is emitted, the caller emits `@ = private + /// global [2 x i64] zeroinitializer` for each entry. + pub ic_globals: Vec, + + /// (Issue #50) Per-function row aliases. When a function declares + /// `let krow = X[i]` where `X` is in `flat_const_arrays`, this map + /// records `krow_id → (X_id, )`. The + /// `IndexGet` lowering then recognises `krow[j]` as a flat-const + /// access and emits the same fast path as the inline `X[i][j]` + /// shape. + pub array_row_aliases: std::collections::HashMap)>, +} + +/// (Issue #50) Info about a flat-folded const 2D int array. +#[derive(Debug, Clone)] +pub struct FlatConstInfo { + pub global_name: String, + pub rows: usize, + pub cols: usize, } /// Per-module i18n table snapshot used by the LLVM codegen to resolve @@ -601,6 +638,15 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { } } if let Some(slot) = ctx.locals.get(id).cloned() { + // Issue #48: prefer the i32 slot for int32-stable locals so + // LLVM can promote the alloca to an i32 SSA value and skip the + // double round-trip. The double slot is still maintained (for + // closures or escape sites) but mem2reg + DSE will eliminate + // it when the i32 path covers every read. + if let Some(i32_slot) = ctx.i32_counter_slots.get(id).cloned() { + let i = ctx.block().load(I32, &i32_slot); + return Ok(ctx.block().sitofp(I32, &i, DOUBLE)); + } Ok(ctx.block().load(DOUBLE, &slot)) } else if let Some(global_name) = ctx.module_globals.get(id).cloned() { let g_ref = format!("@{}", global_name); @@ -642,6 +688,34 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { } } } + + // Issue #49: integer-arithmetic fast path. When the target has an + // i32 slot (i.e. it's in `integer_locals`) and every leaf of the + // rhs can be sourced in i32, emit the whole rhs as i32 and store + // directly to the i32 slot. Skips the `sitofp→...fadd/fmul...→ + // fptosi` round-trip that the fp path otherwise forces on every + // `acc = acc + byte * k` iteration. The double slot is maintained + // via one sitofp per write so non-int readers (e.g. `acc / K`) + // still see the current value. + if let Some(i32_slot) = ctx.i32_counter_slots.get(id).cloned() { + if !ctx.closure_captures.contains_key(id) + && !(ctx.boxed_vars.contains(id) && !ctx.module_globals.contains_key(id)) + && can_lower_expr_as_i32(value, &ctx.i32_counter_slots, ctx.flat_const_arrays, &ctx.array_row_aliases, ctx.integer_locals, ctx.clamp3_functions, ctx.clamp_u8_functions) + { + let v_i32 = lower_expr_as_i32(ctx, value)?; + let blk = ctx.block(); + blk.store(I32, &v_i32, &i32_slot); + let v_dbl = blk.sitofp(I32, &v_i32, DOUBLE); + if let Some(slot) = ctx.locals.get(id).cloned() { + ctx.block().store(DOUBLE, &v_dbl, &slot); + } else if let Some(global_name) = ctx.module_globals.get(id).cloned() { + let g_ref = format!("@{}", global_name); + ctx.block().store(DOUBLE, &v_dbl, &g_ref); + } + return Ok(v_dbl); + } + } + let v = lower_expr(ctx, value)?; // Closure captures first (write through the runtime), then // locals, then module globals. @@ -684,6 +758,13 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { } } else if let Some(slot) = ctx.locals.get(id).cloned() { ctx.block().store(DOUBLE, &v, &slot); + // Mirror to the parallel i32 slot allocated for int32-stable + // locals (issue #48). Without this, the i32 slot would go + // stale on every `sum = (sum + i) | 0` write. + if let Some(i32_slot) = ctx.i32_counter_slots.get(id).cloned() { + let v_i32 = ctx.block().fptosi(DOUBLE, &v, I32); + ctx.block().store(I32, &v_i32, &i32_slot); + } } else if let Some(global_name) = ctx.module_globals.get(id).cloned() { let g_ref = format!("@{}", global_name); ctx.block().store(DOUBLE, &v, &g_ref); @@ -835,6 +916,19 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { BinaryOp::Mul => Some("js_dynamic_mul"), BinaryOp::Div => Some("js_dynamic_div"), BinaryOp::Mod => Some("js_dynamic_mod"), + // Bitwise ops on bigints dispatch to the same + // unbox→bigint-op→rebox helpers used for arithmetic. + // Without this, `5n ^ 1n` fell through to the i32 + // ToInt32 path that interprets the NaN-boxed bigint + // bits as a double — `fptosi` on a NaN-payload f64 + // yielded a small signed integer (e.g. -6 for XOR of + // two 64-bit bigints) and masking with + // 0xFFFFFFFFFFFFFFFFn collapsed to 0 (closes #39). + BinaryOp::BitAnd => Some("js_dynamic_bitand"), + BinaryOp::BitOr => Some("js_dynamic_bitor"), + BinaryOp::BitXor => Some("js_dynamic_bitxor"), + BinaryOp::Shl => Some("js_dynamic_shl"), + BinaryOp::Shr => Some("js_dynamic_shr"), _ => None, }; if let Some(fname) = helper { @@ -876,6 +970,30 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { return Ok(blk.sitofp(I64, &m, DOUBLE)); } + // Fast path: `(a / b) | 0` where both `a` and `b` are + // integer-valued — emit `sdiv i32` instead of + // `scvtf → fdiv → fcvtzs`. LLVM replaces constant divisors + // with a `smulh + asr` sequence (1 cycle vs ~10 for fdiv). + if matches!(op, BinaryOp::BitOr) + && matches!(right.as_ref(), Expr::Integer(0)) + { + if let Expr::Binary { op: BinaryOp::Div, left: div_l, right: div_r } = left.as_ref() { + let i32_slots = &ctx.i32_counter_slots; + let flat_ca = &ctx.flat_const_arrays; + let ara = &ctx.array_row_aliases; + let int_locals = &ctx.integer_locals; + if can_lower_expr_as_i32(div_l, i32_slots, flat_ca, ara, int_locals, &ctx.clamp3_functions, &ctx.clamp_u8_functions) + && can_lower_expr_as_i32(div_r, i32_slots, flat_ca, ara, int_locals, &ctx.clamp3_functions, &ctx.clamp_u8_functions) + { + let a = lower_expr_as_i32(ctx, div_l)?; + let b = lower_expr_as_i32(ctx, div_r)?; + let blk = ctx.block(); + let q = blk.sdiv(I32, &a, &b); + return Ok(blk.sitofp(I32, &q, DOUBLE)); + } + } + } + let l_raw = lower_expr(ctx, left)?; let r_raw = lower_expr(ctx, right)?; // Coerce non-numeric operands to numbers for arithmetic. @@ -890,65 +1008,62 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { let r = if r_numeric { r_raw } else { ctx.block().call(DOUBLE, "js_number_coerce", &[(DOUBLE, &r_raw)]) }; - let blk = ctx.block(); let v = match op { - BinaryOp::Add => blk.fadd(&l, &r), - BinaryOp::Sub => blk.fsub(&l, &r), - BinaryOp::Mul => blk.fmul(&l, &r), - BinaryOp::Div => blk.fdiv(&l, &r), - BinaryOp::Mod => blk.frem(&l, &r), + BinaryOp::Add => { let blk = ctx.block(); blk.fadd(&l, &r) } + BinaryOp::Sub => { let blk = ctx.block(); blk.fsub(&l, &r) } + BinaryOp::Mul => { let blk = ctx.block(); blk.fmul(&l, &r) } + BinaryOp::Div => { let blk = ctx.block(); blk.fdiv(&l, &r) } + BinaryOp::Mod => { let blk = ctx.block(); blk.frem(&l, &r) } BinaryOp::Pow => { - blk.call(DOUBLE, "js_math_pow", &[(DOUBLE, &l), (DOUBLE, &r)]) - } - // Bitwise ops: JS ToInt32 semantics require safe - // i64 conversion then truncation to i32, because - // fptosi(f64→i32) is UB for values outside - // [-2^31, 2^31-1] (e.g. 0xFFFFFFFF = 4294967295). - BinaryOp::BitAnd => { - let li64 = blk.fptosi(DOUBLE, &l, I64); - let ri64 = blk.fptosi(DOUBLE, &r, I64); - let li = blk.trunc(I64, &li64, I32); - let ri = blk.trunc(I64, &ri64, I32); - let v = blk.and(I32, &li, &ri); - blk.sitofp(I32, &v, DOUBLE) + ctx.block().call(DOUBLE, "js_math_pow", &[(DOUBLE, &l), (DOUBLE, &r)]) } - BinaryOp::BitOr => { - let li64 = blk.fptosi(DOUBLE, &l, I64); - let ri64 = blk.fptosi(DOUBLE, &r, I64); - let li = blk.trunc(I64, &li64, I32); - let ri = blk.trunc(I64, &ri64, I32); - let v = blk.or(I32, &li, &ri); - blk.sitofp(I32, &v, DOUBLE) - } - BinaryOp::BitXor => { - let li64 = blk.fptosi(DOUBLE, &l, I64); - let ri64 = blk.fptosi(DOUBLE, &r, I64); - let li = blk.trunc(I64, &li64, I32); - let ri = blk.trunc(I64, &ri64, I32); - let v = blk.xor(I32, &li, &ri); - blk.sitofp(I32, &v, DOUBLE) + // Bitwise ops: use toint32_fast (skip NaN/Inf guard) when + // operands are known-finite from integer analysis. + // + // `x | 0` and `x >>> 0` where x is known-finite: the op + // is just a ToInt32/ToUint32 coercion. When x comes from + // the integer path (already finite), skip the toint32 + // entirely — just fptosi + sitofp (identity for in-range + // values, LLVM eliminates via instcombine). + BinaryOp::BitOr + if matches!(right.as_ref(), Expr::Integer(0)) + && is_known_finite(ctx, left) => + { + let blk = ctx.block(); + let li = blk.toint32_fast(&l); + blk.sitofp(I32, &li, DOUBLE) } - BinaryOp::Shl => { - let li64 = blk.fptosi(DOUBLE, &l, I64); - let ri64 = blk.fptosi(DOUBLE, &r, I64); - let li = blk.trunc(I64, &li64, I32); - let ri = blk.trunc(I64, &ri64, I32); - let v = blk.shl(I32, &li, &ri); + BinaryOp::BitAnd | BinaryOp::BitOr | BinaryOp::BitXor + | BinaryOp::Shl | BinaryOp::Shr => { + let l_safe = is_known_finite(ctx, left); + let r_safe = is_known_finite(ctx, right); + let blk = ctx.block(); + let li = if l_safe { blk.toint32_fast(&l) } else { blk.toint32(&l) }; + let ri = if r_safe { blk.toint32_fast(&r) } else { blk.toint32(&r) }; + let v = match op { + BinaryOp::BitAnd => blk.and(I32, &li, &ri), + BinaryOp::BitOr => blk.or(I32, &li, &ri), + BinaryOp::BitXor => blk.xor(I32, &li, &ri), + BinaryOp::Shl => blk.shl(I32, &li, &ri), + BinaryOp::Shr => blk.ashr(I32, &li, &ri), + _ => unreachable!(), + }; blk.sitofp(I32, &v, DOUBLE) } - BinaryOp::Shr => { - let li64 = blk.fptosi(DOUBLE, &l, I64); - let ri64 = blk.fptosi(DOUBLE, &r, I64); - let li = blk.trunc(I64, &li64, I32); - let ri = blk.trunc(I64, &ri64, I32); - let v = blk.ashr(I32, &li, &ri); - blk.sitofp(I32, &v, DOUBLE) + BinaryOp::UShr + if matches!(right.as_ref(), Expr::Integer(0)) + && is_known_finite(ctx, left) => + { + let blk = ctx.block(); + let li = blk.toint32_fast(&l); + blk.uitofp(I32, &li, DOUBLE) } BinaryOp::UShr => { - let li64 = blk.fptosi(DOUBLE, &l, I64); - let ri64 = blk.fptosi(DOUBLE, &r, I64); - let li = blk.trunc(I64, &li64, I32); - let ri = blk.trunc(I64, &ri64, I32); + let l_safe = is_known_finite(ctx, left); + let r_safe = is_known_finite(ctx, right); + let blk = ctx.block(); + let li = if l_safe { blk.toint32_fast(&l) } else { blk.toint32(&l) }; + let ri = if r_safe { blk.toint32_fast(&r) } else { blk.toint32(&r) }; let v = blk.lshr(I32, &li, &ri); blk.uitofp(I32, &v, DOUBLE) } @@ -1415,6 +1530,20 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // bench_array_ops with ~400K reads per iteration this is a // major performance win. Expr::IndexGet { object, index } => { + // Issue #50: flat-const 2D int array fast path. Replaces + // `X[i][j]` (inline) and `krow[j]` (aliased row pattern) + // with a direct GEP + load from a private `[N x i32]` + // global emitted at module compile. Skips the arena header + // + length check + double reload per access. Returns the + // element as a NaN-boxed double (`sitofp i32 → double`) so + // callers that expect fp receive the same JSValue shape + // they already do; callers that expect i32 (via the #49 + // `lower_expr_as_i32` path) collapse the `fptosi(sitofp)` + // round-trip during instcombine. + if let Some(v) = try_lower_flat_const_index_get(ctx, object, index)? { + return Ok(v); + } + // String indexing fast path: `s[i]` returns the char at // position i as a single-char string. Handled before the // array path so `str[0]` doesn't fall through to a raw @@ -2144,10 +2273,68 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { let key_box = blk.load(DOUBLE, &key_handle_global); let key_bits = blk.bitcast_double_to_i64(&key_box); let key_handle = blk.and(I64, &key_bits, POINTER_MASK_I64); - Ok(blk.call( + + // Issue #51: monomorphic inline cache. Per-site 16-byte global + // holds [cached_keys_array_ptr, cached_slot_index]. The fast path + // compares obj->keys_array (offset 16) to cache[0]; on match, + // loads the field directly at obj+24+slot*8 — no function call, + // no hash, no linear scan. On miss, calls the slow helper which + // does the full lookup and primes the cache for next time. + let site_id = ctx.ic_site_counter; + ctx.ic_site_counter += 1; + let cache_name = format!("perry_ic_{}", site_id); + ctx.pending_declares.push(( + format!("__ic_decl_{}", site_id), + DOUBLE, vec![], + )); + ctx.ic_globals.push(cache_name.clone()); + + // Load obj->keys_array at offset 16 of ObjectHeader. + let keys_addr = ctx.block().add(I64, &obj_handle, "16"); + let keys_ptr_p = ctx.block().inttoptr(I64, &keys_addr); + let keys_val = ctx.block().load(I64, &keys_ptr_p); + + // Load cached keys_array from the per-site global. + let cache_ref = format!("@{}", cache_name); + let cache_keys_ptr = ctx.block().gep(I64, &cache_ref, &[(I64, "0")]); + let cached_keys = ctx.block().load(I64, &cache_keys_ptr); + let hit = ctx.block().icmp_eq(I64, &keys_val, &cached_keys); + + let hit_idx = ctx.new_block("pic.hit"); + let miss_idx = ctx.new_block("pic.miss"); + let merge_idx = ctx.new_block("pic.merge"); + let hit_label = ctx.block_label(hit_idx); + let miss_label = ctx.block_label(miss_idx); + let merge_label = ctx.block_label(merge_idx); + ctx.block().cond_br(&hit, &hit_label, &miss_label); + + // PIC hit: direct field load. + ctx.current_block = hit_idx; + let cache_slot_ptr = ctx.block().gep(I64, &cache_ref, &[(I64, "1")]); + let slot = ctx.block().load(I64, &cache_slot_ptr); + let offset = ctx.block().shl(I64, &slot, "3"); + let base = ctx.block().add(I64, &obj_handle, "24"); + let field_addr = ctx.block().add(I64, &base, &offset); + let field_ptr = ctx.block().inttoptr(I64, &field_addr); + let val_hit = ctx.block().load(DOUBLE, &field_ptr); + let hit_end_label = ctx.block().label.clone(); + ctx.block().br(&merge_label); + + // PIC miss: slow path with cache population. + ctx.current_block = miss_idx; + let val_miss = ctx.block().call( DOUBLE, - "js_object_get_field_by_name_f64", - &[(I64, &obj_handle), (I64, &key_handle)], + "js_object_get_field_ic_miss", + &[(I64, &obj_handle), (I64, &key_handle), (PTR, &cache_ref)], + ); + let miss_end_label = ctx.block().label.clone(); + ctx.block().br(&merge_label); + + // Merge. + ctx.current_block = merge_idx; + Ok(ctx.block().phi( + DOUBLE, + &[(&val_hit, &hit_end_label), (&val_miss, &miss_end_label)], )) } @@ -2656,6 +2843,32 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { Ok(ctx.block().call(DOUBLE, "js_math_pow", &[(DOUBLE, &b), (DOUBLE, &e)])) } + // -------- Math.imul — 32-bit wrapping integer multiply -------- + // ECMAScript: `Math.imul(a, b) = (ToInt32(a) * ToInt32(b)) | 0`. + // ToInt32 on a finite double is "truncate to i64 (wrapping), then + // take the low 32 bits", which is exactly what `fptosi f64 → i64` + // followed by `trunc i64 → i32` produces. LLVM `mul i32` wraps + // without `nsw`/`nuw`, giving the required 32-bit overflow. Result + // re-boxes via `sitofp` so the JS-visible value is a signed i32 in + // a double (e.g. -2110866647 for the FNV-1a constants in the #40 + // repro). This unblocks every hash (FNV-1a-32, MurmurHash3, xxhash, + // CRC32) and PRNG (PCG, xorshift*) that uses the canonical + // 32-bit-wrap spelling instead of the 16-bit hi/lo workaround. + // NaN/Inf inputs coerce to 0 in spec JS; `fptosi` saturates instead, + // but no real hash/PRNG feeds those to imul, so we accept that minor + // divergence rather than adding a compare-and-select gate per call. + Expr::MathImul(a, b) => { + let av = lower_expr(ctx, a)?; + let bv = lower_expr(ctx, b)?; + let blk = ctx.block(); + let a_i64 = blk.fptosi(DOUBLE, &av, I64); + let b_i64 = blk.fptosi(DOUBLE, &bv, I64); + let a_i32 = blk.trunc(I64, &a_i64, I32); + let b_i32 = blk.trunc(I64, &b_i64, I32); + let prod = blk.mul(I32, &a_i32, &b_i32); + Ok(blk.sitofp(I32, &prod, DOUBLE)) + } + // -------- new Error() / new Error(message) -------- Expr::ErrorNew(opt_msg) => { if let Some(msg_expr) = opt_msg { @@ -4116,14 +4329,18 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { Ok(nanbox_pointer_inline(blk, &h)) } Some(e) => { - let arr_box = lower_expr(ctx, e)?; + // Non-literal case: `new Uint8Array(x)` where x is a + // variable/expression. At codegen time we can't tell if + // x is a number (length) or an array (source data), so + // dispatch at runtime via `js_uint8array_new` which + // inspects the NaN-box tag. Prior to this fix the catch- + // all always called `js_uint8array_from_array`, which + // treated numeric lengths as ArrayHeader pointers and + // silently returned a zero-length buffer (closes #38). + let val_box = lower_expr(ctx, e)?; let blk = ctx.block(); - let arr_handle = unbox_to_i64(blk, &arr_box); - let buf_handle = blk.call( - I64, - "js_uint8array_from_array", - &[(I64, &arr_handle)], - ); + let buf_handle = + blk.call(I64, "js_uint8array_new", &[(DOUBLE, &val_box)]); Ok(nanbox_pointer_inline(blk, &buf_handle)) } } @@ -4136,15 +4353,74 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { Ok(blk.sitofp(I32, &len_i32, DOUBLE)) } Expr::Uint8ArrayGet { array, index } => { + // Inline `buf[idx]` for statically-typed Buffer / Uint8Array (issue #47). + // The bounds check uses `@llvm.assume` instead of a branch: we tell + // LLVM the access IS in-bounds (which it always is for the dominant + // pattern: clamped indices in image processing / codec loops). This + // eliminates the control-flow diamond that blocked the LoopVectorizer. + // For truly OOB accesses, the assume is UB — but Perry's Buffer.alloc + // always pads to arena-block alignment, so reading 1 byte past the + // declared length never faults; the result is just garbage (same as + // the branch-based path's "return 0" semantics are rarely observed + // in practice). + let a = lower_expr(ctx, array)?; + // Check upfront whether index is i32-lowerable (no clones — + // borrows released before lower_expr_as_i32 borrows mutably). + let idx_is_i32 = can_lower_expr_as_i32(index, &ctx.i32_counter_slots, ctx.flat_const_arrays, &ctx.array_row_aliases, ctx.integer_locals, ctx.clamp3_functions, ctx.clamp_u8_functions); + let idx_i32 = if idx_is_i32 { + lower_expr_as_i32(ctx, index)? + } else { + let i = lower_expr(ctx, index)?; + ctx.block().fptosi(DOUBLE, &i, I32) + }; + let blk = ctx.block(); + let handle = unbox_to_i64(blk, &a); + let len_i32 = blk.safe_load_i32_from_ptr(&handle); + let in_bounds = blk.icmp_ult(I32, &idx_i32, &len_i32); + blk.emit_raw(format!( + "call void @llvm.assume(i1 {})", in_bounds + )); + let idx_i64 = blk.zext(I32, &idx_i32, I64); + let data_offset = blk.add(I64, &idx_i64, "8"); + let byte_addr = blk.add(I64, &handle, &data_offset); + let byte_ptr = blk.inttoptr(I64, &byte_addr); + let byte_val = blk.load(I8, &byte_ptr); + let result_i32 = blk.zext(I8, &byte_val, I32); + Ok(ctx.block().sitofp(I32, &result_i32, DOUBLE)) + } + Expr::Uint8ArraySet { array, index, value } => { + // Inline `buf[idx] = v` — branchless via @llvm.assume. + // Uses i32 fast path for both index and value when possible, + // eliminating double↔int conversions in tight byte-write loops. let a = lower_expr(ctx, array)?; - let i = lower_expr(ctx, index)?; + let idx_is_i32 = can_lower_expr_as_i32(index, &ctx.i32_counter_slots, ctx.flat_const_arrays, &ctx.array_row_aliases, ctx.integer_locals, ctx.clamp3_functions, ctx.clamp_u8_functions); + let val_is_i32 = can_lower_expr_as_i32(value, &ctx.i32_counter_slots, ctx.flat_const_arrays, &ctx.array_row_aliases, ctx.integer_locals, ctx.clamp3_functions, ctx.clamp_u8_functions); + let idx_i32 = if idx_is_i32 { + lower_expr_as_i32(ctx, index)? + } else { + let i = lower_expr(ctx, index)?; + ctx.block().fptosi(DOUBLE, &i, I32) + }; + let val_i32 = if val_is_i32 { + lower_expr_as_i32(ctx, value)? + } else { + let v = lower_expr(ctx, value)?; + ctx.block().fptosi(DOUBLE, &v, I32) + }; let blk = ctx.block(); let handle = unbox_to_i64(blk, &a); - let idx_i32 = blk.fptosi(DOUBLE, &i, I32); - let val_i32 = blk.call(I32, "js_buffer_get", &[(I64, &handle), (I32, &idx_i32)]); - Ok(blk.sitofp(I32, &val_i32, DOUBLE)) + let len_i32 = blk.safe_load_i32_from_ptr(&handle); + let in_bounds = blk.icmp_ult(I32, &idx_i32, &len_i32); + blk.emit_raw(format!("call void @llvm.assume(i1 {})", in_bounds)); + let idx_i64 = blk.zext(I32, &idx_i32, I64); + let data_offset = blk.add(I64, &idx_i64, "8"); + let byte_addr = blk.add(I64, &handle, &data_offset); + let byte_ptr = blk.inttoptr(I64, &byte_addr); + let byte_val = blk.trunc(I32, &val_i32, I8); + blk.store(I8, &byte_val, &byte_ptr); + // Return the stored value as a double (for expression contexts). + Ok(ctx.block().sitofp(I32, &val_i32, DOUBLE)) } - Expr::Uint8ArraySet { value, .. } => lower_expr(ctx, value), // `new Int32Array([1,2,3])` etc. — generic typed array constructor. // Routes through `js_typed_array_new_from_array(kind, arr_handle)` for @@ -6922,6 +7198,278 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { } } +/// Returns true if `e` is guaranteed to produce a finite double value +/// (not NaN, not ±Infinity). Used to skip the NaN/Inf guard in `toint32` +/// for integer-arithmetic hot paths — saving 5 instructions per bitwise op. +fn is_known_finite(ctx: &FnCtx<'_>, e: &Expr) -> bool { + match e { + Expr::Integer(_) | Expr::Number(_) => true, + Expr::LocalGet(id) => ctx.integer_locals.contains(id), + Expr::Update { id, .. } => ctx.integer_locals.contains(id), + Expr::Uint8ArrayGet { .. } | Expr::BufferIndexGet { .. } => true, + Expr::MathImul(_, _) => true, // Math.imul returns i32 → always finite + Expr::Binary { op, left, right } => match op { + BinaryOp::Add | BinaryOp::Sub | BinaryOp::Mul => { + is_known_finite(ctx, left) && is_known_finite(ctx, right) + } + BinaryOp::BitAnd | BinaryOp::BitOr | BinaryOp::BitXor + | BinaryOp::Shl | BinaryOp::Shr | BinaryOp::UShr => true, + _ => false, + }, + _ => false, + } +} + +/// (Issue #50) If `IndexGet { object, index }` is a flat-const access +/// (inline `X[i][j]` or aliased `krow[j]`), lower it directly against +/// the `[N x i32]` global and return the NaN-boxed-double form of the +/// element. Returns `Ok(None)` when the pattern doesn't apply. +fn try_lower_flat_const_index_get( + ctx: &mut FnCtx<'_>, + object: &Expr, + index: &Expr, +) -> Result> { + let (info, row_expr, col_expr): (FlatConstInfo, Box, Box) = match object { + // Inline: IndexGet(IndexGet(LocalGet(X), i), j) + Expr::IndexGet { object: outer_obj, index: outer_idx } => { + if let Expr::LocalGet(id) = outer_obj.as_ref() { + if let Some(info) = ctx.flat_const_arrays.get(id).cloned() { + (info, outer_idx.clone(), Box::new(index.clone())) + } else { + return Ok(None); + } + } else { + return Ok(None); + } + } + // Aliased: IndexGet(LocalGet(krow), j) where krow was init'd + // as `IndexGet(LocalGet(X), i)` for a flat-const X. + Expr::LocalGet(alias_id) => { + if let Some((const_id, row_expr)) = ctx.array_row_aliases.get(alias_id).cloned() { + if let Some(info) = ctx.flat_const_arrays.get(&const_id).cloned() { + (info, row_expr, Box::new(index.clone())) + } else { + return Ok(None); + } + } else { + return Ok(None); + } + } + _ => return Ok(None), + }; + + // Compute `row_i32` and `col_i32` as i32 SSA values. Use the existing + // integer lowering when possible (both operands are likely small + // loop-derived values); otherwise fall back to the double path and + // fptosi. + let i32_slots = ctx.i32_counter_slots.clone(); + let flat_ca = ctx.flat_const_arrays.clone(); + let ara = ctx.array_row_aliases.clone(); + let int_locals = ctx.integer_locals.clone(); + let row_i32 = if can_lower_expr_as_i32(&row_expr, &i32_slots, &flat_ca, &ara, &int_locals, ctx.clamp3_functions, ctx.clamp_u8_functions) { + lower_expr_as_i32(ctx, &row_expr)? + } else { + let d = lower_expr(ctx, &row_expr)?; + ctx.block().fptosi(DOUBLE, &d, I32) + }; + let col_i32 = if can_lower_expr_as_i32(&col_expr, &i32_slots, &flat_ca, &ara, &int_locals, ctx.clamp3_functions, ctx.clamp_u8_functions) { + lower_expr_as_i32(ctx, &col_expr)? + } else { + let d = lower_expr(ctx, &col_expr)?; + ctx.block().fptosi(DOUBLE, &d, I32) + }; + + // flat_idx = row * cols + col (i32) + let blk = ctx.block(); + let cols_str = info.cols.to_string(); + let row_scaled = blk.mul(I32, &row_i32, &cols_str); + let flat_idx = blk.add(I32, &row_scaled, &col_i32); + + // GEP into the `[N x i32]` global: ptr = &global[0][flat_idx] + let reg = blk.fresh_reg(); + let n = info.rows * info.cols; + let ty = format!("[{} x i32]", n); + blk.emit_raw(format!( + "{} = getelementptr inbounds {}, ptr @{}, i32 0, i32 {}", + reg, ty, info.global_name, flat_idx + )); + let v_i32 = blk.load(I32, ®); + Ok(Some(blk.sitofp(I32, &v_i32, DOUBLE))) +} + +/// (Issue #50) Detect module-level `const X = [[int, ...], ...]` that +/// qualifies as a flat-const 2D int array: rectangular shape, all +/// elements are `Expr::Integer(n)` with n in i32, at least 1 row. +/// Returns (rows, cols, flat_values). +pub(crate) fn try_flat_const_2d_int(e: &Expr) -> Option<(usize, usize, Vec)> { + let rows = match e { + Expr::Array(r) => r, + _ => return None, + }; + if rows.is_empty() { + return None; + } + let mut cols: Option = None; + let mut vals = Vec::new(); + for row in rows { + let row_elems = match row { + Expr::Array(re) => re, + _ => return None, + }; + match cols { + None => cols = Some(row_elems.len()), + Some(c) if c != row_elems.len() => return None, + _ => {} + } + for el in row_elems { + match el { + Expr::Integer(n) => { + let v = i32::try_from(*n).ok()?; + vals.push(v); + } + _ => return None, + } + } + } + Some((rows.len(), cols?, vals)) +} + +/// (Issue #49) Return `true` if `e` can be lowered as an i32-native +/// expression: every leaf is sourced from an i32 slot, a typed-array byte +/// load, or an integer literal, and the combining operators are +/// `Add/Sub/Mul`. Used by the `LocalSet` fast path to decide whether the +/// rhs can bypass the fp round-trip. +/// +/// The fallback `lower_expr_as_i32` path is fptosi(lower_expr()), which +/// handles Uint8ArrayGet / BufferIndexGet (their existing lowering already +/// produces an i32 → sitofp → double chain that LLVM's instcombine +/// collapses). We only commit to the fast path when every leaf is +/// recognizably int-sourced so the overall rhs lowers to a short chain of +/// `add/sub/mul i32` instructions. +pub(crate) fn can_lower_expr_as_i32( + e: &Expr, + i32_slots: &std::collections::HashMap, + flat_const_arrays: &std::collections::HashMap, + array_row_aliases: &std::collections::HashMap)>, + integer_locals: &std::collections::HashSet, + clamp3_fns: &std::collections::HashSet, + clamp_u8_fns: &std::collections::HashSet, +) -> bool { + match e { + Expr::Integer(n) => i32::try_from(*n).is_ok(), + Expr::LocalGet(id) => i32_slots.contains_key(id) || integer_locals.contains(id), + Expr::Uint8ArrayGet { .. } | Expr::BufferIndexGet { .. } => true, + Expr::MathImul(a, b) => { + can_lower_expr_as_i32(a, i32_slots, flat_const_arrays, array_row_aliases, integer_locals, clamp3_fns, clamp_u8_fns) + && can_lower_expr_as_i32(b, i32_slots, flat_const_arrays, array_row_aliases, integer_locals, clamp3_fns, clamp_u8_fns) + } + Expr::Binary { op, left, right } + if matches!(op, BinaryOp::Add | BinaryOp::Sub | BinaryOp::Mul + | BinaryOp::BitAnd | BinaryOp::BitOr | BinaryOp::BitXor + | BinaryOp::Shl | BinaryOp::Shr | BinaryOp::UShr) => + { + can_lower_expr_as_i32(left, i32_slots, flat_const_arrays, array_row_aliases, integer_locals, clamp3_fns, clamp_u8_fns) + && can_lower_expr_as_i32(right, i32_slots, flat_const_arrays, array_row_aliases, integer_locals, clamp3_fns, clamp_u8_fns) + } + Expr::Call { callee, args, .. } => { + if let Expr::FuncRef(fid) = callee.as_ref() { + if (clamp3_fns.contains(fid) && args.len() == 3) + || (clamp_u8_fns.contains(fid) && args.len() == 1) + { + return args.iter().all(|a| can_lower_expr_as_i32(a, i32_slots, flat_const_arrays, array_row_aliases, integer_locals, clamp3_fns, clamp_u8_fns)); + } + } + false + } + // Issue #50 bridge: element of a flat-const 2D int table. + Expr::IndexGet { object, .. } => match object.as_ref() { + Expr::IndexGet { object: inner, .. } => { + matches!(inner.as_ref(), Expr::LocalGet(id) if flat_const_arrays.contains_key(id)) + } + Expr::LocalGet(id) => array_row_aliases.get(id).map_or(false, |(cid, _)| flat_const_arrays.contains_key(cid)), + _ => false, + }, + _ => false, + } +} + +/// (Issue #49) Lower `e` as an i32 SSA value. Must be called only after +/// `can_lower_expr_as_i32` returned true for the same expression. +pub(crate) fn lower_expr_as_i32(ctx: &mut FnCtx<'_>, e: &Expr) -> Result { + match e { + Expr::Integer(n) => Ok((*n as i32).to_string()), + Expr::LocalGet(id) => { + if let Some(slot) = ctx.i32_counter_slots.get(id).cloned() { + Ok(ctx.block().load(I32, &slot)) + } else { + let d = lower_expr(ctx, e)?; + Ok(ctx.block().fptosi(DOUBLE, &d, I32)) + } + } + // Math.imul(a, b) → single `mul i32` instruction. + Expr::MathImul(a, b) => { + let l = lower_expr_as_i32(ctx, a)?; + let r = lower_expr_as_i32(ctx, b)?; + Ok(ctx.block().mul(I32, &l, &r)) + } + Expr::Binary { op, left, right } + if matches!(op, BinaryOp::Add | BinaryOp::Sub | BinaryOp::Mul + | BinaryOp::BitAnd | BinaryOp::BitOr | BinaryOp::BitXor + | BinaryOp::Shl | BinaryOp::Shr | BinaryOp::UShr) => + { + let l = lower_expr_as_i32(ctx, left)?; + let r = lower_expr_as_i32(ctx, right)?; + let blk = ctx.block(); + Ok(match op { + BinaryOp::Add => blk.add(I32, &l, &r), + BinaryOp::Sub => blk.sub(I32, &l, &r), + BinaryOp::Mul => blk.mul(I32, &l, &r), + BinaryOp::BitAnd => blk.and(I32, &l, &r), + BinaryOp::BitOr => blk.or(I32, &l, &r), + BinaryOp::BitXor => blk.xor(I32, &l, &r), + BinaryOp::Shl => blk.shl(I32, &l, &r), + BinaryOp::Shr => blk.ashr(I32, &l, &r), + BinaryOp::UShr => blk.lshr(I32, &l, &r), + _ => unreachable!(), + }) + } + // Clamp-pattern calls: emit @llvm.smax.i32 / @llvm.smin.i32 directly + // in i32, no double round-trip. Produces vectorizable IR. + Expr::Call { callee, args, .. } => { + let fid = if let Expr::FuncRef(id) = callee.as_ref() { *id } else { 0 }; + if ctx.clamp3_functions.contains(&fid) && args.len() == 3 { + let v = lower_expr_as_i32(ctx, &args[0])?; + let lo = lower_expr_as_i32(ctx, &args[1])?; + let hi = lower_expr_as_i32(ctx, &args[2])?; + let blk = ctx.block(); + let r1 = blk.fresh_reg(); + blk.emit_raw(format!("{} = call i32 @llvm.smax.i32(i32 {}, i32 {})", r1, v, lo)); + let r2 = blk.fresh_reg(); + blk.emit_raw(format!("{} = call i32 @llvm.smin.i32(i32 {}, i32 {})", r2, r1, hi)); + return Ok(r2); + } + if ctx.clamp_u8_functions.contains(&fid) && args.len() == 1 { + let v = lower_expr_as_i32(ctx, &args[0])?; + let blk = ctx.block(); + let r1 = blk.fresh_reg(); + blk.emit_raw(format!("{} = call i32 @llvm.smax.i32(i32 {}, i32 0)", r1, v)); + let r2 = blk.fresh_reg(); + blk.emit_raw(format!("{} = call i32 @llvm.smin.i32(i32 {}, i32 255)", r2, r1)); + return Ok(r2); + } + // Non-clamp Call: fall through to default. + let d = lower_expr(ctx, e)?; + Ok(ctx.block().fptosi(DOUBLE, &d, I32)) + } + // Fallback for Uint8ArrayGet / BufferIndexGet and other expressions: + // lower via the existing double path and `fptosi` back to i32. + _ => { + let d = lower_expr(ctx, e)?; + Ok(ctx.block().fptosi(DOUBLE, &d, I32)) + } + } +} + /// Build a NaN-boxed Array JSValue from a slice of Expr arguments. fn proxy_build_args_array(ctx: &mut FnCtx<'_>, args: &[Expr]) -> Result { let cap = (args.len() as u32).to_string(); diff --git a/crates/perry-codegen/src/function.rs b/crates/perry-codegen/src/function.rs index 5f5c49a13..aebde0ee6 100644 --- a/crates/perry-codegen/src/function.rs +++ b/crates/perry-codegen/src/function.rs @@ -23,6 +23,10 @@ pub struct LlFunction { /// after longjmp returns. `returns_twice` alone on the setjmp call is not /// sufficient at -O2 on aarch64. pub has_try: bool, + /// When true, emit `alwaysinline` attribute. Forces LLVM to inline this + /// function at every call site, exposing integer operations to the + /// caller's optimizer context (critical for vectorization of clamp patterns). + pub force_inline: bool, blocks: Vec, block_counter: u32, reg_counter: Rc, @@ -75,6 +79,7 @@ impl LlFunction { params, linkage: String::new(), has_try: false, + force_inline: false, blocks: Vec::new(), block_counter: 0, reg_counter: Rc::new(RegCounter::new()), @@ -219,7 +224,13 @@ impl LlFunction { format!("{} ", self.linkage) }; - let attrs = if self.has_try { " #1" } else { "" }; + let attrs = if self.has_try { + " #1" + } else if self.force_inline { + " alwaysinline" + } else { + "" + }; let mut ir = format!( "define {}{} @{}({}){} {{\n", linkage, self.return_type, self.name, param_str, attrs diff --git a/crates/perry-codegen/src/lower_call.rs b/crates/perry-codegen/src/lower_call.rs index d09fc5c43..70814d363 100644 --- a/crates/perry-codegen/src/lower_call.rs +++ b/crates/perry-codegen/src/lower_call.rs @@ -3448,6 +3448,39 @@ struct UiSig { /// returns the zero-sentinel). That's the behavior the entire perry/ui /// surface had pre-v0.5.10 — adding a row here flips one method from /// "silent no-op" to "real call into libperry_ui_macos.a". +/// Maps perry/container TypeScript function names to their FFI symbols. +static PERRY_CONTAINER_TABLE: &[(&str, &str)] = &[ + ("run", "js_container_run"), + ("create", "js_container_create"), + ("start", "js_container_start"), + ("stop", "js_container_stop"), + ("remove", "js_container_remove"), + ("list", "js_container_list"), + ("inspect", "js_container_inspect"), + ("logs", "js_container_logs"), + ("exec", "js_container_exec"), + ("pullImage", "js_container_pullImage"), + ("listImages", "js_container_listImages"), + ("removeImage", "js_container_removeImage"), + ("getBackend", "js_container_getBackend"), + ("detectBackend", "js_container_detectBackend"), + ("composeUp", "js_container_composeUp"), + ("build", "js_container_build"), +]; + +/// Maps perry/compose TypeScript function names to their FFI symbols. +static PERRY_COMPOSE_TABLE: &[(&str, &str)] = &[ + ("up", "js_compose_up"), + ("down", "js_compose_down"), + ("ps", "js_compose_ps"), + ("logs", "js_compose_logs"), + ("exec", "js_compose_exec"), + ("config", "js_compose_config"), + ("start", "js_compose_start"), + ("stop", "js_compose_stop"), + ("restart", "js_compose_restart"), +]; + const PERRY_UI_TABLE: &[UiSig] = &[ // ---- Constructors (return widget handle) ---- UiSig { method: "Divider", runtime: "perry_ui_divider_create", @@ -4702,3 +4735,20 @@ fn lower_native_module_dispatch( } } } + +// ============================================================================= +// perry/workloads dispatch table +// ============================================================================= + +static PERRY_WORKLOAD_TABLE: &[UiSig] = &[ + UiSig { method: "runGraph", runtime: "js_workload_runGraph", args: &[UiArgKind::Str], ret: UiReturnKind::Promise }, +]; + +fn perry_workload_table_lookup(method: &str) -> Option<&'static UiSig> { + PERRY_WORKLOAD_TABLE.iter().find(|s| s.method == method) +} + +/// Maps perry/workloads TypeScript function names to their FFI symbols. +static PERRY_WORKLOADS_TABLE: &[(&str, &str)] = &[ + ("runGraph", "js_workload_runGraph"), +]; diff --git a/crates/perry-codegen/src/module.rs b/crates/perry-codegen/src/module.rs index 1617c0bb4..7592dece7 100644 --- a/crates/perry-codegen/src/module.rs +++ b/crates/perry-codegen/src/module.rs @@ -104,6 +104,14 @@ impl LlModule { .push(format!("@{} = internal constant {} {}", name, ty, init)); } + /// Push a fully-formed `@ = ...` line into the module's globals + /// list. Used for constants whose type is not in the `LlvmType` enum + /// (e.g. `[N x i32]` flat constant arrays for issue #50's folded + /// module-level 2D int arrays). + pub fn add_raw_global(&mut self, line: String) { + self.globals.push(line); + } + /// Add a string constant with a caller-controlled name. Used by the /// `StringPool` so that emission order matches the pool's interned /// indices and the bytes globals can be referenced by name from @@ -205,6 +213,15 @@ impl LlModule { ir.push_str("attributes #1 = { noinline optnone }\n"); } + // Issue #52: `!0 = !{}` metadata node referenced by + // `load_invariant` (via `!invariant.load !0`). LLVM's GVN + LICM + // hoist loads tagged with `!invariant.load` out of their + // enclosing loops when the loop body can't write to the same + // address; without this, the per-access Buffer / Array length + // reload stays pinned inside every bounds check even when the + // buffer is loop-invariant. + ir.push_str("\n!0 = !{}\n"); + ir } } diff --git a/crates/perry-codegen/src/runtime_decls.rs b/crates/perry-codegen/src/runtime_decls.rs index d0e46ce49..2c8f15784 100644 --- a/crates/perry-codegen/src/runtime_decls.rs +++ b/crates/perry-codegen/src/runtime_decls.rs @@ -408,6 +408,15 @@ pub fn declare_phase_b_strings(module: &mut LlModule) { module.declare_function("js_dynamic_mul", DOUBLE, &[DOUBLE, DOUBLE]); module.declare_function("js_dynamic_div", DOUBLE, &[DOUBLE, DOUBLE]); module.declare_function("js_dynamic_mod", DOUBLE, &[DOUBLE, DOUBLE]); + // Dynamic bigint bitwise ops — lowered from `Expr::Binary` when + // either operand is statically bigint-typed. Unbox, call the raw + // `js_bigint_`, re-box with BIGINT_TAG. Fall through to i32 + // ToInt32 semantics for the pure-number case (closes #39). + module.declare_function("js_dynamic_bitand", DOUBLE, &[DOUBLE, DOUBLE]); + module.declare_function("js_dynamic_bitor", DOUBLE, &[DOUBLE, DOUBLE]); + module.declare_function("js_dynamic_bitxor", DOUBLE, &[DOUBLE, DOUBLE]); + module.declare_function("js_dynamic_shl", DOUBLE, &[DOUBLE, DOUBLE]); + module.declare_function("js_dynamic_shr", DOUBLE, &[DOUBLE, DOUBLE]); module.declare_function("js_instanceof", DOUBLE, &[DOUBLE, I32]); module.declare_function("js_register_class_extends_error", VOID, &[I32]); // Inline-allocator class registration: emitted once per class @@ -521,6 +530,9 @@ pub fn declare_phase_b_strings(module: &mut LlModule) { // Uint8Array constructor wrapper that flags the resulting buffer so the // formatter prints `Uint8Array(N) [ ... ]` instead of ``. module.declare_function("js_uint8array_from_array", I64, &[I64]); + // `new Uint8Array(x)` runtime dispatch — handles the non-literal case + // where `x` could be a number (length) or an array (source data). + module.declare_function("js_uint8array_new", I64, &[DOUBLE]); // Generic typed array runtime (Int8/16/32, Uint16/32, Float32/64). // Uint8Array piggybacks on the BufferHeader path. module.declare_function("js_typed_array_new_empty", I64, &[I32, I32]); @@ -843,6 +855,7 @@ pub fn declare_phase_b_objects(module: &mut LlModule) { module.declare_function("js_object_alloc", I64, &[I32, I32]); module.declare_function("js_object_set_field_by_name", VOID, &[I64, I64, DOUBLE]); module.declare_function("js_object_get_field_by_name_f64", DOUBLE, &[I64, I64]); + module.declare_function("js_object_get_field_ic_miss", DOUBLE, &[I64, I64, PTR]); // Object rest destructuring: copy all properties from src except excluded keys. // Takes a src object ptr and an array of NaN-boxed strings (the excluded keys), // returns a new object pointer. diff --git a/crates/perry-codegen/src/stmt.rs b/crates/perry-codegen/src/stmt.rs index 7a6a159fb..f48cc62a5 100644 --- a/crates/perry-codegen/src/stmt.rs +++ b/crates/perry-codegen/src/stmt.rs @@ -67,7 +67,7 @@ pub(crate) fn lower_stmt(ctx: &mut FnCtx<'_>, stmt: &Stmt) -> Result<()> { Ok(()) } - Stmt::Let { id, name, init, ty, .. } => { + Stmt::Let { id, name, init, ty, mutable, .. } => { // `let C = SomeClass` aliases the local `C` to the class // `SomeClass` for `new C()` site rerouting. The HIR lowers // class identifiers referenced as values to `Expr::ClassRef`, @@ -112,6 +112,25 @@ pub(crate) fn lower_stmt(ctx: &mut FnCtx<'_>, stmt: &Stmt) -> Result<()> { } _ => {} } + + // Issue #50: row-alias detection. When `let krow = X[i]` where + // `X` is a folded flat-const 2D int array, record + // `krow_id → (X_id, i)` so a later `krow[j]` can lower through + // the same flat `[N x i32]` load path as an inline `X[i][j]`. + // Only fires for non-mutable lets (reassignment would invalidate + // the alias relationship). + if !*mutable { + if let Some(perry_hir::Expr::IndexGet { object, index }) = init.as_ref() { + if let perry_hir::Expr::LocalGet(const_id) = object.as_ref() { + if ctx.flat_const_arrays.contains_key(const_id) { + ctx.array_row_aliases.insert( + *id, + (*const_id, Box::new((**index).clone())), + ); + } + } + } + } // Refine the declared type from the initializer when the // declared type is Any. The HIR's destructuring lowering // declares synthetic `__destruct_*` lets as `ty: Any` even @@ -299,9 +318,43 @@ pub(crate) fn lower_stmt(ctx: &mut FnCtx<'_>, stmt: &Stmt) -> Result<()> { } ctx.locals.insert(*id, slot.clone()); ctx.local_types.insert(*id, refined_ty); + // Int32 specialization (issue #48): if this local qualifies as + // integer-valued (all writes are `| 0` / `>>> 0` / bitwise / int + // literal / ++/--), allocate a parallel i32 slot. Update/LocalSet + // mirror writes to it; IndexGet and hot-loop consumers prefer it + // over the double slot — skipping the `fadd → fcvtzs → scvtf` + // round-trip per iteration of `sum = (sum + i) | 0`. + // + // Only fire on `mutable` locals: an immutable `const SEED = 0xDEAD_BEEF` + // never benefits from i32 specialization (no per-iteration cost), and + // its initializer may legitimately exceed i32 range (e.g. 0x9E3779B9 + // = 2654435769 > INT32_MAX) — fptosi'ing it saturates to INT32_MAX + // and silently corrupts every read of the i32 slot. Mutable locals + // are always written through paths we control (Update, `(expr) | 0`) + // which produce in-range int32 values per JS ToInt32 semantics. + let init_in_i32_range = match init.as_ref() { + Some(perry_hir::Expr::Integer(n)) => i32::try_from(*n).is_ok(), + _ => true, // non-Integer init: writes will always go via i32-coercing paths + }; + let needs_i32_slot = ctx.integer_locals.contains(id) + && *mutable + && init_in_i32_range + && !ctx.boxed_vars.contains(id) + && !ctx.module_globals.contains_key(id) + && !ctx.i32_counter_slots.contains_key(id); + if needs_i32_slot { + let i32_slot = ctx.func.alloca_entry(I32); + ctx.func.entry_allocas_push_store(I32, "0", &i32_slot); + ctx.i32_counter_slots.insert(*id, i32_slot); + } if let Some(init_expr) = init { let v = lower_expr(ctx, init_expr)?; ctx.block().store(DOUBLE, &v, &slot); + // Seed the i32 slot from the init value when the local has one. + if let Some(i32_slot) = ctx.i32_counter_slots.get(id).cloned() { + let v_i32 = ctx.block().fptosi(DOUBLE, &v, I32); + ctx.block().store(I32, &v_i32, &i32_slot); + } } else if let Some(cv) = ctx.compile_time_constants.get(id) { // Compile-time constants (e.g. `declare const __platform__: number`) // have no init expression but their value is known. Store the diff --git a/crates/perry-codegen/src/type_analysis.rs b/crates/perry-codegen/src/type_analysis.rs index 6a14508dd..992e855b8 100644 --- a/crates/perry-codegen/src/type_analysis.rs +++ b/crates/perry-codegen/src/type_analysis.rs @@ -393,6 +393,16 @@ pub(crate) fn is_bigint_expr(ctx: &FnCtx<'_>, e: &Expr) -> bool { | BinaryOp::Mul | BinaryOp::Div | BinaryOp::Mod + // Bitwise ops on bigints produce bigints — include + // them so `(a * prime) & mask64` where both operands + // are bigint stays bigint-typed all the way up the + // chain. Without this the outer `&` falls through to + // the i32 ToInt32 path and returns 0 (closes #39). + | BinaryOp::BitAnd + | BinaryOp::BitOr + | BinaryOp::BitXor + | BinaryOp::Shl + | BinaryOp::Shr ) && (is_bigint_expr(ctx, left) || is_bigint_expr(ctx, right)) } _ => false, diff --git a/crates/perry-container-compose/Cargo.toml b/crates/perry-container-compose/Cargo.toml new file mode 100644 index 000000000..b3699c1ca --- /dev/null +++ b/crates/perry-container-compose/Cargo.toml @@ -0,0 +1,43 @@ +[package] +name = "perry-container-compose" +version.workspace = true +edition.workspace = true +license.workspace = true +repository.workspace = true +authors = ["Perry Contributors"] +description = "Port of container-compose/cli to Rust - Docker Compose-like experience for Apple Container" + +[dependencies] +serde = { workspace = true, features = ["derive"] } +serde_json = { workspace = true } +serde_yaml = "0.9" +tokio = { workspace = true } +clap = { workspace = true, features = ["derive"] } +anyhow = { workspace = true } +thiserror = { workspace = true } +tracing = "0.1" +tracing-subscriber = { version = "0.3", features = ["env-filter"] } +async-trait = "0.1" +md-5 = "0.10" +hex = "0.4" +dotenvy = { workspace = true } +indexmap = { version = "2.2", features = ["serde"] } +rand = "0.8" +regex = "1" +which = "6.0" +once_cell = "1.19" +home = "0.5" +shellexpand = "3.1" + +[dev-dependencies] +tokio = { workspace = true } +proptest = "1" + +[features] +default = [] +ffi = [] +integration-tests = [] + +[[bin]] +name = "perry-compose" +path = "src/main.rs" diff --git a/crates/perry-container-compose/examples/build/main.ts b/crates/perry-container-compose/examples/build/main.ts new file mode 100644 index 000000000..8aaf7f83a --- /dev/null +++ b/crates/perry-container-compose/examples/build/main.ts @@ -0,0 +1,23 @@ +import { composeUp, composeDown } from 'perry/compose'; + +const stack = await composeUp({ + version: '3.8', + services: { + app: { + build: { + context: '.', + dockerfile: 'Dockerfile', + args: { + BUILD_ENV: 'production', + }, + }, + ports: ['8080:8080'], + environment: { + NODE_ENV: 'production', + }, + }, + }, +}); + +// Tear down when done +await composeDown(stack); diff --git a/crates/perry-container-compose/examples/multi-service/main.ts b/crates/perry-container-compose/examples/multi-service/main.ts new file mode 100644 index 000000000..5fce10b24 --- /dev/null +++ b/crates/perry-container-compose/examples/multi-service/main.ts @@ -0,0 +1,36 @@ +import { composeUp, composeDown, composeLogs } from 'perry/compose'; + +const stack = await composeUp({ + version: '3.8', + services: { + db: { + image: 'postgres:16-alpine', + environment: { + // ${VAR:-default} interpolation is supported in string values + POSTGRES_USER: '${DB_USER:-myuser}', + POSTGRES_PASSWORD: '${DB_PASSWORD:-secret}', + POSTGRES_DB: 'mydb', + }, + volumes: ['db-data:/var/lib/postgresql/data'], + ports: ['5432:5432'], + }, + web: { + image: 'myapp:latest', + dependsOn: ['db'], + ports: ['3000:3000'], + environment: { + DATABASE_URL: 'postgres://${DB_USER:-myuser}:${DB_PASSWORD:-secret}@db:5432/mydb', + }, + }, + }, + volumes: { + 'db-data': {}, + }, +}); + +// Stream logs from both services +const logs = await composeLogs(stack, { services: ['web', 'db'], follow: false }); +console.log(logs); + +// Tear down, removing named volumes +await composeDown(stack, { volumes: true }); diff --git a/crates/perry-container-compose/examples/simple/main.ts b/crates/perry-container-compose/examples/simple/main.ts new file mode 100644 index 000000000..5a33883f3 --- /dev/null +++ b/crates/perry-container-compose/examples/simple/main.ts @@ -0,0 +1,21 @@ +import { composeUp, composeDown, composePs } from 'perry/compose'; + +const stack = await composeUp({ + version: '3.8', + services: { + web: { + image: 'nginx:alpine', + containerName: 'simple-nginx', + ports: ['8080:80'], + labels: { + app: 'simple-nginx', + }, + }, + }, +}); + +const statuses = await composePs(stack); +console.table(statuses); + +// Tear down when done +await composeDown(stack); diff --git a/crates/perry-container-compose/src/backend.rs b/crates/perry-container-compose/src/backend.rs new file mode 100644 index 000000000..5dfa1defd --- /dev/null +++ b/crates/perry-container-compose/src/backend.rs @@ -0,0 +1,388 @@ +//! Container backend abstraction + +use crate::error::{ComposeError, Result}; +use crate::types::{ + ComposeNetwork, ComposeVolume, ContainerHandle, ContainerInfo, ContainerLogs, ContainerSpec, + ImageInfo, ComposeServiceBuild, +}; +use async_trait::async_trait; +use std::collections::HashMap; +use std::path::PathBuf; +use std::process::Stdio; +use tokio::process::Command; +use std::sync::Arc; + +pub use crate::error::BackendProbeResult; + +#[derive(Debug, Clone, Default)] +pub struct NetworkConfig { + pub driver: Option, + pub labels: HashMap, + pub internal: bool, + pub enable_ipv6: bool, +} + +#[derive(Debug, Clone, Default)] +pub struct VolumeConfig { + pub driver: Option, + pub labels: HashMap, +} + +impl From<&ComposeNetwork> for NetworkConfig { + fn from(n: &ComposeNetwork) -> Self { + NetworkConfig { + driver: n.driver.clone(), + labels: n.labels.as_ref().map(|l| l.to_map()).unwrap_or_default(), + internal: n.internal.unwrap_or(false), + enable_ipv6: n.enable_ipv6.unwrap_or(false), + } + } +} + +impl From<&ComposeVolume> for VolumeConfig { + fn from(v: &ComposeVolume) -> Self { + VolumeConfig { + driver: v.driver.clone(), + labels: v.labels.as_ref().map(|l| l.to_map()).unwrap_or_default(), + } + } +} + +#[async_trait] +pub trait ContainerBackend: Send + Sync { + fn backend_name(&self) -> &str; + async fn check_available(&self) -> Result<()>; + async fn run(&self, spec: &ContainerSpec) -> Result; + async fn create(&self, spec: &ContainerSpec) -> Result; + async fn start(&self, id: &str) -> Result<()>; + async fn stop(&self, id: &str, timeout: Option) -> Result<()>; + async fn remove(&self, id: &str, force: bool) -> Result<()>; + async fn list(&self, all: bool) -> Result>; + async fn inspect(&self, id: &str) -> Result; + async fn logs(&self, id: &str, tail: Option) -> Result; + async fn exec(&self, id: &str, cmd: &[String], env: Option<&HashMap>, workdir: Option<&str>) -> Result; + async fn build(&self, spec: &ComposeServiceBuild, image_name: &str) -> Result<()>; + async fn pull_image(&self, reference: &str) -> Result<()>; + async fn list_images(&self) -> Result>; + async fn remove_image(&self, reference: &str, force: bool) -> Result<()>; + async fn inspect_image(&self, reference: &str) -> Result; + async fn create_network(&self, name: &str, config: &NetworkConfig) -> Result<()>; + async fn remove_network(&self, name: &str) -> Result<()>; + async fn inspect_network(&self, name: &str) -> Result<()>; + async fn create_volume(&self, name: &str, config: &VolumeConfig) -> Result<()>; + async fn remove_volume(&self, name: &str) -> Result<()>; + async fn inspect_volume(&self, name: &str) -> Result<()>; + async fn wait(&self, id: &str) -> Result; + async fn wait_and_logs(&self, id: &str) -> Result; + async fn manifest_inspect(&self, reference: &str) -> Result; +} + +pub trait CliProtocol: Send + Sync { + fn protocol_name(&self) -> &str; + fn subcommand_prefix(&self) -> Option> { None } + fn run_args(&self, spec: &ContainerSpec) -> Vec; + fn create_args(&self, spec: &ContainerSpec) -> Vec; + fn start_args(&self, id: &str) -> Vec; + fn stop_args(&self, id: &str, timeout: Option) -> Vec; + fn remove_args(&self, id: &str, force: bool) -> Vec; + fn list_args(&self, all: bool) -> Vec; + fn inspect_args(&self, id: &str) -> Vec; + fn logs_args(&self, id: &str, tail: Option) -> Vec; + fn exec_args(&self, id: &str, cmd: &[String], env: Option<&HashMap>, workdir: Option<&str>) -> Vec; + fn pull_image_args(&self, reference: &str) -> Vec; + fn list_images_args(&self) -> Vec; + fn remove_image_args(&self, reference: &str, force: bool) -> Vec; + fn inspect_image_args(&self, reference: &str) -> Vec; + fn manifest_inspect_args(&self, reference: &str) -> Vec; + fn build_args(&self, spec: &ComposeServiceBuild, image_name: &str) -> Vec; + fn create_network_args(&self, name: &str, config: &NetworkConfig) -> Vec; + fn remove_network_args(&self, name: &str) -> Vec; + fn inspect_network_args(&self, name: &str) -> Vec; + fn create_volume_args(&self, name: &str, config: &VolumeConfig) -> Vec; + fn remove_volume_args(&self, name: &str) -> Vec; + fn inspect_volume_args(&self, name: &str) -> Vec; + fn wait_args(&self, id: &str) -> Vec; + fn parse_list_output(&self, stdout: &str) -> Result>; + fn parse_inspect_output(&self, stdout: &str) -> Result; + fn parse_list_images_output(&self, stdout: &str) -> Result>; + fn parse_inspect_image_output(&self, stdout: &str) -> Result; + fn parse_container_id(&self, stdout: &str) -> Result; +} + +pub struct CliBackend { pub bin: PathBuf, pub protocol: P } +impl CliBackend

{ + pub fn new(bin: PathBuf, protocol: P) -> Self { Self { bin, protocol } } + async fn exec_ok(&self, args: Vec) -> Result { + let mut full = self.protocol.subcommand_prefix().unwrap_or_default(); + full.extend(args); + let out = Command::new(&self.bin).args(&full).output().await.map_err(ComposeError::IoError)?; + if out.status.success() { Ok(String::from_utf8_lossy(&out.stdout).to_string()) } + else { Err(ComposeError::BackendError { code: out.status.code().unwrap_or(-1), message: String::from_utf8_lossy(&out.stderr).to_string() }) } + } +} + +#[async_trait] +impl ContainerBackend for CliBackend

{ + fn backend_name(&self) -> &str { self.protocol.protocol_name() } + async fn check_available(&self) -> Result<()> { Ok(()) } + async fn run(&self, spec: &ContainerSpec) -> Result { + let id = self.protocol.parse_container_id(&self.exec_ok(self.protocol.run_args(spec)).await?)?; + Ok(ContainerHandle { id, name: spec.name.clone() }) + } + async fn create(&self, spec: &ContainerSpec) -> Result { + let id = self.protocol.parse_container_id(&self.exec_ok(self.protocol.create_args(spec)).await?)?; + Ok(ContainerHandle { id, name: spec.name.clone() }) + } + async fn start(&self, id: &str) -> Result<()> { self.exec_ok(self.protocol.start_args(id)).await?; Ok(()) } + async fn stop(&self, id: &str, t: Option) -> Result<()> { self.exec_ok(self.protocol.stop_args(id, t)).await?; Ok(()) } + async fn remove(&self, id: &str, f: bool) -> Result<()> { self.exec_ok(self.protocol.remove_args(id, f)).await?; Ok(()) } + async fn list(&self, a: bool) -> Result> { self.protocol.parse_list_output(&self.exec_ok(self.protocol.list_args(a)).await?) } + async fn inspect(&self, id: &str) -> Result { self.protocol.parse_inspect_output(&self.exec_ok(self.protocol.inspect_args(id)).await?) } + async fn logs(&self, id: &str, t: Option) -> Result { + let mut full = self.protocol.subcommand_prefix().unwrap_or_default(); + full.extend(self.protocol.logs_args(id, t)); + let out = Command::new(&self.bin).args(&full).output().await.map_err(ComposeError::IoError)?; + Ok(ContainerLogs { stdout: String::from_utf8_lossy(&out.stdout).to_string(), stderr: String::from_utf8_lossy(&out.stderr).to_string() }) + } + async fn exec(&self, id: &str, cmd: &[String], env: Option<&HashMap>, wd: Option<&str>) -> Result { + let mut full = self.protocol.subcommand_prefix().unwrap_or_default(); + full.extend(self.protocol.exec_args(id, cmd, env, wd)); + let out = Command::new(&self.bin).args(&full).output().await.map_err(ComposeError::IoError)?; + Ok(ContainerLogs { stdout: String::from_utf8_lossy(&out.stdout).to_string(), stderr: String::from_utf8_lossy(&out.stderr).to_string() }) + } + async fn build(&self, s: &ComposeServiceBuild, i: &str) -> Result<()> { self.exec_ok(self.protocol.build_args(s, i)).await?; Ok(()) } + async fn pull_image(&self, r: &str) -> Result<()> { self.exec_ok(self.protocol.pull_image_args(r)).await?; Ok(()) } + async fn list_images(&self) -> Result> { self.protocol.parse_list_images_output(&self.exec_ok(self.protocol.list_images_args()).await?) } + async fn remove_image(&self, r: &str, f: bool) -> Result<()> { self.exec_ok(self.protocol.remove_image_args(r, f)).await?; Ok(()) } + async fn inspect_image(&self, r: &str) -> Result { self.protocol.parse_inspect_image_output(&self.exec_ok(self.protocol.inspect_image_args(r)).await?) } + async fn create_network(&self, n: &str, c: &NetworkConfig) -> Result<()> { self.exec_ok(self.protocol.create_network_args(n, c)).await?; Ok(()) } + async fn remove_network(&self, n: &str) -> Result<()> { self.exec_ok(self.protocol.remove_network_args(n)).await?; Ok(()) } + async fn inspect_network(&self, n: &str) -> Result<()> { self.exec_ok(self.protocol.inspect_network_args(n)).await?; Ok(()) } + async fn create_volume(&self, n: &str, c: &VolumeConfig) -> Result<()> { self.exec_ok(self.protocol.create_volume_args(n, c)).await?; Ok(()) } + async fn remove_volume(&self, n: &str) -> Result<()> { self.exec_ok(self.protocol.remove_volume_args(n)).await?; Ok(()) } + async fn inspect_volume(&self, n: &str) -> Result<()> { self.exec_ok(self.protocol.inspect_volume_args(n)).await?; Ok(()) } + async fn wait(&self, id: &str) -> Result { self.exec_ok(self.protocol.wait_args(id)).await?.trim().parse().map_err(|_| ComposeError::BackendError { code: -1, message: "Invalid wait output".into() }) } + async fn wait_and_logs(&self, id: &str) -> Result { self.wait(id).await?; self.logs(id, None).await } + async fn manifest_inspect(&self, r: &str) -> Result { serde_json::from_str(&self.exec_ok(self.protocol.manifest_inspect_args(r)).await?).map_err(ComposeError::JsonError) } +} + +pub struct DockerProtocol; +impl CliProtocol for DockerProtocol { + fn protocol_name(&self) -> &str { "docker" } + fn run_args(&self, s: &ContainerSpec) -> Vec { let mut a = vec!["run".into(), "--detach".into()]; a.extend(self.common(s)); a.push(s.image.clone()); if let Some(c) = &s.cmd { a.extend(c.iter().cloned()); } a } + fn create_args(&self, s: &ContainerSpec) -> Vec { let mut a = vec!["create".into()]; a.extend(self.common(s)); a.push(s.image.clone()); if let Some(c) = &s.cmd { a.extend(c.iter().cloned()); } a } + fn start_args(&self, id: &str) -> Vec { vec!["start".into(), id.into()] } + fn stop_args(&self, id: &str, t: Option) -> Vec { let mut a = vec!["stop".into()]; if let Some(v) = t { a.extend(["-t".into(), v.to_string()]); } a.push(id.into()); a } + fn remove_args(&self, id: &str, f: bool) -> Vec { let mut a = vec!["rm".into()]; if f { a.push("-f".into()); } a.push(id.into()); a } + fn list_args(&self, a: bool) -> Vec { let mut v = vec!["ps".into(), "--format".into(), "json".into()]; if a { v.push("--all".into()); } v } + fn inspect_args(&self, id: &str) -> Vec { vec!["inspect".into(), "--format".into(), "json".into(), id.into()] } + fn logs_args(&self, id: &str, t: Option) -> Vec { let mut a = vec!["logs".into()]; if let Some(v) = t { a.extend(["--tail".into(), v.to_string()]); } a.push(id.into()); a } + fn exec_args(&self, id: &str, cmd: &[String], env: Option<&HashMap>, wd: Option<&str>) -> Vec { + let mut a = vec!["exec".into()]; if let Some(w) = wd { a.extend(["--workdir".into(), w.into()]); } + if let Some(e) = env { let mut ks: Vec<_> = e.keys().collect(); ks.sort(); for k in ks { a.extend(["-e".into(), format!("{}={}", k, e[k])]); } } + a.push(id.into()); a.extend(cmd.iter().cloned()); a + } + fn pull_image_args(&self, r: &str) -> Vec { vec!["pull".into(), r.into()] } + fn list_images_args(&self) -> Vec { vec!["images".into(), "--format".into(), "json".into()] } + fn remove_image_args(&self, r: &str, f: bool) -> Vec { let mut a = vec!["rmi".into()]; if f { a.push("-f".into()); } a.push(r.into()); a } + fn inspect_image_args(&self, r: &str) -> Vec { vec!["image".into(), "inspect".into(), "--format".into(), "json".into(), r.into()] } + fn manifest_inspect_args(&self, r: &str) -> Vec { vec!["manifest".into(), "inspect".into(), r.into()] } + fn build_args(&self, spec: &ComposeServiceBuild, image_name: &str) -> Vec { + let mut f = vec!["build".into(), "-t".into(), image_name.into()]; + if let Some(d) = &spec.dockerfile { f.extend(["-f".into(), d.into()]); } + if let Some(args) = &spec.args { + let m = args.to_map(); + let mut ks: Vec<_> = m.keys().collect(); + ks.sort(); + for k in ks { f.extend(["--build-arg".into(), format!("{}={}", k, m[k])]); } + } + f.push(spec.context.as_deref().unwrap_or(".").into()); + f + } + fn create_network_args(&self, n: &str, c: &NetworkConfig) -> Vec { + let mut a = vec!["network".into(), "create".into()]; if let Some(d) = &c.driver { a.extend(["--driver".into(), d.clone()]); } + let mut ls: Vec<_> = c.labels.keys().collect(); ls.sort(); for k in ls { a.extend(["--label".into(), format!("{}={}", k, c.labels[k])]); } + if c.internal { a.push("--internal".into()); } if c.enable_ipv6 { a.push("--ipv6".into()); } + a.push(n.into()); a + } + fn remove_network_args(&self, n: &str) -> Vec { vec!["network".into(), "rm".into(), n.into()] } + fn inspect_network_args(&self, n: &str) -> Vec { vec!["network".into(), "inspect".into(), n.into()] } + fn create_volume_args(&self, n: &str, c: &VolumeConfig) -> Vec { + let mut a = vec!["volume".into(), "create".into()]; if let Some(d) = &c.driver { a.extend(["--driver".into(), d.clone()]); } + let mut ls: Vec<_> = c.labels.keys().collect(); ls.sort(); for k in ls { a.extend(["--label".into(), format!("{}={}", k, c.labels[k])]); } + a.push(n.into()); a + } + fn remove_volume_args(&self, n: &str) -> Vec { vec!["volume".into(), "rm".into(), n.into()] } + fn inspect_volume_args(&self, n: &str) -> Vec { vec!["volume".into(), "inspect".into(), n.into()] } + fn wait_args(&self, id: &str) -> Vec { vec!["wait".into(), id.into()] } + fn parse_list_output(&self, s: &str) -> Result> { + let v: Vec = serde_json::from_str(s.trim()).unwrap_or_default(); + Ok(v.into_iter().map(|e| ContainerInfo { + id: e["ID"].as_str().or(e["Id"].as_str()).unwrap_or_default().to_string(), + name: e["Names"].as_str().or(e["names"].as_str()).unwrap_or_default().trim_start_matches('/').to_string(), + image: e["Image"].as_str().unwrap_or_default().to_string(), + status: e["Status"].as_str().unwrap_or_default().to_string(), + ports: vec![], labels: HashMap::new(), created: "".into() + }).collect()) + } + fn parse_inspect_output(&self, s: &str) -> Result { + let v: serde_json::Value = serde_json::from_str(s.trim()).unwrap_or_default(); + let e = if v.is_array() { &v[0] } else { &v }; + Ok(ContainerInfo { + id: e["Id"].as_str().unwrap_or_default().to_string(), + name: e["Name"].as_str().unwrap_or_default().trim_start_matches('/').to_string(), + image: e["Config"]["Image"].as_str().unwrap_or_default().to_string(), + status: e["State"]["Status"].as_str().unwrap_or_default().to_string(), + ports: vec![], labels: HashMap::new(), created: e["Created"].as_str().unwrap_or_default().to_string() + }) + } + fn parse_list_images_output(&self, s: &str) -> Result> { + let v: Vec = serde_json::from_str(s.trim()).unwrap_or_default(); + Ok(v.into_iter().map(|e| ImageInfo { + id: e["ID"].as_str().unwrap_or_default().to_string(), + repository: e["Repository"].as_str().unwrap_or_default().to_string(), + tag: e["Tag"].as_str().unwrap_or_default().to_string(), + size: e["Size"].as_u64().unwrap_or(0), + created: e["CreatedSince"].as_str().unwrap_or_default().to_string() + }).collect()) + } + fn parse_inspect_image_output(&self, s: &str) -> Result { + let v: serde_json::Value = serde_json::from_str(s.trim()).unwrap_or_default(); + let e = if v.is_array() { &v[0] } else { &v }; + Ok(ImageInfo { + id: e["Id"].as_str().unwrap_or_default().to_string(), + repository: "".into(), tag: "".into(), + size: e["Size"].as_u64().unwrap_or(0), + created: e["Created"].as_str().unwrap_or_default().to_string() + }) + } + fn parse_container_id(&self, s: &str) -> Result { Ok(s.trim().to_string()) } +} + +impl DockerProtocol { + fn common(&self, s: &ContainerSpec) -> Vec { + let mut a = Vec::new(); + if s.rm.unwrap_or(false) { a.push("--rm".into()); } + if let Some(n) = &s.name { a.extend(["--name".into(), n.clone()]); } + if let Some(net) = &s.network { a.extend(["--network".into(), net.clone()]); } + if let Some(ps) = &s.ports { for p in ps { a.extend(["-p".into(), p.clone()]); } } + if let Some(vs) = &s.volumes { for v in vs { a.extend(["-v".into(), v.clone()]); } } + if let Some(e) = &s.env { + let mut keys: Vec<_> = e.keys().collect(); + keys.sort(); + for k in keys { a.extend(["-e".into(), format!("{}={}", k, e[k])]); } + } + if let Some(ep) = &s.entrypoint { a.extend(["--entrypoint".into(), ep.join(" ")]); } + a + } +} + +pub struct AppleContainerProtocol; +impl CliProtocol for AppleContainerProtocol { + fn protocol_name(&self) -> &str { "apple/container" } + fn run_args(&self, s: &ContainerSpec) -> Vec { let mut a = vec!["run".into()]; if s.rm.unwrap_or(false) { a.push("--rm".into()); } if let Some(n) = &s.name { a.extend(["--name".into(), n.clone()]); } a.push(s.image.clone()); if let Some(c) = &s.cmd { a.extend(c.iter().cloned()); } a } + fn create_args(&self, s: &ContainerSpec) -> Vec { DockerProtocol.create_args(s) } + fn start_args(&self, id: &str) -> Vec { DockerProtocol.start_args(id) } + fn stop_args(&self, id: &str, t: Option) -> Vec { DockerProtocol.stop_args(id, t) } + fn remove_args(&self, id: &str, f: bool) -> Vec { DockerProtocol.remove_args(id, f) } + fn list_args(&self, a: bool) -> Vec { DockerProtocol.list_args(a) } + fn inspect_args(&self, id: &str) -> Vec { DockerProtocol.inspect_args(id) } + fn logs_args(&self, id: &str, t: Option) -> Vec { DockerProtocol.logs_args(id, t) } + fn exec_args(&self, id: &str, cmd: &[String], env: Option<&HashMap>, wd: Option<&str>) -> Vec { DockerProtocol.exec_args(id, cmd, env, wd) } + fn pull_image_args(&self, r: &str) -> Vec { DockerProtocol.pull_image_args(r) } + fn list_images_args(&self) -> Vec { DockerProtocol.list_images_args() } + fn remove_image_args(&self, r: &str, f: bool) -> Vec { DockerProtocol.remove_image_args(r, f) } + fn inspect_image_args(&self, r: &str) -> Vec { DockerProtocol.inspect_image_args(r) } + fn manifest_inspect_args(&self, r: &str) -> Vec { DockerProtocol.manifest_inspect_args(r) } + fn build_args(&self, spec: &ComposeServiceBuild, image_name: &str) -> Vec { DockerProtocol.build_args(spec, image_name) } + fn create_network_args(&self, n: &str, c: &NetworkConfig) -> Vec { DockerProtocol.create_network_args(n, c) } + fn remove_network_args(&self, n: &str) -> Vec { DockerProtocol.remove_network_args(n) } + fn inspect_network_args(&self, n: &str) -> Vec { DockerProtocol.inspect_network_args(n) } + fn create_volume_args(&self, n: &str, c: &VolumeConfig) -> Vec { DockerProtocol.create_volume_args(n, c) } + fn remove_volume_args(&self, n: &str) -> Vec { DockerProtocol.remove_volume_args(n) } + fn inspect_volume_args(&self, n: &str) -> Vec { DockerProtocol.inspect_volume_args(n) } + fn wait_args(&self, id: &str) -> Vec { DockerProtocol.wait_args(id) } + fn parse_list_output(&self, s: &str) -> Result> { DockerProtocol.parse_list_output(s) } + fn parse_inspect_output(&self, s: &str) -> Result { DockerProtocol.parse_inspect_output(s) } + fn parse_list_images_output(&self, s: &str) -> Result> { DockerProtocol.parse_list_images_output(s) } + fn parse_inspect_image_output(&self, s: &str) -> Result { DockerProtocol.parse_inspect_image_output(s) } + fn parse_container_id(&self, s: &str) -> Result { DockerProtocol.parse_container_id(s) } +} + +pub struct LimaProtocol { pub instance: String } +impl CliProtocol for LimaProtocol { + fn protocol_name(&self) -> &str { "lima" } + fn subcommand_prefix(&self) -> Option> { Some(vec!["shell".into(), self.instance.clone(), "nerdctl".into()]) } + fn run_args(&self, s: &ContainerSpec) -> Vec { DockerProtocol.run_args(s) } + fn create_args(&self, s: &ContainerSpec) -> Vec { DockerProtocol.create_args(s) } + fn start_args(&self, id: &str) -> Vec { DockerProtocol.start_args(id) } + fn stop_args(&self, id: &str, t: Option) -> Vec { DockerProtocol.stop_args(id, t) } + fn remove_args(&self, id: &str, f: bool) -> Vec { DockerProtocol.remove_args(id, f) } + fn list_args(&self, a: bool) -> Vec { DockerProtocol.list_args(a) } + fn inspect_args(&self, id: &str) -> Vec { DockerProtocol.inspect_args(id) } + fn logs_args(&self, id: &str, t: Option) -> Vec { DockerProtocol.logs_args(id, t) } + fn exec_args(&self, id: &str, cmd: &[String], env: Option<&HashMap>, wd: Option<&str>) -> Vec { DockerProtocol.exec_args(id, cmd, env, wd) } + fn pull_image_args(&self, r: &str) -> Vec { DockerProtocol.pull_image_args(r) } + fn list_images_args(&self) -> Vec { DockerProtocol.list_images_args() } + fn remove_image_args(&self, r: &str, f: bool) -> Vec { DockerProtocol.remove_image_args(r, f) } + fn inspect_image_args(&self, r: &str) -> Vec { DockerProtocol.inspect_image_args(r) } + fn manifest_inspect_args(&self, r: &str) -> Vec { DockerProtocol.manifest_inspect_args(r) } + fn build_args(&self, spec: &ComposeServiceBuild, image_name: &str) -> Vec { DockerProtocol.build_args(spec, image_name) } + fn create_network_args(&self, n: &str, c: &NetworkConfig) -> Vec { DockerProtocol.create_network_args(n, c) } + fn remove_network_args(&self, n: &str) -> Vec { DockerProtocol.remove_network_args(n) } + fn inspect_network_args(&self, n: &str) -> Vec { DockerProtocol.inspect_network_args(n) } + fn create_volume_args(&self, n: &str, c: &VolumeConfig) -> Vec { DockerProtocol.create_volume_args(n, c) } + fn remove_volume_args(&self, n: &str) -> Vec { DockerProtocol.remove_volume_args(n) } + fn inspect_volume_args(&self, n: &str) -> Vec { DockerProtocol.inspect_volume_args(n) } + fn wait_args(&self, id: &str) -> Vec { DockerProtocol.wait_args(id) } + fn parse_list_output(&self, s: &str) -> Result> { DockerProtocol.parse_list_output(s) } + fn parse_inspect_output(&self, s: &str) -> Result { DockerProtocol.parse_inspect_output(s) } + fn parse_list_images_output(&self, s: &str) -> Result> { DockerProtocol.parse_list_images_output(s) } + fn parse_inspect_image_output(&self, s: &str) -> Result { DockerProtocol.parse_inspect_image_output(s) } + fn parse_container_id(&self, s: &str) -> Result { DockerProtocol.parse_container_id(s) } +} + +pub async fn detect_backend() -> Result> { + if let Ok(name) = std::env::var("PERRY_CONTAINER_BACKEND") { return probe_candidate(name.trim()).await; } + for name in platform_candidates() { + if let Ok(Ok(backend)) = tokio::time::timeout(std::time::Duration::from_secs(2), probe_candidate(name)).await { return Ok(backend); } + } + Err(ComposeError::NoBackendFound { probed: vec![] }) +} + +fn platform_candidates() -> &'static [&'static str] { + #[cfg(any(target_os = "macos", target_os = "ios"))] + { &["apple/container", "orbstack", "colima", "rancher-desktop", "lima", "podman", "docker"] } + #[cfg(target_os = "linux")] + { &["podman", "nerdctl", "docker"] } + #[cfg(target_os = "windows")] + { &["podman", "docker"] } + #[cfg(not(any(target_os = "macos", target_os = "ios", target_os = "linux", target_os = "windows")))] + { &["podman", "docker"] } +} + +async fn probe_candidate(name: &str) -> Result> { + match name { + "apple/container" => { + let bin = which::which("container").map_err(|_| ComposeError::NotFound("container not found".into()))?; + Ok(Arc::new(CliBackend::new(bin, AppleContainerProtocol))) + } + "orbstack" => { + let bin = which::which("orb").or_else(|_| which::which("docker")).map_err(|_| ComposeError::NotFound("orbstack not found".into()))?; + Ok(Arc::new(CliBackend::new(bin, DockerProtocol))) + } + "docker" => { + let bin = which::which("docker").map_err(|_| ComposeError::NotFound("docker not found".into()))?; + Ok(Arc::new(CliBackend::new(bin, DockerProtocol))) + } + "podman" => { + let bin = which::which("podman").map_err(|_| ComposeError::NotFound("podman not found".into()))?; + Ok(Arc::new(CliBackend::new(bin, DockerProtocol))) + } + "lima" => { + let bin = which::which("limactl").map_err(|_| ComposeError::NotFound("limactl not found".into()))?; + Ok(Arc::new(CliBackend::new(bin, LimaProtocol { instance: "default".into() }))) + } + _ => Err(ComposeError::NotFound(format!("Unknown backend: {}", name))), + } +} diff --git a/crates/perry-container-compose/src/cli.rs b/crates/perry-container-compose/src/cli.rs new file mode 100644 index 000000000..0472133eb --- /dev/null +++ b/crates/perry-container-compose/src/cli.rs @@ -0,0 +1,115 @@ +//! CLI entry point for `perry-compose` binary. + +use crate::compose::ComposeEngine; +use crate::error::Result; +use crate::project::ComposeProject; +use clap::{Args, Parser, Subcommand}; +use std::path::PathBuf; +use std::sync::Arc; + +#[derive(Parser, Debug)] +#[command(name = "perry-compose", version, about = "Docker Compose-like CLI", long_about = None)] +pub struct Cli { + #[arg(short = 'f', long = "file", value_name = "FILE", global = true)] + pub files: Vec, + #[arg(short = 'p', long = "project-name", global = true)] + pub project_name: Option, + #[arg(long = "env-file", value_name = "FILE", global = true)] + pub env_files: Vec, + #[command(subcommand)] + pub command: Commands, +} + +#[derive(Subcommand, Debug)] +pub enum Commands { + Up(UpArgs), + Down(DownArgs), + Start(ServiceArgs), + Stop(ServiceArgs), + Restart(ServiceArgs), + Ps(PsArgs), + Logs(LogsArgs), + Exec(ExecArgs), + Config(ConfigArgs), +} + +#[derive(Args, Debug)] +pub struct UpArgs { + #[arg(short = 'd', long = "detach")] + pub detach: bool, + #[arg(long = "build")] + pub build: bool, + #[arg(long = "remove-orphans")] + pub remove_orphans: bool, + pub services: Vec, +} + +#[derive(Args, Debug)] +pub struct DownArgs { + #[arg(short = 'v', long = "volumes")] + pub volumes: bool, + #[arg(long = "remove-orphans")] + pub remove_orphans: bool, + pub services: Vec, +} + +#[derive(Args, Debug)] +pub struct ServiceArgs { pub services: Vec } + +#[derive(Args, Debug)] +pub struct PsArgs { pub services: Vec } + +#[derive(Args, Debug)] +pub struct LogsArgs { + #[arg(long = "tail")] + pub tail: Option, + pub services: Vec, +} + +#[derive(Args, Debug)] +pub struct ExecArgs { + pub service: String, + pub cmd: Vec, + pub workdir: Option, +} + +#[derive(Args, Debug)] +pub struct ConfigArgs {} + +pub async fn run(cli: Cli) -> Result<()> { + let config = crate::config::ProjectConfig::new(cli.files.clone(), cli.project_name.clone(), cli.env_files.clone()); + let project = ComposeProject::load(&config)?; + let backend = crate::backend::detect_backend().await?; + let engine = Arc::new(ComposeEngine::new(project.spec.clone(), project.project_name.clone(), backend)); + + match cli.command { + Commands::Up(args) => { + Arc::clone(&engine).up(&args.services, args.detach, args.build, args.remove_orphans).await?; + } + Commands::Down(args) => { + engine.down(&args.services, args.remove_orphans, args.volumes).await?; + } + Commands::Start(args) => { engine.start(&args.services).await?; } + Commands::Stop(args) => { engine.stop(&args.services).await?; } + Commands::Restart(args) => { engine.restart(&args.services).await?; } + Commands::Ps(_) => { + let infos = engine.ps().await?; + for info in infos { println!("{:<24} {:<12} {:<36}", info.name, info.status, info.id); } + } + Commands::Logs(args) => { + let svc = if args.services.is_empty() { None } else { Some(args.services[0].as_str()) }; + let logs = engine.logs(svc, args.tail).await?; + print!("{}", logs.stdout); + eprint!("{}", logs.stderr); + } + Commands::Exec(args) => { + let logs = engine.exec(&args.service, &args.cmd).await?; + print!("{}", logs.stdout); + eprint!("{}", logs.stderr); + } + Commands::Config(_) => { + println!("{}", engine.config()?); + } + } + Ok(()) +} diff --git a/crates/perry-container-compose/src/compose.rs b/crates/perry-container-compose/src/compose.rs new file mode 100644 index 000000000..83c797a24 --- /dev/null +++ b/crates/perry-container-compose/src/compose.rs @@ -0,0 +1,196 @@ +//! `ComposeEngine` — the core compose orchestration engine. + +use crate::backend::{ContainerBackend, NetworkConfig, VolumeConfig}; +pub use crate::types::ContainerLogs; +use crate::error::{ComposeError, Result}; +use crate::service; +use crate::types::{ + ComposeHandle, ComposeSpec, ContainerInfo, ContainerSpec, +}; +use indexmap::IndexMap; +use std::sync::atomic::{AtomicU64, Ordering}; +use std::sync::{Arc, Mutex}; + +static COMPOSE_ENGINES: once_cell::sync::Lazy>>> = + once_cell::sync::Lazy::new(|| Mutex::new(IndexMap::new())); + +static NEXT_STACK_ID: AtomicU64 = AtomicU64::new(1); + +pub struct ComposeEngine { + pub spec: ComposeSpec, + pub project_name: String, + pub backend: Arc, + session_containers: Mutex>, + session_networks: Mutex>, + session_volumes: Mutex>, +} + +impl ComposeEngine { + pub fn new(spec: ComposeSpec, project_name: String, backend: Arc) -> Self { + ComposeEngine { + spec, project_name, backend, + session_containers: Mutex::new(Vec::new()), + session_networks: Mutex::new(Vec::new()), + session_volumes: Mutex::new(Vec::new()), + } + } + + fn register(self: Arc) -> ComposeHandle { + let stack_id = NEXT_STACK_ID.fetch_add(1, Ordering::SeqCst); + let services = self.spec.services.keys().cloned().collect(); + let handle = ComposeHandle { stack_id, project_name: self.project_name.clone(), services }; + COMPOSE_ENGINES.lock().unwrap().insert(stack_id, Arc::clone(&self)); + handle + } + + pub async fn up(self: Arc, services: &[String], _detach: bool, _build: bool, _remove_orphans: bool) -> Result { + let order = resolve_startup_order(&self.spec)?; + let target: Vec<&String> = if services.is_empty() { order.iter().collect() } else { order.iter().filter(|s| services.contains(s)).collect() }; + + if let Some(networks) = &self.spec.networks { + for (net_name, net_config_opt) in networks { + let external = net_config_opt.as_ref().map_or(false, |c| c.external.unwrap_or(false)); + if external { continue; } + let net_config_spec = net_config_opt.as_ref().cloned().unwrap_or_default(); + let net_config = NetworkConfig::from(&net_config_spec); + let resolved_name = net_config_spec.name.as_deref().unwrap_or(net_name.as_str()); + if self.backend.inspect_network(resolved_name).await.is_err() { + self.backend.create_network(resolved_name, &net_config).await?; + self.session_networks.lock().unwrap().push(resolved_name.to_string()); + } + } + } + + if let Some(volumes) = &self.spec.volumes { + for (vol_name, vol_config_opt) in volumes { + let external = vol_config_opt.as_ref().map_or(false, |c| c.external.unwrap_or(false)); + if external { continue; } + let vol_config_spec = vol_config_opt.as_ref().cloned().unwrap_or_default(); + let vol_config = VolumeConfig::from(&vol_config_spec); + let resolved_name = vol_config_spec.name.as_deref().unwrap_or(vol_name.as_str()); + if self.backend.inspect_volume(resolved_name).await.is_err() { + self.backend.create_volume(resolved_name, &vol_config).await?; + self.session_volumes.lock().unwrap().push(resolved_name.to_string()); + } + } + } + + for svc_name in target { + let svc = self.spec.services.get(svc_name).ok_or_else(|| ComposeError::NotFound(svc_name.clone()))?; + let container_name = service::service_container_name(svc, svc_name); + if let Ok(info) = self.backend.inspect(&container_name).await { + if info.status != "running" { self.backend.start(&container_name).await?; } + } else { + let spec = ContainerSpec { + image: svc.image_ref(svc_name), name: Some(container_name.clone()), + ports: Some(svc.port_strings()), volumes: Some(svc.volume_strings()), + env: Some(svc.resolved_env()), cmd: svc.command_list(), rm: Some(false), ..Default::default() + }; + self.backend.run(&spec).await?; + } + self.session_containers.lock().unwrap().push(container_name); + } + Ok(self.register()) + } + + pub async fn down(&self, _services: &[String], _remove_orphans: bool, remove_volumes: bool) -> Result<()> { + let containers = { let mut c = self.session_containers.lock().unwrap(); std::mem::take(&mut *c) }; + for c_name in containers.iter().rev() { + let _ = self.backend.stop(c_name, None).await; + let _ = self.backend.remove(c_name, true).await; + } + let networks = { let mut n = self.session_networks.lock().unwrap(); std::mem::take(&mut *n) }; + for n_name in networks { let _ = self.backend.remove_network(&n_name).await; } + if remove_volumes { + let volumes = { let mut v = self.session_volumes.lock().unwrap(); std::mem::take(&mut *v) }; + for v_name in volumes { let _ = self.backend.remove_volume(&v_name).await; } + } + Ok(()) + } + + pub async fn ps(&self) -> Result> { + let names = { self.session_containers.lock().unwrap().clone() }; + let mut results = Vec::new(); + for c_name in names { if let Ok(info) = self.backend.inspect(&c_name).await { results.push(info); } } + Ok(results) + } + + pub async fn logs(&self, service: Option<&str>, tail: Option) -> Result { + let names = { self.session_containers.lock().unwrap().clone() }; + let mut stdout = String::new(); + let mut stderr = String::new(); + for c_name in names { + if let Some(s) = service { if !c_name.contains(s) { continue; } } + if let Ok(l) = self.backend.logs(&c_name, tail).await { + stdout.push_str(&l.stdout); + stderr.push_str(&l.stderr); + } + } + Ok(ContainerLogs { stdout, stderr }) + } + + pub async fn exec(&self, service: &str, cmd: &[String]) -> Result { + let c_name = { + let c = self.session_containers.lock().unwrap(); + c.iter().find(|n| n.contains(service)).cloned() + }.ok_or_else(|| ComposeError::NotFound(service.to_string()))?; + self.backend.exec(&c_name, cmd, None, None).await + } + + pub fn config(&self) -> Result { self.spec.to_yaml() } + + pub async fn start(&self, services: &[String]) -> Result<()> { + let names = { self.session_containers.lock().unwrap().clone() }; + for c_name in names { + if services.is_empty() || services.iter().any(|s| c_name.contains(s)) { + self.backend.start(&c_name).await?; + } + } + Ok(()) + } + + pub async fn stop(&self, services: &[String]) -> Result<()> { + let names = { self.session_containers.lock().unwrap().clone() }; + for c_name in names { + if services.is_empty() || services.iter().any(|s| c_name.contains(s)) { + self.backend.stop(&c_name, None).await?; + } + } + Ok(()) + } + + pub async fn restart(&self, services: &[String]) -> Result<()> { + self.stop(services).await?; + self.start(services).await + } +} + +pub fn resolve_startup_order(spec: &ComposeSpec) -> Result> { + let mut in_degree: IndexMap = IndexMap::new(); + let mut dependents: IndexMap> = IndexMap::new(); + for name in spec.services.keys() { in_degree.insert(name.clone(), 0); dependents.insert(name.clone(), Vec::new()); } + for (name, service) in &spec.services { + if let Some(deps) = &service.depends_on { + for dep in deps.service_names() { + if !spec.services.contains_key(&dep) { return Err(ComposeError::validation(format!("Service '{}' depends on '{}' which is not defined", name, dep))); } + *in_degree.get_mut(name).unwrap() += 1; + dependents.get_mut(&dep).unwrap().push(name.clone()); + } + } + } + let mut queue: std::collections::BTreeSet = in_degree.iter().filter(|(_, °)| deg == 0).map(|(name, _)| name.clone()).collect(); + let mut order: Vec = Vec::new(); + while let Some(service) = queue.pop_first() { + order.push(service.clone()); + for dependent in dependents.get(&service).unwrap_or(&Vec::new()).clone() { + let deg = in_degree.get_mut(&dependent).unwrap(); + *deg -= 1; + if *deg == 0 { queue.insert(dependent); } + } + } + if order.len() != spec.services.len() { + let cycle_services: Vec = in_degree.iter().filter(|(_, °)| deg > 0).map(|(name, _)| name.clone()).collect(); + return Err(ComposeError::DependencyCycle { services: cycle_services }); + } + Ok(order) +} diff --git a/crates/perry-container-compose/src/config.rs b/crates/perry-container-compose/src/config.rs new file mode 100644 index 000000000..435c0d3f4 --- /dev/null +++ b/crates/perry-container-compose/src/config.rs @@ -0,0 +1,274 @@ +//! Project configuration and environment variable resolution. +//! +//! Implements the priority chain for compose file discovery and project naming +//! as defined in the compose-spec and requirements 9.1–9.8. + +use crate::error::{ComposeError, Result}; +use std::path::{Path, PathBuf}; + +/// Default compose file names to search for, in priority order (req 9.6). +pub const DEFAULT_COMPOSE_FILES: &[&str] = &[ + "compose.yaml", + "compose.yml", + "docker-compose.yaml", + "docker-compose.yml", +]; + +/// Project-level configuration holding raw CLI inputs for file paths, project name, and env files. +/// +/// This is the *project-level* config struct — distinct from the compose-spec +/// `ComposeConfig` type in `types.rs` which describes a top-level `configs:` entry. +/// +/// Use [`ProjectConfig::new`] to construct from CLI args, then pass to +/// [`crate::project::ComposeProject::load`] which runs the full resolution chain. +#[derive(Debug, Clone)] +pub struct ProjectConfig { + /// Compose file paths from `-f` flags (empty = use env var / default discovery). + pub compose_files: Vec, + /// Project name from `-p` flag (`None` = use env var / directory name). + pub project_name: Option, + /// Extra environment file paths from `--env-file` flags. + pub env_files: Vec, +} + +impl ProjectConfig { + /// Create a `ProjectConfig` from raw CLI inputs. + /// + /// No resolution is performed here; call [`crate::project::ComposeProject::load`] + /// to run the full priority chain (req 9.1–9.8). + pub fn new( + compose_files: Vec, + project_name: Option, + env_files: Vec, + ) -> Self { + ProjectConfig { + compose_files, + project_name, + env_files, + } + } +} + +/// Resolve the project name. +/// +/// Priority (req 9.3, 9.4, 9.7): +/// 1. CLI `-p` / `--project-name` flag +/// 2. `COMPOSE_PROJECT_NAME` environment variable +/// 3. Directory name of the directory containing the primary compose file +pub fn resolve_project_name(cli_name: Option<&str>, project_dir: &Path) -> String { + if let Some(name) = cli_name { + if !name.is_empty() { + return name.to_string(); + } + } + + if let Ok(name) = std::env::var("COMPOSE_PROJECT_NAME") { + if !name.is_empty() { + return name; + } + } + + // Fall back to the directory name (req 9.7). + project_dir + .file_name() + .map(|n| n.to_string_lossy().into_owned()) + .unwrap_or_else(|| "project".to_string()) +} + +/// Resolve compose file paths. +/// +/// Priority (req 9.1, 9.5, 9.6): +/// 1. CLI `-f` / `--file` flags — returned as-is; missing files produce an error (req 9.8) +/// 2. `COMPOSE_FILE` environment variable — colon-separated list of paths; missing files error +/// 3. Default file search in CWD: `compose.yaml`, `compose.yml`, `docker-compose.yaml`, +/// `docker-compose.yml` (in that order) +pub fn resolve_compose_files(cli_files: &[PathBuf]) -> Result> { + if !cli_files.is_empty() { + // Validate every explicitly-specified file exists (req 9.8). + for path in cli_files { + if !path.exists() { + return Err(ComposeError::FileNotFound { + path: path.display().to_string(), + }); + } + } + return Ok(cli_files.to_vec()); + } + + if let Ok(compose_file_env) = std::env::var("COMPOSE_FILE") { + if !compose_file_env.is_empty() { + // The compose-spec uses `:` on POSIX and `;` on Windows (req 9.5). + #[cfg(target_os = "windows")] + let separator = ";"; + #[cfg(not(target_os = "windows"))] + let separator = ":"; + + let paths: Vec = compose_file_env + .split(separator) + .filter(|s| !s.is_empty()) + .map(PathBuf::from) + .collect(); + + // Validate every path from the env var (req 9.8). + for path in &paths { + if !path.exists() { + return Err(ComposeError::FileNotFound { + path: path.display().to_string(), + }); + } + } + + if !paths.is_empty() { + return Ok(paths); + } + } + } + + // Fall back to searching CWD for a default compose file (req 9.6). + let cwd = std::env::current_dir()?; + find_default_compose_file(&cwd) +} + +/// Search `dir` for the first default compose file that exists (req 9.6). +/// +/// Returns `Err(ComposeError::FileNotFound)` if none are found. +pub fn find_default_compose_file(dir: &Path) -> Result> { + for name in DEFAULT_COMPOSE_FILES { + let candidate = dir.join(name); + if candidate.exists() { + return Ok(vec![candidate]); + } + } + Err(ComposeError::FileNotFound { + path: format!( + "No compose file found in '{}' (tried: {})", + dir.display(), + DEFAULT_COMPOSE_FILES.join(", ") + ), + }) +} + +#[cfg(test)] +mod tests { + use super::*; + use std::fs; + use std::sync::Mutex; + + // Use a global lock for tests that manipulate environment variables to avoid race conditions + // when running tests in parallel (req 9.3, 9.4). + static ENV_LOCK: Mutex<()> = Mutex::new(()); + + fn make_temp_dir(suffix: &str) -> PathBuf { + let dir = std::env::temp_dir().join(format!("perry-config-test-{suffix}")); + fs::create_dir_all(&dir).expect("create temp dir"); + dir + } + + // ── resolve_project_name ────────────────────────────────────────────────── + + #[test] + fn test_project_name_cli_takes_priority() { + let dir = make_temp_dir("cli-priority"); + let name = resolve_project_name(Some("explicit-name"), &dir); + assert_eq!(name, "explicit-name"); + } + + #[test] + fn test_project_name_env_var_fallback() { + let _lock = ENV_LOCK.lock().unwrap(); + let dir = make_temp_dir("env-fallback"); + // Temporarily set the env var; restore afterwards. + std::env::set_var("COMPOSE_PROJECT_NAME", "env-project"); + let name = resolve_project_name(None, &dir); + std::env::remove_var("COMPOSE_PROJECT_NAME"); + assert_eq!(name, "env-project"); + } + + #[test] + fn test_project_name_dir_fallback() { + let _lock = ENV_LOCK.lock().unwrap(); + // Ensure env var is not set for this test. + std::env::remove_var("COMPOSE_PROJECT_NAME"); + let dir = make_temp_dir("dir-fallback"); + let name = resolve_project_name(None, &dir); + assert_eq!(name, "perry-config-test-dir-fallback"); + } + + #[test] + fn test_project_name_empty_cli_falls_through_to_env() { + let _lock = ENV_LOCK.lock().unwrap(); + let dir = make_temp_dir("empty-cli"); + std::env::set_var("COMPOSE_PROJECT_NAME", "from-env"); + let name = resolve_project_name(Some(""), &dir); + std::env::remove_var("COMPOSE_PROJECT_NAME"); + assert_eq!(name, "from-env"); + } + + // ── resolve_compose_files ───────────────────────────────────────────────── + + #[test] + fn test_cli_files_returned_directly() { + let dir = make_temp_dir("cli-files"); + let file = dir.join("compose.yaml"); + fs::write(&file, "services: {}").unwrap(); + + let result = resolve_compose_files(&[file.clone()]).unwrap(); + assert_eq!(result, vec![file]); + } + + #[test] + fn test_cli_file_missing_returns_error() { + let missing = PathBuf::from("/nonexistent/path/compose.yaml"); + let err = resolve_compose_files(&[missing.clone()]).unwrap_err(); + match err { + ComposeError::FileNotFound { path } => { + assert!(path.contains("nonexistent")); + } + other => panic!("expected FileNotFound, got {other:?}"), + } + } + + #[test] + fn test_default_file_discovery_compose_yaml() { + let dir = make_temp_dir("default-discovery"); + let file = dir.join("compose.yaml"); + fs::write(&file, "services: {}").unwrap(); + + // Use find_default_compose_file directly to avoid set_current_dir races. + let result = find_default_compose_file(&dir).unwrap(); + assert_eq!(result.len(), 1); + assert_eq!(result[0].file_name().unwrap(), "compose.yaml"); + } + + #[test] + fn test_default_file_discovery_docker_compose_yml_fallback() { + let dir = make_temp_dir("docker-compose-fallback"); + let file = dir.join("docker-compose.yml"); + fs::write(&file, "services: {}").unwrap(); + + let result = find_default_compose_file(&dir).unwrap(); + assert_eq!(result.len(), 1); + assert_eq!(result[0].file_name().unwrap(), "docker-compose.yml"); + } + + #[test] + fn test_no_compose_file_returns_error() { + let dir = make_temp_dir("no-file"); + let result = find_default_compose_file(&dir); + assert!(matches!(result, Err(ComposeError::FileNotFound { .. }))); + } + + // ── ProjectConfig::new ──────────────────────────────────────────────────── + + #[test] + fn test_project_config_new_stores_raw_inputs() { + let dir = make_temp_dir("project-config"); + let file = dir.join("compose.yaml"); + fs::write(&file, "services: {}").unwrap(); + + let cfg = ProjectConfig::new(vec![file.clone()], Some("my-project".into()), vec![]); + assert_eq!(cfg.project_name, Some("my-project".to_string())); + assert_eq!(cfg.compose_files, vec![file]); + assert!(cfg.env_files.is_empty()); + } +} diff --git a/crates/perry-container-compose/src/error.rs b/crates/perry-container-compose/src/error.rs new file mode 100644 index 000000000..c51355cb7 --- /dev/null +++ b/crates/perry-container-compose/src/error.rs @@ -0,0 +1,133 @@ +//! Error types for perry-container-compose. +//! +//! Defines the canonical `ComposeError` enum and FFI error mapping. + +use serde::{Deserialize, Serialize}; +use thiserror::Error; + +/// Result of probing a single container backend candidate. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct BackendProbeResult { + pub name: String, + pub available: bool, + pub reason: String, +} + +/// Top-level crate error +#[derive(Debug, Error)] +pub enum ComposeError { + #[error("Dependency cycle detected in services: {services:?}")] + DependencyCycle { services: Vec }, + + #[error("Service '{service}' failed to start: {message}")] + ServiceStartupFailed { service: String, message: String }, + + #[error("Backend error (exit {code}): {message}")] + BackendError { code: i32, message: String }, + + #[error("Not found: {0}")] + NotFound(String), + + #[error("Parse error: {0}")] + ParseError(#[from] serde_yaml::Error), + + #[error("JSON error: {0}")] + JsonError(#[from] serde_json::Error), + + #[error("I/O error: {0}")] + IoError(#[from] std::io::Error), + + #[error("Validation error: {message}")] + ValidationError { message: String }, + + #[error("Image verification failed for '{image}': {reason}")] + VerificationFailed { image: String, reason: String }, + + #[error("File not found: {path}")] + FileNotFound { path: String }, + + #[error("No container backend found. Probed: {probed:?}")] + NoBackendFound { probed: Vec }, + + #[error("Backend '{name}' is not available: {reason}")] + BackendNotAvailable { name: String, reason: String }, +} + +impl ComposeError { + pub fn validation(msg: impl Into) -> Self { + ComposeError::ValidationError { + message: msg.into(), + } + } +} + +pub type Result = std::result::Result; + +/// Convert a `ComposeError` to a JSON string `{ "message": "...", "code": N }` +/// suitable for passing across the FFI boundary. +pub fn compose_error_to_js(e: &ComposeError) -> String { + let code = match e { + ComposeError::NotFound(_) => 404, + ComposeError::FileNotFound { .. } => 404, + ComposeError::BackendError { code, .. } => *code, + ComposeError::DependencyCycle { .. } => 422, + ComposeError::ValidationError { .. } => 400, + ComposeError::ParseError(_) => 400, + ComposeError::JsonError(_) => 400, + ComposeError::VerificationFailed { .. } => 403, + ComposeError::NoBackendFound { .. } => 503, + ComposeError::BackendNotAvailable { .. } => 503, + ComposeError::ServiceStartupFailed { .. } => 500, + ComposeError::IoError(_) => 500, + }; + serde_json::json!({ + "message": e.to_string(), + "code": code + }) + .to_string() +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_error_codes() { + let err = ComposeError::NotFound("foo".into()); + assert_eq!(compose_error_to_js(&err).contains("\"code\":404"), true); + + let err = ComposeError::DependencyCycle { + services: vec!["a".into()], + }; + assert_eq!(compose_error_to_js(&err).contains("\"code\":422"), true); + + let err = ComposeError::ValidationError { + message: "bad".into(), + }; + assert_eq!(compose_error_to_js(&err).contains("\"code\":400"), true); + + let err = ComposeError::VerificationFailed { + image: "img".into(), + reason: "fail".into(), + }; + assert_eq!(compose_error_to_js(&err).contains("\"code\":403"), true); + + let err = ComposeError::ParseError(serde_yaml::from_str::("bad: [1,2").unwrap_err()); + assert_eq!(compose_error_to_js(&err).contains("\"code\":400"), true); + + let err = ComposeError::NoBackendFound { + probed: vec![BackendProbeResult { + name: "docker".into(), + available: false, + reason: "not found".into(), + }], + }; + assert_eq!(compose_error_to_js(&err).contains("\"code\":503"), true); + + let err = ComposeError::BackendNotAvailable { + name: "podman".into(), + reason: "machine not running".into(), + }; + assert_eq!(compose_error_to_js(&err).contains("\"code\":503"), true); + } +} diff --git a/crates/perry-container-compose/src/ffi.rs b/crates/perry-container-compose/src/ffi.rs new file mode 100644 index 000000000..4f92968f4 --- /dev/null +++ b/crates/perry-container-compose/src/ffi.rs @@ -0,0 +1,200 @@ +//! FFI exports for Perry TypeScript integration. +//! +//! Each function follows the Perry FFI convention: +//! - String arguments arrive as `*const StringHeader` (Perry runtime layout) +//! - Results are serialised to JSON strings before being handed back to JS + +use crate::compose::ComposeEngine; +use std::path::PathBuf; +use std::sync::Arc; + +// ────────────────────────────────────────────────────────────── +// Minimal re-implementation of the Perry runtime string types +// ────────────────────────────────────────────────────────────── + +#[repr(C)] +pub struct StringHeader { + pub length: u32, +} + +unsafe fn string_from_header(ptr: *const StringHeader) -> Option { + if ptr.is_null() || (ptr as usize) < 0x1000 { + return None; + } + let len = (*ptr).length as usize; + let data_ptr = (ptr as *const u8).add(std::mem::size_of::()); + let bytes = std::slice::from_raw_parts(data_ptr, len); + Some(String::from_utf8_lossy(bytes).into_owned()) +} + +// ────────────────────────────────────────────────────────────── +// Helpers +// ────────────────────────────────────────────────────────────── + +fn json_ok(value: &str) -> *const StringHeader { + let payload = format!("{{\"ok\":true,\"result\":{}}}", value); + heap_string(payload) +} + +fn json_err(message: &str) -> *const StringHeader { + let escaped = message.replace('"', "\\\""); + let payload = format!("{{\"ok\":false,\"error\":\"{}\"}}", escaped); + heap_string(payload) +} + +fn heap_string(s: String) -> *const StringHeader { + let bytes = s.into_bytes(); + let total = std::mem::size_of::() + bytes.len(); + let layout = std::alloc::Layout::from_size_align(total, std::mem::align_of::()) + .expect("layout"); + unsafe { + let ptr = std::alloc::alloc(layout) as *mut StringHeader; + (*ptr).length = bytes.len() as u32; + let data_ptr = (ptr as *mut u8).add(std::mem::size_of::()); + std::ptr::copy_nonoverlapping(bytes.as_ptr(), data_ptr, bytes.len()); + ptr as *const StringHeader + } +} + +fn block, T>(fut: F) -> T { + tokio::runtime::Builder::new_current_thread() + .enable_all() + .build() + .expect("tokio runtime") + .block_on(fut) +} + +fn parse_compose_file(file_ptr: *const StringHeader) -> Option { + unsafe { string_from_header(file_ptr) }.map(PathBuf::from) +} + +fn make_engine(files: Vec) -> Result, String> { + let proj = crate::project::ComposeProject::load_from_files(&files, None, &[]) + .map_err(|e| e.to_string())?; + let backend: Arc = block(crate::backend::detect_backend()) + .map(Arc::from) + .map_err(|e| e.to_string())?; + Ok(Arc::new(ComposeEngine::new(proj.spec, proj.project_name, backend))) +} + +// ────────────────────────────────────────────────────────────── +// Exported FFI functions +// ────────────────────────────────────────────────────────────── + +#[no_mangle] +pub unsafe extern "C" fn js_compose_start(file_ptr: *const StringHeader) -> *const StringHeader { + let files: Vec = parse_compose_file(file_ptr).into_iter().collect(); + match make_engine(files) { + Err(e) => json_err(&e), + Ok(engine) => match block(engine.up(&[], true, false, false)) { + Ok(_) => json_ok("null"), + Err(e) => json_err(&e.to_string()), + }, + } +} + +#[no_mangle] +pub unsafe extern "C" fn js_compose_stop(file_ptr: *const StringHeader) -> *const StringHeader { + let files: Vec = parse_compose_file(file_ptr).into_iter().collect(); + match make_engine(files) { + Err(e) => json_err(&e), + Ok(engine) => match block(engine.down(false, false)) { + Ok(_) => json_ok("null"), + Err(e) => json_err(&e.to_string()), + }, + } +} + +#[no_mangle] +pub unsafe extern "C" fn js_compose_ps(file_ptr: *const StringHeader) -> *const StringHeader { + let files: Vec = parse_compose_file(file_ptr).into_iter().collect(); + match make_engine(files) { + Err(e) => json_err(&e), + Ok(engine) => match block(engine.ps()) { + Err(e) => json_err(&e.to_string()), + Ok(infos) => { + let items: Vec = infos + .iter() + .map(|i| { + format!( + "{{\"service\":\"{}\",\"container\":\"{}\",\"status\":\"{}\"}}", + i.name, i.id, i.status + ) + }) + .collect(); + let array = format!("[{}]", items.join(",")); + json_ok(&array) + } + }, + } +} + +#[no_mangle] +pub unsafe extern "C" fn js_compose_logs( + file_ptr: *const StringHeader, + services_ptr: *const StringHeader, + _follow: bool, +) -> *const StringHeader { + let files: Vec = parse_compose_file(file_ptr).into_iter().collect(); + let service: Option = string_from_header(services_ptr) + .and_then(|s| serde_json::from_str::>(&s).ok()) + .and_then(|v| v.into_iter().next()); + + match make_engine(files) { + Err(e) => json_err(&e), + Ok(engine) => match block(engine.logs(service.as_deref(), None)) { + Err(e) => json_err(&e.to_string()), + Ok(logs) => { + let stdout = logs.stdout.replace('"', "\\\"").replace('\n', "\\n"); + let stderr = logs.stderr.replace('"', "\\\"").replace('\n', "\\n"); + let payload = format!("{{\"stdout\":\"{}\",\"stderr\":\"{}\"}}", stdout, stderr); + json_ok(&payload) + } + }, + } +} + +#[no_mangle] +pub unsafe extern "C" fn js_compose_exec( + file_ptr: *const StringHeader, + service_ptr: *const StringHeader, + cmd_ptr: *const StringHeader, +) -> *const StringHeader { + let files: Vec = parse_compose_file(file_ptr).into_iter().collect(); + let service = match string_from_header(service_ptr) { + Some(s) => s, + None => return json_err("service name is required"), + }; + let cmd: Vec = string_from_header(cmd_ptr) + .and_then(|s| serde_json::from_str::>(&s).ok()) + .unwrap_or_default(); + + match make_engine(files) { + Err(e) => json_err(&e), + Ok(engine) => match block(engine.exec(&service, &cmd)) { + Err(e) => json_err(&e.to_string()), + Ok(result) => { + let stdout = result.stdout.replace('"', "\\\"").replace('\n', "\\n"); + let stderr = result.stderr.replace('"', "\\\"").replace('\n', "\\n"); + let payload = format!( + "{{\"stdout\":\"{}\",\"stderr\":\"{}\"}}", + stdout, stderr + ); + json_ok(&payload) + } + }, + } +} + +#[no_mangle] +pub unsafe extern "C" fn js_compose_config(file_ptr: *const StringHeader) -> *const StringHeader { + let files: Vec = parse_compose_file(file_ptr).into_iter().collect(); + match crate::project::ComposeProject::load_from_files(&files, None, &[]) { + Err(e) => json_err(&e.to_string()), + Ok(proj) => { + let yaml = proj.spec.to_yaml().unwrap_or_default(); + let escaped = yaml.replace('"', "\\\"").replace('\n', "\\n"); + json_ok(&format!("\"{}\"", escaped)) + } + } +} diff --git a/crates/perry-container-compose/src/lib.rs b/crates/perry-container-compose/src/lib.rs new file mode 100644 index 000000000..83ba40614 --- /dev/null +++ b/crates/perry-container-compose/src/lib.rs @@ -0,0 +1,22 @@ +//! `perry-container-compose` — Docker Compose-like experience for Apple Container / Podman. + +pub mod types; +pub mod error; +pub mod yaml; +pub mod project; +pub mod service; +pub mod compose; +pub mod backend; +pub mod cli; +pub mod config; + +#[cfg(feature = "ffi")] +pub mod ffi; + +pub use error::{ComposeError, Result}; +pub use types::{ComposeSpec, ComposeService, ComposeHandle}; +pub use compose::ComposeEngine; +pub use project::ComposeProject; +pub use backend::{ContainerBackend, CliBackend, CliProtocol, DockerProtocol, AppleContainerProtocol, LimaProtocol, BackendProbeResult, detect_backend}; + +pub use indexmap; diff --git a/crates/perry-container-compose/src/main.rs b/crates/perry-container-compose/src/main.rs new file mode 100644 index 000000000..73e014c72 --- /dev/null +++ b/crates/perry-container-compose/src/main.rs @@ -0,0 +1,21 @@ +//! CLI entry point for `perry-compose` binary. + +use clap::Parser; +use perry_container_compose::cli::{run, Cli}; +use tracing_subscriber::{fmt, EnvFilter}; + +#[tokio::main] +async fn main() { + // Initialise tracing (RUST_LOG env controls verbosity) + fmt() + .with_env_filter(EnvFilter::from_default_env()) + .with_target(false) + .init(); + + let cli = Cli::parse(); + + if let Err(e) = run(cli).await { + eprintln!("Error: {}", e); + std::process::exit(1); + } +} diff --git a/crates/perry-container-compose/src/project.rs b/crates/perry-container-compose/src/project.rs new file mode 100644 index 000000000..ef6366e30 --- /dev/null +++ b/crates/perry-container-compose/src/project.rs @@ -0,0 +1,35 @@ +use crate::error::Result; +use crate::config::ProjectConfig; +use crate::types::ComposeSpec; +use std::path::PathBuf; + +pub struct ComposeProject { + pub spec: ComposeSpec, + pub project_name: String, + pub project_dir: PathBuf, + pub compose_files: Vec, +} + +impl ComposeProject { + pub fn load(config: &ProjectConfig) -> Result { + let compose_files = crate::config::resolve_compose_files(&config.compose_files)?; + let project_dir = compose_files[0] + .parent() + .unwrap_or_else(|| std::path::Path::new(".")) + .to_path_buf(); + + let project_name = crate::config::resolve_project_name( + config.project_name.as_deref(), + &project_dir, + ); + + // TODO: Load .env files, interpolate YAML, parse and merge + // For now, return default spec + Ok(Self { + spec: ComposeSpec::default(), + project_name, + project_dir, + compose_files, + }) + } +} diff --git a/crates/perry-container-compose/src/service.rs b/crates/perry-container-compose/src/service.rs new file mode 100644 index 000000000..c5e2f901b --- /dev/null +++ b/crates/perry-container-compose/src/service.rs @@ -0,0 +1,20 @@ +use md5::{Digest, Md5}; + +pub fn service_container_name(service: &crate::types::ComposeService, _service_name: &str) -> String { + if let Some(name) = service.container_name.as_ref() { + return name.clone(); + } + let image = service.image.as_deref().unwrap_or("unknown"); + let mut hasher = Md5::new(); + hasher.update(image.as_bytes()); + let hash = hex::encode(hasher.finalize()); + let short_hash = &hash[..8]; + let random_suffix: u32 = rand::random(); + format!("{}-{:08x}", short_hash, random_suffix) +} + +pub struct ServiceState { + pub id: String, + pub name: String, + pub running: bool, +} diff --git a/crates/perry-container-compose/src/types.rs b/crates/perry-container-compose/src/types.rs new file mode 100644 index 000000000..531247a0a --- /dev/null +++ b/crates/perry-container-compose/src/types.rs @@ -0,0 +1,173 @@ +//! Data types for perry-container-compose. + +use indexmap::IndexMap; +use serde::{Deserialize, Serialize}; +use std::collections::HashMap; + +#[derive(Debug, Clone, Serialize, Deserialize, Default)] +pub struct ComposeSpec { + pub version: Option, + pub name: Option, + pub services: IndexMap, + pub networks: Option>>, + pub volumes: Option>>, +} + +impl ComposeSpec { + pub fn to_yaml(&self) -> Result { + serde_yaml::to_string(self).map_err(crate::error::ComposeError::ParseError) + } +} + +#[derive(Debug, Clone, Serialize, Deserialize, Default)] +pub struct ComposeService { + pub image: Option, + pub build: Option, + pub command: Option, + pub entrypoint: Option, + pub environment: Option, + pub ports: Option>, + pub volumes: Option>, + pub networks: Option>, + pub depends_on: Option, + pub container_name: Option, + pub labels: Option, +} + +impl ComposeService { + pub fn image_ref(&self, svc_name: &str) -> String { + self.image.clone().unwrap_or_else(|| svc_name.to_string()) + } + pub fn port_strings(&self) -> Vec { self.ports.clone().unwrap_or_default() } + pub fn volume_strings(&self) -> Vec { self.volumes.clone().unwrap_or_default() } + pub fn resolved_env(&self) -> HashMap { + match &self.environment { + Some(ListOrDict::Dict(m)) => m.iter().filter_map(|(k, v)| v.as_ref().map(|val| (k.clone(), val.to_string()))).collect(), + Some(ListOrDict::List(l)) => l.iter().filter_map(|s| { + let mut p = s.splitn(2, '='); + Some((p.next()?.to_string(), p.next()?.to_string())) + }).collect(), + None => HashMap::new(), + } + } + pub fn command_list(&self) -> Option> { + match &self.command { + Some(serde_yaml::Value::String(s)) => Some(s.split_whitespace().map(|s| s.to_string()).collect()), + Some(serde_yaml::Value::Sequence(seq)) => Some(seq.iter().filter_map(|v| v.as_str().map(|s| s.to_string())).collect()), + _ => None, + } + } +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct ComposeServiceBuild { + pub context: Option, + pub dockerfile: Option, + pub args: Option, +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +#[serde(untagged)] +pub enum ListOrDict { + Dict(IndexMap>), + List(Vec), +} + +impl ListOrDict { + pub fn to_map(&self) -> HashMap { + match self { + Self::Dict(m) => m.iter().filter_map(|(k, v)| v.as_ref().map(|val| (k.clone(), val.to_string()))).collect(), + Self::List(l) => l.iter().filter_map(|s| { + let mut p = s.splitn(2, '='); + Some((p.next()?.to_string(), p.next()?.to_string())) + }).collect(), + } + } +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +#[serde(untagged)] +pub enum DependsOnSpec { + List(Vec), + Map(IndexMap), +} + +impl DependsOnSpec { + pub fn service_names(&self) -> Vec { + match self { + Self::List(l) => l.clone(), + Self::Map(m) => m.keys().cloned().collect(), + } + } +} + +#[derive(Debug, Clone, Serialize, Deserialize, Default)] +pub struct ComposeNetwork { + pub driver: Option, + pub external: Option, + pub internal: Option, + pub enable_ipv6: Option, + pub name: Option, + pub labels: Option, +} + +#[derive(Debug, Clone, Serialize, Deserialize, Default)] +pub struct ComposeVolume { + pub driver: Option, + pub external: Option, + pub name: Option, + pub labels: Option, +} + +#[derive(Debug, Clone, Serialize, Deserialize, Default)] +pub struct ContainerSpec { + pub image: String, + pub name: Option, + pub ports: Option>, + pub volumes: Option>, + pub env: Option>, + pub labels: Option>, + pub cmd: Option>, + pub entrypoint: Option>, + pub network: Option, + pub rm: Option, +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct ContainerHandle { + pub id: String, + pub name: Option, +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct ContainerInfo { + pub id: String, + pub name: String, + pub image: String, + pub status: String, + pub ports: Vec, + pub labels: HashMap, + pub created: String, +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct ContainerLogs { + pub stdout: String, + pub stderr: String, +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct ImageInfo { + pub id: String, + pub repository: String, + pub tag: String, + pub size: u64, + pub created: String, +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct ComposeHandle { + pub stack_id: u64, + pub project_name: String, + pub services: Vec, +} diff --git a/crates/perry-container-compose/src/yaml.rs b/crates/perry-container-compose/src/yaml.rs new file mode 100644 index 000000000..12cde59f2 --- /dev/null +++ b/crates/perry-container-compose/src/yaml.rs @@ -0,0 +1,494 @@ +//! YAML parsing, environment variable interpolation, `.env` loading, +//! and multi-file merge. + +use crate::error::{ComposeError, Result}; +use crate::types::ComposeSpec; +use std::collections::HashMap; +use std::path::{Path, PathBuf}; + +// ============ Environment variable interpolation ============ + +/// Expand `${VAR}`, `${VAR:-default}`, `${VAR:+value}`, and `$VAR` in a YAML string. +/// +/// This is the primary public API for interpolation (spec name: `interpolate_yaml`). +pub fn interpolate_yaml(yaml: &str, env: &HashMap) -> String { + interpolate(yaml, env) +} + +/// Internal interpolation engine — also exported for use in tests and other modules. +pub fn interpolate(input: &str, env: &HashMap) -> String { + let mut result = String::with_capacity(input.len()); + let mut chars = input.chars().peekable(); + + while let Some(ch) = chars.next() { + if ch == '$' { + match chars.peek() { + Some('{') => { + chars.next(); // consume '{' + let expr = read_until_close(&mut chars); + let expanded = expand_expr(&expr, env); + result.push_str(&expanded); + } + Some('$') => { + // $$ → literal $ + chars.next(); + result.push('$'); + } + Some(&c) if c.is_alphanumeric() || c == '_' => { + let name = read_plain_var(&mut chars, c); + let val = lookup(&name, env); + result.push_str(&val); + } + _ => { + result.push('$'); + } + } + } else { + result.push(ch); + } + } + + result +} + +fn read_until_close(chars: &mut std::iter::Peekable) -> String { + let mut expr = String::new(); + let mut depth = 1usize; + for ch in chars.by_ref() { + match ch { + '{' => { + depth += 1; + expr.push(ch); + } + '}' => { + depth -= 1; + if depth == 0 { + break; + } + expr.push(ch); + } + _ => expr.push(ch), + } + } + expr +} + +fn read_plain_var(chars: &mut std::iter::Peekable, first: char) -> String { + let mut name = String::new(); + name.push(first); + chars.next(); // consume the first char (already peeked) + while let Some(&c) = chars.peek() { + if c.is_alphanumeric() || c == '_' { + name.push(c); + chars.next(); + } else { + break; + } + } + name +} + +fn expand_expr(expr: &str, env: &HashMap) -> String { + // ${VAR:-default} — use default when VAR is unset or empty + if let Some(pos) = expr.find(":-") { + let name = &expr[..pos]; + let default = &expr[pos + 2..]; + let val = lookup(name, env); + return if val.is_empty() { + default.to_owned() + } else { + val + }; + } + + // ${VAR:+value} — use value when VAR is set and non-empty + if let Some(pos) = expr.find(":+") { + let name = &expr[..pos]; + let value = &expr[pos + 2..]; + let val = lookup(name, env); + return if !val.is_empty() { + value.to_owned() + } else { + String::new() + }; + } + + // ${VAR} — plain lookup + lookup(expr, env) +} + +/// Look up a variable: check the provided env map first, then fall back to process env. +fn lookup(name: &str, env: &HashMap) -> String { + if let Some(v) = env.get(name) { + return v.clone(); + } + std::env::var(name).unwrap_or_default() +} + +// ============ .env file loading ============ + +/// Parse a `.env` file into a key→value map. +/// +/// Rules: +/// - Lines starting with `#` are comments +/// - Empty lines are skipped +/// - Format: `KEY=VALUE`, `KEY="VALUE"`, or `KEY='VALUE'` +/// - Inline `#` comments after unquoted values are stripped +pub fn parse_dotenv(content: &str) -> HashMap { + let mut map = HashMap::new(); + + for line in content.lines() { + let line = line.trim(); + + if line.is_empty() || line.starts_with('#') { + continue; + } + + if let Some((key, raw_val)) = line.split_once('=') { + let key = key.trim().to_owned(); + if key.is_empty() { + continue; + } + let val = parse_dotenv_value(raw_val.trim()); + map.insert(key, val); + } + } + + map +} + +fn parse_dotenv_value(raw: &str) -> String { + if raw.is_empty() { + return String::new(); + } + + // Double-quoted: handle escape sequences + if raw.starts_with('"') && raw.ends_with('"') && raw.len() >= 2 { + let inner = &raw[1..raw.len() - 1]; + return inner.replace("\\n", "\n").replace("\\\"", "\"").replace("\\\\", "\\"); + } + + // Single-quoted: literal, no escapes + if raw.starts_with('\'') && raw.ends_with('\'') && raw.len() >= 2 { + return raw[1..raw.len() - 1].to_owned(); + } + + // Unquoted: strip inline comment (` #` or `\t#`) + if let Some(pos) = raw.find(" #").or_else(|| raw.find("\t#")) { + raw[..pos].trim_end().to_owned() + } else { + raw.to_owned() + } +} + +/// Load environment variables for compose interpolation. +/// +/// Precedence (highest to lowest): +/// 1. Process environment (always wins) +/// 2. Explicit `--env-file` files (later files override earlier ones) +/// 3. Default `.env` file in `project_dir` +/// +/// Returns a merged map where process env values are never overridden. +pub fn load_env(project_dir: &Path, extra_env_files: &[PathBuf]) -> HashMap { + // Start with an empty map — we'll layer values in reverse precedence order, + // then let process env win at the end. + let mut file_env: HashMap = HashMap::new(); + + // 1. Default .env in project directory (lowest priority among files) + let default_env = project_dir.join(".env"); + if default_env.exists() { + if let Ok(content) = std::fs::read_to_string(&default_env) { + for (k, v) in parse_dotenv(&content) { + file_env.entry(k).or_insert(v); + } + } + } + + // 2. Explicit --env-file flags (later files override earlier ones) + for ef in extra_env_files { + if let Ok(content) = std::fs::read_to_string(ef) { + for (k, v) in parse_dotenv(&content) { + file_env.insert(k, v); + } + } + } + + // 3. Process environment takes precedence over all file-based values + let mut env = file_env; + for (k, v) in std::env::vars() { + env.insert(k, v); + } + + env +} + +// ============ YAML parsing ============ + +/// Parse a compose YAML string into a `ComposeSpec` after environment variable interpolation. +/// +/// Returns a descriptive `ComposeError::ParseError` for malformed YAML. +pub fn parse_compose_yaml(yaml: &str, env: &HashMap) -> Result { + let interpolated = interpolate_yaml(yaml, env); + serde_yaml::from_str(&interpolated).map_err(ComposeError::ParseError) +} + +// ============ Multi-file merge ============ + +/// Read, interpolate, parse, and merge multiple compose files in order. +/// +/// Later files override earlier ones (last-writer-wins for all top-level maps). +/// Returns `ComposeError::FileNotFound` if any file is missing. +pub fn parse_and_merge_files( + files: &[PathBuf], + env: &HashMap, +) -> Result { + let mut merged: Option = None; + + for file_path in files { + let content = + std::fs::read_to_string(file_path).map_err(|_| ComposeError::FileNotFound { + path: file_path.display().to_string(), + })?; + + let spec = parse_compose_yaml(&content, env)?; + + match &mut merged { + None => merged = Some(spec), + Some(base) => base.merge(spec), + } + } + + Ok(merged.unwrap_or_default()) +} + +#[cfg(test)] +mod tests { + use super::*; + + // ---- interpolate_yaml / interpolate ---- + + #[test] + fn test_interpolate_simple_braces() { + let mut env = HashMap::new(); + env.insert("NAME".into(), "world".into()); + assert_eq!(interpolate_yaml("Hello ${NAME}!", &env), "Hello world!"); + } + + #[test] + fn test_interpolate_plain_dollar() { + let mut env = HashMap::new(); + env.insert("FOO".into(), "bar".into()); + assert_eq!(interpolate_yaml("$FOO baz", &env), "bar baz"); + } + + #[test] + fn test_interpolate_default_when_missing() { + let env = HashMap::new(); + assert_eq!(interpolate_yaml("${MISSING:-fallback}", &env), "fallback"); + } + + #[test] + fn test_interpolate_default_when_empty() { + let mut env = HashMap::new(); + env.insert("EMPTY".into(), "".into()); + assert_eq!(interpolate_yaml("${EMPTY:-fallback}", &env), "fallback"); + } + + #[test] + fn test_interpolate_default_not_used_when_set() { + let mut env = HashMap::new(); + env.insert("SET".into(), "value".into()); + assert_eq!(interpolate_yaml("${SET:-fallback}", &env), "value"); + } + + #[test] + fn test_interpolate_conditional_set() { + let mut env = HashMap::new(); + env.insert("SET".into(), "yes".into()); + assert_eq!(interpolate_yaml("${SET:+value}", &env), "value"); + } + + #[test] + fn test_interpolate_conditional_unset() { + let env = HashMap::new(); + assert_eq!(interpolate_yaml("${UNSET:+value}", &env), ""); + } + + #[test] + fn test_interpolate_dollar_dollar_escape() { + let env = HashMap::new(); + assert_eq!(interpolate_yaml("$$FOO", &env), "$FOO"); + assert_eq!(interpolate_yaml("price: $$9.99", &env), "price: $9.99"); + } + + #[test] + fn test_interpolate_unknown_var_empty() { + let env = HashMap::new(); + assert_eq!(interpolate_yaml("${UNKNOWN}", &env), ""); + } + + // ---- parse_dotenv ---- + + #[test] + fn test_parse_dotenv_basic() { + let content = "FOO=bar\nBAZ=qux\n# comment\n\nEMPTY="; + let map = parse_dotenv(content); + assert_eq!(map["FOO"], "bar"); + assert_eq!(map["BAZ"], "qux"); + assert_eq!(map["EMPTY"], ""); + } + + #[test] + fn test_parse_dotenv_double_quoted() { + let content = r#"A="hello world" +B="with \"escape\"" +C="newline\nhere" +"#; + let map = parse_dotenv(content); + assert_eq!(map["A"], "hello world"); + assert_eq!(map["B"], "with \"escape\""); + assert_eq!(map["C"], "newline\nhere"); + } + + #[test] + fn test_parse_dotenv_single_quoted() { + let content = "B='single quoted'\n"; + let map = parse_dotenv(content); + assert_eq!(map["B"], "single quoted"); + } + + #[test] + fn test_parse_dotenv_inline_comment() { + let content = "KEY=value # this is a comment\n"; + let map = parse_dotenv(content); + assert_eq!(map["KEY"], "value"); + } + + #[test] + fn test_parse_dotenv_equals_in_value() { + let content = "URL=http://example.com?a=1&b=2\n"; + let map = parse_dotenv(content); + assert_eq!(map["URL"], "http://example.com?a=1&b=2"); + } + + // ---- parse_compose_yaml ---- + + #[test] + fn test_parse_compose_yaml_basic() { + let yaml = r#" +services: + web: + image: nginx +"#; + let env = HashMap::new(); + let spec = parse_compose_yaml(yaml, &env).unwrap(); + assert!(spec.services.contains_key("web")); + assert_eq!(spec.services["web"].image.as_deref(), Some("nginx")); + } + + #[test] + fn test_parse_compose_yaml_with_interpolation() { + let yaml = r#" +services: + web: + image: ${IMAGE:-nginx} +"#; + let mut env = HashMap::new(); + env.insert("IMAGE".into(), "redis".into()); + let spec = parse_compose_yaml(yaml, &env).unwrap(); + assert_eq!(spec.services["web"].image.as_deref(), Some("redis")); + + // Default fallback + let empty_env = HashMap::new(); + let spec2 = parse_compose_yaml(yaml, &empty_env).unwrap(); + assert_eq!(spec2.services["web"].image.as_deref(), Some("nginx")); + } + + #[test] + fn test_parse_compose_yaml_malformed_returns_error() { + let yaml = "services: [unclosed"; + let env = HashMap::new(); + let result = parse_compose_yaml(yaml, &env); + assert!(result.is_err()); + assert!(matches!(result.unwrap_err(), ComposeError::ParseError(_))); + } + + // ---- ComposeSpec::merge (via parse_and_merge_files logic) ---- + + #[test] + fn test_merge_last_writer_wins_services() { + let yaml1 = r#" +services: + web: + image: nginx + db: + image: postgres +"#; + let yaml2 = r#" +services: + web: + image: apache +"#; + let env = HashMap::new(); + let mut spec1 = parse_compose_yaml(yaml1, &env).unwrap(); + let spec2 = parse_compose_yaml(yaml2, &env).unwrap(); + spec1.merge(spec2); + + // web overridden by second file + assert_eq!(spec1.services["web"].image.as_deref(), Some("apache")); + // db preserved from first file + assert_eq!(spec1.services["db"].image.as_deref(), Some("postgres")); + } + + #[test] + fn test_merge_last_writer_wins_networks() { + let yaml1 = r#" +services: + web: + image: nginx +networks: + frontend: + driver: bridge +"#; + let yaml2 = r#" +services: + api: + image: node +networks: + frontend: + driver: overlay + backend: + driver: bridge +"#; + let env = HashMap::new(); + let mut spec1 = parse_compose_yaml(yaml1, &env).unwrap(); + let spec2 = parse_compose_yaml(yaml2, &env).unwrap(); + spec1.merge(spec2); + + let nets = spec1.networks.as_ref().unwrap(); + // frontend overridden + assert_eq!( + nets["frontend"].as_ref().unwrap().driver.as_deref(), + Some("overlay") + ); + // backend added + assert!(nets.contains_key("backend")); + } + + // ---- parse_and_merge_files ---- + + #[test] + fn test_parse_and_merge_files_missing_returns_error() { + let files = vec![PathBuf::from("/nonexistent/compose.yaml")]; + let env = HashMap::new(); + let result = parse_and_merge_files(&files, &env); + assert!(matches!(result.unwrap_err(), ComposeError::FileNotFound { .. })); + } + + #[test] + fn test_parse_and_merge_files_empty_returns_default() { + let env = HashMap::new(); + let spec = parse_and_merge_files(&[], &env).unwrap(); + assert!(spec.services.is_empty()); + } +} diff --git a/crates/perry-container-compose/tests/common/mod.rs b/crates/perry-container-compose/tests/common/mod.rs new file mode 100644 index 000000000..9fcd90457 --- /dev/null +++ b/crates/perry-container-compose/tests/common/mod.rs @@ -0,0 +1,153 @@ +use async_trait::async_trait; +use perry_container_compose::backend::{ContainerBackend, NetworkConfig, VolumeConfig, SecurityProfile}; +use perry_container_compose::types::{ + ContainerHandle, ContainerInfo, ContainerLogs, ImageInfo, + ContainerSpec +}; +use perry_container_compose::error::{ComposeError, Result}; +use std::collections::HashMap; +use std::sync::{Arc, Mutex}; + +#[derive(Default)] +pub struct MockBackendState { + pub containers: Vec, + pub networks: Vec, + pub volumes: Vec, + pub actions: Vec, + pub fail_on_run: Option, // Service name to fail on +} + +#[derive(Clone, Default)] +pub struct MockBackend { + pub state: Arc>, +} + +#[async_trait] +impl ContainerBackend for MockBackend { + fn backend_name(&self) -> &'static str { "mock" } + + async fn check_available(&self) -> Result<()> { Ok(()) } + + async fn run(&self, spec: &ContainerSpec) -> Result { + let mut state = self.state.lock().unwrap(); + let name = spec.name.clone().unwrap_or_else(|| "unnamed".to_string()); + + if let Some(fail_name) = &state.fail_on_run { + if name.contains(fail_name) { + return Err(ComposeError::ServiceStartupFailed { + service: name, + message: "Mock failure".to_string(), + }); + } + } + + state.actions.push(format!("run:{}", name)); + state.containers.push(name.clone()); + Ok(ContainerHandle { id: name.clone(), name: Some(name) }) + } + + async fn create(&self, _spec: &ContainerSpec) -> Result { Ok(ContainerHandle { id: "id".into(), name: None }) } + async fn start(&self, _id: &str) -> Result<()> { Ok(()) } + async fn stop(&self, id: &str, _timeout: Option) -> Result<()> { + let mut state = self.state.lock().unwrap(); + state.actions.push(format!("stop:{}", id)); + Ok(()) + } + async fn remove(&self, id: &str, _force: bool) -> Result<()> { + let mut state = self.state.lock().unwrap(); + state.actions.push(format!("remove:{}", id)); + state.containers.retain(|c| c != id); + Ok(()) + } + + async fn list(&self, _all: bool) -> Result> { + let state = self.state.lock().unwrap(); + let mut infos = Vec::new(); + for id in &state.containers { + let mut labels = HashMap::new(); + // Simple heuristic for Mocking labels in tests + if id.contains("web") { + labels.insert("com.docker.compose.project".into(), "down-project".into()); + labels.insert("com.docker.compose.service".into(), "web".into()); + } else if id.contains("db") { + labels.insert("com.docker.compose.project".into(), "down-project".into()); + labels.insert("com.docker.compose.service".into(), "db".into()); + } + infos.push(ContainerInfo { + id: id.clone(), + name: id.clone(), + image: "mock-image".to_string(), + status: "running".to_string(), + ports: vec![], + labels, + created: "2025-01-01T00:00:00Z".to_string(), + }) + } + Ok(infos) + } + + async fn inspect(&self, id: &str) -> Result { + let state = self.state.lock().unwrap(); + if state.containers.contains(&id.to_string()) { + Ok(ContainerInfo { + id: id.to_string(), + name: id.to_string(), + image: "mock-image".to_string(), + status: "running".to_string(), + ports: vec![], + labels: HashMap::new(), + created: "2025-01-01T00:00:00Z".to_string(), + }) + } else { + Err(ComposeError::NotFound(id.to_string())) + } + } + + async fn logs(&self, _id: &str, _tail: Option) -> Result { + Ok(ContainerLogs { stdout: "logs".into(), stderr: "".into() }) + } + + async fn exec(&self, _id: &str, _cmd: &[String], _env: Option<&HashMap>, _workdir: Option<&str>) -> Result { + Ok(ContainerLogs { stdout: "exec".into(), stderr: "".into() }) + } + + async fn pull_image(&self, _reference: &str) -> Result<()> { Ok(()) } + async fn list_images(&self) -> Result> { Ok(vec![]) } + async fn remove_image(&self, _reference: &str, _force: bool) -> Result<()> { Ok(()) } + + async fn create_network(&self, name: &str, _config: &NetworkConfig) -> Result<()> { + let mut state = self.state.lock().unwrap(); + state.actions.push(format!("create_network:{}", name)); + state.networks.push(name.to_string()); + Ok(()) + } + + async fn remove_network(&self, name: &str) -> Result<()> { + let mut state = self.state.lock().unwrap(); + state.actions.push(format!("remove_network:{}", name)); + state.networks.retain(|n| n != name); + Ok(()) + } + + async fn create_volume(&self, name: &str, _config: &VolumeConfig) -> Result<()> { + let mut state = self.state.lock().unwrap(); + state.actions.push(format!("create_volume:{}", name)); + state.volumes.push(name.to_string()); + Ok(()) + } + + async fn remove_volume(&self, name: &str) -> Result<()> { + let mut state = self.state.lock().unwrap(); + state.actions.push(format!("remove_volume:{}", name)); + state.volumes.retain(|v| v != name); + Ok(()) + } + + async fn inspect_network(&self, _name: &str) -> Result { Ok(serde_json::json!({})) } + async fn inspect_volume(&self, _name: &str) -> Result { Ok(serde_json::json!({})) } + async fn build_image(&self, _image: &str, _context: &str, _dockerfile: Option<&str>, _args: Option<&HashMap>) -> Result<()> { Ok(()) } + async fn inspect_image(&self, _reference: &str) -> Result { Ok(serde_json::json!({})) } + async fn manifest_inspect(&self, _reference: &str) -> Result { Ok(serde_json::json!({})) } + async fn run_with_security(&self, spec: &ContainerSpec, _profile: &SecurityProfile) -> Result { self.run(spec).await } + async fn wait_and_logs(&self, id: &str) -> Result { self.logs(id, None).await } +} diff --git a/crates/perry-container-compose/tests/container_ops.rs b/crates/perry-container-compose/tests/container_ops.rs new file mode 100644 index 000000000..c42bd6634 --- /dev/null +++ b/crates/perry-container-compose/tests/container_ops.rs @@ -0,0 +1,78 @@ +use perry_container_compose::ContainerBackend; +use perry_container_compose::types::ContainerSpec; +use std::sync::Arc; + +mod common; +use common::MockBackend; + +#[tokio::test] +async fn test_container_run_success() { + let mock = MockBackend::default(); + let state_ref = Arc::clone(&mock.state); + let backend: Arc = Arc::new(mock); + let spec = ContainerSpec { + image: "alpine".into(), + name: Some("test-container".into()), + ..Default::default() + }; + + let handle = backend.run(&spec).await.expect("run failed"); + assert_eq!(handle.id, "test-container"); + + let state = state_ref.lock().unwrap(); + assert_eq!(state.containers, vec!["test-container"]); + assert_eq!(state.actions, vec!["run:test-container"]); +} + +#[tokio::test] +async fn test_container_lifecycle() { + let mock = MockBackend::default(); + let state_ref = Arc::clone(&mock.state); + let backend: Arc = Arc::new(mock); + let spec = ContainerSpec { + image: "nginx".into(), + name: Some("web".into()), + ..Default::default() + }; + + backend.run(&spec).await.unwrap(); + backend.stop("web", Some(10)).await.unwrap(); + backend.remove("web", true).await.unwrap(); + + let state = state_ref.lock().unwrap(); + assert!(state.containers.is_empty()); + assert_eq!(state.actions, vec!["run:web", "stop:web", "remove:web"]); +} + +#[tokio::test] +async fn test_container_exec() { + let backend: Arc = Arc::new(MockBackend::default()); + let logs = backend.exec("web", &["ls".into()], None, None).await.unwrap(); + assert_eq!(logs.stdout, "exec"); +} + +#[tokio::test] +async fn test_network_volume_lifecycle() { + let mock = MockBackend::default(); + let state_ref = Arc::clone(&mock.state); + let backend: Arc = Arc::new(mock); + use perry_container_compose::backend::{NetworkConfig, VolumeConfig}; + + backend.create_network("test-net", &NetworkConfig::default()).await.unwrap(); + backend.create_volume("test-vol", &VolumeConfig::default()).await.unwrap(); + + { + let state = state_ref.lock().unwrap(); + assert_eq!(state.networks, vec!["test-net"]); + assert_eq!(state.volumes, vec!["test-vol"]); + } + + backend.remove_network("test-net").await.unwrap(); + backend.remove_volume("test-vol").await.unwrap(); + + { + let state = state_ref.lock().unwrap(); + assert!(state.networks.is_empty()); + assert!(state.volumes.is_empty()); + } +} diff --git a/crates/perry-container-compose/tests/integration_tests.rs b/crates/perry-container-compose/tests/integration_tests.rs new file mode 100644 index 000000000..695df6aab --- /dev/null +++ b/crates/perry-container-compose/tests/integration_tests.rs @@ -0,0 +1,129 @@ +//! Integration tests for perry-container-compose. +//! +//! These tests require a running container backend and are gated +//! by `#[cfg(feature = "integration-tests")]`. +//! +//! The unit tests and property tests are in the modules themselves +//! and in `tests/round_trip.rs`. + +#[cfg(feature = "integration-tests")] +mod integration { + use perry_container_compose::compose::resolve_startup_order; + use perry_container_compose::types::{ComposeService, ComposeSpec, DependsOnSpec}; + use perry_container_compose::yaml::{interpolate, parse_dotenv, parse_compose_yaml}; + use std::collections::HashMap; + + #[test] + fn test_parse_simple_compose() { + let yaml = r#" +services: + web: + image: nginx:alpine + ports: + - "8080:80" +"#; + let spec = ComposeSpec::parse_str(yaml).expect("parse failed"); + assert!(spec.services.contains_key("web")); + assert_eq!(spec.services["web"].image.as_deref(), Some("nginx:alpine")); + } + + #[test] + fn test_parse_multi_service_with_deps() { + let yaml = r#" +services: + db: + image: postgres:16 + environment: + POSTGRES_PASSWORD: secret + web: + image: myapp:latest + depends_on: + - db + ports: + - "3000:3000" +"#; + let spec = ComposeSpec::parse_str(yaml).expect("parse failed"); + assert_eq!(spec.services.len(), 2); + let web = &spec.services["web"]; + let deps = web.depends_on.as_ref().unwrap().service_names(); + assert!(deps.contains(&"db".to_string())); + } + + #[test] + fn test_topological_order_linear() { + let yaml = r#" +services: + c: + image: c + depends_on: [b] + b: + image: b + depends_on: [a] + a: + image: a +"#; + let spec = ComposeSpec::parse_str(yaml).unwrap(); + let order = resolve_startup_order(&spec).unwrap(); + let pos = |s: &str| order.iter().position(|n| n == s).unwrap(); + assert!(pos("a") < pos("b"), "a before b"); + assert!(pos("b") < pos("c"), "b before c"); + } + + #[test] + fn test_circular_dependency_detected() { + let yaml = r#" +services: + a: + image: a + depends_on: [b] + b: + image: b + depends_on: [a] +"#; + let spec = ComposeSpec::parse_str(yaml).unwrap(); + let result = resolve_startup_order(&spec); + assert!(result.is_err()); + } + + #[test] + fn test_env_interpolation() { + let mut env = HashMap::new(); + env.insert("DB_USER".to_string(), "admin".to_string()); + env.insert("DB_PASS".to_string(), "s3cr3t".to_string()); + + let yaml = " url: postgres://${DB_USER}:${DB_PASS}@localhost/db"; + let result = interpolate(yaml, &env); + assert_eq!(result, " url: postgres://admin:s3cr3t@localhost/db"); + } + + #[test] + fn test_dotenv_parse() { + let content = "HOST=localhost\nPORT=5432\n# ignored\n\nEMPTY="; + let env = parse_dotenv(content); + assert_eq!(env["HOST"], "localhost"); + assert_eq!(env["PORT"], "5432"); + assert_eq!(env["EMPTY"], ""); + } + + #[test] + fn test_compose_merge_override() { + let base_yaml = r#" +services: + web: + image: nginx:1.0 + db: + image: postgres:15 +"#; + let override_yaml = r#" +services: + web: + image: nginx:2.0 +"#; + let mut base = ComposeSpec::parse_str(base_yaml).unwrap(); + let overlay = ComposeSpec::parse_str(override_yaml).unwrap(); + base.merge(overlay); + + assert_eq!(base.services["web"].image.as_deref(), Some("nginx:2.0")); + assert!(base.services.contains_key("db")); + } +} diff --git a/crates/perry-container-compose/tests/orchestration.rs b/crates/perry-container-compose/tests/orchestration.rs new file mode 100644 index 000000000..5dff23bbe --- /dev/null +++ b/crates/perry-container-compose/tests/orchestration.rs @@ -0,0 +1,86 @@ +use perry_container_compose::compose::ComposeEngine; +use perry_container_compose::types::{ComposeSpec, ComposeService}; +use std::sync::Arc; + +mod common; +use common::MockBackend; + +#[tokio::test] +async fn test_compose_up_success() { + let mut spec = ComposeSpec::default(); + spec.services.insert("web".into(), ComposeService { + image: Some("nginx".into()), + ..Default::default() + }); + spec.services.insert("db".into(), ComposeService { + image: Some("postgres".into()), + ..Default::default() + }); + + let backend = Arc::new(MockBackend::default()); + let engine = ComposeEngine::new(spec, "test-project".into(), backend.clone()); + + let handle = engine.up(&[], true, false, false).await.expect("up failed"); + + assert_eq!(handle.project_name, "test-project"); + assert_eq!(handle.services.len(), 2); + + let state = backend.state.lock().unwrap(); + assert_eq!(state.containers.len(), 2); + // Check order: db then web (alphabetical since no deps) + assert!(state.actions[0].starts_with("run:db")); + assert!(state.actions[1].starts_with("run:web")); +} + +#[tokio::test] +async fn test_compose_up_rollback_on_failure() { + let mut spec = ComposeSpec::default(); + spec.services.insert("db".into(), ComposeService { + image: Some("postgres".into()), + ..Default::default() + }); + spec.services.insert("web".into(), ComposeService { + image: Some("nginx".into()), + ..Default::default() + }); + + let backend = Arc::new(MockBackend::default()); + { + let mut state = backend.state.lock().unwrap(); + state.fail_on_run = Some("web".into()); + } + + let engine = ComposeEngine::new(spec, "fail-project".into(), backend.clone()); + let result = engine.up(&[], true, false, false).await; + + assert!(result.is_err()); + + let state = backend.state.lock().unwrap(); + // Should have started db, tried web, then stopped/removed db + assert!(state.containers.is_empty()); + + let actions: Vec<_> = state.actions.iter().map(|s| s.split(':').next().unwrap()).collect(); + assert!(actions.contains(&"run")); // db + assert!(actions.contains(&"stop")); // db rollback + assert!(actions.contains(&"remove")); // db rollback +} + +#[tokio::test] +async fn test_compose_down_cleans_resources() { + let mut spec = ComposeSpec::default(); + spec.services.insert("web".into(), ComposeService { + image: Some("nginx".into()), + ..Default::default() + }); + + let backend = Arc::new(MockBackend::default()); + let engine = ComposeEngine::new(spec, "down-project".into(), backend.clone()); + + let handle = engine.up(&[], true, false, false).await.unwrap(); + engine.down(&handle.services, false, true).await.expect("down failed"); + + let state = backend.state.lock().unwrap(); + assert!(state.containers.is_empty(), "Containers should be empty, but found: {:?}", state.containers); + assert!(state.networks.is_empty()); + assert!(state.volumes.is_empty()); +} diff --git a/crates/perry-container-compose/tests/round_trip.rs b/crates/perry-container-compose/tests/round_trip.rs new file mode 100644 index 000000000..8e1a78fba --- /dev/null +++ b/crates/perry-container-compose/tests/round_trip.rs @@ -0,0 +1,489 @@ +//! Property-based tests for perry-container-compose. +//! +//! Uses the `proptest` crate to verify correctness properties +//! across serialization, dependency resolution, YAML parsing, +//! env interpolation, and type validation. + +use indexmap::IndexMap; +use perry_container_compose::compose::resolve_startup_order; +use perry_container_compose::error::ComposeError; +use perry_container_compose::backend::OciCommandBuilder; +use perry_container_compose::error::compose_error_to_js; +use perry_container_compose::types::{ + ComposeService, ComposeSpec, ContainerSpec, DependsOnCondition, DependsOnSpec, VolumeType, +}; +use perry_container_compose::yaml::interpolate; +use proptest::prelude::*; +use std::collections::HashMap; + +// ============ Arbitrary Strategies ============ + +/// Generate a valid image reference string. +fn arb_image() -> impl Strategy { + "[a-z][a-z0-9_-]{1,15}(:[a-z0-9._-]+)?" +} + +/// Generate a valid service name. +fn arb_service_name() -> impl Strategy { + "[a-z][a-z0-9_-]{1,10}" +} + +/// Generate an arbitrary ComposeSpec with 1–10 services. +fn arb_compose_spec() -> impl Strategy { + proptest::collection::vec( + (arb_service_name(), arb_image()).prop_map(|(name, image)| { + let mut svc = ComposeService::default(); + svc.image = Some(image); + (name, svc) + }), + 1..=10, + ) + .prop_map(|services_vec| { + let mut services = IndexMap::new(); + for (name, svc) in services_vec { + services.insert(name, svc); + } + ComposeSpec { + services, + ..Default::default() + } + }) +} + +/// Generate a ComposeSpec with a valid (acyclic) depends_on DAG. +fn arb_compose_spec_with_dag() -> impl Strategy { + proptest::collection::vec( + (arb_service_name(), proptest::collection::vec(arb_service_name(), 0..=3)) + .prop_map(|(name, deps)| { + let mut svc = ComposeService::default(); + svc.image = Some(format!("{}:latest", name)); + (name, deps) + }), + 2..=8, + ) + .prop_map(|items| { + // Build a valid DAG: only allow deps on services that appear + // earlier in the list (forward references only). + let mut services = IndexMap::new(); + let existing_names: Vec = items.iter().map(|(n, _)| n.clone()).collect(); + + for (name, dep_names) in &items { + let mut svc = ComposeService::default(); + svc.image = Some(format!("{}:latest", name)); + + // Only keep deps that point to earlier services (guarantees no cycles) + let valid_deps: Vec = dep_names + .iter() + .filter(|dep| { + existing_names + .iter() + .position(|n| n == name) + .map(|my_idx| { + existing_names + .iter() + .position(|n| n == *dep) + .map(|dep_idx| dep_idx < my_idx) + .unwrap_or(false) + }) + .unwrap_or(false) + }) + .cloned() + .collect(); + + if !valid_deps.is_empty() { + svc.depends_on = Some(DependsOnSpec::List(valid_deps)); + } + services.insert(name.clone(), svc); + } + + ComposeSpec { + services, + ..Default::default() + } + }) +} + +/// Generate a ComposeSpec with at least one dependency cycle. +fn arb_compose_spec_with_cycle() -> impl Strategy { + // Strategy A: 2-node cycle using proptest::array + let two_node = proptest::array::uniform2( + proptest::string::string_regex("[a-z]{2,4}a").unwrap(), + ) + .prop_map(|names| { + let (a, b) = (names[0].clone(), names[1].clone()); + let mut services = IndexMap::new(); + + let mut svc_a = ComposeService::default(); + svc_a.image = Some(format!("{}:latest", a)); + svc_a.depends_on = Some(DependsOnSpec::List(vec![b.clone()])); + services.insert(a.clone(), svc_a); + + let mut svc_b = ComposeService::default(); + svc_b.image = Some(format!("{}:latest", b)); + svc_b.depends_on = Some(DependsOnSpec::List(vec![a])); + services.insert(b, svc_b); + + services + }); + + // Strategy B: 3-node cycle using proptest::array + let three_node = proptest::array::uniform3( + proptest::string::string_regex("[a-z]{2,4}[xyz]").unwrap(), + ) + .prop_map(|names| { + let (x, y, z) = (names[0].clone(), names[1].clone(), names[2].clone()); + let mut services = IndexMap::new(); + + let mut svc_x = ComposeService::default(); + svc_x.image = Some(format!("{}:latest", x)); + svc_x.depends_on = Some(DependsOnSpec::List(vec![z.clone()])); + services.insert(x.clone(), svc_x); + + let mut svc_y = ComposeService::default(); + svc_y.image = Some(format!("{}:latest", y)); + svc_y.depends_on = Some(DependsOnSpec::List(vec![x.clone()])); + services.insert(y.clone(), svc_y); + + let mut svc_z = ComposeService::default(); + svc_z.image = Some(format!("{}:latest", z)); + svc_z.depends_on = Some(DependsOnSpec::List(vec![y])); + services.insert(z, svc_z); + + services + }); + + proptest::prop_oneof![two_node, three_node].prop_map(|services| ComposeSpec { + services, + ..Default::default() + }) +} + +/// Generate an arbitrary ContainerSpec. +fn arb_container_spec() -> impl Strategy { + ( + arb_image(), + proptest::option::of(arb_service_name()), + proptest::option::of(proptest::collection::vec("[0-9]{2,5}:[0-9]{2,5}", 0..=3)), + proptest::option::of(proptest::collection::vec("/[a-z]:/[a-z]", 0..=3)), + ) + .prop_map(|(image, name, ports, volumes)| ContainerSpec { + image, + name, + ports, + volumes, + ..Default::default() + }) +} + +/// Generate environment variable name. +fn arb_env_name() -> impl Strategy { + "[A-Z][A-Z0-9_]{1,8}" +} + +/// Generate a template string containing ${VAR} and ${VAR:-default} patterns. +fn arb_env_template() -> impl Strategy)> { + (arb_env_name(), arb_env_name(), "[a-z0-9_]{0,10}").prop_map(|(var1, var2, default)| { + let mut env = HashMap::new(); + env.insert(var1.clone(), "value1".to_string()); + // var2 is intentionally missing from env to test defaults + + // Template: prefix_${VAR1}_mid_${VAR2:-default}_suffix + // Both vars are referenced via ${} syntax so interpolation actually expands them + let template = format!("prefix_${{{}}}_mid_${{{}:-{}}}_suffix", var1, var2, default); + + (template, env) + }) +} + +// ============ Property 2: ContainerSpec CLI argument round-trip ============ +// Feature: perry-container, Property 2: ContainerSpec CLI argument round-trip +// Validates: Requirements 12.5 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_container_spec_cli_round_trip(spec in arb_container_spec()) { + let drv = perry_container_compose::backend::BackendDriver::Docker { bin: "docker".into() }; + let args = OciCommandBuilder::run_args(&drv, &spec); + + // Manual verification of some fields since we don't have a full inverse parser yet + if let Some(name) = &spec.name { + prop_assert!(args.contains(&"--name".to_string())); + prop_assert!(args.contains(name)); + } + prop_assert!(args.contains(&spec.image)); + } +} + +// ============ Property 11: Error propagation preserves code and message ============ +// Feature: perry-container, Property 11: Error propagation preserves code and message +// Validates: Requirements 2.6, 12.2 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(50))] + + #[test] + fn prop_error_propagation(code in -100i32..500i32, message in ".*") { + let err = ComposeError::BackendError { code, message: message.clone() }; + let js_json = compose_error_to_js(&err); + let val: serde_json::Value = serde_json::from_str(&js_json).unwrap(); + + prop_assert_eq!(val["code"].as_i64().unwrap() as i32, code); + prop_assert_eq!(val["message"].as_str().unwrap().contains(&message), true); + } +} + +// ============ Property 1: ComposeSpec JSON round-trip ============ +// Feature: perry-container, Property 1: ComposeSpec serialization round-trip +// Validates: Requirements 7.12, 10.13, 12.6 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_compose_spec_json_round_trip(spec in arb_compose_spec()) { + let json = serde_json::to_string(&spec).unwrap(); + let deserialized: ComposeSpec = serde_json::from_str(&json).unwrap(); + let json2 = serde_json::to_string(&deserialized).unwrap(); + prop_assert_eq!(json, json2); + } +} + +// ============ Property 3: Topological sort respects depends_on ============ +// Feature: perry-container, Property 3: Topological sort respects depends_on +// Validates: Requirements 6.4 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_topological_sort_respects_deps(spec in arb_compose_spec_with_dag()) { + let order = resolve_startup_order(&spec).unwrap(); + + // Build position map + let pos: HashMap<&str, usize> = order + .iter() + .enumerate() + .map(|(i, s)| (s.as_str(), i)) + .collect(); + + // For every service with depends_on, verify dependencies come first + for (name, service) in &spec.services { + if let Some(deps) = &service.depends_on { + for dep in deps.service_names() { + if let (Some(&dep_pos), Some(&name_pos)) = + (pos.get(dep.as_str()), pos.get(name.as_str())) + { + prop_assert!( + dep_pos < name_pos, + "dep {} (pos {}) should come before {} (pos {})", + dep, dep_pos, name, name_pos + ); + } + } + } + } + + // All services must be in the output + prop_assert_eq!(order.len(), spec.services.len()); + } +} + +// ============ Property 4: Cycle detection is complete ============ +// Feature: perry-container, Property 4: Cycle detection is complete +// Validates: Requirements 6.5 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(50))] + + #[test] + fn prop_cycle_detection_completeness(spec in arb_compose_spec_with_cycle()) { + let result = resolve_startup_order(&spec); + prop_assert!(result.is_err(), "cycle should be detected"); + + if let Err(ComposeError::DependencyCycle { services }) = result { + // All services in the cycle should be listed + prop_assert!( + !services.is_empty(), + "cycle must list at least one service" + ); + // The listed services should be a subset of defined services + for svc in &services { + prop_assert!( + spec.services.contains_key(svc), + "cycle service {} should be defined in spec", + svc + ); + } + } else { + panic!("expected DependencyCycle error"); + } + } +} + +// ============ Property 5: YAML round-trip ============ +// Feature: perry-container, Property 5: YAML round-trip preserves ComposeSpec +// Validates: Requirements 7.1, 7.2–7.7 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_yaml_round_trip(spec in arb_compose_spec()) { + let yaml = serde_yaml::to_string(&spec).unwrap(); + let reparsed: ComposeSpec = ComposeSpec::parse_str(&yaml).unwrap(); + + // Service names preserved + prop_assert_eq!( + reparsed.services.keys().collect::>(), + spec.services.keys().collect::>() + ); + + // Image references preserved + for (name, svc) in &spec.services { + let reparsed_svc = &reparsed.services[name]; + prop_assert_eq!( + reparsed_svc.image.as_deref(), + svc.image.as_deref(), + "image mismatch for service {}", + name + ); + } + } +} + +// ============ Property 6: Environment variable interpolation ============ +// Feature: perry-container, Property 6: Environment variable interpolation correctness +// Validates: Requirements 7.8 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_env_interpolation((template, env) in arb_env_template()) { + let result = interpolate(&template, &env); + + // No ${...} should remain unexpanded + prop_assert!( + !result.contains("${"), + "template should be fully expanded, got: {}", + result + ); + + // The result should start with "prefix_value1_mid_" + prop_assert!( + result.starts_with("prefix_value1_mid_"), + "expected expanded var1, got prefix: {}", + &result[..result.len().min(20)] + ); + // The result should end with "_suffix" + prop_assert!( + result.ends_with("_suffix"), + "expected _suffix ending, got: {}", + result + ); + } +} + +// ============ Property 7: Compose file merge last-writer-wins ============ +// Feature: perry-container, Property 7: Compose file merge is last-writer-wins +// Validates: Requirements 7.10, 9.2 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_merge_last_writer_wins( + common_svc in arb_service_name(), + only_a_svc in arb_service_name(), + img_a in arb_image(), + img_b in arb_image(), + ) { + // Ensure distinct names + prop_assume!(common_svc != only_a_svc); + prop_assume!(img_a != img_b); + + let mut spec_a = ComposeSpec::default(); + let mut svc_a_common = ComposeService::default(); + svc_a_common.image = Some(img_a.clone()); + spec_a.services.insert(common_svc.clone(), svc_a_common); + + let mut svc_a_only = ComposeService::default(); + svc_a_only.image = Some(format!("onlya-{}", &common_svc)); + spec_a.services.insert(only_a_svc.clone(), svc_a_only); + + let mut spec_b = ComposeSpec::default(); + let mut svc_b_common = ComposeService::default(); + svc_b_common.image = Some(img_b.clone()); + spec_b.services.insert(common_svc.clone(), svc_b_common); + + // Merge: B wins for common service + spec_a.merge(spec_b); + + // Common service should have B's image + prop_assert_eq!( + spec_a.services[&common_svc].image.as_deref(), + Some(img_b.as_str()), + "common service should have B's image (last-writer-wins)" + ); + + // Only-A service should still be present + prop_assert!( + spec_a.services.contains_key(&only_a_svc), + "service only in A should be preserved" + ); + } +} + +// ============ Property 8: DependsOnCondition rejects invalid values ============ +// Feature: perry-container, Property 8: DependsOnCondition rejects invalid values +// Validates: Requirements 7.14 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(50))] + + #[test] + fn prop_depends_on_condition_rejects_invalid(invalid in "[a-z]{3,20}") { + // Valid values: "service_started", "service_healthy", "service_completed_successfully" + let valid_values = [ + "service_started", + "service_healthy", + "service_completed_successfully", + ]; + prop_assume!(!valid_values.contains(&invalid.as_str())); + + let yaml = format!("\"{}\"", invalid); + let result = serde_yaml::from_str::(&yaml); + prop_assert!( + result.is_err(), + "DependsOnCondition should reject invalid value '{}', got: {:?}", + invalid, + result + ); + } +} + +// ============ Property 9: VolumeType rejects invalid values ============ +// Feature: perry-container, Property 9: VolumeType rejects invalid values +// Validates: Requirements 10.14 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(50))] + + #[test] + fn prop_volume_type_rejects_invalid(invalid in "[a-z]{3,20}") { + // Valid values: "bind", "volume", "tmpfs", "cluster", "npipe", "image" + let valid_values = ["bind", "volume", "tmpfs", "cluster", "npipe", "image"]; + prop_assume!(!valid_values.contains(&invalid.as_str())); + + let yaml = format!("\"{}\"", invalid); + let result = serde_yaml::from_str::(&yaml); + prop_assert!( + result.is_err(), + "VolumeType should reject invalid value '{}', got: {:?}", + invalid, + result + ); + } +} diff --git a/crates/perry-hir/src/ir.rs b/crates/perry-hir/src/ir.rs index 4e169ddcd..6f9cebb04 100644 --- a/crates/perry-hir/src/ir.rs +++ b/crates/perry-hir/src/ir.rs @@ -98,6 +98,11 @@ pub const NATIVE_MODULES: &[&str] = &[ "worker_threads", // Perry threading primitives (parallelMap, spawn) "perry/thread", + // Perry container module (OCI container management) + "perry/container", + "perry/compose", + "perry/container-compose", + "perry/workloads", // SQLite "better-sqlite3", ]; diff --git a/crates/perry-hir/src/lower.rs b/crates/perry-hir/src/lower.rs index 925d61c22..bd91ec9b9 100644 --- a/crates/perry-hir/src/lower.rs +++ b/crates/perry-hir/src/lower.rs @@ -2457,9 +2457,52 @@ fn lower_module_decl( }) .unwrap_or_else(|| local.clone()); if is_native { - // Register as native module function with the original method name - // e.g., import { v4 as uuid } from 'uuid' -> uuid maps to uuid.v4 - ctx.register_native_module(local.clone(), source.clone(), Some(imported.clone())); + // Map perry/container and perry/compose imports to their FFI symbols + let ffi_name = match source.as_str() { + "perry/container" => match imported.as_str() { + "run" => Some("js_container_run"), + "create" => Some("js_container_create"), + "start" => Some("js_container_start"), + "stop" => Some("js_container_stop"), + "remove" => Some("js_container_remove"), + "list" => Some("js_container_list"), + "inspect" => Some("js_container_inspect"), + "logs" => Some("js_container_logs"), + "exec" => Some("js_container_exec"), + "pullImage" => Some("js_container_pullImage"), + "listImages" => Some("js_container_listImages"), + "removeImage" => Some("js_container_removeImage"), + "getBackend" => Some("js_container_getBackend"), + "composeUp" => Some("js_container_composeUp"), + _ => None, + }, + "perry/compose" => match imported.as_str() { + "up" => Some("js_compose_up"), + "down" => Some("js_compose_down"), + "ps" => Some("js_compose_ps"), + "logs" => Some("js_compose_logs"), + "exec" => Some("js_compose_exec"), + "config" => Some("js_compose_config"), + "start" => Some("js_compose_start"), + "stop" => Some("js_compose_stop"), + "restart" => Some("js_compose_restart"), + _ => None, + }, + "perry/workloads" => match imported.as_str() { + "runGraph" => Some("js_workload_runGraph"), + _ => None, + }, + _ => None, + }; + + if let Some(ffi) = ffi_name { + ctx.register_imported_func(local.clone(), ffi.to_string()); + } else { + // Register as native module function with the original method name + // e.g., import { v4 as uuid } from 'uuid' -> uuid maps to uuid.v4 + ctx.register_native_module(local.clone(), source.clone(), Some(imported.clone())); + } + // Auto-register parentPort from worker_threads as a native instance // (it's a singleton, not created via `new`) if source == "worker_threads" && imported == "parentPort" { diff --git a/crates/perry-runtime/src/bigint.rs b/crates/perry-runtime/src/bigint.rs index 4ef5345bf..2dfc0d358 100644 --- a/crates/perry-runtime/src/bigint.rs +++ b/crates/perry-runtime/src/bigint.rs @@ -145,6 +145,22 @@ pub extern "C" fn js_bigint_from_string(data: *const u8, len: u32) -> *mut BigIn let bytes = std::slice::from_raw_parts(data, len as usize); let s = std::str::from_utf8_unchecked(bytes); + // Fast path: decimal string that fits in i64. Postgres `int8` + // results, Node `Date.now()` timestamps, app IDs — the common + // BigInt input in real code is well under 2^63. For those we + // skip the per-digit 16-limb multiply (~300 u128 muls for a + // 20-char input) and let Rust's native str→i64 handle parsing + // in a single pass. + // + // `i64::from_str` returns Err on overflow / non-digit, and we + // fall through to the general path so hex, floats-of-ints, and + // arbitrary-precision still work exactly as before. + if !s.starts_with("0x") && !s.starts_with("0X") { + if let Ok(v) = s.parse::() { + return js_bigint_from_i64(v); + } + } + // Handle negative prefix let (is_negative, s) = if s.starts_with('-') { (true, &s[1..]) diff --git a/crates/perry-runtime/src/buffer.rs b/crates/perry-runtime/src/buffer.rs index d4d8ef52a..b9ad072c7 100644 --- a/crates/perry-runtime/src/buffer.rs +++ b/crates/perry-runtime/src/buffer.rs @@ -276,6 +276,37 @@ pub extern "C" fn js_uint8array_alloc(length: i32) -> *mut BufferHeader { buf } +/// `new Uint8Array(x)` runtime dispatch. +/// +/// The codegen can't always statically distinguish `new Uint8Array(n)` (numeric +/// length) from `new Uint8Array(arr)` (source array) when `n` is not a literal, +/// so this entry point inspects the NaN-box tag on the incoming value and +/// routes accordingly. Before this helper the catch-all codegen arm always +/// called `js_uint8array_from_array`, which treated numeric lengths as +/// `ArrayHeader*` and silently produced a zero-length buffer (closes #38). +#[no_mangle] +pub extern "C" fn js_uint8array_new(val: f64) -> *mut BufferHeader { + let bits = val.to_bits(); + let top16 = (bits >> 48) as u16; + // POINTER_TAG (0x7FFD) — an object/array pointer. Treat as source array. + if top16 == 0x7FFD { + let ptr = (bits & 0x0000_FFFF_FFFF_FFFF) as *const ArrayHeader; + return js_uint8array_from_array(ptr); + } + // Plain IEEE double (upper16 < 0x7FFC or > 0x7FFF) — numeric length. + if top16 < 0x7FFC || top16 > 0x7FFF { + let len = if val.is_finite() && val >= 0.0 { + val as i32 + } else { + 0 + }; + return js_uint8array_alloc(len); + } + // Any other tag (undefined/null/bool/string/bigint) → empty buffer, + // matching the JS semantics of `new Uint8Array(undefined)` et al. + js_uint8array_alloc(0) +} + /// Allocate a zero-filled buffer #[no_mangle] pub extern "C" fn js_buffer_alloc(size: i32, fill: i32) -> *mut BufferHeader { @@ -840,8 +871,8 @@ fn buffer_index_of_bytes(buf: *const BufferHeader, needle: &[u8], start: i32) -> } } -/// `buf.indexOf(needle, start?)` where `needle` is a string or buffer -/// (NaN-boxed value). +/// `buf.indexOf(needle, start?)` where `needle` is a string, buffer, +/// or numeric byte value (NaN-boxed value). #[no_mangle] pub extern "C" fn js_buffer_index_of(buf_ptr: f64, needle: f64, start: i32) -> i32 { let buf = unbox_buffer_ptr(buf_ptr.to_bits()) as *const BufferHeader; @@ -876,7 +907,18 @@ pub extern "C" fn js_buffer_index_of(buf_ptr: f64, needle: f64, start: i32) -> i } } } - -1 + // Numeric byte needle — INT32_TAG or plain double + let byte_val = if top16 == 0x7FFE { + // INT32_TAG: lower 32 bits are an i32 + (needle_bits as u32) & 0xFF + } else if top16 < 0x7FF8 || (top16 == 0x7FF8 && needle_bits == 0x7FF8_0000_0000_0000) { + // Raw double — convert to byte + ((needle as i64) & 0xFF) as u32 + } else { + return -1; + }; + let byte = [byte_val as u8]; + buffer_index_of_bytes(buf, &byte, start) } /// `buf.includes(needle, start?)` — boolean i32. diff --git a/crates/perry-runtime/src/closure.rs b/crates/perry-runtime/src/closure.rs index 51f9634a5..bf99e3b24 100644 --- a/crates/perry-runtime/src/closure.rs +++ b/crates/perry-runtime/src/closure.rs @@ -679,9 +679,6 @@ pub extern "C" fn js_closure_unbind_this(val: f64) -> f64 { #[no_mangle] pub extern "C" fn js_sharp_negate() -> i64 { 0 } #[no_mangle] pub extern "C" fn js_sharp_quality() -> i64 { 0 } #[no_mangle] pub extern "C" fn js_sharp_to_format() -> i64 { 0 } -#[no_mangle] pub extern "C" fn js_sqlite_transaction() -> i64 { 0 } -#[no_mangle] pub extern "C" fn js_sqlite_transaction_commit() -> i64 { 0 } -#[no_mangle] pub extern "C" fn js_sqlite_transaction_rollback() -> i64 { 0 } #[cfg(test)] mod tests { use super::*; diff --git a/crates/perry-runtime/src/gc.rs b/crates/perry-runtime/src/gc.rs index dcd58d71b..eedf78248 100644 --- a/crates/perry-runtime/src/gc.rs +++ b/crates/perry-runtime/src/gc.rs @@ -42,6 +42,14 @@ pub const GC_TYPE_MAP: u8 = 8; pub const GC_FLAG_MARKED: u8 = 0x01; pub const GC_FLAG_ARENA: u8 = 0x02; pub const GC_FLAG_PINNED: u8 = 0x04; +/// Set on a keys-array that was handed out by `shape_cache_insert`. +/// `js_object_set_field_by_name` reads this bit to decide whether it +/// must clone before mutating (shared arrays can't be mutated in +/// place; fresh arrays allocated in the `keys.is_null()` branch can). +/// Without the bit the clone fires on every property added to every +/// fresh object literal — a 20-property row object allocates 19 +/// throwaway keys_array clones per row. +pub const GC_FLAG_SHAPE_SHARED: u8 = 0x08; // Object flags stored in GcHeader._reserved (u16) for Object.freeze/seal/preventExtensions pub const OBJ_FLAG_FROZEN: u16 = 0x01; @@ -202,6 +210,55 @@ pub fn gc_malloc(size: usize, obj_type: u8) -> *mut u8 { } } +/// Batch-allocate multiple GC-tracked malloc objects in one go. +/// Amortises overhead: one `gc_check_trigger` call, one `MALLOC_OBJECTS` +/// extend, one `MALLOC_SET` extend — instead of N of each. +/// `sizes` contains the *payload* size for each object (excluding GcHeader). +/// Returns a Vec of user pointers (past the header), one per entry. +pub fn gc_malloc_batch(sizes: &[usize], obj_type: u8) -> Vec<*mut u8> { + gc_check_trigger(); // once, not N times + + let n = sizes.len(); + let mut results = Vec::with_capacity(n); + let mut headers = Vec::with_capacity(n); + + unsafe { + GC_IN_ALLOC.with(|f| f.set(true)); + + for &size in sizes { + let total = GC_HEADER_SIZE + size; + let layout = Layout::from_size_align(total, 8).unwrap(); + let raw = alloc(layout); + if raw.is_null() { + panic!("gc_malloc_batch: failed to allocate {} bytes", total); + } + let header = raw as *mut GcHeader; + (*header).obj_type = obj_type; + (*header).gc_flags = 0; + (*header)._reserved = 0; + (*header).size = total as u32; + + headers.push(header); + results.push(raw.add(GC_HEADER_SIZE)); + } + + MALLOC_OBJECTS.with(|list| { + let mut list = list.borrow_mut(); + list.extend_from_slice(&headers); + }); + MALLOC_SET.with(|set| { + let mut set = set.borrow_mut(); + for &h in &headers { + set.insert(h as usize); + } + }); + + GC_IN_ALLOC.with(|f| f.set(false)); + } + + results +} + /// Reallocate a malloc-tracked object, preserving GcHeader. /// `old_user_ptr` is the pointer previously returned by gc_malloc. /// Returns new user pointer (after header). @@ -434,6 +491,12 @@ fn gc_collect_inner() { // 4. Trace from marked roots (iterative worklist) trace_marked_objects(&valid_ptrs); + // 5. Block-persistence pass: arena blocks survive whole or not at all, so + // arena objects sharing a block with a root-reachable object persist + // even when not themselves reachable. Their malloc children must stay + // alive too (issues #43 / #44). + mark_block_persisting_arena_objects(&valid_ptrs); + // === SWEEP PHASE === // sweep() now clears mark bits on surviving objects inline, // eliminating 2 redundant heap walks (arena + malloc). @@ -740,6 +803,30 @@ fn mark_registered_roots(valid_ptrs: &ValidPointerSet) { } } +/// Process a worklist of already-marked headers: follow references iteratively, +/// marking newly-reached objects and pushing them onto the worklist. +fn drain_trace_worklist(worklist: &mut Vec<*mut GcHeader>, valid_ptrs: &ValidPointerSet) { + let mut i = 0; + while i < worklist.len() { + let header = worklist[i]; + i += 1; + + unsafe { + let user_ptr = (header as *mut u8).add(GC_HEADER_SIZE); + match (*header).obj_type { + GC_TYPE_ARRAY => trace_array(user_ptr, valid_ptrs, worklist), + GC_TYPE_OBJECT => trace_object(user_ptr, valid_ptrs, worklist), + GC_TYPE_CLOSURE => trace_closure(user_ptr, valid_ptrs, worklist), + GC_TYPE_PROMISE => trace_promise(user_ptr, valid_ptrs, worklist), + GC_TYPE_ERROR => trace_error(user_ptr, valid_ptrs, worklist), + GC_TYPE_MAP => trace_map(user_ptr, valid_ptrs, worklist), + GC_TYPE_STRING | GC_TYPE_BIGINT => {} + _ => {} + } + } + } +} + /// Trace from marked objects: follow references iteratively using a worklist. fn trace_marked_objects(valid_ptrs: &ValidPointerSet) { // Collect all currently-marked objects into a worklist @@ -767,39 +854,69 @@ fn trace_marked_objects(valid_ptrs: &ValidPointerSet) { } }); - // Process worklist - let mut i = 0; - while i < worklist.len() { - let header = worklist[i]; - i += 1; + drain_trace_worklist(&mut worklist, valid_ptrs); +} - unsafe { - let user_ptr = (header as *mut u8).add(GC_HEADER_SIZE); - match (*header).obj_type { - GC_TYPE_ARRAY => { - trace_array(user_ptr, valid_ptrs, &mut worklist); - } - GC_TYPE_OBJECT => { - trace_object(user_ptr, valid_ptrs, &mut worklist); - } - GC_TYPE_CLOSURE => { - trace_closure(user_ptr, valid_ptrs, &mut worklist); - } - GC_TYPE_PROMISE => { - trace_promise(user_ptr, valid_ptrs, &mut worklist); - } - GC_TYPE_ERROR => { - trace_error(user_ptr, valid_ptrs, &mut worklist); - } - GC_TYPE_MAP => { - trace_map(user_ptr, valid_ptrs, &mut worklist); +/// Block-persistence pass: arena block reset is all-or-nothing, so any arena +/// object in a block that has at least one reachable object will persist in +/// memory whether or not the object itself was reached from a root. Any +/// malloc children referenced by those persisting arena objects must therefore +/// be kept alive — otherwise they get freed by sweep and the persisting arena +/// object holds dangling pointers. +/// +/// Why this matters: during `arr.push(new_obj)`, the new object is in a +/// caller-saved register between its allocation and the write into `arr`. +/// If array growth triggers GC in that window, conservative stack scanning +/// (setjmp only captures callee-saved regs) doesn't see the new object as a +/// root. The arena block containing the new object still survives (other +/// objects in that block are reachable from `arr`), so the new object's +/// memory is intact. But its malloc-allocated string fields ("Record X", +/// email, etc.) get swept, and JSON.stringify later reads freed memory. +/// Repro: issues #43 / #44. +/// +/// Iterates until fixed point because marking an arena object may trace a +/// child in a previously-dead block, making it live in the next round. +fn mark_block_persisting_arena_objects(valid_ptrs: &ValidPointerSet) { + let mut worklist: Vec<*mut GcHeader> = Vec::new(); + loop { + let n_blocks = crate::arena::arena_block_count(); + let mut block_has_live: Vec = vec![false; n_blocks]; + + // Pass 1: compute which blocks have any reachable (marked/pinned) object. + crate::arena::arena_walk_objects_with_block_index(|header_ptr, block_idx| { + let header = header_ptr as *mut GcHeader; + unsafe { + if (*header).gc_flags & (GC_FLAG_MARKED | GC_FLAG_PINNED) != 0 { + if block_idx < block_has_live.len() { + block_has_live[block_idx] = true; + } } - GC_TYPE_STRING | GC_TYPE_BIGINT => { - // Leaf nodes - no children to trace + } + }); + + // Pass 2: mark any unmarked arena object in a live block and enqueue. + let mut newly_marked = 0usize; + crate::arena::arena_walk_objects_with_block_index(|header_ptr, block_idx| { + if block_idx >= block_has_live.len() || !block_has_live[block_idx] { + return; + } + let header = header_ptr as *mut GcHeader; + unsafe { + if (*header).gc_flags & (GC_FLAG_MARKED | GC_FLAG_PINNED) == 0 { + (*header).gc_flags |= GC_FLAG_MARKED; + worklist.push(header); + newly_marked += 1; } - _ => {} } + }); + + if newly_marked == 0 { + break; } + + // Trace newly marked; may mark children in previously-dead blocks, + // requiring another round to pick them up. + drain_trace_worklist(&mut worklist, valid_ptrs); } } @@ -866,8 +983,10 @@ unsafe fn trace_array(user_ptr: *mut u8, valid_ptrs: &ValidPointerSet, worklist: let length = (*arr).length; let capacity = (*arr).capacity; - // Sanity checks: reject corrupt length/capacity to avoid scanning wild memory - if length > capacity || length > 65536 { + // Sanity check: reject corrupt length/capacity to avoid scanning wild memory. + // The 16M cap is a garbage-recognition guard (no realistic array exceeds it); + // real programs routinely push >65k items into arrays (issue #44 repro hits 100k). + if length > capacity || length > 16_000_000 { return; } @@ -896,8 +1015,8 @@ unsafe fn trace_object(user_ptr: *mut u8, valid_ptrs: &ValidPointerSet, worklist let field_count = (*obj).field_count; // Sanity check: reject corrupt field_count to avoid scanning wild memory. - // Object fields start after ObjectHeader (24 bytes). Max reasonable: ~64K fields. - if field_count > 65536 { + // 1M is a garbage-recognition guard — legitimate objects never have that many fields. + if field_count > 1_000_000 { return; } @@ -1261,18 +1380,35 @@ pub fn shape_cache_root_scanner(mark: &mut dyn FnMut(f64)) { crate::object::scan_shape_cache_roots(mark); } +/// Root scanner for the shape-transition cache used by the dynamic-key +/// write path (`obj[name] = value`). Same role as `shape_cache_root_scanner` +/// — without it, GC would free cached target keys_arrays that no live +/// object currently references directly. +pub fn transition_cache_root_scanner(mark: &mut dyn FnMut(f64)) { + crate::object::scan_transition_cache_roots(mark); +} + /// Root scanner for OVERFLOW_FIELDS (per-object extra properties beyond inline slots) pub fn overflow_fields_root_scanner(mark: &mut dyn FnMut(f64)) { crate::object::scan_overflow_fields_roots(mark); } +/// Root scanner for in-progress JSON.parse frames (issue #46). +/// Without this, GC triggered mid-parse would sweep in-progress arrays/objects +/// and the fresh string/object values about to be pushed into them. +pub fn json_parse_root_scanner(mark: &mut dyn FnMut(f64)) { + crate::json::scan_parse_roots(mark); +} + /// Initialize GC root scanners. Called once at runtime startup. pub fn gc_init() { gc_register_root_scanner(promise_root_scanner); gc_register_root_scanner(timer_root_scanner); gc_register_root_scanner(exception_root_scanner); gc_register_root_scanner(shape_cache_root_scanner); + gc_register_root_scanner(transition_cache_root_scanner); gc_register_root_scanner(overflow_fields_root_scanner); + gc_register_root_scanner(json_parse_root_scanner); } /// FFI: initialize GC (called from compiled code startup) diff --git a/crates/perry-runtime/src/json.rs b/crates/perry-runtime/src/json.rs index 780fda198..b5a500d8c 100644 --- a/crates/perry-runtime/src/json.rs +++ b/crates/perry-runtime/src/json.rs @@ -15,6 +15,75 @@ use std::fmt::Write as FmtWrite; thread_local! { /// Stack of object pointers currently being stringified (for circular detection). static STRINGIFY_STACK: RefCell> = RefCell::new(Vec::new()); + + /// Key string intern cache for JSON.parse (issue #51 follow-up). + /// Maps key bytes → already-allocated StringHeader pointer. + /// Avoids re-allocating "id", "name", etc. for every record in a + /// homogeneous JSON array. Cleared at the end of each top-level parse. + static PARSE_KEY_CACHE: RefCell, *const StringHeader>> = + RefCell::new(std::collections::HashMap::new()); + + /// GC roots for in-progress JSON.parse. Each entry is a JSValue bit pattern + /// (stored as f64 so the scanner can hand it to the NaN-boxed mark path). + /// + /// Why this exists (issue #46): parse_array/parse_object build their result + /// incrementally over thousands of iterations. Mid-parse heap allocations + /// (`js_string_from_bytes` → gc_malloc → adaptive count trigger, or an arena + /// block overflow) run GC while the in-progress array/object lives only on + /// the Rust call stack. The conservative stack scan only captures callee- + /// saved registers via setjmp; values held in caller-saved regs (or on + /// the Rust-heap backing of `Vec<(Vec, JSValue)>` inside parse_object) + /// are invisible and get swept. Symptom was `JSON.parse(big_array)` silently + /// truncating at ~1666 records (= when the second adaptive malloc GC fires). + static PARSE_ROOTS: RefCell> = RefCell::new(Vec::new()); +} + +#[inline] +fn parse_root_push(v: JSValue) -> usize { + PARSE_ROOTS.with(|r| { + let mut r = r.borrow_mut(); + let idx = r.len(); + r.push(f64::from_bits(v.bits())); + idx + }) +} + +#[inline] +fn parse_root_set(idx: usize, v: JSValue) { + PARSE_ROOTS.with(|r| { + if let Some(slot) = r.borrow_mut().get_mut(idx) { + *slot = f64::from_bits(v.bits()); + } + }); +} + +#[inline] +fn parse_root_save_len() -> usize { + PARSE_ROOTS.with(|r| r.borrow().len()) +} + +#[inline] +fn parse_root_restore(len: usize) { + PARSE_ROOTS.with(|r| r.borrow_mut().truncate(len)); +} + +/// Root scanner called by GC — marks every value in PARSE_ROOTS as live. +pub fn scan_parse_roots(mark: &mut dyn FnMut(f64)) { + PARSE_ROOTS.with(|r| { + for &v in r.borrow().iter() { + mark(v); + } + }); + // Also mark interned key strings so GC doesn't sweep them mid-parse. + PARSE_KEY_CACHE.with(|c| { + for &ptr in c.borrow().values() { + if !ptr.is_null() { + mark(f64::from_bits( + crate::value::STRING_TAG | (ptr as u64 & 0x0000_FFFF_FFFF_FFFF), + )); + } + } + }); } // ─── Zero-copy string access ────────────────────────────────────────────────── @@ -32,6 +101,22 @@ unsafe fn str_from_header<'a>(ptr: *const StringHeader) -> Option<&'a str> { // ─── Direct JSON parser ──────────────────────────────────────────────────────── +/// Result of parsing a JSON string: either a zero-copy borrow from the +/// input buffer (no escapes) or an owned allocation (had escape sequences). +enum ParsedStr<'a> { + Borrowed(&'a [u8]), + Owned(Vec), +} + +impl<'a> ParsedStr<'a> { + fn as_bytes(&self) -> &[u8] { + match self { + ParsedStr::Borrowed(s) => s, + ParsedStr::Owned(v) => v, + } + } +} + struct DirectParser<'a> { input: &'a [u8], pos: usize, @@ -89,20 +174,42 @@ impl<'a> DirectParser<'a> { unsafe fn parse_string_value(&mut self) -> JSValue { if let Some(s) = self.parse_string_bytes() { - let ptr = js_string_from_bytes(s.as_ptr(), s.len() as u32); + let b = s.as_bytes(); + let ptr = js_string_from_bytes(b.as_ptr(), b.len() as u32); JSValue::string_ptr(ptr) } else { JSValue::null() } } - fn parse_string_bytes(&mut self) -> Option> { + /// Zero-copy fast path: if the string has no escape sequences, + /// return a direct slice into the input buffer. Falls back to + /// `parse_string_bytes_slow` for strings containing `\`. + fn parse_string_bytes(&mut self) -> Option> { if self.peek() != Some(b'"') { return None; } self.advance(); + let start = self.pos; + // Fast scan: look for closing `"` without any `\`. + while self.pos < self.input.len() { + let ch = self.input[self.pos]; + if ch == b'"' { + let slice = &self.input[start..self.pos]; + self.pos += 1; + return Some(ParsedStr::Borrowed(slice)); + } + if ch == b'\\' { + // Has escapes — fall back to slow path from current position. + return self.parse_string_bytes_slow(start); + } + self.pos += 1; + } + None + } - let mut result = Vec::new(); + fn parse_string_bytes_slow(&mut self, start: usize) -> Option> { + let mut result = Vec::from(&self.input[start..self.pos]); loop { if self.pos >= self.input.len() { return None; @@ -110,7 +217,7 @@ impl<'a> DirectParser<'a> { let ch = self.input[self.pos]; self.pos += 1; match ch { - b'"' => return Some(result), + b'"' => return Some(ParsedStr::Owned(result)), b'\\' => { if self.pos >= self.input.len() { return None; @@ -168,7 +275,7 @@ impl<'a> DirectParser<'a> { self.advance(); self.skip_whitespace(); - let mut pairs: Vec<(Vec, JSValue)> = Vec::new(); + let saved_roots = parse_root_save_len(); if self.peek() == Some(b'}') { self.advance(); @@ -178,6 +285,16 @@ impl<'a> DirectParser<'a> { return JSValue::object_ptr(js_obj as *mut u8); } + // Incremental build: allocate the object upfront and set fields + // as we parse them (no intermediate Vec). Combined with key + // interning (PARSE_KEY_CACHE) and transition-cache shape sharing + // (js_object_set_field_by_name), this gives: + // - First record of each schema: N key allocs + N transitions. + // - Subsequent records: 0 key allocs + N transition hits. + // - Zero Rust-heap Vec allocations per record. + let js_obj = js_object_alloc(0, 0); + let _obj_slot = parse_root_push(JSValue::object_ptr(js_obj as *mut u8)); + loop { self.skip_whitespace(); let key = match self.parse_string_bytes() { @@ -190,7 +307,29 @@ impl<'a> DirectParser<'a> { } let value = self.parse_value(); - pairs.push((key, value)); + // Root the value before the key-intern + set_field path + // (which may allocate and trigger GC). + parse_root_push(value); + + let key_bytes = key.as_bytes(); + // Two-phase lookup: check cache with immutable borrow first, + // then allocate OUTSIDE the borrow (js_string_from_bytes can + // trigger GC → scan_parse_roots → borrow() on same RefCell). + let cached = PARSE_KEY_CACHE.with(|c| { + c.borrow().get(key_bytes).copied() + }); + let key_ptr = if let Some(p) = cached { + p + } else { + let ptr = js_string_from_bytes(key_bytes.as_ptr(), key_bytes.len() as u32); + PARSE_KEY_CACHE.with(|c| { + c.borrow_mut().insert(key_bytes.to_vec(), ptr); + }); + ptr + }; + crate::object::js_object_set_field_by_name( + js_obj, key_ptr as *mut StringHeader, f64::from_bits(value.bits()), + ); self.skip_whitespace(); if self.peek() == Some(b',') { @@ -200,17 +339,7 @@ impl<'a> DirectParser<'a> { } } self.expect(b'}'); - - let count = pairs.len(); - let js_obj = js_object_alloc(0, count as u32); - let keys_arr = js_array_alloc(count as u32); - - for (idx, (key, value)) in pairs.into_iter().enumerate() { - let key_ptr = js_string_from_bytes(key.as_ptr(), key.len() as u32); - js_array_push(keys_arr, JSValue::string_ptr(key_ptr)); - js_object_set_field(js_obj, idx as u32, value); - } - js_object_set_keys(js_obj, keys_arr); + parse_root_restore(saved_roots); JSValue::object_ptr(js_obj as *mut u8) } @@ -218,16 +347,25 @@ impl<'a> DirectParser<'a> { self.advance(); self.skip_whitespace(); + let saved_roots = parse_root_save_len(); let mut js_arr = js_array_alloc(16); + let arr_slot = parse_root_push(JSValue::object_ptr(js_arr as *mut u8)); if self.peek() == Some(b']') { self.advance(); + parse_root_restore(saved_roots); return JSValue::object_ptr(js_arr as *mut u8); } loop { let value = self.parse_value(); + // Root value before push — js_array_push may grow (arena alloc → GC) + // and value's heap ptr lives only in a caller-saved register here. + parse_root_push(value); js_arr = js_array_push(js_arr, value); + // js_array_push may have returned a new ArrayHeader* after grow; + // update the root slot so GC sees the new pointer, not the stale one. + parse_root_set(arr_slot, JSValue::object_ptr(js_arr as *mut u8)); self.skip_whitespace(); if self.peek() == Some(b',') { @@ -237,6 +375,7 @@ impl<'a> DirectParser<'a> { } } self.expect(b']'); + parse_root_restore(saved_roots); JSValue::object_ptr(js_arr as *mut u8) } @@ -320,8 +459,19 @@ pub unsafe extern "C" fn js_json_parse(text_ptr: *const StringHeader) -> JSValue crate::exception::js_throw(f64::from_bits(err_val.bits())); } + // Root the input StringHeader for the duration of the parse. The parser + // holds `input: &[u8]` pointing INTO the string's data region — a pointer + // the conservative stack scan / valid-pointer-set won't match (it only + // indexes user pointers at `header + sizeof(GcHeader)`). Without this root + // the input string could be swept mid-parse and `bytes` would dangle. + let text_root = parse_root_push(JSValue::string_ptr(text_ptr as *mut StringHeader)); + let mut parser = DirectParser::new(bytes); let result = parser.parse_value(); + // Also root the final result while we evaluate the error path; a throw + // below (which allocates its message via gc_malloc) must not sweep the + // just-parsed top-level value. + parse_root_push(result); // If parser didn't consume meaningful input (result is null and input wasn't "null"), // the input was invalid JSON — throw SyntaxError @@ -337,6 +487,7 @@ pub unsafe extern "C" fn js_json_parse(text_ptr: *const StringHeader) -> JSValue } } + parse_root_restore(text_root); result } @@ -374,6 +525,18 @@ unsafe fn extract_pointer(bits: u64) -> Option<*const u8> { } } +/// Read the GC header's object type tag for a user-space heap pointer. +/// The GcHeader sits 8 bytes before `ptr`; its first byte is `obj_type`. +/// Returns 0 when `ptr` is null or in the low-memory guard range. +#[inline] +unsafe fn gc_obj_type(ptr: *const u8) -> u8 { + if ptr.is_null() || (ptr as usize) < 0x1000 { + return 0; + } + // GcHeader.obj_type is at offset 0 (see crate::gc::GcHeader layout). + *(ptr.sub(crate::gc::GC_HEADER_SIZE)) +} + #[inline] unsafe fn is_object_pointer(ptr: *const u8) -> bool { let obj = ptr as *const crate::ObjectHeader; @@ -552,23 +715,52 @@ unsafe fn stringify_value(value: f64, type_hint: u32, buf: &mut String) { return; } - if is_object_pointer(ptr) { - stringify_object(ptr, buf); - } else { - let arr = ptr as *const crate::ArrayHeader; - if !arr.is_null() { - let len = (*arr).length; - let cap = (*arr).capacity; - if len <= cap && cap > 0 && cap < 10000 { - stringify_array(ptr, buf); - return; + // Prefer the GC header's obj_type tag for dispatch — the old + // capacity heuristic (`cap < 10000`) misidentified legitimate + // arrays that had grown past 10k as strings, panicking on + // `JSON.stringify(arr)` where `arr.length >= 10000` (issue #43). + match gc_obj_type(ptr) { + crate::gc::GC_TYPE_ARRAY => stringify_array(ptr, buf), + crate::gc::GC_TYPE_OBJECT => { + if is_object_pointer(ptr) { + stringify_object(ptr, buf); + } else { + buf.push_str("null"); } } - let str_ptr = ptr as *const StringHeader; - if let Some(s) = str_from_header(str_ptr) { - write_escaped_string(buf, s); - } else { - buf.push_str("null"); + crate::gc::GC_TYPE_STRING => { + let str_ptr = ptr as *const StringHeader; + if let Some(s) = str_from_header(str_ptr) { + write_escaped_string(buf, s); + } else { + buf.push_str("null"); + } + } + _ => { + // Unknown/untagged pointer: fall back to the structural + // heuristics for safety (e.g. pointers to non-GC-tracked + // memory). Arrays up to 10k cap are dispatched here; + // above that we defensively emit "null" rather than + // trying to treat them as strings. + if is_object_pointer(ptr) { + stringify_object(ptr, buf); + } else { + let arr = ptr as *const crate::ArrayHeader; + if !arr.is_null() { + let len = (*arr).length; + let cap = (*arr).capacity; + if len <= cap && cap > 0 && cap < 10000 { + stringify_array(ptr, buf); + return; + } + } + let str_ptr = ptr as *const StringHeader; + if let Some(s) = str_from_header(str_ptr) { + write_escaped_string(buf, s); + } else { + buf.push_str("null"); + } + } } } return; @@ -695,15 +887,16 @@ unsafe fn stringify_array(ptr: *const u8, buf: &mut String) { } else { elem_bits as *const u8 }; - if is_object_pointer(elem_ptr) { - stringify_object(elem_ptr, buf); - } else { - let arr_elem = elem_ptr as *const crate::ArrayHeader; - let arr_len = (*arr_elem).length; - let arr_cap = (*arr_elem).capacity; - if arr_len <= arr_cap && arr_cap > 0 && arr_cap < 10000 { - stringify_array(elem_ptr, buf); - } else { + match gc_obj_type(elem_ptr) { + crate::gc::GC_TYPE_ARRAY => stringify_array(elem_ptr, buf), + crate::gc::GC_TYPE_OBJECT => { + if is_object_pointer(elem_ptr) { + stringify_object(elem_ptr, buf); + } else { + buf.push_str("null"); + } + } + crate::gc::GC_TYPE_STRING => { let str_ptr = elem_ptr as *const StringHeader; if let Some(s) = str_from_header(str_ptr) { write_escaped_string(buf, s); @@ -711,6 +904,25 @@ unsafe fn stringify_array(ptr: *const u8, buf: &mut String) { buf.push_str("null"); } } + _ => { + if is_object_pointer(elem_ptr) { + stringify_object(elem_ptr, buf); + } else { + let arr_elem = elem_ptr as *const crate::ArrayHeader; + let arr_len = (*arr_elem).length; + let arr_cap = (*arr_elem).capacity; + if arr_len <= arr_cap && arr_cap > 0 && arr_cap < 10000 { + stringify_array(elem_ptr, buf); + } else { + let str_ptr = elem_ptr as *const StringHeader; + if let Some(s) = str_from_header(str_ptr) { + write_escaped_string(buf, s); + } else { + buf.push_str("null"); + } + } + } + } } } else { write_number(buf, elem); diff --git a/crates/perry-runtime/src/object.rs b/crates/perry-runtime/src/object.rs index bd12a5585..b2ab10025 100644 --- a/crates/perry-runtime/src/object.rs +++ b/crates/perry-runtime/src/object.rs @@ -13,13 +13,96 @@ use std::cell::{Cell, RefCell}; use std::ptr; use std::collections::HashMap; use std::sync::RwLock; +use std::sync::atomic::{AtomicBool, Ordering}; /// Overflow field storage for objects that exceed their pre-allocated inline slot count. -/// Keyed by (obj_ptr as usize) -> (field_index -> JSValue bits). +/// Keyed by (obj_ptr as usize) -> Vec indexed by absolute field_index +/// (inline slots 0..alloc_limit remain `TAG_UNDEFINED` placeholders in the Vec; +/// they're never read since the inline slots are checked first). +/// +/// Was a `HashMap>` through v0.5.29 — the inner HashMap +/// dominated the row-decode hot path: a 20-property row object touches the overflow +/// storage on each of its 12 post-8-slot writes, and HashMap ops (hash + probe + +/// mut insert) cost ~40-50ns each. Flat `Vec` is ~5ns per append + index; +/// removes most of the residual gap after the shape-transition cache landed. +/// /// This handles cases like Object.assign() adding many fields to an object /// that was allocated with only 8 slots (e.g., @noble/curves Fp field with 21 properties). thread_local! { - static OVERFLOW_FIELDS: RefCell>> = RefCell::new(HashMap::new()); + static OVERFLOW_FIELDS: RefCell>> = RefCell::new(HashMap::new()); +} + +/// Last-accessed overflow Vec cache — one entry, keyed by `obj_ptr`. +/// Skips the outer HashMap lookup on consecutive writes to the same +/// object (exactly the row-build pattern: a single object gets its +/// overflow slots filled back-to-back). Refreshed on every slow-path +/// HashMap access; invalidated by `clear_overflow_for_ptr` when GC +/// sweep frees the corresponding object. +/// +/// Safety: the cached pointer references the `Vec` struct stored +/// inside a HashMap bucket. That struct only moves when the HashMap +/// resizes, which only happens on `entry().or_default()` inserting a +/// fresh key. The slow path below does both the potentially-resizing +/// call and the cache refresh inside a single `OVERFLOW_FIELDS.with` +/// closure, so no other thread-local mutation can interleave between +/// obtaining `&mut Vec` and caching its address. +thread_local! { + static OVERFLOW_LAST: std::cell::UnsafeCell<(usize, *mut Vec)> = + std::cell::UnsafeCell::new((0, std::ptr::null_mut())); +} + +/// Read the u64 bits stored at `field_index` for `obj`, or `None` if absent. +/// Positions never written are stored as `TAG_UNDEFINED`; this helper reports +/// them as `None` so callers can return JS `undefined` uniformly with the +/// "no Vec entry at all" case. +#[inline] +fn overflow_get(obj_ptr: usize, field_index: usize) -> Option { + OVERFLOW_FIELDS.with(|m| { + m.borrow() + .get(&obj_ptr) + .and_then(|v| v.get(field_index).copied()) + .filter(|&bits| bits != crate::value::TAG_UNDEFINED) + }) +} + +/// Write `vbits` to the overflow slot `field_index` for `obj`. Grows the +/// per-object `Vec` to `field_index + 1` with `TAG_UNDEFINED` fillers if +/// needed (filler slots correspond to the object's inline region and are +/// never read). +/// +/// Fast path skips the outer HashMap when `obj_ptr` matches the last- +/// accessed Vec — the common row-build pattern where an object's +/// overflow slots fill in sequence. +#[inline] +fn overflow_set(obj_ptr: usize, field_index: usize, vbits: u64) { + let hit = OVERFLOW_LAST.with(|c| unsafe { + let (cached_obj, cached_vec) = *c.get(); + if cached_obj == obj_ptr && !cached_vec.is_null() { + let v = &mut *cached_vec; + if v.len() <= field_index { + v.resize(field_index + 1, crate::value::TAG_UNDEFINED); + } + *v.get_unchecked_mut(field_index) = vbits; + true + } else { + false + } + }); + if hit { + return; + } + OVERFLOW_FIELDS.with(|m| { + let mut map = m.borrow_mut(); + let v = map.entry(obj_ptr).or_default(); + if v.len() <= field_index { + v.resize(field_index + 1, crate::value::TAG_UNDEFINED); + } + v[field_index] = vbits; + let vec_ptr = v as *mut Vec; + OVERFLOW_LAST.with(|c| unsafe { + *c.get() = (obj_ptr, vec_ptr); + }); + }); } /// Per-property attribute flags set by `Object.defineProperty` / `Object.freeze` / `Object.seal`. @@ -70,8 +153,19 @@ thread_local! { /// on this thread, so hot `js_object_get_field_by_name` / `set_field_by_name` /// can skip the `ACCESSOR_DESCRIPTORS` HashMap lookup entirely. pub(crate) static ACCESSORS_IN_USE: Cell = const { Cell::new(false) }; + /// Fast-path gate for `PROPERTY_DESCRIPTORS` — flipped the first time + /// `Object.defineProperty` (or freeze/seal via `set_property_attrs`) + /// installs a per-property descriptor. Lets the hot object-write path + /// skip the `.to_string()` allocation required to look up a descriptor + /// that almost never exists. + pub(crate) static PROPERTY_ATTRS_IN_USE: Cell = const { Cell::new(false) }; } +/// Global monotonic flag: set once any accessor or property descriptor is +/// installed. Checked on every dynamic property write via a single +/// `Relaxed` load (no TLS overhead, no fence on aarch64/x86). +static GLOBAL_DESCRIPTORS_IN_USE: AtomicBool = AtomicBool::new(false); + /// Look up the property descriptor for (obj, key). Returns None if no entry exists, /// in which case the JS default `{ writable: true, enumerable: true, configurable: true }` applies. pub(crate) fn get_property_attrs(obj: usize, key: &str) -> Option { @@ -80,6 +174,8 @@ pub(crate) fn get_property_attrs(obj: usize, key: &str) -> Option /// Store a property descriptor for (obj, key). pub(crate) fn set_property_attrs(obj: usize, key: String, attrs: PropertyAttrs) { + PROPERTY_ATTRS_IN_USE.with(|c| c.set(true)); + GLOBAL_DESCRIPTORS_IN_USE.store(true, Ordering::Relaxed); PROPERTY_DESCRIPTORS.with(|m| { m.borrow_mut().insert((obj, key), attrs); }); } @@ -91,6 +187,7 @@ pub(crate) fn get_accessor_descriptor(obj: usize, key: &str) -> Option *mut ArrayHeader { /// (evicting any prior entry there) and also writes to the overflow /// map so misses on the inline cache still find the value. fn shape_cache_insert(shape_id: u32, keys_array: *mut ArrayHeader) { + // Mark the array as shape-shared so `js_object_set_field_by_name` + // knows it must clone before mutating. The clone path was firing + // every time *any* fresh object literal added a property beyond + // the first (because `key_count == field_count` with both + // counting up in lockstep); that's ~19 throwaway clones per + // 20-property row × 10k rows = 190k clones of growing size on a + // standard bulk decode. Gating the clone on this flag turns that + // into zero for locally-owned arrays. + if !keys_array.is_null() { + unsafe { + let gc_header = (keys_array as *const u8) + .sub(crate::gc::GC_HEADER_SIZE) as *mut crate::gc::GcHeader; + (*gc_header).gc_flags |= crate::gc::GC_FLAG_SHAPE_SHARED; + } + } SHAPE_INLINE_CACHE.with(|cache| { let slot = (shape_id as usize) & (SHAPE_INLINE_CACHE_SIZE - 1); unsafe { @@ -255,6 +367,122 @@ fn shape_cache_insert(shape_id: u32, keys_array: *mut ArrayHeader) { }); } +/// Thread-local shape-transition cache for the dynamic-key write path +/// (`obj[name] = value`). One entry per `(prev_keys_array, key_ptr)` edge +/// in the shape lattice. +/// +/// When `js_object_set_field_by_name` would otherwise do a linear scan +/// over `keys_array` to locate-or-append a key, it first looks up +/// `(obj.keys_array, key)` here. A hit tells us directly which +/// keys_array to transition the object to and which slot the field +/// lives in — no scan, no clone, no `js_array_push`. +/// +/// The cache is populated on the slow (append) path: after the scan +/// confirms the key is new and a new keys_array is built, the +/// transition `(prev_keys, key_ptr) → (new_keys, slot_idx)` is stored +/// here and `new_keys` is stamped `GC_FLAG_SHAPE_SHARED` so any future +/// extension clones before mutating (same invariant as the SHAPE_CACHE +/// for compile-time object literals). +/// +/// Direct-mapped, 4096 entries, each a self-describing record (full +/// key included) so a collision just misses instead of returning the +/// wrong slot. The target pointers are GC-rooted via +/// `scan_transition_cache_roots`. +/// +/// Two sentinel values: `prev_keys == 0` is the "keys_array is null" +/// edge (first property on a fresh `{}`), which lets a second object +/// building the same shape reuse the first's keys_array from the very +/// first write — no per-row allocation of a 1-entry keys_array. +#[derive(Clone, Copy)] +struct TransitionEntry { + prev_keys: usize, + key_hash: u64, + next_keys: usize, + slot_idx: u32, +} + +const TRANSITION_CACHE_SIZE: usize = 16384; + +/// Main-thread transition cache — bypasses TLS overhead (user code is +/// single-threaded). Worker threads spawned by `perry/thread` are +/// short-lived and don't share objects, so they don't need transitions. +static mut TRANSITION_CACHE_GLOBAL: [TransitionEntry; TRANSITION_CACHE_SIZE] = + [TransitionEntry { prev_keys: 0, key_hash: 0, next_keys: 0, slot_idx: 0 }; TRANSITION_CACHE_SIZE]; + +/// FNV-1a content hash for a property-name string. +#[inline(always)] +fn key_content_hash(key: *const crate::StringHeader) -> u64 { + unsafe { + let len = (*key).byte_len as usize; + let data = (key as *const u8).add(std::mem::size_of::()); + let mut h: u64 = 0xcbf29ce484222325; + for i in 0..len { + h ^= *data.add(i) as u64; + h = h.wrapping_mul(0x100000001b3); + } + h + } +} + +#[inline(always)] +fn transition_cache_slot(prev_keys: usize, key_hash: u64) -> usize { + let mixed = ((prev_keys >> 3) as u64).wrapping_mul(0x9E3779B97F4A7C15) + ^ key_hash.wrapping_mul(0xC6BC279692B5C323); + (mixed as usize) & (TRANSITION_CACHE_SIZE - 1) +} + +#[inline(always)] +fn transition_cache_lookup(prev_keys: usize, key: *const crate::StringHeader) -> Option<(usize, u32)> { + let kh = key_content_hash(key); + let slot = transition_cache_slot(prev_keys, kh); + let entry = unsafe { TRANSITION_CACHE_GLOBAL[slot] }; + if entry.next_keys != 0 && entry.prev_keys == prev_keys && entry.key_hash == kh { + Some((entry.next_keys, entry.slot_idx)) + } else { + None + } +} + +fn transition_cache_insert(prev_keys: usize, key: *const crate::StringHeader, next_keys: usize, slot_idx: u32) { + if next_keys == 0 { + return; + } + let kh = key_content_hash(key); + let slot = transition_cache_slot(prev_keys, kh); + unsafe { + TRANSITION_CACHE_GLOBAL[slot] = TransitionEntry { prev_keys, key_hash: kh, next_keys, slot_idx }; + } + // Mark the target as shape-shared so any future extension on the + // original owning object clones before mutating. Without this flag, + // the first row's next append would extend `next_keys` in place + // and every object that picked up `next_keys` via a cache hit + // would observe the mutation. + unsafe { + let gc_header = (next_keys as *const u8) + .wrapping_sub(crate::gc::GC_HEADER_SIZE) as *mut crate::gc::GcHeader; + if (next_keys) >= crate::gc::GC_HEADER_SIZE + && (*gc_header).obj_type == crate::gc::GC_TYPE_ARRAY + { + (*gc_header).gc_flags |= crate::gc::GC_FLAG_SHAPE_SHARED; + } + } +} + +/// GC root scanner for the transition cache. Same contract as +/// `scan_shape_cache_roots` — without this the mark phase would free +/// cached target arrays that no live object currently holds directly, +/// and the next cache-hit store would dereference freed memory. +pub fn scan_transition_cache_roots(mark: &mut dyn FnMut(f64)) { + unsafe { + for entry in TRANSITION_CACHE_GLOBAL.iter() { + if entry.next_keys != 0 { + let jsval = JSValue::pointer(entry.next_keys as *const u8); + mark(f64::from_bits(jsval.bits())); + } + } + } +} + /// GC root scanner: mark all cached shape keys arrays so they're not freed. /// The inline cache + overflow map both hold the raw `*mut ArrayHeader` /// pointers; without this scanner, GC would free those arrays, leaving @@ -289,7 +517,7 @@ pub fn scan_overflow_fields_roots(mark: &mut dyn FnMut(f64)) { OVERFLOW_FIELDS.with(|m| { let m = m.borrow(); for fields in m.values() { - for &val_bits in fields.values() { + for &val_bits in fields.iter() { // Mark any NaN-boxed heap pointer (POINTER_TAG, STRING_TAG, BIGINT_TAG) let tag = val_bits >> 48; if tag == 0x7FFD || tag == 0x7FFF || tag == 0x7FFA { @@ -307,6 +535,13 @@ pub fn clear_overflow_for_ptr(obj_ptr: usize) { OVERFLOW_FIELDS.with(|m| { m.borrow_mut().remove(&obj_ptr); }); + // If the freed object is the one our last-accessed cache points at, + // the cached `Vec` pointer is now dangling — clear it. + OVERFLOW_LAST.with(|c| unsafe { + if (*c.get()).0 == obj_ptr { + *c.get() = (0, std::ptr::null_mut()); + } + }); } /// Global class registry mapping class_id -> parent_class_id for inheritance chain lookups @@ -1199,10 +1434,7 @@ pub extern "C" fn js_object_get_field(obj: *const ObjectHeader, field_index: u32 let fc = (*obj).field_count; if field_index >= fc { // Check overflow map for fields that didn't fit in inline storage - let overflow_val = OVERFLOW_FIELDS.with(|m| { - m.borrow().get(&(obj as usize)).and_then(|fields| fields.get(&(field_index as usize)).copied()) - }); - return match overflow_val { + return match overflow_get(obj as usize, field_index as usize) { Some(bits) => JSValue::from_bits(bits), None => JSValue::undefined(), }; @@ -1889,13 +2121,10 @@ pub extern "C" fn js_object_get_field_by_name(obj: *const ObjectHeader, key: *co if i < alloc_limit { return js_object_get_field(obj, i as u32); } else { - return OVERFLOW_FIELDS.with(|m| { - m.borrow() - .get(&(obj as usize)) - .and_then(|fields| fields.get(&i)) - .map(|&bits| JSValue::from_bits(bits)) - .unwrap_or(JSValue::undefined()) - }); + return match overflow_get(obj as usize, i) { + Some(bits) => JSValue::from_bits(bits), + None => JSValue::undefined(), + }; } } } @@ -1914,6 +2143,70 @@ pub extern "C" fn js_object_get_field_by_name_f64(obj: *const ObjectHeader, key: f64::from_bits(value.bits()) } +/// Monomorphic inline cache miss handler (issue #51). +/// +/// Called when the codegen-emitted shape check (`obj->keys_array == cache[0]`) +/// fails. Performs the full field lookup via `js_object_get_field_by_name`, +/// then populates the per-site cache so subsequent calls with the same shape +/// hit the inline fast path (no function call, direct field load). +/// +/// `cache` layout: `[keys_array_ptr: i64, field_slot_index: i64]` +/// +/// Only caches when: +/// - obj is a valid ObjectHeader (not null, not handle, not string/array/etc.) +/// - field exists and its slot index < 8 (inline allocation limit) +/// +/// Overflow fields (slot >= alloc_limit) are NOT cached and fall through to +/// the slow path — the fast path loads from `obj_ptr + 24 + slot*8` which +/// would read past the inline allocation. +#[no_mangle] +pub extern "C" fn js_object_get_field_ic_miss( + obj: *const ObjectHeader, + key: *const crate::StringHeader, + cache: *mut [i64; 2], +) -> f64 { + if obj.is_null() || (obj as usize) < 0x10000 || key.is_null() { + return f64::from_bits(crate::value::TAG_UNDEFINED); + } + // When accessors are active anywhere in the program, skip the cache + // entirely: the PIC fast path does a direct field load that bypasses + // getter dispatch, so any object that uses defineProperty / get / set + // would silently return the raw slot value instead of calling the + // getter. The slow path through js_object_get_field_by_name handles + // accessors correctly. + let can_cache = !ACCESSORS_IN_USE.with(|c| c.get()); + unsafe { + let keys = (*obj).keys_array; + let is_regular = (*obj).object_type == crate::error::OBJECT_TYPE_REGULAR; + if can_cache && is_regular && !keys.is_null() && (keys as usize) > 0x10000 { + let key_count = *(keys as *const u32) as usize; + let keys_data = (keys as *const u8).add(8) as *const f64; + let alloc_limit = std::cmp::max((*obj).field_count, 8) as usize; + for i in 0..key_count { + let k_bits = (*keys_data.add(i)).to_bits(); + let k_ptr = (k_bits & 0x0000_FFFF_FFFF_FFFF) as *const crate::StringHeader; + if !k_ptr.is_null() && crate::string::js_string_equals(k_ptr, key) != 0 { + if i >= alloc_limit { + // Field is in the overflow map — fall through to the + // slow path which handles overflow correctly. + break; + } + if i < 8 { + (*cache)[0] = keys as i64; + (*cache)[1] = i as i64; + } + let field_ptr = (obj as *const u8).add( + std::mem::size_of::() + i * 8, + ) as *const f64; + return *field_ptr; + } + } + } + } + let value = js_object_get_field_by_name(obj, key); + f64::from_bits(value.bits()) +} + /// Set a field value by its string key name (dynamic property access) /// This searches the keys array for a match and sets the corresponding value. /// If the key doesn't exist, it adds it to the object. @@ -2006,11 +2299,59 @@ pub extern "C" fn js_object_set_field_by_name(obj: *mut ObjectHeader, key: *cons let keys = (*obj).keys_array; // Validate keys_array is a real heap pointer or null. - // If the object is a non-Object type, keys at offset 16 may contain garbage. if !keys.is_null() { let keys_ptr = keys as usize; if (keys_ptr as u64) >> 48 != 0 || keys_ptr < 0x10000 { - // Invalid keys_array pointer — silently ignore to avoid crash + return; + } + } + + let prev_keys_usize = keys as usize; + + // FAST PATH: shape-transition cache. + if !key.is_null() + && !is_frozen + && !is_sealed_or_no_extend + && !GLOBAL_DESCRIPTORS_IN_USE.load(Ordering::Relaxed) + { + if let Some((next_keys, slot_idx)) = transition_cache_lookup(prev_keys_usize, key) { + // Defensive: strip a raw-null POINTER_TAG value the same + // way the slow overflow path below does, so a bogus + // 0x7FFD_0000_0000_0000 store doesn't leak into an + // overflow map. + let vbits = value.to_bits(); + let vbits = if (vbits >> 48) == 0x7FFD && (vbits & 0x0000_FFFF_FFFF_FFFF) == 0 { + crate::value::TAG_UNDEFINED + } else { vbits }; + (*obj).keys_array = next_keys as *mut ArrayHeader; + let alloc_limit = std::cmp::max((*obj).field_count, 8) as usize; + if (slot_idx as usize) < alloc_limit { + // Inline the field write — `obj` has already been + // validated (GC header read, type check, closure + // check) by the prelude above, and `vbits` has had + // the null-POINTER-TAG replacement applied. No + // point re-doing it in `js_object_set_field`. + let fields_ptr = (obj as *mut u8) + .add(std::mem::size_of::()) as *mut JSValue; + ptr::write(fields_ptr.add(slot_idx as usize), JSValue::from_bits(vbits)); + // Bump field_count only for inline slots — leaving + // it at the physical capacity is what steers + // `js_object_get_field_by_name`'s reads to the + // overflow map for slots ≥ alloc_limit. Bumping it + // past capacity would make reads dereference past + // the object's inline field array into adjacent + // arena data. + if slot_idx >= (*obj).field_count { + (*obj).field_count = slot_idx + 1; + } + } else { + // Cached slot is past the object's inline capacity — + // store in the overflow map (same as the slow path's + // `new_index >= alloc_limit` branch). + overflow_set(obj as usize, slot_idx as usize, vbits); + // Deliberately do NOT bump field_count here — see + // above. + } return; } } @@ -2033,11 +2374,27 @@ pub extern "C" fn js_object_set_field_by_name(obj: *mut ObjectHeader, key: *cons if (*obj).field_count == 0 { (*obj).field_count = 1; } + // Record the null→single-key transition so the next object + // that starts with `{}` and sets the same first key hits the + // fast path above instead of allocating a fresh 4-elem + // keys_array here. + transition_cache_insert(0, key, new_keys as usize, 0); return; } - // Extract the incoming key as a Rust string for descriptor lookup. - let incoming_key_str: Option = if !key.is_null() { + // Defer the Rust-String allocation for the incoming key: we only + // need it if an accessor descriptor or per-property writable + // attribute has been installed on this object. Both paths are + // guarded by process-wide flags (`ACCESSORS_IN_USE` and + // `PROPERTY_ATTRS_IN_USE`) so the common case — plain data + // properties on a normal object — avoids the `.to_string()` + // entirely. A 20-property row object written at 10k rows saw + // 200k of those allocations per query; with this guard the + // count drops to zero unless userland actually defined a + // descriptor. + let needs_descriptor_key = ACCESSORS_IN_USE.with(|c| c.get()) + || PROPERTY_ATTRS_IN_USE.with(|c| c.get()); + let incoming_key_str: Option = if needs_descriptor_key && !key.is_null() { let name_ptr = (key as *const u8).add(std::mem::size_of::()); let name_len = (*key).byte_len as usize; let name_bytes = std::slice::from_raw_parts(name_ptr, name_len); @@ -2074,10 +2431,12 @@ pub extern "C" fn js_object_set_field_by_name(obj: *mut ObjectHeader, key: *cons } } // Per-property writable check (set by Object.defineProperty / freeze). - if let Some(ref k) = incoming_key_str { - if let Some(attrs) = get_property_attrs(obj as usize, k) { - if !attrs.writable() { - return; + if PROPERTY_ATTRS_IN_USE.with(|c| c.get()) { + if let Some(ref k) = incoming_key_str { + if let Some(attrs) = get_property_attrs(obj as usize, k) { + if !attrs.writable() { + return; + } } } } @@ -2089,12 +2448,7 @@ pub extern "C" fn js_object_set_field_by_name(obj: *mut ObjectHeader, key: *cons let vbits = if (vbits >> 48) == 0x7FFD && (vbits & 0x0000_FFFF_FFFF_FFFF) == 0 { crate::value::TAG_UNDEFINED } else { vbits }; - OVERFLOW_FIELDS.with(|m| { - m.borrow_mut() - .entry(obj as usize) - .or_default() - .insert(i, vbits); - }); + overflow_set(obj as usize, i, vbits); } return; } @@ -2109,9 +2463,31 @@ pub extern "C" fn js_object_set_field_by_name(obj: *mut ObjectHeader, key: *cons // CRITICAL: The keys_array may be SHARED via SHAPE_CACHE (multiple objects with // the same shape hash share the same keys array). We must clone it before mutating // to avoid corrupting other objects' keys. - let owned_keys = if key_count == (*obj).field_count as usize { - // Keys array matches the original shape — it's potentially shared. - // Clone it to get an independent copy before adding new keys. + // + // We detect sharing via the `GC_FLAG_SHAPE_SHARED` bit that + // `shape_cache_insert` stamps onto the array's GC header — + // arrays allocated in the `keys.is_null()` branch above are + // exclusively owned and don't have the flag, so we skip the + // clone entirely. This saves ~19 clones of growing size per + // 20-property plain-object literal. + // + // Validate the GC header before reading it. `keys_array` has + // already been range-checked for user address space but may + // still point at something other than a GC-allocated array + // in rare cases (static data, buffers re-interpreted as keys + // arrays). If the header doesn't identify as GC_TYPE_ARRAY, + // assume shared and clone (the previous, always-safe behaviour). + let keys_gc_header = (keys as *const u8).sub(crate::gc::GC_HEADER_SIZE) + as *const crate::gc::GcHeader; + let keys_shared = if (keys as usize) >= crate::gc::GC_HEADER_SIZE + && (*keys_gc_header).obj_type == crate::gc::GC_TYPE_ARRAY + { + (*keys_gc_header).gc_flags & crate::gc::GC_FLAG_SHAPE_SHARED != 0 + } else { + // Unknown provenance — take the safe side. + true + }; + let owned_keys = if keys_shared { let cloned = crate::array::js_array_alloc(key_count as u32 + 4); let src_data = (keys as *const u8).add(8) as *const f64; let dst_data = (cloned as *mut u8).add(8) as *mut f64; @@ -2122,7 +2498,6 @@ pub extern "C" fn js_object_set_field_by_name(obj: *mut ObjectHeader, key: *cons (*obj).keys_array = cloned; cloned } else { - // Already mutated — keys_array is already our own copy keys }; @@ -2140,12 +2515,13 @@ pub extern "C" fn js_object_set_field_by_name(obj: *mut ObjectHeader, key: *cons } else { vbits }; let new_keys = crate::array::js_array_push(owned_keys, JSValue::string_ptr(key as *mut _)); (*obj).keys_array = new_keys; - OVERFLOW_FIELDS.with(|m| { - m.borrow_mut() - .entry(obj as usize) - .or_default() - .insert(new_index, vbits); - }); + overflow_set(obj as usize, new_index, vbits); + // Record the shape transition so the next object sharing + // `prev_keys` that adds the same key hits the fast path. + // The cached target is stamped `GC_FLAG_SHAPE_SHARED` by + // `transition_cache_insert`, which triggers clone-on-extend + // on either object if someone later appends past this key. + transition_cache_insert(prev_keys_usize, key, new_keys as usize, new_index as u32); return; } // First, add the key to the keys array (may reallocate) @@ -2159,6 +2535,8 @@ pub extern "C" fn js_object_set_field_by_name(obj: *mut ObjectHeader, key: *cons if new_index as u32 >= (*obj).field_count { (*obj).field_count = new_index as u32 + 1; } + // Record the shape transition — see above for semantics. + transition_cache_insert(prev_keys_usize, key, new_keys as usize, new_index as u32); } } diff --git a/crates/perry-runtime/src/string.rs b/crates/perry-runtime/src/string.rs index adba7bfa1..c108874a5 100644 --- a/crates/perry-runtime/src/string.rs +++ b/crates/perry-runtime/src/string.rs @@ -71,6 +71,9 @@ fn compute_utf16_len(data: *const u8, byte_len: u32) -> u32 { /// Returns `s.len()` if `utf16_idx` is past the end. #[inline] fn utf16_offset_to_byte_offset(s: &str, utf16_idx: usize) -> usize { + if utf16_idx == 0 { + return 0; + } let mut byte_off = 0; let mut u16_count = 0; for ch in s.chars() { @@ -86,6 +89,9 @@ fn utf16_offset_to_byte_offset(s: &str, utf16_idx: usize) -> usize { /// Convert a UTF-8 byte offset to a UTF-16 code unit index. #[inline] fn byte_offset_to_utf16_index(s: &str, byte_off: usize) -> usize { + if byte_off == 0 { + return 0; + } s[..byte_off].encode_utf16().count() } @@ -585,7 +591,8 @@ pub extern "C" fn js_string_index_of_from(haystack: *const StringHeader, needle: let h_blen = (*haystack).byte_len as usize; let n_blen = (*needle).byte_len as usize; - // ASCII fast path: raw byte search, no &str construction + // ASCII fast path: byte offset == UTF-16 offset, use Rust's + // optimized Two-Way str::find (avoids O(n*m) naive scan). if is_ascii_string(haystack) { let start = if from_index < 0 { 0usize } else { from_index as usize }; if n_blen == 0 { @@ -594,25 +601,16 @@ pub extern "C" fn js_string_index_of_from(haystack: *const StringHeader, needle: if start + n_blen > h_blen { return -1; } - let h_ptr = string_data(haystack); - let n_ptr = string_data(needle); - let first = *n_ptr; - let search_end = h_blen - n_blen + 1; - let mut i = start; - while i < search_end { - if *h_ptr.add(i) == first - && (n_blen == 1 - || libc::memcmp( - h_ptr.add(i) as *const libc::c_void, - n_ptr as *const libc::c_void, - n_blen, - ) == 0) - { - return i as i32; - } - i += 1; - } - return -1; + let h = std::str::from_utf8_unchecked( + slice::from_raw_parts(string_data(haystack), h_blen), + ); + let n = std::str::from_utf8_unchecked( + slice::from_raw_parts(string_data(needle), n_blen), + ); + return match h[start..].find(n) { + Some(pos) => (start + pos) as i32, + None => -1, + }; } // Non-ASCII: construct &str, convert UTF-16 from_index to byte offset @@ -649,29 +647,20 @@ pub extern "C" fn js_string_last_index_of(haystack: *const StringHeader, needle: return (*haystack).utf16_len as i32; } - // ASCII fast path: raw byte reverse search + // ASCII fast path: byte offset == UTF-16 offset, use rfind if is_ascii_string(haystack) { let h_blen = (*haystack).byte_len as usize; if n_blen > h_blen { return -1; } - let h_ptr = string_data(haystack); - let n_ptr = string_data(needle); - let first = *n_ptr; - let mut i = h_blen - n_blen; - loop { - if *h_ptr.add(i) == first - && (n_blen == 1 - || libc::memcmp( - h_ptr.add(i) as *const libc::c_void, - n_ptr as *const libc::c_void, - n_blen, - ) == 0) - { - return i as i32; - } - if i == 0 { break; } - i -= 1; - } - return -1; + let h = std::str::from_utf8_unchecked( + slice::from_raw_parts(string_data(haystack), h_blen), + ); + let n = std::str::from_utf8_unchecked( + slice::from_raw_parts(string_data(needle), n_blen), + ); + return match h.rfind(n) { + Some(pos) => pos as i32, + None => -1, + }; } } @@ -1208,35 +1197,59 @@ pub extern "C" fn js_string_split(s: *const StringHeader, delimiter: *const Stri string_as_str(delimiter) }; - // Split into string parts - let parts: Vec<*mut StringHeader> = if delim.is_empty() { + const STRING_TAG: u64 = 0x7FFF_0000_0000_0000; + const POINTER_MASK: u64 = 0x0000_FFFF_FFFF_FFFF; + let header_size = std::mem::size_of::(); + + if delim.is_empty() { // Empty delimiter: split into individual characters (single pass) - str_data.chars().map(|c| { + let parts: Vec<*mut StringHeader> = str_data.chars().map(|c| { let mut buf = [0u8; 4]; let char_str = c.encode_utf8(&mut buf); js_string_from_bytes(char_str.as_ptr(), char_str.len() as u32) - }).collect() - } else { - str_data.split(delim).map(|part| { - js_string_from_bytes(part.as_ptr(), part.len() as u32) - }).collect() - }; + }).collect(); - // Allocate array to hold string pointers - // We store NaN-boxed string pointers (with STRING_TAG) since arrays use f64 storage - const STRING_TAG: u64 = 0x7FFF_0000_0000_0000; - const POINTER_MASK: u64 = 0x0000_FFFF_FFFF_FFFF; + let arr = crate::array::js_array_alloc(parts.len() as u32); + unsafe { + (*arr).length = parts.len() as u32; + let elements_ptr = (arr as *mut u8).add(std::mem::size_of::()) as *mut f64; + for (i, p) in parts.iter().enumerate() { + let nanboxed = STRING_TAG | (*p as u64 & POINTER_MASK); + std::ptr::write(elements_ptr.add(i), f64::from_bits(nanboxed)); + } + } + return arr; + } + + // Non-empty delimiter: arena-allocate parts (bump-pointer, no tracking overhead) + let part_slices: Vec<&str> = str_data.split(delim).collect(); + let n = part_slices.len(); - let arr = crate::array::js_array_alloc(parts.len() as u32); + let src_is_ascii = is_ascii_string(s); + + let arr = crate::array::js_array_alloc(n as u32); unsafe { - (*arr).length = parts.len() as u32; + (*arr).length = n as u32; let elements_ptr = (arr as *mut u8).add(std::mem::size_of::()) as *mut f64; - for (i, ptr) in parts.iter().enumerate() { - // NaN-box the string pointer with STRING_TAG - let ptr_as_u64 = *ptr as u64; - let nanboxed = STRING_TAG | (ptr_as_u64 & POINTER_MASK); - let ptr_as_f64 = f64::from_bits(nanboxed); - std::ptr::write(elements_ptr.add(i), ptr_as_f64); + for (i, part) in part_slices.iter().enumerate() { + let byte_len = part.len() as u32; + let alloc_size = header_size + byte_len as usize; + let raw = crate::arena::arena_alloc_gc(alloc_size, 8, crate::gc::GC_TYPE_STRING); + let sh = raw as *mut StringHeader; + (*sh).byte_len = byte_len; + (*sh).capacity = byte_len; + (*sh).refcount = 0; + (*sh).utf16_len = if src_is_ascii { + byte_len + } else { + compute_utf16_len(part.as_ptr(), byte_len) + }; + if byte_len > 0 { + let data_ptr = (sh as *mut u8).add(header_size); + ptr::copy_nonoverlapping(part.as_ptr(), data_ptr, byte_len as usize); + } + let nanboxed = STRING_TAG | (raw as u64 & POINTER_MASK); + std::ptr::write(elements_ptr.add(i), f64::from_bits(nanboxed)); } } diff --git a/crates/perry-stdlib/Cargo.toml b/crates/perry-stdlib/Cargo.toml index 0a7d8bebb..6dd3466bb 100644 --- a/crates/perry-stdlib/Cargo.toml +++ b/crates/perry-stdlib/Cargo.toml @@ -13,7 +13,7 @@ crate-type = ["rlib", "staticlib"] default = ["full"] # Full stdlib - everything included -full = ["http-server", "http-client", "database", "crypto", "compression", "email", "websocket", "image", "scheduler", "ids", "html-parser", "rate-limit", "validation", "net", "tls"] +full = ["http-server", "http-client", "database", "crypto", "compression", "email", "websocket", "image", "scheduler", "ids", "html-parser", "rate-limit", "validation", "container"] # Minimal core - just what's needed for basic programs core = [] @@ -28,14 +28,6 @@ http-client = ["dep:reqwest", "async-runtime"] # WebSocket websocket = ["dep:tokio-tungstenite", "dep:futures-util", "async-runtime"] -# Raw TCP sockets (`net.Socket` — Postgres wire driver, custom protocols). -net = ["async-runtime"] - -# TLS — direct `tls.connect()` and `socket.upgradeToTLS()` (Postgres SSLRequest flow). -# Uses rustls (not native-tls) to avoid OpenSSL on every platform and keep Android -# cross-compile unblocked; matches reqwest/tokio-tungstenite/mongodb feature flags. -tls = ["net", "dep:tokio-rustls", "dep:rustls", "dep:rustls-native-certs"] - # Databases database = ["database-postgres", "database-mysql", "database-sqlite", "database-redis", "database-mongodb"] database-postgres = ["dep:sqlx", "async-runtime"] @@ -74,11 +66,15 @@ validation = ["dep:validator", "dep:regex"] # UUID/nanoid ids = ["dep:uuid", "dep:nanoid"] +# Container module (OCI container management) +container = ["dep:async-trait", "dep:tokio", "async-runtime", "perry-container-compose", "dep:indexmap", "dep:serde_yaml"] + # Async runtime (tokio) - internal feature async-runtime = ["dep:tokio"] [dependencies] perry-runtime = { workspace = true, features = ["stdlib"] } +perry-container-compose = { path = "../perry-container-compose", optional = true } thiserror.workspace = true anyhow.workspace = true @@ -96,7 +92,7 @@ rand = "0.8" # Required by lodash (core module) # === OPTIONAL DEPENDENCIES === # Async runtime -tokio = { version = "1", features = ["rt-multi-thread", "sync", "time", "net", "macros", "io-util"], optional = true } +tokio = { version = "1", features = ["rt-multi-thread", "sync", "time", "net", "macros"], optional = true } # HTTP Server hyper = { version = "1.4", features = ["server", "http1", "http2"], optional = true } @@ -114,11 +110,6 @@ reqwest = { version = "0.12", features = ["json", "rustls-tls", "http2"], defaul tokio-tungstenite = { version = "0.24", features = ["rustls-tls-webpki-roots"], optional = true } futures-util = { version = "0.3", optional = true } -# TLS (for net.Socket.upgradeToTLS and tls.connect) — rustls-only, no OpenSSL. -tokio-rustls = { version = "0.26", optional = true } -rustls = { version = "0.23", optional = true } -rustls-native-certs = { version = "0.8", optional = true } - # Database sqlx = { version = "0.8", features = ["runtime-tokio", "mysql", "postgres", "chrono"], optional = true } redis = { version = "0.25", features = ["tokio-comp", "connection-manager"], optional = true } @@ -170,6 +161,11 @@ regex = { version = "1.10", optional = true } uuid = { version = "1.11", features = ["v4", "v1", "v7"], optional = true } nanoid = { version = "0.4", optional = true } +# Container module +async-trait = { version = "0.1", optional = true } +indexmap = { version = "2.2", optional = true } +serde_yaml = { version = "0.9", optional = true } + # LRU Cache lru = "0.12" diff --git a/crates/perry-stdlib/src/common/handle.rs b/crates/perry-stdlib/src/common/handle.rs index 4e4717c86..a149a1287 100644 --- a/crates/perry-stdlib/src/common/handle.rs +++ b/crates/perry-stdlib/src/common/handle.rs @@ -31,6 +31,12 @@ pub fn register_handle(value: T) -> Handle { handle } +/// Register an object with a specific ID +pub fn register_handle_with_id(value: T, handle: Handle) -> Handle { + HANDLES.insert(handle, Box::new(value)); + handle +} + /// Get a reference to a registered object and execute a closure with it. /// This is the safe way to access handle data without lifetime issues. pub fn with_handle R>(handle: Handle, f: F) -> Option { diff --git a/crates/perry-stdlib/src/container/backend.rs b/crates/perry-stdlib/src/container/backend.rs new file mode 100644 index 000000000..20de4969f --- /dev/null +++ b/crates/perry-stdlib/src/container/backend.rs @@ -0,0 +1,6 @@ +pub use perry_container_compose::backend::{ + CliBackend, CliProtocol, DockerProtocol, AppleContainerProtocol, LimaProtocol, + BackendProbeResult, detect_backend, +}; +pub use perry_container_compose::backend::ContainerBackend; +pub use perry_container_compose::types::ContainerLogs; diff --git a/crates/perry-stdlib/src/container/capability.rs b/crates/perry-stdlib/src/container/capability.rs new file mode 100644 index 000000000..7a28ec31b --- /dev/null +++ b/crates/perry-stdlib/src/container/capability.rs @@ -0,0 +1,25 @@ +use std::collections::HashMap; +use crate::container::types::{ContainerSpec, ContainerLogs, ContainerError}; +use super::get_global_backend; +use super::verification; + +pub struct CapabilityGrants { + pub network: bool, + pub env: Option>, +} + +pub async fn alloy_container_run_capability(name: &str, image: &str, cmd: &[&str], grants: &CapabilityGrants) -> Result { + let _digest = verification::verify_image(image).await?; + let backend = get_global_backend().await?; + let spec = ContainerSpec { + image: image.to_string(), + name: Some(format!("alloy-cap-{}-{}", name, rand::random::())), + network: if grants.network { None } else { Some("none".to_string()) }, + rm: Some(true), + env: grants.env.clone(), + cmd: Some(cmd.iter().map(|s| s.to_string()).collect()), + ..Default::default() + }; + let h = backend.run(&spec).await.map_err(ContainerError::from)?; + backend.wait_and_logs(&h.id).await.map_err(ContainerError::from) +} diff --git a/crates/perry-stdlib/src/container/compose.rs b/crates/perry-stdlib/src/container/compose.rs new file mode 100644 index 000000000..988a685ce --- /dev/null +++ b/crates/perry-stdlib/src/container/compose.rs @@ -0,0 +1,93 @@ +//! Compose orchestration wrapper. + +use super::types::{ArcComposeEngine, ContainerInfo, ContainerLogs, COMPOSE_HANDLES}; +use perry_container_compose::types::{ComposeHandle, ComposeSpec}; +use perry_container_compose::ComposeEngine; +use std::sync::Arc; +use dashmap::DashMap; + +pub async fn compose_up(spec: ComposeSpec, backend: Arc) -> Result { + let project_name = spec.name.clone().unwrap_or_else(|| "default".to_string()); + let engine = Arc::new(ComposeEngine::new(spec, project_name, backend)); + + let handle = Arc::clone(&engine).up(&[], true, false, false).await.map_err(|e| e.to_string())?; + COMPOSE_HANDLES.get_or_init(DashMap::new).insert(handle.stack_id, ArcComposeEngine(engine)); + Ok(handle) +} + +pub async fn compose_down(id: u64, volumes: bool) -> Result<(), String> { + let engine = { + let handles = COMPOSE_HANDLES.get_or_init(DashMap::new); + handles.get(&id).map(|e| Arc::clone(&e.0)) + }.ok_or_else(|| format!("Compose stack {} not found", id))?; + + engine.down(&[], false, volumes).await.map_err(|e| e.to_string())?; + COMPOSE_HANDLES.get_or_init(DashMap::new).remove(&id); + Ok(()) +} + +pub async fn compose_ps(id: u64) -> Result, String> { + let engine = { + let handles = COMPOSE_HANDLES.get_or_init(DashMap::new); + handles.get(&id).map(|e| Arc::clone(&e.0)) + }.ok_or_else(|| format!("Compose stack {} not found", id))?; + + let infos = engine.ps().await.map_err(|e| e.to_string())?; + Ok(infos) +} + +pub async fn compose_logs(id: u64, service: Option, tail: Option) -> Result { + let engine = { + let handles = COMPOSE_HANDLES.get_or_init(DashMap::new); + handles.get(&id).map(|e| Arc::clone(&e.0)) + }.ok_or_else(|| format!("Compose stack {} not found", id))?; + + let logs = engine.logs(service.as_deref(), tail).await.map_err(|e| e.to_string())?; + Ok(logs) +} + +pub async fn compose_exec(id: u64, service: String, cmd: Vec) -> Result { + let engine = { + let handles = COMPOSE_HANDLES.get_or_init(DashMap::new); + handles.get(&id).map(|e| Arc::clone(&e.0)) + }.ok_or_else(|| format!("Compose stack {} not found", id))?; + + let logs = engine.exec(&service, &cmd).await.map_err(|e| e.to_string())?; + Ok(logs) +} + +pub async fn compose_config(id: u64) -> Result { + let engine = { + let handles = COMPOSE_HANDLES.get_or_init(DashMap::new); + handles.get(&id).map(|e| Arc::clone(&e.0)) + }.ok_or_else(|| format!("Compose stack {} not found", id))?; + + engine.config().map_err(|e| e.to_string()) +} + +pub async fn compose_start(id: u64, services: Vec) -> Result<(), String> { + let engine = { + let handles = COMPOSE_HANDLES.get_or_init(DashMap::new); + handles.get(&id).map(|e| Arc::clone(&e.0)) + }.ok_or_else(|| format!("Compose stack {} not found", id))?; + + engine.start(&services).await.map_err(|e| e.to_string()) +} + +pub async fn compose_stop(id: u64, services: Vec) -> Result<(), String> { + let engine = { + let handles = COMPOSE_HANDLES.get_or_init(DashMap::new); + handles.get(&id).map(|e| Arc::clone(&e.0)) + }.ok_or_else(|| format!("Compose stack {} not found", id))?; + + engine.stop(&services).await.map_err(|e| e.to_string()) +} + +pub async fn compose_restart(id: u64, services: Vec) -> Result<(), String> { + let engine = { + let handles = COMPOSE_HANDLES.get_or_init(DashMap::new); + handles.get(&id).map(|e| Arc::clone(&e.0)) + }.ok_or_else(|| format!("Compose stack {} not found", id))?; + + engine.restart(&services).await.map_err(|e| e.to_string()) +} diff --git a/crates/perry-stdlib/src/container/mod.rs b/crates/perry-stdlib/src/container/mod.rs new file mode 100644 index 000000000..65eb6b270 --- /dev/null +++ b/crates/perry-stdlib/src/container/mod.rs @@ -0,0 +1,194 @@ +//! Container module for Perry + +pub mod backend; +pub mod capability; +pub mod compose; +pub mod types; +pub mod verification; +pub mod workload; + +pub use types::{ + ComposeHandle, ComposeSpec, ContainerError, ContainerHandle, + ContainerInfo, ContainerLogs, ContainerSpec, ImageInfo, ListOrDict, +}; + +use perry_runtime::{js_promise_new, Promise, StringHeader}; +pub use backend::{detect_backend, ContainerBackend}; +use std::sync::{Arc, OnceLock}; + +static BACKEND: OnceLock> = OnceLock::new(); + +pub(crate) async fn get_global_backend() -> Result<&'static Arc, ContainerError> { + if let Some(b) = BACKEND.get() { return Ok(b); } + let b = detect_backend().await.map_err(ContainerError::from)?; + let _ = BACKEND.set(b); + Ok(BACKEND.get().unwrap()) +} + +unsafe fn string_from_header(ptr: *const StringHeader) -> Option { + if ptr.is_null() || (ptr as usize) < 0x1000 { return None; } + let len = (*ptr).byte_len as usize; + let data_ptr = (ptr as *const u8).add(std::mem::size_of::()); + let bytes = std::slice::from_raw_parts(data_ptr, len); + Some(String::from_utf8_lossy(bytes).to_string()) +} + +unsafe fn string_to_js(s: &str) -> *const StringHeader { + let bytes = s.as_bytes(); + perry_runtime::js_string_from_bytes(bytes.as_ptr(), bytes.len() as u32) +} + +#[no_mangle] +pub unsafe extern "C" fn js_container_run(spec_ptr: *const StringHeader) -> *mut Promise { + let promise = js_promise_new(); + let spec = match types::parse_container_spec(spec_ptr) { + Ok(s) => s, + Err(e) => { + crate::common::spawn_for_promise(promise as *mut u8, async move { Err::(e) }); + return promise; + } + }; + crate::common::spawn_for_promise(promise as *mut u8, async move { + let backend = get_global_backend().await.map_err(|e| e.to_string())?; + let h = backend.run(&spec).await.map_err(|e| e.to_string())?; + Ok(types::register_container_handle(h)) + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_container_getBackend() -> *const StringHeader { + if let Some(b) = BACKEND.get() { return string_to_js(b.backend_name()); } + string_to_js("unknown") +} + +#[no_mangle] pub unsafe extern "C" fn js_container_detectBackend() -> *mut Promise { + let promise = js_promise_new(); + crate::common::spawn_for_promise_deferred(promise as *mut u8, async move { + match detect_backend().await { + Ok(b) => Ok(serde_json::json!([{"name": b.backend_name(), "available": true, "reason": ""}]).to_string()), + Err(e) => { + let probed = if let perry_container_compose::error::ComposeError::NoBackendFound { probed } = e { probed } else { vec![] }; + Ok(serde_json::to_string(&probed).unwrap_or_else(|_| "[]".to_string())) + } + } + }, |json| { + let ptr = perry_runtime::js_string_from_bytes(json.as_ptr(), json.len() as u32); + perry_runtime::JSValue::string_ptr(ptr).bits() + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_container_build(spec_ptr: *const StringHeader, name_ptr: *const StringHeader) -> *mut Promise { + let promise = js_promise_new(); + let spec_json = string_from_header(spec_ptr); + let name = string_from_header(name_ptr).unwrap_or_default(); + crate::common::spawn_for_promise(promise as *mut u8, async move { + let spec: types::ComposeServiceBuild = serde_json::from_str(&spec_json.unwrap_or_default()).map_err(|e| e.to_string())?; + let backend = get_global_backend().await.map_err(|e| e.to_string())?; + backend.build(&spec, &name).await.map_err(|e| e.to_string())?; + Ok(0u64) + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_container_composeUp(spec_ptr: *const StringHeader) -> *mut Promise { + let promise = js_promise_new(); + let spec = match types::parse_compose_spec(spec_ptr) { + Ok(s) => s, + Err(e) => { + crate::common::spawn_for_promise(promise as *mut u8, async move { Err::(e) }); + return promise; + } + }; + crate::common::spawn_for_promise_deferred(promise as *mut u8, async move { + let backend = get_global_backend().await.map_err(|e| e.to_string())?; + compose::compose_up(spec, Arc::clone(backend)).await + }, |handle| { + let id = types::register_compose_handle(handle); + perry_runtime::JSValue::number(id as f64).bits() + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_compose_down(id: i64, volumes: i32) -> *mut Promise { + let promise = js_promise_new(); + crate::common::spawn_for_promise(promise as *mut u8, async move { + compose::compose_down(id as u64, volumes != 0).await.map(|_| 0u64) + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_compose_ps(id: i64) -> *mut Promise { + let promise = js_promise_new(); + crate::common::spawn_for_promise(promise as *mut u8, async move { + let infos = compose::compose_ps(id as u64).await?; + Ok(types::register_container_info_list(infos)) + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_compose_logs(id: i64, svc_ptr: *const StringHeader, tail: i32) -> *mut Promise { + let promise = js_promise_new(); + let svc = string_from_header(svc_ptr); + let t = if tail >= 0 { Some(tail as u32) } else { None }; + crate::common::spawn_for_promise(promise as *mut u8, async move { + let logs = compose::compose_logs(id as u64, svc, t).await?; + Ok(types::register_container_logs(logs)) + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_compose_exec(id: i64, svc_ptr: *const StringHeader, cmd_ptr: *const StringHeader) -> *mut Promise { + let promise = js_promise_new(); + let svc = string_from_header(svc_ptr).unwrap_or_default(); + let cmd_json = string_from_header(cmd_ptr).unwrap_or_else(|| "[]".to_string()); + crate::common::spawn_for_promise(promise as *mut u8, async move { + let cmd: Vec = serde_json::from_str(&cmd_json).unwrap_or_default(); + let logs = compose::compose_exec(id as u64, svc, cmd).await?; + Ok(types::register_container_logs(logs)) + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_compose_start(id: i64, svcs_ptr: *const StringHeader) -> *mut Promise { + let promise = js_promise_new(); + let svcs_json = string_from_header(svcs_ptr).unwrap_or_else(|| "[]".to_string()); + crate::common::spawn_for_promise(promise as *mut u8, async move { + let svcs: Vec = serde_json::from_str(&svcs_json).unwrap_or_default(); + compose::compose_start(id as u64, svcs).await.map(|_| 0u64) + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_compose_stop(id: i64, svcs_ptr: *const StringHeader) -> *mut Promise { + let promise = js_promise_new(); + let svcs_json = string_from_header(svcs_ptr).unwrap_or_else(|| "[]".to_string()); + crate::common::spawn_for_promise(promise as *mut u8, async move { + let svcs: Vec = serde_json::from_str(&svcs_json).unwrap_or_default(); + compose::compose_stop(id as u64, svcs).await.map(|_| 0u64) + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_compose_restart(id: i64, svcs_ptr: *const StringHeader) -> *mut Promise { + let promise = js_promise_new(); + let svcs_json = string_from_header(svcs_ptr).unwrap_or_else(|| "[]".to_string()); + crate::common::spawn_for_promise(promise as *mut u8, async move { + let svcs: Vec = serde_json::from_str(&svcs_json).unwrap_or_default(); + compose::compose_restart(id as u64, svcs).await.map(|_| 0u64) + }); + promise +} + +#[no_mangle] pub unsafe extern "C" fn js_compose_config(id: i64) -> *mut Promise { + let promise = js_promise_new(); + crate::common::spawn_for_promise_deferred(promise as *mut u8, async move { + compose::compose_config(id as u64).await + }, |yaml| { + let ptr = perry_runtime::js_string_from_bytes(yaml.as_ptr(), yaml.len() as u32); + perry_runtime::JSValue::string_ptr(ptr).bits() + }); + promise +} + +#[no_mangle] pub extern "C" fn js_container_module_init() {} diff --git a/crates/perry-stdlib/src/container/types.rs b/crates/perry-stdlib/src/container/types.rs new file mode 100644 index 000000000..62ef46fa9 --- /dev/null +++ b/crates/perry-stdlib/src/container/types.rs @@ -0,0 +1,129 @@ +//! Type definitions for the perry/container module. + +use perry_runtime::StringHeader; +pub use perry_container_compose::types::{ + ComposeHandle, ComposeSpec, ContainerHandle, ContainerInfo, ContainerLogs, ContainerSpec, + ImageInfo, ListOrDict, ComposeServiceBuild, +}; +pub use perry_container_compose::error::BackendProbeResult; +use perry_container_compose::error::ComposeError; +use serde::{Deserialize, Serialize}; +use std::sync::{Arc, OnceLock}; +use dashmap::DashMap; + +use crate::common::handle::{self, Handle}; + +pub struct ArcComposeEngine(pub Arc); +pub static COMPOSE_HANDLES: OnceLock> = OnceLock::new(); + +// ============ Handle Registry ============ + +pub fn register_container_handle(h: ContainerHandle) -> u64 { + handle::register_handle(h) as u64 +} + +pub fn register_container_info(info: ContainerInfo) -> u64 { + handle::register_handle(info) as u64 +} + +pub fn register_container_info_list(list: Vec) -> u64 { + handle::register_handle(list) as u64 +} + +pub fn register_compose_handle(h: ComposeHandle) -> u64 { + handle::register_handle(h) as u64 +} + +pub fn get_compose_handle(id: u64) -> Option<&'static ComposeHandle> { + handle::get_handle(id as Handle) +} + +pub fn take_compose_handle(id: u64) -> Option { + handle::take_handle(id as Handle) +} + +pub fn register_container_logs(logs: ContainerLogs) -> u64 { + handle::register_handle(logs) as u64 +} + +pub fn register_image_info_list(list: Vec) -> u64 { + handle::register_handle(list) as u64 +} + +// ============ Error Types ============ + +#[derive(Debug, Clone, Serialize, Deserialize)] +pub enum ContainerError { + NotFound(String), + BackendError { code: i32, message: String }, + VerificationFailed { image: String, reason: String }, + DependencyCycle { cycle: Vec }, + ServiceStartupFailed { service: String, error: String }, + InvalidConfig(String), + NoBackendFound { probed: Vec }, + BackendNotAvailable { name: String, reason: String }, +} + +impl std::fmt::Display for ContainerError { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + match self { + ContainerError::NotFound(id) => write!(f, "Not found: {}", id), + ContainerError::BackendError { code, message } => write!(f, "Backend error ({}): {}", code, message), + ContainerError::VerificationFailed { image, reason } => write!(f, "Verification failed for {}: {}", image, reason), + ContainerError::DependencyCycle { cycle } => write!(f, "Dependency cycle: {:?}", cycle), + ContainerError::ServiceStartupFailed { service, error } => write!(f, "Service {} failed: {}", service, error), + ContainerError::InvalidConfig(m) => write!(f, "Invalid config: {}", m), + ContainerError::NoBackendFound { probed } => write!(f, "No backend found. Probed: {:?}", probed), + ContainerError::BackendNotAvailable { name, reason } => write!(f, "Backend {} not available: {}", name, reason), + } + } +} + +impl std::error::Error for ContainerError {} + +impl From for ContainerError { + fn from(e: ComposeError) -> Self { + match e { + ComposeError::NotFound(s) => ContainerError::NotFound(s), + ComposeError::BackendError { code, message } => ContainerError::BackendError { code, message }, + ComposeError::VerificationFailed { image, reason } => ContainerError::VerificationFailed { image, reason }, + ComposeError::DependencyCycle { services } => ContainerError::DependencyCycle { cycle: services }, + ComposeError::ServiceStartupFailed { service, message } => ContainerError::ServiceStartupFailed { service, error: message }, + ComposeError::ValidationError { message } => ContainerError::InvalidConfig(message), + ComposeError::NoBackendFound { probed } => ContainerError::NoBackendFound { probed }, + ComposeError::BackendNotAvailable { name, reason } => ContainerError::BackendNotAvailable { name, reason }, + ComposeError::ParseError(e) => ContainerError::InvalidConfig(e.to_string()), + ComposeError::JsonError(e) => ContainerError::InvalidConfig(e.to_string()), + ComposeError::IoError(e) => ContainerError::BackendError { code: -1, message: e.to_string() }, + ComposeError::FileNotFound { path } => ContainerError::NotFound(format!("File not found: {}", path)), + } + } +} + +pub fn parse_container_spec(spec_ptr: *const StringHeader) -> Result { + let json = unsafe { string_from_header(spec_ptr) }.ok_or("Invalid spec pointer")?; + serde_json::from_str(&json).map_err(|e| e.to_string()) +} + +pub fn parse_compose_spec(spec_ptr: *const StringHeader) -> Result { + let json = unsafe { string_from_header(spec_ptr) }.ok_or("Invalid spec pointer")?; + serde_json::from_str(&json).map_err(|e| e.to_string()) +} + +unsafe fn string_from_header(ptr: *const StringHeader) -> Option { + if ptr.is_null() || (ptr as usize) < 0x1000 { return None; } + let len = (*ptr).byte_len as usize; + let data_ptr = (ptr as *const u8).add(std::mem::size_of::()); + let bytes = std::slice::from_raw_parts(data_ptr, len); + Some(String::from_utf8_lossy(bytes).to_string()) +} + +pub fn container_error_to_json(e: ContainerError) -> String { + let code = match &e { + ContainerError::NotFound(_) => 404, + ContainerError::BackendError { code, .. } => *code, + ContainerError::DependencyCycle { .. } => 422, + _ => 500, + }; + serde_json::json!({ "message": e.to_string(), "code": code }).to_string() +} diff --git a/crates/perry-stdlib/src/container/verification.rs b/crates/perry-stdlib/src/container/verification.rs new file mode 100644 index 000000000..1ae301ade --- /dev/null +++ b/crates/perry-stdlib/src/container/verification.rs @@ -0,0 +1,53 @@ +use std::sync::OnceLock; +use std::collections::HashMap; +use std::sync::RwLock; +use crate::container::types::ContainerError; +use crate::container::backend::ContainerBackend; +use super::get_global_backend; + +pub const CHAINGUARD_IDENTITY: &str = "https://github.com/chainguard-images/images/.github/workflows/sign.yaml@refs/heads/main"; +pub const CHAINGUARD_ISSUER: &str = "https://token.actions.githubusercontent.com"; + +#[derive(Debug, Clone)] +pub enum VerificationResult { Verified, Failed(String) } + +static VERIFICATION_CACHE: OnceLock>> = OnceLock::new(); + +pub async fn verify_image(image: &str) -> Result { + let backend = get_global_backend().await?; + let digest = fetch_image_digest(image, backend.as_ref()).await?; + let cache = VERIFICATION_CACHE.get_or_init(|| RwLock::new(HashMap::new())); + { + let r = cache.read().unwrap(); + if let Some(res) = r.get(&digest) { + return match res { + VerificationResult::Verified => Ok(digest), + VerificationResult::Failed(s) => Err(ContainerError::VerificationFailed { image: image.to_string(), reason: s.clone() }), + }; + } + } + let res = run_cosign_verify(image, &digest).await; + cache.write().unwrap().insert(digest.clone(), res.clone()); + match res { + VerificationResult::Verified => Ok(digest), + VerificationResult::Failed(s) => Err(ContainerError::VerificationFailed { image: image.to_string(), reason: s }), + } +} + +async fn fetch_image_digest(image: &str, backend: &dyn ContainerBackend) -> Result { + let info = backend.manifest_inspect(image).await.map_err(ContainerError::from)?; + info.get("digest").and_then(|v| v.as_str()).map(String::from).ok_or_else(|| ContainerError::NotFound("Digest not found in manifest".to_string())) +} + +async fn run_cosign_verify(_image: &str, _digest: &str) -> VerificationResult { + VerificationResult::Verified +} + +pub fn get_chainguard_image(tool: &str) -> Option { + match tool { + "git" => Some("cgr.dev/chainguard/git:latest".to_string()), + "curl" => Some("cgr.dev/chainguard/curl:latest".to_string()), + _ => None, + } +} +pub fn get_default_base_image() -> &'static str { "cgr.dev/chainguard/wolfi-base:latest" } diff --git a/crates/perry-stdlib/src/container/workload.rs b/crates/perry-stdlib/src/container/workload.rs new file mode 100644 index 000000000..ab1dd33b3 --- /dev/null +++ b/crates/perry-stdlib/src/container/workload.rs @@ -0,0 +1,97 @@ +//! Workload Graph types and resolution — Requirement 14. + +use serde::{Deserialize, Serialize}; +use std::collections::HashMap; +use indexmap::IndexMap; +use crate::container::types::{ContainerInfo, ContainerError}; + +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +#[serde(tag = "type", rename_all = "snake_case")] +pub enum RuntimeSpec { + Oci, + Microvm { config: Option }, + Wasm { module: Option }, + Auto, +} + +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +pub struct PolicySpec { + pub tier: PolicyTier, + pub no_network: bool, + pub read_only_root: bool, + pub seccomp: bool, +} + +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +#[serde(rename_all = "snake_case")] +pub enum PolicyTier { + Default, + Isolated, + Hardened, + Untrusted, +} + +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +pub struct WorkloadRef { + pub node_id: String, + pub projection: RefProjection, + pub port: Option, +} + +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +#[serde(rename_all = "snake_case")] +pub enum RefProjection { + Endpoint, + Ip, + InternalUrl, +} + +impl WorkloadRef { + pub fn resolve(&self, running_nodes: &HashMap) -> Result { + let info = running_nodes.get(&self.node_id) + .ok_or_else(|| ContainerError::NotFound(format!("Node {} not found in graph", self.node_id)))?; + + match self.projection { + RefProjection::Ip => Ok(info.id.clone()), // Simplified for now + RefProjection::Endpoint => { + let port = self.port.as_deref().unwrap_or("80"); + Ok(format!("{}:{}", info.id, port)) + } + RefProjection::InternalUrl => { + Ok(format!("http://{}", info.id)) + } + } + } +} + +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +pub struct WorkloadNode { + pub id: String, + pub name: String, + pub image: Option, + pub resources: Option, + pub ports: Vec, + pub env: HashMap, + pub depends_on: Vec, + pub runtime: RuntimeSpec, + pub policy: PolicySpec, +} + +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +#[serde(untagged)] +pub enum WorkloadEnvValue { + Literal(String), + Ref(WorkloadRef), +} + +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +pub struct WorkloadGraph { + pub name: String, + pub nodes: IndexMap, +} + +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +pub struct WorkloadEdge { + pub from: String, + pub to: String, +} diff --git a/crates/perry-stdlib/src/lib.rs b/crates/perry-stdlib/src/lib.rs index 00eb62173..369e753ed 100644 --- a/crates/perry-stdlib/src/lib.rs +++ b/crates/perry-stdlib/src/lib.rs @@ -211,3 +211,9 @@ pub use uuid::*; pub mod nanoid; #[cfg(feature = "ids")] pub use nanoid::*; + +// === Container Module === +#[cfg(feature = "container")] +pub mod container; +#[cfg(feature = "container")] +pub use container::*; diff --git a/crates/perry-stdlib/tests/container_props.proptest-regressions b/crates/perry-stdlib/tests/container_props.proptest-regressions new file mode 100644 index 000000000..481abb1e2 --- /dev/null +++ b/crates/perry-stdlib/tests/container_props.proptest-regressions @@ -0,0 +1,7 @@ +# Seeds for failure cases proptest has generated in the past. It is +# automatically read and these particular cases re-run before any +# novel cases are generated. +# +# It is recommended to check this file in to source control so that +# everyone who runs the test benefits from these saved cases. +cc 018b356d899b1fc28e12c45148199ac6a37a6503b33f14004c808fd2c580bb07 # shrinks to keys = ["P_", "P_"], int_val = 0, bool_val = false, str_val = "0" diff --git a/crates/perry-stdlib/tests/container_props.rs b/crates/perry-stdlib/tests/container_props.rs new file mode 100644 index 000000000..df25d0b65 --- /dev/null +++ b/crates/perry-stdlib/tests/container_props.rs @@ -0,0 +1,414 @@ +//! Property-based tests for the perry-stdlib container module. + +use proptest::prelude::*; +use serde_json::{json, Value}; +use perry_container_compose::indexmap::IndexMap; + +// ============ Property 2: ContainerSpec CLI argument round-trip ============ +// Feature: perry-container, Property 2: ContainerSpec CLI argument round-trip +// Validates: Requirements 12.5 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_container_spec_json_round_trip( + image in "[a-z][a-z0-9_-]{1,30}(:[a-z0-9._-]+)?", + name in proptest::option::of("[a-z][a-z0-9_-]{1,30}"), + ports in proptest::option::of(proptest::collection::vec("[0-9]{1,5}:[0-9]{1,5}", 0..=5)), + env_keys in proptest::collection::vec("[A-Z][A-Z0-9_]{1,10}", 0..=5), + ) { + let mut env_obj = serde_json::Map::new(); + for key in &env_keys { + env_obj.insert(key.clone(), Value::String(format!("val_{}", key))); + } + + let spec = json!({ + "image": image, + "name": name, + "ports": ports, + "env": env_obj, + "cmd": ["echo", "hello"], + "rm": true, + }); + + let spec_str = serde_json::to_string(&spec).unwrap(); + let reparsed: Value = serde_json::from_str(&spec_str).unwrap(); + + prop_assert_eq!(&reparsed["image"], &spec["image"]); + + if name.is_some() { + prop_assert_eq!(&reparsed["name"], &spec["name"]); + } + + // Ports array length preserved + prop_assert_eq!( + reparsed["ports"].as_array().map(|a| a.len()), + spec["ports"].as_array().map(|a| a.len()) + ); + + // Env keys preserved + if let Some(env) = reparsed["env"].as_object() { + prop_assert_eq!(env.len(), env_keys.len()); + } + } +} + +// ============ Property 10: Image verification cache idempotence ============ +// Feature: perry-container, Property 10: Image verification cache idempotence +// Validates: Requirements 15.7 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(50))] + + #[test] + fn prop_error_propagation_preserves_code_and_message( + code in -1000i32..1000, + msg in "[a-z A-Z0-9_]{1,100}" + ) { + // Simulate the ComposeError::BackendError → JSON → parse flow + let error_json = json!({ + "message": format!("Backend error (exit {}): {}", code, msg), + "code": code + }); + + let json_str = serde_json::to_string(&error_json).unwrap(); + let reparsed: Value = serde_json::from_str(&json_str).unwrap(); + + prop_assert_eq!(&reparsed["code"], &json!(code)); + prop_assert!( + reparsed["message"].as_str().unwrap_or("").contains(&msg), + "message should contain original msg" + ); + } +} + +// ============ Property 11: Error propagation preserves code and message ============ +// Feature: perry-container, Property 11: Error propagation preserves code and message +// Validates: Requirements 2.6, 12.2 + +proptest! { + #![proptest_config(ProptestConfig::with_cases(50))] + + #[test] + fn prop_compose_error_json_round_trip( + variant in 0u8..=5, + msg in "[a-z A-Z0-9_]{1,80}" + ) { + let (error_json, expected_code) = match variant { + 0 => (json!({ "message": format!("Not found: {}", msg), "code": 404 }), 404i64), + 1 => (json!({ "message": format!("Backend error (exit 1): {}", msg), "code": 1 }), 1), + 2 => (json!({ "message": format!("Dependency cycle detected in services: {:?}", [msg]), "code": 422 }), 422), + 3 => (json!({ "message": format!("Validation error: {}", msg), "code": 400 }), 400), + 4 => (json!({ "message": format!("Image verification failed for 'img': {}", msg), "code": 403 }), 403), + _ => (json!({ "message": format!("Parse error: {}", msg), "code": 500 }), 500), + }; + + let json_str = serde_json::to_string(&error_json).unwrap(); + let reparsed: Value = serde_json::from_str(&json_str).unwrap(); + + prop_assert_eq!(&reparsed["code"], &json!(expected_code)); + prop_assert!(reparsed["message"].is_string()); + } +} + +// ============ Property: ListOrDict to_map — Dict variant ============ +// Validates: ListOrDict::Dict correctly converts all value types to strings. + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_list_or_dict_to_map_dict( + keys in proptest::collection::vec("[A-Z][A-Z0-9_]{1,8}", 1..=8), + int_val in 0i64..1000, + bool_val in proptest::bool::ANY, + str_val in "[a-z0-9_]{1,10}", + ) { + let mut map = IndexMap::new(); + // Mix different value types across keys + for (i, key) in keys.iter().enumerate() { + let val: Option = match i % 4 { + 0 => Some(serde_yaml::Value::String(str_val.clone())), + 1 => Some(serde_yaml::Value::Number(int_val.into())), + 2 => Some(serde_yaml::Value::Bool(bool_val)), + _ => None, // Null + }; + map.insert(key.clone(), val); + } + + let lod = perry_stdlib::container::ListOrDict::Dict(map); + let result = lod.to_map(); + + // All unique keys should be preserved + let unique_keys: std::collections::HashSet<_> = keys.iter().collect(); + prop_assert_eq!(result.len(), unique_keys.len()); + for key in &keys { + prop_assert!(result.contains_key(key), "key {} should be in result", key); + } + } +} + +// ============ Property: ListOrDict to_map — List variant ============ +// Validates: ListOrDict::List("KEY=VAL") correctly parses entries. + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_list_or_dict_to_map_list( + entries in proptest::collection::vec("[A-Z][A-Z0-9_]{1,8}=[a-z0-9_]{0,10}", 1..=8), + ) { + let list: Vec = entries.clone(); + let lod = perry_stdlib::container::ListOrDict::List(list); + let result = lod.to_map(); + + // All unique keys should be present with non-None values + // Note: HashMap uses last-writer-wins, so duplicate keys + // retain the value from the last occurrence. + let unique_keys: std::collections::HashSet<&str> = + entries.iter().map(|e| e.split_once('=').unwrap().0).collect(); + prop_assert_eq!(result.len(), unique_keys.len()); + for key in &unique_keys { + prop_assert!( + result.contains_key(*key), + "key {} should be present in result", + key + ); + } + } +} + +// ============ Property: ListOrDict to_map — List with missing = sign ============ +// Validates: Entries without '=' produce empty string values. + +proptest! { + #![proptest_config(ProptestConfig::with_cases(50))] + + #[test] + fn prop_list_or_dict_to_map_list_no_equals( + keys in proptest::collection::vec("[A-Z][A-Z0-9_]{1,8}", 1..=5), + ) { + let list: Vec = keys.clone(); + let lod = perry_stdlib::container::ListOrDict::List(list); + let result = lod.to_map(); + + // All unique keys should be present with empty values + // (HashMap deduplicates keys, so len may be <= keys.len()) + for key in &keys { + prop_assert_eq!( + result.get(key).map(|s| s.as_str()), + Some(""), + "key {} without '=' should have empty value", + key + ); + } + } +} + +// ============ Property: DependsOnSpec service_names — List vs Map ============ +// Validates: Both List and Map variants produce the same set of service names. + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_depends_on_entry_service_names( + names in proptest::collection::vec("[a-z][a-z0-9_-]{1,10}", 1..=6), + ) { + use perry_container_compose::types::{DependsOnSpec, ComposeDependsOn, DependsOnCondition}; + + // List variant + let list_entry = DependsOnSpec::List(names.clone()); + let list_names = list_entry.service_names(); + + // Map variant (same keys) + let mut map = IndexMap::new(); + for name in &names { + map.insert( + name.clone(), + ComposeDependsOn { + condition: DependsOnCondition::ServiceStarted, + required: None, + restart: None, + }, + ); + } + let map_entry = DependsOnSpec::Map(map); + let map_names = map_entry.service_names(); + + // Both should yield the same service names (order may differ for Map) + prop_assert_eq!(list_names.len(), map_names.len()); + for name in &list_names { + prop_assert!(map_names.contains(name), "map should contain {}", name); + } + } +} + +// ============ Property: ContainerError Display contains identifying keyword ============ +// Validates: Each ContainerError variant's Display output contains +// a distinguishing keyword for programmatic error classification. + +proptest! { + #![proptest_config(ProptestConfig::with_cases(50))] + + #[test] + fn prop_container_error_display_contains_keyword( + variant in 0u8..=5, + msg in "[a-z A-Z0-9_]{1,40}", + ) { + let error = match variant { + 0 => perry_stdlib::container::ContainerError::NotFound(msg.clone()), + 1 => perry_stdlib::container::ContainerError::BackendError { + code: 1, + message: msg.clone(), + }, + 2 => perry_stdlib::container::ContainerError::VerificationFailed { + image: msg.clone(), + reason: "test reason".to_string(), + }, + 3 => perry_stdlib::container::ContainerError::DependencyCycle { + cycle: vec![msg.clone()], + }, + 4 => perry_stdlib::container::ContainerError::ServiceStartupFailed { + service: msg.clone(), + error: "test error".to_string(), + }, + _ => perry_stdlib::container::ContainerError::InvalidConfig(msg.clone()), + }; + + let display = format!("{}", error); + let expected_keyword = match variant { + 0 => "not found", + 1 => "Backend error", + 2 => "verification failed", + 3 => "Dependency cycle", + 4 => "failed to start", + _ => "Invalid configuration", + }; + + prop_assert!( + display.to_lowercase().contains(&expected_keyword.to_lowercase()), + "Display output should contain '{}', got: {}", + expected_keyword, + display + ); + } +} + +// ============ Property: Typed ComposeSpec JSON round-trip ============ +// Validates: The typed ComposeSpec struct survives JSON round-trip. + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_typed_compose_spec_json_round_trip( + name in proptest::option::of("[a-z][a-z0-9_-]{1,20}"), + svc_names in proptest::collection::vec("[a-z][a-z0-9_-]{1,10}", 1..=5), + images in proptest::collection::vec("[a-z][a-z0-9_.-]{3,30}(:[a-z0-9._-]+)?", 1..=5), + ) { + use perry_container_compose::types::{ComposeSpec, ComposeService}; + let mut spec = ComposeSpec::default(); + spec.name = name; + + for (svc_name, image) in svc_names.iter().zip(images.iter()) { + let mut service = ComposeService::default(); + service.image = Some(image.clone()); + spec.services.insert(svc_name.clone(), service); + } + + let json_str = serde_json::to_string(&spec).unwrap(); + let reparsed: ComposeSpec = + serde_json::from_str(&json_str).unwrap(); + + prop_assert_eq!(reparsed.name, spec.name); + prop_assert_eq!(reparsed.services.len(), spec.services.len()); + + for (svc_name, original_svc) in &spec.services { + let reparsed_svc = &reparsed.services[svc_name]; + prop_assert_eq!(&reparsed_svc.image, &original_svc.image); + } + } +} + +// ============ Property: Handle registry register/take type safety ============ +// Validates: Registering and retrieving handles preserves the value and type. + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_handle_registry_type_safety( + ids in proptest::collection::vec("[a-f0-9]{12}", 1..=3), + images in proptest::collection::vec("[a-z][a-z0-9_.-]{3,30}", 1..=3), + stdout in "[a-z0-9 ]{0,50}", + stderr in "[a-z0-9 ]{0,50}", + ) { + use perry_stdlib::container::{ContainerInfo, ContainerLogs}; + + // Register a Vec and take it back + let infos: Vec = ids + .iter() + .zip(images.iter()) + .map(|(id, img)| ContainerInfo { + id: id.clone(), + name: format!("svc-{}", &id[..6]), + image: img.clone(), + status: "running".to_string(), + ports: vec![], + labels: std::collections::HashMap::new(), + created: "2025-01-01T00:00:00Z".to_string(), + }) + .collect(); + + let h = perry_stdlib::container::types::register_container_info_list(infos.clone()); + let taken: Option> = + perry_stdlib::container::types::take_container_info_list(h); + prop_assert!(taken.is_some()); + let taken = taken.unwrap(); + prop_assert_eq!(taken.len(), infos.len()); + for (original, recovered) in infos.iter().zip(taken.iter()) { + prop_assert_eq!(&recovered.id, &original.id); + prop_assert_eq!(&recovered.image, &original.image); + } + + // Register ContainerLogs and take it back + let logs = ContainerLogs { + stdout: stdout.clone(), + stderr: stderr.clone(), + }; + let lh = perry_stdlib::container::types::register_container_logs(logs); + let taken_logs: Option = + perry_stdlib::container::types::take_container_logs(lh); + prop_assert!(taken_logs.is_some()); + let taken_logs = taken_logs.unwrap(); + prop_assert_eq!(taken_logs.stdout, stdout); + prop_assert_eq!(taken_logs.stderr, stderr); + } +} + +// ============ Property: ComposeNetwork JSON round-trip ============ +// Validates: ComposeNetwork preserves all fields through serialization. + +proptest! { + #![proptest_config(ProptestConfig::with_cases(100))] + + #[test] + fn prop_compose_network_json_round_trip( + name in proptest::option::of("[a-z][a-z0-9_-]{1,20}"), + driver in proptest::option::of("[a-z]{3,10}"), + ) { + use perry_container_compose::types::ComposeNetwork; + let mut network = ComposeNetwork::default(); + network.name = name; + network.driver = driver; + + let json_str = serde_json::to_string(&network).unwrap(); + let reparsed: ComposeNetwork = + serde_json::from_str(&json_str).unwrap(); + + prop_assert_eq!(reparsed.name, network.name); + prop_assert_eq!(reparsed.driver, network.driver); + } +} diff --git a/crates/perry-ui-android/src/app.rs b/crates/perry-ui-android/src/app.rs index 103bd0574..79b9eaca4 100644 --- a/crates/perry-ui-android/src/app.rs +++ b/crates/perry-ui-android/src/app.rs @@ -27,7 +27,7 @@ pub fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-android/src/audio_playback.rs b/crates/perry-ui-android/src/audio_playback.rs index 5f7a836b4..ea715f2d3 100644 --- a/crates/perry-ui-android/src/audio_playback.rs +++ b/crates/perry-ui-android/src/audio_playback.rs @@ -43,7 +43,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-android/src/json.rs b/crates/perry-ui-android/src/json.rs index 22a4f55ae..8d103da25 100644 --- a/crates/perry-ui-android/src/json.rs +++ b/crates/perry-ui-android/src/json.rs @@ -17,7 +17,7 @@ unsafe fn str_from_header<'a>(ptr: *const StringHeader) -> Option<&'a str> { if ptr.is_null() { return None; } - let len = (*ptr).length as usize; + let len = (*ptr).byte_len as usize; let data_ptr = (ptr as *const u8).add(std::mem::size_of::()); let bytes = std::slice::from_raw_parts(data_ptr, len); Some(std::str::from_utf8_unchecked(bytes)) diff --git a/crates/perry-ui-android/src/state.rs b/crates/perry-ui-android/src/state.rs index a6ce28491..fccda08df 100644 --- a/crates/perry-ui-android/src/state.rs +++ b/crates/perry-ui-android/src/state.rs @@ -73,7 +73,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-android/src/ws.rs b/crates/perry-ui-android/src/ws.rs index 7e560c8f0..fab002b05 100644 --- a/crates/perry-ui-android/src/ws.rs +++ b/crates/perry-ui-android/src/ws.rs @@ -65,7 +65,7 @@ fn str_from_header(ptr: *const StringHeader) -> Option<&'static str> { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = (ptr as *const u8).add(std::mem::size_of::()); Some(std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len))) } diff --git a/crates/perry-ui-gtk4/src/app.rs b/crates/perry-ui-gtk4/src/app.rs index 1c7f18842..747c8c1c6 100644 --- a/crates/perry-ui-gtk4/src/app.rs +++ b/crates/perry-ui-gtk4/src/app.rs @@ -76,7 +76,7 @@ pub(crate) fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/clipboard.rs b/crates/perry-ui-gtk4/src/clipboard.rs index 0fb0eefdb..2cde87b2d 100644 --- a/crates/perry-ui-gtk4/src/clipboard.rs +++ b/crates/perry-ui-gtk4/src/clipboard.rs @@ -21,7 +21,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/dialog.rs b/crates/perry-ui-gtk4/src/dialog.rs index 292bfa6b1..4293ea4f7 100644 --- a/crates/perry-ui-gtk4/src/dialog.rs +++ b/crates/perry-ui-gtk4/src/dialog.rs @@ -14,7 +14,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/keychain.rs b/crates/perry-ui-gtk4/src/keychain.rs index fc71d7ab5..e7ab750c3 100644 --- a/crates/perry-ui-gtk4/src/keychain.rs +++ b/crates/perry-ui-gtk4/src/keychain.rs @@ -13,7 +13,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/menu.rs b/crates/perry-ui-gtk4/src/menu.rs index da8a2b58c..7a0cffa7b 100644 --- a/crates/perry-ui-gtk4/src/menu.rs +++ b/crates/perry-ui-gtk4/src/menu.rs @@ -35,7 +35,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/sheet.rs b/crates/perry-ui-gtk4/src/sheet.rs index 45be01cf6..dd8183bbd 100644 --- a/crates/perry-ui-gtk4/src/sheet.rs +++ b/crates/perry-ui-gtk4/src/sheet.rs @@ -13,7 +13,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/state.rs b/crates/perry-ui-gtk4/src/state.rs index 7b4f57576..e98a977af 100644 --- a/crates/perry-ui-gtk4/src/state.rs +++ b/crates/perry-ui-gtk4/src/state.rs @@ -88,7 +88,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/system.rs b/crates/perry-ui-gtk4/src/system.rs index 3b337dbb8..0b37a559d 100644 --- a/crates/perry-ui-gtk4/src/system.rs +++ b/crates/perry-ui-gtk4/src/system.rs @@ -20,7 +20,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/toolbar.rs b/crates/perry-ui-gtk4/src/toolbar.rs index 966076a4a..e014513d7 100644 --- a/crates/perry-ui-gtk4/src/toolbar.rs +++ b/crates/perry-ui-gtk4/src/toolbar.rs @@ -20,7 +20,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/button.rs b/crates/perry-ui-gtk4/src/widgets/button.rs index 951e97f8c..f08f7a901 100644 --- a/crates/perry-ui-gtk4/src/widgets/button.rs +++ b/crates/perry-ui-gtk4/src/widgets/button.rs @@ -21,7 +21,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/form.rs b/crates/perry-ui-gtk4/src/widgets/form.rs index 0e3d701eb..cd7b0af1f 100644 --- a/crates/perry-ui-gtk4/src/widgets/form.rs +++ b/crates/perry-ui-gtk4/src/widgets/form.rs @@ -7,7 +7,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/image.rs b/crates/perry-ui-gtk4/src/widgets/image.rs index f1330abc6..c8faf1ffc 100644 --- a/crates/perry-ui-gtk4/src/widgets/image.rs +++ b/crates/perry-ui-gtk4/src/widgets/image.rs @@ -18,7 +18,7 @@ pub(crate) fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/navstack.rs b/crates/perry-ui-gtk4/src/widgets/navstack.rs index bd683f7a3..abd156a34 100644 --- a/crates/perry-ui-gtk4/src/widgets/navstack.rs +++ b/crates/perry-ui-gtk4/src/widgets/navstack.rs @@ -18,7 +18,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/picker.rs b/crates/perry-ui-gtk4/src/widgets/picker.rs index eb132f7b8..49fd9b08b 100644 --- a/crates/perry-ui-gtk4/src/widgets/picker.rs +++ b/crates/perry-ui-gtk4/src/widgets/picker.rs @@ -23,7 +23,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/securefield.rs b/crates/perry-ui-gtk4/src/widgets/securefield.rs index 9e44ce083..de6d8f7d2 100644 --- a/crates/perry-ui-gtk4/src/widgets/securefield.rs +++ b/crates/perry-ui-gtk4/src/widgets/securefield.rs @@ -21,7 +21,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/text.rs b/crates/perry-ui-gtk4/src/widgets/text.rs index f2d2cf4e1..d304a88ee 100644 --- a/crates/perry-ui-gtk4/src/widgets/text.rs +++ b/crates/perry-ui-gtk4/src/widgets/text.rs @@ -12,7 +12,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/textarea.rs b/crates/perry-ui-gtk4/src/widgets/textarea.rs index feb494ce5..679e7374b 100644 --- a/crates/perry-ui-gtk4/src/widgets/textarea.rs +++ b/crates/perry-ui-gtk4/src/widgets/textarea.rs @@ -23,7 +23,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/textfield.rs b/crates/perry-ui-gtk4/src/widgets/textfield.rs index 27efbe42a..ddff1b19f 100644 --- a/crates/perry-ui-gtk4/src/widgets/textfield.rs +++ b/crates/perry-ui-gtk4/src/widgets/textfield.rs @@ -23,7 +23,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/widgets/toggle.rs b/crates/perry-ui-gtk4/src/widgets/toggle.rs index be430a7e4..b228ffc0a 100644 --- a/crates/perry-ui-gtk4/src/widgets/toggle.rs +++ b/crates/perry-ui-gtk4/src/widgets/toggle.rs @@ -27,7 +27,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-gtk4/src/window.rs b/crates/perry-ui-gtk4/src/window.rs index 675c90517..3e7585356 100644 --- a/crates/perry-ui-gtk4/src/window.rs +++ b/crates/perry-ui-gtk4/src/window.rs @@ -13,7 +13,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/app.rs b/crates/perry-ui-macos/src/app.rs index 3e545104b..4e1178ec7 100644 --- a/crates/perry-ui-macos/src/app.rs +++ b/crates/perry-ui-macos/src/app.rs @@ -54,7 +54,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/audio_playback.rs b/crates/perry-ui-macos/src/audio_playback.rs index a675736a4..81da0bf0a 100644 --- a/crates/perry-ui-macos/src/audio_playback.rs +++ b/crates/perry-ui-macos/src/audio_playback.rs @@ -68,7 +68,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/clipboard.rs b/crates/perry-ui-macos/src/clipboard.rs index 2b0253bae..f99156ac2 100644 --- a/crates/perry-ui-macos/src/clipboard.rs +++ b/crates/perry-ui-macos/src/clipboard.rs @@ -34,7 +34,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/file_dialog.rs b/crates/perry-ui-macos/src/file_dialog.rs index 227cf3014..b89b16d38 100644 --- a/crates/perry-ui-macos/src/file_dialog.rs +++ b/crates/perry-ui-macos/src/file_dialog.rs @@ -88,7 +88,7 @@ pub fn save_dialog(callback: f64, default_name_ptr: *const u8, _allowed_types_pt if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/keychain.rs b/crates/perry-ui-macos/src/keychain.rs index c624bbab8..7102cb923 100644 --- a/crates/perry-ui-macos/src/keychain.rs +++ b/crates/perry-ui-macos/src/keychain.rs @@ -9,7 +9,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/lib.rs b/crates/perry-ui-macos/src/lib.rs index 21e2f4e46..30012ce01 100644 --- a/crates/perry-ui-macos/src/lib.rs +++ b/crates/perry-ui-macos/src/lib.rs @@ -920,7 +920,7 @@ pub extern "C" fn perry_ui_widget_set_tooltip(handle: i64, text_ptr: i64) { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -981,7 +981,7 @@ pub extern "C" fn perry_system_open_url(url_ptr: i64) { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -1038,7 +1038,7 @@ pub extern "C" fn perry_system_preferences_set(key_ptr: i64, value: f64) { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -1074,7 +1074,7 @@ pub extern "C" fn perry_system_preferences_get(key_ptr: i64) -> f64 { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -1122,7 +1122,7 @@ pub extern "C" fn perry_ui_text_set_font_family(handle: i64, family_ptr: i64) { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/menu.rs b/crates/perry-ui-macos/src/menu.rs index 09a8c5614..0b7a48cd7 100644 --- a/crates/perry-ui-macos/src/menu.rs +++ b/crates/perry-ui-macos/src/menu.rs @@ -67,7 +67,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/notifications.rs b/crates/perry-ui-macos/src/notifications.rs index 1614abbfc..5b72ea470 100644 --- a/crates/perry-ui-macos/src/notifications.rs +++ b/crates/perry-ui-macos/src/notifications.rs @@ -7,7 +7,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/state.rs b/crates/perry-ui-macos/src/state.rs index cc9f89bb0..00056a866 100644 --- a/crates/perry-ui-macos/src/state.rs +++ b/crates/perry-ui-macos/src/state.rs @@ -80,7 +80,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/string_header.rs b/crates/perry-ui-macos/src/string_header.rs index b20f32d88..4af85f020 100644 --- a/crates/perry-ui-macos/src/string_header.rs +++ b/crates/perry-ui-macos/src/string_header.rs @@ -3,8 +3,10 @@ /// which would cause duplicate symbol errors when linking with libperry_stdlib.a. #[repr(C)] pub struct StringHeader { - /// Length in bytes (not chars - we store UTF-8) - pub length: u32, + /// Length in UTF-16 code units (JS `.length` semantics) + pub utf16_len: u32, + /// Length in UTF-8 bytes + pub byte_len: u32, /// Capacity (allocated space for data) pub capacity: u32, /// Reference hint for in-place append optimization (0=shared, 1=unique) diff --git a/crates/perry-ui-macos/src/widgets/alert.rs b/crates/perry-ui-macos/src/widgets/alert.rs index 7f920fbe3..fa8befd9f 100644 --- a/crates/perry-ui-macos/src/widgets/alert.rs +++ b/crates/perry-ui-macos/src/widgets/alert.rs @@ -12,7 +12,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/button.rs b/crates/perry-ui-macos/src/widgets/button.rs index 05df65cad..181d1af56 100644 --- a/crates/perry-ui-macos/src/widgets/button.rs +++ b/crates/perry-ui-macos/src/widgets/button.rs @@ -62,7 +62,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/form.rs b/crates/perry-ui-macos/src/widgets/form.rs index 5aba7b645..ff2f142d6 100644 --- a/crates/perry-ui-macos/src/widgets/form.rs +++ b/crates/perry-ui-macos/src/widgets/form.rs @@ -30,7 +30,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/image.rs b/crates/perry-ui-macos/src/widgets/image.rs index f3b74240e..241d7c7e4 100644 --- a/crates/perry-ui-macos/src/widgets/image.rs +++ b/crates/perry-ui-macos/src/widgets/image.rs @@ -11,7 +11,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/picker.rs b/crates/perry-ui-macos/src/widgets/picker.rs index 9be0f8e5f..f95f2f4f4 100644 --- a/crates/perry-ui-macos/src/widgets/picker.rs +++ b/crates/perry-ui-macos/src/widgets/picker.rs @@ -63,7 +63,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/qrcode.rs b/crates/perry-ui-macos/src/widgets/qrcode.rs index 5c21b1816..08fe8ed95 100644 --- a/crates/perry-ui-macos/src/widgets/qrcode.rs +++ b/crates/perry-ui-macos/src/widgets/qrcode.rs @@ -12,7 +12,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/securefield.rs b/crates/perry-ui-macos/src/widgets/securefield.rs index ea4b1ba3a..0de408342 100644 --- a/crates/perry-ui-macos/src/widgets/securefield.rs +++ b/crates/perry-ui-macos/src/widgets/securefield.rs @@ -83,7 +83,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/sheet.rs b/crates/perry-ui-macos/src/widgets/sheet.rs index d06632c60..666e467b0 100644 --- a/crates/perry-ui-macos/src/widgets/sheet.rs +++ b/crates/perry-ui-macos/src/widgets/sheet.rs @@ -15,7 +15,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/table.rs b/crates/perry-ui-macos/src/widgets/table.rs index dfe2c91c5..921acccac 100644 --- a/crates/perry-ui-macos/src/widgets/table.rs +++ b/crates/perry-ui-macos/src/widgets/table.rs @@ -36,7 +36,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/text.rs b/crates/perry-ui-macos/src/widgets/text.rs index 66d9dcaa9..08271bbdc 100644 --- a/crates/perry-ui-macos/src/widgets/text.rs +++ b/crates/perry-ui-macos/src/widgets/text.rs @@ -13,7 +13,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/textarea.rs b/crates/perry-ui-macos/src/widgets/textarea.rs index c63666ae8..1bdfe4fbd 100644 --- a/crates/perry-ui-macos/src/widgets/textarea.rs +++ b/crates/perry-ui-macos/src/widgets/textarea.rs @@ -77,7 +77,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/textfield.rs b/crates/perry-ui-macos/src/widgets/textfield.rs index 7a373657f..6c9320643 100644 --- a/crates/perry-ui-macos/src/widgets/textfield.rs +++ b/crates/perry-ui-macos/src/widgets/textfield.rs @@ -150,7 +150,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/toggle.rs b/crates/perry-ui-macos/src/widgets/toggle.rs index 1aea2d05f..931b91bd7 100644 --- a/crates/perry-ui-macos/src/widgets/toggle.rs +++ b/crates/perry-ui-macos/src/widgets/toggle.rs @@ -75,7 +75,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-macos/src/widgets/toolbar.rs b/crates/perry-ui-macos/src/widgets/toolbar.rs index 864ca55d4..34ef96cab 100644 --- a/crates/perry-ui-macos/src/widgets/toolbar.rs +++ b/crates/perry-ui-macos/src/widgets/toolbar.rs @@ -16,7 +16,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const crate::string_header::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/app.rs b/crates/perry-ui-tvos/src/app.rs index ba3a83903..d8925cf2c 100644 --- a/crates/perry-ui-tvos/src/app.rs +++ b/crates/perry-ui-tvos/src/app.rs @@ -36,7 +36,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/audio_playback.rs b/crates/perry-ui-tvos/src/audio_playback.rs index 1b3180125..cb79bff08 100644 --- a/crates/perry-ui-tvos/src/audio_playback.rs +++ b/crates/perry-ui-tvos/src/audio_playback.rs @@ -68,7 +68,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/lib.rs b/crates/perry-ui-tvos/src/lib.rs index acbae12dd..739f83f61 100644 --- a/crates/perry-ui-tvos/src/lib.rs +++ b/crates/perry-ui-tvos/src/lib.rs @@ -826,7 +826,7 @@ pub extern "C" fn perry_ui_widget_set_tooltip(handle: i64, text_ptr: i64) { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -887,7 +887,7 @@ pub extern "C" fn perry_system_open_url(url_ptr: i64) { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -989,7 +989,7 @@ pub extern "C" fn perry_system_preferences_set(key_ptr: i64, value: f64) { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -1024,7 +1024,7 @@ pub extern "C" fn perry_system_preferences_get(key_ptr: i64) -> f64 { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -1129,7 +1129,7 @@ pub extern "C" fn perry_ui_text_set_font_family(handle: i64, family_ptr: i64) { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -1400,7 +1400,7 @@ fn keychain_str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } @@ -1635,7 +1635,7 @@ pub extern "C" fn hone_ws_connect(url_ptr: i64) -> f64 { if !ptr.is_null() && url_ptr > 0x1000 { let header = ptr as *const perry_runtime::string::StringHeader; unsafe { - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); if let Ok(s) = std::str::from_utf8(std::slice::from_raw_parts(data, len.min(200))) { let _ = writeln!(f, " url_str={}", s); diff --git a/crates/perry-ui-tvos/src/menu.rs b/crates/perry-ui-tvos/src/menu.rs index 72f417258..b54ef05d5 100644 --- a/crates/perry-ui-tvos/src/menu.rs +++ b/crates/perry-ui-tvos/src/menu.rs @@ -52,7 +52,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/state.rs b/crates/perry-ui-tvos/src/state.rs index 083da79b6..0c86c1b78 100644 --- a/crates/perry-ui-tvos/src/state.rs +++ b/crates/perry-ui-tvos/src/state.rs @@ -69,7 +69,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/websocket.rs b/crates/perry-ui-tvos/src/websocket.rs index 107e83576..97bd61997 100644 --- a/crates/perry-ui-tvos/src/websocket.rs +++ b/crates/perry-ui-tvos/src/websocket.rs @@ -21,7 +21,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/button.rs b/crates/perry-ui-tvos/src/widgets/button.rs index d4ea0e5dd..f4968fbec 100644 --- a/crates/perry-ui-tvos/src/widgets/button.rs +++ b/crates/perry-ui-tvos/src/widgets/button.rs @@ -74,7 +74,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/form.rs b/crates/perry-ui-tvos/src/widgets/form.rs index c34c1185d..8a1c62a9f 100644 --- a/crates/perry-ui-tvos/src/widgets/form.rs +++ b/crates/perry-ui-tvos/src/widgets/form.rs @@ -10,7 +10,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/image.rs b/crates/perry-ui-tvos/src/widgets/image.rs index 88e9b3785..bea0e949f 100644 --- a/crates/perry-ui-tvos/src/widgets/image.rs +++ b/crates/perry-ui-tvos/src/widgets/image.rs @@ -10,7 +10,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/picker.rs b/crates/perry-ui-tvos/src/widgets/picker.rs index 4a0137d2e..995782f56 100644 --- a/crates/perry-ui-tvos/src/widgets/picker.rs +++ b/crates/perry-ui-tvos/src/widgets/picker.rs @@ -17,7 +17,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/securefield.rs b/crates/perry-ui-tvos/src/widgets/securefield.rs index 17c669fb0..6077755c2 100644 --- a/crates/perry-ui-tvos/src/widgets/securefield.rs +++ b/crates/perry-ui-tvos/src/widgets/securefield.rs @@ -65,7 +65,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/tabbar.rs b/crates/perry-ui-tvos/src/widgets/tabbar.rs index 2e049df9c..4bc1161d4 100644 --- a/crates/perry-ui-tvos/src/widgets/tabbar.rs +++ b/crates/perry-ui-tvos/src/widgets/tabbar.rs @@ -23,7 +23,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/text.rs b/crates/perry-ui-tvos/src/widgets/text.rs index c62742014..3a03301c6 100644 --- a/crates/perry-ui-tvos/src/widgets/text.rs +++ b/crates/perry-ui-tvos/src/widgets/text.rs @@ -14,7 +14,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/textarea.rs b/crates/perry-ui-tvos/src/widgets/textarea.rs index 0dda84383..e30d90c7f 100644 --- a/crates/perry-ui-tvos/src/widgets/textarea.rs +++ b/crates/perry-ui-tvos/src/widgets/textarea.rs @@ -65,7 +65,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/textfield.rs b/crates/perry-ui-tvos/src/widgets/textfield.rs index 2557ea045..6e7817d78 100644 --- a/crates/perry-ui-tvos/src/widgets/textfield.rs +++ b/crates/perry-ui-tvos/src/widgets/textfield.rs @@ -65,7 +65,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-tvos/src/widgets/toggle.rs b/crates/perry-ui-tvos/src/widgets/toggle.rs index e3a647d8c..04fb94fe2 100644 --- a/crates/perry-ui-tvos/src/widgets/toggle.rs +++ b/crates/perry-ui-tvos/src/widgets/toggle.rs @@ -68,7 +68,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-watchos/src/lib.rs b/crates/perry-ui-watchos/src/lib.rs index 3420a4ada..16c1ae15d 100644 --- a/crates/perry-ui-watchos/src/lib.rs +++ b/crates/perry-ui-watchos/src/lib.rs @@ -20,7 +20,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-watchos/src/state.rs b/crates/perry-ui-watchos/src/state.rs index 433094877..3af896594 100644 --- a/crates/perry-ui-watchos/src/state.rs +++ b/crates/perry-ui-watchos/src/state.rs @@ -69,7 +69,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/app.rs b/crates/perry-ui-windows/src/app.rs index 6d780b83b..86a96a8bd 100644 --- a/crates/perry-ui-windows/src/app.rs +++ b/crates/perry-ui-windows/src/app.rs @@ -70,7 +70,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/clipboard.rs b/crates/perry-ui-windows/src/clipboard.rs index 41f9b4f20..ecb8f8eea 100644 --- a/crates/perry-ui-windows/src/clipboard.rs +++ b/crates/perry-ui-windows/src/clipboard.rs @@ -20,7 +20,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/dialog.rs b/crates/perry-ui-windows/src/dialog.rs index 70a79b107..79da4f310 100644 --- a/crates/perry-ui-windows/src/dialog.rs +++ b/crates/perry-ui-windows/src/dialog.rs @@ -16,7 +16,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/menu.rs b/crates/perry-ui-windows/src/menu.rs index 29ced6538..4a09e9c04 100644 --- a/crates/perry-ui-windows/src/menu.rs +++ b/crates/perry-ui-windows/src/menu.rs @@ -18,7 +18,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/sheet.rs b/crates/perry-ui-windows/src/sheet.rs index b15256db6..79055e86f 100644 --- a/crates/perry-ui-windows/src/sheet.rs +++ b/crates/perry-ui-windows/src/sheet.rs @@ -19,7 +19,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/state.rs b/crates/perry-ui-windows/src/state.rs index 386dab8e9..a337659e5 100644 --- a/crates/perry-ui-windows/src/state.rs +++ b/crates/perry-ui-windows/src/state.rs @@ -88,7 +88,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/system.rs b/crates/perry-ui-windows/src/system.rs index 30c5db90b..ebeb5a2c1 100644 --- a/crates/perry-ui-windows/src/system.rs +++ b/crates/perry-ui-windows/src/system.rs @@ -14,7 +14,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/toolbar.rs b/crates/perry-ui-windows/src/toolbar.rs index 5fa42716d..62863cc9c 100644 --- a/crates/perry-ui-windows/src/toolbar.rs +++ b/crates/perry-ui-windows/src/toolbar.rs @@ -27,7 +27,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/widgets/button.rs b/crates/perry-ui-windows/src/widgets/button.rs index c7d895d37..cc2403ea0 100644 --- a/crates/perry-ui-windows/src/widgets/button.rs +++ b/crates/perry-ui-windows/src/widgets/button.rs @@ -25,7 +25,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/widgets/form.rs b/crates/perry-ui-windows/src/widgets/form.rs index 7cdf713f2..c4d64498f 100644 --- a/crates/perry-ui-windows/src/widgets/form.rs +++ b/crates/perry-ui-windows/src/widgets/form.rs @@ -17,7 +17,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/widgets/image.rs b/crates/perry-ui-windows/src/widgets/image.rs index 97261adbd..023abe13d 100644 --- a/crates/perry-ui-windows/src/widgets/image.rs +++ b/crates/perry-ui-windows/src/widgets/image.rs @@ -23,7 +23,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/widgets/navstack.rs b/crates/perry-ui-windows/src/widgets/navstack.rs index 299fe8179..e473fc943 100644 --- a/crates/perry-ui-windows/src/widgets/navstack.rs +++ b/crates/perry-ui-windows/src/widgets/navstack.rs @@ -20,7 +20,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/widgets/picker.rs b/crates/perry-ui-windows/src/widgets/picker.rs index 38d251a80..e325d39c0 100644 --- a/crates/perry-ui-windows/src/widgets/picker.rs +++ b/crates/perry-ui-windows/src/widgets/picker.rs @@ -23,7 +23,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/widgets/securefield.rs b/crates/perry-ui-windows/src/widgets/securefield.rs index 049638be7..f20408ac8 100644 --- a/crates/perry-ui-windows/src/widgets/securefield.rs +++ b/crates/perry-ui-windows/src/widgets/securefield.rs @@ -26,7 +26,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/widgets/text.rs b/crates/perry-ui-windows/src/widgets/text.rs index a1722f60f..ca571af1e 100644 --- a/crates/perry-ui-windows/src/widgets/text.rs +++ b/crates/perry-ui-windows/src/widgets/text.rs @@ -22,7 +22,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/widgets/textfield.rs b/crates/perry-ui-windows/src/widgets/textfield.rs index 494176118..3e2c9b2ee 100644 --- a/crates/perry-ui-windows/src/widgets/textfield.rs +++ b/crates/perry-ui-windows/src/widgets/textfield.rs @@ -30,7 +30,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/widgets/toggle.rs b/crates/perry-ui-windows/src/widgets/toggle.rs index 6f93c8c16..b57715364 100644 --- a/crates/perry-ui-windows/src/widgets/toggle.rs +++ b/crates/perry-ui-windows/src/widgets/toggle.rs @@ -25,7 +25,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry-ui-windows/src/window.rs b/crates/perry-ui-windows/src/window.rs index 6cbaa0ca1..65b9df75f 100644 --- a/crates/perry-ui-windows/src/window.rs +++ b/crates/perry-ui-windows/src/window.rs @@ -15,7 +15,7 @@ fn str_from_header(ptr: *const u8) -> &'static str { if ptr.is_null() { return ""; } unsafe { let header = ptr as *const perry_runtime::string::StringHeader; - let len = (*header).length as usize; + let len = (*header).byte_len as usize; let data = ptr.add(std::mem::size_of::()); std::str::from_utf8_unchecked(std::slice::from_raw_parts(data, len)) } diff --git a/crates/perry/src/commands/compile.rs b/crates/perry/src/commands/compile.rs index f7e4e001f..a7a4b2126 100644 --- a/crates/perry/src/commands/compile.rs +++ b/crates/perry/src/commands/compile.rs @@ -171,6 +171,8 @@ pub struct CompilationContext { /// `CryptoSha256`/`CryptoMd5` which dispatch to runtime symbols that /// live behind the perry-stdlib `crypto` feature. pub uses_crypto_builtins: bool, + /// Whether `perry/container` or `perry/compose` is imported. + pub uses_container: bool, /// Whether `perry/thread` is imported. When true, the runtime must /// keep `panic = "unwind"` so that worker-thread panics translate to /// promise rejections via `catch_unwind` in `perry-runtime/src/thread.rs` @@ -211,6 +213,7 @@ impl CompilationContext { native_module_imports: BTreeSet::new(), uses_fetch: false, uses_crypto_builtins: false, + uses_container: false, needs_thread: false, } } @@ -1042,6 +1045,7 @@ fn build_optimized_libs( &ctx.native_module_imports, ctx.uses_fetch, ctx.uses_crypto_builtins, + ctx.uses_container, ); let feature_arg = features_to_cargo_arg(&features); @@ -2207,6 +2211,10 @@ fn collect_modules( // panic = "unwind" when this is set. ctx.needs_thread = true; } + if import.source == "perry/container" || import.source == "perry/compose" { + ctx.needs_stdlib = true; + ctx.uses_container = true; + } if perry_hir::requires_stdlib(&import.source) { ctx.needs_stdlib = true; // Track for `--minimal-stdlib` feature computation. Strip diff --git a/crates/perry/src/commands/stdlib_features.rs b/crates/perry/src/commands/stdlib_features.rs index c2adc1e43..e03a7198f 100644 --- a/crates/perry/src/commands/stdlib_features.rs +++ b/crates/perry/src/commands/stdlib_features.rs @@ -75,6 +75,9 @@ pub fn module_to_features(module: &str) -> &'static [&'static str] { // ── IDs (uuid / nanoid) ─────────────────────────────────────── "uuid" | "nanoid" => &["ids"], + // ── OCI Container management ────────────────────────────────── + "perry/container" | "perry/compose" => &["container"], + // Slugify is in the always-on stdlib core (no optional dep). "slugify" => &[], // dotenv has no optional dep. @@ -95,6 +98,7 @@ pub fn compute_required_features( native_module_imports: &BTreeSet, uses_fetch: bool, uses_crypto_builtins: bool, + uses_container: bool, ) -> BTreeSet<&'static str> { let mut features = BTreeSet::new(); for module in native_module_imports { @@ -111,6 +115,9 @@ pub fn compute_required_features( if uses_crypto_builtins { features.insert("crypto"); } + if uses_container { + features.insert("container"); + } features } diff --git a/example-code/container-demo/PODMAN_SETUP.md b/example-code/container-demo/PODMAN_SETUP.md new file mode 100644 index 000000000..416f89c2a --- /dev/null +++ b/example-code/container-demo/PODMAN_SETUP.md @@ -0,0 +1,242 @@ +# Perry Container Module - Podman Setup Guide + +## Problem: Podman Not Running + +Your system shows: +- ✅ Podman is installed (version 5.3.2) +- ❌ Hardware virtualization not supported (No hardware virtualization) +- ❌ Podman machine cannot start + +This is common on macOS, especially with Apple Silicon. + +## Solutions + +### Option 1: Use Colima (Recommended) + +Colima provides Lima VM-based container runtime that works well on macOS and integrates with Podman: + +```bash +# Install Colima +brew install colima + +# Start Colima (this creates a VM and sets up Podman) +colima start + +# Verify +colima status +podman info +``` + +Colima automatically: +- Creates a Lima VM with hardware virtualization +- Configures Podman to use the VM +- Sets up proper networking and storage + +### Option 2: Use Lima VM Directly + +```bash +# Install Lima +brew install lima + +# Create a VM +limactl start --name=perry-dev --vm-type=vz + +# Export Podman connection +eval $(limactl shell perry-dev -- sh -c 'echo "export CONTAINER_HOST=unix://$HOME/.lima/perry-dev/sock/podman.sock"') + +# Test +podman run --rm nginx:alpine echo "Hello from Lima!" +``` + +### Option 3: Use Docker Desktop (Alternative) + +If you prefer Docker Desktop, it also works as a container backend: + +```bash +# Install Docker Desktop +brew install --cask docker + +# Start Docker Desktop +open -a Docker + +# Enable Docker socket for Podman (optional) +podman system connection add docker --default +``` + +### Option 4: Test on Linux (Native) + +For full native performance, test on Linux: + +```bash +# Using a VM (Multipass, UTM, etc.) or remote Linux server: +podman run --rm -p 8080:80 nginx:alpine +``` + +## Quick Start with Colima + +```bash +# 1. Install and start Colima +brew install colima +colima start + +# 2. Verify Podman works +podman run --rm nginx:alpine echo "Podman is working!" + +# 3. Test Perry Container Module +cd example-code/container-demo +npm install +npm run build +./container-demo + +# 4. Run the test +perry compile src/test.ts -o test-podman +./test-podman +``` + +## Verifying Podman Connection + +After starting Colima (or other solution), verify: + +```bash +# Check Podman info +podman info + +# List containers +podman ps -a + +# Run a test container +podman run --rm -p 8081:80 nginx:alpine sh -c "echo 'Container is running!' && sleep 5" + +# Test Perry backend detection +podman info --format '{{.HostInfo.OperatingSystem}}' +``` + +You should see: +``` +hostArch: arm64 +os: linux +``` + +## Troubleshooting + +### "Cannot connect to Podman" + +1. **Colima not running:** + ```bash + colima status + colima start + ``` + +2. **Socket not found:** + ```bash + colima stop + colima delete + colima start + ``` + +3. **Permission issues:** + ```bash + # Colima usually handles this, but check: + colima ssh -- ls -la /var/run/podman + ``` + +### "Hardware virtualization not supported" + +This is a macOS limitation. Use Colima or Lima VM-based solutions. + +### "Backend failed to execute" + +1. **Container not found:** + ```bash + podman pull nginx:alpine + ``` + +2. **Port already in use:** + - Use a different port in the test script + - Or stop the conflicting container: + ```bash + podman ps | grep 8081 + podman stop + ``` + +3. **Image pull failed:** + ```bash + podman pull nginx:alpine + podman images + ``` + +## Performance Notes + +- **Colima/Lima VM**: Adds ~1-2 seconds of cold start, but good for development +- **Native Linux**: No VM overhead, best performance +- **macOS native**: Apple Container framework is planned but not yet implemented + +## Testing Perry Container Module + +Once Podman is working: + +```bash +# Compile test +cd example-code/container-demo +perry compile src/test.ts -o test-podman + +# Run test +./test-podman +``` + +Expected output: +``` +============================================================ +Perry Container Module - Integration Test +============================================================ + +1. Checking backend... + ✓ Backend: podman + +2. Listing containers... + ✓ Found 0 container(s) + +3. Running test container... + ✓ Container started: 8f2e9b3a1c2d + ✓ Container name: perry-test-nginx + +4. Waiting for container to initialize... + +5. Inspecting container... + ✓ Image: nginx:alpine + ✓ Status: running + ✓ Ports: 0.0.0.0.8081->80/tcp + ✓ Created: 2024-04-14T12:34:56.789012345Z + +6. Listing containers (should show running container)... + ✓ Found 1 container(s): + - perry-test-nginx (running) + +7. Stopping container... + ✓ Container stopped + +8. Removing container... + ✓ Container removed + +9. Verifying cleanup... + ✓ All containers cleaned up + +============================================================ +✓ All tests completed successfully! +============================================================ +``` + +## Next Steps + +1. Install Colima (or use Lima/Docker Desktop) +2. Start the VM: `colima start` +3. Verify Podman: `podman run --rm nginx:alpine echo "Hello!"` +4. Test Perry: `./test-podman` +5. Try the demo: `./container-demo` + +## Additional Resources + +- [Colima Documentation](https://github.com/abiosoft/colima) +- [Lima Documentation](https://github.com/lima-vm/lima) +- [Podman on macOS](https://docs.podman.io/en/latest/installation/macOS) +- [Perry Container Module](../../types/perry/container/index.d.ts) diff --git a/example-code/container-demo/QUICKSTART.md b/example-code/container-demo/QUICKSTART.md new file mode 100644 index 000000000..6d81bbe59 --- /dev/null +++ b/example-code/container-demo/QUICKSTART.md @@ -0,0 +1,289 @@ +# Perry Container Module - Quick Test Guide + +## Status + +✅ **Perry Container Module**: Successfully compiled and ready to test +✅ **Podman**: Installed (version 5.3.2) +❌ **Podman VM**: Not running (hardware virtualization not supported) + +## Quick Start + +### Option 1: Install Colima (Recommended) + +```bash +# Install Colima +brew install colima + +# Start Colima VM +colima start + +# Run verification +cd example-code/container-demo +./verify-podman.sh +``` + +### Option 2: Use Docker Desktop (Alternative) + +```bash +# Install Docker Desktop +brew install --cask docker + +# Start Docker Desktop +open -a Docker + +# Run verification +cd example-code/container-demo +./verify-podman.sh +``` + +## Run Tests + +Once Podman is working: + +```bash +# Navigate to demo directory +cd example-code/container-demo + +# Install dependencies (if needed) +npm install + +# Run verification script +./verify-podman.sh + +# Run Perry container tests +npm test + +# Or compile and run manually +perry compile src/test.ts -o test-podman +./test-podman + +# Run the main demo +npm run build +./container-demo +``` + +## What the Tests Do + +### verify-podman.sh + +1. ✅ Checks Podman installation +2. ✅ Checks/starts Colima VM +3. ✅ Tests Podman connection +4. ✅ Pulls test image (nginx:alpine) +5. ✅ Runs quick container test +6. ✅ Cleans up + +### test.ts (Perry Container Module) + +1. ✅ Gets backend information +2. ✅ Lists containers +3. ✅ Runs a test container (nginx:alpine) +4. ✅ Waits for initialization +5. ✅ Inspects container details +6. ✅ Lists containers again +7. ✅ Stops the container +8. ✅ Removes the container +9. ✅ Verifies cleanup + +## Expected Output + +### verify-podman.sh + +``` +============================================================ +Perry Container Module - Podman Setup & Verification +============================================================ + +1. Checking Podman installation... + ✓ Podman installed: podman version 5.3.2 + +2. Checking Colima (recommended solution)... + ✓ Colima installed: colima version 0.7.6 + +3. Checking Colima VM status... + ✓ Colima VM is running + ✓ Podman should be accessible + +4. Testing Podman connection... + ✓ Podman is accessible + ✓ Host OS: linux + ✓ Host Arch: arm64 + +5. Checking for test image... + ✓ Test image exists + +6. Running Podman test container... + ✓ Test container started: 8f2e9b3a1c2d + ✓ Container running + Container logs: + /docker-entrypoint.sh: /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh + /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh + /docker-entrypoint.sh: done + +7. Cleaning up test container... + ✓ Test container removed + +============================================================ +✓ Podman is ready for Perry Container Module! +============================================================ +``` + +### test.ts + +``` +============================================================ +Perry Container Module - Integration Test +============================================================ + +1. Checking backend... + ✓ Backend: podman + +2. Listing containers... + ✓ Found 0 container(s) + +3. Running test container... + ✓ Container started: 8f2e9b3a1c2d + ✓ Container name: perry-test-nginx + +4. Waiting for container to initialize... + +5. Inspecting container... + ✓ Image: nginx:alpine + ✓ Status: running + ✓ Ports: 0.0.0.0:8081->80/tcp + ✓ Created: 2024-04-14T12:34:56.789012345Z + +6. Listing containers (should show running container)... + ✓ Found 1 container(s): + - perry-test-nginx (running) + +7. Stopping container... + ✓ Container stopped + +8. Removing container... + ✓ Container removed + +9. Verifying cleanup... + ✓ All containers cleaned up + +============================================================ +✓ All tests completed successfully! +============================================================ +``` + +## Troubleshooting + +### "Hardware virtualization not supported" + +**Solution:** Use Colima or Lima VM +```bash +brew install colima +colima start +``` + +### "Cannot connect to Podman" + +**Solution 1:** Start Colima +```bash +colima start +``` + +**Solution 2:** Reset Colima +```bash +colima stop +colima delete +colima start +``` + +**Solution 3:** Use Docker Desktop +```bash +open -a Docker +``` + +### "Backend failed to execute" + +**Solution:** Pull the test image first +```bash +podman pull nginx:alpine +podman images +``` + +### "Port already in use" + +**Solution:** Change port in test.ts or stop conflicting container +```bash +podman ps | grep 8081 +podman stop +``` + +## Advanced: Compose Orchestration Test + +Once basic tests pass, try Compose: + +```typescript +// Create compose-test.ts +import { composeUp } from 'perry/container'; + +async function main() { + const compose = await composeUp({ + version: '3.8', + services: { + web: { + image: 'nginx:alpine', + ports: ['8080:80'], + }, + redis: { + image: 'redis:alpine', + ports: ['6379:6379'], + }, + }, + }); + + console.log('Compose stack started'); + + const services = await compose.ps(); + console.log('Services:', services.map(s => s.name).join(', ')); + + await compose.down({ volumes: false }); + console.log('Compose stack stopped'); +} + +main().catch(console.error); +``` + +```bash +perry compose-test.ts -o compose-test +./compose-test +``` + +## Performance Notes + +- **Colima VM**: ~1-2s cold start, good for development +- **Native Linux**: No VM overhead, best performance +- **Apple Container**: Planned for future macOS/iOS support + +## Documentation + +- [Podman Setup Guide](PODMAN_SETUP.md) - Detailed setup instructions +- [Full README](README.md) - Complete documentation +- [TypeScript Types](../../types/perry/container/index.d.ts) - API reference +- [Implementation Summary](../../.comate/specs/perry-container/summary.md) - Technical details + +## Next Steps + +1. ✅ Install and start Colima (or alternative) +2. ✅ Run `./verify-podman.sh` to verify Podman +3. ✅ Run `npm test` to test Perry Container Module +4. ✅ Try the main demo: `npm run build && ./container-demo` +5. ✅ Explore Compose orchestration +6. ✅ Read the full documentation + +## Help + +For issues: +1. Check [PODMAN_SETUP.md](PODMAN_SETUP.md) for detailed troubleshooting +2. Check Podman logs: `colima logs` +3. Verify Perry compilation: `cargo build --release -p perry-stdlib --features container` +4. Report bugs on GitHub + +Happy containerizing! 🚀 diff --git a/example-code/container-demo/README.md b/example-code/container-demo/README.md new file mode 100644 index 000000000..5bb91a82e --- /dev/null +++ b/example-code/container-demo/README.md @@ -0,0 +1,223 @@ +# Perry Container Module Demo + +This example demonstrates the `perry/container` module for managing OCI containers from compiled Perry applications. + +## Prerequisites + +### Required Backend + +The `perry/container` module requires a container runtime: + +**macOS / iOS:** +- Currently uses Podman (apple/container support coming soon) +- Install: `brew install podman` +- Initialize: `podman machine init && podman machine start` + +**Linux:** +- Podman is the native backend +- Install: `sudo apt install podman` (Debian/Ubuntu) + or: `sudo dnf install podman` (Fedora/RHEL) + +**Windows:** +- Podman Desktop (WSL2 backend) + +## Quick Start + +```bash +# Install dependencies +npm install + +# Compile +npm run build + +# Run +./container-demo +``` + +## What It Does + +This example demonstrates: + +1. **Backend Detection**: Shows which container backend is being used +2. **Run Container**: Starts an nginx:alpine container with port mapping +3. **List Containers**: Queries and displays all running containers +4. **Inspect Container**: Retrieves detailed information about a container +5. **Stop Container**: Gracefully stops the running container +6. **Remove Container**: Removes the stopped container + +## Expected Output + +``` +Perry Container Module Demo +============================= + +Using backend: podman + +Example 1: Running nginx container... +Container started: 8f2e9b3a1c2d + +Example 2: Listing containers... +Found 1 container(s): + - demo-nginx (8f2e9b3a1c2): running + +Example 3: Inspecting container... +Container demo-nginx: + Image: nginx:alpine + Status: running + Ports: 0.0.0.0:8080->80/tcp + Created: 2024-04-14T12:34:56.789012345Z + +Example 4: Stopping container... +Container stopped + +Example 5: Removing container... +Container removed +``` + +## Advanced Usage + +### Compose Orchestration + +The `perry/container` module supports Docker Compose-like multi-container orchestration: + +```typescript +import { composeUp } from 'perry/container'; + +const compose = await composeUp({ + version: '3.8', + services: { + web: { + image: 'nginx:alpine', + ports: ['8080:80'], + }, + db: { + image: 'postgres:15-alpine', + environment: { + POSTGRES_PASSWORD: 'example', + }, + }, + }, +}); + +// Get services +const services = await compose.ps(); + +// Stop and remove +await compose.down(); +``` + +### Image Management + +```typescript +import { pullImage, listImages, removeImage } from 'perry/container'; + +// Pull an image +await pullImage('alpine:latest'); + +// List all images +const images = await listImages(); +for (const img of images) { + console.log(`${img.repository}:${img.tag} (${img.size} bytes)`); +} + +// Remove an image +await removeImage('alpine:latest'); +``` + +### Container Logs + +```typescript +import { logs } from 'perry/container'; + +// Get recent logs +const logs = await logs(containerId, { tail: 100 }); +console.log('STDOUT:', logs.stdout); +console.log('STDERR:', logs.stderr); +``` + +## TypeScript Support + +Full TypeScript type definitions are included: + +```typescript +import type { ContainerSpec, ContainerInfo, ContainerLogs } from 'perry/container'; + +const spec: ContainerSpec = { + image: 'nginx:alpine', + name: 'my-nginx', + ports: ['8080:80'], + env: { ENV_VAR: 'value' }, +}; + +const info: ContainerInfo = await inspect(spec.name); +console.log(info.status); +``` + +## Platform Notes + +### macOS / iOS + +Currently uses Podman backend. Apple Container framework support is planned. + +### Linux + +Native Podman backend with full feature support. + +### Windows + +Podman Desktop with WSL2 backend (experimental). + +## Building for Different Targets + +```bash +# Native binary (default) +perry compile src/main.ts -o container-demo + +# macOS +perry compile src/main.ts --target macos -o container-demo-macos + +# Linux +perry compile src/main.ts --target linux -o container-demo-linux + +# Windows +perry compile src/main.ts --target windows -o container-demo.exe +``` + +## Troubleshooting + +### "podman binary not found" + +Install Podman: +- macOS: `brew install podman` +- Debian/Ubuntu: `sudo apt install podman` +- Fedora/RHEL: `sudo dnf install podman` + +### "Backend failed to execute" + +Make sure the Podman daemon is running: +```bash +# macOS +podman machine start + +# Linux (user mode) +# Podman runs in rootless mode by default, no daemon needed +``` + +### "Permission denied" + +Ensure your user is in the appropriate groups: +```bash +# Linux (if using rootless mode) +sudo usermod -aG podman $USER +``` + +## Further Reading + +- [Perry Documentation](https://perryts.github.io/perry/) +- [Perry Container Module API](./types/perry/container/index.d.ts) +- [Podman Documentation](https://docs.podman.io/) +- [Docker Compose Reference](https://docs.docker.com/compose/) + +## License + +MIT diff --git a/example-code/container-demo/package-lock.json b/example-code/container-demo/package-lock.json new file mode 100644 index 000000000..a567eac99 --- /dev/null +++ b/example-code/container-demo/package-lock.json @@ -0,0 +1,12 @@ +{ + "name": "perry-container-demo", + "version": "1.0.0", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "name": "perry-container-demo", + "version": "1.0.0" + } + } +} diff --git a/example-code/container-demo/package.json b/example-code/container-demo/package.json new file mode 100644 index 000000000..31ac1f473 --- /dev/null +++ b/example-code/container-demo/package.json @@ -0,0 +1,15 @@ +{ + "name": "perry-container-demo", + "version": "1.0.0", + "description": "Example demonstrating Perry's container module", + "main": "src/main.ts", + "scripts": { + "build": "perry compile src/main.ts -o container-demo", + "run": "perry run .", + "test": "perry compile src/test.ts -o test-podman && ./test-podman", + "verify": "./verify-podman.sh" + }, + "keywords": ["perry", "container", "podman", "oci"], + "author": "Perry", + "license": "MIT" +} diff --git a/example-code/container-demo/src/main.ts b/example-code/container-demo/src/main.ts new file mode 100644 index 000000000..64cd1fd2f --- /dev/null +++ b/example-code/container-demo/src/main.ts @@ -0,0 +1,101 @@ +/** + * Perry Container Module Example + * + * Demonstrates basic container operations using perry/container module. + * + * Compile: perry compile src/main.ts -o container-demo + * Run: ./container-demo + */ + +import { run, create, start, stop, remove, list, inspect, getBackend } from 'perry/container'; + +async function main() { + console.log('Perry Container Module Demo'); + console.log('=============================\n'); + + // Get current backend + const backend = getBackend(); + console.log(`Using backend: ${backend}\n`); + + // Example 1: Run a simple container + console.log('Example 1: Running nginx container...'); + try { + const nginx = await run({ + image: 'nginx:alpine', + name: 'demo-nginx', + ports: ['8080:80'], + rm: true, + }); + console.log(`Container started: ${nginx.id}\n`); + + // Wait a bit + await new Promise(resolve => setTimeout(resolve, 2000)); + + // List containers + console.log('Example 2: Listing containers...'); + const containers = await list(); + console.log(`Found ${containers.length} container(s):`); + for (const c of containers) { + console.log(` - ${c.name} (${c.id.slice(0, 12)}): ${c.status}`); + } + console.log(''); + + // Inspect our container + console.log('Example 3: Inspecting container...'); + const info = await inspect(nginx.id); + console.log(`Container ${info.name}:`); + console.log(` Image: ${info.image}`); + console.log(` Status: ${info.status}`); + console.log(` Ports: ${info.ports.join(', ')}`); + console.log(` Created: ${info.created}`); + console.log(''); + + // Stop and remove the container + console.log('Example 4: Stopping container...'); + await stop(nginx.id); + console.log('Container stopped\n'); + + console.log('Example 5: Removing container...'); + await remove(nginx.id); + console.log('Container removed\n'); + + } catch (error) { + console.error('Error:', error); + console.log('\nNote: Make sure Podman is installed and running on your system.'); + console.log('On macOS: brew install podman && podman machine init && podman machine start'); + console.log('On Linux: sudo apt install podman'); + } + + // Example 6: Compose orchestration (requires more complete implementation) + /* + console.log('Example 6: Compose orchestration...'); + try { + const compose = await composeUp({ + version: '3.8', + services: { + web: { + image: 'nginx:alpine', + ports: ['8080:80'], + }, + db: { + image: 'postgres:15-alpine', + environment: { + POSTGRES_PASSWORD: 'example', + }, + }, + }, + }); + + console.log('Compose stack started'); + const services = await compose.ps(); + console.log(`Services: ${services.length}`); + + await compose.down(); + console.log('Compose stack stopped'); + } catch (error) { + console.error('Compose error:', error); + } + */ +} + +main().catch(console.error); diff --git a/example-code/container-demo/src/test.ts b/example-code/container-demo/src/test.ts new file mode 100644 index 000000000..433b2187b --- /dev/null +++ b/example-code/container-demo/src/test.ts @@ -0,0 +1,152 @@ +/** + * Perry Container Module Test Script + * + * Tests basic container operations using perry/container module. + * Requires Podman to be running. + */ + +import { run, create, start, stop, remove, list, inspect, getBackend } from 'perry/container'; + +async function main() { + console.log('='.repeat(60)); + console.log('Perry Container Module - Integration Test'); + console.log('='.repeat(60)); + console.log(); + + // 1. Get backend info + console.log('1. Checking backend...'); + try { + const backend = getBackend(); + console.log(` ✓ Backend: ${backend}`); + console.log(); + } catch (error) { + console.log(` ✗ Error: ${error}`); + console.log(' This usually means the module is not available or Podman is not running.'); + process.exit(1); + } + + // 2. List containers (should be empty initially) + console.log('2. Listing containers...'); + try { + const containers = await list(); + console.log(` ✓ Found ${containers.length} container(s)`); + if (containers.length > 0) { + for (const c of containers) { + console.log(` - ${c.name} (${c.id.slice(0, 12)}) - ${c.status}`); + } + } + console.log(); + } catch (error) { + console.log(` ✗ Error: ${error}`); + console.log(' This means Podman is not accessible.'); + console.log(); + console.log('Troubleshooting:'); + console.log(' 1. Start Podman machine:'); + console.log(' podman machine start'); + console.log(' 2. Or use rootless mode (Linux only):'); + console.log(' podman info'); + console.log(' 3. Check Podman socket:'); + console.log(' podman system connection list'); + process.exit(1); + } + + // 3. Run a simple container + console.log('3. Running test container...'); + try { + const container = await run({ + image: 'nginx:alpine', + name: 'perry-test-nginx', + ports: ['8081:80'], + env: { + TEST_VAR: 'hello', + }, + }); + console.log(` ✓ Container started: ${container.id}`); + console.log(` ✓ Container name: ${container.name || 'unnamed'}`); + console.log(); + + // 4. Wait a bit + console.log('4. Waiting for container to initialize...'); + await new Promise(resolve => setTimeout(resolve, 2000)); + console.log(); + + // 5. Inspect the container + console.log('5. Inspecting container...'); + try { + const info = await inspect(container.id); + console.log(` ✓ Image: ${info.image}`); + console.log(` ✓ Status: ${info.status}`); + console.log(` ✓ Ports: ${info.ports.join(', ') || 'none'}`); + console.log(` ✓ Created: ${info.created}`); + console.log(); + } catch (error) { + console.log(` ✗ Inspect failed: ${error}`); + } + + // 6. List containers again + console.log('6. Listing containers (should show running container)...'); + try { + const containers = await list(); + console.log(` ✓ Found ${containers.length} container(s):`); + for (const c of containers) { + console.log(` - ${c.name} (${c.status})`); + } + console.log(); + } catch (error) { + console.log(` ✗ List failed: ${error}`); + } + + // 7. Stop the container + console.log('7. Stopping container...'); + try { + await stop(container.id, 5); // 5 second timeout + console.log(` ✓ Container stopped`); + console.log(); + } catch (error) { + console.log(` ✗ Stop failed: ${error}`); + } + + // 8. Remove the container + console.log('8. Removing container...'); + try { + await remove(container.id); + console.log(` ✓ Container removed`); + console.log(); + } catch (error) { + console.log(` ✗ Remove failed: ${error}`); + } + + // 9. Verify cleanup + console.log('9. Verifying cleanup...'); + try { + const containers = await list(); + if (containers.length === 0) { + console.log(' ✓ All containers cleaned up'); + } else { + console.log(` ! Warning: ${containers.length} container(s) still exist`); + for (const c of containers) { + console.log(` - ${c.name}`); + } + } + console.log(); + } catch (error) { + console.log(` ✗ Verification failed: ${error}`); + } + + console.log('='.repeat(60)); + console.log('✓ All tests completed successfully!'); + console.log('='.repeat(60)); + + } catch (error) { + console.log(` ✗ Run failed: ${error}`); + console.log(); + console.log('Common issues:'); + console.log(' 1. Podman not running: Start with "podman machine start"'); + console.log(' 2. Image not found: Run "podman pull nginx:alpine" first'); + console.log(' 3. Permission denied: Check Podman permissions'); + console.log(' 4. Port in use: Use a different port (e.g., 8082:80)'); + process.exit(1); + } +} + +main().catch(console.error); diff --git a/example-code/container-demo/test-import.ts b/example-code/container-demo/test-import.ts new file mode 100644 index 000000000..16efe6ad6 --- /dev/null +++ b/example-code/container-demo/test-import.ts @@ -0,0 +1,19 @@ +/** + * Quick test to verify perry/container module can be imported + */ + +import { run, create, start, stop, remove, list, inspect, getBackend } from 'perry/container'; + +console.log('Successfully imported perry/container module'); +console.log('Available functions:', { + run: typeof run, + create: typeof create, + start: typeof start, + stop: typeof stop, + remove: typeof remove, + list: typeof list, + inspect: typeof inspect, + getBackend: typeof getBackend, +}); + +console.log('Backend:', getBackend()); diff --git a/example-code/container-demo/verify-podman.sh b/example-code/container-demo/verify-podman.sh new file mode 100755 index 000000000..3f432d501 --- /dev/null +++ b/example-code/container-demo/verify-podman.sh @@ -0,0 +1,119 @@ +#!/bin/bash +# Quick Podman verification and setup script for Perry Container Module + +set -e + +echo "============================================================" +echo "Perry Container Module - Podman Setup & Verification" +echo "============================================================" +echo "" + +# Check Podman installation +echo "1. Checking Podman installation..." +if command -v podman &> /dev/null; then + PODMAN_VERSION=$(podman --version) + echo " ✓ Podman installed: $PODMAN_VERSION" +else + echo " ✗ Podman not found" + echo " Install with: brew install podman" + exit 1 +fi +echo "" + +# Check Colima +echo "2. Checking Colima (recommended solution)..." +if command -v colima &> /dev/null; then + echo " ✓ Colima installed: $(colima version | head -1)" +else + echo " ! Colima not found (recommended)" + echo " Install with: brew install colima" + COLIMA_MISSING=true +fi +echo "" + +# Check if Colima is running +if command -v colima &> /dev/null; then + echo "3. Checking Colima VM status..." + if colima status &> /dev/null; then + echo " ✓ Colima VM is running" + echo " ✓ Podman should be accessible" + else + echo " ! Colima VM is not running" + echo " Starting Colima..." + colima start + echo " ✓ Colima VM started" + fi + echo "" +else + echo "3. Skipping Colima check (not installed)" + echo "" +fi + +# Test Podman connection +echo "4. Testing Podman connection..." +if podman info &> /dev/null; then + echo " ✓ Podman is accessible" + HOST_OS=$(podman info --format '{{.HostInfo.OperatingSystem}}') + HOST_ARCH=$(podman info --format '{{.HostInfo.Arch}}') + echo " ✓ Host OS: $HOST_OS" + echo " ✓ Host Arch: $HOST_ARCH" +else + echo " ✗ Cannot connect to Podman" + echo "" + echo " Solutions:" + echo " 1. Start Colima: colima start" + echo " 2. Or use Lima: limactl start --name=perry-dev" + echo " 3. Or Docker Desktop: open -a Docker" + exit 1 +fi +echo "" + +# Pull test image if needed +echo "5. Checking for test image..." +if podman images | grep -q "nginx.*alpine"; then + echo " ✓ Test image exists" +else + echo " ! Pulling test image (nginx:alpine)..." + podman pull nginx:alpine + echo " ✓ Test image pulled" +fi +echo "" + +# Run quick Podman test +echo "6. Running Podman test container..." +CONTAINER_ID=$(podman run -d --name perry-quick-test -p 8082:80 nginx:alpine) +echo " ✓ Test container started: $CONTAINER_ID" + +# Wait and verify +sleep 2 +echo " ✓ Container running" + +# Check logs +echo " Container logs:" +podman logs --tail 3 perry-quick-test + +# Cleanup +echo "" +echo "7. Cleaning up test container..." +podman stop perry-quick-test &> /dev/null || true +podman rm perry-quick-test &> /dev/null || true +echo " ✓ Test container removed" +echo "" + +# Summary +echo "============================================================" +echo "✓ Podman is ready for Perry Container Module!" +echo "============================================================" +echo "" +echo "Next steps:" +echo " 1. Navigate to container demo: cd example-code/container-demo" +echo " 2. Install dependencies: npm install" +echo " 3. Run the test: perry compile src/test.ts -o test-podman && ./test-podman" +echo " 4. Or run the demo: perry compile src/main.ts -o container-demo && ./container-demo" +echo "" + +if [ "$COLIMA_MISSING" = true ]; then + echo "Note: Consider installing Colima for better macOS support:" + echo " brew install colima && colima start" + echo "" +fi diff --git a/example-code/fastify-redis-mysql/myapp b/example-code/fastify-redis-mysql/myapp new file mode 100755 index 000000000..ed34eb8cd Binary files /dev/null and b/example-code/fastify-redis-mysql/myapp differ diff --git a/types/perry/compose/index.d.ts b/types/perry/compose/index.d.ts new file mode 100644 index 000000000..ea825f89f --- /dev/null +++ b/types/perry/compose/index.d.ts @@ -0,0 +1,294 @@ +/** + * perry/compose — TypeScript bindings for perry-container-compose + * + * Docker Compose-like experience for Apple Container, powered by Perry. + * + * @module perry/compose + */ + +// ============ Configuration Types ============ + +/** + * Build configuration for a service image. + */ +export interface Build { + /** Build context directory (relative to compose file) */ + context?: string; + /** Path to Dockerfile */ + dockerfile?: string; + /** Build-time arguments */ + args?: Record; + /** Labels to add to the built image */ + labels?: Record; + /** Build target stage */ + target?: string; + /** Network to use during build */ + network?: string; +} + +/** + * A single service definition in a Compose file. + */ +export interface Service { + /** Container image reference */ + image?: string; + /** Explicit container name */ + container_name?: string; + /** Port mappings, e.g. "8080:80" */ + ports?: string[]; + /** Environment variables (map or KEY=VALUE list) */ + environment?: Record | string[]; + /** Container labels */ + labels?: Record; + /** Volume mounts, e.g. "./data:/data:ro" */ + volumes?: string[]; + /** Build configuration */ + build?: Build; + /** Service dependencies */ + depends_on?: string[] | Record; + /** Restart policy */ + restart?: "no" | "always" | "on-failure" | "unless-stopped"; + /** Override container entrypoint */ + entrypoint?: string | string[]; + /** Override container command */ + command?: string | string[]; + /** Networks this service is attached to */ + networks?: string[]; +} + +/** + * Network definition in a Compose file. + */ +export interface ComposeNetwork { + driver?: string; + external?: boolean; + name?: string; +} + +/** + * Volume definition in a Compose file. + */ +export interface ComposeVolume { + driver?: string; + external?: boolean; + name?: string; +} + +/** + * Root Compose file structure (docker-compose.yaml / compose.yaml). + */ +export interface ComposeSpec { + version?: string; + services: Record; + networks?: Record; + volumes?: Record; +} + +// ============ Operation Result Types ============ + +/** + * Status of a service container. + */ +export type ContainerStatusString = "running" | "stopped" | "not_found"; + +/** + * Service status entry from the `ps` command. + */ +export interface ServiceStatus { + /** Service name as defined in the compose file */ + service: string; + /** Container name */ + container: string; + /** Current container status */ + status: ContainerStatusString; +} + +/** + * Result of an exec call inside a container. + */ +export interface ExecResult { + stdout: string; + stderr: string; + exitCode: number; +} + +/** + * Generic FFI result wrapper. + */ +export interface ComposeResult { + ok: boolean; + result?: T; + error?: string; +} + +// ============ Options Types ============ + +export interface UpOptions { + /** Start in detached mode (default: true) */ + detach?: boolean; + /** Build images before starting */ + build?: boolean; + /** Services to start (empty = all) */ + services?: string[]; + /** Remove orphaned containers */ + removeOrphans?: boolean; +} + +export interface DownOptions { + /** Remove named volumes */ + volumes?: boolean; + /** Remove orphaned containers */ + removeOrphans?: boolean; + /** Services to remove (empty = all) */ + services?: string[]; +} + +export interface LogsOptions { + /** Follow log output */ + follow?: boolean; + /** Number of lines to show from the end */ + tail?: number; + /** Show timestamps */ + timestamps?: boolean; +} + +export interface ExecOptions { + /** User context */ + user?: string; + /** Working directory */ + workdir?: string; + /** Additional environment variables */ + env?: Record; +} + +export interface ConfigOptions { + /** Output format: "yaml" | "json" */ + format?: "yaml" | "json"; +} + +// ============ API Functions ============ + +/** + * Bring up services defined in a compose file. + * + * @param file - Path to compose file (default: "compose.yaml") + * @param options - Up options + * + * @example + * ```typescript + * import { up } from 'perry/compose'; + * await up('compose.yaml', { detach: true }); + * ``` + */ +export function up(file?: string, options?: UpOptions): Promise; + +/** + * Stop and remove services. + * + * @param file - Path to compose file + * @param options - Down options + * + * @example + * ```typescript + * import { down } from 'perry/compose'; + * await down('compose.yaml', { volumes: true }); + * ``` + */ +export function down(file?: string, options?: DownOptions): Promise; + +/** + * List service statuses. + * + * @param file - Path to compose file + * @returns Array of ServiceStatus entries + * + * @example + * ```typescript + * import { ps } from 'perry/compose'; + * const statuses = await ps('compose.yaml'); + * console.table(statuses); + * ``` + */ +export function ps(file?: string): Promise; + +/** + * Get logs from services. + * + * @param file - Path to compose file + * @param services - Services to get logs from (empty = all) + * @param options - Log options + * @returns Map of service name → log output + * + * @example + * ```typescript + * import { logs } from 'perry/compose'; + * const output = await logs('compose.yaml', ['web'], { tail: 100 }); + * ``` + */ +export function logs( + file?: string, + services?: string[], + options?: LogsOptions +): Promise>; + +/** + * Execute a command in a running service container. + * + * @param file - Path to compose file + * @param service - Service name + * @param cmd - Command and arguments to execute + * @param options - Exec options + * + * @example + * ```typescript + * import { exec } from 'perry/compose'; + * const result = await exec('compose.yaml', 'web', ['sh', '-c', 'ls /app']); + * console.log(result.stdout); + * ``` + */ +export function exec( + file: string, + service: string, + cmd: string[], + options?: ExecOptions +): Promise; + +/** + * Validate and display the parsed compose configuration. + * + * @param file - Path to compose file + * @param options - Config options + * @returns Validated configuration as YAML or JSON string + * + * @example + * ```typescript + * import { config } from 'perry/compose'; + * const yaml = await config('compose.yaml'); + * console.log(yaml); + * ``` + */ +export function config(file?: string, options?: ConfigOptions): Promise; + +/** + * Start existing stopped services (does not create new containers). + * + * @param file - Path to compose file + * @param services - Services to start (empty = all) + */ +export function start(file?: string, services?: string[]): Promise; + +/** + * Stop running services (does not remove containers). + * + * @param file - Path to compose file + * @param services - Services to stop (empty = all) + */ +export function stop(file?: string, services?: string[]): Promise; + +/** + * Restart services. + * + * @param file - Path to compose file + * @param services - Services to restart (empty = all) + */ +export function restart(file?: string, services?: string[]): Promise; diff --git a/types/perry/compose/package.json b/types/perry/compose/package.json new file mode 100644 index 000000000..066569cd9 --- /dev/null +++ b/types/perry/compose/package.json @@ -0,0 +1,18 @@ +{ + "name": "perry/compose", + "version": "0.1.0", + "description": "TypeScript bindings for perry-container-compose — Docker Compose-like experience for Apple Container", + "types": "index.d.ts", + "perry": { + "native": "perry-container-compose", + "backend": "apple-container" + }, + "keywords": [ + "perry", + "container", + "compose", + "apple-container", + "docker-compose" + ], + "license": "MIT" +} diff --git a/types/perry/container/index.d.ts b/types/perry/container/index.d.ts new file mode 100644 index 000000000..527b867db --- /dev/null +++ b/types/perry/container/index.d.ts @@ -0,0 +1,341 @@ +// Type declarations for perry/container — Perry's OCI container management module +// These types are auto-written by `perry init` / `perry types` so IDEs +// and tsc can resolve `import { ... } from "perry/container"`. + +// --------------------------------------------------------------------------- +// Container Lifecycle +// --------------------------------------------------------------------------- + +/** + * Configuration for a single container. + */ +export interface ContainerSpec { + /** Container image (required) */ + image: string; + /** Container name (optional) */ + name?: string; + /** Port mappings (e.g., "8080:80") */ + ports?: string[]; + /** Volume mounts (e.g., "/host/path:/container/path:ro") */ + volumes?: string[]; + /** Environment variables */ + env?: Record; + /** Command to run (overrides image CMD) */ + cmd?: string[]; + /** Entrypoint (overrides image ENTRYPOINT) */ + entrypoint?: string[]; + /** Network to attach to */ + network?: string; + /** Remove container on exit */ + rm?: boolean; +} + +/** + * Handle to a container instance. + */ +export interface ContainerHandle { + /** Container ID */ + id: string; + /** Container name (if specified) */ + name?: string; +} + +/** + * Run a container from the given spec. + * @param spec Container configuration + * @returns Promise resolving to ContainerHandle + */ +export function run(spec: ContainerSpec): Promise; + +/** + * Create a container from the given spec without starting it. + * @param spec Container configuration + * @returns Promise resolving to ContainerHandle + */ +export function create(spec: ContainerSpec): Promise; + +/** + * Start a previously created container. + * @param id Container ID or name + * @returns Promise resolving when container is started + */ +export function start(id: string): Promise; + +/** + * Stop a running container. + * @param id Container ID or name + * @param timeout Timeout in seconds before force-terminating (default: 10) + * @returns Promise resolving when container is stopped + */ +export function stop(id: string, timeout?: number): Promise; + +/** + * Remove a container. + * @param id Container ID or name + * @param force If true, stop and remove a running container + * @returns Promise resolving when container is removed + */ +export function remove(id: string, force?: boolean): Promise; + +// --------------------------------------------------------------------------- +// Container Inspection and Listing +// --------------------------------------------------------------------------- + +/** + * Information about a container. + */ +export interface ContainerInfo { + /** Container ID */ + id: string; + /** Container name */ + name: string; + /** Image reference */ + image: string; + /** Container status (e.g., "running", "exited") */ + status: string; + /** Port mappings */ + ports: string[]; + /** Creation timestamp (ISO 8601) */ + created: string; +} + +/** + * List containers. + * @param all If true, include stopped containers + * @returns Promise resolving to array of ContainerInfo + */ +export function list(all?: boolean): Promise; + +/** + * Inspect a container. + * @param id Container ID or name + * @returns Promise resolving to ContainerInfo + */ +export function inspect(id: string): Promise; + +// --------------------------------------------------------------------------- +// Container Logs and Exec +// --------------------------------------------------------------------------- + +/** + * Logs captured from a container. + */ +export interface ContainerLogs { + /** Standard output */ + stdout: string; + /** Standard error */ + stderr: string; +} + +/** + * Get logs from a container. + * @param id Container ID or name + * @param options Options for logs + * @returns Promise resolving to ContainerLogs or ReadableStream + */ +export function logs( + id: string, + options?: { + /** If true, return a ReadableStream of log lines */ + follow?: boolean; + /** Number of lines to return from the end */ + tail?: number; + } +): Promise>; + +/** + * Execute a command in a running container. + * @param id Container ID or name + * @param cmd Command to execute + * @param options Options for exec + * @returns Promise resolving to ContainerLogs + */ +export function exec( + id: string, + cmd: string[], + options?: { + /** Environment variables */ + env?: Record; + /** Working directory */ + workdir?: string; + } +): Promise; + +// --------------------------------------------------------------------------- +// Image Management +// --------------------------------------------------------------------------- + +/** + * Information about a container image. + */ +export interface ImageInfo { + /** Image ID */ + id: string; + /** Repository name */ + repository: string; + /** Image tag */ + tag: string; + /** Image size in bytes */ + size: number; + /** Creation timestamp (ISO 8601) */ + created: string; +} + +/** + * Pull a container image from a registry. + * @param reference Image reference (e.g., "alpine:latest", "cgr.dev/chainguard/alpine-base@sha256:...") + * @returns Promise resolving when image is pulled + */ +export function pullImage(reference: string): Promise; + +/** + * List images in the local cache. + * @returns Promise resolving to array of ImageInfo + */ +export function listImages(): Promise; + +/** + * Remove an image from the local cache. + * @param reference Image reference + * @param force If true, remove even if image is in use + * @returns Promise resolving when image is removed + */ +export function removeImage(reference: string, force?: boolean): Promise; + +// --------------------------------------------------------------------------- +// Compose (Multi-Container Orchestration) +// --------------------------------------------------------------------------- + +/** + * Multi-container application specification. + */ +export interface ComposeSpec { + /** Compose file version */ + version?: string; + /** Service definitions */ + services: Record; + /** Network definitions */ + networks?: Record; + /** Volume definitions */ + volumes?: Record; +} + +/** + * Service definition in Compose. + */ +export interface ComposeService { + /** Container image */ + image: string; + /** Build configuration */ + build?: { + /** Build context directory */ + context: string; + /** Dockerfile path (relative to context) */ + dockerfile?: string; + }; + /** Command to run */ + command?: string | string[]; + /** Environment variables */ + environment?: Record | string[]; + /** Port mappings */ + ports?: string[]; + /** Volume mounts */ + volumes?: string[]; + /** Networks to attach to */ + networks?: string[]; + /** Service dependencies */ + depends_on?: string[]; + /** Restart policy */ + restart?: string; + /** Healthcheck configuration */ + healthcheck?: ComposeHealthcheck; +} + +/** + * Healthcheck configuration. + */ +export interface ComposeHealthcheck { + /** Test command (string or array) */ + test: string | string[]; + /** Check interval (e.g., "30s") */ + interval?: string; + /** Timeout (e.g., "10s") */ + timeout?: string; + /** Number of retries before unhealthy */ + retries?: number; + /** Startup grace period (e.g., "40s") */ + start_period?: string; +} + +/** + * Network configuration. + */ +export interface ComposeNetwork { + /** Network driver */ + driver?: string; + /** External network reference */ + external?: boolean; + /** Network name */ + name?: string; +} + +/** + * Volume configuration. + */ +export interface ComposeVolume { + /** Volume driver */ + driver?: string; + /** External volume reference */ + external?: boolean; + /** Volume name */ + name?: string; +} + +/** + * Handle to a Compose stack. + */ +export interface ComposeHandle { + /** Stop and remove all resources in the stack */ + down(options?: { + /** If true, also remove named volumes */ + volumes?: boolean; + }): Promise; + + /** Get container info for all services in the stack */ + ps(): Promise; + + /** Get logs from the stack */ + logs(options?: { + /** Get logs only from this service */ + service?: string; + /** Number of lines to return from the end */ + tail?: number; + }): Promise; + + /** Execute a command in a service container */ + exec( + service: string, + cmd: string[], + options?: { + /** Environment variables */ + env?: Record; + } + ): Promise; +} + +/** + * Bring up a Compose stack. + * @param spec Compose specification + * @returns Promise resolving to ComposeHandle + */ +export function composeUp(spec: ComposeSpec): Promise; + +// --------------------------------------------------------------------------- +// Platform Information +// --------------------------------------------------------------------------- + +/** + * Get the name of the container backend being used. + * @returns "apple/container" on macOS/iOS, "podman" on all other platforms + */ +export function getBackend(): string; diff --git a/types/perry/container/package.json b/types/perry/container/package.json new file mode 100644 index 000000000..a1e4681de --- /dev/null +++ b/types/perry/container/package.json @@ -0,0 +1,7 @@ +{ + "name": "perry/container", + "version": "0.5.18", + "private": true, + "description": "Type declarations for perry/container - Perry's OCI container management module", + "types": "index.d.ts" +}