NIP-AE: agent engrams (kind:30174) — core memory injection + sprout mem CLI by tlongwell-block · Pull Request #593 · block/sprout

tlongwell-block · 2026-05-15T01:58:34Z

Summary

Implements NIP-AE (Agent Engrams, kind 30174) as the smallest viable surface across relay, CLI, and ACP harness. The goal: let an owner write a small, durable "core memory" that their agent reads once per new session, with no new daemons, no new crates, and minimal moving parts.

What this gives you

sprout mem set core "I am Sami. Be terse." — writes a NIP-44-encrypted, parameterized-replaceable note addressed to your agent.
The agent fetches that note at session creation and injects it as a prompt section between [System] and [Context]. One fetch per session, fail-open on every error path.
No core yet? The agent gets a short onboarding nudge instead, so it learns to ask the user about themselves and create one.

How it's structured

sprout-core::engram — pure primitives, no I/O. Shared by CLI and ACP.

Conversation key (NIP-44 v2 symmetric, so either party reads with their own seckey).
d-tag = lower_hex(HMAC-SHA256(K_c, "agent-memory/v1/d-tag\0" || slug)) — spec vectors pinned byte-for-byte.
Body parse/serialize with strict duplicate-key rejection at any depth (serde Visitor, not a hand-rolled scanner).
Envelope build + head selection (created_at desc, then event-id desc; tombstones honoured).

Relay — kind 30174 added to ALL_KINDS, the per-kind scope allowlist (UsersWrite, same group as KIND_READ_STATE), and is_global_only_kind. A new validate_engram_envelope rejects malformed events (≠1 d, ≠1 p, non-lowercase-hex d, empty content) before they reach NIP-33 replacement, so a bad event can't poison the storage head and become invisible to #p readers.

sprout mem CLI — ls | get | set | rm. Slug shorthand normalises foo → mem/foo; core is reserved and rm core is refused. set reads from - (stdin). submit_engram parses the relay's {accepted, message} so a duplicate: response (same-second NIP-33 dominated write) surfaces as a Conflict instead of a silent "wrote".

ACP harness — at new-session creation, one synchronous fetch + decrypt of the core engram, cached per channel in the rendered prompt section. Re-fetched only when the session is invalidated. On transport errors we return None (no section) rather than the onboarding nudge — a flaky relay shouldn't gaslight the agent into thinking its memory is empty.

Test plan

13 engram unit tests in sprout-core, including the spec's K_c vector and three d-tag vectors verified byte-for-byte.
6 envelope-validation tests in the relay.
2 format_prompt injection tests in the ACP queue (core present → injected; absent → nudge).
Scope-allowlist coverage extended.
Live end-to-end against a local relay per TESTING.md:
- ls (empty) → set core → get core → set foo (stdin) → set mem/bar → ls (two entries) → rm foo → get foo exits non-zero with tombstoned: → ls shows only mem/bar → rm core refused → invalid slug rejected.
- Fresh agent with core set: harness logs injected NIP-AE core section ... section_len=60.
- Fresh agent without core: harness logs onboarding nudge.

Notes for review

Codex reviewed an earlier revision at 7/10 and flagged five issues; all are addressed in this PR. See commit for specifics: visitor-based dup detection, conflict surfacing, fail-closed-on-transport-error, envelope pre-validation, structured NotFound/Conflict exit codes.
Diffstat: 16 files, +1709 / -41. The bulk is sprout-core/src/engram.rs (~835 lines, mostly tests + spec vectors).
No new crates. No new daemons. The CLI uses the same SproutClient as everything else; the harness uses the existing RestClient::query.

Out of scope (intentionally)

Mid-session refresh of the core engram.
Owner-side UI for browsing/editing engrams (the CLI is enough for now; a desktop surface can come later using the same sprout-core::engram API).
Engram kinds beyond core / mem/* (the d-tag derivation is slug-agnostic; future kinds slot in without protocol changes).

Closes the NIP-AE implementation thread.

Codex P2 findings on PR #593: 1. Non-NIP44 content slipped past the relay envelope check. A signed kind:30174 with valid d/p tags but content like 'x' won NIP-33 replacement against a valid head and was then silently discarded by readers — silently erasing memory. 2. Uppercase-hex p tags were accepted. Readers query #p with the lowercase hex of the owner pubkey (byte-exact tag match), so an uppercase-tagged event that won replacement became invisible to readers — same bricking pattern. Tighten validate_engram_envelope to: - require lowercase hex for the p tag (consistent with the existing rule on d) - validate that content is a syntactically plausible NIP-44 v2 payload: standard base64 alphabet, length multiple of 4, decoded length >= 99 bytes, first decoded byte = 0x02 (version prefix). Relay-side sanity check only — the MAC and decryption still happen at the reader. The point is to refuse obvious junk before it can supersede a valid head. +6 regression tests; canonical-accepts fixture updated to use a real- shape NIP-44 v2 sample.

Second codex review pass on PR #593 flagged two more P2s: 1. (engram_fetch.rs) When the relay returns kind:30174 events addressed to the agent but none decrypt (wrong key, MAC failure, body schema mismatch, or an event injected by another party that happened to be p-tagged at this agent), the previous code returned Ok(None) — which the harness then renders as the onboarding nudge, inviting the agent to overwrite a real-but-unreadable core. Now distinguish three outcomes: - empty array → Ok(None) (confirmed absence; nudge) - >=1 event decrypts → use winning head - non-empty, none decrypt→ Err (fail closed; no section) Extracted the post-query decode logic into a pure decode_core_body() helper so it's unit-testable without mocking RestClient. Added 5 tests: empty-array-absent, valid-core-returns-profile, undecryptable-is-err-not-absent (the regression), non-core-body-is- absent, unparseable-candidates-is-err. 2. (commands/mem.rs) The module doc comment claimed `sprout mem` and `sprout mem ls` were equivalent, but the clap wiring requires a subcommand so bare `sprout mem` exits with a usage error. Drop the false claim — bare-group-shows-help is the convention across the other 12 subcommand groups; adding a default action just for mem would be inconsistent.

Third codex pass on PR #593 flagged that engram events were stored as global, so any authenticated relay member could REQ `{"kinds":[30174]}` and harvest: - the encrypted ciphertext (no plaintext but still a fingerprint) - the public `#p` (owner pubkey) - the public `#d` (HMAC-derived per-slug fingerprint) - timestamps (write-activity patterns) Together that leaks who-pairs-with-which-agent + when they're active. Strictly speaking the NIP-AE design encrypts content for confidentiality but assumes the relay enforces read gating on the event metadata. Add a new `engram_filters_authorized` predicate alongside the existing `p_gated_filters_authorized`. A filter that can match KIND_AGENT_ENGRAM must satisfy at least one of: - `authors` non-empty AND every entry == authed (agent reading own), or - `#p` non-empty AND every entry == authed (owner reading addressed-to-self). Specific-event-ids lookups (`ids: [...]`) are exempt — knowing the id implies prior authorization. Hook into all four read paths: - WS REQ (historical + live subscription registration) - WS COUNT - HTTP /query - HTTP /count +9 unit tests: agent_querying_own, owner_querying, owner_no_authors, ids_lookup, skips_non_engram_kinds (positive); unrelated_reader, bare_kind_filter, wildcard_kind_filter, mixed_authors_with_unauthed (negative).

Implements NIP-AE (Agent Engrams, kind:30174) as the smallest viable surface across relay, CLI, and ACP harness. * sprout-core::engram — pure crypto + parsing primitives shared by CLI and ACP harness. Conversation key, d-tag HMAC, body parse/serialize with duplicate-key rejection, envelope build, head selection. Pinned spec vectors (K_c, three d-tags) verified byte-for-byte. * Relay: kind 30174 added to ALL_KINDS, the per-kind scope allowlist (UsersWrite, same group as KIND_READ_STATE), and is_global_only_kind. NIP-33 plumbing (replace_parameterized_event) handles the rest. * sprout mem CLI: ls/get/set/rm. Slug shorthand normalises 'foo' to 'mem/foo'; 'core' is reserved. set reads stdin with '-'. Monotonic created_at + tombstone semantics per spec. Symmetric decrypt — either party (owner or agent) reads with their own seckey. * ACP harness: at new-session creation, fire one synchronous fetch + decrypt of the core engram and cache the rendered prompt section per channel. If no core exists or any error occurs, inject the onboarding nudge so the agent learns to bootstrap itself. format_prompt() emits the section after [System] and before [Context]. No mid-session refresh — only re-fetched when a session is invalidated. * Tests: 13 engram unit tests including spec vectors, 2 format_prompt injection tests, scope-allowlist coverage extended. Signed-off-by: Tyler Longwell <tlongwell@squareup.com>

Codex P2 findings on PR #593: 1. Non-NIP44 content slipped past the relay envelope check. A signed kind:30174 with valid d/p tags but content like 'x' won NIP-33 replacement against a valid head and was then silently discarded by readers — silently erasing memory. 2. Uppercase-hex p tags were accepted. Readers query #p with the lowercase hex of the owner pubkey (byte-exact tag match), so an uppercase-tagged event that won replacement became invisible to readers — same bricking pattern. Tighten validate_engram_envelope to: - require lowercase hex for the p tag (consistent with the existing rule on d) - validate that content is a syntactically plausible NIP-44 v2 payload: standard base64 alphabet, length multiple of 4, decoded length >= 99 bytes, first decoded byte = 0x02 (version prefix). Relay-side sanity check only — the MAC and decryption still happen at the reader. The point is to refuse obvious junk before it can supersede a valid head. +6 regression tests; canonical-accepts fixture updated to use a real- shape NIP-44 v2 sample. Signed-off-by: Tyler Longwell <tlongwell@squareup.com>

Second codex review pass on PR #593 flagged two more P2s: 1. (engram_fetch.rs) When the relay returns kind:30174 events addressed to the agent but none decrypt (wrong key, MAC failure, body schema mismatch, or an event injected by another party that happened to be p-tagged at this agent), the previous code returned Ok(None) — which the harness then renders as the onboarding nudge, inviting the agent to overwrite a real-but-unreadable core. Now distinguish three outcomes: - empty array → Ok(None) (confirmed absence; nudge) - >=1 event decrypts → use winning head - non-empty, none decrypt→ Err (fail closed; no section) Extracted the post-query decode logic into a pure decode_core_body() helper so it's unit-testable without mocking RestClient. Added 5 tests: empty-array-absent, valid-core-returns-profile, undecryptable-is-err-not-absent (the regression), non-core-body-is- absent, unparseable-candidates-is-err. 2. (commands/mem.rs) The module doc comment claimed `sprout mem` and `sprout mem ls` were equivalent, but the clap wiring requires a subcommand so bare `sprout mem` exits with a usage error. Drop the false claim — bare-group-shows-help is the convention across the other 12 subcommand groups; adding a default action just for mem would be inconsistent. Signed-off-by: Tyler Longwell <tlongwell@squareup.com>

Third codex pass on PR #593 flagged that engram events were stored as global, so any authenticated relay member could REQ `{"kinds":[30174]}` and harvest: - the encrypted ciphertext (no plaintext but still a fingerprint) - the public `#p` (owner pubkey) - the public `#d` (HMAC-derived per-slug fingerprint) - timestamps (write-activity patterns) Together that leaks who-pairs-with-which-agent + when they're active. Strictly speaking the NIP-AE design encrypts content for confidentiality but assumes the relay enforces read gating on the event metadata. Add a new `engram_filters_authorized` predicate alongside the existing `p_gated_filters_authorized`. A filter that can match KIND_AGENT_ENGRAM must satisfy at least one of: - `authors` non-empty AND every entry == authed (agent reading own), or - `#p` non-empty AND every entry == authed (owner reading addressed-to-self). Specific-event-ids lookups (`ids: [...]`) are exempt — knowing the id implies prior authorization. Hook into all four read paths: - WS REQ (historical + live subscription registration) - WS COUNT - HTTP /query - HTTP /count +9 unit tests: agent_querying_own, owner_querying, owner_no_authors, ids_lookup, skips_non_engram_kinds (positive); unrelated_reader, bare_kind_filter, wildcard_kind_filter, mixed_authors_with_unauthed (negative). Signed-off-by: Tyler Longwell <tlongwell@squareup.com>

The round-4 fix added p_gated/engram_filters_authorized gates to the WS REQ historical-delivery branch, WS COUNT, HTTP /query, and HTTP /count — but missed the WS REQ NIP-50 search branch, which intercepts before reaching the gate. Since kind:30174 envelopes are indexed in Typesense (only NIP-17 gift wraps are skipped), an authenticated relay member could send {"search":"*","kinds":[30174]} and harvest every engram ciphertext + owner #p + slug #d fingerprint on the relay, leaking the metadata the round-4 gate was specifically written to protect. Fix: move the two filter-auth checks above the search early-return. The same reordering also closes the equivalent search-bypass for the pre-existing P_GATED_KINDS (observer frames, member notifications) which are likewise globally stored, indexed, and were previously only gated on the non-search path. +4 regression tests asserting the gate rejects search-shaped attack filters and still allows authored search. Found by codex review round 5. Signed-off-by: Tyler Longwell <tlongwell@squareup.com>

Signed-off-by: Tyler Longwell <tlongwell@squareup.com> * origin/main: dev-mcp: add view_image tool (#602) fix(relay,desktop): only advertise NIP-43 when enforced; probe pairing by supported_nips (#601) fix(desktop): derive unread state from NIP-RS + relay catch-up only (#599) docs(testing): rewrite TESTING.md for current API and CLI-first workflow (#597) fix(agent): fix OpenAI-compat request body serialization and max_tokens (#595) feat(desktop): per-persona and per-agent env var overrides (#594)

The HTTP /query NIP-50 search path in handle_bridge_search pushed only kind/authors/time/channel into Typesense and applied a channel-access post-filter, but did not enforce the rest of the requesting filter against the fetched events. The WS NIP-50 path does (handlers/req.rs). For NIP-AE this meant the engram read gate (which authorizes the *filter*: kind=30174 with author=self or #p=self) was bypassed for /query specifically: an authorized search like {"search":"foo","kinds":[30174],"#p":[owner_self]} could return text-matching engram envelopes whose #p belongs to a different owner (or an authors=[agent_self] search could return events authored by other agents), because Typesense doesn't see #p and the post-filter wasn't running. Fix: extract the per-hit acceptance logic into search_hit_accepted() and call sprout_core::filter::filters_match against the current filter before channel-access and dedup. This mirrors the WS post- filter at handlers/req.rs and locks the bridge to the same NIP-01 semantics. Tests: three unit tests covering the leak — mismatched #p tag, mismatched author, and channel scope — exercising the helper that owns the fix. Full suites Mari named also green: engram (17), engram_envelope (12), engram_gate (12), engram_fetch (5). Signed-off-by: Tyler Longwell <109685178+tlongwell-block@users.noreply.github.com>

Bring the PR up to date with origin/main. Two conflicts resolved: * crates/sprout-acp/src/main.rs — main extracted the binary body into sprout-acp/src/lib.rs (commit 70cb53e, "Add Sprig all-in-one agent binary"). Took main's 3-line shim verbatim and replayed Sami's three hunks against lib.rs instead: - declare `mod engram_fetch;` - clone `startup_owner` for `OwnerCache::new` so we can also use it for the PromptContext below - thread `agent_keys` + `agent_owner_pubkey` into PromptContext construction (needed by the NIP-AE core fetch in pool.rs) * Cargo.lock + desktop/src-tauri/Cargo.lock — took theirs; cargo check --workspace produced no further changes (all engram deps were already present on main). Verified: * cargo check --workspace — clean * cargo clippy --workspace --all-targets -- -D warnings — clean * cargo test -p sprout-core -p sprout-acp -p sprout-cli -p sprout-relay — all green, including the 17 engram tests, 5 engram_fetch tests (covering the d7842a0 fail-closed regression), and 13 format_prompt tests (the two new agent_core injection cases included). Signed-off-by: Tyler Longwell <tlongwell@squareup.com>

Signed-off-by: Tyler Longwell <tlongwell@squareup.com>

tlongwell-block requested a review from wesbillman as a code owner May 15, 2026 01:58

tlongwell-block added 6 commits May 15, 2026 16:57

tlongwell-block force-pushed the sami/nip-ae-minimal branch from ca57552 to 860ec66 Compare May 15, 2026 21:02

tlongwell-block and others added 2 commits May 15, 2026 17:47

tlongwell-block mentioned this pull request May 18, 2026

feat(acp): --no-memory flag to disable NIP-AE core injection #611

Merged

feat(acp): --no-memory flag to disable NIP-AE core injection (#611)

995baa1

Signed-off-by: Tyler Longwell <tlongwell@squareup.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NIP-AE: agent engrams (kind:30174) — core memory injection + sprout mem CLI#593

NIP-AE: agent engrams (kind:30174) — core memory injection + sprout mem CLI#593
tlongwell-block wants to merge 9 commits into
mainfrom
sami/nip-ae-minimal

tlongwell-block commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant