[DX-1211] Address currently silent failures around missing modes #2236
[DX-1211] Address currently silent failures around missing modes #2236umair-ably wants to merge 11 commits into
modes #2236Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds ChangesStrict mode silent-failure handling
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
4f35b50 to
4746cfb
Compare
4746cfb to
36fc647
Compare
modes
36fc647 to
daa4edc
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Pull request overview
This PR surfaces two previously silent realtime “missing channel modes” failures by emitting a default-level warning log (and, when clientOptions.strictMode: true, throwing an ErrorInfo with a remediation hint) for:
presence.get()withoutpresence_subscribechannel.subscribe()withoutsubscribe
Changes:
- Add runtime detection for missing
presence_subscribe/subscribemodes and emit hintful warnings or throw based onstrictMode. - Introduce
Logger.silentFailureLogSuffix()to explain the “warn now, throw in next major” behavior in log output. - Add tests covering strict vs default behavior and add
strictModetoClientOptionstypings.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/realtime/presence.test.js | Adds coverage for presence.get() with/without presence_subscribe under strict/default modes. |
| test/realtime/channel.test.js | Adds coverage for channel.subscribe() with/without subscribe under strict/default modes. |
| src/common/lib/util/logger.ts | Adds a reusable suffix for strictMode-off silent-failure warnings. |
| src/common/lib/client/realtimepresence.ts | Warn/throw on presence.get() when presence_subscribe is not granted. |
| src/common/lib/client/realtimechannel.ts | Warn/throw on subscribe() when subscribe mode is not granted; adds one-time warning gate. |
| src/common/lib/client/realtimeannotations.ts | Fixes mode-check precedence and adds a log before throwing for missing annotation_subscribe. |
| ably.d.ts | Documents ClientOptions.strictMode. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/common/lib/client/realtimechannel.ts (1)
503-530:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid registering the listener before the strict-mode validation can fail.
Lines 503-507 add the subscription before the attach/mode check runs. When
strictMode === true,subscribe()can reject on Line 529, but the listener remains registered. Retrying the call stacks duplicates, and a later reattach can unexpectedly activate callbacks from a failedsubscribe()attempt.Suggested fix
- // Filtered - if (event && typeof event === 'object' && !Array.isArray(event)) { - this.client._FilteredSubscriptions.subscribeFilter(this, event, listener); - } else { - this.subscriptions.on(event, listener); - } + const registerListener = () => { + if (event && typeof event === 'object' && !Array.isArray(event)) { + this.client._FilteredSubscriptions.subscribeFilter(this, event, listener); + } else { + this.subscriptions.on(event, listener); + } + }; // (RTL7g) + const strictMode = this.client.options.strictMode === true; + if (!strictMode) { + registerListener(); + } + if (this.channelOptions.attachOnSubscribe !== false) { const stateChange = await this.attach(); if (this.state === 'attached' && (this._mode & flags.SUBSCRIBE) === 0) { @@ - if (this.client.options.strictMode === true) throw err; + if (strictMode) throw err; } + if (strictMode) { + registerListener(); + } return stateChange; } else { + registerListener(); return null; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/common/lib/client/realtimechannel.ts` around lines 503 - 530, The listener is being registered before the attach/mode strict-mode check, causing orphaned listeners if attach/mode validation fails; change subscribe() so it first performs the attach() call and the mode check (the block that constructs the ErrorInfo and checks this.client.options.strictMode and this._silentSubscribeWarned) and only after that, if appropriate, register the listener via this.client._FilteredSubscriptions.subscribeFilter(this, event, listener) or this.subscriptions.on(event, listener); ensure that if attach/mode validation throws (strictMode === true) the listener is not registered, and if you intend to allow registration when strictMode === false, register only after the validation path completes successfully or explicitly permits it.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/common/lib/client/realtimechannel.ts`:
- Around line 509-533: The missing-subscribe validation is currently inside the
attachOnSubscribe branch so it is skipped when channelOptions.attachOnSubscribe
=== false; move the check that constructs ErrorInfo and warns/throws (the block
using this.state === 'attached', (this._mode & flags.SUBSCRIBE) === 0,
Logger.logActionNoStrip, this._silentSubscribeWarned and
client.options.strictMode) out of that attachOnSubscribe conditional so it runs
whenever the channel ends up attached (after calling attach() or if already
attached), regardless of attachOnSubscribe; keep using attach() to get
stateChange and still return stateChange (or null when not calling attach), but
ensure the subscribe-mode validation executes unconditionally after determining
the attached state.
In `@src/common/lib/util/logger.ts`:
- Around line 120-127: Update the static method silentFailureLogSuffix in the
logger (silentFailureLogSuffix) to return a generic suffix that doesn't assume
the behavior is a silent return; replace the phrase "returns silently" with
wording that covers any silent/no-op behavior (e.g., "currently has no
observable effect or fails silently") and keep the rest of the guidance about
clientOptions.strictMode and future throwing intact so the message is accurate
for subscribe/listener and other silent-failure paths.
---
Outside diff comments:
In `@src/common/lib/client/realtimechannel.ts`:
- Around line 503-530: The listener is being registered before the attach/mode
strict-mode check, causing orphaned listeners if attach/mode validation fails;
change subscribe() so it first performs the attach() call and the mode check
(the block that constructs the ErrorInfo and checks
this.client.options.strictMode and this._silentSubscribeWarned) and only after
that, if appropriate, register the listener via
this.client._FilteredSubscriptions.subscribeFilter(this, event, listener) or
this.subscriptions.on(event, listener); ensure that if attach/mode validation
throws (strictMode === true) the listener is not registered, and if you intend
to allow registration when strictMode === false, register only after the
validation path completes successfully or explicitly permits it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4cd8d76d-2fcc-471e-97ea-91f7f7785049
📒 Files selected for processing (7)
ably.d.tssrc/common/lib/client/realtimeannotations.tssrc/common/lib/client/realtimechannel.tssrc/common/lib/client/realtimepresence.tssrc/common/lib/util/logger.tstest/realtime/channel.test.jstest/realtime/presence.test.js
daa4edc to
8af10ac
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/realtime/presence.test.js (1)
2403-2431: ⚡ Quick winAlign the non-strict test claim with what it actually verifies.
The test name says it verifies warning logging, but assertions only check resolution to
[]. Either assert the warning output explicitly or rename the test to avoid over-claiming behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/realtime/presence.test.js` around lines 2403 - 2431, The test "with strictMode disabled (default), logs a warning and resolves to []" currently only asserts that channel.presence.get() resolves to []; update it to either (A) explicitly assert the warning is emitted by capturing/logging the realtime or helper logger during the test and checking that a warning containing "strictMode" or similar text was logged when presence.get() ran (reference the test's realtime instance, channel.presence.get(), and helper logger/monitoring utilities), or (B) rename the test string to something accurate like "with strictMode disabled (default), resolves to []" so it no longer claims to check warning logging; ensure you change only the test description or add the log assertion around the existing whenPromiseSettles callback where presence.get() is validated.test/realtime/channel.test.js (1)
2118-2153: ⚡ Quick winAdd coverage for
attachOnSubscribe: falsein this DX-1211 test block.These tests cover the default attach-on-subscribe flow, but not the manual-attach flow (
attachOnSubscribe: false). Add a case where the channel is explicitly attached withoutsubscribemode, thensubscribe()is called, to lock in strict/non-strict behavior on that path too.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/realtime/channel.test.js` around lines 2118 - 2153, Add tests for the manual-attach flow by creating channels with attachOnSubscribe: false (via channels.get(..., { modes: ['publish'], attachOnSubscribe: false })) then explicitly call channel.attach() before calling channel.subscribe() to exercise the path where subscribe does not auto-attach; add two cases mirroring the existing ones: one with realtime = helper.AblyRealtime({ strictMode: true }) asserting channel.subscribe() rejects with code 93003 and an appropriate hint, and one with realtime = helper.AblyRealtime() (default strictMode false) asserting channel.subscribe() resolves (returns null or a ChannelStateChange-like object) and channel.state === 'attached'; reuse the same naming pattern and helper.monitorConnectionThenCloseAndFinishAsync wrapper as the other tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/realtime/channel.test.js`:
- Around line 2118-2153: Add tests for the manual-attach flow by creating
channels with attachOnSubscribe: false (via channels.get(..., { modes:
['publish'], attachOnSubscribe: false })) then explicitly call channel.attach()
before calling channel.subscribe() to exercise the path where subscribe does not
auto-attach; add two cases mirroring the existing ones: one with realtime =
helper.AblyRealtime({ strictMode: true }) asserting channel.subscribe() rejects
with code 93003 and an appropriate hint, and one with realtime =
helper.AblyRealtime() (default strictMode false) asserting channel.subscribe()
resolves (returns null or a ChannelStateChange-like object) and channel.state
=== 'attached'; reuse the same naming pattern and
helper.monitorConnectionThenCloseAndFinishAsync wrapper as the other tests.
In `@test/realtime/presence.test.js`:
- Around line 2403-2431: The test "with strictMode disabled (default), logs a
warning and resolves to []" currently only asserts that channel.presence.get()
resolves to []; update it to either (A) explicitly assert the warning is emitted
by capturing/logging the realtime or helper logger during the test and checking
that a warning containing "strictMode" or similar text was logged when
presence.get() ran (reference the test's realtime instance,
channel.presence.get(), and helper logger/monitoring utilities), or (B) rename
the test string to something accurate like "with strictMode disabled (default),
resolves to []" so it no longer claims to check warning logging; ensure you
change only the test description or add the log assertion around the existing
whenPromiseSettles callback where presence.get() is validated.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f61a171a-8956-4253-a6a4-8dc13ae0f727
📒 Files selected for processing (6)
src/common/lib/client/realtimeannotations.tssrc/common/lib/client/realtimechannel.tssrc/common/lib/client/realtimepresence.tssrc/common/lib/util/logger.tstest/realtime/channel.test.jstest/realtime/presence.test.js
| hint: 'Re-create the channel with subscribe in modes: realtime.channels.get(name, { modes: ["subscribe", ...] }). Your token/API-key capability must permit subscribe on this channel. If you have the Ably CLI installed, `ably auth keys list` shows your key\'s capabilities. Note: appending to channel.modes after attach() does not enable the mode server-side - the array reflects what the server granted, not what you requested.', | ||
| }); | ||
| if (this.client.options.strictMode === true) { | ||
| // The call is about to throw, so undo the listener registration above to avoid leaking a handler. |
There was a problem hiding this comment.
This is a breaking change.
The current behaviour of subscribe (with the implicit attach) is that listener is always added regardless of failure.
There was a problem hiding this comment.
ahh i see - this was a coderabbit suggestion from earlier. Still seems strange the listener is added despite this erroring out?
I've undone this to align with existing behaviour though
| code: 93001, | ||
| statusCode: 400, | ||
| hint: 'Re-create the channel with annotation_subscribe in modes: realtime.channels.get(name, { modes: ["subscribe", "annotation_subscribe", ...] }). If the subsequent attach is rejected by the server, check that the channel namespace has "Message annotations, updates, and deletes" enabled in the Ably dashboard and that your API key has annotation-subscribe capability on this channel. If you have the Ably CLI installed, `ably apps rules list` shows which channel namespaces have Mutable Messages enabled, and `ably auth keys list` shows your key\'s capabilities. Note: appending to channel.modes after attach() does not enable the mode server-side - the array reflects what the server granted, not what you requested.', | ||
| hint: 'Include "annotation_subscribe" in the channel modes: realtime.channels.get(name, { modes: ["subscribe", "annotation_subscribe", ...] }), or call channel.setOptions({ modes: [...] }) on an existing channel (this triggers a reattach). If the subsequent attach is rejected by the server, check that the channel namespace has "Message annotations, updates, and deletes" enabled in the Ably dashboard and that your API key has annotation-subscribe capability on this channel. If you have the Ably CLI installed, `ably apps rules list` shows which channel namespaces have Mutable Messages enabled, and `ably auth keys list` shows your key\'s capabilities.', |
There was a problem hiding this comment.
Im wondering if this hint is a bit presumptuous? Like, are there other reasons the next attach could be rejected? Or should this hint stop at what you need to do to fix the immediate problem and then let the next error hint at how to fix that? Do we risk confusing LLMs by throwing to many possibilities at them?
There was a problem hiding this comment.
Agree it was over-reaching — and it was partly wrong. The two things the old tail told you to check "if the attach is rejected" actually fail differently (confirmed by live sandbox traces + the canonical registry):
- Missing namespace rule + an explicit
annotation_subscribemode → server rejects the attach (channel →failed, server code93002"…namespace with Mutable Messages enabled"). - Missing
annotation-subscribecapability → no rejection; the server silently drops the mode, the attach succeeds, and this same93001re-fires.
So "if the attach is rejected, check the namespace and the capability" conflated a reject path with a downgrade path. Reworded to split by the symptom you actually see:
…If the attach is then rejected, the channel namespace does not have "Message annotations, updates, and deletes" enabled (
ably apps rules list…). If the attach succeeds but annotations still are not delivered, your API key lacks the annotation-subscribe capability and the server silently dropped the mode (ably auth keys list…).
On "let the next error hint at the fix" — I'd like to, but the downstream rejection is currently unhinted: server errors come through ErrorInfo.fromValues, which only adds an href, never a hint. So the cause has to live here for now. If we add a hint to the server's 93002 rejection (a separate change), we could then trim this branch to just the immediate fix — happy to file that follow-up.
— 🤖 Drafted with Claude Code (Claude Opus 4.8, 1M context)
There was a problem hiding this comment.
What's the status of getting hint or whatever we decide to call it into Realtime? That's a trivial PR when it comes to it, so would it not be better to solve it properly first time rather than adding technical debt we have to undo later?
There was a problem hiding this comment.
added a ticket so we don't lose track of this https://ably.atlassian.net/browse/DX-1460
| 'The channel was attached without the presence_subscribe mode, so the server has not delivered any members to this client.', | ||
| code: 93002, | ||
| statusCode: 400, | ||
| hint: 'Include "presence_subscribe" in the channel modes: realtime.channels.get(name, { modes: ["presence_subscribe", ...] }), or call channel.setOptions({ modes: [...] }) on an existing channel (this triggers a reattach). Alternatively, omit modes entirely and ensure your token/API-key capability permits presence-subscribe on this channel. If you have the Ably CLI installed, `ably auth keys list` shows your key\'s capabilities.', |
There was a problem hiding this comment.
presence_subscribe isn't a capability we publish in our documentation. It exists on the server, but it's not something we expose as its a subset of subscribe which for legacy reasons gives message + presence + objects.
So that's a judgement call to make here as to whether this should become a public capability or not.
In any case, telling an LLM to do this right now would only confuse it further - as ably.d.ts doesnt list it as a valid capability, so the LLM would just see a typescript error
There was a problem hiding this comment.
Good catch — fixed the capability, with one push-back on the mode.
The real bug (fixed): the hint named a presence-subscribe capability, which doesn't exist. Presence delivery is governed by the subscribe capability (the legacy "subscribe ⇒ messages + presence + objects" you describe), so it now reads "…capability permits subscribe", matching the sibling subscribe-mode hint.
On presence_subscribe the mode — I'd keep it: it's a valid ChannelMode in ably.d.ts (doc-commented "The client will receive presence messages"), so modes: ["presence_subscribe"] typechecks — no TS error. And it's required: modes map 1:1 to flag bits (encodeModesToFlags OR-sets one bit per mode, no supersetting), and the integration tests prove it — channel.test.js attaches modes: ['subscribe'] and asserts presence subscription is denied, and ['publish', 'presence_subscribe'] and asserts message subscribe is denied. So subscribe doesn't imply presence_subscribe; when a user has restricted modes it's the only client-side way to get the flag. The hint leads with "omit modes entirely" for the common case, so we only steer toward presence_subscribe when someone has explicitly restricted modes. Whether to document it more prominently is a fair product question, but the hint needs no new public surface.
Also reassigned the code: while checking this I found 93002 is the canonical server code for "namespace needs Mutable Messages" (ably-common errors.json + faqs.ably.com/error-code-93002), so reusing it client-side for presence was a collision. presence.get() now throws 91008 — the presence block, next to 91005 which get() already throws for the suspended case. Reserved in ably/ably-common#345.
— 🤖 Drafted with Claude Code (Claude Opus 4.8, 1M context)
b2537c4 to
35f3492
Compare
729d8f0 to
8eb6588
Compare
8eb6588 to
d22cf9a
Compare
Declaration only — no runtime use yet. Documented silent-failure paths will read this option in subsequent commits to gate a hint-carrying throw; the default stays `false` in v2.x. Per DXRFC-022 work item B5 the default flips to `true` in v3 with no per-call opt-out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The guard at line 73 parenthesised `(state === 'attached' && _mode & flag) === 0` so the `=== 0` compared against `(boolean && number)`. It happened to behave correctly — `false === 0` is `false` so the throw skipped when not attached, and the bitwise result === 0 was correct when attached — but the comparison was structural luck rather than the obvious reading of the predicate. Re-parenthesise to `state === 'attached' && (_mode & flag) === 0`. Also emit an always-on warning log adjacent to the throw so the diagnostic fires in the SDK output even when the caller swallows the throw. No silentFailureLogSuffix here because this throw is unconditional (pre-DXRFC-022) and not strictMode-gated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A short suffix appended to silent-failure warning logs emitted when clientOptions.strictMode is off, so the reader knows the same path will throw in a future major version. Co-locating on Logger keeps the import surface tight; callers do `Logger.silentFailureLogSuffix()` next to `Logger.logActionNoStrip(...)`. Log-only by design — the suffix is not put into ErrorInfo.hint, because the hint is also shown when the throw fires (strictMode on), where the suffix would be misleading. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a channel was attached without the presence_subscribe mode, the
server never delivers presence members, so presence.get() resolves to
[] regardless of who is actually present. Today there is no signal
distinguishing "no one is present" from "this client cannot see anyone".
This commit detects the case after ensureAttached() populates the
server-granted mode set, then:
- emits an always-on warning log carrying the hint + a suffix telling
the reader that strictMode will throw in a future major version.
- throws ErrorInfo with err.hint when clientOptions.strictMode === true.
Code 93002 sits next to 93001 (annotation_subscribe missing) in the
SDK-internal precondition class. It would also be defensible to use
40160 (server-side capability denied), but the hint-coverage rubric
already pins 40160 to the "no auth options" hint shape, so a second
40160 throw site would either weaken that pin or need a rubric refactor.
Suspended-state and {waitForSync: false} paths return earlier and are
unaffected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a channel was attached without the subscribe mode, the server never delivers messages to the listener, so channel.subscribe() appears to succeed but no callback ever fires. This commit closes the gap symmetrically to presence.get() in the previous commit: - After the implicit attach completes (RTL7g), if subscribe mode was not granted by the server, emit an always-on warning log carrying the hint + the future-throw suffix. - One-shot per channel: the warning fires once per attach cycle to keep noise down on long-lived listeners. Reset _silentSubscribeWarned on the ATTACHED message so a channels.release() + re-attach with corrected modes restores signalling. - Throw ErrorInfo (code 93003) when clientOptions.strictMode === true. attachOnSubscribe: false is out of scope — the check requires an attach to have populated _mode. Document this caveat separately if needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Terser, technical error messages for 93002/93003 - Hints now offer setOptions() or omitting modes + capability check; drop confusing channel.modes note - Keep listener registered on strict-mode throw (matches existing subscribe semantics) - Remove ticket IDs from test names Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…grade The 93001 hint lumped two distinct failure modes under "if the subsequent attach is rejected": a missing namespace rule (Mutable Messages) does reject the attach, but a missing annotation-subscribe capability does NOT - the server silently drops the mode and this same error re-fires. Confirmed by live sandbox traces and the canonical error registry. Reword to key off the symptom the caller observes: attach rejected => enable the namespace rule; attach succeeds but annotations still undelivered => grant the capability. Addresses review feedback that the hint was presumptuous and over-enumerated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… off colliding 93002 Two fixes to the presence.get()/channel.subscribe() missing-mode errors: - Hint correctness (review feedback): the presence hint named a "presence-subscribe" capability that does not exist. Presence delivery is governed by the "subscribe" capability, matching the subscribe-mode hint. - Error-code collision: 93002 is the canonical server code for "namespace needs Mutable Messages" (ably-common errors.json, faqs.ably.com/error-code-93002). presence.get() reused it client-side. Move presence to 91008 (presence block, next to 91005) and channel.subscribe() to 90009 (channel block); 93xxx is the annotations/mutable-messages block. Both codes are new on this unreleased branch, so the renumber is non-breaking. Reserved in ably/ably-common#345. Update the two tests asserting them. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…align the silent-failure log suffix From the 2026-07-02 docstrings-pr-review run: - ClientOptions.strictMode: both strictMode-gated paths are async, so the caller-observable failure is a rejected promise, not a throw. Also drop the vague "documented silent-failure paths" (D10), the version label (C9), the presence.get() worked example (B6), the parenthesised default (A5), and the em-dash opt-out clause (B5/A2). Default stated once, late, as its own sentence. - Logger.silentFailureLogSuffix: fold the JSDoc into two one-line paragraphs, fix ClientOptions casing, drop the "(human or LLM)" aside, and align the suffix string with the strictMode docstring framing (default will change in a future major version; enabling it now makes the call reject). Also re-verified by live sandbox trace: a namespace without the annotations rule rejects the attach with 93002, while a capability shortfall resolves the attach and silently drops the mode. The 93001 hint's reject-vs-downgrade split is therefore correct as shipped and is deliberately left unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…e-get mode hints Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…capability branch
Drops the attach-rejection narration, which surfaces as its own error,
and promotes the parentheticals to sentences. Keeps the guard that
channels.get(name, { modes }) on an existing channel throws, steering
the caller between the two suggested remedies.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
e81bf4b to
c56022a
Compare
9de47a6 to
8c70f4e
Compare
Targets #2233 - review that first please
Author's Note
I'd advise reading the Claude generated sections below as they outline the experiments and exact numbers, but if you want a summary of my thoughts...
This PR aims to address 2 silent failures we have today:
presence.get()on a channel withoutpresence_subscribereturns an empty list.channel.subscribe()on a channel withoutsubscriberegisters a listener that never fires.The fixes themselves were easy in that we just throw an error and hint pairing when running into these issues. We also introduce a
strictModewhich allows us to keep this as an additive change (log instead of throw) before promoting it to a breaking change in a later release (always throw). We want this strictMode in case people don't wish to upgrade to a breaking change release, but they can still benefit from the improvements we've made here.The experiments themselves were the tricky and time-consuming part.
The full experiment runs and details are at https://github.com/ably-labs/llm-eval/pull/10
What this PR does
Two SDK calls fail silently today:
presence.get()on a channel withoutpresence_subscribereturns an empty list.channel.subscribe()on a channel withoutsubscriberegisters a listener that never fires.Both with no error and no warning. A developer or AI agent gets back "nothing", assumes it works, and
ships broken code.
This PR makes those calls speak up: a warning log by default, and — if you set
clientOptions.strictMode— a thrown error with a hint instead of the silent result. This is the v2.xstep of the A5/B5 rollout in DXRFC-022 (warn now;
strictModebecomes the default in v3).How we tested it
We had AI agents do small Ably coding tasks, with no web access, treating the SDK as a black box. Each
task ran against three versions of the SDK:
strictMode: true: the call throws instead of returning silently.(A fourth, B-verbose, is the warning shown only if the developer raised the log level themselves.)
About 370 runs, 0 invalid, on Opus 4.8 and Sonnet 4.6, plus Haiku 4.5 on the last experiment. The
agents can't cheat by swapping in a better API key — the harness blocks it. Every result is compared
against the currently-published
ably@2.21.0.We labelled each run by what the agent did:
The main result
On the strong models, the surfacing barely changed anything — because Opus and Sonnet usually work out
the cause on their own. At baseline they shipped broken code only about 10% of the time, so there was
little left to fix.
But that's the easy case: a capable model, full attention, a tiny task. In real use a strong model
often has a full or long context window, or is working in a large unfamiliar codebase, and can't give
each empty result that much attention. We can't easily simulate "Opus with a full context window", so
we used Haiku as a stand-in for a strong model that can't fully focus — same task, less reasoning
brought to bear.
That is where the surfacing matters. On the keystone task:
strictMode)So the change removes most silent failures exactly when the model is stretched thin — the realistic
case, not the exception.
This is a stand-in, not a direct measurement: we did not run Opus with a deliberately full context.
Haiku is the proxy. Measuring a frontier model under real load is the obvious next test.
The seven experiments
works — there's no test." The key can't subscribe, so nothing arrives. Opus/Sonnet shipped broken
2/20 at baseline, 1/20 with B-error, 0/10 with C. More telling: at baseline, 0 of 20 runs reached
the diagnosis from the SDK (they guessed, or tripped a loud server error by probing); with B-error,
20 of 20 did.
models nearly always noticed (the task says someone is present, and a wrong guess triggers a loud
server error anyway), so all arms scored about the same.
(B-error) against the throw (C). Result: B-error ≈ C — the default warning does the job, and
strictModeis the backstop for the rest.channel options. Every model and every arm fixed it cleanly in about 6 steps. When the bug is
reported and the cause is in plain sight, the warning adds nothing.
naming
strictMode. 100% of runs foundstrictModein the type definitions and switched it on.Agents read the JSDoc.
modesreplaces thedefaults, so adding one mode silently drops presence. We buried this in a multi-file app and in a
from-scratch build. Two-thirds of agents made the mistake — but on the strong models they all caught
it themselves, so nothing shipped broken. The "wasted hours debugging" failures we'd heard about did
not show up on the strong models at this scale.
fixes a reported bug. The no-oracle task (1) is where the 50% → 10% → 0% result above came from.
Where it doesn't help
Good next step: clearer JSDoc
Experiment 5 showed agents reliably read the SDK's type definitions — 100% found
strictModethere.That makes the JSDoc a cheap, high-value place to stop these failures before they happen.
The clearest example is the bug in experiment 6. The JSDoc for
ChannelOptions.modesjust says "Anarray of ChannelMode objects." It does not say that setting
modesreplaces the defaults instead ofadding to them — which is the exact cause of that silent failure. One added line ("setting
modesreplaces the default set — list every mode you need") would stop agents making the mistake at all, in
the place they actually look.
This PR makes failures speak up at runtime. Clearer JSDoc on
modes(and related options) would preventsome of them up front. The two together are the natural follow-up.
Summary by CodeRabbit
New Features
strictModesetting (default:false) to convert certain previously silent failure cases into thrown errors with clearer, hint-rich messages.Bug Fixes
subscribe()andpresence.get(), including consistent error codes/hints and reduced repeated warnings (logging is limited unless strict mode is enabled).Tests
strictModebehavior for missingsubscribeandpresence_subscribemodes.