From 57ccc488050d616a8e13f2680076d997dbe8a8f4 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Tue, 16 Sep 2025 08:36:51 +0100 Subject: [PATCH 01/50] Sticky Events --- proposals/4354-sticky-events.md | 337 ++++++++++++++++++++++++++++++++ 1 file changed, 337 insertions(+) create mode 100644 proposals/4354-sticky-events.md diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md new file mode 100644 index 00000000000..3bb90d6116c --- /dev/null +++ b/proposals/4354-sticky-events.md @@ -0,0 +1,337 @@ +# MSC4354: Sticky Events + +MatrixRTC currently depends on [MSC3757](https://github.com/matrix-org/matrix-spec-proposals/pull/3757) +for sending per-user per-device state. MatrixRTC wants to be able to share a temporary state to all +users in a room to indicate whether the given client is in the call or not. + +The concerns with MSC3757 and using it for MatrixRTC are mainly: + +1. In order to ensure other users are unable to modify each other’s state, it proposes using + string packing for authorization which feels wrong, given the structured nature of events. +2. Allowing unprivileged users to send arbitrary amounts of state into the room is a potential + abuse vector, as these states can pile up and can never be cleaned up as the DAG is append-only. +3. State resolution can cause rollbacks. These rollbacks may inadvertently affect per-user per-device state. + +Other proposals have similar problems such as live location sharing which uses state events when it +really just wants per-user last-write-wins behaviour. + +There currently exists no good communication primitive in Matrix to send this kind of data. EDUs are +almost the right primitive, but: + +* They can’t be sent via clients (there is no concept of EDUs in the Client-Server API\!) +* They aren’t extensible. +* They do not guarantee delivery. Each EDU type has slightly different persistence/delivery guarantees, + all of which currently fall short of guaranteeing delivery. + +This proposal adds such a primitive, called Sticky Events, which provides the following guarantees: + +* Eventual delivery (with timeouts) and convergence. +* Access control tied to the joined members in the room. +* Extensible, able to be sent by clients. + +This new primitive can be used to implement MatrixRTC participation, live location sharing, among other functionality. + +## Proposal + +Message events can be annotated with a new top-level `sticky` key, which MUST have a `duration_ms`, +which is the number of milliseconds for the event to be sticky. The presence of `sticky.duration_ms` +with a valid value makes the event “sticky”[^stickyobj]. Valid values are the integer range 0-3600000 (1 hour). + +```json +{ + "type": "m.rtc.member", + "sticky": { + "duration_ms": 600000 + }, + "sender": "@alice:example.com", + "room_id": "!foo", + "origin_server_ts": 1757920344000, + "content": { ... } +} +``` + +This key can be set by clients in the CS API by a new query parameter `stick_duration_ms`, which is +added to the following endpoints: + +* `PUT /\_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}` +* `PUT /\_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}` + +To calculate if any sticky event is still sticky: + +* Calculate the start time: + * The start time is `min(now, origin_server_ts)`. This ensures that malicious origin timestamps cannot + specify start times in the future. + * If the event is pushed via `/send`, servers MAY use the current time as the start time. This minimises + the risk of clock skew causing the start time to be too far in the past. See “Potential issues \> Time”. +* Calculate the end time as `start_time + min(stick_duration_ms, 3600000)`. +* If the end time is in the future, the event remains sticky. + +Sticky events are like normal message events and are authorised using normal PDU checks. They have the +following _additional_ properties: + +* They are eagerly synchronised with all other servers.[^partial] +* They must appear in the `/sync` response.[^sync] +* The soft-failure checks MUST be re-evaluated when the membership state changes for a user with unexpired sticky events.[^softfail] + +To implement these properties, servers MUST: + +* Attempt to send all sticky events to all joined servers, whilst respecting per-server backoff times. + Large volumes of events to send MUST NOT cause the sticky event to be dropped from the send queue on the server. +* Ensure all sticky events are delivered to clients via `/sync` in a new section of the sync response, + regardless of whether the sticky event falls within the timeline limit of the request. +* When a new server joins the room, the server MUST attempt delivery of all sticky events immediately. +* Remember sticky events per-user, per-room such that the soft-failure checks can be re-evaluated. + +When an event loses its stickiness, these properties disappear with the stickiness. Servers SHOULD NOT +eagerly synchronise such events anymore, nor send them down `/sync`, nor re-evaluate their soft-failure status. +Note: policy servers and other similar antispam techniques still apply to these events. + +The new sync section looks like: + +```json +{ + "rooms": { + "join": { + "!726s6s6q:example.com": { + "account_data": { ... }, + "ephemeral": { ... }, + "state": { ... }, + "timeline": { ... }, + "sticky_events": [ + { + "event": { + "sender": "@bob:example.com", + "type": "m.foo", + "sticky": { + "duration_ms": 300000 + }, + "origin_server_ts": 1757920344000, + "content": { ... } + }, + "prev_batch": "s1234_5678_90123" + }, + { + "event": { + "sender": "@alice:example.com", + "type": "m.foo", + "sticky": { + "duration_ms": 300000 + }, + "origin_server_ts": 1757920311020, + "content": { ... } + }, + "prev_batch": "s1234_5678_90125" + } + ], + } + } + } +} +``` + +Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_events`, which has the following response shape: + +```json +{ + "rooms": { + "!726s6s6q:example.com": [ + { + "prev_batch": "s1234_5678_90125", + "event": { + "sender": "@bob:example.com", + "type": "m.foo", + "sticky": { + "duration_ms": 300000 + }, + "origin_server_ts": 1757920344000, + "content": { ... } + } + } + ] + } +} +``` + +Sticky messages MAY be sent in the timeline section of the `/sync` response, regardless of whether +or not they exceed the timeline limit[^ordering]. + +Servers SHOULD rate limit sticky events over federation. If the rate limit kicks in, servers MUST +return a non-2xx status code from `/send` such that the sending server *retries the request* in order +to guarantee that the sticky event is eventually delivered. Servers MUST NOT silently drop sticky events +and return 200 OK from `/send`, as this breaks the eventual delivery guarantee. + +These messages may be combined with [MSC4140: Delayed Events](https://github.com/matrix-org/matrix-spec-proposals/pull/4140) +to provide heartbeat semantics (e.g required for MatrixRTC). Note that the sticky duration in this proposal +is distinct from that of delayed events. The purpose of the sticky duration in this proposal is to ensure sticky events are cleaned up. + +### Implementing a map + +MatrixRTC relies on a per-user, per-device map of RTC member events. To implement this, this MSC proposes +a standardised mechanism for determining keys on sticky events, the `content.sticky_key` property: + +```json +{ + "type": "m.rtc.member", + "sticky": { + "duration_ms": 300000 + }, + "sender": "@alice:example.com", + "room_id": "!foo", + "origin_server_ts": 1757920344000, + "content": { + "sticky_key": "LAPTOPXX123", + ... + } +} +``` + +`content.sticky_key` is ignored server-side[^encryption] and is purely informational. Clients which +receive a sticky event with a sticky key SHOULD keep a map with keys determined via the 4-uple +`(room_id, sender, type, content.sticky_key)` to track the current values in the map. Nothing stops +users sending multiple events with the same `sticky_key`. To deterministically tie-break, clients which +implement this behaviour MUST: + +- pick the one with the highest `origin_server_ts`, +- tie break on the one with the highest lexicographical event ID (A < Z). + +When overwriting keys, clients SHOULD use the same sticky duration as the previous sticky event to avoid clients diverging. +This can happen when a client sends a sticky event with key K with a long timeout, then overwrites it with the same key K’ +with a short timeout. If the sticky event K’ fails to be sent to all servers before the short timeout is hit, +some clients will believe the state is K and others will have no state. This will only resolve once the long timeout is hit. + +Note that encrypted sticky events will encrypt some parts of the 4-uple. An encrypted sticky event only exposes the room ID and sender to the server: + +```json +{ + "content": { + "algorithm": "m.megolm.v1.aes-sha2", + "ciphertext": "AwgCEqABubgx7p8AThCNreFNHqo2XJCG8cMUxwVepsuXAfrIKpdo8UjxyAsA50IOYK6T5cDL4s/OaiUQdyrSGoK5uFnn52vrjMI/+rr8isPzl7+NK3hk1Tm5QEKgqbDJROI7/8rX7I/dK2SfqN08ZUEhatAVxznUeDUH3kJkn+8Onx5E0PmQLSzPokFEi0Z0Zp1RgASX27kGVDl1D4E0vb9EzVMRW1PrbdVkFlGIFM8FE8j3yhNWaWE342eaj24NqnnWJ5VG9l2kT/hlNwUenoGJFMzozjaUlyjRIMpQXqbodjgyQkGacTEdhBuwAQ", + "device_id": "AAvTvsyf5F", + "sender_key": "KVMNIv/HyP0QMT11EQW0X8qB7U817CUbqrZZCsDgeFE", + "session_id": "c4+O+eXPf0qze1bUlH4Etf6ifzpbG3YeDEreTVm+JZU" + }, + "origin_server_ts": 1757948616527, + "sender": "@alice:example.com", + "type": "m.room.encrypted", + "sticky": { + "duration_ms": 600000 + }, + "event_id": "$lsFIWE9JcIMWUrY3ZTOKAxT_lIddFWLdK6mqwLxBchk", + "room_id": "!ffCSThQTiVQJiqvZjY:matrix.org" +} +``` + +The decrypted event would contain the `type` and `content.sticky_key`. + +## Potential issues + +### Time + +Servers who can’t maintain correct clock frequency may expire sticky events at a slightly slower/faster rate +than other servers. As the maximum timeout is relatively low, the total deviation is also reasonably low, +making this less problematic. The alternative of explicitly sending an expiration event would likely cause +more deviation due to retries than deviations due to clocks. + +Servers with significant clock skew may set `origin_server_ts` too far in the past or future. If the value +is too far in the past this will cause sticky events to expire quicker than they should, or to always be +treated as expired. If the value is too far in the future, this has no effect as it is bounded by the current time. +As such, this proposal relies somewhat on NTP to ensure clocks over federation are roughly in sync. +As a consequence of this, the sticky duration SHOULD NOT be set to below 5 minutes.[^ttl] + +### Encryption + +Encrypted sticky events reduce reliability as in order for a sticky event to be visible to the end user it +requires *both* the sending client to think the receiver is joined (so we encrypt for their devices) and the +receiving server to think the sender is joined (so it passes auth checks). Unencrypted events only strictly +require the receiving server to think the sender is joined. + +The lack of historical room key sharing may make some encrypted sticky events undecryptable when new users join the room. + +### Spam + +Servers may send every event as a sticky event, causing a higher amount of events to be sent eagerly over federation +and to be sent down `/sync` to clients. The former is already an issue as servers can simply `/send` many events. +The latter is a new abuse vector, as up until this point the `timeline_limit` would restrict the amount of events +that arrive on client devices (only state events are unbounded and setting state is a privileged operation). +This proposal has the following protections in place: + +* All sticky events expire, with a hard limit of 1 hour. The hard limit ensures that servers cannot set years-long expiry times. + This ensures that the data in the `/sync` response can go down and not grow unbounded. +* All sticky events are subject to normal PDU checks, meaning that the sender must be authorised to send events into the room. +* Servers sending lots of sticky events may be asked to try again later as a form of rate-limiting. + Due to data expiring, subsequent requests will gradually have less data. + +## Alternatives + +### Use state events + +We could do [MSC3757](https://github.com/matrix-org/matrix-spec-proposals/pull/3757), but for the +reasons mentioned at the start we don’t really want to do so. + +### Make stickiness persistent not ephemeral + +There are arguments that, at least for some use cases, we don’t want these sticky events to timeout. +However, that opens the possibility of bloating the `/sync` response with sticky events. + +Suggestions for minimizing that have been to have a hard limit on the number of sticky events a user can have per room, +instead of a timeout. However, this has two drawbacks: a) you still may end up with substantial bloat as stale data doesn’t +automatically get reaped (even if the amount of bloat is limited), and b) what do clients do if there are already too many +sticky events? The latter is tricky, as deleting the oldest may not be what the user wants if it happens to be not-stale data, +and asking the user what data it wants to delete vs keep is unergonomic. + +Non-expiring sticky events could be added later if the above issues are resolved. + +### Have a dedicated ‘ephemeral user state’ section + +Early prototypes of this proposal devised a key-value map with timeouts maintained over EDUs rather than PDUs. +This early proposal had much the same feature set as this proposal but with one major difference: equivocation. +Servers could broadcast different values for the same key to different servers, causing the map to not converge: +the Byzantine Broadcast problem. Matrix already has a data structure to agree on shared state: the room DAG. +As such, this led to the prototype to the current proposal. By putting the data into the DAG, other servers +can talk to each other via it to see if they have been told different values. When combined with a simple +conflict resolution algorithm (which works because there is [no need for coordination](https://arxiv.org/abs/1901.01930)), +this provides a way for clients to agree on the same values. Note that in practice this needs servers to *eagerly* +share forward extremities so servers aren’t reliant on unrelated events being sent in order to check for equivocation. +Currently, there is no mechanism for servers to express “these are my latest events, what are yours?” without actually sending another event. + +## Security Considerations + +Servers may equivocate over federation and send different events to different servers in an attempt to cause +the key-value map maintained by clients to not converge. Alternatively, servers may fail to send sticky events +to their own clients to produce the same outcome. Federation equivocation is mitigated by the events being +persisted in the DAG, as servers can talk to each other to fetch all events. There is no way to protect against +dropped updates for the latter scenario. + +## Unstable Prefix + +- The `stick_duration_ms` query param is `msc4354_stick_duration_ms`. +- The `sticky` key in the PDU is `msc4354_sticky`. +- The `/sync` response section is `msc4354_sticky_events`. +- The sticky key in the `content` of the PDU is `msc4354_sticky_key`. + +[^stickyobj]: The presence of the `sticky` object alone is insufficient. +[^partial]: Over federation, servers are not required to send all timeline events to every other server. +Servers mostly lazy load timeline events, and will rely on clients hitting `/messages` which in turn +hits`/backfill` to request events from federated servers. +[^sync]: Normal timeline events do not always appear in the sync response if the event is more than `timeline_limit` events away. +[^softfail]: Not all servers will agree on soft-failure status due to the check considering the “current state” of the room. +To ensure all servers agree on which events are sticky, we need to re-evaluate this rule when the current room state changes. +This becomes particularly important when room state is rolled back. For example, if Charlie sends some sticky event E and +then Bob kicks Charlie, but concurrently Alice kicks Bob then whether or not a receiving server would accept E would depend +on whether they saw “Alice kicks Bob” or “Bob kicks Charlie”. If they saw “Alice kicks Bob” then E would be accepted. If they +saw “Bob kicks Charlie” then E would be rejected, and would need to be rolled back when they see “Alice kicks Bob”. +[^ordering]: Sticky events expose gaps in the timeline which cannot be expressed using the current sync API. If sync used +something like [stitched ordering](https://codeberg.org/andybalaam/stitched-order) +or [MSC3871](https://github.com/matrix-org/matrix-spec-proposals/pull/3871) then sticky events could be inserted straight +into the timeline without any additional section, hence “MAY” would enable this behaviour in the future. +[^encryption]: Previous versions of this proposal had the key be at the top-level of the event JSON so servers could +implement map-like semantics on client’s behalf. However, this would force the key to remain visible to the server and +thus leak metadata. As a result, the key now falls within the encrypted `content` payload, and clients are expected to +implement the map-like semantics should they wish to. +[^ttl]: Earlier designs had servers inject a new `unsigned.ttl_ms` field into the PDU to say how many milliseconds were left. +This was problematic because it would have to be modified every time the server attempted delivery of the event to another server. +Furthermore, it didn’t really add any more protection because it assumed servers honestly set the value. +Malicious servers could set the TTL to be the maximum allowed time all the time, ensuring maximum divergence +on whether or not an event was sticky. In contrast, using `origin_server_ts` is a consistent reference point +that all servers are guaranteed to see, limiting the ability for malicious servers to cause divergence as all +servers approximately track NTP. \ No newline at end of file From 94b1a875db403213529c22e764fe7c3841689e51 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Tue, 16 Sep 2025 08:51:16 +0100 Subject: [PATCH 02/50] Remove prev_batch It wasn't particulalry useful for clients, and doesn't help equivocation much. --- proposals/4354-sticky-events.md | 68 +++++++++++++++------------------ 1 file changed, 30 insertions(+), 38 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 3bb90d6116c..69dfcdd2024 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -53,8 +53,8 @@ with a valid value makes the event “sticky”[^stickyobj]. Valid values are th This key can be set by clients in the CS API by a new query parameter `stick_duration_ms`, which is added to the following endpoints: -* `PUT /\_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}` -* `PUT /\_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}` +* `PUT /_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}` +* `PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}` To calculate if any sticky event is still sticky: @@ -97,33 +97,28 @@ The new sync section looks like: "ephemeral": { ... }, "state": { ... }, "timeline": { ... }, - "sticky_events": [ - { - "event": { - "sender": "@bob:example.com", - "type": "m.foo", - "sticky": { - "duration_ms": 300000 - }, - "origin_server_ts": 1757920344000, - "content": { ... } - }, - "prev_batch": "s1234_5678_90123" - }, - { - "event": { - "sender": "@alice:example.com", - "type": "m.foo", - "sticky": { - "duration_ms": 300000 - }, - "origin_server_ts": 1757920311020, - "content": { ... } - }, - "prev_batch": "s1234_5678_90125" - } - ], - } + "sticky_events": { + "events": [ + { + "sender": "@bob:example.com", + "type": "m.foo", + "sticky": { + "duration_ms": 300000 + }, + "origin_server_ts": 1757920344000, + "content": { ... } + }, + { + "sender": "@alice:example.com", + "type": "m.foo", + "sticky": { + "duration_ms": 300000 + }, + "origin_server_ts": 1757920311020, + "content": { ... } + } + ] + } } } } @@ -134,10 +129,8 @@ Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_eve ```json { "rooms": { - "!726s6s6q:example.com": [ - { - "prev_batch": "s1234_5678_90125", - "event": { + "!726s6s6q:example.com": { + "events": [{ "sender": "@bob:example.com", "type": "m.foo", "sticky": { @@ -145,15 +138,14 @@ Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_eve }, "origin_server_ts": 1757920344000, "content": { ... } - } - } - ] + }] + } } } ``` Sticky messages MAY be sent in the timeline section of the `/sync` response, regardless of whether -or not they exceed the timeline limit[^ordering]. +or not they exceed the timeline limit[^ordering]. Servers SHOULD rate limit sticky events over federation. If the rate limit kicks in, servers MUST return a non-2xx status code from `/send` such that the sending server *retries the request* in order @@ -334,4 +326,4 @@ Furthermore, it didn’t really add any more protection because it assumed serve Malicious servers could set the TTL to be the maximum allowed time all the time, ensuring maximum divergence on whether or not an event was sticky. In contrast, using `origin_server_ts` is a consistent reference point that all servers are guaranteed to see, limiting the ability for malicious servers to cause divergence as all -servers approximately track NTP. \ No newline at end of file +servers approximately track NTP. From 50d76e6af2ee9cd25463de99064f7aa4368e467c Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Tue, 16 Sep 2025 09:01:25 +0100 Subject: [PATCH 03/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 69dfcdd2024..7321524b7d3 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -323,7 +323,7 @@ implement the map-like semantics should they wish to. [^ttl]: Earlier designs had servers inject a new `unsigned.ttl_ms` field into the PDU to say how many milliseconds were left. This was problematic because it would have to be modified every time the server attempted delivery of the event to another server. Furthermore, it didn’t really add any more protection because it assumed servers honestly set the value. -Malicious servers could set the TTL to be the maximum allowed time all the time, ensuring maximum divergence +Malicious servers could set the TTL to be 0 ~ `sticky.duration_ms` , ensuring maximum divergence on whether or not an event was sticky. In contrast, using `origin_server_ts` is a consistent reference point that all servers are guaranteed to see, limiting the ability for malicious servers to cause divergence as all servers approximately track NTP. From 3baf0d89c8c27fd1af56b9a95b4518939812e478 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Tue, 16 Sep 2025 12:56:35 +0100 Subject: [PATCH 04/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 7321524b7d3..2ba1cbb925a 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -97,7 +97,7 @@ The new sync section looks like: "ephemeral": { ... }, "state": { ... }, "timeline": { ... }, - "sticky_events": { + "sticky": { "events": [ { "sender": "@bob:example.com", From 29e9bf736afb6d9b3f53ee323e5ef4d1e95d9e5a Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Tue, 16 Sep 2025 13:55:37 +0100 Subject: [PATCH 05/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 2ba1cbb925a..d7ee5015040 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -18,7 +18,8 @@ really just wants per-user last-write-wins behaviour. There currently exists no good communication primitive in Matrix to send this kind of data. EDUs are almost the right primitive, but: -* They can’t be sent via clients (there is no concept of EDUs in the Client-Server API\!) +* They can’t be sent via clients (there is no concept of EDUs in the Client-Server API\! + [MSC2477](https://github.com/matrix-org/matrix-spec-proposals/pull/2477) tries to change that) * They aren’t extensible. * They do not guarantee delivery. Each EDU type has slightly different persistence/delivery guarantees, all of which currently fall short of guaranteeing delivery. @@ -145,7 +146,8 @@ Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_eve ``` Sticky messages MAY be sent in the timeline section of the `/sync` response, regardless of whether -or not they exceed the timeline limit[^ordering]. +or not they exceed the timeline limit[^ordering]. If a sticky event is in the timeline, it MAY be +omitted from the `sticky.events` section. This ensures we minimise duplication in the `/sync` response JSON. Servers SHOULD rate limit sticky events over federation. If the rate limit kicks in, servers MUST return a non-2xx status code from `/send` such that the sending server *retries the request* in order From b6e8159abdf4cfeace5f7c204538a71512bf85e1 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 17 Sep 2025 08:56:05 +0100 Subject: [PATCH 06/50] Syntax --- proposals/4354-sticky-events.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index d7ee5015040..e7a0544d201 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -89,7 +89,7 @@ Note: policy servers and other similar antispam techniques still apply to these The new sync section looks like: -```json +```js { "rooms": { "join": { @@ -123,11 +123,12 @@ The new sync section looks like: } } } + ``` Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_events`, which has the following response shape: -```json +```js { "rooms": { "!726s6s6q:example.com": { @@ -163,7 +164,7 @@ is distinct from that of delayed events. The purpose of the sticky duration in t MatrixRTC relies on a per-user, per-device map of RTC member events. To implement this, this MSC proposes a standardised mechanism for determining keys on sticky events, the `content.sticky_key` property: -```json +```js { "type": "m.rtc.member", "sticky": { @@ -195,7 +196,7 @@ some clients will believe the state is K and others will have no state. This wil Note that encrypted sticky events will encrypt some parts of the 4-uple. An encrypted sticky event only exposes the room ID and sender to the server: -```json +```js { "content": { "algorithm": "m.megolm.v1.aes-sha2", From 33ec282c3f52f49e36650855dea980b5a6d9b989 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Thu, 18 Sep 2025 13:06:59 +0100 Subject: [PATCH 07/50] Update proposals/4354-sticky-events.md Co-authored-by: Johannes Marbach --- proposals/4354-sticky-events.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index e7a0544d201..9d0dacb6e8c 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -120,6 +120,7 @@ The new sync section looks like: } ] } + } } } } From 7725f74d04b0b30b28c24bc613fc9e6ba6b5f3db Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Thu, 18 Sep 2025 14:08:12 +0100 Subject: [PATCH 08/50] Update proposals/4354-sticky-events.md Co-authored-by: Johannes Marbach --- proposals/4354-sticky-events.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 9d0dacb6e8c..98c79cd803c 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -37,6 +37,8 @@ This new primitive can be used to implement MatrixRTC participation, live locati Message events can be annotated with a new top-level `sticky` key, which MUST have a `duration_ms`, which is the number of milliseconds for the event to be sticky. The presence of `sticky.duration_ms` with a valid value makes the event “sticky”[^stickyobj]. Valid values are the integer range 0-3600000 (1 hour). +For use cases that require stickiness beyond this limit, the application is responsible for sending another +event to make it happen. ```json { From 192c6b46f861d9207887cf097d53d60844bd20c1 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Thu, 18 Sep 2025 14:20:00 +0100 Subject: [PATCH 09/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 98c79cd803c..b991b4a48d7 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -221,6 +221,18 @@ Note that encrypted sticky events will encrypt some parts of the 4-uple. An encr The decrypted event would contain the `type` and `content.sticky_key`. +If a client wishes to implement access control in this key-value map based on the power levels event, +they must ensure that they accept all writes in order to ensure all clients converge. For an example +where this goes wrong if you don't, consider the case where two events are sent concurrently: + - Alice sets a key in this map which requires PL100, + - Alice is granted PL100 from PL0. + +If clients only update the map for authorised actions, then clients which see Alice setting a key before the +PL event will not update the map and hence forget the value. Clients which see the PL event first would then +accept Alice setting the key and the net result is divergence between clients. By always updating the map even +for unauthorised updates, we ensure that the arrival order doesn't affect the end result. Clients can then +choose whether or not to materialise/show/process a given key based on the current PL event. + ## Potential issues ### Time From 97c9c5b70f9b2ea53cb9c7b6530e3514297e7988 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 19 Sep 2025 11:01:53 +0100 Subject: [PATCH 10/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index b991b4a48d7..5eff6f57e98 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -26,7 +26,7 @@ almost the right primitive, but: This proposal adds such a primitive, called Sticky Events, which provides the following guarantees: -* Eventual delivery (with timeouts) and convergence. +* Eventual delivery (with timeouts) with convergence. * Access control tied to the joined members in the room. * Extensible, able to be sent by clients. @@ -34,7 +34,7 @@ This new primitive can be used to implement MatrixRTC participation, live locati ## Proposal -Message events can be annotated with a new top-level `sticky` key, which MUST have a `duration_ms`, +Message events can be annotated with a new top-level `sticky` key[^toplevel], which MUST have a `duration_ms`, which is the number of milliseconds for the event to be sticky. The presence of `sticky.duration_ms` with a valid value makes the event “sticky”[^stickyobj]. Valid values are the integer range 0-3600000 (1 hour). For use cases that require stickiness beyond this limit, the application is responsible for sending another @@ -70,7 +70,7 @@ To calculate if any sticky event is still sticky: * If the end time is in the future, the event remains sticky. Sticky events are like normal message events and are authorised using normal PDU checks. They have the -following _additional_ properties: +following _additional_ properties[^prop]: * They are eagerly synchronised with all other servers.[^partial] * They must appear in the `/sync` response.[^sync] @@ -319,7 +319,12 @@ dropped updates for the latter scenario. - The `/sync` response section is `msc4354_sticky_events`. - The sticky key in the `content` of the PDU is `msc4354_sticky_key`. +[^toplevel]: This has to be at the top-level as we want to support _encrypted_ sticky events, and therefore metadata the server +needs cannot be within `content`. [^stickyobj]: The presence of the `sticky` object alone is insufficient. +[^prop]: An interesting observation is that these additional properties are entirely to handle edge cases. In the happy case, +events _are sent_ to others over federation, they _aren't_ soft-failed and they _do appear_ down `/sync`. This MSC just makes this +reliable. [^partial]: Over federation, servers are not required to send all timeline events to every other server. Servers mostly lazy load timeline events, and will rely on clients hitting `/messages` which in turn hits`/backfill` to request events from federated servers. From 8d101fd9589dc7548a4af933302b348044d1ce20 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 19 Sep 2025 11:02:48 +0100 Subject: [PATCH 11/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 5eff6f57e98..d497ae7653a 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -38,7 +38,8 @@ Message events can be annotated with a new top-level `sticky` key[^toplevel], wh which is the number of milliseconds for the event to be sticky. The presence of `sticky.duration_ms` with a valid value makes the event “sticky”[^stickyobj]. Valid values are the integer range 0-3600000 (1 hour). For use cases that require stickiness beyond this limit, the application is responsible for sending another -event to make it happen. +event to make it happen. The `sticky` key is not protected from redaction. A redacted sticky event is the same +as a normal event. ```json { From c75e19c5c08ac55bf6dcec3348544fad3cc04e75 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 19 Sep 2025 11:24:58 +0100 Subject: [PATCH 12/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index d497ae7653a..645b0e50e36 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -257,14 +257,18 @@ receiving server to think the sender is joined (so it passes auth checks). Unenc require the receiving server to think the sender is joined. The lack of historical room key sharing may make some encrypted sticky events undecryptable when new users join the room. +[MSC4268: Sharing room keys for past messages](https://github.com/matrix-org/matrix-spec-proposals/pull/4268) would +help with this. ### Spam Servers may send every event as a sticky event, causing a higher amount of events to be sent eagerly over federation and to be sent down `/sync` to clients. The former is already an issue as servers can simply `/send` many events. The latter is a new abuse vector, as up until this point the `timeline_limit` would restrict the amount of events -that arrive on client devices (only state events are unbounded and setting state is a privileged operation). -This proposal has the following protections in place: +that arrive on client devices (only state events are unbounded and setting state is a privileged operation). Even so, +if a client was actively syncing then they would see all these events anyway, so it's only really a concern for "gappy" +incremental syncs when the client was not actively syncing and has now started their application. This proposal has the +following protections in place: * All sticky events expire, with a hard limit of 1 hour. The hard limit ensures that servers cannot set years-long expiry times. This ensures that the data in the `/sync` response can go down and not grow unbounded. @@ -272,6 +276,21 @@ This proposal has the following protections in place: * Servers sending lots of sticky events may be asked to try again later as a form of rate-limiting. Due to data expiring, subsequent requests will gradually have less data. +We could add a layer of indirection to the `/sync` response where we only announce the number of sticky events, and +expect the client to fetch them when they are ready via a different endpoint. This has roughly the same bandwidth cost, but +the client chooses when to pull in this information, reducing the time-to-interactivity. This has a few problems: + - It assumes sticky events are not urgently required when opening the application. This may be true for something like live + location sharing but may not be true for VoIP calls. + - It's not clear that there is a strong need for the extra indirection, given the strong rate limits and expirations already in + place. + - Adding the indirection increases complexity and friction when using the API, and presupposes the standard `/sync` model. + For [MSC4186: Simplified Sliding Sync](https://github.com/matrix-org/matrix-spec-proposals/pull/4186), clients can already indirect + if they wish to by simply not enabling the extension until they are ready to receive the data. Therefore, any new `/get_sticky_events` + API would really only be useful for A) applications which do not sync, B) users of the existing `/sync` API. The use case for applications + which do not sync is weak, given the entire point of sticky events is to ensure rapid synchronisation of temporary data. This heavily + implies the use of some kind of syncing mechanism to receive timely updates, which polling a `/get_sticky_events` endpoint subverts. + + ## Alternatives ### Use state events From c925a4cfbed5816aa33268b70be92844fb501f7f Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 19 Sep 2025 15:54:08 +0100 Subject: [PATCH 13/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 645b0e50e36..3feee7a91a3 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -76,6 +76,7 @@ following _additional_ properties[^prop]: * They are eagerly synchronised with all other servers.[^partial] * They must appear in the `/sync` response.[^sync] * The soft-failure checks MUST be re-evaluated when the membership state changes for a user with unexpired sticky events.[^softfail] +* They ignore history visibility checks. Any joined user is authorised to see sticky events for the duration they remain sticky. To implement these properties, servers MUST: From 6524be23c1b6c4b4da97a2e4151fb3662e5ff915 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Mon, 22 Sep 2025 08:11:19 +0100 Subject: [PATCH 14/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 3feee7a91a3..7e962767f96 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -315,7 +315,11 @@ Non-expiring sticky events could be added later if the above issues are resolved ### Have a dedicated ‘ephemeral user state’ section Early prototypes of this proposal devised a key-value map with timeouts maintained over EDUs rather than PDUs. -This early proposal had much the same feature set as this proposal but with one major difference: equivocation. +This early proposal had a similar overall feature set as this proposal but with two differences: + - The early proposal never persisted anything, whereas this one persists by default to the DAG (which could be deleted via message retention). + Some use cases would like to have persistence. + - The lack of any persistence enabled equivocation attacks. + Servers could broadcast different values for the same key to different servers, causing the map to not converge: the Byzantine Broadcast problem. Matrix already has a data structure to agree on shared state: the room DAG. As such, this led to the prototype to the current proposal. By putting the data into the DAG, other servers From d14448c7a6a887ece4358669f75d235a944869d3 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Mon, 22 Sep 2025 14:46:25 +0100 Subject: [PATCH 15/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 7e962767f96..e1a34aa5b6a 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -343,6 +343,7 @@ dropped updates for the latter scenario. - The `sticky` key in the PDU is `msc4354_sticky`. - The `/sync` response section is `msc4354_sticky_events`. - The sticky key in the `content` of the PDU is `msc4354_sticky_key`. +- To enable this in SSS, the extension name is `org.matrix.msc4308.sticky_events`. [^toplevel]: This has to be at the top-level as we want to support _encrypted_ sticky events, and therefore metadata the server needs cannot be within `content`. From ce37b02f40b507c9b16f988f53e95963cc7558f7 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Mon, 22 Sep 2025 14:46:55 +0100 Subject: [PATCH 16/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index e1a34aa5b6a..14714e0cc9e 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -343,7 +343,7 @@ dropped updates for the latter scenario. - The `sticky` key in the PDU is `msc4354_sticky`. - The `/sync` response section is `msc4354_sticky_events`. - The sticky key in the `content` of the PDU is `msc4354_sticky_key`. -- To enable this in SSS, the extension name is `org.matrix.msc4308.sticky_events`. +- To enable this in SSS, the extension name is `org.matrix.msc4354.sticky_events`. [^toplevel]: This has to be at the top-level as we want to support _encrypted_ sticky events, and therefore metadata the server needs cannot be within `content`. From caf3fcd819b5016f091df1301710334d5f819ae4 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Tue, 23 Sep 2025 10:10:04 +0100 Subject: [PATCH 17/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 14714e0cc9e..6b543891292 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -341,7 +341,7 @@ dropped updates for the latter scenario. - The `stick_duration_ms` query param is `msc4354_stick_duration_ms`. - The `sticky` key in the PDU is `msc4354_sticky`. -- The `/sync` response section is `msc4354_sticky_events`. +- The `/sync` response section is `msc4354_sticky`. - The sticky key in the `content` of the PDU is `msc4354_sticky_key`. - To enable this in SSS, the extension name is `org.matrix.msc4354.sticky_events`. From ba01efd5f495240ab2271b05b900d2be6c9327a5 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Tue, 23 Sep 2025 12:36:55 +0100 Subject: [PATCH 18/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 6b543891292..615dff70ea4 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -84,7 +84,7 @@ To implement these properties, servers MUST: Large volumes of events to send MUST NOT cause the sticky event to be dropped from the send queue on the server. * Ensure all sticky events are delivered to clients via `/sync` in a new section of the sync response, regardless of whether the sticky event falls within the timeline limit of the request. -* When a new server joins the room, the server MUST attempt delivery of all sticky events immediately. +* When a new server joins the room, existing servers MUST attempt delivery of all sticky events _originating from their server only_[^newjoiner]. * Remember sticky events per-user, per-room such that the soft-failure checks can be re-evaluated. When an event loses its stickiness, these properties disappear with the stickiness. Servers SHOULD NOT @@ -360,7 +360,10 @@ To ensure all servers agree on which events are sticky, we need to re-evaluate t This becomes particularly important when room state is rolled back. For example, if Charlie sends some sticky event E and then Bob kicks Charlie, but concurrently Alice kicks Bob then whether or not a receiving server would accept E would depend on whether they saw “Alice kicks Bob” or “Bob kicks Charlie”. If they saw “Alice kicks Bob” then E would be accepted. If they -saw “Bob kicks Charlie” then E would be rejected, and would need to be rolled back when they see “Alice kicks Bob”. +saw “Bob kicks Charlie” then E would be rejected, and would need to be rolled back when they see “Alice kicks Bob”. +[^newjoiner]: We restrict delivery of sticky events to ones sent locally to reduce the number of events sent on join. If +we sent all active sticky events then the number of received events by the new joiner would be `O(nm)` where `n` = number of joined servers, +`m` = number of active sticky events. [^ordering]: Sticky events expose gaps in the timeline which cannot be expressed using the current sync API. If sync used something like [stitched ordering](https://codeberg.org/andybalaam/stitched-order) or [MSC3871](https://github.com/matrix-org/matrix-spec-proposals/pull/3871) then sticky events could be inserted straight From 06d7aa59b850ad3314dd3328a129ede27f1b1bcb Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 24 Sep 2025 17:04:18 +0100 Subject: [PATCH 19/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 49 +++++++++++++++++++++++++++++---- 1 file changed, 43 insertions(+), 6 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 615dff70ea4..a2eb097eac0 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -164,7 +164,7 @@ These messages may be combined with [MSC4140: Delayed Events](https://github.com to provide heartbeat semantics (e.g required for MatrixRTC). Note that the sticky duration in this proposal is distinct from that of delayed events. The purpose of the sticky duration in this proposal is to ensure sticky events are cleaned up. -### Implementing a map +### Implementing an ephemeral map MatrixRTC relies on a per-user, per-device map of RTC member events. To implement this, this MSC proposes a standardised mechanism for determining keys on sticky events, the `content.sticky_key` property: @@ -194,10 +194,46 @@ implement this behaviour MUST: - pick the one with the highest `origin_server_ts`, - tie break on the one with the highest lexicographical event ID (A < Z). -When overwriting keys, clients SHOULD use the same sticky duration as the previous sticky event to avoid clients diverging. -This can happen when a client sends a sticky event with key K with a long timeout, then overwrites it with the same key K’ -with a short timeout. If the sticky event K’ fails to be sent to all servers before the short timeout is hit, -some clients will believe the state is K and others will have no state. This will only resolve once the long timeout is hit. +Clients SHOULD expire sticky events in maps when their stickiness ends. They should use the algorithm described in this proposal +to determine if an event is still sticky. Clients may diverge if they do not expire sticky events as in the following scenario: +```mermaid +sequenceDiagram + HS1->>+HS2: Sticky event S (1h sticky) + HS2->>Client: Sticky event S (1h sticky) + HS2->>-HS1: OK + HS2->>+HS2: Goes offline + HS1->>+HS2: Sticky event S' (1h sticky) + HS1->>+HS2: Retry Sticky event S' (1h sticky) + HS1->>+HS2: Retry Sticky event S' (1h sticky) + HS1->>HS1: Sticky event S' expires + HS2->>HS2: Back online + Client->>Client: Sticky event S still valid! +``` + +When clients create multiple events with the same `sticky_key`, they SHOULD use the same sticky duration as the previous +sticky event to avoid clients diverging. This can happen when a client sends a sticky event S with a long timeout, then overwrites it with S’ +with a short timeout. If S’ fails to be sent to all servers before the short timeout is hit, +some clients will believe the state is S and others will have no state. This will only resolve once the long timeout is hit. +To illustrate this, consider the scenario when clients use the _same sticky duration_: +``` +Event Lifetime + S [=========|==] | + S' [====|=====|=] + | | + A B +``` +At the time `A`, the possible states on clients is `{ _, S, S'}` where `_` is nothing due to not seeing either event. At time `B`, +the possible states on clients is `{ _, S'}`. +Contrast with: +``` +Event Lifetime + S [=======|====|=] + S' [==|=] | + | | + A B +``` +Just like before, at time `A` the possible states are `{ _, S, S'}`, but now at time `B` the possible states are `{ _, S }`. +This is problematic if you're trying to agree on the "latest" values, like you would in a k:v map. Note that encrypted sticky events will encrypt some parts of the 4-uple. An encrypted sticky event only exposes the room ID and sender to the server: @@ -350,7 +386,8 @@ needs cannot be within `content`. [^stickyobj]: The presence of the `sticky` object alone is insufficient. [^prop]: An interesting observation is that these additional properties are entirely to handle edge cases. In the happy case, events _are sent_ to others over federation, they _aren't_ soft-failed and they _do appear_ down `/sync`. This MSC just makes this -reliable. +reliable. These properties are also properties which _state events_ already have, so we need the equivalent functionality if this +proposal wants to replace [MSC3757: Restricting who can overwrite a state event](https://github.com/matrix-org/matrix-spec-proposals/pull/3757). [^partial]: Over federation, servers are not required to send all timeline events to every other server. Servers mostly lazy load timeline events, and will rely on clients hitting `/messages` which in turn hits`/backfill` to request events from federated servers. From b44ccaa42b84c2118d106fee1062495b2006079c Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Thu, 25 Sep 2025 08:47:12 +0100 Subject: [PATCH 20/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 49 +++++++++++++++++++++++---------- 1 file changed, 34 insertions(+), 15 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index a2eb097eac0..62c1697202a 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -91,7 +91,18 @@ When an event loses its stickiness, these properties disappear with the stickine eagerly synchronise such events anymore, nor send them down `/sync`, nor re-evaluate their soft-failure status. Note: policy servers and other similar antispam techniques still apply to these events. -The new sync section looks like: +Servers SHOULD rate limit sticky events over federation. If the rate limit kicks in, servers MUST +return a non-2xx status code from `/send` such that the sending server *retries the request* in order +to guarantee that the sticky event is eventually delivered. Servers MUST NOT silently drop sticky events +and return 200 OK from `/send`, as this breaks the eventual delivery guarantee. + +These messages may be combined with [MSC4140: Delayed Events](https://github.com/matrix-org/matrix-spec-proposals/pull/4140) +to provide heartbeat semantics (e.g required for MatrixRTC). Note that the sticky duration in this proposal +is distinct from that of delayed events. The purpose of the sticky duration in this proposal is to ensure sticky events are cleaned up. + +### Sync API changes + +The new `/sync` section looks like: ```js { @@ -128,8 +139,11 @@ The new sync section looks like: } } } - ``` +Sticky messages MAY be sent in the timeline section of the `/sync` response, regardless of whether +or not they exceed the timeline limit[^ordering]. If a sticky event is in the timeline, it MAY be +omitted from the `sticky.events` section. This ensures we minimise duplication in the `/sync` response JSON. + Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_events`, which has the following response shape: @@ -151,18 +165,8 @@ Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_eve } ``` -Sticky messages MAY be sent in the timeline section of the `/sync` response, regardless of whether -or not they exceed the timeline limit[^ordering]. If a sticky event is in the timeline, it MAY be -omitted from the `sticky.events` section. This ensures we minimise duplication in the `/sync` response JSON. - -Servers SHOULD rate limit sticky events over federation. If the rate limit kicks in, servers MUST -return a non-2xx status code from `/send` such that the sending server *retries the request* in order -to guarantee that the sticky event is eventually delivered. Servers MUST NOT silently drop sticky events -and return 200 OK from `/send`, as this breaks the eventual delivery guarantee. - -These messages may be combined with [MSC4140: Delayed Events](https://github.com/matrix-org/matrix-spec-proposals/pull/4140) -to provide heartbeat semantics (e.g required for MatrixRTC). Note that the sticky duration in this proposal -is distinct from that of delayed events. The purpose of the sticky duration in this proposal is to ensure sticky events are cleaned up. +Sticky events are expected to be encrypted and so there is no "state filter" equivalent provided for sticky events +e.g to filter sticky events by event type. ### Implementing an ephemeral map @@ -259,6 +263,22 @@ Note that encrypted sticky events will encrypt some parts of the 4-uple. An encr The decrypted event would contain the `type` and `content.sticky_key`. +#### Spam + +Under normal circumstances for the MatrixRTC use case there will be a window of time where clients will receive +sticky events that are not useful. MatrixRTC defines an `m.rtc.member` event with an empty content (and optional `leave_reason`) +as having [left the session](https://github.com/matrix-org/matrix-spec-proposals/blob/toger5/matrixRTC/proposals/4143-matrix-rtc.md#leaving-a-session). +This is conceptually the same as deleting a key from the map. However, as the server is unaware of the `sticky_key`, it +cannot perform the delete operation for clients, and will instead send the empty content event down `/sync`. This means if +N users leave a call, there will be N sticky events present in `/sync` for the sticky duration specified. + +This is the tradeoff for providing the ability to encrypt sticky events to reduce metadata visible to the server. It's worth +noting that this increase in inactionable sticky events only applies in a small time window. Had the client synced earlier when the +call was active, then the `m.rtc.member` events would be actionable. Had the client synced later when the inactionable sticky events +had expired, then the client wouldn't see them at all. + +#### Access control + If a client wishes to implement access control in this key-value map based on the power levels event, they must ensure that they accept all writes in order to ensure all clients converge. For an example where this goes wrong if you don't, consider the case where two events are sent concurrently: @@ -327,7 +347,6 @@ the client chooses when to pull in this information, reducing the time-to-intera which do not sync is weak, given the entire point of sticky events is to ensure rapid synchronisation of temporary data. This heavily implies the use of some kind of syncing mechanism to receive timely updates, which polling a `/get_sticky_events` endpoint subverts. - ## Alternatives ### Use state events From 81cf7282ffccbe3b366155ce1ec98684292a58f5 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Thu, 25 Sep 2025 09:36:16 +0100 Subject: [PATCH 21/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 52 ++++++++++++++++++++++++++++----- 1 file changed, 45 insertions(+), 7 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 62c1697202a..19a293d7fcd 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -80,26 +80,63 @@ following _additional_ properties[^prop]: To implement these properties, servers MUST: -* Attempt to send all sticky events to all joined servers, whilst respecting per-server backoff times. +* Attempt to send their own[^origin] sticky events to all joined servers, whilst respecting per-server backoff times. Large volumes of events to send MUST NOT cause the sticky event to be dropped from the send queue on the server. * Ensure all sticky events are delivered to clients via `/sync` in a new section of the sync response, regardless of whether the sticky event falls within the timeline limit of the request. -* When a new server joins the room, existing servers MUST attempt delivery of all sticky events _originating from their server only_[^newjoiner]. +* When a new server joins the room, existing servers MUST attempt delivery of all of their own sticky events[^newjoiner]. * Remember sticky events per-user, per-room such that the soft-failure checks can be re-evaluated. When an event loses its stickiness, these properties disappear with the stickiness. Servers SHOULD NOT eagerly synchronise such events anymore, nor send them down `/sync`, nor re-evaluate their soft-failure status. Note: policy servers and other similar antispam techniques still apply to these events. -Servers SHOULD rate limit sticky events over federation. If the rate limit kicks in, servers MUST -return a non-2xx status code from `/send` such that the sending server *retries the request* in order -to guarantee that the sticky event is eventually delivered. Servers MUST NOT silently drop sticky events -and return 200 OK from `/send`, as this breaks the eventual delivery guarantee. - These messages may be combined with [MSC4140: Delayed Events](https://github.com/matrix-org/matrix-spec-proposals/pull/4140) to provide heartbeat semantics (e.g required for MatrixRTC). Note that the sticky duration in this proposal is distinct from that of delayed events. The purpose of the sticky duration in this proposal is to ensure sticky events are cleaned up. +### Rate limits + +As sticky events are sent to clients regardless of the timeline limit, care needs to be taken to ensure +that other room participants cannot send large volumes of sticky events. + +Servers SHOULD rate limit sticky events over federation. Servers can choose one of two options to do this: + - A) Do not persist the sticky events and expect the other server to retry later. + - B) Persist the sticky events but wait a while before delivering them to clients. + +Option A means servers don't need to store sticky events in their database, protecting disk usage at the cost of more bandwidth. +To implement this, servers MUST return a non-2xx status code from `/send` such that the sending server +*retries the request* in order to guarantee that the sticky event is eventually delivered. Servers MUST NOT +silently drop sticky events and return 200 OK from `/send`, as this breaks the eventual delivery guarantee. +Care must be taken with this approach as all the PDUs in the transaction will be retried, even ones for different rooms / not sticky events. + +Option B means servers have to store the sticky event in their database, protecting bandwidth at the cost of more disk usage. +This provides fine-grained control over when to deliver the sticky events to clients as the server doesn't need +to wait for another request. Servers SHOULD deliver the event to clients before the sticky event expires. This may not +always be possible if the remaining time is very short. + +### Federation behaviour + +Servers are only responsible for sending sticky events originating from their own server. This ensures the server is aware +of the `prev_events` of all sticky events they send to other servers. This is important because the receiving server will +attempt to fetch those previous events if they are unaware of them, _rejecting the transaction_ if the sending server fails +to provide them. For this reason, it is not possible for servers to reliably deliver _other server's_ sticky events. + +In the common case, sticky events are sent over federation like any other event and do not cause any behavioural changes. +The two cases where this is different is: + - when sending sticky events to newly joined servers + - when sending "old" but unexpired sticky events + +Servers tend to maintain a sliding window of events to deliver to other servers e.g the most recent 50 PDUs. Sticky events +can fall outside this range, which is what we define as "old". On the receiving server, old events appear to have unknown +`prev_events`, which cannot be connected to any known part of the room DAG. Sending sticky events to newly joined servers can be seen +as a form of sending old but unexpired sticky events, and so this proposal only considers this case. Sending these old events +will potentially increase the number of forward extremities in the room for the receiving server. This may impact state resolution +performance if there are many forward extremities. Servers MAY send dummy events to remove forward extremities (Synapse has the +option to do this since 2019). Alternatively, servers MAY choose not to add old sticky events to their forward extremities, but +this A) reduces eventual delivery guarantees by reducing the frequency of transitive delivery of events, B) reduces the convergence +rate when implementing ephemeral maps (see "Implementing an ephemeral map"), as that relies on servers referencing sticky events from other servers. + ### Sync API changes The new `/sync` section looks like: @@ -417,6 +454,7 @@ This becomes particularly important when room state is rolled back. For example, then Bob kicks Charlie, but concurrently Alice kicks Bob then whether or not a receiving server would accept E would depend on whether they saw “Alice kicks Bob” or “Bob kicks Charlie”. If they saw “Alice kicks Bob” then E would be accepted. If they saw “Bob kicks Charlie” then E would be rejected, and would need to be rolled back when they see “Alice kicks Bob”. +[^origin]: That is, the domain of the sender of the sticky event is the sending server. [^newjoiner]: We restrict delivery of sticky events to ones sent locally to reduce the number of events sent on join. If we sent all active sticky events then the number of received events by the new joiner would be `O(nm)` where `n` = number of joined servers, `m` = number of active sticky events. From eced090df4ed50d27560f90c6a031cb519cfc9b5 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Thu, 25 Sep 2025 09:42:50 +0100 Subject: [PATCH 22/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 88 +++++++++++++++++---------------- 1 file changed, 45 insertions(+), 43 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 19a293d7fcd..c69cfb9fe10 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -76,7 +76,7 @@ following _additional_ properties[^prop]: * They are eagerly synchronised with all other servers.[^partial] * They must appear in the `/sync` response.[^sync] * The soft-failure checks MUST be re-evaluated when the membership state changes for a user with unexpired sticky events.[^softfail] -* They ignore history visibility checks. Any joined user is authorised to see sticky events for the duration they remain sticky. +* They ignore history visibility checks. Any joined user is authorised to see sticky events for the duration they remain sticky.[^hisvis] To implement these properties, servers MUST: @@ -95,48 +95,6 @@ These messages may be combined with [MSC4140: Delayed Events](https://github.com to provide heartbeat semantics (e.g required for MatrixRTC). Note that the sticky duration in this proposal is distinct from that of delayed events. The purpose of the sticky duration in this proposal is to ensure sticky events are cleaned up. -### Rate limits - -As sticky events are sent to clients regardless of the timeline limit, care needs to be taken to ensure -that other room participants cannot send large volumes of sticky events. - -Servers SHOULD rate limit sticky events over federation. Servers can choose one of two options to do this: - - A) Do not persist the sticky events and expect the other server to retry later. - - B) Persist the sticky events but wait a while before delivering them to clients. - -Option A means servers don't need to store sticky events in their database, protecting disk usage at the cost of more bandwidth. -To implement this, servers MUST return a non-2xx status code from `/send` such that the sending server -*retries the request* in order to guarantee that the sticky event is eventually delivered. Servers MUST NOT -silently drop sticky events and return 200 OK from `/send`, as this breaks the eventual delivery guarantee. -Care must be taken with this approach as all the PDUs in the transaction will be retried, even ones for different rooms / not sticky events. - -Option B means servers have to store the sticky event in their database, protecting bandwidth at the cost of more disk usage. -This provides fine-grained control over when to deliver the sticky events to clients as the server doesn't need -to wait for another request. Servers SHOULD deliver the event to clients before the sticky event expires. This may not -always be possible if the remaining time is very short. - -### Federation behaviour - -Servers are only responsible for sending sticky events originating from their own server. This ensures the server is aware -of the `prev_events` of all sticky events they send to other servers. This is important because the receiving server will -attempt to fetch those previous events if they are unaware of them, _rejecting the transaction_ if the sending server fails -to provide them. For this reason, it is not possible for servers to reliably deliver _other server's_ sticky events. - -In the common case, sticky events are sent over federation like any other event and do not cause any behavioural changes. -The two cases where this is different is: - - when sending sticky events to newly joined servers - - when sending "old" but unexpired sticky events - -Servers tend to maintain a sliding window of events to deliver to other servers e.g the most recent 50 PDUs. Sticky events -can fall outside this range, which is what we define as "old". On the receiving server, old events appear to have unknown -`prev_events`, which cannot be connected to any known part of the room DAG. Sending sticky events to newly joined servers can be seen -as a form of sending old but unexpired sticky events, and so this proposal only considers this case. Sending these old events -will potentially increase the number of forward extremities in the room for the receiving server. This may impact state resolution -performance if there are many forward extremities. Servers MAY send dummy events to remove forward extremities (Synapse has the -option to do this since 2019). Alternatively, servers MAY choose not to add old sticky events to their forward extremities, but -this A) reduces eventual delivery guarantees by reducing the frequency of transitive delivery of events, B) reduces the convergence -rate when implementing ephemeral maps (see "Implementing an ephemeral map"), as that relies on servers referencing sticky events from other servers. - ### Sync API changes The new `/sync` section looks like: @@ -205,6 +163,48 @@ Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_eve Sticky events are expected to be encrypted and so there is no "state filter" equivalent provided for sticky events e.g to filter sticky events by event type. +### Rate limits + +As sticky events are sent to clients regardless of the timeline limit, care needs to be taken to ensure +that other room participants cannot send large volumes of sticky events. + +Servers SHOULD rate limit sticky events over federation. Servers can choose one of two options to do this: + - A) Do not persist the sticky events and expect the other server to retry later. + - B) Persist the sticky events but wait a while before delivering them to clients. + +Option A means servers don't need to store sticky events in their database, protecting disk usage at the cost of more bandwidth. +To implement this, servers MUST return a non-2xx status code from `/send` such that the sending server +*retries the request* in order to guarantee that the sticky event is eventually delivered. Servers MUST NOT +silently drop sticky events and return 200 OK from `/send`, as this breaks the eventual delivery guarantee. +Care must be taken with this approach as all the PDUs in the transaction will be retried, even ones for different rooms / not sticky events. + +Option B means servers have to store the sticky event in their database, protecting bandwidth at the cost of more disk usage. +This provides fine-grained control over when to deliver the sticky events to clients as the server doesn't need +to wait for another request. Servers SHOULD deliver the event to clients before the sticky event expires. This may not +always be possible if the remaining time is very short. + +### Federation behaviour + +Servers are only responsible for sending sticky events originating from their own server. This ensures the server is aware +of the `prev_events` of all sticky events they send to other servers. This is important because the receiving server will +attempt to fetch those previous events if they are unaware of them, _rejecting the transaction_ if the sending server fails +to provide them. For this reason, it is not possible for servers to reliably deliver _other server's_ sticky events. + +In the common case, sticky events are sent over federation like any other event and do not cause any behavioural changes. +The two cases where this is different is: + - when sending sticky events to newly joined servers + - when sending "old" but unexpired sticky events + +Servers tend to maintain a sliding window of events to deliver to other servers e.g the most recent 50 PDUs. Sticky events +can fall outside this range, which is what we define as "old". On the receiving server, old events appear to have unknown +`prev_events`, which cannot be connected to any known part of the room DAG. Sending sticky events to newly joined servers can be seen +as a form of sending old but unexpired sticky events, and so this proposal only considers this case. Sending these old events +will potentially increase the number of forward extremities in the room for the receiving server. This may impact state resolution +performance if there are many forward extremities. Servers MAY send dummy events to remove forward extremities (Synapse has the +option to do this since 2019). Alternatively, servers MAY choose not to add old sticky events to their forward extremities, but +this A) reduces eventual delivery guarantees by reducing the frequency of transitive delivery of events, B) reduces the convergence +rate when implementing ephemeral maps (see "Implementing an ephemeral map"), as that relies on servers referencing sticky events from other servers. + ### Implementing an ephemeral map MatrixRTC relies on a per-user, per-device map of RTC member events. To implement this, this MSC proposes @@ -454,6 +454,8 @@ This becomes particularly important when room state is rolled back. For example, then Bob kicks Charlie, but concurrently Alice kicks Bob then whether or not a receiving server would accept E would depend on whether they saw “Alice kicks Bob” or “Bob kicks Charlie”. If they saw “Alice kicks Bob” then E would be accepted. If they saw “Bob kicks Charlie” then E would be rejected, and would need to be rolled back when they see “Alice kicks Bob”. +[^hisvis]: This ensures that newly joined servers can see sticky events sent from before they were joined to the room, regardless +of the history visibility setting. This matches the behaviour of state events. [^origin]: That is, the domain of the sender of the sticky event is the sending server. [^newjoiner]: We restrict delivery of sticky events to ones sent locally to reduce the number of events sent on join. If we sent all active sticky events then the number of received events by the new joiner would be `O(nm)` where `n` = number of joined servers, From cec181556464d79c465f085296f86ad7ddc754fa Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 26 Sep 2025 15:42:49 +0100 Subject: [PATCH 23/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 205 +++++++++++++++++--------------- 1 file changed, 108 insertions(+), 97 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index c69cfb9fe10..11ad743474b 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -22,7 +22,7 @@ almost the right primitive, but: [MSC2477](https://github.com/matrix-org/matrix-spec-proposals/pull/2477) tries to change that) * They aren’t extensible. * They do not guarantee delivery. Each EDU type has slightly different persistence/delivery guarantees, - all of which currently fall short of guaranteeing delivery. + all of which currently fall short of guaranteeing delivery, with the exception of to-device messages. This proposal adds such a primitive, called Sticky Events, which provides the following guarantees: @@ -203,102 +203,7 @@ will potentially increase the number of forward extremities in the room for the performance if there are many forward extremities. Servers MAY send dummy events to remove forward extremities (Synapse has the option to do this since 2019). Alternatively, servers MAY choose not to add old sticky events to their forward extremities, but this A) reduces eventual delivery guarantees by reducing the frequency of transitive delivery of events, B) reduces the convergence -rate when implementing ephemeral maps (see "Implementing an ephemeral map"), as that relies on servers referencing sticky events from other servers. - -### Implementing an ephemeral map - -MatrixRTC relies on a per-user, per-device map of RTC member events. To implement this, this MSC proposes -a standardised mechanism for determining keys on sticky events, the `content.sticky_key` property: - -```js -{ - "type": "m.rtc.member", - "sticky": { - "duration_ms": 300000 - }, - "sender": "@alice:example.com", - "room_id": "!foo", - "origin_server_ts": 1757920344000, - "content": { - "sticky_key": "LAPTOPXX123", - ... - } -} -``` - -`content.sticky_key` is ignored server-side[^encryption] and is purely informational. Clients which -receive a sticky event with a sticky key SHOULD keep a map with keys determined via the 4-uple -`(room_id, sender, type, content.sticky_key)` to track the current values in the map. Nothing stops -users sending multiple events with the same `sticky_key`. To deterministically tie-break, clients which -implement this behaviour MUST: - -- pick the one with the highest `origin_server_ts`, -- tie break on the one with the highest lexicographical event ID (A < Z). - -Clients SHOULD expire sticky events in maps when their stickiness ends. They should use the algorithm described in this proposal -to determine if an event is still sticky. Clients may diverge if they do not expire sticky events as in the following scenario: -```mermaid -sequenceDiagram - HS1->>+HS2: Sticky event S (1h sticky) - HS2->>Client: Sticky event S (1h sticky) - HS2->>-HS1: OK - HS2->>+HS2: Goes offline - HS1->>+HS2: Sticky event S' (1h sticky) - HS1->>+HS2: Retry Sticky event S' (1h sticky) - HS1->>+HS2: Retry Sticky event S' (1h sticky) - HS1->>HS1: Sticky event S' expires - HS2->>HS2: Back online - Client->>Client: Sticky event S still valid! -``` - -When clients create multiple events with the same `sticky_key`, they SHOULD use the same sticky duration as the previous -sticky event to avoid clients diverging. This can happen when a client sends a sticky event S with a long timeout, then overwrites it with S’ -with a short timeout. If S’ fails to be sent to all servers before the short timeout is hit, -some clients will believe the state is S and others will have no state. This will only resolve once the long timeout is hit. -To illustrate this, consider the scenario when clients use the _same sticky duration_: -``` -Event Lifetime - S [=========|==] | - S' [====|=====|=] - | | - A B -``` -At the time `A`, the possible states on clients is `{ _, S, S'}` where `_` is nothing due to not seeing either event. At time `B`, -the possible states on clients is `{ _, S'}`. -Contrast with: -``` -Event Lifetime - S [=======|====|=] - S' [==|=] | - | | - A B -``` -Just like before, at time `A` the possible states are `{ _, S, S'}`, but now at time `B` the possible states are `{ _, S }`. -This is problematic if you're trying to agree on the "latest" values, like you would in a k:v map. - -Note that encrypted sticky events will encrypt some parts of the 4-uple. An encrypted sticky event only exposes the room ID and sender to the server: - -```js -{ - "content": { - "algorithm": "m.megolm.v1.aes-sha2", - "ciphertext": "AwgCEqABubgx7p8AThCNreFNHqo2XJCG8cMUxwVepsuXAfrIKpdo8UjxyAsA50IOYK6T5cDL4s/OaiUQdyrSGoK5uFnn52vrjMI/+rr8isPzl7+NK3hk1Tm5QEKgqbDJROI7/8rX7I/dK2SfqN08ZUEhatAVxznUeDUH3kJkn+8Onx5E0PmQLSzPokFEi0Z0Zp1RgASX27kGVDl1D4E0vb9EzVMRW1PrbdVkFlGIFM8FE8j3yhNWaWE342eaj24NqnnWJ5VG9l2kT/hlNwUenoGJFMzozjaUlyjRIMpQXqbodjgyQkGacTEdhBuwAQ", - "device_id": "AAvTvsyf5F", - "sender_key": "KVMNIv/HyP0QMT11EQW0X8qB7U817CUbqrZZCsDgeFE", - "session_id": "c4+O+eXPf0qze1bUlH4Etf6ifzpbG3YeDEreTVm+JZU" - }, - "origin_server_ts": 1757948616527, - "sender": "@alice:example.com", - "type": "m.room.encrypted", - "sticky": { - "duration_ms": 600000 - }, - "event_id": "$lsFIWE9JcIMWUrY3ZTOKAxT_lIddFWLdK6mqwLxBchk", - "room_id": "!ffCSThQTiVQJiqvZjY:matrix.org" -} -``` - -The decrypted event would contain the `type` and `content.sticky_key`. +rate when implementing ephemeral maps (see "Addendum: Implementing an ephemeral map"), as that relies on servers referencing sticky events from other servers. #### Spam @@ -437,6 +342,109 @@ dropped updates for the latter scenario. - The sticky key in the `content` of the PDU is `msc4354_sticky_key`. - To enable this in SSS, the extension name is `org.matrix.msc4354.sticky_events`. +## Addendum + +This section explains how sticky events can be used to implement a short-lived, per-user, per-room key-value store. +This technique would be used by MatrixRTC to synchronise RTC members. + +### Implementing an ephemeral map + +MatrixRTC relies on a per-user, per-device map of RTC member events. To implement this, this MSC proposes +a standardised mechanism for determining keys on sticky events, the `content.sticky_key` property: + +```js +{ + "type": "m.rtc.member", + "sticky": { + "duration_ms": 300000 + }, + "sender": "@alice:example.com", + "room_id": "!foo", + "origin_server_ts": 1757920344000, + "content": { + "sticky_key": "LAPTOPXX123", + ... + } +} +``` + +`content.sticky_key` is ignored server-side[^encryption] and is purely informational. Clients which +receive a sticky event with a sticky key SHOULD keep a map with keys determined via the 3-uple[^4uple] +`(room_id, sender, content.sticky_key)` to track the current values in the map. Nothing stops +users sending multiple events with the same `sticky_key`. To deterministically tie-break, clients which +implement this behaviour MUST: + +- pick the one with the highest `origin_server_ts`, +- tie break on the one with the highest lexicographical event ID (A < Z). + +Clients SHOULD expire sticky events in maps when their stickiness ends. They should use the algorithm described in this proposal +to determine if an event is still sticky. Clients may diverge if they do not expire sticky events as in the following scenario: +```mermaid +sequenceDiagram + HS1->>+HS2: Sticky event S (1h sticky) + HS2->>Client: Sticky event S (1h sticky) + HS2->>-HS1: OK + HS2->>+HS2: Goes offline + HS1->>+HS2: Sticky event S' (1h sticky) + HS1->>+HS2: Retry Sticky event S' (1h sticky) + HS1->>+HS2: Retry Sticky event S' (1h sticky) + HS1->>HS1: Sticky event S' expires + HS2->>HS2: Back online + Client->>Client: Sticky event S still valid! +``` + +There is no mechanism for sticky events to expire earlier than their timeout value. To remove entries in the map, clients SHOULD +send another sticky event with just `content.sticky_key` set, with all the other application-specific fields omitted. + +When clients create multiple events with the same `sticky_key`, they SHOULD use the same sticky duration as the previous +sticky event to avoid clients diverging. This can happen when a client sends a sticky event S with a long timeout, then overwrites it with S’ +with a short timeout. If S’ fails to be sent to all servers before the short timeout is hit, +some clients will believe the state is S and others will have no state. This will only resolve once the long timeout is hit. +To illustrate this, consider the scenario when clients use the _same sticky duration_: +``` +Event Lifetime + S [=========|==] | + S' [====|=====|=] + | | + A B +``` +At the time `A`, the possible states on clients is `{ _, S, S'}` where `_` is nothing due to not seeing either event. At time `B`, +the possible states on clients is `{ _, S'}`. +Contrast with: +``` +Event Lifetime + S [=======|====|=] + S' [==|=] | + | | + A B +``` +Just like before, at time `A` the possible states are `{ _, S, S'}`, but now at time `B` the possible states are `{ _, S }`. +This is problematic if you're trying to agree on the "latest" values, like you would in a k:v map. + +Note that encrypted sticky events will encrypt some parts of the 4-uple. An encrypted sticky event only exposes the room ID and sender to the server: + +```js +{ + "content": { + "algorithm": "m.megolm.v1.aes-sha2", + "ciphertext": "AwgCEqABubgx7p8AThCNreFNHqo2XJCG8cMUxwVepsuXAfrIKpdo8UjxyAsA50IOYK6T5cDL4s/OaiUQdyrSGoK5uFnn52vrjMI/+rr8isPzl7+NK3hk1Tm5QEKgqbDJROI7/8rX7I/dK2SfqN08ZUEhatAVxznUeDUH3kJkn+8Onx5E0PmQLSzPokFEi0Z0Zp1RgASX27kGVDl1D4E0vb9EzVMRW1PrbdVkFlGIFM8FE8j3yhNWaWE342eaj24NqnnWJ5VG9l2kT/hlNwUenoGJFMzozjaUlyjRIMpQXqbodjgyQkGacTEdhBuwAQ", + "device_id": "AAvTvsyf5F", + "sender_key": "KVMNIv/HyP0QMT11EQW0X8qB7U817CUbqrZZCsDgeFE", + "session_id": "c4+O+eXPf0qze1bUlH4Etf6ifzpbG3YeDEreTVm+JZU" + }, + "origin_server_ts": 1757948616527, + "sender": "@alice:example.com", + "type": "m.room.encrypted", + "sticky": { + "duration_ms": 600000 + }, + "event_id": "$lsFIWE9JcIMWUrY3ZTOKAxT_lIddFWLdK6mqwLxBchk", + "room_id": "!ffCSThQTiVQJiqvZjY:matrix.org" +} +``` + +The decrypted event would contain the `type` and `content.sticky_key`. + [^toplevel]: This has to be at the top-level as we want to support _encrypted_ sticky events, and therefore metadata the server needs cannot be within `content`. [^stickyobj]: The presence of the `sticky` object alone is insufficient. @@ -475,3 +483,6 @@ Malicious servers could set the TTL to be 0 ~ `sticky.duration_ms` , ensuring ma on whether or not an event was sticky. In contrast, using `origin_server_ts` is a consistent reference point that all servers are guaranteed to see, limiting the ability for malicious servers to cause divergence as all servers approximately track NTP. +[^4uple]: Earlier versions of this proposal had the key be the 4-uple `(room_id, sender, **type**, content.sticky_key)`, but there may +be valid use cases for a key to have different event types as values. Given you can emulate the 4-uple by string-packing the event type +into the `content.sticky_key`, for proposing a standard map mechanism it makes sense to just use the 3-uple form. From b94096a2ac187c154d86d218864eb9ef791e1a83 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 26 Sep 2025 15:53:12 +0100 Subject: [PATCH 24/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 11ad743474b..0fd8ef56a9d 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -198,12 +198,18 @@ The two cases where this is different is: Servers tend to maintain a sliding window of events to deliver to other servers e.g the most recent 50 PDUs. Sticky events can fall outside this range, which is what we define as "old". On the receiving server, old events appear to have unknown `prev_events`, which cannot be connected to any known part of the room DAG. Sending sticky events to newly joined servers can be seen -as a form of sending old but unexpired sticky events, and so this proposal only considers this case. Sending these old events -will potentially increase the number of forward extremities in the room for the receiving server. This may impact state resolution +as a form of sending old but unexpired sticky events, and so this proposal only considers this case. + +Servers MUST send old sticky events in the order they were created on the server (stream ordering / based on `origin_server_ts`). +This ensures that sticky events appear in roughly the right place in the timeline as servers use the arrival ordering to determine +an event's position in the timeline. + +Sending these old events will potentially increase the number of forward extremities in the room for the receiving server. This may impact state resolution performance if there are many forward extremities. Servers MAY send dummy events to remove forward extremities (Synapse has the option to do this since 2019). Alternatively, servers MAY choose not to add old sticky events to their forward extremities, but this A) reduces eventual delivery guarantees by reducing the frequency of transitive delivery of events, B) reduces the convergence -rate when implementing ephemeral maps (see "Addendum: Implementing an ephemeral map"), as that relies on servers referencing sticky events from other servers. +rate when implementing ephemeral maps (see "Addendum: Implementing an ephemeral map"), as that relies on servers referencing sticky +events from other servers. #### Spam From b9ed93f2d148f29a1f4613059d9cb20abd884fe7 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 26 Sep 2025 15:56:14 +0100 Subject: [PATCH 25/50] Move around k:v map bits to Addendum --- proposals/4354-sticky-events.md | 56 ++++++++++++++++----------------- 1 file changed, 28 insertions(+), 28 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 0fd8ef56a9d..ee71fb97e5a 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -211,34 +211,6 @@ this A) reduces eventual delivery guarantees by reducing the frequency of transi rate when implementing ephemeral maps (see "Addendum: Implementing an ephemeral map"), as that relies on servers referencing sticky events from other servers. -#### Spam - -Under normal circumstances for the MatrixRTC use case there will be a window of time where clients will receive -sticky events that are not useful. MatrixRTC defines an `m.rtc.member` event with an empty content (and optional `leave_reason`) -as having [left the session](https://github.com/matrix-org/matrix-spec-proposals/blob/toger5/matrixRTC/proposals/4143-matrix-rtc.md#leaving-a-session). -This is conceptually the same as deleting a key from the map. However, as the server is unaware of the `sticky_key`, it -cannot perform the delete operation for clients, and will instead send the empty content event down `/sync`. This means if -N users leave a call, there will be N sticky events present in `/sync` for the sticky duration specified. - -This is the tradeoff for providing the ability to encrypt sticky events to reduce metadata visible to the server. It's worth -noting that this increase in inactionable sticky events only applies in a small time window. Had the client synced earlier when the -call was active, then the `m.rtc.member` events would be actionable. Had the client synced later when the inactionable sticky events -had expired, then the client wouldn't see them at all. - -#### Access control - -If a client wishes to implement access control in this key-value map based on the power levels event, -they must ensure that they accept all writes in order to ensure all clients converge. For an example -where this goes wrong if you don't, consider the case where two events are sent concurrently: - - Alice sets a key in this map which requires PL100, - - Alice is granted PL100 from PL0. - -If clients only update the map for authorised actions, then clients which see Alice setting a key before the -PL event will not update the map and hence forget the value. Clients which see the PL event first would then -accept Alice setting the key and the net result is divergence between clients. By always updating the map even -for unauthorised updates, we ensure that the arrival order doesn't affect the end result. Clients can then -choose whether or not to materialise/show/process a given key based on the current PL event. - ## Potential issues ### Time @@ -451,6 +423,34 @@ Note that encrypted sticky events will encrypt some parts of the 4-uple. An encr The decrypted event would contain the `type` and `content.sticky_key`. +#### Spam + +Under normal circumstances for the MatrixRTC use case there will be a window of time where clients will receive +sticky events that are not useful. MatrixRTC defines an `m.rtc.member` event with an empty content (and optional `leave_reason`) +as having [left the session](https://github.com/matrix-org/matrix-spec-proposals/blob/toger5/matrixRTC/proposals/4143-matrix-rtc.md#leaving-a-session). +This is conceptually the same as deleting a key from the map. However, as the server is unaware of the `sticky_key`, it +cannot perform the delete operation for clients, and will instead send the empty content event down `/sync`. This means if +N users leave a call, there will be N sticky events present in `/sync` for the sticky duration specified. + +This is the tradeoff for providing the ability to encrypt sticky events to reduce metadata visible to the server. It's worth +noting that this increase in inactionable sticky events only applies in a small time window. Had the client synced earlier when the +call was active, then the `m.rtc.member` events would be actionable. Had the client synced later when the inactionable sticky events +had expired, then the client wouldn't see them at all. + +#### Access control + +If a client wishes to implement access control in this key-value map based on the power levels event, +they must ensure that they accept all writes in order to ensure all clients converge. For an example +where this goes wrong if you don't, consider the case where two events are sent concurrently: + - Alice sets a key in this map which requires PL100, + - Alice is granted PL100 from PL0. + +If clients only update the map for authorised actions, then clients which see Alice setting a key before the +PL event will not update the map and hence forget the value. Clients which see the PL event first would then +accept Alice setting the key and the net result is divergence between clients. By always updating the map even +for unauthorised updates, we ensure that the arrival order doesn't affect the end result. Clients can then +choose whether or not to materialise/show/process a given key based on the current PL event. + [^toplevel]: This has to be at the top-level as we want to support _encrypted_ sticky events, and therefore metadata the server needs cannot be within `content`. [^stickyobj]: The presence of the `sticky` object alone is insufficient. From b135726ea45188ecd40186dec7be53c85f2d02c8 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 26 Sep 2025 15:57:35 +0100 Subject: [PATCH 26/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index ee71fb97e5a..920b8d6226b 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -347,7 +347,7 @@ a standardised mechanism for determining keys on sticky events, the `content.sti ``` `content.sticky_key` is ignored server-side[^encryption] and is purely informational. Clients which -receive a sticky event with a sticky key SHOULD keep a map with keys determined via the 3-uple[^4uple] +receive a sticky event with a `sticky_key` SHOULD keep a map with keys determined via the 3-uple[^4uple] `(room_id, sender, content.sticky_key)` to track the current values in the map. Nothing stops users sending multiple events with the same `sticky_key`. To deterministically tie-break, clients which implement this behaviour MUST: From 8f0e3ceb785786d71c568d70c59c8d7f3fdd27f7 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Sat, 27 Sep 2025 08:51:25 +0100 Subject: [PATCH 27/50] Update proposals/4354-sticky-events.md Co-authored-by: Timo <16718859+toger5@users.noreply.github.com> --- proposals/4354-sticky-events.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 920b8d6226b..e3cc4959240 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -34,7 +34,7 @@ This new primitive can be used to implement MatrixRTC participation, live locati ## Proposal -Message events can be annotated with a new top-level `sticky` key[^toplevel], which MUST have a `duration_ms`, +Message events can be annotated with a new top-level `sticky` object[^toplevel], which MUST have a `duration_ms`, which is the number of milliseconds for the event to be sticky. The presence of `sticky.duration_ms` with a valid value makes the event “sticky”[^stickyobj]. Valid values are the integer range 0-3600000 (1 hour). For use cases that require stickiness beyond this limit, the application is responsible for sending another From 3c26e3b2ad2b7962af91c6e298ded0a2e816dc0d Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Mon, 29 Sep 2025 12:46:38 +0100 Subject: [PATCH 28/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index e3cc4959240..e515ddf34d8 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -139,6 +139,9 @@ Sticky messages MAY be sent in the timeline section of the `/sync` response, reg or not they exceed the timeline limit[^ordering]. If a sticky event is in the timeline, it MAY be omitted from the `sticky.events` section. This ensures we minimise duplication in the `/sync` response JSON. +When sending sticky events down `/sync`, the `unsigned` section SHOULD have a `sticky_duration_ttl_ms` to indicate +how many milliseconds until the sticky event expires. This provides a way to reduce clock skew between a local homeserver +and their connected clients. Clients SHOULD use this value to determine when the sticky event expires. Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_events`, which has the following response shape: @@ -312,6 +315,12 @@ to their own clients to produce the same outcome. Federation equivocation is mit persisted in the DAG, as servers can talk to each other to fetch all events. There is no way to protect against dropped updates for the latter scenario. +Servers may lie to their own clients about the `unsigned.sticky_duration_ttl_ms` value, with the aim of making +certain sticky events last longer or shorter than intended. Servers can already maliciously drop sticky events +to lose updates, and the lack of any verification of the event hash means servers can also maliciously alter the +`origin_server_ts`. Therefore, adding `unsigned.sticky_duration_ttl_ms` doesn't materially make the situation worse. +In the common case, it provides protection against clock skew when clients have the wrong time. + ## Unstable Prefix - The `stick_duration_ms` query param is `msc4354_stick_duration_ms`. @@ -319,6 +328,7 @@ dropped updates for the latter scenario. - The `/sync` response section is `msc4354_sticky`. - The sticky key in the `content` of the PDU is `msc4354_sticky_key`. - To enable this in SSS, the extension name is `org.matrix.msc4354.sticky_events`. +- The `unsigned.sticky_duration_ttl_ms` field is `unsigned.msc4354_sticky_duration_ttl_ms` ## Addendum From 71e83cbb4a694f783e098eff5c71f75be45b77f3 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 1 Oct 2025 11:56:49 +0100 Subject: [PATCH 29/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index e515ddf34d8..41685aaf72c 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -54,7 +54,7 @@ as a normal event. } ``` -This key can be set by clients in the CS API by a new query parameter `stick_duration_ms`, which is +This key can be set by clients in the CS API by a new query parameter `sticky_duration_ms`, which is added to the following endpoints: * `PUT /_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}` @@ -67,7 +67,7 @@ To calculate if any sticky event is still sticky: specify start times in the future. * If the event is pushed via `/send`, servers MAY use the current time as the start time. This minimises the risk of clock skew causing the start time to be too far in the past. See “Potential issues \> Time”. -* Calculate the end time as `start_time + min(stick_duration_ms, 3600000)`. +* Calculate the end time as `start_time + min(sticky_duration_ms, 3600000)`. * If the end time is in the future, the event remains sticky. Sticky events are like normal message events and are authorised using normal PDU checks. They have the @@ -323,7 +323,7 @@ In the common case, it provides protection against clock skew when clients have ## Unstable Prefix -- The `stick_duration_ms` query param is `msc4354_stick_duration_ms`. +- The `sticky_duration_ms` query param is `org.matrix.msc4354.sticky_duration_ms`. - The `sticky` key in the PDU is `msc4354_sticky`. - The `/sync` response section is `msc4354_sticky`. - The sticky key in the `content` of the PDU is `msc4354_sticky_key`. From b2eab83fdc4eb9f40950b2cdb338c8b89c34b67e Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 1 Oct 2025 14:55:22 +0100 Subject: [PATCH 30/50] Apply suggestions from code review Co-authored-by: Travis Ralston --- proposals/4354-sticky-events.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 41685aaf72c..c142e924c32 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -39,7 +39,7 @@ which is the number of milliseconds for the event to be sticky. The presence of with a valid value makes the event “sticky”[^stickyobj]. Valid values are the integer range 0-3600000 (1 hour). For use cases that require stickiness beyond this limit, the application is responsible for sending another event to make it happen. The `sticky` key is not protected from redaction. A redacted sticky event is the same -as a normal event. +as a normal event. Note: this new top-level object is added to the [`ClientEvent` format](https://spec.matrix.org/v1.16/client-server-api/#room-event-format). ```json { @@ -57,15 +57,15 @@ as a normal event. This key can be set by clients in the CS API by a new query parameter `sticky_duration_ms`, which is added to the following endpoints: -* `PUT /_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}` -* `PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}` +* [`PUT /_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}`](https://spec.matrix.org/v1.16/client-server-api/#put_matrixclientv3roomsroomidsendeventtypetxnid) +* [`PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}`](https://spec.matrix.org/v1.16/client-server-api/#put_matrixclientv3roomsroomidstateeventtypestatekey) To calculate if any sticky event is still sticky: * Calculate the start time: * The start time is `min(now, origin_server_ts)`. This ensures that malicious origin timestamps cannot specify start times in the future. - * If the event is pushed via `/send`, servers MAY use the current time as the start time. This minimises + * If the event is pushed over federation via `/send`, servers MAY use the current time as the start time instead. This minimises the risk of clock skew causing the start time to be too far in the past. See “Potential issues \> Time”. * Calculate the end time as `start_time + min(sticky_duration_ms, 3600000)`. * If the end time is in the future, the event remains sticky. @@ -93,7 +93,8 @@ Note: policy servers and other similar antispam techniques still apply to these These messages may be combined with [MSC4140: Delayed Events](https://github.com/matrix-org/matrix-spec-proposals/pull/4140) to provide heartbeat semantics (e.g required for MatrixRTC). Note that the sticky duration in this proposal -is distinct from that of delayed events. The purpose of the sticky duration in this proposal is to ensure sticky events are cleaned up. +is distinct from that of delayed events. The purpose of the sticky duration in this proposal is to ensure sticky events are cleaned up, +whereas the purpose of delayed events is to affect the send time (and thus start time for stickiness) of an event. ### Sync API changes From 99ee9f86f02f809b3eb520bf7ef741ced6514d08 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 1 Oct 2025 15:51:10 +0100 Subject: [PATCH 31/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index c142e924c32..3299d300549 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -140,6 +140,9 @@ Sticky messages MAY be sent in the timeline section of the `/sync` response, reg or not they exceed the timeline limit[^ordering]. If a sticky event is in the timeline, it MAY be omitted from the `sticky.events` section. This ensures we minimise duplication in the `/sync` response JSON. +Sticky events follow the same 'stream-like' behaviour as the `timeline`. This means clients will receive a sticky +event S _once_, and subsequent requests with an advanced `since` token will not return the same sticky event S. + When sending sticky events down `/sync`, the `unsigned` section SHOULD have a `sticky_duration_ttl_ms` to indicate how many milliseconds until the sticky event expires. This provides a way to reduce clock skew between a local homeserver and their connected clients. Clients SHOULD use this value to determine when the sticky event expires. @@ -181,6 +184,7 @@ To implement this, servers MUST return a non-2xx status code from `/send` such t *retries the request* in order to guarantee that the sticky event is eventually delivered. Servers MUST NOT silently drop sticky events and return 200 OK from `/send`, as this breaks the eventual delivery guarantee. Care must be taken with this approach as all the PDUs in the transaction will be retried, even ones for different rooms / not sticky events. +Servers solely relying on this option will need to consider that sticky events may be transitively delivered by a 3rd server. Option B means servers have to store the sticky event in their database, protecting bandwidth at the cost of more disk usage. This provides fine-grained control over when to deliver the sticky events to clients as the server doesn't need From 3ff65a57fa11aa40bac80df323fde3ab44680d7a Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 1 Oct 2025 16:00:23 +0100 Subject: [PATCH 32/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 3299d300549..fe807781f65 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -362,8 +362,8 @@ a standardised mechanism for determining keys on sticky events, the `content.sti ``` `content.sticky_key` is ignored server-side[^encryption] and is purely informational. Clients which -receive a sticky event with a `sticky_key` SHOULD keep a map with keys determined via the 3-uple[^4uple] -`(room_id, sender, content.sticky_key)` to track the current values in the map. Nothing stops +receive a sticky event with a `sticky_key` SHOULD keep a map with keys determined via the 4-uple[^3uple] +`(room_id, sender, type, content.sticky_key)` to track the current values in the map. Nothing stops users sending multiple events with the same `sticky_key`. To deterministically tie-break, clients which implement this behaviour MUST: @@ -504,6 +504,12 @@ Malicious servers could set the TTL to be 0 ~ `sticky.duration_ms` , ensuring ma on whether or not an event was sticky. In contrast, using `origin_server_ts` is a consistent reference point that all servers are guaranteed to see, limiting the ability for malicious servers to cause divergence as all servers approximately track NTP. -[^4uple]: Earlier versions of this proposal had the key be the 4-uple `(room_id, sender, **type**, content.sticky_key)`, but there may -be valid use cases for a key to have different event types as values. Given you can emulate the 4-uple by string-packing the event type -into the `content.sticky_key`, for proposing a standard map mechanism it makes sense to just use the 3-uple form. +[^3uple]: Earlier versions of this proposal removed the event type and had the key be the 3-uple `(room_id, sender, content.sticky_key)`. +Conceptually, this made a per-user per-room map that looked like `Map`. In contrast, the 4-uple format is +`Map>`. This proposal has flip-flopped on 4-uple vs 3-uple because they are broadly speaking +equivalent: the 3-uple form can string pack the event type in the sticky key and the 4-uple form can loop over the outer event type map +looking for the same sticky key and apply tie-breaking rules to the resulting events. We do not have any use cases for the 3-uple form +as all use cases expect the values for a given key to be of the same type e.g `m.rtc.member`. Furthermore, it's not clear how this MSC +could safely propose a delimiter when both the event type and sticky key would be freeform unicode decided by the application. For these +reasons, this MSC chooses the 4-uple format. This makes implementing `Map` behaviour _slightly_ harder, and makes +implementing `Map>` _slightly_ easier. From 865746cba550498294137897f8372812ab889e7c Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Thu, 2 Oct 2025 10:47:45 +0100 Subject: [PATCH 33/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index fe807781f65..e45a357b38f 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -335,6 +335,16 @@ In the common case, it provides protection against clock skew when clients have - To enable this in SSS, the extension name is `org.matrix.msc4354.sticky_events`. - The `unsigned.sticky_duration_ttl_ms` field is `unsigned.msc4354_sticky_duration_ttl_ms` +The `/versions` response in the CSAPI includes: +```json +{ + "versions": ["..."], + "unstable_features": { + "org.matrix.msc4354": true + } +} +``` + ## Addendum This section explains how sticky events can be used to implement a short-lived, per-user, per-room key-value store. From 240d6502cbcddb1a52acc1df6c88768c1793f411 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Thu, 2 Oct 2025 11:04:59 +0100 Subject: [PATCH 34/50] Update proposals/4354-sticky-events.md Co-authored-by: Travis Ralston --- proposals/4354-sticky-events.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index e45a357b38f..c254fc9fb6a 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -348,7 +348,7 @@ The `/versions` response in the CSAPI includes: ## Addendum This section explains how sticky events can be used to implement a short-lived, per-user, per-room key-value store. -This technique would be used by MatrixRTC to synchronise RTC members. +This technique would be used by MatrixRTC to synchronise RTC members, and should land in the spec as a suggested algorithm to follow. ### Implementing an ephemeral map From 434794d7cb27a5271ec013674ed2d7e026592d68 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Tue, 7 Oct 2025 17:04:35 +0100 Subject: [PATCH 35/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index c254fc9fb6a..6403b6d8f65 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -191,6 +191,11 @@ This provides fine-grained control over when to deliver the sticky events to cli to wait for another request. Servers SHOULD deliver the event to clients before the sticky event expires. This may not always be possible if the remaining time is very short. +Servers SHOULD return sticky events down `/sync` in batches if there are many sticky events to return in one go. +This ensures that clients can always make forward progress and can't get into a death spiral of never being able to +download a large `/sync` response. This MSC recommends a batch size of 100 sticky events per `/sync` response, across +all rooms. This means at most ~6.5MB of the sync response will contain sticky events. + ### Federation behaviour Servers are only responsible for sending sticky events originating from their own server. This ensures the server is aware @@ -275,6 +280,7 @@ the client chooses when to pull in this information, reducing the time-to-intera which do not sync is weak, given the entire point of sticky events is to ensure rapid synchronisation of temporary data. This heavily implies the use of some kind of syncing mechanism to receive timely updates, which polling a `/get_sticky_events` endpoint subverts. + ## Alternatives ### Use state events @@ -380,6 +386,11 @@ implement this behaviour MUST: - pick the one with the highest `origin_server_ts`, - tie break on the one with the highest lexicographical event ID (A < Z). +>[!NOTE] +> If a client sends two sticky events in the same millisecond, the 2nd event may be replaced by the 1st if +> the event ID of the 1st event has a higher lexicographical event ID. To protect against this, clients should +> ensure that they wait at least 1 millisecond between sending sticky events. + Clients SHOULD expire sticky events in maps when their stickiness ends. They should use the algorithm described in this proposal to determine if an event is still sticky. Clients may diverge if they do not expire sticky events as in the following scenario: ```mermaid From 6f94547a1c81bbf3cb3d3f3cf437d1fefb42f6ed Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 8 Oct 2025 10:01:26 +0100 Subject: [PATCH 36/50] Update 4354-sticky-events.md --- proposals/4354-sticky-events.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 6403b6d8f65..82a9557c572 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -265,6 +265,8 @@ following protections in place: * All sticky events are subject to normal PDU checks, meaning that the sender must be authorised to send events into the room. * Servers sending lots of sticky events may be asked to try again later as a form of rate-limiting. Due to data expiring, subsequent requests will gradually have less data. +* Sticky events are returned down `/sync` in batches of 100 to ensure clients never get a single enormous `/sync` response. They + will still get all unexpired sticky events via batches. We could add a layer of indirection to the `/sync` response where we only announce the number of sticky events, and expect the client to fetch them when they are ready via a different endpoint. This has roughly the same bandwidth cost, but @@ -381,7 +383,7 @@ a standardised mechanism for determining keys on sticky events, the `content.sti receive a sticky event with a `sticky_key` SHOULD keep a map with keys determined via the 4-uple[^3uple] `(room_id, sender, type, content.sticky_key)` to track the current values in the map. Nothing stops users sending multiple events with the same `sticky_key`. To deterministically tie-break, clients which -implement this behaviour MUST: +implement this behaviour MUST[^maporder]: - pick the one with the highest `origin_server_ts`, - tie break on the one with the highest lexicographical event ID (A < Z). @@ -534,3 +536,8 @@ as all use cases expect the values for a given key to be of the same type e.g `m could safely propose a delimiter when both the event type and sticky key would be freeform unicode decided by the application. For these reasons, this MSC chooses the 4-uple format. This makes implementing `Map` behaviour _slightly_ harder, and makes implementing `Map>` _slightly_ easier. +[^maporder]: We determine the order based on the data in the event and not by the order of sticky events in the array returned by `/sync`. +We do this because the `/sync` ordering may not match the sending order if A) servers violate the proposal and send later events first, +B) newer sticky events are retrieved transitively from a 3rd server via `/get_missing_events` _first_, then older sticky events are sent +afterwards, C) when batches of sticky events are returned down `/sync`, newer sticky events may appear in the timeline before older sticky events are +returned via batching. From 0d5e4d8ad75842fa98110f8161a7d5195b9f6995 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 29 Oct 2025 08:51:22 +0000 Subject: [PATCH 37/50] Apply suggestions from code review Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Co-authored-by: Timo <16718859+toger5@users.noreply.github.com> --- proposals/4354-sticky-events.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 82a9557c572..51278257822 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -12,8 +12,9 @@ The concerns with MSC3757 and using it for MatrixRTC are mainly: abuse vector, as these states can pile up and can never be cleaned up as the DAG is append-only. 3. State resolution can cause rollbacks. These rollbacks may inadvertently affect per-user per-device state. -Other proposals have similar problems such as live location sharing which uses state events when it -really just wants per-user last-write-wins behaviour. +[MSC3489](https://github.com/matrix-org/matrix-spec-proposals/pull/3489) ("Sharing streams of location +data with history", AKA "live location sharing") has similar problems: it uses state events when it +really just needs per-user last-write-wins behaviour. There currently exists no good communication primitive in Matrix to send this kind of data. EDUs are almost the right primitive, but: @@ -30,7 +31,7 @@ This proposal adds such a primitive, called Sticky Events, which provides the fo * Access control tied to the joined members in the room. * Extensible, able to be sent by clients. -This new primitive can be used to implement MatrixRTC participation, live location sharing, among other functionality. +This new primitive can be used to implement MatrixRTC participation and live location sharing, among other functionality. ## Proposal @@ -198,7 +199,7 @@ all rooms. This means at most ~6.5MB of the sync response will contain sticky ev ### Federation behaviour -Servers are only responsible for sending sticky events originating from their own server. This ensures the server is aware +As with regular events, servers are only responsible for sending sticky events originating from their own server. This ensures the server is aware of the `prev_events` of all sticky events they send to other servers. This is important because the receiving server will attempt to fetch those previous events if they are unaware of them, _rejecting the transaction_ if the sending server fails to provide them. For this reason, it is not possible for servers to reliably deliver _other server's_ sticky events. @@ -385,13 +386,13 @@ receive a sticky event with a `sticky_key` SHOULD keep a map with keys determine users sending multiple events with the same `sticky_key`. To deterministically tie-break, clients which implement this behaviour MUST[^maporder]: -- pick the one with the highest `origin_server_ts`, +- pick the one with the highest `origin_server_ts + sticky.duration_ms`, - tie break on the one with the highest lexicographical event ID (A < Z). >[!NOTE] > If a client sends two sticky events in the same millisecond, the 2nd event may be replaced by the 1st if > the event ID of the 1st event has a higher lexicographical event ID. To protect against this, clients should -> ensure that they wait at least 1 millisecond between sending sticky events. +> ensure that they wait at least 1 millisecond between sending sticky events with the same `sticky_key`. Clients SHOULD expire sticky events in maps when their stickiness ends. They should use the algorithm described in this proposal to determine if an event is still sticky. Clients may diverge if they do not expire sticky events as in the following scenario: From 7e54063dbc183514b36fd555348b757ffdd46743 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 29 Oct 2025 10:02:15 +0000 Subject: [PATCH 38/50] Redaction and last-to-expire commentary --- proposals/4354-sticky-events.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 51278257822..c4d5b52db60 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -386,7 +386,7 @@ receive a sticky event with a `sticky_key` SHOULD keep a map with keys determine users sending multiple events with the same `sticky_key`. To deterministically tie-break, clients which implement this behaviour MUST[^maporder]: -- pick the one with the highest `origin_server_ts + sticky.duration_ms`, +- pick the one with the highest `origin_server_ts + sticky.duration_ms` (last to expire wins), - tie break on the one with the highest lexicographical event ID (A < Z). >[!NOTE] @@ -411,13 +411,12 @@ sequenceDiagram ``` There is no mechanism for sticky events to expire earlier than their timeout value. To remove entries in the map, clients SHOULD -send another sticky event with just `content.sticky_key` set, with all the other application-specific fields omitted. +send another sticky event with just `content.sticky_key` set, with all the other application-specific fields omitted. Redacting +sticky events are an alternative way to do this, although this loses the `content.sticky_key` property so clients will need to +remember the sticky event ID to know which sticky key was affected. When clients create multiple events with the same `sticky_key`, they SHOULD use the same sticky duration as the previous -sticky event to avoid clients diverging. This can happen when a client sends a sticky event S with a long timeout, then overwrites it with S’ -with a short timeout. If S’ fails to be sent to all servers before the short timeout is hit, -some clients will believe the state is S and others will have no state. This will only resolve once the long timeout is hit. -To illustrate this, consider the scenario when clients use the _same sticky duration_: +sticky event to avoid clients not applying more recent sticky events. ``` Event Lifetime S [=========|==] | @@ -436,7 +435,8 @@ Event Lifetime A B ``` Just like before, at time `A` the possible states are `{ _, S, S'}`, but now at time `B` the possible states are `{ _, S }`. -This is problematic if you're trying to agree on the "latest" values, like you would in a k:v map. +This is problematic if you're trying to agree on the "latest" values, like you would in a k:v map. Note that if a client had +seen S then sees S', they will ignore it due to it having a lower expiry time than S (last to expire wins). Note that encrypted sticky events will encrypt some parts of the 4-uple. An encrypted sticky event only exposes the room ID and sender to the server: From da7c7c756e5dbc40beb325876eb4a0ff52ef2b16 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 29 Oct 2025 11:50:35 +0000 Subject: [PATCH 39/50] vdh comments --- proposals/4354-sticky-events.md | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index c4d5b52db60..abcc60ba8d9 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -1,13 +1,15 @@ # MSC4354: Sticky Events -MatrixRTC currently depends on [MSC3757](https://github.com/matrix-org/matrix-spec-proposals/pull/3757) +MatrixRTC currently relies on allowing any user (PL0) to send `org.matrix.msc3401.call` +and `org.matrix.msc3401.call.member` state events into the room for sending per-user per-device state. MatrixRTC wants to be able to share a temporary state to all users in a room to indicate whether the given client is in the call or not. -The concerns with MSC3757 and using it for MatrixRTC are mainly: +The concerns with allowing any user to send room state and using it for MatrixRTC are mainly: -1. In order to ensure other users are unable to modify each other’s state, it proposes using - string packing for authorization which feels wrong, given the structured nature of events. +1. Any user can modify other user's call state. MSC3757 tries to fix this, but in order to ensure other + users are unable to modify each other’s state, it proposes using string packing for authorization which + feels wrong, given the structured nature of events. 2. Allowing unprivileged users to send arbitrary amounts of state into the room is a potential abuse vector, as these states can pile up and can never be cleaned up as the DAG is append-only. 3. State resolution can cause rollbacks. These rollbacks may inadvertently affect per-user per-device state. @@ -16,14 +18,18 @@ The concerns with MSC3757 and using it for MatrixRTC are mainly: data with history", AKA "live location sharing") has similar problems: it uses state events when it really just needs per-user last-write-wins behaviour. -There currently exists no good communication primitive in Matrix to send this kind of data. EDUs are +There currently exists no good communication primitive in Matrix to send this kind of data. Receipt/Typing EDUs are almost the right primitive, but: -* They can’t be sent via clients (there is no concept of EDUs in the Client-Server API\! - [MSC2477](https://github.com/matrix-org/matrix-spec-proposals/pull/2477) tries to change that) -* They aren’t extensible. -* They do not guarantee delivery. Each EDU type has slightly different persistence/delivery guarantees, - all of which currently fall short of guaranteeing delivery, with the exception of to-device messages. +* They aren't extensible. You can _only_ send receipts / typing notifications and cannot add extra keys to the JSON object. + There is no concept of EDUs in the Client-Server API to allow additional EDU types, though + [MSC2477](https://github.com/matrix-org/matrix-spec-proposals/pull/2477) tries to change that. +* They do not guarantee delivery. Receipts/typing have slightly different persistence/delivery guarantees, + all of which currently fall short of guaranteeing delivery. You _can_ guarantee delivery with EDUs, which is what to-device messages + do, but that lacks the per-room scoping required for a per-room, per-user state. It's insufficient to just slap on some extra + keys to make it per-room, per-user though because of the Byzantine broadcast problem: a user can send each server _different_ + state, thus breaking convergence. To-device messages fundamentally avoid this by being point-to-point communication, and not + a broadcast mechanism. This proposal adds such a primitive, called Sticky Events, which provides the following guarantees: @@ -168,8 +174,9 @@ Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_eve } ``` -Sticky events are expected to be encrypted and so there is no "state filter" equivalent provided for sticky events -e.g to filter sticky events by event type. +Sticky events are expected to be encrypted and so there is no [state filter](https://spec.matrix.org/v1.16/client-server-api/#post_matrixclientv3useruseridfilter_request_roomeventfilter) +equivalent provided for sticky events e.g to filter sticky events by event type. +As with normal events, sticky events sent by ignored users MUST NOT be delivered to clients. ### Rate limits From 50c1910978c8517d2593ad4be2d8e9dad0b27b1a Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Wed, 29 Oct 2025 12:00:14 +0000 Subject: [PATCH 40/50] more vdh --- proposals/4354-sticky-events.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index abcc60ba8d9..02147c77982 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -46,7 +46,8 @@ which is the number of milliseconds for the event to be sticky. The presence of with a valid value makes the event “sticky”[^stickyobj]. Valid values are the integer range 0-3600000 (1 hour). For use cases that require stickiness beyond this limit, the application is responsible for sending another event to make it happen. The `sticky` key is not protected from redaction. A redacted sticky event is the same -as a normal event. Note: this new top-level object is added to the [`ClientEvent` format](https://spec.matrix.org/v1.16/client-server-api/#room-event-format). +as a normal event. Note: this new top-level object is added to the [`ClientEvent` format](https://spec.matrix.org/v1.16/client-server-api/#room-event-format) +and the [`Persistent Data Unit`](https://spec.matrix.org/v1.16/rooms/v12/#event-format-1) for each room version. ```json { @@ -70,10 +71,8 @@ added to the following endpoints: To calculate if any sticky event is still sticky: * Calculate the start time: - * The start time is `min(now, origin_server_ts)`. This ensures that malicious origin timestamps cannot + * The start time is `min(received_ts, origin_server_ts)`. This ensures that malicious origin timestamps cannot specify start times in the future. - * If the event is pushed over federation via `/send`, servers MAY use the current time as the start time instead. This minimises - the risk of clock skew causing the start time to be too far in the past. See “Potential issues \> Time”. * Calculate the end time as `start_time + min(sticky_duration_ms, 3600000)`. * If the end time is in the future, the event remains sticky. From 732a72b07937438dc5d94f1c68933a0360ca28c6 Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 31 Oct 2025 08:56:56 +0000 Subject: [PATCH 41/50] More vdh comments --- proposals/4354-sticky-events.md | 38 +++++++++++++++++---------------- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 02147c77982..86df7703013 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -79,19 +79,21 @@ To calculate if any sticky event is still sticky: Sticky events are like normal message events and are authorised using normal PDU checks. They have the following _additional_ properties[^prop]: -* They are eagerly synchronised with all other servers.[^partial] -* They must appear in the `/sync` response.[^sync] -* The soft-failure checks MUST be re-evaluated when the membership state changes for a user with unexpired sticky events.[^softfail] -* They ignore history visibility checks. Any joined user is authorised to see sticky events for the duration they remain sticky.[^hisvis] +* They are eagerly **pushed** to all other servers.[^partial] +* They must be **delivered** to clients.[^sync] +* Only state event level **checks** are applied to them.[^softfail][^hisvis] To implement these properties, servers MUST: -* Attempt to send their own[^origin] sticky events to all joined servers, whilst respecting per-server backoff times. - Large volumes of events to send MUST NOT cause the sticky event to be dropped from the send queue on the server. -* Ensure all sticky events are delivered to clients via `/sync` in a new section of the sync response, - regardless of whether the sticky event falls within the timeline limit of the request. -* When a new server joins the room, existing servers MUST attempt delivery of all of their own sticky events[^newjoiner]. -* Remember sticky events per-user, per-room such that the soft-failure checks can be re-evaluated. +* Attempt to **push** their own[^origin] sticky events to all joined servers, whilst respecting per-server backoff times. + Large volumes of events to send MUST NOT cause the sticky event to be dropped from the send queue on the server. +* When a new server joins the room, existing servers MUST attempt to **push** all of their own sticky events[^newjoiner]. +* Ensure sticky events are **delivered** to clients via `/sync` in a new section of the sync response, + regardless of whether the sticky event falls within the timeline limit of the request. If there are too many sticky events, + the client is informed of this and can fetch the remaining sticky events via a new pagination endpoint. +* Soft-failure **checks** MUST be re-evaluated when the membership state changes for a user with unexpired sticky events.[^softfail] +* History visibility **checks** MUST NOT be applied to sticky events. Any joined user is authorised to see sticky events + for the duration they remain sticky.[^hisvis] When an event loses its stickiness, these properties disappear with the stickiness. Servers SHOULD NOT eagerly synchronise such events anymore, nor send them down `/sync`, nor re-evaluate their soft-failure status. @@ -123,8 +125,9 @@ The new `/sync` section looks like: "sticky": { "duration_ms": 300000 }, - "origin_server_ts": 1757920344000, - "content": { ... } + "origin_server_ts": 1757920341020, + "content": { ... }, + "unsigned": { "sticky_duration_ttl_ms": 258113 } }, { "sender": "@alice:example.com", @@ -132,8 +135,9 @@ The new `/sync` section looks like: "sticky": { "duration_ms": 300000 }, - "origin_server_ts": 1757920311020, + "origin_server_ts": 1757920344000, "content": { ... } + "unsigned": { "sticky_duration_ttl_ms": 289170 } } ] } @@ -145,6 +149,9 @@ The new `/sync` section looks like: Sticky messages MAY be sent in the timeline section of the `/sync` response, regardless of whether or not they exceed the timeline limit[^ordering]. If a sticky event is in the timeline, it MAY be omitted from the `sticky.events` section. This ensures we minimise duplication in the `/sync` response JSON. +This proposal recommends always putting sticky events into the `sticky.events` section _except_ if +the sticky event is going to be returned in the `timeline.events` section of the current sync response. +In other words, filter out any event from `sticky.events` where the event ID appears in `timeline.events`. Sticky events follow the same 'stream-like' behaviour as the `timeline`. This means clients will receive a sticky event S _once_, and subsequent requests with an advanced `since` token will not return the same sticky event S. @@ -198,11 +205,6 @@ This provides fine-grained control over when to deliver the sticky events to cli to wait for another request. Servers SHOULD deliver the event to clients before the sticky event expires. This may not always be possible if the remaining time is very short. -Servers SHOULD return sticky events down `/sync` in batches if there are many sticky events to return in one go. -This ensures that clients can always make forward progress and can't get into a death spiral of never being able to -download a large `/sync` response. This MSC recommends a batch size of 100 sticky events per `/sync` response, across -all rooms. This means at most ~6.5MB of the sync response will contain sticky events. - ### Federation behaviour As with regular events, servers are only responsible for sending sticky events originating from their own server. This ensures the server is aware From e5c1635932d51392ec5ecd4322706e0457c35bbe Mon Sep 17 00:00:00 2001 From: Kegan Dougal <7190048+kegsay@users.noreply.github.com> Date: Fri, 31 Oct 2025 11:30:28 +0000 Subject: [PATCH 42/50] Pagination --- proposals/4354-sticky-events.md | 60 ++++++++++++++++++++++++--------- 1 file changed, 44 insertions(+), 16 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 86df7703013..a1345d6a3e3 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -118,6 +118,7 @@ The new `/sync` section looks like: "state": { ... }, "timeline": { ... }, "sticky": { + "prev_batch": "s11_22_33_44_55_66_77_88", "events": [ { "sender": "@bob:example.com", @@ -166,6 +167,7 @@ Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_eve { "rooms": { "!726s6s6q:example.com": { + "prev_batch": "s11_22_33_44_55_66_77_88", "events": [{ "sender": "@bob:example.com", "type": "m.foo", @@ -184,6 +186,45 @@ Sticky events are expected to be encrypted and so there is no [state filter](htt equivalent provided for sticky events e.g to filter sticky events by event type. As with normal events, sticky events sent by ignored users MUST NOT be delivered to clients. +#### Pagination + +If there are too many sticky events to return down `/sync`, the server may choose not to deliver all sticky events and +instead provide a `prev_batch` token which can be passed to a new endpoint to retrieve sticky events in that room. If all +sticky events were delivered, the `prev_batch` token is omitted from the Sync / Sliding Sync response object. + +This proposal recommends only sending a `prev_batch` token if there are more than 100 sticky events in a given room. This +minimises the chances that clients will need to paginate, improving responsiveness at the cost of higher initial bandwidth used. + +The API shape follows the one proposed in [MSC4308: Thread Subscriptions extension to Sliding Sync](https://github.com/matrix-org/matrix-spec-proposals/blob/rei/msc_ssext_threadsubs/proposals/4308-sliding-sync-ext-thread-subscriptions.md#companion-endpoint-for-backpaginating-thread-subscription-changes): + +``` +GET /_matrix/client/v1/rooms/{roomId}/sticky_events +``` +URL query parameters: + - dir (string, required): always `b` (backward), to mirror other pagination endpoints. The forward direction is not yet specified to be implemented. + - from (string, optional): a token used to continue backpaginating + The token is either acquired from a previous `/sticky_events` response, or the `prev_batch` in a Sliding Sync / Sync response. + The token is opaque and has no client-discernible meaning. + If this token is not provided, then backpagination starts from the 'end'. + - to (string, optional): a token used to limit the backpagination + The token, originally acquired from pos in a Sliding Sync response, would be the same one used as the pos request parameter in the Sliding Sync request that returned the prev_batch. + - limit (int, optional; default 100): a maximum number of sticky events to fetch in one response. + Must be greater than zero. Servers may impose a smaller limit than requested. + +Response body: +```js +{ + "sticky_events": [ ... ], // list of sticky events + // If there are still more sticky events to fetch, + // a new `from` token the client can use to walk further + // backwards. (The `to` token, if used, should be reused.) + "end": "OPAQUE_TOKEN" +} +``` + +NB: This endpoint may also be used to retrieve sticky events in a room without calling `/sync` at all (by omitting both `from` and `to`), +which may be useful for bots. + ### Rate limits As sticky events are sent to clients regardless of the timeline limit, care needs to be taken to ensure @@ -274,22 +315,8 @@ following protections in place: * All sticky events are subject to normal PDU checks, meaning that the sender must be authorised to send events into the room. * Servers sending lots of sticky events may be asked to try again later as a form of rate-limiting. Due to data expiring, subsequent requests will gradually have less data. -* Sticky events are returned down `/sync` in batches of 100 to ensure clients never get a single enormous `/sync` response. They - will still get all unexpired sticky events via batches. - -We could add a layer of indirection to the `/sync` response where we only announce the number of sticky events, and -expect the client to fetch them when they are ready via a different endpoint. This has roughly the same bandwidth cost, but -the client chooses when to pull in this information, reducing the time-to-interactivity. This has a few problems: - - It assumes sticky events are not urgently required when opening the application. This may be true for something like live - location sharing but may not be true for VoIP calls. - - It's not clear that there is a strong need for the extra indirection, given the strong rate limits and expirations already in - place. - - Adding the indirection increases complexity and friction when using the API, and presupposes the standard `/sync` model. - For [MSC4186: Simplified Sliding Sync](https://github.com/matrix-org/matrix-spec-proposals/pull/4186), clients can already indirect - if they wish to by simply not enabling the extension until they are ready to receive the data. Therefore, any new `/get_sticky_events` - API would really only be useful for A) applications which do not sync, B) users of the existing `/sync` API. The use case for applications - which do not sync is weak, given the entire point of sticky events is to ensure rapid synchronisation of temporary data. This heavily - implies the use of some kind of syncing mechanism to receive timely updates, which polling a `/get_sticky_events` endpoint subverts. +* Sticky events are returned down `/sync` with a recommended limit of 100 per room to ensure clients never get a single enormous `/sync` response. They + will still get all unexpired sticky events via the pagination endpoint. ## Alternatives @@ -351,6 +378,7 @@ In the common case, it provides protection against clock skew when clients have - The sticky key in the `content` of the PDU is `msc4354_sticky_key`. - To enable this in SSS, the extension name is `org.matrix.msc4354.sticky_events`. - The `unsigned.sticky_duration_ttl_ms` field is `unsigned.msc4354_sticky_duration_ttl_ms` +- The endpoint `/_matrix/client/v1/rooms/{roomId}/sticky_events` is `/_matrix/client/unstable/org.matrix.msc4354/rooms/{roomId}/sticky_events`. The `/versions` response in the CSAPI includes: ```json From 331484dfd982b360f83c9de29459deb986180a68 Mon Sep 17 00:00:00 2001 From: "Olivier Wilkinson (reivilibre)" Date: Tue, 16 Dec 2025 16:35:39 +0000 Subject: [PATCH 43/50] Replace 'companion endpoint pagination' with MSC3885-style 'subtoken pagination' --- proposals/4354-sticky-events.md | 80 ++++++++++++++++----------------- 1 file changed, 40 insertions(+), 40 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index a1345d6a3e3..fc14b0aca34 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -89,8 +89,8 @@ To implement these properties, servers MUST: Large volumes of events to send MUST NOT cause the sticky event to be dropped from the send queue on the server. * When a new server joins the room, existing servers MUST attempt to **push** all of their own sticky events[^newjoiner]. * Ensure sticky events are **delivered** to clients via `/sync` in a new section of the sync response, - regardless of whether the sticky event falls within the timeline limit of the request. If there are too many sticky events, - the client is informed of this and can fetch the remaining sticky events via a new pagination endpoint. + regardless of whether the sticky event falls within the timeline limit of the request. + If there are too many sticky events to deliver at once, they will be delivered in subsequent `/sync` responses instead. * Soft-failure **checks** MUST be re-evaluated when the membership state changes for a user with unexpired sticky events.[^softfail] * History visibility **checks** MUST NOT be applied to sticky events. Any joined user is authorised to see sticky events for the duration they remain sticky.[^hisvis] @@ -118,7 +118,6 @@ The new `/sync` section looks like: "state": { ... }, "timeline": { ... }, "sticky": { - "prev_batch": "s11_22_33_44_55_66_77_88", "events": [ { "sender": "@bob:example.com", @@ -161,13 +160,24 @@ When sending sticky events down `/sync`, the `unsigned` section SHOULD have a `s how many milliseconds until the sticky event expires. This provides a way to reduce clock skew between a local homeserver and their connected clients. Clients SHOULD use this value to determine when the sticky event expires. -Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_events`, which has the following response shape: +Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_events`, which has the following request extension +shape: ```js { + "enabled": true, + "limit": 100, // optional (default 100, min 1): max number of events to return, server can override to a lower number + "since": "some_token" // optional: can be omitted on initial sync / when extension is only just enabled +} +``` + +and, when enabled, the following response extension shape: + +```js +{ + "next_batch": "some_token", // REQUIRED when there are changes "rooms": { "!726s6s6q:example.com": { - "prev_batch": "s11_22_33_44_55_66_77_88", "events": [{ "sender": "@bob:example.com", "type": "m.foo", @@ -188,42 +198,35 @@ As with normal events, sticky events sent by ignored users MUST NOT be delivered #### Pagination -If there are too many sticky events to return down `/sync`, the server may choose not to deliver all sticky events and -instead provide a `prev_batch` token which can be passed to a new endpoint to retrieve sticky events in that room. If all -sticky events were delivered, the `prev_batch` token is omitted from the Sync / Sliding Sync response object. +Because sticky events and to-device messages are alike in the way that they should be *reliably* delivered to +clients, without any gaps in the pagination, they follow the [MSC3885: Sliding Sync Extension: To-Device messages][MSC3885] +model for pagination in sliding sync. -This proposal recommends only sending a `prev_batch` token if there are more than 100 sticky events in a given room. This -minimises the chances that clients will need to paginate, improving responsiveness at the cost of higher initial bandwidth used. +In short: when there are too many sticky events to return in one response, the server returns a limited number +of the oldest sticky events that have not yet been delivered. -The API shape follows the one proposed in [MSC4308: Thread Subscriptions extension to Sliding Sync](https://github.com/matrix-org/matrix-spec-proposals/blob/rei/msc_ssext_threadsubs/proposals/4308-sliding-sync-ext-thread-subscriptions.md#companion-endpoint-for-backpaginating-thread-subscription-changes): +At every response, the server returns a `next_batch` token which the client MUST persist and send +as a `since` token in the next Sliding Sync request (in the extension), if the client wishes to advance +in the sticky events stream. -``` -GET /_matrix/client/v1/rooms/{roomId}/sticky_events -``` -URL query parameters: - - dir (string, required): always `b` (backward), to mirror other pagination endpoints. The forward direction is not yet specified to be implemented. - - from (string, optional): a token used to continue backpaginating - The token is either acquired from a previous `/sticky_events` response, or the `prev_batch` in a Sliding Sync / Sync response. - The token is opaque and has no client-discernible meaning. - If this token is not provided, then backpagination starts from the 'end'. - - to (string, optional): a token used to limit the backpagination - The token, originally acquired from pos in a Sliding Sync response, would be the same one used as the pos request parameter in the Sliding Sync request that returned the prev_batch. - - limit (int, optional; default 100): a maximum number of sticky events to fetch in one response. - Must be greater than zero. Servers may impose a smaller limit than requested. - -Response body: -```js -{ - "sticky_events": [ ... ], // list of sticky events - // If there are still more sticky events to fetch, - // a new `from` token the client can use to walk further - // backwards. (The `to` token, if used, should be reused.) - "end": "OPAQUE_TOKEN" -} -``` +However, we don't require `next_batch` to be provided in the response when there are no changes, because that seems +like a mistake, which would lead to unnecessarily high quiescent bandwidth usage if many extensions follow this pattern. +[There is a comment thread open on MSC3885.](https://github.com/matrix-org/matrix-spec-proposals/pull/3885#discussion_r2623950713). + +One concern is that [MSC3885] has not yet been updated to account for [MSC4186 'Simplified' Sliding Sync][MSC4186], +the 'modern-day' dialect of Sliding Sync, so it is unknown whether this pattern will remain in use. +Whatever happens, this MSC should likely follow the same evolution as that one. + +[MSC3885]: https://github.com/matrix-org/matrix-spec-proposals/pull/3885 +[MSC4186]: https://github.com/matrix-org/matrix-spec-proposals/pull/4186 + +Another concern is a potential problem that we are calling 'flickering'. +This is where due to oldest-first pagination, a client might briefly display stale data before +near-immediately updating it with later data, despite that later data already having been 'available' +on the server. -NB: This endpoint may also be used to retrieve sticky events in a room without calling `/sync` at all (by omitting both `from` and `to`), -which may be useful for bots. +With that said, given this is an edge case that requires a substantial number of sticky events to trigger, +we don't currently consider it worthwhile to add complexity to avoid. ### Rate limits @@ -315,8 +318,6 @@ following protections in place: * All sticky events are subject to normal PDU checks, meaning that the sender must be authorised to send events into the room. * Servers sending lots of sticky events may be asked to try again later as a form of rate-limiting. Due to data expiring, subsequent requests will gradually have less data. -* Sticky events are returned down `/sync` with a recommended limit of 100 per room to ensure clients never get a single enormous `/sync` response. They - will still get all unexpired sticky events via the pagination endpoint. ## Alternatives @@ -378,7 +379,6 @@ In the common case, it provides protection against clock skew when clients have - The sticky key in the `content` of the PDU is `msc4354_sticky_key`. - To enable this in SSS, the extension name is `org.matrix.msc4354.sticky_events`. - The `unsigned.sticky_duration_ttl_ms` field is `unsigned.msc4354_sticky_duration_ttl_ms` -- The endpoint `/_matrix/client/v1/rooms/{roomId}/sticky_events` is `/_matrix/client/unstable/org.matrix.msc4354/rooms/{roomId}/sticky_events`. The `/versions` response in the CSAPI includes: ```json From 4340903c15e9eab1bfb2f6a31cfa08fd535f7e7c Mon Sep 17 00:00:00 2001 From: "Olivier Wilkinson (reivilibre)" Date: Fri, 19 Dec 2025 16:50:38 +0000 Subject: [PATCH 44/50] Describe interaction of sticky events with /sync RoomFilter --- proposals/4354-sticky-events.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index fc14b0aca34..e00c8e0fe9b 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -151,7 +151,12 @@ or not they exceed the timeline limit[^ordering]. If a sticky event is in the ti omitted from the `sticky.events` section. This ensures we minimise duplication in the `/sync` response JSON. This proposal recommends always putting sticky events into the `sticky.events` section _except_ if the sticky event is going to be returned in the `timeline.events` section of the current sync response. -In other words, filter out any event from `sticky.events` where the event ID appears in `timeline.events`. +In other words, filter out any event from `sticky.events` where the event ID appears in `timeline.events`. + +**Interaction with `RoomFilter`:** The `RoomFilter` does not apply to the `sticky.events` section, as it is neither `timeline` nor `state`. +However, the `timeline` filter MUST be applied before applying the deduplication logic above. +In other words, if a sticky event would normally appear in both the `timeline.events` section and the `sticky.events` section, +but is filtered out by the `timeline` filter, the sticky event MUST appear in `sticky.events`. Sticky events follow the same 'stream-like' behaviour as the `timeline`. This means clients will receive a sticky event S _once_, and subsequent requests with an advanced `since` token will not return the same sticky event S. From 41deb2d6255e4c749f9e8d0b9fba4ee60f2867a3 Mon Sep 17 00:00:00 2001 From: "Olivier Wilkinson (reivilibre)" Date: Wed, 1 Apr 2026 16:56:57 +0100 Subject: [PATCH 45/50] Describe 'join room' behaviour for sync, as well as 'all joined rooms' as targets --- proposals/4354-sticky-events.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index e00c8e0fe9b..074f24fde68 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -165,6 +165,9 @@ When sending sticky events down `/sync`, the `unsigned` section SHOULD have a `s how many milliseconds until the sticky event expires. This provides a way to reduce clock skew between a local homeserver and their connected clients. Clients SHOULD use this value to determine when the sticky event expires. +When the user joins a room, the server MUST include all unexpired sticky events for that room in their subsequent +sync response. + Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_events`, which has the following request extension shape: @@ -201,6 +204,14 @@ Sticky events are expected to be encrypted and so there is no [state filter](htt equivalent provided for sticky events e.g to filter sticky events by event type. As with normal events, sticky events sent by ignored users MUST NOT be delivered to clients. +The server MUST include sticky events across all joined rooms in the Sticky Event extension response for Sliding Sync, +regardless of what subscription lists are requested by the client. + +As with regular `/sync`, when a user joins a room, the server MUST include all unexpired sticky events for that room +in their subsequent sync responses. +The server MAY spread them across multiple sync responses or the server MAY ignore the `limit` specified +in the request extension for this case, depending on implementation preference. + #### Pagination Because sticky events and to-device messages are alike in the way that they should be *reliably* delivered to From 082a157696e7311982a4b65b1ba77ca26e784cbb Mon Sep 17 00:00:00 2001 From: "Olivier Wilkinson (reivilibre)" Date: Wed, 1 Apr 2026 17:05:09 +0100 Subject: [PATCH 46/50] MUST send in order -> SHOULD (best-effort) but no need to guarantee --- proposals/4354-sticky-events.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 074f24fde68..028792425fb 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -282,10 +282,6 @@ can fall outside this range, which is what we define as "old". On the receiving `prev_events`, which cannot be connected to any known part of the room DAG. Sending sticky events to newly joined servers can be seen as a form of sending old but unexpired sticky events, and so this proposal only considers this case. -Servers MUST send old sticky events in the order they were created on the server (stream ordering / based on `origin_server_ts`). -This ensures that sticky events appear in roughly the right place in the timeline as servers use the arrival ordering to determine -an event's position in the timeline. - Sending these old events will potentially increase the number of forward extremities in the room for the receiving server. This may impact state resolution performance if there are many forward extremities. Servers MAY send dummy events to remove forward extremities (Synapse has the option to do this since 2019). Alternatively, servers MAY choose not to add old sticky events to their forward extremities, but @@ -293,6 +289,12 @@ this A) reduces eventual delivery guarantees by reducing the frequency of transi rate when implementing ephemeral maps (see "Addendum: Implementing an ephemeral map"), as that relies on servers referencing sticky events from other servers. +Servers SHOULD (best-effort) send sticky events to other homeservers in the order they were created on the server +(stream ordering / based on `origin_server_ts`). +However, this does not need to be guaranteed, particularly when catching up sending old sticky events +(either after a network partition or to a newly-joined server) at the same time as new sticky events +are being created in real-time. + ## Potential issues ### Time From 8fbd13da77e725f14cefb2ae93b96d91abb69920 Mon Sep 17 00:00:00 2001 From: "Olivier Wilkinson (reivilibre)" Date: Wed, 1 Apr 2026 17:07:58 +0100 Subject: [PATCH 47/50] Simplify by picking a lane for /sync deduplication --- proposals/4354-sticky-events.md | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 028792425fb..40edeba1159 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -146,12 +146,9 @@ The new `/sync` section looks like: } } ``` -Sticky messages MAY be sent in the timeline section of the `/sync` response, regardless of whether -or not they exceed the timeline limit[^ordering]. If a sticky event is in the timeline, it MAY be -omitted from the `sticky.events` section. This ensures we minimise duplication in the `/sync` response JSON. -This proposal recommends always putting sticky events into the `sticky.events` section _except_ if -the sticky event is going to be returned in the `timeline.events` section of the current sync response. -In other words, filter out any event from `sticky.events` where the event ID appears in `timeline.events`. +If a sticky event appears in the timeline section of the `/sync` response (`timeline.events`), +it MUST NOT be included in the `sticky.events` section. +This ensures we minimise duplication in the `/sync` response JSON. **Interaction with `RoomFilter`:** The `RoomFilter` does not apply to the `sticky.events` section, as it is neither `timeline` nor `state`. However, the `timeline` filter MUST be applied before applying the deduplication logic above. From 8c491f319ee462432af8dc91e7fc331ffd694ef2 Mon Sep 17 00:00:00 2001 From: "Olivier Wilkinson (reivilibre)" Date: Wed, 1 Apr 2026 17:40:44 +0100 Subject: [PATCH 48/50] Unify on deduplication also for sliding sync --- proposals/4354-sticky-events.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 40edeba1159..4da2854ad84 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -197,6 +197,9 @@ and, when enabled, the following response extension shape: } ``` +As with regular `/sync`, if a sticky event appears in the `timeline_events` section +of the sync response, it MUST NOT be included in the Sticky Events extension response. + Sticky events are expected to be encrypted and so there is no [state filter](https://spec.matrix.org/v1.16/client-server-api/#post_matrixclientv3useruseridfilter_request_roomeventfilter) equivalent provided for sticky events e.g to filter sticky events by event type. As with normal events, sticky events sent by ignored users MUST NOT be delivered to clients. From 5c6bd897be4a8906229d1cc9b0732fb2dc096910 Mon Sep 17 00:00:00 2001 From: "Olivier Wilkinson (reivilibre)" Date: Wed, 1 Apr 2026 18:12:30 +0100 Subject: [PATCH 49/50] Policy servers and similar spam checkers disable the stickiness --- proposals/4354-sticky-events.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 4da2854ad84..1758e94ad2b 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -97,7 +97,9 @@ To implement these properties, servers MUST: When an event loses its stickiness, these properties disappear with the stickiness. Servers SHOULD NOT eagerly synchronise such events anymore, nor send them down `/sync`, nor re-evaluate their soft-failure status. -Note: policy servers and other similar antispam techniques still apply to these events. + +Policy servers and similar homeserver-specific antispam techniques (e.g. custom spam checker modules) still apply to these events, +in which case the stickiness of the event is prevented. These messages may be combined with [MSC4140: Delayed Events](https://github.com/matrix-org/matrix-spec-proposals/pull/4140) to provide heartbeat semantics (e.g required for MatrixRTC). Note that the sticky duration in this proposal From ad1203d0c15043a11a2540ff49d8137f931b74db Mon Sep 17 00:00:00 2001 From: "Olivier Wilkinson (reivilibre)" Date: Tue, 7 Apr 2026 12:39:04 +0100 Subject: [PATCH 50/50] Sliding sync: fix to 'only for interested rooms, regardless of top-N window' --- proposals/4354-sticky-events.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/proposals/4354-sticky-events.md b/proposals/4354-sticky-events.md index 1758e94ad2b..c0c04b6efc1 100644 --- a/proposals/4354-sticky-events.md +++ b/proposals/4354-sticky-events.md @@ -108,6 +108,8 @@ whereas the purpose of delayed events is to affect the send time (and thus start ### Sync API changes +#### Current `/sync` + The new `/sync` section looks like: ```js @@ -167,6 +169,8 @@ and their connected clients. Clients SHOULD use this value to determine when the When the user joins a room, the server MUST include all unexpired sticky events for that room in their subsequent sync response. +#### MSC4186 (Simplified) Sliding Sync + Over Simplified Sliding Sync, Sticky Events have their own extension `sticky_events`, which has the following request extension shape: @@ -206,8 +210,11 @@ Sticky events are expected to be encrypted and so there is no [state filter](htt equivalent provided for sticky events e.g to filter sticky events by event type. As with normal events, sticky events sent by ignored users MUST NOT be delivered to clients. -The server MUST include sticky events across all joined rooms in the Sticky Event extension response for Sliding Sync, -regardless of what subscription lists are requested by the client. +The server MUST include sticky events across all rooms that would be matched by at least one subscription list +(i.e. all rooms that the client is interested in), even if the room does not appear in top-N window for that +subscription list at this time. +Rooms that would not be matched by a list are not included, as this means the client is not interested +in those rooms. As with regular `/sync`, when a user joins a room, the server MUST include all unexpired sticky events for that room in their subsequent sync responses.