Skip to content

Support sending and receiving MSC4354 Sticky Event metadata.#19365

Merged
reivilibre merged 47 commits intodevelopfrom
rei/sticky_events1
Feb 11, 2026
Merged

Support sending and receiving MSC4354 Sticky Event metadata.#19365
reivilibre merged 47 commits intodevelopfrom
rei/sticky_events1

Conversation

@reivilibre
Copy link
Copy Markdown
Contributor

@reivilibre reivilibre commented Jan 9, 2026

Part of: MSC4354 whose experimental feature tracking issue is #19409

Follows: #19340 (a necessary bugfix for /event/ to set this metadata)

Partially supersedes: #18968

This PR implements the first batch of work to support MSC4354 Sticky Events.

Sticky events are events that have been configured with a finite 'stickiness' duration,
capped to 1 hour per current MSC draft.

Whilst an event is sticky, we provide stronger delivery guarantees for the event, both to
our clients and to remote homeservers, essentially making it reliable delivery as long as we
have a functional connection to the client/server and until the stickiness expires.

This PR merely supports creating sticky events and receiving the sticky TTL metadata in clients.
It is not suitable for trialling sticky events since none of the other semantics are implemented.

The current plan is to follow this PR up with more PRs, roughly parcelled up as follows:

  • Implement the sliding sync extension specified in MSC4354, to proactively notify
  • Implement the oldschool sync support, specified in MSC4354, to do the same but for clients still using oldschool sync.
  • Notice new servers joining rooms and proactively tell them about sticky events.
  • Add special federation catch-up support for sticky events, so that (as long as we have a connection) they don't get 'dropped' in the gaps between federation /send requests.
  • Re-evaluate soft-failed sticky events when we think that might be possible
  1. Add MSC4354 experimental feature flag

  2. Expose MSC4354 enablement on /versions

  3. Add constants for sticky events

  4. Add sticky_events table

  5. Add sticky events store and stream

  6. EventBase: add the concept of sticky_duration

  7. EventBuilder: allow building events with sticky event fields

  8. store method: insert_sticky_events_txn

  9. When persisting currently-sticky events, add to sticky event stream

  10. Allow clients to send sticky events
    Including delayed events

  11. Add test helper for sending sticky events

  12. Expose the sticky event TTL to clients

  13. Add a test for sticky TTL calculation and exposure to clients

@reivilibre reivilibre requested a review from a team as a code owner January 9, 2026 16:36
@reivilibre reivilibre changed the title Support sending and receiving [MSC4354 Sticky Event](https://github.com/matrix-org/matrix-spec-proposals/pull/4354) metadata. Support sending and receiving MSC4354 Sticky Event metadata. Jan 9, 2026
@reivilibre reivilibre force-pushed the rei/sticky_events1 branch 2 times, most recently from e49d4bd to 78ee6e8 Compare January 16, 2026 09:00
Comment thread synapse/api/constants.py Outdated
Comment thread synapse/api/constants.py Outdated
Comment thread synapse/api/constants.py
Comment thread synapse/api/constants.py
Comment thread synapse/api/constants.py Outdated

class StickyEvent:
QUERY_PARAM_NAME: Final = "org.matrix.msc4354.sticky_duration_ms"
FIELD_NAME: Final = "msc4354_sticky"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separation between StickyEventField and StickyEvent.FIELD_NAME is slightly mind bending.

Comment thread synapse/storage/databases/main/events.py
Comment thread synapse/handlers/delayed_events.py Outdated
Comment thread tests/rest/client/test_sticky_events.py
Comment thread synapse/storage/databases/main/delayed_events.py
Comment thread tests/rest/client/utils.py Outdated
@reivilibre reivilibre mentioned this pull request Feb 3, 2026
3 tasks
Comment on lines +156 to +158
expr_soft_failed = "COALESCE(((ej.internal_metadata::jsonb)->>'soft_failed')::boolean, FALSE)"
else:
expr_soft_failed = "COALESCE(ej.internal_metadata->>'soft_failed', FALSE)"
Copy link
Copy Markdown
Contributor

@MadLittleMods MadLittleMods Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trial-olddeps -> sqlite3.OperationalError: near ">>": syntax error

trial-olddeps CI failing ❌, https://github.com/element-hq/synapse/actions/runs/21635340214/job/62359543300?pr=19365#step:10:7282

tests.storage.test_sticky_events.StickyEventsTestCase.test_get_updated_sticky_events
===============================================================================
[FAIL]
Traceback (most recent call last):
  File "/home/runner/work/synapse/synapse/tests/storage/test_sticky_events.py", line 153, in test_get_updated_sticky_events_with_limit
    updates = self.get_success(
  File "/home/runner/work/synapse/synapse/tests/unittest.py", line 707, in get_success
    return self.successResultOf(deferred)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/trial/_synctest.py", line 706, in successResultOf
    self.fail(
twisted.trial.unittest.FailTest: Success result expected on <Deferred at 0x7f36aca5efe0 current result: None>, found failure result instead:
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 517, in errback
    self._startRunCallbacks(fail)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks
    self._runCallbacks()
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 1514, in gotResult
    current_context.run(_inlineCallbacks, r, g, status)
--- <exception caught here> ---
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
    result = current_context.run(result.throwExceptionIntoGenerator, g)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/databases/main/sticky_events.py", line 144, in get_updated_sticky_events
    return await self.db_pool.runInteraction(
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 1015, in runInteraction
    return await delay_cancellation(_runInteraction())
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
    result = current_context.run(result.throwExceptionIntoGenerator, g)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 981, in _runInteraction
    result: R = await self.runWithConnection(
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 1117, in runWithConnection
    return await make_deferred_yieldable(
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/runner/work/synapse/synapse/tests/server.py", line 814, in <lambda>
    d.addCallback(lambda x: function(*args, **kwargs))
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
    compat.reraise(excValue, excTraceback)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
    return function(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/python/compat.py", line 403, in reraise
    raise exception.with_traceback(traceback)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 1110, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 819, in new_transaction
    r = func(cursor, *args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/databases/main/sticky_events.py", line 160, in _get_updated_sticky_events_txn
    txn.execute(
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 458, in execute
    self._do_execute(self.txn.execute, sql, parameters)
  File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 520, in _do_execute
    return func(sql, *args, **kwargs)
sqlite3.OperationalError: near ">>": syntax error

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related discussion in #synapse-dev:matrix.org

Comment thread synapse/storage/databases/main/events.py
Comment thread synapse/storage/databases/main/sticky_events.py Outdated
Comment thread synapse/storage/schema/main/delta/93/01_sticky_events.sql Outdated
Comment thread tests/rest/client/utils.py
Comment thread tests/rest/client/test_sticky_events.py
Comment thread tests/storage/test_sticky_events.py
Comment thread synapse/events/builder.py

content: JsonDict = attr.Factory(dict)
unsigned: JsonDict = attr.Factory(dict)
sticky: StickyEventField | None = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine as-is. We don't have any great options ⏩

@reivilibre
Copy link
Copy Markdown
Contributor Author

Because it's not very trivial to unstick our SQLite supported version situation, for right now I'm going to merge this with a workaround for when the SQLite version is technically unsupported, such as on Ubuntu 22.04. I've opened a ticket to track this #19452 because I'd like to remove this workaround soon.

This does mean that the feature is not getting tested under olddeps, which currently uses Ubuntu 22.04. This is definitely not OK when the feature gets stabilised.

Another approach we could do, if SQLite 3.37 support is a requirement, is to mount a custom function into SQLite to deal with the JSON.

@reivilibre reivilibre merged commit 52fb6e9 into develop Feb 11, 2026
78 of 80 checks passed
@reivilibre reivilibre deleted the rei/sticky_events1 branch February 11, 2026 12:41
alexlebens pushed a commit to alexlebens/infrastructure that referenced this pull request Feb 24, 2026
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [element-hq/synapse](https://github.com/element-hq/synapse) | minor | `v1.147.1` → `v1.148.0` |

---

### Release Notes

<details>
<summary>element-hq/synapse (element-hq/synapse)</summary>

### [`v1.148.0`](https://github.com/element-hq/synapse/releases/tag/v1.148.0)

[Compare Source](element-hq/synapse@v1.147.1...v1.148.0)

### Synapse 1.148.0 (2026-02-24)

No significant changes since 1.148.0rc1.

### Synapse 1.148.0rc1 (2026-02-17)

#### Features

- Support sending and receiving [MSC4354 Sticky Event](matrix-org/matrix-spec-proposals#4354) metadata. ([#&#8203;19365](element-hq/synapse#19365))

#### Improved Documentation

- Fix reference to the `experimental_features` section of the configuration manual documentation. ([#&#8203;19435](element-hq/synapse#19435))

#### Deprecations and Removals

- Remove support for [MSC3244: Room version capabilities](matrix-org/matrix-spec-proposals#3244) as the MSC was rejected. ([#&#8203;19429](element-hq/synapse#19429))

#### Internal Changes

- Add in-repo Complement tests so we can test Synapse specific behavior at an end-to-end level. ([#&#8203;19406](element-hq/synapse#19406))
- Push Synapse docker images to Element OCI Registry. ([#&#8203;19420](element-hq/synapse#19420))
- Allow configuring the Rust HTTP client to use HTTP/2 only. ([#&#8203;19457](element-hq/synapse#19457))
- Correctly refuse to start if the Rust workspace config has changed and the Rust library has not been rebuilt. ([#&#8203;19470](element-hq/synapse#19470))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4yNS44IiwidXBkYXRlZEluVmVyIjoiNDMuMjUuOCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiaW1hZ2UiXX0=-->

Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/4203
Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net>
Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Feb 26, 2026
Tested on NetBSD 9 amd64 by reporting pyproject.toml buglets upstream!

# Synapse 1.148.0 (2026-02-24)

## Features

- Support sending and receiving [MSC4354 Sticky Event](matrix-org/matrix-spec-proposals#4354) metadata. ([\#19365](element-hq/synapse#19365))

## Deprecations and Removals

- Remove support for [MSC3244: Room version capabilities](matrix-org/matrix-spec-proposals#3244) as the MSC was rejected. ([\#19429](element-hq/synapse#19429))
github-merge-queue bot pushed a commit to famedly/synapse that referenced this pull request Mar 2, 2026
# Famedly Synapse Release v1.148.0_1

## Famedly additions for v1.148.0_1

- chore: fix the inconsistent stream error log message to contain the
proper information instead of rasing value error
([\#238](#238)) (itsoyou &
FrenchGithubUser)

### Notes for Famedly:

- Support sending and receiving
[https://github.com/matrix-org/matrix-spec-proposals/pull/4354](https://github.com/matrix-org/matrix-spec-proposals/pull/4354)
metadata. [\#19365](element-hq/synapse#19365)
reivilibre added a commit that referenced this pull request Mar 10, 2026
Follows: #19365

Part of: MSC4354 whose experimental feature tracking issue is #19409

Partially supersedes: #18968

---------

Signed-off-by: Olivier 'reivilibre' <oliverw@matrix.org>
Comment thread synapse/streams/events.py
self._instance_name
)
thread_subscriptions_key = self.store.get_max_thread_subscriptions_stream_id()
sticky_events_key = self.store.get_max_sticky_events_stream_id()
Copy link
Copy Markdown
Contributor

@MadLittleMods MadLittleMods Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Instead of a single stream_id int, should we be using something that holds onto multiple positions?

def get_device_stream_token(self) -> MultiWriterStreamToken:
return MultiWriterStreamToken.from_generator(self._device_list_id_gen)

(we use this above for example: to_device_key = self.store.get_to_device_stream_token())

I assume we're trying to support multiple writers although we didn't document anything in docs/workers.md (I assume because this is experimental)

(same with the StickyEventsStream in synapse/replication/tcp/streams/_base.py)

(same with thread_subscriptions_key)


Spawning from spotting this while looking at yet another new stream, #19558 (comment)

Copy link
Copy Markdown
Contributor Author

@reivilibre reivilibre Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right I did consider this (for thread subscriptions honestly; it did slip my mind here), but (particularly for thread subscriptions) it seemed like an overcomplication for now.

My understanding is that it's OK to use a single int for following a MultiWriterStreamToken stream, the only 'problem' is that you have to wait for all writers to persist before you can advance — or (rewording because that sounds awkward), if a writer worker allocates a stream ID but takes a while to persist the row using it, other workers in the meantime might allocate and use stream IDs and their new stream rows will be 'invisible' to a reader until that slow worker finishes persisting.

(edit: https://element-hq.github.io/synapse/latest/development/synapse_architecture/streams.html#multi-writer-streams is decent on this topic
edit 2: https://element-hq.github.io/synapse/latest/development/synapse_architecture/streams.html#current-stream-id is even more decent on this topic)

For an example, despite the events stream allowing multiple writers, the stream reader for federation senders (see federation_stream_position table) has a single int position. The repercussion is that one slow event persister can cause us to stall sending out events over federation. (OK that doesn't sound too convincing, this is probably a case where it'd make sense to use the sharded tokens :))

Whilst I don't think thread subscriptions need to worry about this — they should persist very quickly — it's possible that the events stream is maybe the most important place to use this, because event persistence can invoke state resolution which is slow.
Since the sticky events stream is also written during event persistence, I suppose that would be a fair argument that we should do that here too, at least for sync, to be consistent with how we do timeline of events.

So all in all, I guess: not fatal, but well spotted. I will do the Proper Thing™ at least for sliding sync since that seems important going forwards. I am not sure if it is worth doing for oldschool sync (just on the cost/benefit ratio). Will write a note on the experimental feature tracking issue though.

Copy link
Copy Markdown
Contributor

@MadLittleMods MadLittleMods Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's overkill about it?

Supporting multiple writers is the hard part and the rest is just using the right method to craft the token 🤔

Thanks for the extra docs - 👍 I understand that a single stream_id for a multi-writer stream is ok but could result in more waiting depending on the workers view of the world. The important part is that a worker actually waits for the token, whatever it is, when reading data (c.f. #19647)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's overkill about it?

With an integer stream token on the reader, it's easy enough to 'get updates between A and B'. With a vector stream token, it gets a little more complicated. (Mostly this is just writing some code, but debugging it later can also be more confusing)

For completeness, although this risk doesn't seem to apply here, you also need to account for the possibility of different workers experiencing the stream rows (facts) in different orders, because the vector stream token isn't guaranteed to advance in the same way on each worker (afaik, but could be wrong)

reivilibre added a commit that referenced this pull request Apr 15, 2026
Follows: #19365

Part of: MSC4354 Sticky Events (experimental feature #19409)

This PR introduces a `spam_checker_spammy` flag, analogous to
`policy_server_spammy`, as an explicit flag
that an event was decided to be spammy by a spam-checker module.

The original Sticky Events PR (#18968) just reused
`policy_server_spammy`, but it didn't sit right with me
because we (at least appear to be experimenting with features that)
allow users to opt-in to seeing
`policy_server_spammy` events (presumably for moderation purposes).

Keeping these flags separate felt best, therefore.

As for why we need this flag: soon soft-failed status won't be
permanent, at least for sticky events.
The spam checker modules currently work by making events soft-failed.
We want to prevent spammy events from getting
reconsidered/un-soft-failed, so it seems like we need
a flag to track spam-checker spamminess *separately* from soft-failed.

Should be commit-by-commit friendly, but is also small.

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants