Support sending and receiving MSC4354 Sticky Event metadata.#19365
Support sending and receiving MSC4354 Sticky Event metadata.#19365reivilibre merged 47 commits intodevelopfrom
Conversation
2b83332 to
53d4486
Compare
e49d4bd to
78ee6e8
Compare
Including delayed events
78ee6e8 to
fb67e9a
Compare
|
|
||
| class StickyEvent: | ||
| QUERY_PARAM_NAME: Final = "org.matrix.msc4354.sticky_duration_ms" | ||
| FIELD_NAME: Final = "msc4354_sticky" |
There was a problem hiding this comment.
The separation between StickyEventField and StickyEvent.FIELD_NAME is slightly mind bending.
Co-authored-by: Eric Eastwood <erice@element.io>
1c2a9ad to
beead8a
Compare
89b20d3 to
5a5efc5
Compare
| expr_soft_failed = "COALESCE(((ej.internal_metadata::jsonb)->>'soft_failed')::boolean, FALSE)" | ||
| else: | ||
| expr_soft_failed = "COALESCE(ej.internal_metadata->>'soft_failed', FALSE)" |
There was a problem hiding this comment.
trial-olddeps -> sqlite3.OperationalError: near ">>": syntax error
trial-olddeps CI failing ❌, https://github.com/element-hq/synapse/actions/runs/21635340214/job/62359543300?pr=19365#step:10:7282
tests.storage.test_sticky_events.StickyEventsTestCase.test_get_updated_sticky_events
===============================================================================
[FAIL]
Traceback (most recent call last):
File "/home/runner/work/synapse/synapse/tests/storage/test_sticky_events.py", line 153, in test_get_updated_sticky_events_with_limit
updates = self.get_success(
File "/home/runner/work/synapse/synapse/tests/unittest.py", line 707, in get_success
return self.successResultOf(deferred)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/trial/_synctest.py", line 706, in successResultOf
self.fail(
twisted.trial.unittest.FailTest: Success result expected on <Deferred at 0x7f36aca5efe0 current result: None>, found failure result instead:
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 517, in errback
self._startRunCallbacks(fail)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks
self._runCallbacks()
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 1514, in gotResult
current_context.run(_inlineCallbacks, r, g, status)
--- <exception caught here> ---
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
result = current_context.run(result.throwExceptionIntoGenerator, g)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/databases/main/sticky_events.py", line 144, in get_updated_sticky_events
return await self.db_pool.runInteraction(
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 1015, in runInteraction
return await delay_cancellation(_runInteraction())
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
result = current_context.run(result.throwExceptionIntoGenerator, g)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 981, in _runInteraction
result: R = await self.runWithConnection(
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 1117, in runWithConnection
return await make_deferred_yieldable(
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/home/runner/work/synapse/synapse/tests/server.py", line 814, in <lambda>
d.addCallback(lambda x: function(*args, **kwargs))
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
compat.reraise(excValue, excTraceback)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
return function(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/python/compat.py", line 403, in reraise
raise exception.with_traceback(traceback)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
result = func(conn, *args, **kw)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 1110, in inner_func
return func(db_conn, *args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 819, in new_transaction
r = func(cursor, *args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/databases/main/sticky_events.py", line 160, in _get_updated_sticky_events_txn
txn.execute(
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 458, in execute
self._do_execute(self.txn.execute, sql, parameters)
File "/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/synapse/storage/database.py", line 520, in _do_execute
return func(sql, *args, **kwargs)
sqlite3.OperationalError: near ">>": syntax error
|
|
||
| content: JsonDict = attr.Factory(dict) | ||
| unsigned: JsonDict = attr.Factory(dict) | ||
| sticky: StickyEventField | None = None |
There was a problem hiding this comment.
Probably fine as-is. We don't have any great options ⏩
|
Because it's not very trivial to unstick our SQLite supported version situation, for right now I'm going to merge this with a workaround for when the SQLite version is technically unsupported, such as on Ubuntu 22.04. I've opened a ticket to track this #19452 because I'd like to remove this workaround soon. This does mean that the feature is not getting tested under olddeps, which currently uses Ubuntu 22.04. This is definitely not OK when the feature gets stabilised. Another approach we could do, if SQLite 3.37 support is a requirement, is to mount a custom function into SQLite to deal with the JSON. |
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [element-hq/synapse](https://github.com/element-hq/synapse) | minor | `v1.147.1` → `v1.148.0` | --- ### Release Notes <details> <summary>element-hq/synapse (element-hq/synapse)</summary> ### [`v1.148.0`](https://github.com/element-hq/synapse/releases/tag/v1.148.0) [Compare Source](element-hq/synapse@v1.147.1...v1.148.0) ### Synapse 1.148.0 (2026-02-24) No significant changes since 1.148.0rc1. ### Synapse 1.148.0rc1 (2026-02-17) #### Features - Support sending and receiving [MSC4354 Sticky Event](matrix-org/matrix-spec-proposals#4354) metadata. ([#​19365](element-hq/synapse#19365)) #### Improved Documentation - Fix reference to the `experimental_features` section of the configuration manual documentation. ([#​19435](element-hq/synapse#19435)) #### Deprecations and Removals - Remove support for [MSC3244: Room version capabilities](matrix-org/matrix-spec-proposals#3244) as the MSC was rejected. ([#​19429](element-hq/synapse#19429)) #### Internal Changes - Add in-repo Complement tests so we can test Synapse specific behavior at an end-to-end level. ([#​19406](element-hq/synapse#19406)) - Push Synapse docker images to Element OCI Registry. ([#​19420](element-hq/synapse#19420)) - Allow configuring the Rust HTTP client to use HTTP/2 only. ([#​19457](element-hq/synapse#19457)) - Correctly refuse to start if the Rust workspace config has changed and the Rust library has not been rebuilt. ([#​19470](element-hq/synapse#19470)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4yNS44IiwidXBkYXRlZEluVmVyIjoiNDMuMjUuOCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiaW1hZ2UiXX0=--> Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/4203 Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net> Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
Tested on NetBSD 9 amd64 by reporting pyproject.toml buglets upstream! # Synapse 1.148.0 (2026-02-24) ## Features - Support sending and receiving [MSC4354 Sticky Event](matrix-org/matrix-spec-proposals#4354) metadata. ([\#19365](element-hq/synapse#19365)) ## Deprecations and Removals - Remove support for [MSC3244: Room version capabilities](matrix-org/matrix-spec-proposals#3244) as the MSC was rejected. ([\#19429](element-hq/synapse#19429))
# Famedly Synapse Release v1.148.0_1 ## Famedly additions for v1.148.0_1 - chore: fix the inconsistent stream error log message to contain the proper information instead of rasing value error ([\#238](#238)) (itsoyou & FrenchGithubUser) ### Notes for Famedly: - Support sending and receiving [https://github.com/matrix-org/matrix-spec-proposals/pull/4354](https://github.com/matrix-org/matrix-spec-proposals/pull/4354) metadata. [\#19365](element-hq/synapse#19365)
| self._instance_name | ||
| ) | ||
| thread_subscriptions_key = self.store.get_max_thread_subscriptions_stream_id() | ||
| sticky_events_key = self.store.get_max_sticky_events_stream_id() |
There was a problem hiding this comment.
🤔 Instead of a single stream_id int, should we be using something that holds onto multiple positions?
synapse/synapse/storage/databases/main/devices.py
Lines 250 to 251 in e1a429a
(we use this above for example: to_device_key = self.store.get_to_device_stream_token())
I assume we're trying to support multiple writers although we didn't document anything in docs/workers.md (I assume because this is experimental)
(same with the StickyEventsStream in synapse/replication/tcp/streams/_base.py)
(same with thread_subscriptions_key)
Spawning from spotting this while looking at yet another new stream, #19558 (comment)
There was a problem hiding this comment.
right I did consider this (for thread subscriptions honestly; it did slip my mind here), but (particularly for thread subscriptions) it seemed like an overcomplication for now.
My understanding is that it's OK to use a single int for following a MultiWriterStreamToken stream, the only 'problem' is that you have to wait for all writers to persist before you can advance — or (rewording because that sounds awkward), if a writer worker allocates a stream ID but takes a while to persist the row using it, other workers in the meantime might allocate and use stream IDs and their new stream rows will be 'invisible' to a reader until that slow worker finishes persisting.
(edit: https://element-hq.github.io/synapse/latest/development/synapse_architecture/streams.html#multi-writer-streams is decent on this topic
edit 2: https://element-hq.github.io/synapse/latest/development/synapse_architecture/streams.html#current-stream-id is even more decent on this topic)
For an example, despite the events stream allowing multiple writers, the stream reader for federation senders (see federation_stream_position table) has a single int position. The repercussion is that one slow event persister can cause us to stall sending out events over federation. (OK that doesn't sound too convincing, this is probably a case where it'd make sense to use the sharded tokens :))
Whilst I don't think thread subscriptions need to worry about this — they should persist very quickly — it's possible that the events stream is maybe the most important place to use this, because event persistence can invoke state resolution which is slow.
Since the sticky events stream is also written during event persistence, I suppose that would be a fair argument that we should do that here too, at least for sync, to be consistent with how we do timeline of events.
So all in all, I guess: not fatal, but well spotted. I will do the Proper Thing™ at least for sliding sync since that seems important going forwards. I am not sure if it is worth doing for oldschool sync (just on the cost/benefit ratio). Will write a note on the experimental feature tracking issue though.
There was a problem hiding this comment.
What's overkill about it?
Supporting multiple writers is the hard part and the rest is just using the right method to craft the token 🤔
Thanks for the extra docs - 👍 I understand that a single stream_id for a multi-writer stream is ok but could result in more waiting depending on the workers view of the world. The important part is that a worker actually waits for the token, whatever it is, when reading data (c.f. #19647)
There was a problem hiding this comment.
What's overkill about it?
With an integer stream token on the reader, it's easy enough to 'get updates between A and B'. With a vector stream token, it gets a little more complicated. (Mostly this is just writing some code, but debugging it later can also be more confusing)
For completeness, although this risk doesn't seem to apply here, you also need to account for the possibility of different workers experiencing the stream rows (facts) in different orders, because the vector stream token isn't guaranteed to advance in the same way on each worker (afaik, but could be wrong)
Follows: #19365 Part of: MSC4354 Sticky Events (experimental feature #19409) This PR introduces a `spam_checker_spammy` flag, analogous to `policy_server_spammy`, as an explicit flag that an event was decided to be spammy by a spam-checker module. The original Sticky Events PR (#18968) just reused `policy_server_spammy`, but it didn't sit right with me because we (at least appear to be experimenting with features that) allow users to opt-in to seeing `policy_server_spammy` events (presumably for moderation purposes). Keeping these flags separate felt best, therefore. As for why we need this flag: soon soft-failed status won't be permanent, at least for sticky events. The spam checker modules currently work by making events soft-failed. We want to prevent spammy events from getting reconsidered/un-soft-failed, so it seems like we need a flag to track spam-checker spamminess *separately* from soft-failed. Should be commit-by-commit friendly, but is also small. --------- Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Part of: MSC4354 whose experimental feature tracking issue is #19409
Follows: #19340 (a necessary bugfix for
/event/to set this metadata)Partially supersedes: #18968
This PR implements the first batch of work to support MSC4354 Sticky Events.
Sticky events are events that have been configured with a finite 'stickiness' duration,
capped to 1 hour per current MSC draft.
Whilst an event is sticky, we provide stronger delivery guarantees for the event, both to
our clients and to remote homeservers, essentially making it reliable delivery as long as we
have a functional connection to the client/server and until the stickiness expires.
This PR merely supports creating sticky events and receiving the sticky TTL metadata in clients.
It is not suitable for trialling sticky events since none of the other semantics are implemented.
The current plan is to follow this PR up with more PRs, roughly parcelled up as follows:
Add MSC4354 experimental feature flag
Expose MSC4354 enablement on /versions
Add constants for sticky events
Add sticky_events table
Add sticky events store and stream
EventBase: add the concept of sticky_duration
EventBuilder: allow building events with sticky event fields
store method: insert_sticky_events_txn
When persisting currently-sticky events, add to sticky event stream
Allow clients to send sticky events
Including delayed events
Add test helper for sending sticky events
Expose the sticky event TTL to clients
Add a test for sticky TTL calculation and exposure to clients