Commit Graph

289 Commits

Author SHA1 Message Date
Richard van der Hoff
c897ac63e9
Ensure that incoming to-device messages are not dropped (#17127)
... when workers are unreachable, etc.

Fixes https://github.com/element-hq/synapse/issues/17117.

The general principle is just to make sure that we propagate any
exceptions to the JsonResource, so that we return an error code to the
sending server. That means that the sending server no longer considers
the message safely sent, so it will retry later.

In the issue, Erik mentions that an alternative solution would be to
persist the to-device messages into a table so that they can be retried.
This might be an improvement for performance, but even if we did that,
we still need this mechanism, since we might be unable to reach the
database. So, if we want to do that, it can be a later follow-up.

---------

Co-authored-by: Erik Johnston <erik@matrix.org>
2024-04-29 14:11:00 +01:00
dependabot[bot]
1e68b56a62
Bump black from 23.10.1 to 24.2.0 (#16936) 2024-03-13 16:46:44 +00:00
Erik Johnston
23740eaa3d
Correctly mention previous copyright (#16820)
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
2024-01-23 11:26:48 +00:00
Patrick Cloke
8e1e62c9e0 Update license headers 2023-11-21 15:29:58 -05:00
Patrick Cloke
70b503f144 Fix import ordering issue introduced in 7a3a55ac98. 2023-10-31 10:32:35 -04:00
Patrick Cloke
7a3a55ac98
Merge pull request from GHSA-mp92-3jfm-3575 2023-10-31 13:58:30 +00:00
Patrick Cloke
85e5f2dc25
Add a new module API to update user presence state. (#16544)
This adds a module API which allows a module to update a user's
presence state/status message. This is useful for controlling presence
from an external system.

To fully control presence from the module the presence.enabled config
parameter gains a new state of "untracked" which disables internal tracking
of presence changes via user actions, etc. Only updates from the module will
be persisted and sent down sync properly).
2023-10-26 15:11:24 -04:00
Patrick Cloke
fc31b495b3
Stop sending incorrect knock_state_events. (#16403)
Synapse was incorrectly implemented with a knock_state_events
property on some APIs (instead of knock_room_state). This was
correct in Synapse 1.70.0, but *both* fields were sent to also be
compatible with Synapse versions expecting the wrong field.

Enough time has passed that only the correct field needs to be
included/handled.
2023-10-06 07:27:35 -04:00
Patrick Cloke
f84da3c32e
Add a cache around server ACL checking (#16360)
* Pre-compiles the server ACLs onto an object per room and
  invalidates them when new events come in.
* Converts the server ACL checking into Rust.
2023-09-26 11:57:50 -04:00
Mathieu Velten
8c3bcea2da
Rename pagination&purge locks and add comments explaining them (#16112) 2023-08-16 16:19:54 +02:00
Erik Johnston
ae55cc1e6b
Add ability to wait for locks and add locks to purge history / room deletion (#15791)
c.f. #13476
2023-07-31 10:58:03 +01:00
Patrick Cloke
6d81aec09f
Support room version 11 (#15912)
And fix a bug in the implementation of the updated redaction
format (MSC2174) where the top-level redacts field was not
properly added for backwards-compatibility.
2023-07-18 08:44:59 -04:00
Mathieu Velten
59ec4a0dc1
Fix MSC3983 support: only one OTK per device was returned through federation (#15770) 2023-06-13 19:51:47 +02:00
Eric Eastwood
4e6390cb10
Update error to more plainly explain we can only authorize our own events (#15725) 2023-06-06 16:26:12 -05:00
Patrick Cloke
c01343de43
Add stricter mypy options (#15694)
Enable warn_unused_configs, strict_concatenate, disallow_subclassing_any,
and disallow_incomplete_defs.
2023-05-31 07:18:29 -04:00
Sean Quah
5d8c659373
Remove unused FederationServer.__str__ override (#15690)
Signed-off-by: Sean Quah <seanq@matrix.org>
2023-05-30 14:37:39 +01:00
Patrick Cloke
07771fa487
Remove experimental configuration flags & unstable values for faster joins (#15625)
Synapse will no longer send (or respond to) the unstable flags
for faster joins. These were only available behind a configuration
flag and handled in parallel with the stable flags.
2023-05-19 07:23:09 -04:00
Sean Quah
e46d5f3586
Factor out an is_mine_server_name method (#15542)
Add an `is_mine_server_name` method, similar to `is_mine_id`.

Ideally we would use this consistently, instead of sometimes comparing
against `hs.hostname` and other times reaching into
`hs.config.server.server_name`.

Also fix a bug in the tests where `hs.hostname` would sometimes differ
from `hs.config.server.server_name`.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-05-05 15:06:22 +01:00
Patrick Cloke
57aeeb308b
Add support for claiming multiple OTKs at once. (#15468)
MSC3983 provides a way to request multiple OTKs at once from appservices,
this extends this concept to the Client-Server API.

Note that this will likely be spit out into a separate MSC, but is currently part of
MSC3983.
2023-04-27 12:57:46 -04:00
Patrick Cloke
8e9739449d
Add unstable /keys/claim endpoint which always returns fallback keys. (#15462)
It can be useful to always return the fallback key when attempting to
claim keys. This adds an unstable endpoint for `/keys/claim` which
always returns fallback keys in addition to one-time-keys.

The fallback key(s) are not marked as "used" unless there are no
corresponding OTKs.

This is currently defined in MSC3983 (although likely to be split out
to a separate MSC). The endpoint shape may change or be requested
differently (i.e. a keyword parameter on the current endpoint), but the
core logic should be reasonable.
2023-04-25 13:30:41 -04:00
Andrew Morgan
aec639e3e3
Move Spam Checker callbacks to a dedicated file (#15453) 2023-04-18 00:57:40 +00:00
Patrick Cloke
5282ba1e2b
Implement MSC3983 to proxy /keys/claim queries to appservices. (#15314)
Experimental support for MSC3983 is behind a configuration flag.
If enabled, for users which are exclusively owned by an application
service then the appservice will be queried for one-time keys *if*
there are none uploaded to Synapse.
2023-03-28 18:26:27 +00:00
Mathieu Velten
6cddf24e36
Faster joins: don't stall when a user joins during a fast join (#14606)
Fixes #12801.
Complement tests are at
https://github.com/matrix-org/complement/pull/567.

Avoid blocking on full state when handling a subsequent join into a
partial state room.

Also always perform a remote join into partial state rooms, since we do
not know whether the joining user has been banned and want to avoid
leaking history to banned users.

Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
Co-authored-by: Sean Quah <seanq@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
2023-02-10 23:31:05 +00:00
Sean Quah
d0c713cc85
Return read-only collections from @cached methods (#13755)
It's important that collections returned from `@cached` methods are not
modified, otherwise future retrievals from the cache will return the
modified collection.

This applies to the return values from `@cached` methods and the values
inside the dictionaries returned by `@cachedList` methods. It's not
necessary for the dictionaries returned by `@cachedList` methods
themselves to be read-only.

Signed-off-by: Sean Quah <seanq@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
2023-02-10 23:29:00 +00:00
Patrick Cloke
1182ae5063
Add helper to parse an enum from query args & use it. (#14956)
The `parse_enum` helper pulls an enum value from the query string
(by delegating down to the parse_string helper with values generated
from the enum).

This is used to pull out "f" and "b" in most places and then we thread
the resulting Direction enum throughout more code.
2023-02-01 21:35:24 +00:00
David Robertson
3b8574b4f2
Tag /send_join responses to detect faster joins (#14950)
* Tag /send_join responses to detect faster joins

* Changelog

* Define a proper SynapseTag

* isort
2023-01-31 12:43:20 +00:00
David Robertson
85a7a201fa
Also use stable name in SendJoinResponse struct (#14841)
* Also use stable name in SendJoinResponse struct

follow-up to #14832

* Changelog

* Fix a rename I missed

* Run black

* Update synapse/federation/federation_client.py

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2023-01-16 12:40:25 +00:00
David Robertson
52ae80dd1a
Use stable identifiers for faster joins (#14832)
* Use new query param when requesting a partial join

* Read new query param when serving partial join

* Provide new field names when serving partial joins

* Read new field names from partial join response

* Changelog
2023-01-13 17:58:53 +00:00
David Robertson
1eed795fc5
Include heroes in partial join responses' state (#14442)
* Pull out hero selection logic

* Include heroes in partial join response's state

* Changelog

* Fixup trial test

* Remove TODO
2022-11-15 17:35:19 +00:00
Eric Eastwood
70b3396506
Explain SynapseError and FederationError better (#14191)
Explain `SynapseError` and `FederationError` better

Spawning from https://github.com/matrix-org/synapse/pull/13816#discussion_r993262622
2022-10-19 15:39:43 -05:00
Andrew Morgan
9c23442ac9
Correct field name for stripped state events when knocking. knock_state_events -> knock_room_state (#14102) 2022-10-12 14:37:20 +01:00
reivilibre
c06b2b7142
Faster Remote Room Joins: tell remote homeservers that we are unable to authorise them if they query a room which has partial state on our server. (#13823) 2022-09-23 11:47:16 +01:00
reivilibre
ba882c0357
Faster Room Joins: fix /make_knock blocking indefinitely when the room in question is a partial-stated room. (#13583)
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2022-08-24 09:09:59 +00:00
Eric Eastwood
344a2f767c
Instrument FederationStateIdsServlet - /state_ids (#13499)
Instrument FederationStateIdsServlet - `/state_ids` so it's easier to follow what's going on in Jaeger when viewing a trace.
2022-08-15 19:41:23 +01:00
reivilibre
e9e6aacfbe
Faster Room Joins: prevent Synapse from answering federated join requests for a room which it has not fully joined yet. (#13416) 2022-08-04 16:27:04 +01:00
Will Hunt
502f075e96
Implement MSC3848: Introduce errcodes for specific event sending failures (#13343)
Implements MSC3848
2022-07-27 13:44:40 +01:00
David Robertson
b977867358
Rate limit joins per-room (#13276) 2022-07-19 11:45:17 +00:00
Sean Quah
68db233f0c
Handle race between persisting an event and un-partial stating a room (#13100)
Whenever we want to persist an event, we first compute an event context,
which includes the state at the event and a flag indicating whether the
state is partial. After a lot of processing, we finally try to store the
event in the database, which can fail for partial state events when the
containing room has been un-partial stated in the meantime.

We detect the race as a foreign key constraint failure in the data store
layer and turn it into a special `PartialStateConflictError` exception,
which makes its way up to the method in which we computed the event
context.

To make things difficult, the exception needs to cross a replication
request: `/fed_send_events` for events coming over federation and
`/send_event` for events from clients. We transport the
`PartialStateConflictError` as a `409 Conflict` over replication and
turn `409`s back into `PartialStateConflictError`s on the worker making
the request.

All client events go through
`EventCreationHandler.handle_new_client_event`, which is called in
*a lot* of places. Instead of trying to update all the code which
creates client events, we turn the `PartialStateConflictError` into a
`429 Too Many Requests` in
`EventCreationHandler.handle_new_client_event` and hope that clients
take it as a hint to retry their request.

On the federation event side, there are 7 places which compute event
contexts. 4 of them use outlier event contexts:
`FederationEventHandler._auth_and_persist_outliers_inner`,
`FederationHandler.do_knock`, `FederationHandler.on_invite_request` and
`FederationHandler.do_remotely_reject_invite`. These events won't have
the partial state flag, so we do not need to do anything for then.

The remaining 3 paths which create events are
`FederationEventHandler.process_remote_join`,
`FederationEventHandler.on_send_membership_event` and
`FederationEventHandler._process_received_pdu`.

We can't experience the race in `process_remote_join`, unless we're
handling an additional join into a partial state room, which currently
blocks, so we make no attempt to handle it correctly.

`on_send_membership_event` is only called by
`FederationServer._on_send_membership_event`, so we catch the
`PartialStateConflictError` there and retry just once.

`_process_received_pdu` is called by `on_receive_pdu` for incoming
events and `_process_pulled_event` for backfill. The latter should never
try to persist partial state events, so we ignore it. We catch the
`PartialStateConflictError` in `on_receive_pdu` and retry just once.

Refering to the graph of code paths in
https://github.com/matrix-org/synapse/issues/12988#issuecomment-1156857648
may make the above make more sense.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-07-05 16:12:52 +01:00
Erik Johnston
e3163e2e11
Reduce the amount of state we pull from the DB (#12811) 2022-06-06 09:24:12 +01:00
Erik Johnston
888a29f412
Wait for lazy join to complete when getting current state (#12872) 2022-06-01 16:02:53 +01:00
Richard van der Hoff
f0aec0abef
Improve logging when signature checks fail (#12925)
* Raise a dedicated `InvalidEventSignatureError` from `_check_sigs_on_pdu`

* Downgrade logging about redactions to DEBUG

this can be very spammy during a room join, and it's not very useful.

* Raise `InvalidEventSignatureError` from `_check_sigs_and_hash`

... and, more importantly, move the logging out to the callers.

* changelog
2022-05-31 23:32:56 +01:00
Erik Johnston
1e453053cb
Rename storage classes (#12913) 2022-05-31 12:17:50 +00:00
Patrick Cloke
c52abc1cfd
Additional constants for EDU types. (#12884)
Instead of hard-coding strings in many places.
2022-05-27 07:14:36 -04:00
Jess Porter
a608ac847b
add SpamChecker callback for silently dropping inbound federated events (#12744)
Signed-off-by: jesopo <github@lolnerd.net>
2022-05-23 16:36:21 +00:00
David Robertson
6463244375
Remove unused # type: ignores (#12531)
Over time we've begun to use newer versions of mypy, typeshed, stub
packages---and of course we've improved our own annotations. This makes
some type ignore comments no longer necessary. I have removed them.

There was one exception: a module that imports `select.epoll`. The
ignore is redundant on Linux, but I've kept it ignored for those of us
who work on the source tree using not-Linux. (#11771)

I'm more interested in the config line which enforces this. I want
unused ignores to be reported, because I think it's useful feedback when
annotating to know when you've fixed a problem you had to previously
ignore.

* Installing extras before typechecking

Lacking an easy way to install all extras generically, let's bite the bullet and
make install the hand-maintained `all` extra before typechecking.

Now that https://github.com/matrix-org/backend-meta/pull/6 is merged to
the release/v1 branch.
2022-04-27 14:03:44 +01:00
Richard van der Hoff
b121a3ad2b
Back out implementation of MSC2314 (#12474)
MSC2314 has now been closed, so we're backing out its implementation, which
originally happened in #6176.

Unfortunately it's not a direct revert, as that PR mixed in a bunch of
unrelated changes to tests etc.
2022-04-19 11:17:29 +00:00
Patrick Cloke
4bdbebccb9
Remove the unstable event field for /send_join per MSC3083. (#12395)
This was missed when initially stabilising room version 8 and was
left in as a compatibility shim. Most homeservers have upgraded
to a version which expects the proper field name, and the failure
mode is reasonable (a user on an older server may have to attempt
joining the room twice with an obscure error message the first time).
2022-04-12 11:27:45 -04:00
Sean Quah
800ba87cc8
Refactor and convert Linearizer to async (#12357)
Refactor and convert `Linearizer` to async. This makes a `Linearizer`
cancellation bug easier to fix.

Also refactor to use an async context manager, which eliminates an
unlikely footgun where code that doesn't immediately use the context
manager could forget to release the lock.

Signed-off-by: Sean Quah <seanq@element.io>
2022-04-05 15:43:52 +01:00
Richard van der Hoff
38adf14998
Enhance logging for inbound federation events (#12301)
It is currently rather hard to see which rooms are causing inbound federation
traffic. Add the room id to the logs.
2022-03-25 14:44:57 +00:00
Richard van der Hoff
afa17f0eab
Return a 404 from /state for an outlier (#12087)
* Replace `get_state_for_pdu` with  `get_state_ids_for_pdu` and `get_events_as_list`.
* Return a 404 from `/state` and `/state_ids` for an outlier
2022-03-21 11:23:32 +00:00