Commit Graph

476 Commits

Author SHA1 Message Date
Shay
a648a06d52
Add some tracing spans to give insight into local joins (#13439) 2022-08-03 10:19:34 -07:00
andrew do
78a3111c41
Return 404 or member list when getting joined_members after leaving (#13374)
Signed-off-by: Andrew Doh <andrewddo@gmail.com>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Andrew Morgan <andrewm@element.io>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
2022-08-03 14:26:31 +02:00
Will Hunt
502f075e96
Implement MSC3848: Introduce errcodes for specific event sending failures (#13343)
Implements MSC3848
2022-07-27 13:44:40 +01:00
Sean Quah
335ebb21cc
Faster room joins: avoid blocking when pulling events with missing prevs (#13355)
Avoid blocking on full state in `_resolve_state_at_missing_prevs` and
return a new flag indicating whether the resolved state is partial.
Thread that flag around so that it makes it into the event context.

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2022-07-26 12:39:23 +01:00
David Robertson
b977867358
Rate limit joins per-room (#13276) 2022-07-19 11:45:17 +00:00
Erik Johnston
cf5fa5063d
Don't pull out full state when sending dummy events (#13310) 2022-07-18 14:19:11 +01:00
Erik Johnston
0ca4172b5d
Don't pull out state in compute_event_context for unconflicted state (#13267) 2022-07-14 13:57:02 +00:00
Sean Quah
68db233f0c
Handle race between persisting an event and un-partial stating a room (#13100)
Whenever we want to persist an event, we first compute an event context,
which includes the state at the event and a flag indicating whether the
state is partial. After a lot of processing, we finally try to store the
event in the database, which can fail for partial state events when the
containing room has been un-partial stated in the meantime.

We detect the race as a foreign key constraint failure in the data store
layer and turn it into a special `PartialStateConflictError` exception,
which makes its way up to the method in which we computed the event
context.

To make things difficult, the exception needs to cross a replication
request: `/fed_send_events` for events coming over federation and
`/send_event` for events from clients. We transport the
`PartialStateConflictError` as a `409 Conflict` over replication and
turn `409`s back into `PartialStateConflictError`s on the worker making
the request.

All client events go through
`EventCreationHandler.handle_new_client_event`, which is called in
*a lot* of places. Instead of trying to update all the code which
creates client events, we turn the `PartialStateConflictError` into a
`429 Too Many Requests` in
`EventCreationHandler.handle_new_client_event` and hope that clients
take it as a hint to retry their request.

On the federation event side, there are 7 places which compute event
contexts. 4 of them use outlier event contexts:
`FederationEventHandler._auth_and_persist_outliers_inner`,
`FederationHandler.do_knock`, `FederationHandler.on_invite_request` and
`FederationHandler.do_remotely_reject_invite`. These events won't have
the partial state flag, so we do not need to do anything for then.

The remaining 3 paths which create events are
`FederationEventHandler.process_remote_join`,
`FederationEventHandler.on_send_membership_event` and
`FederationEventHandler._process_received_pdu`.

We can't experience the race in `process_remote_join`, unless we're
handling an additional join into a partial state room, which currently
blocks, so we make no attempt to handle it correctly.

`on_send_membership_event` is only called by
`FederationServer._on_send_membership_event`, so we catch the
`PartialStateConflictError` there and retry just once.

`_process_received_pdu` is called by `on_receive_pdu` for incoming
events and `_process_pulled_event` for backfill. The latter should never
try to persist partial state events, so we ignore it. We catch the
`PartialStateConflictError` in `on_receive_pdu` and retry just once.

Refering to the graph of code paths in
https://github.com/matrix-org/synapse/issues/12988#issuecomment-1156857648
may make the above make more sense.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-07-05 16:12:52 +01:00
Shay
046a6513bc
Don't process /send requests for users who have hit their ratelimit (#13134) 2022-06-30 09:22:40 -07:00
Quentin Gliech
92103cb2c8
Decouple synapse.api.auth_blocking.AuthBlocking from synapse.api.auth.Auth. (#13021) 2022-06-14 09:51:15 +01:00
David Teller
a164a46038
Uniformize spam-checker API, part 4: port other spam-checker callbacks to return Union[Allow, Codes]. (#12857)
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
2022-06-13 18:16:16 +00:00
Richard van der Hoff
f68b5e5773 Merge branch 'rav/simplify_event_auth_interface' into develop 2022-06-13 11:34:59 +01:00
Richard van der Hoff
c1b28b8842 Remove redundant room_version param from check_auth_rules_from_context
It's now implied by the room_version property on the event.
2022-06-12 23:13:10 +01:00
Richard van der Hoff
68be42f6b6 Remove room_version param from validate_event_for_room_version
Instead, use the `room_version` property of the event we're validating.

The `room_version` was originally added as a parameter somewhere around #4482,
but really it's been redundant since #6875 added a `room_version` field to `EventBase`.
2022-06-12 23:13:09 +01:00
Richard van der Hoff
7c6b2204d1
Faster joins: add issue links to the TODOs (#13004)
... to help us keep track of these things
2022-06-09 10:13:03 +00:00
Erik Johnston
e3163e2e11
Reduce the amount of state we pull from the DB (#12811) 2022-06-06 09:24:12 +01:00
Erik Johnston
888a29f412
Wait for lazy join to complete when getting current state (#12872) 2022-06-01 16:02:53 +01:00
Richard van der Hoff
79dadf7216
Fix 404 on /sync when the last event is a redaction of an unknown/purged event (#12905)
Currently, we try to pull the event corresponding to a sync token from the database. However, when
we fetch redaction events, we check the target of that redaction (because we aren't allowed to send
redactions to clients without validating them). So, if the sync token points to a redaction of an event
that we don't have, we have a problem.

It turns out we don't really need that event, and can just work with its ID and metadata, which
sidesteps the whole problem.
2022-06-01 11:29:51 +00:00
Erik Johnston
3594f6c1f3 Merge branch 'master' into develop 2022-05-31 14:48:22 +01:00
Erik Johnston
1e453053cb
Rename storage classes (#12913) 2022-05-31 12:17:50 +00:00
Brendan Abolivier
8fd87739bf
Fix import in module_api module and docs on the new check_event_for_spam signature (#12918)
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2022-05-31 12:04:53 +02:00
David Teller
af7db19e1e
Uniformize spam-checker API, part 3: Expand check_event_for_spam with the ability to return additional fields (#12846)
Signed-off-by: David Teller <davidt@element.io>
2022-05-30 18:24:56 +02:00
Erik Johnston
b83bc5fab5
Pull out less state when handling gaps mk2 (#12852) 2022-05-26 09:48:12 +00:00
Erik Johnston
4660d9fdcf
Fix up state_store naming (#12871) 2022-05-25 12:59:04 +01:00
David Teller
28199e9357
Uniformize spam-checker API, part 2: check_event_for_spam (#12808)
Signed-off-by: David Teller <davidt@element.io>
2022-05-23 17:27:39 +00:00
Shay
71e8afe34d
Update EventContext get_current_event_ids and get_prev_event_ids to accept state filters and update calls where possible (#12791) 2022-05-20 09:54:12 +01:00
Patrick Cloke
86a515ccbf
Consolidate logic for parsing relations. (#12693)
Parse the `m.relates_to` event content field (which describes relations)
in a single place, this is used during:

* Event persistence.
* Validation of the Client-Server API.
* Fetching bundled aggregations.
* Processing of push rules.

Each of these separately implement the logic and each made slightly
different assumptions about what was valid. Some had minor / potential
bugs.
2022-05-16 12:42:45 +00:00
Patrick Cloke
a4c75918b3
Remove unneeded ActionGenerator class. (#12691)
It simply passes through to `BulkPushRuleEvaluator`, which can be
called directly instead.
2022-05-11 07:15:21 -04:00
Erik Johnston
c72d26c1e1
Refactor EventContext (#12689)
Refactor how the `EventContext` class works, with the intention of reducing the amount of state we fetch from the DB during event processing.

The idea here is to get rid of the cached `current_state_ids` and `prev_state_ids` that live in the `EventContext`, and instead defer straight to the database (and its caching). 

One change that may have a noticeable effect is that we now no longer prefill the `get_current_state_ids` cache on a state change. However, that query is relatively light, since its just a case of reading a table from the DB (unlike fetching state at an event which is more heavyweight). For deployments with workers this cache isn't even used.


Part of #12684
2022-05-10 19:43:13 +00:00
David Robertson
2607b3e181
Update mypy to 0.950 and fix complaints (#12650) 2022-05-06 12:35:20 +00:00
andrew do
01e625513a
remove constantly lib use and switch to enums. (#12624) 2022-05-04 11:26:11 +00:00
Nick Mills-Barrett
e3a49f4784
Fix missing sync events during historical batch imports (#12319)
Discovered after much in-depth investigation in #12281.

Closes: #12281
Closes: #3305

Signed off by: Nick Mills-Barrett nick@beeper.com
2022-04-13 11:38:35 +01:00
Patrick Cloke
86cf6a3a17
Remove references to unstable identifiers from MSC3440. (#12382)
Removes references to unstable thread relation, unstable
identifiers for filtering parameters, and the experimental
config flag.
2022-04-12 08:42:03 -04:00
Sean Quah
800ba87cc8
Refactor and convert Linearizer to async (#12357)
Refactor and convert `Linearizer` to async. This makes a `Linearizer`
cancellation bug easier to fix.

Also refactor to use an async context manager, which eliminates an
unlikely footgun where code that doesn't immediately use the context
manager could forget to release the lock.

Signed-off-by: Sean Quah <seanq@element.io>
2022-04-05 15:43:52 +01:00
Eric Eastwood
6f2943714b
Remove unused auth_event_ids argument plumbing (#12304)
Follow-up to https://github.com/matrix-org/synapse/pull/12083

Since we are now using the new `state_event_ids` parameter to do all of the heavy lifting.
We can remove any spots where we plumbed `auth_event_ids` just for MSC2716 things in
https://github.com/matrix-org/synapse/pull/9247/files.

Removing `auth_event_ids` from following functions:

 - `create_and_send_nonmember_event`
 - `_local_membership_update`
 - `update_membership`
 - `update_membership_locked`
2022-03-29 09:18:52 +01:00
Eric Eastwood
14662d3c18
Refactor create_new_client_event to use a new parameter, state_event_ids, which accurately describes the usage with MSC2716 instead of abusing auth_event_ids (#12083)
Spawned from https://github.com/matrix-org/synapse/pull/10975#discussion_r813183430

Part of [MSC2716](https://github.com/matrix-org/matrix-spec-proposals/pull/2716)
2022-03-25 09:21:06 -05:00
Patrick Cloke
ea27528b5d
Support stable identifiers for MSC3440: Threading (#12151)
The unstable identifiers are still supported if the experimental configuration
flag is enabled. The unstable identifiers will be removed in a future release.
2022-03-10 15:36:13 +00:00
Erik Johnston
61fd2a8f59
Limit the size of the aggregation_key (#12101)
There's no reason to let people use long keys.
2022-03-03 10:52:35 +00:00
Richard van der Hoff
e2e1d90a5e
Faster joins: persist to database (#12012)
When we get a partial_state response from send_join, store information in the
database about it:
 * store a record about the room as a whole having partial state, and stash the
   list of member servers too.
 * flag the join event itself as having partial state
 * also, for any new events whose prev-events are partial-stated, note that
   they will *also* be partial-stated.

We don't yet make any attempt to interpret this data, so API calls (and a bunch
of other things) are just going to get incorrect data.
2022-03-01 12:49:54 +00:00
Richard van der Hoff
9d11fee8f2
Improve exception handling for concurrent execution (#12109)
* fix incorrect unwrapFirstError import

this was being imported from the wrong place

* Refactor `concurrently_execute` to use `yieldable_gather_results`

* Improve exception handling in `yieldable_gather_results`

Try to avoid swallowing so many stack traces.

* mark unwrapFirstError deprecated

* changelog
2022-03-01 09:34:30 +00:00
Richard van der Hoff
e24ff8ebe3
Remove HomeServer.get_datastore() (#12031)
The presence of this method was confusing, and mostly present for backwards
compatibility. Let's get rid of it.

Part of #11733
2022-02-23 11:04:02 +00:00
Richard van der Hoff
a85dde3445
Minor typing fixes (#12034)
These started failing in
https://github.com/matrix-org/synapse/pull/12031... I'm a bit mystified by how
they ever worked.
2022-02-21 18:37:04 +00:00
Eric Eastwood
fef2e792be
Fix historical messages backfilling in random order on remote homeservers (MSC2716) (#11114)
Fix https://github.com/matrix-org/synapse/issues/11091
Fix https://github.com/matrix-org/synapse/issues/10764 (side-stepping the issue because we no longer have to deal with `fake_prev_event_id`)

 1. Made the `/backfill` response return messages in `(depth, stream_ordering)` order (previously only sorted by `depth`)
    - Technically, it shouldn't really matter how `/backfill` returns things but I'm just trying to make the `stream_ordering` a little more consistent from the origin to the remote homeservers in order to get the order of messages from `/messages` consistent ([sorted by `(topological_ordering, stream_ordering)`](https://github.com/matrix-org/synapse/blob/develop/docs/development/room-dag-concepts.md#depth-and-stream-ordering)).
    - Even now that we return backfilled messages in order, it still doesn't guarantee the same `stream_ordering` (and more importantly the [`/messages` order](https://github.com/matrix-org/synapse/blob/develop/docs/development/room-dag-concepts.md#depth-and-stream-ordering)) on the other server. For example, if a room has a bunch of history imported and someone visits a permalink to a historical message back in time, their homeserver will skip over the historical messages in between and insert the permalink as the next message in the `stream_order` and totally throw off the sort.
       - This will be even more the case when we add the [MSC3030 jump to date API endpoint](https://github.com/matrix-org/matrix-doc/pull/3030) so the static archives can navigate and jump to a certain date.
       - We're solving this in the future by switching to [online topological ordering](https://github.com/matrix-org/gomatrixserverlib/issues/187) and [chunking](https://github.com/matrix-org/synapse/issues/3785) which by its nature will apply retroactively to fix any inconsistencies introduced by people permalinking
 2. As we're navigating `prev_events` to return in `/backfill`, we order by `depth` first (newest -> oldest) and now also tie-break based on the `stream_ordering` (newest -> oldest). This is technically important because MSC2716 inserts a bunch of historical messages at the same `depth` so it's best to be prescriptive about which ones we should process first. In reality, I think the code already looped over the historical messages as expected because the database is already in order.
 3. Making the historical state chain and historical event chain float on their own by having no `prev_events` instead of a fake `prev_event` which caused backfill to get clogged with an unresolvable event. Fixes https://github.com/matrix-org/synapse/issues/11091 and https://github.com/matrix-org/synapse/issues/10764
 4. We no longer find connected insertion events by finding a potential `prev_event` connection to the current event we're iterating over. We now solely rely on marker events which when processed, add the insertion event as an extremity and the federating homeserver can ask about it when time calls.
    - Related discussion, https://github.com/matrix-org/synapse/pull/11114#discussion_r741514793


Before | After
--- | ---
![](https://user-images.githubusercontent.com/558581/139218681-b465c862-5c49-4702-a59e-466733b0cf45.png) | ![](https://user-images.githubusercontent.com/558581/146453159-a1609e0a-8324-439d-ae44-e4bce43ac6d1.png)



#### Why aren't we sorting topologically when receiving backfill events?

> The main reason we're going to opt to not sort topologically when receiving backfill events is because it's probably best to do whatever is easiest to make it just work. People will probably have opinions once they look at [MSC2716](https://github.com/matrix-org/matrix-doc/pull/2716) which could change whatever implementation anyway.
> 
> As mentioned, ideally we would do this but code necessary to make the fake edges but it gets confusing and gives an impression of “just whyyyy” (feels icky). This problem also dissolves with online topological ordering.
>
> -- https://github.com/matrix-org/synapse/pull/11114#discussion_r741517138

See https://github.com/matrix-org/synapse/pull/11114#discussion_r739610091 for the technical difficulties
2022-02-07 15:54:13 -06:00
Patrick Cloke
6bf81a7a61
Bundle aggregations outside of the serialization method. (#11612)
This makes the serialization of events synchronous (and it no
longer access the database), but we must manually calculate and
provide the bundled aggregations.

Overall this should cause no change in behavior, but is prep work
for other improvements.
2022-01-07 09:10:46 -05:00
Patrick Cloke
f58b300d27
Do not attempt to bundled aggregations for /members and /state. (#11623)
Both of those APIs return state events, which will not have bundled
aggregations added anyway.
2021-12-29 08:02:03 -05:00
Patrick Cloke
dd47788752
Do not bundle aggregations for APIs which shouldn't include them. (#11592)
And make bundling aggregations opt-in, instead of opt-out to avoid
having APIs to include extraneous data (and being much heavier than
necessary).
2021-12-20 14:14:38 -05:00
Sean Quah
0147b3de20
Add missing type hints to synapse.logging.context (#11556) 2021-12-14 17:35:28 +00:00
Eric Eastwood
aa8708ebed
Allow events to be created with no prev_events (MSC2716) (#11243)
The event still needs to have `auth_events` defined to be valid.

Split out from https://github.com/matrix-org/synapse/pull/11114
2021-12-10 23:08:51 -06:00
Patrick Cloke
494ebd7347
Include bundled aggregations in /sync and related fixes (#11478)
Due to updates to MSC2675 this includes a few fixes:

* Include bundled aggregations for /sync.
* Do not include bundled aggregations for /initialSync and /events.
* Do not bundle aggregations for state events.
* Clarifies comments and variable names.
2021-12-06 15:51:15 +00:00
Patrick Cloke
6a5dd485bd
Refactor the code to inject bundled relations during serialization. (#11408) 2021-11-23 06:43:56 -05:00