synapse-product

mirror of https://git.anonymousland.org/anonymousland/synapse-product.git synced 2024-12-21 20:15:01 -05:00

Author	SHA1	Message	Date
Richard van der Hoff	7c6b2204d1	Faster joins: add issue links to the TODOs (#13004 ) ... to help us keep track of these things	2022-06-09 10:13:03 +00:00
Erik Johnston	e3163e2e11	Reduce the amount of state we pull from the DB (#12811 )	2022-06-06 09:24:12 +01:00
Sean Quah	2fba1076c5	Faster room joins: Try other destinations when resyncing the state of a partial-state room (#12812 ) Signed-off-by: Sean Quah <seanq@matrix.org>	2022-05-31 15:50:29 +01:00
Erik Johnston	1e453053cb	Rename storage classes (#12913 )	2022-05-31 12:17:50 +00:00
Erik Johnston	b83bc5fab5	Pull out less state when handling gaps mk2 (#12852 )	2022-05-26 09:48:12 +00:00
Erik Johnston	4660d9fdcf	Fix up `state_store` naming (#12871 )	2022-05-25 12:59:04 +01:00
Eric Eastwood	7c2a78bb3b	Marker events as state - MSC2716 (#12718 ) Sending marker events as state now so they are always able to be seen by homeservers (not lost in some timeline gap). Part of [MSC2716](https://github.com/matrix-org/matrix-spec-proposals/pull/2716) Complement tests: https://github.com/matrix-org/complement/pull/371 As initially discussed at https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r782629097 and https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r876684431 When someone joins a room, process all of the marker events we see in the current state. Marker events should be sent with a unique `state_key` so that they can all resolve in the current state to easily be discovered. Marker events as state - If we re-use the same `state_key` (like `""`), then we would have to fetch previous snapshots of state up through time to find all of the marker events. This way we can avoid all of that. This PR was originally doing this but then thought of the smarter way to tackle in an [out of band discussion with @erikjohnston](https://docs.google.com/document/d/1JJDuPfcPNX75fprdTWlxlaKjWOdbdJylbpZ03hzo638/edit#bookmark=id.sm92fqyq7vpp). - Also avoids state resolution conflicts where only one of the marker events win As a homeserver, when we see new marker state, we know there is new history imported somewhere back in time and should process it to fetch the insertion event where the historical messages are and set it as an insertion extremity. This way we know where to backfill more messages when someone asks for scrollback.	2022-05-23 20:43:37 -05:00
Shay	71e8afe34d	Update EventContext `get_current_event_ids` and `get_prev_event_ids` to accept state filters and update calls where possible (#12791 )	2022-05-20 09:54:12 +01:00
Patrick Cloke	a4c75918b3	Remove unneeded `ActionGenerator` class. (#12691 ) It simply passes through to `BulkPushRuleEvaluator`, which can be called directly instead.	2022-05-11 07:15:21 -04:00
Erik Johnston	c72d26c1e1	Refactor `EventContext` (#12689 ) Refactor how the `EventContext` class works, with the intention of reducing the amount of state we fetch from the DB during event processing. The idea here is to get rid of the cached `current_state_ids` and `prev_state_ids` that live in the `EventContext`, and instead defer straight to the database (and its caching). One change that may have a noticeable effect is that we now no longer prefill the `get_current_state_ids` cache on a state change. However, that query is relatively light, since its just a case of reading a table from the DB (unlike fetching state at an event which is more heavyweight). For deployments with workers this cache isn't even used. Part of #12684	2022-05-10 19:43:13 +00:00
andrew do	01e625513a	remove constantly lib use and switch to enums. (#12624 )	2022-05-04 11:26:11 +00:00
Richard van der Hoff	f5668f0b4a	Await un-partial-stating after a partial-state join (#12399 ) When we join a room via the faster-joins mechanism, we end up with "partial state" at some points on the event DAG. Many parts of the codebase need to wait for the full state to load. So, we implement a mechanism to keep track of which events have partial state, and wait for them to be fully-populated.	2022-04-21 07:42:03 +01:00
Richard van der Hoff	320186319a	Resync state after partial-state join (#12394 ) We work through all the events with partial state, updating the state at each of them. Once it's done, we recalculate the state for the whole room, and then mark the room as having complete state.	2022-04-12 13:23:43 +00:00
Sean Quah	800ba87cc8	Refactor and convert `Linearizer` to async (#12357 ) Refactor and convert `Linearizer` to async. This makes a `Linearizer` cancellation bug easier to fix. Also refactor to use an async context manager, which eliminates an unlikely footgun where code that doesn't immediately use the context manager could forget to release the lock. Signed-off-by: Sean Quah <seanq@element.io>	2022-04-05 15:43:52 +01:00
Richard van der Hoff	9b43df1f7b	Optimise `_get_state_after_missing_prev_event`: use `/state` (#12040 ) If we're missing most of the events in the room state, then we may as well call the /state endpoint, instead of individually requesting each and every event.	2022-04-01 12:53:42 +01:00
Richard van der Hoff	9b67715bc3	Disable proactive sends for remote joins (#12330 ) Do not attempt to send remote joins out over federation. Normally, it will do nothing; occasionally, it will do the wrong thing.	2022-03-30 12:04:35 +01:00
Richard van der Hoff	e2e1d90a5e	Faster joins: persist to database (#12012 ) When we get a partial_state response from send_join, store information in the database about it: * store a record about the room as a whole having partial state, and stash the list of member servers too. * flag the join event itself as having partial state * also, for any new events whose prev-events are partial-stated, note that they will also be partial-stated. We don't yet make any attempt to interpret this data, so API calls (and a bunch of other things) are just going to get incorrect data.	2022-03-01 12:49:54 +00:00
Richard van der Hoff	e24ff8ebe3	Remove `HomeServer.get_datastore()` (#12031 ) The presence of this method was confusing, and mostly present for backwards compatibility. Let's get rid of it. Part of #11733	2022-02-23 11:04:02 +00:00
Richard van der Hoff	696acd3515	`send_join` response: get create event from `state`, not `auth_chain` (#12005 ) msc3706 proposes changing the `/send_join` response: > Any events returned within `state` can be omitted from `auth_chain`. Currently, we rely on `m.room.create` being returned in `auth_chain`, but since the `m.room.create` event must necessarily be part of the state, the above change will break this. In short, let's look for `m.room.create` in `state` rather than `auth_chain`.	2022-02-17 11:59:26 +00:00
Richard van der Hoff	bab2394aa9	`_auth_and_persist_outliers`: drop events we have already seen (#11994 ) We already have two copies of this code, in 2/3 of the callers of `_auth_and_persist_outliers`. Before I add a third, let's push it down.	2022-02-15 14:33:28 +00:00
Eric Eastwood	fef2e792be	Fix historical messages backfilling in random order on remote homeservers (MSC2716) (#11114 ) Fix https://github.com/matrix-org/synapse/issues/11091 Fix https://github.com/matrix-org/synapse/issues/10764 (side-stepping the issue because we no longer have to deal with `fake_prev_event_id`) 1. Made the `/backfill` response return messages in `(depth, stream_ordering)` order (previously only sorted by `depth`) - Technically, it shouldn't really matter how `/backfill` returns things but I'm just trying to make the `stream_ordering` a little more consistent from the origin to the remote homeservers in order to get the order of messages from `/messages` consistent ([sorted by `(topological_ordering, stream_ordering)`](https://github.com/matrix-org/synapse/blob/develop/docs/development/room-dag-concepts.md#depth-and-stream-ordering)). - Even now that we return backfilled messages in order, it still doesn't guarantee the same `stream_ordering` (and more importantly the [`/messages` order](https://github.com/matrix-org/synapse/blob/develop/docs/development/room-dag-concepts.md#depth-and-stream-ordering)) on the other server. For example, if a room has a bunch of history imported and someone visits a permalink to a historical message back in time, their homeserver will skip over the historical messages in between and insert the permalink as the next message in the `stream_order` and totally throw off the sort. - This will be even more the case when we add the [MSC3030 jump to date API endpoint](https://github.com/matrix-org/matrix-doc/pull/3030) so the static archives can navigate and jump to a certain date. - We're solving this in the future by switching to [online topological ordering](https://github.com/matrix-org/gomatrixserverlib/issues/187) and [chunking](https://github.com/matrix-org/synapse/issues/3785) which by its nature will apply retroactively to fix any inconsistencies introduced by people permalinking 2. As we're navigating `prev_events` to return in `/backfill`, we order by `depth` first (newest -> oldest) and now also tie-break based on the `stream_ordering` (newest -> oldest). This is technically important because MSC2716 inserts a bunch of historical messages at the same `depth` so it's best to be prescriptive about which ones we should process first. In reality, I think the code already looped over the historical messages as expected because the database is already in order. 3. Making the historical state chain and historical event chain float on their own by having no `prev_events` instead of a fake `prev_event` which caused backfill to get clogged with an unresolvable event. Fixes https://github.com/matrix-org/synapse/issues/11091 and https://github.com/matrix-org/synapse/issues/10764 4. We no longer find connected insertion events by finding a potential `prev_event` connection to the current event we're iterating over. We now solely rely on marker events which when processed, add the insertion event as an extremity and the federating homeserver can ask about it when time calls. - Related discussion, https://github.com/matrix-org/synapse/pull/11114#discussion_r741514793 Before \| After --- \| --- ![](https://user-images.githubusercontent.com/558581/139218681-b465c862-5c49-4702-a59e-466733b0cf45.png) \| ![](https://user-images.githubusercontent.com/558581/146453159-a1609e0a-8324-439d-ae44-e4bce43ac6d1.png) #### Why aren't we sorting topologically when receiving backfill events? > The main reason we're going to opt to not sort topologically when receiving backfill events is because it's probably best to do whatever is easiest to make it just work. People will probably have opinions once they look at [MSC2716](https://github.com/matrix-org/matrix-doc/pull/2716) which could change whatever implementation anyway. > > As mentioned, ideally we would do this but code necessary to make the fake edges but it gets confusing and gives an impression of “just whyyyy” (feels icky). This problem also dissolves with online topological ordering. > > -- https://github.com/matrix-org/synapse/pull/11114#discussion_r741517138 See https://github.com/matrix-org/synapse/pull/11114#discussion_r739610091 for the technical difficulties	2022-02-07 15:54:13 -06:00
Richard van der Hoff	251b5567ec	Remove `log_function` and its uses (#11761 ) I've never found this terribly useful. I think it was added in the early days of Synapse, without much thought as to what would actually be useful to log, and has just been cargo-culted ever since. Rather, it tends to clutter up debug logs with useless information.	2022-01-18 13:06:04 +00:00
Richard van der Hoff	0fb3dd0830	Refactor the way we set `outlier` (#11634 ) * `_auth_and_persist_outliers`: mark persisted events as outliers Mark any events that get persisted via `_auth_and_persist_outliers` as, well, outliers. Currently this will be a no-op as everything will already be flagged as an outlier, but I'm going to change that. * `process_remote_join`: stop flagging as outlier The events are now flagged as outliers later on, by `_auth_and_persist_outliers`. * `send_join`: remove `outlier=True` The events created here are returned in the result of `send_join` to `FederationHandler.do_invite_join`. From there they are passed into `FederationEventHandler.process_remote_join`, which passes them to `_auth_and_persist_outliers`... which sets the `outlier` flag. * `get_event_auth`: remove `outlier=True` stop flagging the events returned by `get_event_auth` as outliers. This method is only called by `_get_remote_auth_chain_for_event`, which passes the results into `_auth_and_persist_outliers`, which will flag them as outliers. * `_get_remote_auth_chain_for_event`: remove `outlier=True` we pass all the events into `_auth_and_persist_outliers`, which will now flag the events as outliers. * `_check_sigs_and_hash_and_fetch`: remove unused `outlier` parameter This param is now never set to True, so we can remove it. * `_check_sigs_and_hash_and_fetch_one`: remove unused `outlier` param This is no longer set anywhere, so we can remove it. * `get_pdu`: remove unused `outlier` parameter ... and chase it down into `get_pdu_from_destination_raw`. * `event_from_pdu_json`: remove redundant `outlier` param This is never set to `True`, so can be removed. * changelog * update docstring	2022-01-05 12:26:11 +00:00
Richard van der Hoff	878aa55293	`FederationClient.backfill`: stop flagging events as outliers (#11632 ) Events returned by `backfill` should not be flagged as outliers. Fixes: ``` AssertionError: null File "synapse/handlers/federation.py", line 313, in try_backfill dom, room_id, limit=100, extremities=extremities File "synapse/handlers/federation_event.py", line 517, in backfill await self._process_pulled_events(dest, events, backfilled=True) File "synapse/handlers/federation_event.py", line 642, in _process_pulled_events await self._process_pulled_event(origin, ev, backfilled=backfilled) File "synapse/handlers/federation_event.py", line 669, in _process_pulled_event assert not event.internal_metadata.is_outlier() ``` See https://sentry.matrix.org/sentry/synapse-matrixorg/issues/231992 Fixes #8894.	2022-01-04 16:31:32 +00:00
Richard van der Hoff	2359ee3864	Remove redundant `get_current_events_token` (#11643 ) * Push `get_room_{min,max_stream_ordering}` into StreamStore Both implementations of this are identical, so we may as well push it down and get rid of the abstract base class nonsense. * Remove redundant `StreamStore` class This is empty now * Remove redundant `get_current_events_token` This was an exact duplicate of `get_room_max_stream_ordering`, so let's get rid of it. * newsfile	2022-01-04 16:10:27 +00:00
Richard van der Hoff	73cbb284b9	Remove redundant parameters on `_check_event_auth` (#11292 ) as of #11012, these parameters are unused.	2021-11-10 14:16:06 +00:00
Patrick Cloke	c01bc5f43d	Add remaining type hints to `synapse.events`. (#11098 )	2021-11-02 09:55:52 -04:00
reivilibre	75ca0a6168	Annotate `log_function` decorator (#10943 ) Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>	2021-10-27 17:27:23 +01:00
Brendan Abolivier	c7a5e49664	Implement an `on_new_event` callback (#11126 ) Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2021-10-26 15:17:36 +02:00
Richard van der Hoff	da957a60e8	Ensure that we correctly auth events returned by `send_join` (#11012 ) This is the final piece of the jigsaw for #9595. As with other changes before this one (eg #10771), we need to make sure that we auth the auth events in the right order, and actually check that their predecessors haven't been rejected. To do this I've reused the existing code we use when persisting outliers elsewhere. I've removed the code for attempting to fetch missing auth_events - the events should have been present in the send_join response, so the likely reason they are missing is that we couldn't verify them, so requesting them again is unlikely to help. Instead, we simply drop any state which relies on those auth events, as we do at a backwards-extremity. See also matrix-org/complement#216 for a test for this.	2021-10-25 15:21:09 +01:00
Richard van der Hoff	0930e9ae12	Clean up `_update_auth_events_and_context_for_auth` (#11122 ) Remove some redundant code, and generally simplify.	2021-10-20 18:22:40 +01:00
Richard van der Hoff	f3efa0036b	Move _persist_auth_tree into FederationEventHandler (#11115 ) This is just a lift-and-shift, because it fits more naturally here. We do rename it to `process_remote_join` at the same time though.	2021-10-19 10:24:09 +01:00
Richard van der Hoff	0170774b19	Rename `_auth_and_persist_fetched_events` (#11116 ) ... to `_auth_and_persist_outliers`, since that reflects its purpose better.	2021-10-19 10:23:55 +01:00
Richard van der Hoff	cc33d9eee2	Check auth on received events' auth_events (#11001 ) Currently, when we receive an event whose auth_events differ from those we expect, we state-resolve between the two state sets, and check that the event passes auth based on the resolved state. This means that it's possible for us to accept events which don't pass auth at their declared auth_events (or where the auth events themselves were rejected), leading to problems down the line like #10083. This change means we will: * ignore any events where we cannot find the auth events * reject any events whose auth events were rejected * reject any events which do not pass auth at their declared auth_events. Together with a whole raft of previous work, this is a partial fix to #9595. Fixes #6643. Based on #11009.	2021-10-18 18:29:37 +01:00
Richard van der Hoff	a5d2ea3d08	Check all auth events for room id and rejection (#11009 ) This fixes a bug where we would accept an event whose `auth_events` include rejected events, if the rejected event was shadowed by another `auth_event` with same `(type, state_key)`. The approach is to pass a list of auth events into `check_auth_rules_for_event` instead of a dict, which of course means updating the call sites. This is an extension of #10956.	2021-10-18 18:28:30 +01:00
Richard van der Hoff	e8f24b6c35	`_run_push_actions_and_persist_event`: handle no min_depth (#11014 ) Make sure that we correctly handle rooms where we do not yet have a `min_depth`, and also add some comments and logging.	2021-10-18 17:17:15 +01:00
Eric Eastwood	daf498e099	Fix 500 error on `/messages` when we accumulate more than 5 backward extremities (#11027 ) Found while working on the Gitter backfill script and noticed it only happened after we sent 7 batches, https://gitlab.com/gitterHQ/webapp/-/merge_requests/2229#note_665906390 When there are more than 5 backward extremities for a given depth, backfill will throw an error because we sliced the extremity list to 5 but then try to iterate over the full list. This causes us to look for state that we never fetched and we get a `KeyError`. Before when calling `/messages` when there are more than 5 backward extremities: ``` Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 258, in _async_render_wrapper callback_return = await self._async_render(request) File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 446, in _async_render callback_return = await raw_callback_return File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/room.py", line 580, in on_GET msgs = await self.pagination_handler.get_messages( File "/usr/local/lib/python3.8/site-packages/synapse/handlers/pagination.py", line 396, in get_messages await self.hs.get_federation_handler().maybe_backfill( File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 133, in maybe_backfill return await self._maybe_backfill_inner(room_id, current_depth, limit) File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 386, in _maybe_backfill_inner likely_extremeties_domains = get_domains_from_state(states[e_id]) KeyError: '$zpFflMEBtZdgcMQWTakaVItTLMjLFdKcRWUPHbbSZJl' ```	2021-10-14 18:53:45 -05:00
Richard van der Hoff	96fe77c254	Improve the logging in _auth_and_persist_outliers (#11010 ) Include the event ids being peristed	2021-10-07 11:43:25 +00:00
Richard van der Hoff	86af6b2f0e	Add a comment in _process_received_pdu (#11011 )	2021-10-07 12:20:03 +01:00
Eric Eastwood	392863fbf1	Fix logic flaw preventing tracking of MSC2716 events in existing room versions (#10962 ) We correctly allowed using the MSC2716 batch endpoint for the room creator in existing room versions but accidentally didn't track the events because of a logic flaw. This prevented you from connecting subsequent chunks together because it would throw the unknown batch ID error. We only want to process MSC2716 events when: - The room version supports MSC2716 - Any room where the homeserver has the `msc2716_enabled` experimental feature enabled and the event is from the room creator	2021-10-05 11:51:57 -05:00
Richard van der Hoff	787af4a106	Host `cache_joined_hosts_for_event` to caller (#10986 ) `_check_event_auth` is only called in two places, and only one of those sets `send_on_behalf_of`. Warming the cache isn't really part of auth anyway, so moving it out makes a lot more sense.	2021-10-05 13:01:41 +01:00
Richard van der Hoff	d099535deb	`_update_auth_events_and_context_for_auth`: add some comments (#10987 ) Add some more comments about wtf is going on here.	2021-10-05 12:50:38 +01:00
Richard van der Hoff	cb88ed912b	`_check_event_auth`: move event validation earlier (#10988 ) There's little point in doing a fancy state reconciliation dance if the event itself is invalid. Likewise, there's no point checking it again in `_check_for_soft_fail`.	2021-10-05 12:50:07 +01:00
Richard van der Hoff	428174f902	Split `event_auth.check` into two parts (#10940 ) Broadly, the existing `event_auth.check` function has two parts: * a validation section: checks that the event isn't too big, that it has the rught signatures, etc. This bit is independent of the rest of the state in the room, and so need only be done once for each event. * an auth section: ensures that the event is allowed, given the rest of the state in the room. This gets done multiple times, against various sets of room state, because it forms part of the state res algorithm. Currently, this is implemented with `do_sig_check` and `do_size_check` parameters, but I think that makes everything hard to follow. Instead, we split the function in two and call each part separately where it is needed.	2021-09-29 18:59:15 +01:00
Richard van der Hoff	2622b28c5c	Inline `_check_event_auth` for outliers (#10926 ) * Inline `_check_event_auth` for outliers When we are persisting an outlier, most of `_check_event_auth` is redundant: * `_update_auth_events_and_context_for_auth` does nothing, because the `input_auth_events` are (now) exactly the event's auth_events, which means that `missing_auth` is empty. * we don't care about soft-fail, kicking guest users or `send_on_behalf_of` for outliers ... so the only thing that matters is the auth itself, so let's just do that. * `_auth_and_persist_fetched_events_inner`: de-async `prep` `prep` no longer calls any `async` methods, so let's make it synchronous. * Simplify `_check_event_auth` We no longer need to support outliers here, which makes things rather simpler. * changelog * lint	2021-09-28 15:25:07 +01:00
Richard van der Hoff	0420d4e6a5	Stop trying to auth/persist events whose auth events we do not have. (#10907 )	2021-09-24 14:01:45 +01:00
Richard van der Hoff	85551b7a85	Factor out common code for persisting fetched auth events (#10896 ) * Factor more stuff out of `_get_events_and_persist` It turns out that the event-sorting algorithm in `_get_events_and_persist` is also useful in other circumstances. Here we move the current `_auth_and_persist_fetched_events` to `_auth_and_persist_fetched_events_inner`, and then factor the sorting part out to `_auth_and_persist_fetched_events`. * `_get_remote_auth_chain_for_event`: remove redundant `outlier` assignment `get_event_auth` returns events with the outlier flag already set, so this is redundant (though we need to update a test where `get_event_auth` is mocked). * `_get_remote_auth_chain_for_event`: move existing-event tests earlier Move a couple of tests outside the loop. This is a bit inefficient for now, but a future commit will make it better. It should be functionally identical. * `_get_remote_auth_chain_for_event`: use `_auth_and_persist_fetched_events` We can use the same codepath for persisting the events fetched as part of an auth chain as for those fetched individually by `_get_events_and_persist` for building the state at a backwards extremity. * `_get_remote_auth_chain_for_event`: use a dict for efficiency `_auth_and_persist_fetched_events` sorts the events itself, so we no longer need to care about maintaining the ordering from `get_event_auth` (and no longer need to sort by depth in `get_event_auth`). That means that we can use a map, making it easier to filter out events we already have, etc. * changelog * `_auth_and_persist_fetched_events`: improve docstring	2021-09-24 11:56:33 +01:00
Richard van der Hoff	261c9763c4	Simplify `_auth_and_persist_fetched_events` (#10901 ) Combine the two loops over the list of events, and hence get rid of `_NewEventInfo`. Also pass the event back alongside the context, so that it's easier to process the result.	2021-09-24 11:56:13 +01:00
Richard van der Hoff	a7304adc7d	Factor out `_get_remote_auth_chain_for_event` from `_update_auth_events_and_context_for_auth` (#10884 ) * Reload auth events from db after fetching and persisting In `_update_auth_events_and_context_for_auth`, when we fetch the remote auth tree and persist the returned events: load the missing events from the database rather than using the copies we got from the remote server. This is mostly in preparation for additional refactors, but does have an advantage in that if we later get around to checking the rejected status, we'll be able to make use of it. * Factor out `_get_remote_auth_chain_for_event` from `_update_auth_events_and_context_for_auth` * changelog	2021-09-23 17:34:33 +01:00
Richard van der Hoff	26f2bfedbf	Factor out a separate `EventContext.for_outlier` (#10883 ) Constructing an EventContext for an outlier is actually really simple, and there's no sense in going via an `async` method in the `StateHandler`. This also means that we can resolve a bunch of FIXMEs.	2021-09-22 17:58:57 +01:00

1 2

60 Commits