forked-synapse

mirror of https://mau.dev/maunium/synapse.git synced 2024-10-01 01:36:05 -04:00

Author	SHA1	Message	Date
Nick Mills-Barrett	190f49d8ab	Use cache store remove base slaved (#13329 ) This comes from two identical definitions in each of the base stores, and means the base slaved store is now empty and can be removed.	2022-07-21 11:51:30 +01:00
Patrick Cloke	a6895dd576	Add type annotations to `trace` decorator. (#13328 ) Functions that are decorated with `trace` are now properly typed and the type hints for them are fixed.	2022-07-19 14:14:30 -04:00
David Robertson	b977867358	Rate limit joins per-room (#13276 )	2022-07-19 11:45:17 +00:00
Erik Johnston	f721f1baba	Revert "Make all `process_replication_rows` methods async (#13304 )" (#13312 ) This reverts commit `5d4028f217`.	2022-07-18 14:28:14 +01:00
Nick Mills-Barrett	5d4028f217	Make all `process_replication_rows` methods async (#13304 ) More prep work for asyncronous caching, also makes all process_replication_rows methods consistent (presence handler already is so). Signed off by Nick @ Beeper (@Fizzadar)	2022-07-17 22:19:43 +01:00
Sean Quah	1391a76cd2	Faster room joins: fix race in recalculation of current room state (#13151 ) Bounce recalculation of current state to the correct event persister and move recalculation of current state into the event persistence queue, to avoid concurrent updates to a room's current state. Also give recalculation of a room's current state a real stream ordering. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-07-07 12:19:31 +00:00
Sean Quah	68db233f0c	Handle race between persisting an event and un-partial stating a room (#13100 ) Whenever we want to persist an event, we first compute an event context, which includes the state at the event and a flag indicating whether the state is partial. After a lot of processing, we finally try to store the event in the database, which can fail for partial state events when the containing room has been un-partial stated in the meantime. We detect the race as a foreign key constraint failure in the data store layer and turn it into a special `PartialStateConflictError` exception, which makes its way up to the method in which we computed the event context. To make things difficult, the exception needs to cross a replication request: `/fed_send_events` for events coming over federation and `/send_event` for events from clients. We transport the `PartialStateConflictError` as a `409 Conflict` over replication and turn `409`s back into `PartialStateConflictError`s on the worker making the request. All client events go through `EventCreationHandler.handle_new_client_event`, which is called in a lot of places. Instead of trying to update all the code which creates client events, we turn the `PartialStateConflictError` into a `429 Too Many Requests` in `EventCreationHandler.handle_new_client_event` and hope that clients take it as a hint to retry their request. On the federation event side, there are 7 places which compute event contexts. 4 of them use outlier event contexts: `FederationEventHandler._auth_and_persist_outliers_inner`, `FederationHandler.do_knock`, `FederationHandler.on_invite_request` and `FederationHandler.do_remotely_reject_invite`. These events won't have the partial state flag, so we do not need to do anything for then. The remaining 3 paths which create events are `FederationEventHandler.process_remote_join`, `FederationEventHandler.on_send_membership_event` and `FederationEventHandler._process_received_pdu`. We can't experience the race in `process_remote_join`, unless we're handling an additional join into a partial state room, which currently blocks, so we make no attempt to handle it correctly. `on_send_membership_event` is only called by `FederationServer._on_send_membership_event`, so we catch the `PartialStateConflictError` there and retry just once. `_process_received_pdu` is called by `on_receive_pdu` for incoming events and `_process_pulled_event` for backfill. The latter should never try to persist partial state events, so we ignore it. We catch the `PartialStateConflictError` in `on_receive_pdu` and retry just once. Refering to the graph of code paths in https://github.com/matrix-org/synapse/issues/12988#issuecomment-1156857648 may make the above make more sense. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-07-05 16:12:52 +01:00
David Robertson	97e9fbe1b2	Type annotations in `synapse.databases.main.devices` (#13025 ) Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>	2022-06-15 15:20:04 +00:00
Patrick Cloke	cf05258f76	Remove groups replication code. (#12900 ) The replication logic for groups is no longer used, so the message passing infrastructure can be removed.	2022-05-31 13:04:08 -04:00
Erik Johnston	1e453053cb	Rename storage classes (#12913 )	2022-05-31 12:17:50 +00:00
reivilibre	39dee30f01	Send `USER_IP` commands on a different Redis channel, in order to reduce traffic to workers that do not process these commands. (#12809 )	2022-05-20 15:28:23 +01:00
reivilibre	177b884ad7	Lay some foundation work to allow workers to only subscribe to some kinds of messages, reducing replication traffic. (#12672 )	2022-05-19 16:29:08 +01:00
Andrew Morgan	83be72d76c	Add `StreamKeyType` class and replace string literals with constants (#12567 )	2022-05-16 15:35:31 +00:00
Sean Quah	a559c8b0d9	Respect the `@cancellable` flag for `ReplicationEndpoint`s (#12700 ) While `ReplicationEndpoint`s register themselves via `JsonResource`, they pass a method that calls the handler, instead of the handler itself, to `register_paths`. As a result, `JsonResource` will not correctly pick up the `@cancellable` flag and we have to apply it ourselves. Signed-off-by: Sean Quah <seanq@element.io>	2022-05-11 12:25:39 +01:00
Shay	d80a7ab151	Update `replication.md` with info on TCP module structure (#12621 )	2022-05-09 14:46:43 -07:00
Šimon Brandner	ef86cf3d28	Update `_on_new_receipts()` to work with MSC2285 changes. (#12636 )	2022-05-05 13:25:51 +00:00
Erik Johnston	c0379d6e5b	Reduce log spam when running multiple event persisters (#12610 )	2022-05-05 10:20:23 +01:00
Erik Johnston	d1cd96ce29	Add opentracing spans to calls to external cache (#12380 )	2022-04-07 13:18:29 +01:00
Sean Quah	800ba87cc8	Refactor and convert `Linearizer` to async (#12357 ) Refactor and convert `Linearizer` to async. This makes a `Linearizer` cancellation bug easier to fix. Also refactor to use an async context manager, which eliminates an unlikely footgun where code that doesn't immediately use the context manager could forget to release the lock. Signed-off-by: Sean Quah <seanq@element.io>	2022-04-05 15:43:52 +01:00
Erik Johnston	66053b6bfb	Prefill more stream change caches. (#12372 )	2022-04-05 14:26:41 +01:00
Erik Johnston	b446c99ac9	Prefill the device_list_stream_cache (#12367 ) * Prefill the device_list_stream_cache * Newsfile * Newsfile	2022-04-04 20:12:25 +01:00
Erik Johnston	5c9e39e619	Track device list updates per room. (#12321 ) This is a first step in dealing with #7721. The idea is basically that rather than calculating the full set of users a device list update needs to be sent to up front, we instead simply record the rooms the user was in at the time of the change. This will allow a few things: 1. we can defer calculating the set of remote servers that need to be poked about the change; and 2. during `/sync` and `/keys/changes` we can avoid also avoid calculating users who share rooms with other users, and instead just look at the rooms that have changed. However, care needs to be taken to correctly handle server downgrades. As such this PR writes to both `device_lists_changes_in_room` and the `device_lists_outbound_pokes` table synchronously. In a future release we can then bump the database schema compat version to `69` and then we can assume that the new `device_lists_changes_in_room` exists and is handled. There is a temporary option to disable writing to `device_lists_outbound_pokes` synchronously, allowing us to test the new code path does work (and by implication upgrading to a future release and downgrading to this one will work correctly). Note: Ideally we'd do the calculation of room to servers on a worker (e.g. the background worker), but currently only master can write to the `device_list_outbound_pokes` table.	2022-04-04 15:25:20 +01:00
reivilibre	f871222880	Move `update_client_ip` background job from the main process to the background worker. (#12251 )	2022-04-01 13:08:55 +01:00
David Robertson	a2b00a4486	Bump `black` and `click` versions (#12320 )	2022-03-29 10:41:19 +00:00
reivilibre	4a53f35737	Improve code documentation for the typing stream over replication. (#12211 )	2022-03-11 14:00:15 +00:00
Patrick Cloke	3e4af36bc8	Rename get_tcp_replication to get_replication_command_handler. (#12192 ) Since the object it returns is a ReplicationCommandHandler. This is clean-up from adding support to Redis where the command handler was added as an additional layer of abstraction from the TCP protocol.	2022-03-10 13:01:56 +00:00
Nick Mills-Barrett	180d8ff0d4	Retry some http replication failures (#12182 ) This allows for the target process to be down for around a minute which provides time for restarts during synapse upgrades/config updates. Closes: #12178 Signed off by Nick Mills-Barrett nick@beeper.com	2022-03-09 14:53:28 +00:00
Patrick Cloke	d8bab6793c	Fix incorrect type hints for txredis. (#12042 ) Some properties were marked as RedisProtocol instead of ConnectionHandler, which wraps RedisProtocol instance(s).	2022-03-08 07:26:05 -05:00
Erik Johnston	423cca9efe	Spread out sending device lists to remote hosts (#12132 )	2022-03-04 11:48:15 +00:00
Richard van der Hoff	e24ff8ebe3	Remove `HomeServer.get_datastore()` (#12031 ) The presence of this method was confusing, and mostly present for backwards compatibility. Let's get rid of it. Part of #11733	2022-02-23 11:04:02 +00:00
Erik Johnston	6d14b3dabf	Better error message when failing to request from another process (#12060 )	2022-02-22 15:52:08 +00:00
Patrick Cloke	d0e78af35e	Add missing type hints to synapse.replication. (#11938 )	2022-02-08 11:03:08 -05:00
Patrick Cloke	6c0984e3f0	Remove unnecessary ignores due to Twisted upgrade. (#11939 ) Twisted 22.1.0 fixed some internal type hints, allowing Synapse to remove ignore calls for parameters to connectTCP.	2022-02-08 09:15:59 -05:00
Patrick Cloke	63d90f10ec	Add missing type hints to synapse.replication.http. (#11856 )	2022-02-08 07:44:39 -05:00
Richard van der Hoff	2277275485	Stop reading from `event_reference_hashes` (#11794 ) Preparation for dropping this table altogether. Part of #6574.	2022-01-21 09:18:10 +00:00
Patrick Cloke	10a88ba91c	Use auto_attribs/native type hints for attrs classes. (#11692 )	2022-01-13 13:49:28 +00:00
Richard van der Hoff	2359ee3864	Remove redundant `get_current_events_token` (#11643 ) * Push `get_room_{min,max_stream_ordering}` into StreamStore Both implementations of this are identical, so we may as well push it down and get rid of the abstract base class nonsense. * Remove redundant `StreamStore` class This is empty now * Remove redundant `get_current_events_token` This was an exact duplicate of `get_room_max_stream_ordering`, so let's get rid of it. * newsfile	2022-01-04 16:10:27 +00:00
Patrick Cloke	cbd82d0b2d	Convert all namedtuples to attrs. (#11665 ) To improve type hints throughout the code.	2021-12-30 18:47:12 +00:00
Sean Quah	5305a5e881	Type hint the constructors of the data store classes (#11555 )	2021-12-13 17:05:00 +00:00
Quentin Gliech	a15a893df8	Save the OIDC session ID (sid) with the device on login (#11482 ) As a step towards allowing back-channel logout for OIDC.	2021-12-06 12:43:06 -05:00
Sean Quah	ffd858aa68	Add type hints to `synapse/storage/databases/main/events_worker.py` (#11411 ) Also refactor the stream ID trackers/generators a bit and try to document them better.	2021-11-26 18:41:31 +00:00
Patrick Cloke	5cace20bf1	Add missing type hints to `synapse.app`. (#11287 )	2021-11-10 15:06:54 -05:00
Nick Barrett	af54167516	Enable passing typing stream writers as a list. (#11237 ) This makes the typing stream writer config match the other stream writers that only currently support a single worker.	2021-11-03 14:25:47 +00:00
Brendan Abolivier	c7a5e49664	Implement an `on_new_event` callback (#11126 ) Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2021-10-26 15:17:36 +02:00
Sean Quah	2b82ec425f	Add type hints for most `HomeServer` parameters (#11095 )	2021-10-22 18:15:41 +01:00
Sean Quah	6a67f3786a	Fix logging context warnings when losing replication connection (#10984 ) Instead of triggering `__exit__` manually on the replication handler's logging context, use it as a context manager so that there is an `__enter__` call to balance the `__exit__`.	2021-10-15 13:10:58 +01:00
Sean Quah	6b18eb4430	Fix opentracing and Prometheus metrics for replication requests (#10996 ) This commit fixes two bugs to do with decorators not instrumenting `ReplicationEndpoint`'s `send_request` correctly. There are two decorators on `send_request`: Prometheus' `Gauge.track_inprogress()` and Synapse's `opentracing.trace`. `Gauge.track_inprogress()` does not have any support for async functions when used as a decorator. Since async functions behave like regular functions that return coroutines, only the creation of the coroutine was covered by the metric and none of the actual body of `send_request`. `Gauge.track_inprogress()` returns a regular, non-async function wrapping `send_request`, which is the source of the next bug. The `opentracing.trace` decorator would normally handle async functions correctly, but since the wrapped `send_request` is a non-async function, the decorator ends up suffering from the same issue as `Gauge.track_inprogress()`: the opentracing span only measures the creation of the coroutine and none of the actual function body. Using `Gauge.track_inprogress()` as a context manager instead of a decorator resolves both bugs.	2021-10-12 11:23:46 +01:00
David Robertson	51a5da74cc	Annotate synapse.storage.util (#10892 ) Also mark `synapse.streams` as having has no untyped defs Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>	2021-10-08 14:25:16 +00:00
Patrick Cloke	f4b1a9a527	Require direct references to configuration variables. (#10985 ) This removes the magic allowing accessing configurable variables directly from the config object. It is now required that a specific configuration class is used (e.g. `config.foo` must be replaced with `config.server.foo`).	2021-10-06 10:47:41 -04:00
David Robertson	29364145b2	Pass str to twisted's IReactorTCP (#10895 ) This follows a correction made in twisted/twisted#1664 and should fix our Twisted Trial CI job. Until that change is in a twisted release, we'll have to ignore the type of the `host` argument. I've raised #10899 to remind us to review the issue in a few months' time.	2021-09-30 12:51:47 +01:00

1 2 3 4 5 ...

614 Commits