forked-synapse

mirror of https://mau.dev/maunium/synapse.git synced 2024-10-01 01:36:05 -04:00

Author	SHA1	Message	Date
Richard van der Hoff	9635822cc1	Clarify docs for some room state functions (#16950 ) State before an event is different to state after that event, and people tend to assume the wrong one.	2024-03-19 17:16:37 +00:00
Quentin Gliech	4af33015af	Fix joining remote rooms when a `on_new_event` callback is registered (#16973 ) Since Synapse 1.76.0, any module which registers a `on_new_event` callback would brick the ability to join remote rooms. This is because this callback tried to get the full state of the room, which would end up in a deadlock. Related: https://github.com/matrix-org/synapse-auto-accept-invite/issues/18 The following module would brick the ability to join remote rooms: ```python from typing import Any, Dict, Literal, Union import logging from synapse.module_api import ModuleApi, EventBase logger = logging.getLogger(__name__) class MyModule: def __init__(self, config: None, api: ModuleApi): self._api = api self._config = config self._api.register_third_party_rules_callbacks( on_new_event=self.on_new_event, ) async def on_new_event(self, event: EventBase, _state_map: Any) -> None: logger.info(f"Received new event: {event}") @staticmethod def parse_config(_config: Dict[str, Any]) -> None: return None ``` This is technically a breaking change, as we are now passing partial state on the `on_new_event` callback. However, this callback was broken for federated rooms since 1.76.0, and local rooms have full state anyway, so it's unlikely that it would change anything.	2024-03-06 16:00:20 +01:00
Erik Johnston	23740eaa3d	Correctly mention previous copyright (#16820 ) During the migration the automated script to update the copyright headers accidentally got rid of some of the existing copyright lines. Reinstate them.	2024-01-23 11:26:48 +00:00
Patrick Cloke	8e1e62c9e0	Update license headers	2023-11-21 15:29:58 -05:00
Patrick Cloke	e3e0ae4ab1	Convert state delta processing from a dict to attrs. (#16469 ) For improved type checking & memory usage.	2023-10-16 07:35:22 -04:00
Patrick Cloke	f84da3c32e	Add a cache around server ACL checking (#16360 ) * Pre-compiles the server ACLs onto an object per room and invalidates them when new events come in. * Converts the server ACL checking into Rust.	2023-09-26 11:57:50 -04:00
Patrick Cloke	7ec0a141b4	Convert more cached return values to immutable types (#16356 )	2023-09-20 07:48:55 -04:00
Erik Johnston	fc1e534e41	Speed up updating state in large rooms (#15971 ) This should speed up updating state in rooms with lots of state.	2023-07-20 15:51:28 +01:00
Gabriel Féron	daf3a67908	Add get_canonical_room_alias to module API (#15450 ) Co-authored-by: Boxdot <d@zerovolt.org>	2023-05-31 09:18:37 -04:00
Eric Eastwood	379eb2d7ab	Fix `@trace` not wrapping some state methods that return coroutines correctly (#15647 ) ``` 2023-05-21 09:30:09,288 - synapse.logging.opentracing - 940 - ERROR - POST-1 - @trace may not have wrapped StateStorageController.get_state_for_groups correctly! The function is not async but returned a coroutine ``` Tracing instrumentation for these functions originally introduced in https://github.com/matrix-org/synapse/pull/15610	2023-05-23 12:26:25 -05:00
Eric Eastwood	703a8f9c67	Instrument `state` and `state_group` storage related things (tracing) (#15610 ) Instrument `state` and `state_group` storage related things (tracing) so it's a little more clear where these database transactions are coming from as there is a lot of wires crossing in these functions. Part of `/messages` performance investigation: https://github.com/matrix-org/synapse/issues/13356	2023-05-19 12:26:58 -05:00
Sean Quah	d0c713cc85	Return read-only collections from `@cached` methods (#13755 ) It's important that collections returned from `@cached` methods are not modified, otherwise future retrievals from the cache will return the modified collection. This applies to the return values from `@cached` methods and the values inside the dictionaries returned by `@cachedList` methods. It's not necessary for the dictionaries returned by `@cachedList` methods themselves to be read-only. Signed-off-by: Sean Quah <seanq@matrix.org> Co-authored-by: David Robertson <davidr@element.io>	2023-02-10 23:29:00 +00:00
Sean Quah	0a686d1d13	Faster joins: Refactor handling of servers in room (#14954 ) Ensure that the list of servers in a partial state room always contains the server we joined off. Also refactor `get_partial_state_servers_at_join` to return `None` when the given room is no longer partial stated, to explicitly indicate when the room has partial state. Otherwise it's not clear whether an empty list means that the room has full state, or the room is partial stated, but the server we joined off told us that there are no servers in the room. Signed-off-by: Sean Quah <seanq@matrix.org>	2023-02-03 15:39:59 +00:00
Sean Quah	2ec9c58496	Faster joins: Update room stats and the user directory on workers when finishing join (#14874 ) * Faster joins: Update room stats and user directory on workers when done When finishing a partial state join to a room, we update the current state of the room without persisting additional events. Workers receive notice of the current state update over replication, but neglect to wake the room stats and user directory updaters, which then get incidentally triggered the next time an event is persisted or an unrelated event persister sends out a stream position update. We wake the room stats and user directory updaters at the appropriate time in this commit. Part of #12814 and #12815. Signed-off-by: Sean Quah <seanq@matrix.org> * fixup comment Signed-off-by: Sean Quah <seanq@matrix.org>	2023-01-23 10:31:36 +00:00
David Robertson	b5b5f66084	Move `StateFilter` to `synapse.types` (#14668 ) * Move `StateFilter` to `synapse.types` * Changelog	2022-12-12 16:19:30 +00:00
Erik Johnston	3dfc4a08dc	Fix performance regression in `get_users_in_room` (#13972 ) Fixes #13942. Introduced in #13575. Basically, let's only get the ordered set of hosts out of the DB if we need an ordered set of hosts. Since we split the function up the caching won't be as good, but I think it will still be fine as e.g. multiple backfill requests for the same room will hit the cache.	2022-09-30 13:15:32 +01:00
Sean Quah	f49f73c0da	Faster room joins: Avoid blocking `/keys/changes` (#13888 ) Part of the work for #12993. Once #12993 is fully resolved, we expect `/keys/changes` to behave sensibly when joined to a room with partial state. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-09-23 17:55:15 +01:00
Sean Quah	03c2bfb7f8	Send device list updates out to servers in partially joined rooms (#13874 ) Use the provided list of servers in the room from the `/send_join` response, since we will not know which users are in the room. This isn't sufficient to ensure that all remote servers receive the right device list updates, since the `/send_join` response may be inaccurate or we may calculate the membership state of new users in the room incorrectly. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-09-23 13:44:03 +01:00
reivilibre	d3d9ca156e	Cancel the processing of key query requests when they time out. (#13680 )	2022-09-07 12:03:32 +01:00
Eric Eastwood	51d732db3b	Optimize how we calculate `likely_domains` during backfill (#13575 ) Optimize how we calculate `likely_domains` during backfill because I've seen this take 17s in production just to `get_current_state` which is used to `get_domains_from_state` (see case [2. Loading tons of events in the `/messages` investigation issue](https://github.com/matrix-org/synapse/issues/13356)). There are 3 ways we currently calculate hosts that are in the room: 1. `get_current_state` -> `get_domains_from_state` - Used in `backfill` to calculate `likely_domains` and `/timestamp_to_event` because it was cargo-culted from `backfill` - This one is being eliminated in favor of `get_current_hosts_in_room` in this PR 🕳 1. `get_current_hosts_in_room` - Used for other federation things like sending read receipts and typing indicators 1. `get_hosts_in_room_at_events` - Used when pushing out events over federation to other servers in the `_process_event_queue_loop` Fix https://github.com/matrix-org/synapse/issues/13626 Part of https://github.com/matrix-org/synapse/issues/13356 Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.2tvwz3yhcafh) ### Query performance #### Before The query from `get_current_state` sucks just because we have to get all 80k events. And we see almost the exact same performance locally trying to get all of these events (16s vs 17s): ``` synapse=# SELECT type, state_key, event_id FROM current_state_events WHERE room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; Time: 16035.612 ms (00:16.036) synapse=# SELECT type, state_key, event_id FROM current_state_events WHERE room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; Time: 4243.237 ms (00:04.243) ``` But what about `get_current_hosts_in_room`: When there is 8M rows in the `current_state_events` table, the previous query in `get_current_hosts_in_room` took 13s from complete freshness (when the events were first added). But takes 930ms after a Postgres restart or 390ms if running back to back to back. ```sh $ psql synapse synapse=# \timing on synapse=# SELECT COUNT(DISTINCT substring(state_key FROM '@[^:]:(.)$')) FROM current_state_events WHERE type = 'm.room.member' AND membership = 'join' AND room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; count ------- 4130 (1 row) Time: 13181.598 ms (00:13.182) synapse=# SELECT COUNT() from current_state_events where room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; count ------- 80814 synapse=# SELECT COUNT() from current_state_events; count --------- 8162847 synapse=# SELECT pg_size_pretty( pg_total_relation_size('current_state_events') ); pg_size_pretty ---------------- 4702 MB ``` #### After I'm not sure how long it takes from complete freshness as I only really get that opportunity once (maybe restarting computer but that's cumbersome) and it's not really relevant to normal operating times. Maybe you get closer to the fresh times the more access variability there is so that Postgres caches aren't as exact. Update: The longest I've seen this run for is 6.4s and 4.5s after a computer restart. After a Postgres restart, it takes 330ms and running back to back takes 260ms. ```sh $ psql synapse synapse=# \timing on Timing is on. synapse=# SELECT substring(c.state_key FROM '@[^:]:(.)$') as host FROM current_state_events c /* Get the depth of the event from the events table */ INNER JOIN events AS e USING (event_id) WHERE c.type = 'm.room.member' AND c.membership = 'join' AND c.room_id = '!OGEhHVWSdvArJzumhm:matrix.org' GROUP BY host ORDER BY min(e.depth) ASC; Time: 333.800 ms ``` #### Going further To improve things further we could add a `limit` parameter to `get_current_hosts_in_room`. Realistically, we don't need 4k domains to choose from because there is no way we're going to query that many before we a) probably get an answer or b) we give up. Another thing we can do is optimize the query to use a index skip scan: - https://wiki.postgresql.org/wiki/Loose_indexscan - Index Skip Scan, https://commitfest.postgresql.org/37/1741/ - https://www.timescale.com/blog/how-we-made-distinct-queries-up-to-8000x-faster-on-postgresql/	2022-08-30 01:38:14 -05:00
Sean Quah	84169a82dc	Avoid blocking lazy-loading `/sync`s during partial joins (#13477 ) Use a state filter or accept partial state in a few places where we request state, to avoid blocking. To make lazy-loading `/sync`s work, we need to provide the memberships of event senders, which are not guaranteed to be in the room state. Instead we dig through auth events for memberships to present to clients. The auth events of an event are guaranteed to contain a passable membership event, otherwise the event would have been rejected. Note that this only covers the common code paths encountered during testing. There has been no exhaustive checking of all sync code paths. Fixes #13146. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-08-18 11:53:02 +01:00
Eric Eastwood	0a4efbc1dd	Instrument the federation/backfill part of `/messages` (#13489 ) Instrument the federation/backfill part of `/messages` so it's easier to follow what's going on in Jaeger when viewing a trace. Split out from https://github.com/matrix-org/synapse/pull/13440 Follow-up from https://github.com/matrix-org/synapse/pull/13368 Part of https://github.com/matrix-org/synapse/issues/13356	2022-08-16 12:39:40 -05:00
reivilibre	c3516e9dec	Faster room joins: make `/joined_members` block whilst the room is partial stated. (#13514 )	2022-08-16 13:16:56 +01:00
Eric Eastwood	92d21faf12	Instrument `/messages` for understandable traces in Jaeger (#13368 ) In Jaeger: - Before: huge list of uncategorized database calls - After: nice and collapsible into units of work	2022-08-03 10:57:38 -05:00
Sean Quah	224d792dd7	Refactor `_resolve_state_at_missing_prevs` to return an `EventContext` (#13404 ) Previously, `_resolve_state_at_missing_prevs` returned the resolved state before an event and a partial state flag. These were unwieldy to carry around would only ever be used to build an event context. Build the event context directly instead. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-08-01 13:53:56 +01:00
Sean Quah	335ebb21cc	Faster room joins: avoid blocking when pulling events with missing prevs (#13355 ) Avoid blocking on full state in `_resolve_state_at_missing_prevs` and return a new flag indicating whether the resolved state is partial. Thread that flag around so that it makes it into the event context. Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>	2022-07-26 12:39:23 +01:00
Erik Johnston	0731e0829c	Don't pull out the full state when storing state (#13274 )	2022-07-15 12:59:45 +00:00
Richard van der Hoff	7c6b2204d1	Faster joins: add issue links to the TODOs (#13004 ) ... to help us keep track of these things	2022-06-09 10:13:03 +00:00
Erik Johnston	44de53bb79	Reduce state pulled from DB due to sending typing and receipts over federation (#12964 ) Reducing the amount of state we pull from the DB is useful as fetching state is expensive in terms of DB, CPU and memory.	2022-06-06 16:46:11 +01:00
Erik Johnston	e3163e2e11	Reduce the amount of state we pull from the DB (#12811 )	2022-06-06 09:24:12 +01:00
Erik Johnston	888a29f412	Wait for lazy join to complete when getting current state (#12872 )	2022-06-01 16:02:53 +01:00
Erik Johnston	1e453053cb	Rename storage classes (#12913 )	2022-05-31 12:17:50 +00:00

32 Commits