forked-synapse

mirror of https://mau.dev/maunium/synapse.git synced 2024-10-01 01:36:05 -04:00

Author	SHA1	Message	Date
Erik Johnston	c900d18647	Fix OIDC login regression (#17031 ) Requests may require a User-Agent header, and the change in #16972 accidentally removed it, resulting in requests getting rejected causing login to fail.	2024-03-26 13:26:46 +00:00
Richard van der Hoff	b5322b4daf	Ensure that pending to-device events are sent over federation at startup (#16925 ) Fixes https://github.com/element-hq/synapse/issues/16680, as well as a related bug, where servers which we had never successfully sent an event to would not be retried. In order to fix the case of pending to-device messages, we hook into the existing `wake_destinations_needing_catchup` process, by extending it to look for destinations that have pending to-device messages. The federation transmission loop then attempts to send the pending to-device messages as normal.	2024-03-22 13:24:11 +00:00
Mathieu Velten	b7af076ab5	Add OIDC config to add extra parameters to the authorize URL (#16971 )	2024-03-22 10:35:11 +00:00
SpiritCroc	9ad49e7ecf	Do not refuse to set read_marker if previous event_id is in wrong room (#16990 )	2024-03-21 18:43:07 +00:00
Hanadi	f7a3ebe44d	Fix reject knocks on deactivating account (#17010 )	2024-03-21 18:05:54 +00:00
Mathieu Velten	3ab9e6d524	OIDC: try to JWT decode userinfo response if JSON parsing failed (#16972 )	2024-03-21 17:49:44 +00:00
Shay	cf5adc80e1	Update power level default for public rooms (#16907 )	2024-03-19 17:55:31 +00:00
Shay	8fb5b0f335	Improve event validation (#16908 ) As the title states.	2024-03-19 17:52:53 +00:00
Mathieu Velten	74ab329eaa	Pass module API to OIDC mapping provider (#16974 ) As done for SAML mapping provider, let's pass the module API to the OIDC one so the mapper can do more logic in its code.	2024-03-19 17:20:10 +00:00
Richard van der Hoff	9635822cc1	Clarify docs for some room state functions (#16950 ) State before an event is different to state after that event, and people tend to assume the wrong one.	2024-03-19 17:16:37 +00:00
Richard van der Hoff	52f456a822	`/sync`: Fix edge-case in calculating the "device_lists" response (#16949 ) Fixes https://github.com/element-hq/synapse/issues/16948. If the `join` and the `leave` are in the same sync response, we need to count them as a "left" user.	2024-03-14 17:34:19 +00:00
Richard van der Hoff	6d5bafb2c8	Split up `SyncHandler.compute_state_delta` (#16929 ) This is a huge method, which melts my brain. This is a non-functional change which lays some groundwork for future work in this area.	2024-03-14 17:18:48 +00:00
Mathieu Velten	cb562d73aa	Improve lock performance when a lot of locks are waiting (#16840 ) When a lot of locks are waiting for a single lock, notifying all locks independently with `call_later` on each release is really costly and incurs some kind of async contention, where the CPU is spinning a lot for not much. The included test is taking around 30s before the change, and 0.5s after. It was found following failing tests with https://github.com/element-hq/synapse/pull/16827.	2024-03-14 13:49:54 +00:00
dependabot[bot]	9b5eef95ad	Bump ruff from 0.1.14 to 0.3.2 (#16994 )	2024-03-13 17:06:23 +00:00
dependabot[bot]	e161103b46	Bump mypy from 1.5.1 to 1.8.0 (#16901 )	2024-03-13 17:05:57 +00:00
dependabot[bot]	1e68b56a62	Bump black from 23.10.1 to 24.2.0 (#16936 )	2024-03-13 16:46:44 +00:00
Gerrit Gogel	1f88790764	Prevent locking up while processing batched_auth_events (#16968 ) This PR aims to fix #16895, caused by a regression in #7 and not fixed by #16903. The PR #16903 only fixes a starvation issue, where the CPU isn't released. There is a second issue, where the execution is blocked. This theory is supported by the flame graphs provided in #16895 and the fact that I see the CPU usage reducing and far below the limit. Since the changes in #7, the method `check_state_independent_auth_rules` is called with the additional parameter `batched_auth_events`: `6fa13b4f92/synapse/handlers/federation_event.py (L1741-L1743)` It makes the execution enter this if clause, introduced with #15195 `6fa13b4f92/synapse/event_auth.py (L178-L189)` There are two issues in the above code snippet. First, there is the blocking issue. I'm not entirely sure if this is a deadlock, starvation, or something different. In the beginning, I thought the copy operation was responsible. It wasn't. Then I investigated the nested `store.get_events` inside the function `update`. This was also not causing the blocking issue. Only when I replaced the set difference operation (`-` ) with a list comprehension, the blocking was resolved. Creating and comparing sets with a very large amount of events seems to be problematic. This is how the flamegraph looks now while persisting outliers. As you can see, the execution no longer locks up in the above function. ![output_2024-02-28_13-59-40](https://github.com/element-hq/synapse/assets/13143850/6db9c9ac-484f-47d0-bdde-70abfbd773ec) Second, the copying here doesn't serve any purpose, because only a shallow copy is created. This means the same objects from the original dict are referenced. This fails the intention of protecting these objects from mutation. The review of the original PR https://github.com/matrix-org/synapse/pull/15195 had an extensive discussion about this matter. Various approaches to copying the auth_events were attempted: 1) Implementing a deepcopy caused issues due to builtins.EventInternalMetadata not being pickleable. 2) Creating a dict with new objects akin to a deepcopy. 3) Creating a dict with new objects containing only necessary attributes. Concluding, there is no easy way to create an actual copy of the objects. Opting for a deepcopy can significantly strain memory and CPU resources, making it an inefficient choice. I don't see why the copy is necessary in the first place. Therefore I'm proposing to remove it altogether. After these changes, I was able to successfully join these rooms, without the main worker locking up: - #synapse:matrix.org - #element-android:matrix.org - #element-web:matrix.org - #ecips:matrix.org - #ipfs-chatter:ipfs.io - #python:matrix.org - #matrix:matrix.org	2024-03-12 15:07:36 +00:00
Alexander Fechler	48f59d3806	deactivated flag refactored to filter deactivated users. (#16874 ) Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2024-03-11 16:08:04 +00:00
Patrick Cloke	696cc9e802	Stabilize support for Retry-After header (MSC4014) (#16947 )	2024-03-08 09:33:46 +00:00
Quentin Gliech	4af33015af	Fix joining remote rooms when a `on_new_event` callback is registered (#16973 ) Since Synapse 1.76.0, any module which registers a `on_new_event` callback would brick the ability to join remote rooms. This is because this callback tried to get the full state of the room, which would end up in a deadlock. Related: https://github.com/matrix-org/synapse-auto-accept-invite/issues/18 The following module would brick the ability to join remote rooms: ```python from typing import Any, Dict, Literal, Union import logging from synapse.module_api import ModuleApi, EventBase logger = logging.getLogger(__name__) class MyModule: def __init__(self, config: None, api: ModuleApi): self._api = api self._config = config self._api.register_third_party_rules_callbacks( on_new_event=self.on_new_event, ) async def on_new_event(self, event: EventBase, _state_map: Any) -> None: logger.info(f"Received new event: {event}") @staticmethod def parse_config(_config: Dict[str, Any]) -> None: return None ``` This is technically a breaking change, as we are now passing partial state on the `on_new_event` callback. However, this callback was broken for federated rooms since 1.76.0, and local rooms have full state anyway, so it's unlikely that it would change anything.	2024-03-06 16:00:20 +01:00
Andrew Morgan	8a05304222	Revert "Improve DB performance of calculating badge counts for push. (#16756 )" (#16979 )	2024-03-05 12:27:27 +00:00
Erik Johnston	cdbbf3653d	Don't lock up when joining large rooms (#16903 ) Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>	2024-02-20 14:29:18 +00:00
kegsay	c51a2240d1	bugfix: always prefer unthreaded receipt when >1 exist (MSC4102) (#16927 ) Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2024-02-20 14:12:06 +00:00
Remi Rampin	0621e8eb0e	Add metric for emails sent (#16881 ) This adds a counter `synapse_emails_sent_total` for emails sent. They are broken down by `type`, which are `password_reset`, `registration`, `add_threepid`, `notification` (matching the methods of `Mailer`).	2024-02-14 15:30:03 +00:00
Erik Johnston	7b4d7429f8	Don't invalidate the entire event cache when we purge history (#16905 ) We do this by adding support to the LRU cache for "extra indices" based on the cached value. This allows us to efficiently map from room ID to the cached events and only invalidate those.	2024-02-13 13:24:11 +00:00
Erik Johnston	01910b981f	Add a config to not send out device list updates for specific users (#16909 ) List of users not to send out device list updates for when they register new devices. This is useful to handle bot accounts. This is undocumented as its mostly a hack to test on matrix.org. Note: This will still send out device list updates if the device is later updated, e.g. end to end keys are added.	2024-02-13 13:23:03 +00:00
Erik Johnston	ea1b30940e	Merge remote-tracking branch 'origin/release-v1.101' into develop	2024-02-09 10:52:35 +00:00
Erik Johnston	bfa93d1d3b	Only do one concurrent fetch per server in keyring (#16894 ) Otherwise if we've stacked a bunch of requests for the keys of a server, we'll end up sending lots of concurrent requests for its keys, needlessly.	2024-02-09 10:51:11 +00:00
Erik Johnston	02a147039c	Increase batching when fetching auth chains (#16893 ) This basically reverts a change that was in https://github.com/element-hq/synapse/pull/16833, where we reduced the batching. The smaller batching can cause performance issues on busy servers and databases.	2024-02-09 10:51:00 +00:00
David Baker	71ca199165	Accept unprefixed form of MSC3981 recurse parameter (#16842 ) Now that the MSC3981 has passed FCP	2024-02-06 09:48:39 +00:00
dependabot[bot]	871f51c270	Bump lxml-stubs from 0.4.0 to 0.5.1 (#16885 )	2024-02-06 09:29:17 +00:00
Erik Johnston	adf15c4f6b	Run `ANALYZE` after fiddling with stats (#16849 ) Introduced in #16833 Fixes #16844	2024-01-24 13:57:12 +00:00
Erik Johnston	c925b45567	Speed up e2e device keys queries for bot accounts (#16841 ) This helps with bot accounts with lots of non-e2e devices. The change is basically to change the order of the join for the case of using `INNER JOIN`	2024-01-23 11:37:16 +00:00
Erik Johnston	23740eaa3d	Correctly mention previous copyright (#16820 ) During the migration the automated script to update the copyright headers accidentally got rid of some of the existing copyright lines. Reinstate them.	2024-01-23 11:26:48 +00:00
Erik Johnston	14c725f73b	Preparatory work for tweaking performance of auth chain lookups (#16833 )	2024-01-23 11:26:27 +00:00
Shay	a68b48a5dd	Allow room creation but not publishing to continue if room publication rules are violated when creating a new room. (#16811 ) Prior to this PR, if a request to create a public (public as in published to the rooms directory) room violated the room list publication rules set in the [config](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#room_list_publication_rules), the request to create the room was denied and the room was not created. This PR changes the behavior such that when a request to create a room published to the directory violates room list publication rules, the room is still created but the room is not published to the directory.	2024-01-22 13:59:45 +00:00
Mo Balaa	b99f6db039	Handle wildcard type filters properly (#14984 )	2024-01-22 10:46:30 +00:00
Hanadi	42e1aaea68	feat: add msc4028 to versions api (#16787 ) Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2024-01-16 14:36:08 +00:00
Erik Johnston	c43f751013	Optimize query for fetching to-device messages in `/sync` (#16805 ) The current query supports passing in a list of users, which generates a query using `user_id = ANY(..)`. This is generates a less efficient query plan that is notably slower than a simple `user_id = ?` condition. Note: The new function is mostly a copy and paste and then a simplification of the existing function.	2024-01-11 13:37:57 +00:00
Erik Johnston	b11f7b5122	Improve DB performance of calculating badge counts for push. (#16756 ) The crux of the change is to try and make the queries simpler and pull out fewer rows. Before, there were quite a few joins against subqueries, which caused postgres to pull out more rows than necessary. Instead, let's simplify the query and do some of the filtering out in Python instead, letting Postgres do better optimizations now that it doesn't have to deal with joins against subqueries. Review note: this is a complete rewrite of the function, so not sure how useful the diff is. --------- Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2024-01-11 11:52:13 +00:00
Erik Johnston	a986f86c82	Correctly handle OIDC config with no `client_secret` set (#16806 ) In previous versions of authlib using `client_secret_basic` without a `client_secret` would result in an invalid auth header. Since authlib 1.3 it throws an exception. The configuration may be accepted in by very lax servers, so we don't want to deny it outright. Instead, let's default the `client_auth_method` to `none`, which does the right thing. If the config specifies `client_auth_method` and no `client_secret` then that is going to be bogus and we should reject it	2024-01-10 17:16:49 +00:00
Erik Johnston	cbe8a80d10	Faster load recents for sync (#16783 ) This hopefully reduces the amount of state we need to keep in memory	2024-01-10 15:11:59 +00:00
Erik Johnston	0a96fa52a2	Pull less state out if we fail to backfill (#16788 ) Sometimes we fail to fetch events during backfill due to missing state, and we often end up querying the same bad events periodically (as people backpaginate). In such cases its likely we will continue to fail to get the state, and therefore we should try before loading the state that we have from the DB (as otherwise it's wasted DB and memory). --------- Co-authored-by: reivilibre <oliverw@matrix.org>	2024-01-10 14:42:13 +00:00
Erik Johnston	578c5c736e	Reduce amount of state pulled out when querying federation hierachy (#16785 ) There are two changes here: 1. Only pull out the required state when handling the request. 2. Change the get filtered state return type to check that we're only querying state that was requested --------- Co-authored-by: reivilibre <oliverw@matrix.org>	2024-01-10 14:31:35 +00:00
Erik Johnston	4c67f0391b	Split up deleting devices into batches (#16766 ) Otherwise for users with large numbers of devices this can cause a lot of woe.	2024-01-10 13:55:16 +00:00
Erik Johnston	c3f2f0f063	Faster partial join to room with complex auth graph (#7 ) Instead of persisting outliers in a bunch of batches, let's just do them all at once. This is fine because all `_auth_and_persist_outliers_inner` is doing is checking the auth rules for each event, which requires the events to be topologically sorted by the auth graph.	2024-01-10 12:29:42 +00:00
reivilibre	a83a337c4d	Filter out rooms from the room directory being served to other homeservers when those rooms block that homeserver by their Access Control Lists. (#16759 ) The idea here being that the directory server shouldn't advertise rooms to a requesting server is the requesting server would not be allowed to join or participate in the room. <!-- Fixes: # <!-- --> <!-- Supersedes: # <!-- --> <!-- Follows: # <!-- --> <!-- Part of: # <!-- --> Base: `develop` <!-- git-stack-base-branch:develop --> <!-- This pull request is commit-by-commit review friendly. <!-- --> <!-- This pull request is intended for commit-by-commit review. <!-- --> Original commit schedule, with full messages: <ol> <li> Pass `from_federation_origin` down into room list retrieval code </li> <li> Don't cache /publicRooms response for inbound federated requests </li> <li> fixup! Don't cache /publicRooms response for inbound federated requests </li> <li> Cap the number of /publicRooms entries to 100 </li> <li> Simplify code now that you can't request unlimited rooms </li> <li> Filter out rooms from federated requests that don't have the correct ACL </li> <li> Request a handful more when filtering ACLs so that we can try to avoid shortchanging the requester </li> </ol> --------- Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>	2024-01-08 17:24:20 +00:00
Erik Johnston	5d3850b038	Port `EventInternalMetadata` class to Rust (#16782 ) There are a couple of things we need to be careful of here: 1. The current python code does no validation when loading from the DB, so we need to be careful to ignore such errors (at least on jki.re there are some old events with internal metadata fields of the wrong type). 2. We want to be memory efficient, as we often have many hundreds of thousands of events in the cache at a time. --------- Co-authored-by: Quentin Gliech <quenting@element.io>	2024-01-08 14:06:48 +00:00
Erik Johnston	81b1c56288	Fix linting (#16780 ) Introduced in #16762	2024-01-05 13:29:00 +00:00
Erik Johnston	7469fa7585	Simplify internal metadata class. (#16762 ) We remove these fields as they're just duplicating data the event already stores, and (for reasons 🤫) I'd like to simplify the class to only store simple types. I'm not entirely convinced that we shouldn't instead add helper methods to the event class to generate stream tokens, but I don't really think that's where they belong either	2024-01-05 13:03:20 +00:00

1 2 3 4 5 ...

15898 Commits