Commit Graph

706 Commits

Author SHA1 Message Date
Andrew Morgan
182147195b
Check third party rules before persisting knocks over federation (#10212)
An accidental mis-ordering of operations during #6739 technically allowed an incoming knock event over federation in before checking it against any configured Third Party Access Rules modules.

This PR corrects that by performing the TPAR check *before* persisting the event.
2021-06-21 11:57:09 +01:00
Marcus
8070b893db
update black to 21.6b0 (#10197)
Reformat all files with the new version.

Signed-off-by: Marcus Hoffmann <bubu@bubu1.eu>
2021-06-17 15:20:06 +01:00
Eric Eastwood
a911dd768b
Add fields to better debug where events are being soft_failed (#10168)
Follow-up to https://github.com/matrix-org/synapse/pull/10156#discussion_r650292223
2021-06-17 14:59:45 +01:00
Patrick Cloke
9e5ab6dd58
Remove the experimental flag for knocking and use stable prefixes / endpoints. (#10167)
* Room version 7 for knocking.
* Stable prefixes and endpoints (both client and federation) for knocking.
* Removes the experimental configuration flag.
2021-06-15 07:45:14 -04:00
Eric Eastwood
b31daac01c
Add metrics to track how often events are soft_failed (#10156)
Spawned from missing messages we were seeing on `matrix.org` from a
federated Gtiter bridged room, https://gitlab.com/gitterHQ/webapp/-/issues/2770.
The underlying issue in Synapse is tracked by https://github.com/matrix-org/synapse/issues/10066
where the message and join event race and the message is `soft_failed` before the
`join` event reaches the remote federated server.

Less soft_failed events = better and usually this should only trigger for events
where people are doing bad things and trying to fuzz and fake everything.
2021-06-11 10:12:35 +01:00
Sorunome
d936371b69
Implement knock feature (#6739)
This PR aims to implement the knock feature as proposed in https://github.com/matrix-org/matrix-doc/pull/2403

Signed-off-by: Sorunome mail@sorunome.de
Signed-off-by: Andrew Morgan andrewm@element.io
2021-06-09 19:39:51 +01:00
Erik Johnston
a0101fc021
Handle /backfill returning no events (#10133)
Fixes #10123
2021-06-08 10:37:01 +01:00
Erik Johnston
a0cd8ae8cb
Don't try and backfill the same room in parallel. (#10116)
If backfilling is slow then the client may time out and retry, causing
Synapse to start a new `/backfill` before the existing backfill has
finished, duplicating work.
2021-06-04 10:47:58 +01:00
Erik Johnston
c96ab31dff
Limit number of events in a replication request (#10118)
Fixes #9956.
2021-06-04 10:35:47 +01:00
Richard van der Hoff
b4b2fd2ece
add a cache to have_seen_event (#9953)
Empirically, this helped my server considerably when handling gaps in Matrix HQ. The problem was that we would repeatedly call have_seen_events for the same set of (50K or so) auth_events, each of which would take many minutes to complete, even though it's only an index scan.
2021-06-01 12:04:47 +01:00
Brendan Abolivier
f828a70be3
Limit the number of events sent over replication when persisting events. (#10082) 2021-05-27 17:10:58 +01:00
Patrick Cloke
ac6bfcd52f
Refactor checking restricted join rules (#10007)
To be more consistent with similar code. The check now automatically
raises an AuthError instead of passing back a boolean. It also absorbs
some shared logic between callers.
2021-05-18 12:17:04 -04:00
Erik Johnston
2b2985b5cf
Improve performance of backfilling in large rooms. (#9935)
We were pulling the full auth chain for the room out of the DB each time
we backfilled, which can be *huge* for large rooms and is totally
unnecessary.
2021-05-10 13:29:02 +01:00
Erik Johnston
de8f0a03a3
Don't set the external cache if its been done recently (#9905) 2021-05-05 16:53:22 +01:00
Patrick Cloke
d924827da1
Check for space membership during a remote join of a restricted room (#9814)
When receiving a /send_join request for a room with join rules set to 'restricted',
check if the user is a member of the spaces defined in the 'allow' key of the join rules.

This only applies to an experimental room version, as defined in MSC3083.
2021-04-23 07:05:51 -04:00
Jonathan de Jong
495b214f4f
Fix (final) Bugbear violations (#9838) 2021-04-20 11:50:49 +01:00
Patrick Cloke
936e69825a
Separate creating an event context from persisting it in the federation handler (#9800)
This refactoring allows adding logic that uses the event context
before persisting it.
2021-04-14 12:35:28 -04:00
Patrick Cloke
e8816c6ace Revert "Check for space membership during a remote join of a restricted room. (#9763)"
This reverts commit cc51aaaa7a.

The PR was prematurely merged and not yet approved.
2021-04-14 12:33:37 -04:00
Patrick Cloke
cc51aaaa7a
Check for space membership during a remote join of a restricted room. (#9763)
When receiving a /send_join request for a room with join rules set to 'restricted',
check if the user is a member of the spaces defined in the 'allow' key of the join
rules.
    
This only applies to an experimental room version, as defined in MSC3083.
2021-04-14 12:32:20 -04:00
Jonathan de Jong
4b965c862d
Remove redundant "coding: utf-8" lines (#9786)
Part of #9744

Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-04-14 15:34:27 +01:00
Jonathan de Jong
2ca4e349e9
Bugbear: Add Mutable Parameter fixes (#9682)
Part of #9366

Adds in fixes for B006 and B008, both relating to mutable parameter lint errors.

Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
2021-04-08 22:38:54 +01:00
Patrick Cloke
d959d28730
Add type hints to the federation handler and server. (#9743) 2021-04-06 07:21:57 -04:00
Erik Johnston
963f4309fe
Make RateLimiter class check for ratelimit overrides (#9711)
This should fix a class of bug where we forget to check if e.g. the appservice shouldn't be ratelimited.

We also check the `ratelimit_override` table to check if the user has ratelimiting disabled. That table is really only meant to override the event sender ratelimiting, so we don't use any values from it (as they might not make sense for different rate limits), but we do infer that if ratelimiting is disabled for the user we should disabled all ratelimits.

Fixes #9663
2021-03-30 12:06:09 +01:00
Richard van der Hoff
af2248f8bf
Optimise missing prev_event handling (#9601)
Background: When we receive incoming federation traffic, and notice that we are missing prev_events from 
the incoming traffic, first we do a `/get_missing_events` request, and then if we still have missing prev_events,
we set up new backwards-extremities. To do that, we need to make a `/state_ids` request to ask the remote
server for the state at those prev_events, and then we may need to then ask the remote server for any events
in that state which we don't already have, as well as the auth events for those missing state events, so that we
can auth them.

This PR attempts to optimise the processing of that state request. The `state_ids` API returns a list of the state
events, as well as a list of all the auth events for *all* of those state events. The optimisation comes from the
observation that we are currently loading all of those auth events into memory at the start of the operation, but
we almost certainly aren't going to need *all* of the auth events. Rather, we can check that we have them, and
leave the actual load into memory for later. (Ideally the federation API would tell us which auth events we're
actually going to need, but it doesn't.)

The effect of this is to reduce the number of events that I need to load for an event in Matrix HQ from about
60000 to about 22000, which means it can stay in my in-memory cache, whereas previously the sheer number
of events meant that all 60K events had to be loaded from db for each request, due to the amount of cache
churn. (NB I've already tripled the size of the cache from its default of 10K).

Unfortunately I've ended up basically C&Ping `_get_state_for_room` and `_get_events_from_store_or_dest` into
a new method, because `_get_state_for_room` is also called during backfill, which expects the auth events to be
returned, so the same tricks don't work. That said, I don't really know why that codepath is completely different
(ultimately we're doing the same thing in setting up a new backwards extremity) so I've left a TODO suggesting
that we clean it up.
2021-03-15 13:51:02 +00:00
Richard van der Hoff
2b328d7e02
Improve logging when processing incoming transactions (#9596)
Put the room id in the logcontext, to make it easier to understand what's going on.
2021-03-12 15:08:03 +00:00
Patrick Cloke
2a99cc6524
Use the chain cover index in get_auth_chain_ids. (#9576)
This uses a simplified version of get_chain_cover_difference to calculate
auth chain of events.
2021-03-10 09:57:59 -05:00
Eric Eastwood
0a00b7ff14
Update black, and run auto formatting over the codebase (#9381)
- Update black version to the latest
 - Run black auto formatting over the codebase
    - Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
 - Update `code_style.md` docs around installing black to use the correct version
2021-02-16 22:32:34 +00:00
Andrew Morgan
594f2853e0
Remove dead handled_events set in invite_join (#9394)
This PR removes a set that was created and [initially used](1d2a0040cf (diff-0bc92da3d703202f5b9be2d3f845e375f5b1a6bc6ba61705a8af9be1121f5e42R435-R436)), but is no longer today.

May help cut down a bit on the time it takes to accept invites.
2021-02-12 22:15:50 +00:00
Erik Johnston
ff55300b91
Honour ratelimit flag for application services for invite ratelimiting (#9302) 2021-02-03 10:17:37 +00:00
Erik Johnston
f2c1560eca
Ratelimit invites by room and target user (#9258) 2021-01-29 16:38:29 +00:00
Erik Johnston
dd8da8c5f6
Precompute joined hosts and store in Redis (#9198) 2021-01-26 13:57:31 +00:00
David Teller
f14428b25c
Allow spam-checker modules to be provide async methods. (#8890)
Spam checker modules can now provide async methods. This is implemented
in a backwards-compatible manner.
2020-12-11 14:05:15 -05:00
Patrick Cloke
30fba62108
Apply an IP range blacklist to push and key revocation requests. (#8821)
Replaces the `federation_ip_range_blacklist` configuration setting with an
`ip_range_blacklist` setting with wider scope. It now applies to:

* Federation
* Identity servers
* Push notifications
* Checking key validitity for third-party invite events

The old `federation_ip_range_blacklist` setting is still honored if present, but
with reduced scope (it only applies to federation and identity servers).
2020-12-02 11:09:24 -05:00
Richard van der Hoff
950bb0305f
Consistently use room_id from federation request body (#8776)
* Consistently use room_id from federation request body

Some federation APIs have a redundant `room_id` path param (see
https://github.com/matrix-org/matrix-doc/issues/2330). We should make sure we
consistently use either the path param or the body param, and the body param is
easier.

* Kill off some references to "context"

Once upon a time, "rooms" were known as "contexts". I think this kills of the
last references to "contexts".
2020-11-19 10:05:33 +00:00
Andrew Morgan
e8d0853739
Generalise _maybe_store_room_on_invite (#8754)
There's a handy function called maybe_store_room_on_invite which allows us to create an entry in the rooms table for a room and its version for which we aren't joined to yet, but we can reference when ingesting events about.

This is currently used for invites where we receive some stripped state about the room and pass it down via /sync to the client, without us being in the room yet.

There is a similar requirement for knocking, where we will eventually do the same thing, and need an entry in the rooms table as well. Thus, reusing this function works, however its name needs to be generalised a bit.

Separated out from #6739.
2020-11-13 16:24:04 +00:00
Patrick Cloke
34a5696f93
Fix typos and spelling errors. (#8639) 2020-10-23 12:38:40 -04:00
Richard van der Hoff
123711ed19 Move third_party_rules check to event creation time
Rather than waiting until we handle the event, call the ThirdPartyRules check
when we fist create the event.
2020-10-13 21:38:48 +01:00
Richard van der Hoff
d59378d86b Remove redundant calls to third_party_rules in on_send_{join,leave}
There's not much point in calling these *after* we have decided to accept them
into the DAG.
2020-10-13 21:38:48 +01:00
Erik Johnston
b2486f6656
Fix message duplication if something goes wrong after persisting the event (#8476)
Should fix #3365.
2020-10-13 12:07:56 +01:00
Richard van der Hoff
f31f8e6319
Remove stream ordering from Metadata dict (#8452)
There's no need for it to be in the dict as well as the events table. Instead,
we store it in a separate attribute in the EventInternalMetadata object, and
populate that on load.

This means that we can rely on it being correctly populated for any event which
has been persited to the database.
2020-10-05 14:43:14 +01:00
Richard van der Hoff
937393abd8 Move resolve_events_with_store into StateResolutionHandler 2020-09-29 17:35:20 +01:00
Richard van der Hoff
2649d545a5
Mypy fixes for synapse.handlers.federation (#8422)
For some reason, an apparently unrelated PR upset mypy about this module. Here are a number of little fixes.
2020-09-29 15:57:36 +01:00
Richard van der Hoff
450ec48445
A pair of tiny cleanups in the federation request code. (#8401) 2020-09-28 13:15:00 +01:00
Erik Johnston
ac11fcbbb8
Add EventStreamPosition type (#8388)
The idea is to remove some of the places we pass around `int`, where it can represent one of two things:

1. the position of an event in the stream; or
2. a token that partitions the stream, used as part of the stream tokens.

The valid operations are then:

1. did a position happen before or after a token;
2. get all events that happened before or after a token; and
3. get all events between two tokens.

(Note that we don't want to allow other operations as we want to change the tokens to be vector clocks rather than simple ints)
2020-09-24 13:24:17 +01:00
Patrick Cloke
00db7786de Synapse 1.20.0rc5 (2020-09-18)
==============================
 
 In addition to the below, Synapse 1.20.0rc5 also includes the bug fix that was included in 1.19.3.
 
 Features
 --------
 
 - Add flags to the `/versions` endpoint for whether new rooms default to using E2EE. ([\#8343](https://github.com/matrix-org/synapse/issues/8343))
 
 Bugfixes
 --------
 
 - Fix rate limiting of federation `/send` requests. ([\#8342](https://github.com/matrix-org/synapse/issues/8342))
 - Fix a longstanding bug where back pagination over federation could get stuck if it failed to handle a received event. ([\#8349](https://github.com/matrix-org/synapse/issues/8349))
 
 Internal Changes
 ----------------
 
 - Blacklist [MSC2753](https://github.com/matrix-org/matrix-doc/pull/2753) SyTests until it is implemented. ([\#8285](https://github.com/matrix-org/synapse/issues/8285))
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEF3tZXk38tRDFVnUIM/xY9qcRMEgFAl9kzk8ACgkQM/xY9qcR
 MEim6A//aERkhyLGRlGpLd37lCyFQCeffTMH1rTvu04iIBQBaUZ6g7CYWOpK43zT
 U8kt379+5OShjdAXs/X4XP+ucdHVbrwsRSP3hBS/fFLiDT0fJgP8uiSf5QqO6NnT
 OqDyXYjcXvj/c6tMKglVtsdh8u4hFwNZjGPMGG68IzJu14uEhnD100cL9jSB9bLB
 ongWpsQzzdGBpJPSFRjv9dCUSeRbzyUdl1t0uqzrNqyN9s/JnzFTn7ZYo6y3lnSS
 dHGVMMo/12M2PkbBHnbJVvDY5Q/R7ZxyXlpz0gvSNOQIw8FqYFnuB0Niy5dQhXSR
 Sy5h4qbczLxqbql1x+lmzeQm4ZMORsW/Tl4C3z6yK6OYaOCJHIf9en4DplTSTqp1
 t+85JxWR2wH10d99YHBpaYKmkVovpwgchrO4YWrtXljUFAhhavzf+YiAdOHYT52s
 RDsDLsvjMbxEHsz4cHfycmshYhjzjb340wkoDXuQpj0zrO99d+Zd83xdK8pS0UQn
 OaljLRAd/5iBjTSyZPSrB1U5141OzlM3QZVJzaYAnP12yhR9eaX2twSCk+lPYOWd
 nhLJjNnj1B1XSGArthuE5NLyEiCPz6KyN2RhO0EOx5YjZN9TwH7LS9upyNFe1nN1
 GIhO5gz+jWLuBZE3xzRNjJyCx/I/LolpCwGMvKDu6638rpsbrPs=
 =tT5/
 -----END PGP SIGNATURE-----

Merge tag 'v1.20.0rc5' into develop

Synapse 1.20.0rc5 (2020-09-18)
==============================

In addition to the below, Synapse 1.20.0rc5 also includes the bug fix that was included in 1.19.3.

Features
--------

- Add flags to the `/versions` endpoint for whether new rooms default to using E2EE. ([\#8343](https://github.com/matrix-org/synapse/issues/8343))

Bugfixes
--------

- Fix rate limiting of federation `/send` requests. ([\#8342](https://github.com/matrix-org/synapse/issues/8342))
- Fix a longstanding bug where back pagination over federation could get stuck if it failed to handle a received event. ([\#8349](https://github.com/matrix-org/synapse/issues/8349))

Internal Changes
----------------

- Blacklist [MSC2753](https://github.com/matrix-org/matrix-doc/pull/2753) SyTests until it is implemented. ([\#8285](https://github.com/matrix-org/synapse/issues/8285))
2020-09-18 11:17:58 -04:00
Patrick Cloke
8a4a4186de
Simplify super() calls to Python 3 syntax. (#8344)
This converts calls like super(Foo, self) -> super().

Generated with:

    sed -i "" -Ee 's/super\([^\(]+\)/super()/g' **/*.py
2020-09-18 09:56:44 -04:00
Erik Johnston
43f2b67e4d
Intelligently select extremities used in backfill. (#8349)
Instead of just using the most recent extremities let's pick the
ones that will give us results that the pagination request cares about,
i.e. pick extremities only if they have a smaller depth than the
pagination token.

This is useful when we fail to backfill an extremity, as we no longer
get stuck requesting that same extremity repeatedly.
2020-09-18 14:25:52 +01:00
Patrick Cloke
aec294ee0d
Use slots in attrs classes where possible (#8296)
slots use less memory (and attribute access is faster) while slightly
limiting the flexibility of the class attributes. This focuses on objects
which are instantiated "often" and for short periods of time.
2020-09-14 12:50:06 -04:00
Erik Johnston
04cc249b43
Add experimental support for sharding event persister. Again. (#8294)
This is *not* ready for production yet. Caveats:

1. We should write some tests...
2. The stream token that we use for events can get stalled at the minimum position of all writers. This means that new events may not be processed and e.g. sent down sync streams if a writer isn't writing or is slow.
2020-09-14 10:16:41 +01:00
Erik Johnston
5d3e306d9f
Clean up Notifier.on_new_room_event code path (#8288)
The idea here is that we pass the `max_stream_id` to everything, and only use the stream ID of the particular event to figure out *when* the max stream position has caught up to the event and we can notify people about it.

This is to maintain the distinction between the position of an item in the stream (i.e. event A has stream ID 513) and a token that can be used to partition the stream (i.e. give me all events after stream ID 352). This distinction becomes important when the tokens are more complicated than a single number, which they will be once we start tracking the position of multiple writers in the tokens.

The valid operations here are:

1. Is a position before or after a token
2. Fetching all events between two tokens
3. Merging multiple tokens to get the "max", i.e. `C = max(A, B)` means that for all positions P where P is before A *or* before B, then P is before C.

Future PR will change the token type to a dedicated type.
2020-09-10 13:24:43 +01:00