* add tests for fetching key locally
* add logic to check if origin server is same as host and fetch verify key locally rather than over federation
* add changelog
* slight refactor, add docstring, change changelog entry
* Make changelog entry one line
* remove verify_json_locally and push locality check to process_request, add function process_request_locally
* remove leftover code reference
* refactor to add common call to 'verify_json and associated handling code
* add type hint to process_json
* add some docstrings + very slight refactor
* Teach MyPy that the sentinel context is False
This means that if `ctx: LoggingContextOrSentinel`
then `bool(ctx)` narrows us to `ctx:LoggingContext`, which is a really
neat find!
* Annotate RequestMetrics
- Raise errors for sentry if we use the sentinel context
- Ensure we don't raise an error and carry on, but not recording stats
- Include stack trace in the error case to lower Sean's blood pressure
* Make mypy pass for synapse.http.request_metrics
* Make synapse.http.connectproxyclient pass mypy
Co-authored-by: reivilibre <oliverw@matrix.org>
Users admin API can now also modify user
type in addition to allowing it to be
set on user creation.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
This is the final piece of the jigsaw for #9595. As with other changes before this one (eg #10771), we need to make sure that we auth the auth events in the right order, and actually check that their predecessors haven't been rejected.
To do this I've reused the existing code we use when persisting outliers elsewhere.
I've removed the code for attempting to fetch missing auth_events - the events should have been present in the send_join response, so the likely reason they are missing is that we couldn't verify them, so requesting them again is unlikely to help. Instead, we simply drop any state which relies on those auth events, as we do at a backwards-extremity. See also matrix-org/complement#216 for a test for this.
`synapse.config.__main__` has the possibility to read a config item. This can be used to conveniently also validate the config is valid before trying to start Synapse.
The "read" command broke in https://github.com/matrix-org/synapse/pull/10916 as it now requires passing in "server.server_name" for example.
Also made the read command optional so one can just call this with just the confirm file reference and get a "Config parses OK" if things are ok.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
* We only need to fetch users in private rooms
* Filter out `user_id` at the top
* Discard excluded users in the top loop
We weren't doing this in the "First, if they're our user" branch so this
is a bugfix.
* The caller must check that `user_id` is included
This is in the docstring. There are two call sites:
- one in `_handle_room_publicity_change`, which explicitly checks before calling;
- and another in `_handle_room_membership_event`, which returns early if
the user is excluded.
So this change is safe.
* Test joining a private room with an excluded user
* Tweak an existing test
* Changelog
* test docstring
* lint
If we find ourselves dealing with rejected events, we proably want to know
about it. Let's include it in the stringification of the event so that it gets
logged.
Currently, when we receive an event whose auth_events differ from those we expect, we state-resolve between the two state sets, and check that the event passes auth based on the resolved state.
This means that it's possible for us to accept events which don't pass auth at their declared auth_events (or where the auth events themselves were rejected), leading to problems down the line like #10083.
This change means we will:
* ignore any events where we cannot find the auth events
* reject any events whose auth events were rejected
* reject any events which do not pass auth at their declared auth_events.
Together with a whole raft of previous work, this is a partial fix to #9595.
Fixes#6643.
Based on #11009.
This fixes a bug where we would accept an event whose `auth_events` include
rejected events, if the rejected event was shadowed by another `auth_event`
with same `(type, state_key)`.
The approach is to pass a list of auth events into
`check_auth_rules_for_event` instead of a dict, which of course means updating
the call sites.
This is an extension of #10956.
Instead of triggering `__exit__` manually on the replication handler's
logging context, use it as a context manager so that there is an
`__enter__` call to balance the `__exit__`.
Found while working on the Gitter backfill script and noticed
it only happened after we sent 7 batches, https://gitlab.com/gitterHQ/webapp/-/merge_requests/2229#note_665906390
When there are more than 5 backward extremities for a given depth,
backfill will throw an error because we sliced the extremity list
to 5 but then try to iterate over the full list. This causes
us to look for state that we never fetched and we get a `KeyError`.
Before when calling `/messages` when there are more than 5 backward extremities:
```
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 258, in _async_render_wrapper
callback_return = await self._async_render(request)
File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 446, in _async_render
callback_return = await raw_callback_return
File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/room.py", line 580, in on_GET
msgs = await self.pagination_handler.get_messages(
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/pagination.py", line 396, in get_messages
await self.hs.get_federation_handler().maybe_backfill(
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 133, in maybe_backfill
return await self._maybe_backfill_inner(room_id, current_depth, limit)
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 386, in _maybe_backfill_inner
likely_extremeties_domains = get_domains_from_state(states[e_id])
KeyError: '$zpFflMEBtZdgcMQWTakaVItTLMjLFdKcRWUPHbbSZJl'
```
==============================
**Note:** This release candidate [fixes](https://github.com/matrix-org/synapse/issues/11053) the user directory [bug](https://github.com/matrix-org/synapse/issues/11025) present in 1.45.0rc1. However, the [performance issue](https://github.com/matrix-org/synapse/issues/11049) which appeared in v1.44.0 is yet to be resolved.
Bugfixes
--------
- Fix a long-standing bug when using multiple event persister workers where events were not correctly sent down `/sync` due to a race. ([\#11045](https://github.com/matrix-org/synapse/issues/11045))
- Fix a bug introduced in Synapse 1.45.0rc1 where the user directory would stop updating if it processed an event from a
user not in the `users` table. ([\#11053](https://github.com/matrix-org/synapse/issues/11053))
- Fix a bug introduced in Synapse v1.44.0 when logging errors during oEmbed processing. ([\#11061](https://github.com/matrix-org/synapse/issues/11061))
Internal Changes
----------------
- Add an 'approximate difference' method to `StateFilter`. ([\#10825](https://github.com/matrix-org/synapse/issues/10825))
- Fix inconsistent behavior of `get_last_client_by_ip` when reporting data that has not been stored in the database yet. ([\#10970](https://github.com/matrix-org/synapse/issues/10970))
- Fix a bug introduced in Synapse 1.21.0 that causes opentracing and Prometheus metrics for replication requests to be measured incorrectly. ([\#10996](https://github.com/matrix-org/synapse/issues/10996))
- Ensure that cache config tests do not share state. ([\#11036](https://github.com/matrix-org/synapse/issues/11036))
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE1508oLYUKainYFJakD7OEIo53t0FAmFoBHYACgkQkD7OEIo5
3t2+wRAAlk2GR9Ezi8E/AG6f5Dsq0+guswHK9b34PtI0wcLQFgy76y200g8D/V0F
XFdby4IZsSLVmP2EVF+vIaU0/zviClHR81b8pvNcIRxORBAhV6fH5JkCRPNAlurS
aGnoKA1TSSg4HwNHn+dWyg7VCHl76oLEBhixRdz7aWPhLnCNkCM62E+jlwVq19BH
lPIDcwEvtq2ovHIGyE/sTg6Shrf9ELy0mYMQpPJNfDmxqjWqRhKaSpx5JSEJnzih
amONFY7Nc8kIUdP6JSR6105IhUDAnItScmyyUowyCvwtAQsXsgMd8Cs0FdiJ0JLy
BtE9qsBmBpmVWjfk7ESXE1eTHqH9Eexsy5dcB43hiVZ3ihhCCLmtVANej3QaRnRL
rYAiH+W9WK7n3189hzkmEG/BKk+zu65dLaHSMjBvEDMyiYLoPRlYnmunLeo6mESq
HHa25o5WiY8+UQ4RgcYqScIqvlgFqODmZT21OlknZfupUd2kDCN7Ga7TY599sBp2
XSGNB6l/eGWRWtOPcdrandsDnppcoYsfJkXIfAUIxwjnSN/RU4kDPDiYDXhU8d+D
hW/f59XwfMy3D3zZI4a+pEY8ZhcSHTnLJni95UFyb/7zOwnT+pZXUfVX46727nxg
imUfXsXIxUYQnDkAGBXdt7QMZPy4gvTGenllqaA987xjfd5/fKs=
=TTLK
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE1508oLYUKainYFJakD7OEIo53t0FAmFoCZQACgkQkD7OEIo5
3t2Ygw/9E48TTeW7F5j16aCVVmRYnlAoh+rBlrKiaIE4MSJtwI+TEdIIrwGm+JdB
lh59N3YmkXrn2No4ew0UY9fjD8LgofX3GAmlP2vN/fHst9kfe/UtoIDHWdQG1Xhh
xeHnt9vXirQMOLxapkDM4ymBW1WAQdvgdn3Ozy3hQ+ZUQiMqJg4CQVMb7mUpjdJz
D6Jb0Ub5FGnCGcSrofBk9KvHSZQb9Qrp8Rz3kokib98/+eJGdJDXhf0Yg4junJDy
mOvTAhbJYG9TS3taFajuw8IfcngJYyBMUOf7HlcqMtyEPdHFW2wClUHzeQIPOurU
YQmpqpDO7LSrHhFOt3cdmIlCGpGxKgR9AIUn8Volb2pxwH2i/BA7ryP9ZKz9dht3
VnPFLEHYumBYOAB05TsUkDlpr/0x3YS1JCgzHGABM8spsi8XO/ZPRkwpjdc/jSRB
3A4I7WGpKddk7pY3mCOnDx8hm2asgTPgTHkvJr2v2wYndIHxW91BDKhhSZ4tzjIY
aOr2CtJmoFb3Buek/s1S2AGVLRXMYa/IIRHl8FjmL7WhsqebJkcRfcaMfHHtCq5Y
tLhK2CYavEOu8TrkFWyIPPCCn8Y6Y8qnfg8c8f1ncfHsVHE4sOs9hjsVVz+n4QBd
LD+ICWBeLvpGQK+0cZu3QFtIyo6XKBJlYuAyLD5saNkuLnupA3Y=
=A9jl
-----END PGP SIGNATURE-----
Merge tag 'v1.45.0rc2' into develop
Synapse 1.45.0rc2 (2021-10-14)
==============================
**Note:** This release candidate [fixes](https://github.com/matrix-org/synapse/issues/11053) the user directory [bug](https://github.com/matrix-org/synapse/issues/11025) present in 1.45.0rc1. However, the [performance issue](https://github.com/matrix-org/synapse/issues/11049) which appeared in v1.44.0 is yet to be resolved.
Bugfixes
--------
- Fix a long-standing bug when using multiple event persister workers where events were not correctly sent down `/sync` due to a race. ([\#11045](https://github.com/matrix-org/synapse/issues/11045))
- Fix a bug introduced in Synapse 1.45.0rc1 where the user directory would stop updating if it processed an event from a
user not in the `users` table. ([\#11053](https://github.com/matrix-org/synapse/issues/11053))
- Fix a bug introduced in Synapse v1.44.0 when logging errors during oEmbed processing. ([\#11061](https://github.com/matrix-org/synapse/issues/11061))
Internal Changes
----------------
- Add an 'approximate difference' method to `StateFilter`. ([\#10825](https://github.com/matrix-org/synapse/issues/10825))
- Fix inconsistent behavior of `get_last_client_by_ip` when reporting data that has not been stored in the database yet. ([\#10970](https://github.com/matrix-org/synapse/issues/10970))
- Fix a bug introduced in Synapse 1.21.0 that causes opentracing and Prometheus metrics for replication requests to be measured incorrectly. ([\#10996](https://github.com/matrix-org/synapse/issues/10996))
- Ensure that cache config tests do not share state. ([\#11036](https://github.com/matrix-org/synapse/issues/11036))
Resolve and share `state_groups` for all historical events in batch. This also helps for showing the appropriate avatar/displayname in Element and will work whenever `/messages` has one of the historical messages as the first message in the batch.
This does have the flaw where if you just insert a single historical event somewhere, it probably won't resolve the state correctly from `/messages` or `/context` since it will grab a non historical event above or below with resolved state which never included the historical state back then. For the same reasions, this also does not work in Element between the transition from actual messages to historical messages. In the Gitter case, this isn't really a problem since all of the historical messages are in one big lump at the beginning of the room.
For a future iteration, might be good to look at `/messages` and `/context` to additionally add the `state` for any historical messages in that batch.
---
How are the `state_groups` shared? To illustrate the `state_group` sharing, see this example:
**Before** (new `state_group` for every event 😬, very inefficient):
```
# Tests from https://github.com/matrix-org/complement/pull/206
$ COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh TestBackfillingHistory/parallel/should_resolve_member_state_events_for_historical_events
create_new_client_event m.room.member event=$_JXfwUDIWS6xKGG4SmZXjSFrizhARM7QblhATVWWUcA state_group=None
create_new_client_event org.matrix.msc2716.insertion event=$1ZBfmBKEjg94d-vGYymKrVYeghwBOuGJ3wubU1-I9y0 state_group=9
create_new_client_event org.matrix.msc2716.insertion event=$Mq2JvRetTyclPuozRI682SAjYp3GqRuPc8_cH5-ezPY state_group=10
create_new_client_event m.room.message event=$MfmY4rBQkxrIp8jVwVMTJ4PKnxSigpG9E2cn7S0AtTo state_group=11
create_new_client_event m.room.message event=$uYOv6V8wiF7xHwOMt-60d1AoOIbqLgrDLz6ZIQDdWUI state_group=12
create_new_client_event m.room.message event=$PAbkJRMxb0bX4A6av463faiAhxkE3FEObM1xB4D0UG4 state_group=13
create_new_client_event org.matrix.msc2716.batch event=$Oy_S7AWN7rJQe_MYwGPEy6RtbYklrI-tAhmfiLrCaKI state_group=14
```
**After** (all events in batch sharing `state_group=10`) (the base insertion event has `state_group=8` which matches the `prev_event` we're inserting next to):
```
# Tests from https://github.com/matrix-org/complement/pull/206
$ COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh TestBackfillingHistory/parallel/should_resolve_member_state_events_for_historical_events
create_new_client_event m.room.member event=$PWomJ8PwENYEYuVNoG30gqtybuQQSZ55eldBUSs0i0U state_group=None
create_new_client_event org.matrix.msc2716.insertion event=$e_mCU7Eah9ABF6nQU7lu4E1RxIWccNF05AKaTT5m3lw state_group=9
create_new_client_event org.matrix.msc2716.insertion event=$ui7A3_GdXIcJq0C8GpyrF8X7B3DTjMd_WGCjogax7xU state_group=10
create_new_client_event m.room.message event=$EnTIM5rEGVezQJiYl62uFBl6kJ7B-sMxWqe2D_4FX1I state_group=10
create_new_client_event m.room.message event=$LGx5jGONnBPuNhAuZqHeEoXChd9ryVkuTZatGisOPjk state_group=10
create_new_client_event m.room.message event=$wW0zwoN50lbLu1KoKbybVMxLbKUj7GV_olozIc5i3M0 state_group=10
create_new_client_event org.matrix.msc2716.batch event=$5ZB6dtzqFBCEuMRgpkU201Qhx3WtXZGTz_YgldL6JrQ state_group=10
```
* Pull out `_handle_room_membership_event`
* Discard excluded users early
* Rearrange logic so the change is membership is effectively switched over. See PR for rationale.
The following scenarios would halt the user directory updater:
- user joins room
- user leaves room
- user present in room which switches from private to public, or vice versa.
for two classes of users:
- appservice senders
- users missing from the user table.
If this happened, the user directory would be stuck, unable to make forward progress.
Exclude both cases from the user directory, so that we ignore them.
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: reivilibre <oliverw@matrix.org>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
The race allowed the current position to advance too far when stream IDs
are still being persisted.
This happened when it received a new stream ID from a remote write
between a new stream ID being allocated and it being added to the set of
unpersisted stream IDs.
Fixes#9424.
Make `get_last_client_by_ip` return the same dictionary structure
regardless of whether the data has been persisted to the database.
This change will allow slightly cleaner type hints to be applied later
on.
This commit fixes two bugs to do with decorators not instrumenting
`ReplicationEndpoint`'s `send_request` correctly. There are two
decorators on `send_request`: Prometheus' `Gauge.track_inprogress()`
and Synapse's `opentracing.trace`.
`Gauge.track_inprogress()` does not have any support for async
functions when used as a decorator. Since async functions behave like
regular functions that return coroutines, only the creation of the
coroutine was covered by the metric and none of the actual body of
`send_request`.
`Gauge.track_inprogress()` returns a regular, non-async function
wrapping `send_request`, which is the source of the next bug.
The `opentracing.trace` decorator would normally handle async functions
correctly, but since the wrapped `send_request` is a non-async function,
the decorator ends up suffering from the same issue as
`Gauge.track_inprogress()`: the opentracing span only measures the
creation of the coroutine and none of the actual function body.
Using `Gauge.track_inprogress()` as a context manager instead of a
decorator resolves both bugs.
Updating mypy past version 0.9 means that third-party stubs are no-longer distributed with typeshed. See http://mypy-lang.blogspot.com/2021/06/mypy-0900-released.html for details.
We therefore pull in stub packages in setup.py
Additionally, some modules that we were previously ignoring import failures for now have stubs. So let's use them.
The rest of this change consists of fixups to make the newer mypy + stubs pass CI.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
This splits apart `handle_new_user` into a function which adds an entry to the `user_directory` and a function which updates the room sharing tables. I plan to continue doing more of this kind of refactoring to clarify the implementation.
The shared ratelimit function was replaced with a dedicated
RequestRatelimiter class (accessible from the HomeServer
object).
Other properties were copied to each sub-class that inherited
from BaseHandler.
Use `PreserveLoggingContext()` to ensure that logging contexts are not
lost when exiting a read/write lock.
When exiting a read/write lock, callbacks on a `Deferred` are triggered
as a signal to any waiting coroutines. Any waiting coroutine that
becomes runnable is likely to follow the Synapse logging context rules
and will restore its own logging context, then either run to completion
or await another `Deferred`, resetting the logging context in the
process.
This removes the magic allowing accessing configurable
variables directly from the config object. It is now required
that a specific configuration class is used (e.g. `config.foo`
must be replaced with `config.server.foo`).
Fix a long-standing bug where a batch of user directory changes would be
silently dropped if the server left a room early in the batch.
* Pull out `wait_for_background_update` in tests
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
The following modules now pass `disallow_untyped_defs`:
* synapse.util.caches.cached_call
* synapse.util.caches.lrucache
* synapse.util.caches.response_cache
* synapse.util.caches.stream_change_cache
* synapse.util.caches.ttlcache pass
* synapse.util.daemonize
* synapse.util.patch_inline_callbacks pass `no-untyped-defs`
* synapse.util.versionstring
Additional typing in synapse.util.metrics. Didn't get this to pass `no-untyped-defs`, think I'll need to watch #10847
There are two steps to rebuilding the user directory:
1. a scan over rooms, followed by
2. a scan over local users.
The former reads avatars and display names from the `room_memberships`
table and therefore contains potentially private avatars and
display names. The latter reads from the the `profiles` table which only
contains public data; moreover it will overwrite any private profiles
that the rooms scan may have written to the user directory. This means
that the rebuild could leak private user while the rebuild was in
progress, only to later cover up the leaks once the rebuild had completed.
This change skips over local users when writing user_directory rows
when scanning rooms. Doing so means that it'll take longer for a rebuild
to make local users searchable, which is unfortunate. I think a future
PR can improve this by swapping the order of the two steps above. (And
indeed there's more to do here, e.g. copying from `profiles` without
going via Python.)
Small tidy-ups while I'm here:
* Remove duplicated code from test_initial. This was meant to be pulled into `purge_and_rebuild_user_dir`.
* Move `is_public` before updating sharing tables. No functional change; it's still before the first read of `is_public`.
* Don't bother creating a set from dict keys. Slightly nicer and makes the code simpler.
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
We correctly allowed using the MSC2716 batch endpoint for
the room creator in existing room versions but accidentally didn't track
the events because of a logic flaw.
This prevented you from connecting subsequent chunks together because it would
throw the unknown batch ID error.
We only want to process MSC2716 events when:
- The room version supports MSC2716
- Any room where the homeserver has the `msc2716_enabled` experimental feature enabled and the event is from the room creator