==============================
**Note:** This release candidate [fixes](https://github.com/matrix-org/synapse/issues/11053) the user directory [bug](https://github.com/matrix-org/synapse/issues/11025) present in 1.45.0rc1. However, the [performance issue](https://github.com/matrix-org/synapse/issues/11049) which appeared in v1.44.0 is yet to be resolved.
Bugfixes
--------
- Fix a long-standing bug when using multiple event persister workers where events were not correctly sent down `/sync` due to a race. ([\#11045](https://github.com/matrix-org/synapse/issues/11045))
- Fix a bug introduced in Synapse 1.45.0rc1 where the user directory would stop updating if it processed an event from a
user not in the `users` table. ([\#11053](https://github.com/matrix-org/synapse/issues/11053))
- Fix a bug introduced in Synapse v1.44.0 when logging errors during oEmbed processing. ([\#11061](https://github.com/matrix-org/synapse/issues/11061))
Internal Changes
----------------
- Add an 'approximate difference' method to `StateFilter`. ([\#10825](https://github.com/matrix-org/synapse/issues/10825))
- Fix inconsistent behavior of `get_last_client_by_ip` when reporting data that has not been stored in the database yet. ([\#10970](https://github.com/matrix-org/synapse/issues/10970))
- Fix a bug introduced in Synapse 1.21.0 that causes opentracing and Prometheus metrics for replication requests to be measured incorrectly. ([\#10996](https://github.com/matrix-org/synapse/issues/10996))
- Ensure that cache config tests do not share state. ([\#11036](https://github.com/matrix-org/synapse/issues/11036))
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE1508oLYUKainYFJakD7OEIo53t0FAmFoBHYACgkQkD7OEIo5
3t2+wRAAlk2GR9Ezi8E/AG6f5Dsq0+guswHK9b34PtI0wcLQFgy76y200g8D/V0F
XFdby4IZsSLVmP2EVF+vIaU0/zviClHR81b8pvNcIRxORBAhV6fH5JkCRPNAlurS
aGnoKA1TSSg4HwNHn+dWyg7VCHl76oLEBhixRdz7aWPhLnCNkCM62E+jlwVq19BH
lPIDcwEvtq2ovHIGyE/sTg6Shrf9ELy0mYMQpPJNfDmxqjWqRhKaSpx5JSEJnzih
amONFY7Nc8kIUdP6JSR6105IhUDAnItScmyyUowyCvwtAQsXsgMd8Cs0FdiJ0JLy
BtE9qsBmBpmVWjfk7ESXE1eTHqH9Eexsy5dcB43hiVZ3ihhCCLmtVANej3QaRnRL
rYAiH+W9WK7n3189hzkmEG/BKk+zu65dLaHSMjBvEDMyiYLoPRlYnmunLeo6mESq
HHa25o5WiY8+UQ4RgcYqScIqvlgFqODmZT21OlknZfupUd2kDCN7Ga7TY599sBp2
XSGNB6l/eGWRWtOPcdrandsDnppcoYsfJkXIfAUIxwjnSN/RU4kDPDiYDXhU8d+D
hW/f59XwfMy3D3zZI4a+pEY8ZhcSHTnLJni95UFyb/7zOwnT+pZXUfVX46727nxg
imUfXsXIxUYQnDkAGBXdt7QMZPy4gvTGenllqaA987xjfd5/fKs=
=TTLK
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE1508oLYUKainYFJakD7OEIo53t0FAmFoCZQACgkQkD7OEIo5
3t2Ygw/9E48TTeW7F5j16aCVVmRYnlAoh+rBlrKiaIE4MSJtwI+TEdIIrwGm+JdB
lh59N3YmkXrn2No4ew0UY9fjD8LgofX3GAmlP2vN/fHst9kfe/UtoIDHWdQG1Xhh
xeHnt9vXirQMOLxapkDM4ymBW1WAQdvgdn3Ozy3hQ+ZUQiMqJg4CQVMb7mUpjdJz
D6Jb0Ub5FGnCGcSrofBk9KvHSZQb9Qrp8Rz3kokib98/+eJGdJDXhf0Yg4junJDy
mOvTAhbJYG9TS3taFajuw8IfcngJYyBMUOf7HlcqMtyEPdHFW2wClUHzeQIPOurU
YQmpqpDO7LSrHhFOt3cdmIlCGpGxKgR9AIUn8Volb2pxwH2i/BA7ryP9ZKz9dht3
VnPFLEHYumBYOAB05TsUkDlpr/0x3YS1JCgzHGABM8spsi8XO/ZPRkwpjdc/jSRB
3A4I7WGpKddk7pY3mCOnDx8hm2asgTPgTHkvJr2v2wYndIHxW91BDKhhSZ4tzjIY
aOr2CtJmoFb3Buek/s1S2AGVLRXMYa/IIRHl8FjmL7WhsqebJkcRfcaMfHHtCq5Y
tLhK2CYavEOu8TrkFWyIPPCCn8Y6Y8qnfg8c8f1ncfHsVHE4sOs9hjsVVz+n4QBd
LD+ICWBeLvpGQK+0cZu3QFtIyo6XKBJlYuAyLD5saNkuLnupA3Y=
=A9jl
-----END PGP SIGNATURE-----
Merge tag 'v1.45.0rc2' into develop
Synapse 1.45.0rc2 (2021-10-14)
==============================
**Note:** This release candidate [fixes](https://github.com/matrix-org/synapse/issues/11053) the user directory [bug](https://github.com/matrix-org/synapse/issues/11025) present in 1.45.0rc1. However, the [performance issue](https://github.com/matrix-org/synapse/issues/11049) which appeared in v1.44.0 is yet to be resolved.
Bugfixes
--------
- Fix a long-standing bug when using multiple event persister workers where events were not correctly sent down `/sync` due to a race. ([\#11045](https://github.com/matrix-org/synapse/issues/11045))
- Fix a bug introduced in Synapse 1.45.0rc1 where the user directory would stop updating if it processed an event from a
user not in the `users` table. ([\#11053](https://github.com/matrix-org/synapse/issues/11053))
- Fix a bug introduced in Synapse v1.44.0 when logging errors during oEmbed processing. ([\#11061](https://github.com/matrix-org/synapse/issues/11061))
Internal Changes
----------------
- Add an 'approximate difference' method to `StateFilter`. ([\#10825](https://github.com/matrix-org/synapse/issues/10825))
- Fix inconsistent behavior of `get_last_client_by_ip` when reporting data that has not been stored in the database yet. ([\#10970](https://github.com/matrix-org/synapse/issues/10970))
- Fix a bug introduced in Synapse 1.21.0 that causes opentracing and Prometheus metrics for replication requests to be measured incorrectly. ([\#10996](https://github.com/matrix-org/synapse/issues/10996))
- Ensure that cache config tests do not share state. ([\#11036](https://github.com/matrix-org/synapse/issues/11036))
Resolve and share `state_groups` for all historical events in batch. This also helps for showing the appropriate avatar/displayname in Element and will work whenever `/messages` has one of the historical messages as the first message in the batch.
This does have the flaw where if you just insert a single historical event somewhere, it probably won't resolve the state correctly from `/messages` or `/context` since it will grab a non historical event above or below with resolved state which never included the historical state back then. For the same reasions, this also does not work in Element between the transition from actual messages to historical messages. In the Gitter case, this isn't really a problem since all of the historical messages are in one big lump at the beginning of the room.
For a future iteration, might be good to look at `/messages` and `/context` to additionally add the `state` for any historical messages in that batch.
---
How are the `state_groups` shared? To illustrate the `state_group` sharing, see this example:
**Before** (new `state_group` for every event 😬, very inefficient):
```
# Tests from https://github.com/matrix-org/complement/pull/206
$ COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh TestBackfillingHistory/parallel/should_resolve_member_state_events_for_historical_events
create_new_client_event m.room.member event=$_JXfwUDIWS6xKGG4SmZXjSFrizhARM7QblhATVWWUcA state_group=None
create_new_client_event org.matrix.msc2716.insertion event=$1ZBfmBKEjg94d-vGYymKrVYeghwBOuGJ3wubU1-I9y0 state_group=9
create_new_client_event org.matrix.msc2716.insertion event=$Mq2JvRetTyclPuozRI682SAjYp3GqRuPc8_cH5-ezPY state_group=10
create_new_client_event m.room.message event=$MfmY4rBQkxrIp8jVwVMTJ4PKnxSigpG9E2cn7S0AtTo state_group=11
create_new_client_event m.room.message event=$uYOv6V8wiF7xHwOMt-60d1AoOIbqLgrDLz6ZIQDdWUI state_group=12
create_new_client_event m.room.message event=$PAbkJRMxb0bX4A6av463faiAhxkE3FEObM1xB4D0UG4 state_group=13
create_new_client_event org.matrix.msc2716.batch event=$Oy_S7AWN7rJQe_MYwGPEy6RtbYklrI-tAhmfiLrCaKI state_group=14
```
**After** (all events in batch sharing `state_group=10`) (the base insertion event has `state_group=8` which matches the `prev_event` we're inserting next to):
```
# Tests from https://github.com/matrix-org/complement/pull/206
$ COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh TestBackfillingHistory/parallel/should_resolve_member_state_events_for_historical_events
create_new_client_event m.room.member event=$PWomJ8PwENYEYuVNoG30gqtybuQQSZ55eldBUSs0i0U state_group=None
create_new_client_event org.matrix.msc2716.insertion event=$e_mCU7Eah9ABF6nQU7lu4E1RxIWccNF05AKaTT5m3lw state_group=9
create_new_client_event org.matrix.msc2716.insertion event=$ui7A3_GdXIcJq0C8GpyrF8X7B3DTjMd_WGCjogax7xU state_group=10
create_new_client_event m.room.message event=$EnTIM5rEGVezQJiYl62uFBl6kJ7B-sMxWqe2D_4FX1I state_group=10
create_new_client_event m.room.message event=$LGx5jGONnBPuNhAuZqHeEoXChd9ryVkuTZatGisOPjk state_group=10
create_new_client_event m.room.message event=$wW0zwoN50lbLu1KoKbybVMxLbKUj7GV_olozIc5i3M0 state_group=10
create_new_client_event org.matrix.msc2716.batch event=$5ZB6dtzqFBCEuMRgpkU201Qhx3WtXZGTz_YgldL6JrQ state_group=10
```
* Pull out `_handle_room_membership_event`
* Discard excluded users early
* Rearrange logic so the change is membership is effectively switched over. See PR for rationale.
The following scenarios would halt the user directory updater:
- user joins room
- user leaves room
- user present in room which switches from private to public, or vice versa.
for two classes of users:
- appservice senders
- users missing from the user table.
If this happened, the user directory would be stuck, unable to make forward progress.
Exclude both cases from the user directory, so that we ignore them.
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: reivilibre <oliverw@matrix.org>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
The race allowed the current position to advance too far when stream IDs
are still being persisted.
This happened when it received a new stream ID from a remote write
between a new stream ID being allocated and it being added to the set of
unpersisted stream IDs.
Fixes#9424.
Make `get_last_client_by_ip` return the same dictionary structure
regardless of whether the data has been persisted to the database.
This change will allow slightly cleaner type hints to be applied later
on.
This commit fixes two bugs to do with decorators not instrumenting
`ReplicationEndpoint`'s `send_request` correctly. There are two
decorators on `send_request`: Prometheus' `Gauge.track_inprogress()`
and Synapse's `opentracing.trace`.
`Gauge.track_inprogress()` does not have any support for async
functions when used as a decorator. Since async functions behave like
regular functions that return coroutines, only the creation of the
coroutine was covered by the metric and none of the actual body of
`send_request`.
`Gauge.track_inprogress()` returns a regular, non-async function
wrapping `send_request`, which is the source of the next bug.
The `opentracing.trace` decorator would normally handle async functions
correctly, but since the wrapped `send_request` is a non-async function,
the decorator ends up suffering from the same issue as
`Gauge.track_inprogress()`: the opentracing span only measures the
creation of the coroutine and none of the actual function body.
Using `Gauge.track_inprogress()` as a context manager instead of a
decorator resolves both bugs.
Updating mypy past version 0.9 means that third-party stubs are no-longer distributed with typeshed. See http://mypy-lang.blogspot.com/2021/06/mypy-0900-released.html for details.
We therefore pull in stub packages in setup.py
Additionally, some modules that we were previously ignoring import failures for now have stubs. So let's use them.
The rest of this change consists of fixups to make the newer mypy + stubs pass CI.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
This splits apart `handle_new_user` into a function which adds an entry to the `user_directory` and a function which updates the room sharing tables. I plan to continue doing more of this kind of refactoring to clarify the implementation.
The shared ratelimit function was replaced with a dedicated
RequestRatelimiter class (accessible from the HomeServer
object).
Other properties were copied to each sub-class that inherited
from BaseHandler.
Use `PreserveLoggingContext()` to ensure that logging contexts are not
lost when exiting a read/write lock.
When exiting a read/write lock, callbacks on a `Deferred` are triggered
as a signal to any waiting coroutines. Any waiting coroutine that
becomes runnable is likely to follow the Synapse logging context rules
and will restore its own logging context, then either run to completion
or await another `Deferred`, resetting the logging context in the
process.
This removes the magic allowing accessing configurable
variables directly from the config object. It is now required
that a specific configuration class is used (e.g. `config.foo`
must be replaced with `config.server.foo`).
Fix a long-standing bug where a batch of user directory changes would be
silently dropped if the server left a room early in the batch.
* Pull out `wait_for_background_update` in tests
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
The following modules now pass `disallow_untyped_defs`:
* synapse.util.caches.cached_call
* synapse.util.caches.lrucache
* synapse.util.caches.response_cache
* synapse.util.caches.stream_change_cache
* synapse.util.caches.ttlcache pass
* synapse.util.daemonize
* synapse.util.patch_inline_callbacks pass `no-untyped-defs`
* synapse.util.versionstring
Additional typing in synapse.util.metrics. Didn't get this to pass `no-untyped-defs`, think I'll need to watch #10847
There are two steps to rebuilding the user directory:
1. a scan over rooms, followed by
2. a scan over local users.
The former reads avatars and display names from the `room_memberships`
table and therefore contains potentially private avatars and
display names. The latter reads from the the `profiles` table which only
contains public data; moreover it will overwrite any private profiles
that the rooms scan may have written to the user directory. This means
that the rebuild could leak private user while the rebuild was in
progress, only to later cover up the leaks once the rebuild had completed.
This change skips over local users when writing user_directory rows
when scanning rooms. Doing so means that it'll take longer for a rebuild
to make local users searchable, which is unfortunate. I think a future
PR can improve this by swapping the order of the two steps above. (And
indeed there's more to do here, e.g. copying from `profiles` without
going via Python.)
Small tidy-ups while I'm here:
* Remove duplicated code from test_initial. This was meant to be pulled into `purge_and_rebuild_user_dir`.
* Move `is_public` before updating sharing tables. No functional change; it's still before the first read of `is_public`.
* Don't bother creating a set from dict keys. Slightly nicer and makes the code simpler.
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
We correctly allowed using the MSC2716 batch endpoint for
the room creator in existing room versions but accidentally didn't track
the events because of a logic flaw.
This prevented you from connecting subsequent chunks together because it would
throw the unknown batch ID error.
We only want to process MSC2716 events when:
- The room version supports MSC2716
- Any room where the homeserver has the `msc2716_enabled` experimental feature enabled and the event is from the room creator
`_check_event_auth` is only called in two places, and only one of those sets
`send_on_behalf_of`. Warming the cache isn't really part of auth anyway, so
moving it out makes a lot more sense.