Commit Graph

950 Commits

Author SHA1 Message Date
Nick Mills-Barrett
e6af49fbea
Reintroduce membership tables event stream ordering (#15128)
* Add `event_stream_ordering` column to membership state tables

Specifically this adds the column to `current_state_events`,
`local_current_membership` and `room_memberships`. Each of these tables
is regularly joined with the `events` table to get the stream ordering
and denormalising this into each table will yield significant query
performance improvements once used.

* Make denormalised `event_stream_ordering` columns foreign keys
* Add comment in schema file explaining new denormalised columns
* Add triggers to enforce consistency of `event_stream_ordering` columns
* Re-order purge room tables to account for foreign keys
* Bump schema version to 75

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2023-03-24 11:44:01 +00:00
David Robertson
3b0083c92a
Use immutabledict instead of frozendict (#15113)
Additionally:

* Consistently use `freeze()` in test

---------

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: 6543 <6543@obermui.de>
2023-03-22 17:15:34 +00:00
Patrick Cloke
1bc4feb6c9
Apply & bundle edits for non-message events. (#15295) 2023-03-21 14:19:54 -04:00
Andrew Morgan
ec9224bf9a
Make POST /_matrix/client/v3/rooms/{roomId}/report/{eventId} endpoint return 404 if event exists, but the user lacks access (#15300) 2023-03-21 13:24:03 +00:00
reivilibre
1f5473465d
Refresh remote profiles that have been marked as stale, in order to fill the user directory. [rei:userdirpriv] (#14756)
* Scaffolding for background process to refresh profiles

* Add scaffolding for background process to refresh profiles for a given server

* Implement the code to select servers to refresh from

* Ensure we don't build up multiple looping calls

* Make `get_profile` able to respect backoffs

* Add logic for refreshing users

* When backing off, schedule a refresh when the backoff is over

* Wake up the background processes when we receive an interesting state event

* Add tests

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Add comment about 1<<62

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-03-16 11:44:11 +00:00
reivilibre
f54f877f27
Preparatory work to fix the user directory assuming that any remote membership state events represent a profile change. [rei:userdirpriv] (#14755)
* Remove special-case method for new memberships only, use more generic method

* Only collect profiles from state events in public rooms

* Add a table to track stale remote user profiles

* Add store methods to set and delete rows in this new table

* Mark remote profiles as stale when a member state event comes in to a private room

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Simplify by removing Optionality of `event_id`

* Replace names and avatars with None if they're set to dodgy things

I think this makes more sense anyway.

* Move schema delta to 74 (I missed the boat?)

* Turns out these can be None after all

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-03-16 09:55:19 +00:00
reivilibre
d0fe417f5c
Remove unused store method _set_destination_retry_timings_emulated. (#15266) 2023-03-14 17:32:46 +00:00
Patrick Cloke
88efc75bab
Include the room ID in more purge room log lines. (#15222) 2023-03-08 20:08:56 +00:00
Erik Johnston
c69aae94cd
Split up txn for fetching device keys (#15215)
We look up keys in batches, but we should do that outside of the
transaction to avoid starving the database pool.
2023-03-07 08:51:34 +00:00
Patrick Cloke
02f74f3a99
Combine AbstractStreamIdTracker and AbstractStreamIdGenerator. (#15192)
AbstractStreamIdTracker (now) has only a single sub-class: AbstractStreamIdGenerator,
combine them to simplify some code and remove any direct references to
AbstractStreamIdTracker.
2023-03-03 08:13:37 -05:00
Andrew Morgan
15e975f68f
Experimental MSC3890 Implementation: Fix deleting account data when using an account data writer worker (#14869) 2023-03-03 10:51:57 +00:00
Andrew Morgan
1eea662780
Add a get_next_txn method to StreamIdGenerator to match MultiWriterIdGenerator (#15191 2023-03-02 18:27:00 +00:00
Dirk Klimpel
65f10afb64
Move event_reports to RoomWorkerStore (#15165) 2023-03-02 10:38:46 +00:00
Richard van der Hoff
2b78981736
Remove support for aggregating reactions (#15172)
It turns out that no clients rely on server-side aggregation of `m.annotation`
relationships: it's just not very useful as currently implemented.

It's also non-trivial to calculate.

I want to remove it from MSC2677, so to keep the implementation in line, let's
remove it here.
2023-02-28 18:49:28 +00:00
reivilibre
d62cd940cb
Fix a long-standing bug where an initial sync would not respond to changes to the list of ignored users if there was an initial sync cached. (#15163) 2023-02-28 17:11:26 +00:00
reivilibre
682d31c702
Allow use of the /filter Client-Server APIs on workers. (#15134) 2023-02-28 16:37:19 +00:00
Dirk Klimpel
93f7955eba
Admin API endpoint to delete a reported event (#15116)
* Admin api to delete event report

* lint +  tests

* newsfile

* Apply suggestions from code review

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>

* revert changes - move to WorkerStore

* update unit test

* Note that timestamp is in millseconds

---------

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
2023-02-28 12:09:10 +00:00
Andrew Morgan
b40657314e
Add module API callbacks for adding and deleting local 3PID associations (#15044 2023-02-27 14:19:19 +00:00
Shay
1c95ddd09b
Batch up storing state groups when creating new room (#14918) 2023-02-24 13:15:29 -08:00
Sean Quah
335f52d595
Improve handling of non-ASCII characters in user directory search (#15143)
* Fix a long-standing bug where non-ASCII characters in search terms,
  including accented letters, would not match characters in a different
  case.
* Fix a long-standing bug where search terms using combining accents
  would not match display names using precomposed accents and vice
  versa.

To fully take effect, the user directory must be rebuilt after this
change.

Fixes #14630.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-02-24 13:39:45 +00:00
dependabot[bot]
9bb2eac719
Bump black from 22.12.0 to 23.1.0 (#15103) 2023-02-22 15:29:09 -05:00
reivilibre
1cbc3f197c
Fix a bug introduced in Synapse v1.74.0 where searching with colons when using ICU for search term tokenisation would fail with an error. (#15079)
Co-authored-by: David Robertson <davidr@element.io>
2023-02-20 12:00:18 +00:00
David Robertson
06ba71083e
Fix order of partial state tables when purging (#15068)
* Fix order of partial state tables when purging

`partial_state_rooms` has an FK on `events` pointing to the join event we
get from `/send_join`, so we must delete from that table before deleting
from `events`.

**NB:** It would be nice to cancel any resync processes for the room
being purged. We do not do this at present. To do so reliably we'd need
an internal HTTP "replication" endpoint, because the worker doing the
resync process may be different to that handling the purge request.

The first time the resync process tries to write data after the deletion
it will fail because we have deleted necessary data e.g. auth
events. AFAICS it will not retry the resync, so the only downside to
not cancelling the resync is a scary-looking traceback.

(This is presumably extremely race-sensitive.)

* Changelog

* admist(?) -> between

* Warn about a race

* Fix typo, thanks Sean

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

---------

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2023-02-14 23:42:29 +00:00
Erik Johnston
cb262713b7
Fix clashing DB txn name (#15070)
* Fix clashing DB txn name

* Newsfile
2023-02-14 11:20:25 +00:00
Erik Johnston
f09db5c991
Skip calculating unread push actions in /sync when enable_push is false. (#14980) 2023-02-14 11:10:29 +00:00
Harishankar Kumar
db2b105d69
Change collection[str] to StrCollection in event_auth code (#14929)
Signed-off-by: Harishankar Kumar <hari01584@gmail.com>
2023-02-14 09:37:08 +00:00
Mathieu Velten
6cddf24e36
Faster joins: don't stall when a user joins during a fast join (#14606)
Fixes #12801.
Complement tests are at
https://github.com/matrix-org/complement/pull/567.

Avoid blocking on full state when handling a subsequent join into a
partial state room.

Also always perform a remote join into partial state rooms, since we do
not know whether the joining user has been banned and want to avoid
leaking history to banned users.

Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
Co-authored-by: Sean Quah <seanq@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
2023-02-10 23:31:05 +00:00
Sean Quah
d0c713cc85
Return read-only collections from @cached methods (#13755)
It's important that collections returned from `@cached` methods are not
modified, otherwise future retrievals from the cache will return the
modified collection.

This applies to the return values from `@cached` methods and the values
inside the dictionaries returned by `@cachedList` methods. It's not
necessary for the dictionaries returned by `@cachedList` methods
themselves to be read-only.

Signed-off-by: Sean Quah <seanq@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
2023-02-10 23:29:00 +00:00
Patrick Cloke
cf5233b783
Avoid fetching unused account data in sync. (#14973)
The per-room account data is no longer unconditionally
fetched, even if all rooms will be filtered out.

Global account data will not be fetched if it will all be
filtered out.
2023-02-10 14:22:16 +00:00
Patrick Cloke
a481fb9f98
Refactor get_user_devices_from_cache to avoid mutating cached values. (#15040)
The previous version of the code could mutate a cached value,
but only if the input requested all devices of a user *and* a specific
device.

To avoid this nonsensical situation we no longer fetch a specific
device ID if all of a user's devices are returned.
2023-02-10 08:09:47 -05:00
Erik Johnston
fd296b7343
Fix exception on start up about device lists (#15041)
Fixes #15010.
2023-02-10 09:52:35 +00:00
Patrick Cloke
733531ee3e
Add final type hint to synapse.server. (#15035) 2023-02-09 09:49:04 -05:00
David Robertson
f10caa73ee
Disambiguate get_ex_outlier_stream_rows query
A backwards-compatible piece of #14979 that's safe to land now.
2023-02-07 15:33:33 +00:00
David Robertson
9cd7610f86
Revert "Add event_stream_ordering column to membership state tables (#14979)"
This reverts commit 5fdc12f482.
2023-02-07 15:26:55 +00:00
Nick Mills-Barrett
5fdc12f482
Add event_stream_ordering column to membership state tables (#14979)
This adds an `event_stream_ordering` column to `current_state_events`,
`local_current_membership` and `room_memberships`. Each of these tables
is regularly joined with the `events` table to get the stream ordering
and denormalising this into each table will yield significant query
performance improvements once used. Includes a background job to
populate these values from the `events` table.

Same idea as https://github.com/matrix-org/synapse/pull/13703.

Signed off by Nick @ Beeper (@fizzadar).
2023-02-07 00:10:54 +00:00
David Robertson
e8269ed391
Type hints for tests.appservice (#14990)
* Accept a Sequence of events in synapse.appservice

This avoids some casts/ignores in the tests I'm about to fixup. It seems
that `List[Mock]` is not a subtype of `List[EventBase]`, but
`Sequence[Mock]` is a subtype of `Sequence[EventBase]`. So presumably
`Mock` is considered a subtype of anything, much like `Any`.

* make tests.appservice.test_scheduler pass mypy

* Extra hints in tests.appservice.test_scheduler

* Extra hints in tests.appservice.test_api

* Extra hints in tests.appservice.test_appservice

* Disallow untyped defs

* Changelog
2023-02-06 12:49:06 +00:00
Patrick Cloke
b2d97bac09
Implement MSC3958: suppress notifications from edits (#14960)
Co-authored-by: Brad Murray <brad@beeper.com>
Co-authored-by: Nick Barrett <nick@beeper.com>

Copy the suppress_edits push rule from Beeper to implement MSC3958.

9415a1284b/rust/src/push/base_rules.rs (L98-L114)
2023-02-03 14:31:14 -05:00
Sean Quah
0a686d1d13
Faster joins: Refactor handling of servers in room (#14954)
Ensure that the list of servers in a partial state room always contains
the server we joined off.

Also refactor `get_partial_state_servers_at_join` to return `None` when
the given room is no longer partial stated, to explicitly indicate when
the room has partial state. Otherwise it's not clear whether an empty
list means that the room has full state, or the room is partial stated,
but the server we joined off told us that there are no servers in the
room.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-02-03 15:39:59 +00:00
David Robertson
2186ebed6c
Fetch fewer events when getting hosts in room (#14962) 2023-02-02 16:49:14 +00:00
Patrick Cloke
1182ae5063
Add helper to parse an enum from query args & use it. (#14956)
The `parse_enum` helper pulls an enum value from the query string
(by delegating down to the parse_string helper with values generated
from the enum).

This is used to pull out "f" and "b" in most places and then we thread
the resulting Direction enum throughout more code.
2023-02-01 21:35:24 +00:00
Patrick Cloke
230a831c73
Attempt to delete more duplicate rows in receipts_linearized table. (#14915)
The previous assumption was that the stream_id column was unique
(for a room ID, receipt type, user ID tuple), but this turned out to be
incorrect.

Now find the max stream ID, then map this back to a database-specific
row identifier and delete other rows which match the (room ID, receipt type,
user ID) tuple, but *not* the row ID.
2023-02-01 15:45:10 -05:00
David Robertson
796a4b7482
Prefer type(x) is int to isinstance(x, int) (#14945)
* Perfer `type(x) is int` to `isinstance(x, int)`

This covered all additional instances I could see where `x` was
user-controlled.
The remaining cases are

```
$ rg -s 'isinstance.*[^_]int'
tests/replication/_base.py
576:        if isinstance(obj, int):

synapse/util/caches/stream_change_cache.py
136:        assert isinstance(stream_pos, int)
214:        assert isinstance(stream_pos, int)
246:        assert isinstance(stream_pos, int)
267:        assert isinstance(stream_pos, int)

synapse/replication/tcp/external_cache.py
133:        if isinstance(result, int):

synapse/metrics/__init__.py
100:        if isinstance(calls, (int, float)):

synapse/handlers/appservice.py
262:        assert isinstance(new_token, int)

synapse/config/_util.py
62:        if isinstance(p, int):
```

which cover metrics, logic related to `jsonschema`, and replication and
data streams. AFAICS these are all internal to Synapse

* Changelog
2023-01-31 10:33:07 +00:00
Patrick Cloke
2a51f3ec36
Implement MSC3952: Intentional mentions (#14823)
MSC3952 defines push rules which searches for mentions in a list of
Matrix IDs in the event body, instead of searching the entire event
body for display name / local part.

This is implemented behind an experimental configuration flag and
does not yet implement the backwards compatibility pieces of the MSC.
2023-01-27 10:16:21 -05:00
David Robertson
faecc6c083
Merge branch 'release-v1.76' into develop 2023-01-27 13:01:18 +00:00
Patrick Cloke
265735db9d
Use an enum for direction. (#14927)
For better type safety we  use an enum instead of strings to
configure direction (backwards or forwards).
2023-01-27 07:27:55 -05:00
Patrick Cloke
345576bc34
Fix paginating /relations with a live token (#14866)
The `/relations` endpoint was not properly handle "live tokens"
(i.e sync tokens), to do this properly we abstract the code that
`/messages` has and re-use it.
2023-01-26 13:24:15 -05:00
Patrick Cloke
8a05d5de21
Batch look-ups to see if rooms are partial stated. (#14917)
* Batch look-ups to see if rooms are partial stated.

* Fix issues found in linting.

* Fix typo.

* Apply suggestions from code review

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Clarify comments.

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Also improve the cache size while we're at it

* is_partial_state_rooms -> is_partial_state_room_batched

* Run `black`

* Improve annotation for `simple_select_many_batch`

* Fix is_partial_state_room_batched impl

* Okay, _actually_ fix impl

* Update description.

* Update synapse/storage/databases/main/room.py

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* Run black.

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: David Robertson <davidr@element.io>
2023-01-26 17:15:36 +00:00
Sean Quah
cf66d712c6
Fix initialization of _device_list_id_gen (#14914)
On startup, the `_device_list_id_gen` stream id generator is initialized
using the maximum stream id seen in a list of tables. When we started
populating the `device_list_remote_pending` table in #13913, we forgot
to add it to the aforementioned list of tables, so the stream id
generator can hand out old stream ids after a restart. The end result is
that Synapse can fail to handle device list update EDUs after a restart
when a partial state join is in progress.

Add the `device_list_remote_pending` table to the list of tables to
consider when initializing the `_device_list_id_gen` stream id generator.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-26 10:38:49 +00:00
David Robertson
4607be0b7b
Request partial joins by default (#14905)
* Request partial joins by default

This is a little sloppy, but we are trying to gain confidence in faster
joins in the upcoming RC.

Admins can still opt out by adding the following to their Synapse
config:

```yaml
experimental:
    faster_joins: false
```

We may revert this change before the release proper, depending on how
testing in the wild goes.

* Changelog

* Try to fix the backfill test failures

* Upgrade notes

* Postgres compat?
2023-01-24 15:28:20 +00:00
David Robertson
80d44060c9
Faster joins: omit partial rooms from eager syncs until the resync completes (#14870)
* Allow `AbstractSet` in `StrCollection`

Or else frozensets are excluded. This will be useful in an upcoming
commit where I plan to change a function that accepts `List[str]` to
accept `StrCollection` instead.

* `rooms_to_exclude` -> `rooms_to_exclude_globally`

I am about to make use of this exclusion mechanism to exclude rooms for
a specific user and a specific sync. This rename helps to clarify the
distinction between the global config and the rooms to exclude for a
specific sync.

* Better function names for internal sync methods

* Track a list of excluded rooms on SyncResultBuilder

I plan to feed a list of partially stated rooms for this sync to ignore

* Exclude partial state rooms during eager sync

using the mechanism established in the previous commit

* Track un-partial-state stream in sync tokens

So that we can work out which rooms have become fully-stated during a
given sync period.

* Fix mutation of `@cached` return value

This was fouling up a complement test added alongside this PR.
Excluding a room would mean the set of forgotten rooms in the cache
would be extended. This means that room could be erroneously considered
forgotten in the future.

Introduced in #12310, Synapse 1.57.0. I don't think this had any
user-visible side effects (until now).

* SyncResultBuilder: track rooms to force as newly joined

Similar plan as before. We've omitted rooms from certain sync responses;
now we establish the mechanism to reintroduce them into future syncs.

* Read new field, to present rooms as newly joined

* Force un-partial-stated rooms to be newly-joined

for eager incremental syncs only, provided they're still fully stated

* Notify user stream listeners to wake up long polling syncs

* Changelog

* Typo fix

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Unnecessary list cast

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Rephrase comment

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Another comment

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Fixup merge(?)

* Poke notifier when receiving un-partial-stated msg over replication

* Fixup merge whoops

Thanks MV :)

Co-authored-by: Mathieu Velen <mathieuv@matrix.org>

Co-authored-by: Mathieu Velten <mathieuv@matrix.org>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2023-01-23 15:44:39 +00:00