Commit Graph

24174 Commits

Author SHA1 Message Date
Erik Johnston
e5d07bb083
Fix bump stamp for non-joined rooms (#17674)
We should only look for bump stamps in joined rooms, otherwise we should
just use the membership stream ordering.
2024-09-06 11:44:37 +01:00
Erik Johnston
a708e1afd0
Small performance improvements for sliding sync (#17672)
A couple of small performance improvements for sliding sync.
2024-09-06 11:44:13 +01:00
Erik Johnston
786de8570b
Speed up fetching partial-state rooms on sliding sync (#17666)
Instead of having a large cache of `room_id -> bool` about whether a
room is partially stated, replace with a "fetch rooms the user is which
are partially-stated". This is a lot faster as the set of partially
stated rooms at any point across the whole server is small, and so such
a query is fast.

The main issue with the bulk cache lookup is the CPU time looking all
the rooms up in the cache.
2024-09-06 11:12:54 +01:00
Erik Johnston
d5accec2e5
Speed up sliding sync by avoiding copies (#17670)
We ended up spending ~10% CPU creating a new dictionary and
`_RoomMembershipForUser`, so let's avoid creating new dicts and copying
by returning `newly_joined`, `newly_left` and `is_dm` as sets directly.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-09-06 11:12:29 +01:00
Johannes Marbach
de3363ef58
Stabilise MSC4156: server_name -> via (#17650) 2024-09-05 17:07:39 +01:00
Erik Johnston
6b770d8bfc Revert "Fix bump stamp for non-joined rooms"
This reverts commit f73c844403.
2024-09-05 15:43:37 +01:00
Erik Johnston
f73c844403 Fix bump stamp for non-joined rooms
We should only look for bump stamps in joined rooms, otherwise we should
just use the membership stream ordering.
2024-09-05 15:42:49 +01:00
Erik Johnston
b09bcf16d9
Fix background update to handle invalid events (#17641)
Follow-up to #17634, https://github.com/element-hq/synapse/pull/17631
and https://github.com/element-hq/synapse/pull/17632 to fix-up
https://github.com/element-hq/synapse/pull/17512
2024-09-05 14:15:04 +01:00
Eric Eastwood
b054690c8c
Sliding Sync: Prevent duplicate tags being added to traces (#17655)
Prevent duplicate tags being added to traces.

Noticed because we see these warnings in Jaeger:

<img width="462" alt="Screenshot 2024-09-03 at 2 34 05 PM"
src="https://github.com/user-attachments/assets/6fac12ed-0074-435b-9451-eccde7e7012a">
2024-09-05 10:05:01 +01:00
Erik Johnston
dce38f3faf
Fix sliding sync on workers (#17649)
Broke in #17630

---------

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-09-04 10:52:46 +01:00
dependabot[bot]
fc10d38849
Bump twisted from 24.7.0rc1 to 24.7.0 (#17647)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-03 18:48:43 +01:00
dependabot[bot]
4255c03599
Bump types-psycopg2 from 2.9.21.20240417 to 2.9.21.20240819 (#17646)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-03 18:38:01 +01:00
dependabot[bot]
c24cce73a1
Bump towncrier from 24.7.1 to 24.8.0 (#17645)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-03 18:37:30 +01:00
dependabot[bot]
1c5d2a4197
Bump types-pillow from 10.2.0.20240520 to 10.2.0.20240822 (#17644)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-03 18:19:42 +01:00
Erik Johnston
391c4f870b Merge remote-tracking branch 'origin/release-v1.114' into develop 2024-09-02 20:58:49 +01:00
Erik Johnston
5eec67b6ef Fix changelog 2024-09-02 17:08:34 +01:00
Erik Johnston
6722adf04e Update changelog 2024-09-02 16:27:13 +01:00
Erik Johnston
ac27c9e46a 1.114.0 2024-09-02 15:14:57 +01:00
Erik Johnston
f729ef08c9
Enable sliding sync support by default (#17648)
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-09-02 15:09:04 +01:00
Quentin Gliech
7d52ce7d4b
Format files with Ruff (#17643)
I thought ruff check would also format, but it doesn't.

This runs ruff format in CI and dev scripts. The first commit is just a
run of `ruff format .` in the root directory.
2024-09-02 12:39:04 +01:00
Erik Johnston
709b7363fe
Sliding sync: use new DB tables (#17630)
Based on https://github.com/element-hq/synapse/pull/17629

Utilizing the new sliding sync tables added in
https://github.com/element-hq/synapse/pull/17512 for fast acquisition of
rooms for the user and filtering/sorting.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-09-01 11:25:39 +01:00
Erik Johnston
560b43ac02
Sliding Sync: Split up get_room_membership_for_user_at_to_token (#17629)
This is to make it easier to reuse the logic when adding support for the
new tables

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-09-01 10:52:03 +01:00
Erik Johnston
8b6ff1dba5 Revert "Also handle invalid event errors"
This reverts commit b4d0356e48.
2024-09-01 10:43:26 +01:00
Erik Johnston
b4d0356e48 Also handle invalid event errors 2024-09-01 10:42:49 +01:00
Erik Johnston
d52c17ce01
Sliding sync: various fixes to background update (#17636)
Follows on from #17512, other fixes include: #17633, #17634, #17635
2024-09-01 10:18:45 +01:00
Erik Johnston
966a50bb63 Fixup changelog 2024-08-30 16:38:53 +01:00
Erik Johnston
d6125c583d 1.114.0rc3 2024-08-30 16:38:08 +01:00
Erik Johnston
da58e55a0b Fix starting non-media repos (#17626)
Regressed in #17543.

The `max_download_size` config is not available on workers that don't
load the media repo.

Besides, we should honour the max_size param that was passed into the
function.
2024-08-30 16:37:11 +01:00
Erik Johnston
a5a454fc35 Fixup changelog 2024-08-30 15:39:53 +01:00
Erik Johnston
1caff75526 Fixup changelog 2024-08-30 15:36:52 +01:00
Erik Johnston
7b75922020 1.114.0rc2 2024-08-30 15:35:18 +01:00
Quentin Gliech
26c1330764 Replace isort and black with ruff (#17620)
Ruff now has decent parity with black and isort, so this is going to just save us a bunch of time
2024-08-30 15:32:43 +01:00
Quentin Gliech
48303fcbcc MSC3861: load the issuer and account management URLs from OIDC discovery (#17407)
This will help mitigating any discrepancies between the issuer
configured and the one returned by the OIDC provider.

This also removes the need for configuring the `account_management_url`
explicitely, as it will now be loaded from the OIDC discovery, as per
MSC2965.

Because we may now fetch stuff for the .well-known/matrix/client
endpoint, this also transforms the client well-known resource to be
asynchronous.
2024-08-30 15:31:51 +01:00
Michael Telatynski
53a3783750 Use custom stage UIA error for MAS cross-signing reset (#17509)
Rather than 501 M_UNRECOGNISED

Client side implementation at
https://github.com/matrix-org/matrix-react-sdk/pull/12892/
2024-08-30 15:31:51 +01:00
Erik Johnston
b913aaa788 Sliding sync: Store the per-connection state in the database. (#17599)
Based on #17600

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-30 15:31:05 +01:00
Erik Johnston
dab88a7b1f Sliding Sync: Make PerConnectionState immutable (#17600)
This is so that we can cache it.

We also move the sliding sync types to
`synapse/types/handlers/sliding_sync.py`. This is mainly in-prep for

The only change in behaviour is that
`RoomSyncConfig.combine_sync_config(..)` now returns a new room sync
config rather than mutating in-place.

Reviewable commit-by-commit.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-30 15:29:07 +01:00
Quentin Gliech
ca69d0f571
MSC3861: load the issuer and account management URLs from OIDC discovery (#17407)
This will help mitigating any discrepancies between the issuer
configured and the one returned by the OIDC provider.

This also removes the need for configuring the `account_management_url`
explicitely, as it will now be loaded from the OIDC discovery, as per
MSC2965.

Because we may now fetch stuff for the .well-known/matrix/client
endpoint, this also transforms the client well-known resource to be
asynchronous.
2024-08-30 14:04:08 +00:00
Michael Telatynski
02ebcf7725
Use custom stage UIA error for MAS cross-signing reset (#17509)
Rather than 501 M_UNRECOGNISED

Client side implementation at
https://github.com/matrix-org/matrix-react-sdk/pull/12892/
2024-08-30 14:52:57 +02:00
Quentin Gliech
cdd5979129
Replace isort and black with ruff (#17620)
Ruff now has decent parity with black and isort, so this is going to just save us a bunch of time
2024-08-30 10:07:46 +02:00
Erik Johnston
89801e04ca
Sliding sync: Ignore tables with no create event in current state (#17633) 2024-08-30 08:54:14 +01:00
Erik Johnston
7098d47f29
Sliding sync: Fix bg update again (v3) (#17634)
Follow-up to https://github.com/element-hq/synapse/pull/17631 and
https://github.com/element-hq/synapse/pull/17632 to fix-up
https://github.com/element-hq/synapse/pull/17599

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-30 08:54:07 +01:00
Eric Eastwood
26f81fb5be
Sliding Sync: Fix outlier re-persisting causing problems with sliding sync tables (#17635)
Fix outlier re-persisting causing problems with sliding sync tables

Follow-up to https://github.com/element-hq/synapse/pull/17512

When running on `matrix.org`, we discovered that a remote invite is
first persisted as an `outlier` and then re-persisted again where it is
de-outliered. The first the time, the `outlier` is persisted with one
`stream_ordering` but when persisted again and de-outliered, it is
assigned a different `stream_ordering` that won't end up being used.
Since we call `_calculate_sliding_sync_table_changes()` before
`_update_outliers_txn()` which fixes this discrepancy (always use the
`stream_ordering` from the first time it was persisted), we're working
with an unreliable `stream_ordering` value that will possibly be unused
and not make it into the `events` table.
2024-08-30 08:53:57 +01:00
Erik Johnston
d844afdc29
Fix background update for sliding sync (find previous membership) (#17632)
This reverts commit
ab414f2ab8.

Introduced in https://github.com/element-hq/synapse/pull/17512
2024-08-29 19:16:39 +01:00
Erik Johnston
bb80894391
Fix background update for sliding sync (#17631)
This reverts commit ab414f2ab8a294fbffb417003eeea0f14bbd6588.

Introduced in https://github.com/element-hq/synapse/pull/17599
2024-08-29 16:58:53 +01:00
Erik Johnston
e43c2b023e
Sliding sync: Store the per-connection state in the database. (#17599)
Based on #17600

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-29 16:26:58 +01:00
Erik Johnston
2999a14aed
Sliding Sync: Make PerConnectionState immutable (#17600)
This is so that we can cache it.

We also move the sliding sync types to
`synapse/types/handlers/sliding_sync.py`. This is mainly in-prep for
#17599 to avoid circular imports.

The only change in behaviour is that
`RoomSyncConfig.combine_sync_config(..)` now returns a new room sync
config rather than mutating in-place.

Reviewable commit-by-commit.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-29 16:22:57 +01:00
Eric Eastwood
1a6b718f8c
Sliding Sync: Pre-populate room data for quick filtering/sorting (#17512)
Pre-populate room data for quick filtering/sorting in the Sliding Sync
API

Spawning from
https://github.com/element-hq/synapse/pull/17450#discussion_r1697335578

This PR is acting as the Synapse version `N+1` step in the gradual
migration being tracked by
https://github.com/element-hq/synapse/issues/17623

Adding two new database tables:

- `sliding_sync_joined_rooms`: A table for storing room meta data that
the local server is still participating in. The info here can be shared
across all `Membership.JOIN`. Keyed on `(room_id)` and updated when the
relevant room current state changes or a new event is sent in the room.
- `sliding_sync_membership_snapshots`: A table for storing a snapshot of
room meta data at the time of the local user's membership. Keyed on
`(room_id, user_id)` and only updated when a user's membership in a room
changes.

Also adds background updates to populate these tables with all of the
existing data.


We want to have the guarantee that if a row exists in the sliding sync
tables, we are able to rely on it (accurate data). And if a row doesn't
exist, we use a fallback to get the same info until the background
updates fill in the rows or a new event comes in triggering it to be
fully inserted. This means we need a couple extra things in place until
we bump `SCHEMA_COMPAT_VERSION` and run the foreground update in the
`N+2` part of the gradual migration. For context on why we can't rely on
the tables without these things see [1].

1. On start-up, block until we clear out any rows for the rooms that
have had events since the max-`stream_ordering` of the
`sliding_sync_joined_rooms` table (compare to max-`stream_ordering` of
the `events` table). For `sliding_sync_membership_snapshots`, we can
compare to the max-`stream_ordering` of `local_current_membership`
- This accounts for when someone downgrades their Synapse version and
then upgrades it again. This will ensure that we don't have any
stale/out-of-date data in the
`sliding_sync_joined_rooms`/`sliding_sync_membership_snapshots` tables
since any new events sent in rooms would have also needed to be written
to the sliding sync tables. For example a new event needs to bump
`event_stream_ordering` in `sliding_sync_joined_rooms` table or some
state in the room changing (like the room name). Or another example of
someone's membership changing in a room affecting
`sliding_sync_membership_snapshots`.
1. Add another background update that will catch-up with any rows that
were just deleted from the sliding sync tables (based on the activity in
the `events`/`local_current_membership`). The rooms that need
recalculating are added to the
`sliding_sync_joined_rooms_to_recalculate` table.
1. Making sure rows are fully inserted. Instead of partially inserting,
we need to check if the row already exists and fully insert all data if
not.

All of this extra functionality can be removed once the
`SCHEMA_COMPAT_VERSION` is bumped with support for the new sliding sync
tables so people can no longer downgrade (the `N+2` part of the gradual
migration).


<details>
<summary><sup>[1]</sup></summary>

For `sliding_sync_joined_rooms`, since we partially insert rows as state
comes in, we can't rely on the existence of the row for a given
`room_id`. We can't even rely on looking at whether the background
update has finished. There could still be partial rows from when someone
reverted their Synapse version after the background update finished, had
some state changes (or new rooms), then upgraded again and more state
changes happen leaving a partial row.

For `sliding_sync_membership_snapshots`, we insert items as a whole
except for the `forgotten` column ~~so we can rely on rows existing and
just need to always use a fallback for the `forgotten` data. We can't
use the `forgotten` column in the table for the same reasons above about
`sliding_sync_joined_rooms`.~~ We could have an out-of-date membership
from when someone reverted their Synapse version. (same problems as
outlined for `sliding_sync_joined_rooms` above)

Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)

</details>


### TODO

 - [x] Update `stream_ordering`/`bump_stamp`
 - [x] Handle remote invites
 - [x] Handle state resets
- [x] Consider adding `sender` so we can filter `LEAVE` memberships and
distinguish from kicks.
     - [x] We should add it to be able to tell leaves from kicks 
- [x] Consider adding `tombstone` state to help address
https://github.com/element-hq/synapse/issues/17540
     - [x] We should add it `tombstone_successor_room_id`
- [x] Consider adding `forgotten` status to avoid extra
lookup/table-join on `room_memberships`
    - [x] We should add it
- [x] Background update to fill in values for all joined rooms and
non-join membership
 - [x] Clean-up tables when room is deleted
 - [ ] Make sure tables are useful to our use case
- First explored in
https://github.com/element-hq/synapse/compare/erikj/ss_use_new_tables
- Also explored in
76b5a576eb
 - [x] Plan for how can we use this with a fallback
     - See plan discussed above in main area of the issue description
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)
 - [x] Plan for how we can rely on this new table without a fallback
- Synapse version `N+1`: (this PR) Bump `SCHEMA_VERSION` to `87`. Add
new tables and background update to backfill all rows. Since this is a
new table, we don't have to add any `NOT VALID` constraints and validate
them when the background update completes. Read from new tables with a
fallback in cases where the rows aren't filled in yet.
- Synapse version `N+2`: Bump `SCHEMA_VERSION` to `88` and bump
`SCHEMA_COMPAT_VERSION` to `87` because we don't want people to
downgrade and miss writes while they are on an older version. Add a
foreground update to finish off the backfill so we can read from new
tables without the fallback. Application code can now rely on the new
tables being populated.
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.hh7shg4cxdhj)




### Dev notes

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase

SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase
```

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_sliding_sync.FilterRoomsTestCase
```

Reference:

- [Development docs on background updates and worked examples of gradual
migrations

](1dfa59b238/docs/development/database_schema.md (background-updates))
- A real example of a gradual migration:
https://github.com/matrix-org/synapse/pull/15649#discussion_r1213779514
- Adding `rooms.creator` field that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/10697
- Adding `rooms.room_version` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/6729
- Adding `room_stats_state.room_type` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/13031
- Tables from MSC2716: `insertion_events`, `insertion_event_edges`,
`insertion_event_extremities`, `batch_events`
- `current_state_events` updated in
`synapse/storage/databases/main/events.py`

---

```
persist_event (adds to queue)
_persist_event_batch
_persist_events_and_state_updates (assigns `stream_ordering` to events)
_persist_events_txn
	_store_event_txn
        _update_metadata_tables_txn
            _store_room_members_txn
	_update_current_state_txn
```

---

> Concatenated Indexes [...] (also known as multi-column, composite or
combined index)
>
> [...] key consists of multiple columns.
> 
> We can take advantage of the fact that the first index column is
always usable for searching
>
> *--
https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys*

---

Dealing with `portdb` (`synapse/_scripts/synapse_port_db.py`),
https://github.com/element-hq/synapse/pull/17512#discussion_r1725998219

---

<details>
<summary>SQL queries:</summary>

Both of these are equivalent and work in SQLite and Postgres

Options 1:
```sql
WITH data_table (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) AS (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
)
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT * FROM data_table
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

Option 2:
```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT 
    column1 as room_id,
    column2 as user_id,
    column3 as membership_event_id,
    column4 as membership,
    column5 as event_stream_ordering,
    {", ".join("column" + str(i) for i in range(6, 6 + len(insert_keys)))}
FROM (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
) as v
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

If we don't need the `membership` condition, we could use:

```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, membership_event_id, user_id, membership, event_stream_ordering, {", ".join(insert_keys)})
VALUES (
    ?, ?, ?,
    (SELECT membership FROM room_memberships WHERE event_id = ?),
    (SELECT stream_ordering FROM events WHERE event_id = ?),
    {", ".join("?" for _ in insert_values)}
)
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

</details>

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Erik Johnston <erik@matrix.org>
2024-08-29 16:09:51 +01:00
Gordan Trevis
594cd5f9fd
Fix Internal Server Error for Non-Local Users in Room Actions (#17607) 2024-08-29 14:34:29 +00:00
Erik Johnston
b21134de3b
Fix starting non-media repos (#17626)
Regressed in #17543.

The `max_download_size` config is not available on workers that don't
load the media repo.

Besides, we should honour the max_size param that was passed into the
function.
2024-08-29 12:26:17 +00:00
meise
a8f29c9913
docs: fix typo in saml2_config example (#17594) 2024-08-29 10:39:16 +00:00