Commit Graph

801 Commits

Author SHA1 Message Date
Erik Johnston
f6c2b0ec2e
Sliding sync: don't fetch room summary for named rooms. (#17683)
For rooms with a name we can skip fetching a full room summary, as we
don't need to calculate heroes, and instead just fetch the room counts
directly.

This also changes things to not return counts and heroes for non-joined
rooms. For left/banned rooms we were returning zero values anyway, and
for invite/knock rooms we don't really want to leak such information
(even if some of is included in the stripped state).
2024-09-11 13:16:57 +01:00
Quentin Gliech
7d52ce7d4b
Format files with Ruff (#17643)
I thought ruff check would also format, but it doesn't.

This runs ruff format in CI and dev scripts. The first commit is just a
run of `ruff format .` in the root directory.
2024-09-02 12:39:04 +01:00
Erik Johnston
e43c2b023e
Sliding sync: Store the per-connection state in the database. (#17599)
Based on #17600

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-29 16:26:58 +01:00
Eric Eastwood
1a6b718f8c
Sliding Sync: Pre-populate room data for quick filtering/sorting (#17512)
Pre-populate room data for quick filtering/sorting in the Sliding Sync
API

Spawning from
https://github.com/element-hq/synapse/pull/17450#discussion_r1697335578

This PR is acting as the Synapse version `N+1` step in the gradual
migration being tracked by
https://github.com/element-hq/synapse/issues/17623

Adding two new database tables:

- `sliding_sync_joined_rooms`: A table for storing room meta data that
the local server is still participating in. The info here can be shared
across all `Membership.JOIN`. Keyed on `(room_id)` and updated when the
relevant room current state changes or a new event is sent in the room.
- `sliding_sync_membership_snapshots`: A table for storing a snapshot of
room meta data at the time of the local user's membership. Keyed on
`(room_id, user_id)` and only updated when a user's membership in a room
changes.

Also adds background updates to populate these tables with all of the
existing data.


We want to have the guarantee that if a row exists in the sliding sync
tables, we are able to rely on it (accurate data). And if a row doesn't
exist, we use a fallback to get the same info until the background
updates fill in the rows or a new event comes in triggering it to be
fully inserted. This means we need a couple extra things in place until
we bump `SCHEMA_COMPAT_VERSION` and run the foreground update in the
`N+2` part of the gradual migration. For context on why we can't rely on
the tables without these things see [1].

1. On start-up, block until we clear out any rows for the rooms that
have had events since the max-`stream_ordering` of the
`sliding_sync_joined_rooms` table (compare to max-`stream_ordering` of
the `events` table). For `sliding_sync_membership_snapshots`, we can
compare to the max-`stream_ordering` of `local_current_membership`
- This accounts for when someone downgrades their Synapse version and
then upgrades it again. This will ensure that we don't have any
stale/out-of-date data in the
`sliding_sync_joined_rooms`/`sliding_sync_membership_snapshots` tables
since any new events sent in rooms would have also needed to be written
to the sliding sync tables. For example a new event needs to bump
`event_stream_ordering` in `sliding_sync_joined_rooms` table or some
state in the room changing (like the room name). Or another example of
someone's membership changing in a room affecting
`sliding_sync_membership_snapshots`.
1. Add another background update that will catch-up with any rows that
were just deleted from the sliding sync tables (based on the activity in
the `events`/`local_current_membership`). The rooms that need
recalculating are added to the
`sliding_sync_joined_rooms_to_recalculate` table.
1. Making sure rows are fully inserted. Instead of partially inserting,
we need to check if the row already exists and fully insert all data if
not.

All of this extra functionality can be removed once the
`SCHEMA_COMPAT_VERSION` is bumped with support for the new sliding sync
tables so people can no longer downgrade (the `N+2` part of the gradual
migration).


<details>
<summary><sup>[1]</sup></summary>

For `sliding_sync_joined_rooms`, since we partially insert rows as state
comes in, we can't rely on the existence of the row for a given
`room_id`. We can't even rely on looking at whether the background
update has finished. There could still be partial rows from when someone
reverted their Synapse version after the background update finished, had
some state changes (or new rooms), then upgraded again and more state
changes happen leaving a partial row.

For `sliding_sync_membership_snapshots`, we insert items as a whole
except for the `forgotten` column ~~so we can rely on rows existing and
just need to always use a fallback for the `forgotten` data. We can't
use the `forgotten` column in the table for the same reasons above about
`sliding_sync_joined_rooms`.~~ We could have an out-of-date membership
from when someone reverted their Synapse version. (same problems as
outlined for `sliding_sync_joined_rooms` above)

Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)

</details>


### TODO

 - [x] Update `stream_ordering`/`bump_stamp`
 - [x] Handle remote invites
 - [x] Handle state resets
- [x] Consider adding `sender` so we can filter `LEAVE` memberships and
distinguish from kicks.
     - [x] We should add it to be able to tell leaves from kicks 
- [x] Consider adding `tombstone` state to help address
https://github.com/element-hq/synapse/issues/17540
     - [x] We should add it `tombstone_successor_room_id`
- [x] Consider adding `forgotten` status to avoid extra
lookup/table-join on `room_memberships`
    - [x] We should add it
- [x] Background update to fill in values for all joined rooms and
non-join membership
 - [x] Clean-up tables when room is deleted
 - [ ] Make sure tables are useful to our use case
- First explored in
https://github.com/element-hq/synapse/compare/erikj/ss_use_new_tables
- Also explored in
76b5a576eb
 - [x] Plan for how can we use this with a fallback
     - See plan discussed above in main area of the issue description
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)
 - [x] Plan for how we can rely on this new table without a fallback
- Synapse version `N+1`: (this PR) Bump `SCHEMA_VERSION` to `87`. Add
new tables and background update to backfill all rows. Since this is a
new table, we don't have to add any `NOT VALID` constraints and validate
them when the background update completes. Read from new tables with a
fallback in cases where the rows aren't filled in yet.
- Synapse version `N+2`: Bump `SCHEMA_VERSION` to `88` and bump
`SCHEMA_COMPAT_VERSION` to `87` because we don't want people to
downgrade and miss writes while they are on an older version. Add a
foreground update to finish off the backfill so we can read from new
tables without the fallback. Application code can now rely on the new
tables being populated.
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.hh7shg4cxdhj)




### Dev notes

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase

SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase
```

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_sliding_sync.FilterRoomsTestCase
```

Reference:

- [Development docs on background updates and worked examples of gradual
migrations

](1dfa59b238/docs/development/database_schema.md (background-updates))
- A real example of a gradual migration:
https://github.com/matrix-org/synapse/pull/15649#discussion_r1213779514
- Adding `rooms.creator` field that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/10697
- Adding `rooms.room_version` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/6729
- Adding `room_stats_state.room_type` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/13031
- Tables from MSC2716: `insertion_events`, `insertion_event_edges`,
`insertion_event_extremities`, `batch_events`
- `current_state_events` updated in
`synapse/storage/databases/main/events.py`

---

```
persist_event (adds to queue)
_persist_event_batch
_persist_events_and_state_updates (assigns `stream_ordering` to events)
_persist_events_txn
	_store_event_txn
        _update_metadata_tables_txn
            _store_room_members_txn
	_update_current_state_txn
```

---

> Concatenated Indexes [...] (also known as multi-column, composite or
combined index)
>
> [...] key consists of multiple columns.
> 
> We can take advantage of the fact that the first index column is
always usable for searching
>
> *--
https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys*

---

Dealing with `portdb` (`synapse/_scripts/synapse_port_db.py`),
https://github.com/element-hq/synapse/pull/17512#discussion_r1725998219

---

<details>
<summary>SQL queries:</summary>

Both of these are equivalent and work in SQLite and Postgres

Options 1:
```sql
WITH data_table (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) AS (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
)
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT * FROM data_table
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

Option 2:
```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT 
    column1 as room_id,
    column2 as user_id,
    column3 as membership_event_id,
    column4 as membership,
    column5 as event_stream_ordering,
    {", ".join("column" + str(i) for i in range(6, 6 + len(insert_keys)))}
FROM (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
) as v
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

If we don't need the `membership` condition, we could use:

```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, membership_event_id, user_id, membership, event_stream_ordering, {", ".join(insert_keys)})
VALUES (
    ?, ?, ?,
    (SELECT membership FROM room_memberships WHERE event_id = ?),
    (SELECT stream_ordering FROM events WHERE event_id = ?),
    {", ".join("?" for _ in insert_values)}
)
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

</details>

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Erik Johnston <erik@matrix.org>
2024-08-29 16:09:51 +01:00
Erik Johnston
f1e8d2d15a
Sliding Sync: Speed up getting receipts for initial rooms (#17592)
Let's only pull out the events we care about. Note that the index isn't
necessary here, as postgres is happy to scan the set of rooms for the
events.
2024-08-20 12:57:34 +01:00
Shay
dc8ddc6472
Prepare for authenticated media freeze (#17433)
As part of the rollout of
[MSC3916](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/3916-authentication-for-media.md)
this PR adds support for designating authenticated media and ensuring
that authenticated media is not served over unauthenticated endpoints.
2024-07-22 10:33:17 +01:00
Eric Eastwood
fa91655805
Return some room data in Sliding Sync /sync (#17320)
- Timeline events
 - Stripped `invite_state`

Based on [MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575): Sliding Sync
2024-07-02 11:07:05 -05:00
Travis Ralston
f1c4dfb08b
Add report room API (MSC4151) (#17270)
https://github.com/matrix-org/matrix-spec-proposals/pull/4151

This is intended to be enabled by default for immediate use. When FCP is
complete, the unstable endpoint will be dropped and stable endpoint
supported instead - no backwards compatibility is expected for the
unstable endpoint.
2024-06-12 12:27:46 +02:00
Erik Johnston
8c4937b216
Fix bug where device lists would break sync (#17292)
If the stream ID in the unconverted table is ahead of the device lists
ID gen, then it can break all /sync requests that had an ID from ahead
of the table.

The fix is to make sure we add the unconverted table to the list of
tables we check at start up.

Broke in https://github.com/element-hq/synapse/pull/17229
2024-06-10 15:56:57 +01:00
Erik Johnston
d16910ca02
Replaces all usages of StreamIdGenerator with MultiWriterIdGenerator (#17229)
Replaces all usages of `StreamIdGenerator` with `MultiWriterIdGenerator`, which is safer.
2024-05-30 11:07:32 +00:00
Erik Johnston
225f378ffa
Clean out invalid destinations from outbox (#17242)
We started ensuring we only insert valid destinations:
https://github.com/element-hq/synapse/pull/17240
2024-05-30 11:25:24 +01:00
Shay
37558d5e4c
Add support for MSC3823 - Account Suspension (#17051) 2024-05-01 17:45:17 +01:00
Erik Johnston
55b0aa847a Fix GHSA-3h7q-rfh9-xm4v
Weakness in auth chain indexing allows DoS from remote room members
through disk fill and high CPU usage.

A remote Matrix user with malicious intent, sharing a room with Synapse
instances before 1.104.1, can dispatch specially crafted events to
exploit a weakness in how the auth chain cover index is calculated. This
can induce high CPU consumption and accumulate excessive data in the
database of such instances, resulting in a denial of service.

Servers in private federations, or those that do not federate, are not
affected.
2024-04-23 15:25:49 +01:00
Erik Johnston
d40878451c
Add forgotten schema delta (#17054)
This should have been in #17045. Whoops.
2024-04-09 13:03:41 +01:00
Erik Johnston
adf15c4f6b
Run ANALYZE after fiddling with stats (#16849)
Introduced in #16833

Fixes #16844
2024-01-24 13:57:12 +00:00
Erik Johnston
23740eaa3d
Correctly mention previous copyright (#16820)
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
2024-01-23 11:26:48 +00:00
Erik Johnston
14c725f73b
Preparatory work for tweaking performance of auth chain lookups (#16833) 2024-01-23 11:26:27 +00:00
Erik Johnston
0455c40085 Update book location 2023-12-13 16:15:22 +00:00
Erik Johnston
eaad9bb156 Merge remote-tracking branch 'gitlab/clokep/license-license' into new_develop 2023-12-13 15:11:56 +00:00
David Robertson
44377f5ac0
Revert postgres logical replication deltaas
This reverts two commits:

    0bb8e418a4
    "Fix postgres schema after dropping old tables (#16730)"

and

    51e4e35653
    "Add a Postgres `REPLICA IDENTITY` to tables that do not have an implicit one. This should allow use of Postgres logical replication. (take  2, now with no added deadlocks!) (#16658)"

and also amends the changelog.
2023-12-05 16:10:48 +00:00
David Robertson
0bb8e418a4
Fix postgres schema after dropping old tables (#16730) 2023-12-05 11:08:40 +00:00
reivilibre
51e4e35653
Add a Postgres REPLICA IDENTITY to tables that do not have an implicit one. This should allow use of Postgres logical replication. (take 2, now with no added deadlocks!) (#16658)
* Add `ALTER TABLE ... REPLICA IDENTITY ...` for individual tables

We can't combine them into one file as it makes it likely to hit a deadlock

if Synapse is running, as it only takes one other transaction to access two

tables in a different order to the schema delta.

* Add notes

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Re-introduce REPLICA IDENTITY test

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-12-04 14:57:28 +00:00
Patrick Cloke
579c6be5f6
Drop unused tables & unneeded access token ID for events. (#16522) 2023-12-01 10:12:00 +00:00
Patrick Cloke
d199b84006
Remove old full schema dumps. (#16697)
These are not useful and make it difficult to search for
table definitions, etc.
2023-11-28 07:28:07 -05:00
Patrick Cloke
8e1e62c9e0 Update license headers 2023-11-21 15:29:58 -05:00
Erik Johnston
9c02ef21e0
Speed up purge room by adding index (#16657)
What it says on the tin
2023-11-17 14:15:44 +00:00
Erik Johnston
4d6b800385
Revert "Fix test not detecting tables with missing primary keys and missing replica identities, then add more replica identities. (#16647)" (#16652)
This reverts commit 830988ae72.
2023-11-16 16:57:26 +00:00
Erik Johnston
ef5329a9f9
Revert "Add a Postgres REPLICA IDENTITY to tables that do not have an implicit one. This should allow use of Postgres logical replication. (#16456)" (#16651)
This reverts commit 69afe3f7a0.
2023-11-16 16:48:48 +00:00
reivilibre
830988ae72
Fix test not detecting tables with missing primary keys and missing replica identities, then add more replica identities. (#16647)
* Fix the CI query that did not detect all cases of missing primary keys

* Add more missing REPLICA IDENTITY entries

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-11-16 12:26:27 +00:00
David Robertson
43d1aa75e8
Add an Admin API to temporarily grant the ability to update an existing cross-signing key without UIA (#16634) 2023-11-15 17:28:10 +00:00
Patrick Cloke
f2f2c7c1f0
Use full GitHub links instead of bare issue numbers. (#16637) 2023-11-15 08:02:11 -05:00
reivilibre
69afe3f7a0
Add a Postgres REPLICA IDENTITY to tables that do not have an implicit one. This should allow use of Postgres logical replication. (#16456)
* Add Postgres replica identities to tables that don't have an implicit one

Fixes #16224

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Move the delta to version 83 as we missed the boat for 82

* Add a test that all tables have a REPLICA IDENTITY

* Extend the test to include when indices are deleted

* isort

* black

* Fully qualify `oid` as it is a 'hidden attribute' in Postgres 11

* Update tests/storage/test_database.py

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* Add missed tables

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2023-11-13 16:03:22 +00:00
Erik Johnston
ba47fea528
Allow multiple workers to write to receipts stream. (#16432)
Fixes #16417
2023-10-25 16:16:19 +01:00
Patrick Cloke
12ca87f5ea
Remove the last reference to event_txn_id. (#16521)
This table was no longer used, except for a background process
which purged old entries in it.
2023-10-23 07:37:45 -04:00
Erik Johnston
e9069c9f91
Mark sync as limited if there is a gap in the timeline (#16485)
This splits thinsg into two queries, but most of the time we won't have
new event backwards extremities so this shouldn't actually add an extra
RTT for the majority of cases.

Note this removes the check for events with no prev events, but that was
part of MSC2716 work that has since been removed.
2023-10-19 15:04:18 +01:00
Patrick Cloke
4cc729d480
Revert "Drop unused tables & unneeded access token ID for events. (#16268)" (#16465)
This reverts commit cabd577460.

There are additional usages of these tables
which need to be removed first.
2023-10-12 08:56:10 -04:00
David Robertson
28fd28e92e
Add DB indices to speed up purging rooms (#16457) 2023-10-10 10:33:39 +01:00
Patrick Cloke
cabd577460
Drop unused tables & unneeded access token ID for events. (#16268)
Drop the event_txn_id table and the tables related to MSC2716,
which is no longer supported in Synapse.
2023-10-06 08:29:33 -04:00
Erik Johnston
329597022e
Some minor performance fixes for task schedular (#16313) 2023-09-14 16:20:47 +01:00
Patrick Cloke
16ef6f1e3c
Stop purging tables which are slated for removal. (#16273) 2023-09-12 07:12:31 -04:00
Mathieu Velten
4f1840a88a
Delete device messages asynchronously and in staged batches (#16240) 2023-09-06 09:30:53 +02:00
Patrick Cloke
ebd8374fb5
Stop writing to the event_txn_id table (#16175) 2023-08-30 11:10:56 +01:00
Erik Johnston
18279631e9
Fix rare deadlock when using read/write locks (#16169) 2023-08-23 16:24:30 +01:00
Erik Johnston
4adaba9acf
Fix rare deadlock when using read/write locks (#16133) 2023-08-23 13:45:25 +01:00
Erik Johnston
7cd79ce051
Reduce DB contention on worker locks (#16160) 2023-08-23 13:45:19 +01:00
Erik Johnston
3b3fed7229
Increase perf of read/write locks (#16149)
We do this by marking the tables as `UNLOGGED` in PostgreSQL.
2023-08-23 09:23:22 +01:00
Mathieu Velten
358896e1b8
Implements a task scheduler for resumable potentially long running tasks (#15891) 2023-08-21 14:17:13 +02:00
Mathieu Velten
dac97642e4
Implements admin API to lock an user (MSC3939) (#15870) 2023-08-10 09:10:55 +00:00
Patrick Cloke
d98a43d922
Stabilize support for MSC3970: updated transaction semantics (scope to device_id) (#15629)
For now this maintains compatible with old Synapses by falling back
to using transaction semantics on a per-access token. A future version
of Synapse will drop support for this.
2023-08-04 07:47:18 -04:00
Mathieu Velten
8ebfd577e2
Bump DB version to 79 since synapse v1.88 was already there (#15998) 2023-07-26 14:51:44 +02:00