Commit Graph

83 Commits

Author SHA1 Message Date
Quentin Gliech
7d52ce7d4b
Format files with Ruff (#17643)
I thought ruff check would also format, but it doesn't.

This runs ruff format in CI and dev scripts. The first commit is just a
run of `ruff format .` in the root directory.
2024-09-02 12:39:04 +01:00
Eric Eastwood
1a6b718f8c
Sliding Sync: Pre-populate room data for quick filtering/sorting (#17512)
Pre-populate room data for quick filtering/sorting in the Sliding Sync
API

Spawning from
https://github.com/element-hq/synapse/pull/17450#discussion_r1697335578

This PR is acting as the Synapse version `N+1` step in the gradual
migration being tracked by
https://github.com/element-hq/synapse/issues/17623

Adding two new database tables:

- `sliding_sync_joined_rooms`: A table for storing room meta data that
the local server is still participating in. The info here can be shared
across all `Membership.JOIN`. Keyed on `(room_id)` and updated when the
relevant room current state changes or a new event is sent in the room.
- `sliding_sync_membership_snapshots`: A table for storing a snapshot of
room meta data at the time of the local user's membership. Keyed on
`(room_id, user_id)` and only updated when a user's membership in a room
changes.

Also adds background updates to populate these tables with all of the
existing data.


We want to have the guarantee that if a row exists in the sliding sync
tables, we are able to rely on it (accurate data). And if a row doesn't
exist, we use a fallback to get the same info until the background
updates fill in the rows or a new event comes in triggering it to be
fully inserted. This means we need a couple extra things in place until
we bump `SCHEMA_COMPAT_VERSION` and run the foreground update in the
`N+2` part of the gradual migration. For context on why we can't rely on
the tables without these things see [1].

1. On start-up, block until we clear out any rows for the rooms that
have had events since the max-`stream_ordering` of the
`sliding_sync_joined_rooms` table (compare to max-`stream_ordering` of
the `events` table). For `sliding_sync_membership_snapshots`, we can
compare to the max-`stream_ordering` of `local_current_membership`
- This accounts for when someone downgrades their Synapse version and
then upgrades it again. This will ensure that we don't have any
stale/out-of-date data in the
`sliding_sync_joined_rooms`/`sliding_sync_membership_snapshots` tables
since any new events sent in rooms would have also needed to be written
to the sliding sync tables. For example a new event needs to bump
`event_stream_ordering` in `sliding_sync_joined_rooms` table or some
state in the room changing (like the room name). Or another example of
someone's membership changing in a room affecting
`sliding_sync_membership_snapshots`.
1. Add another background update that will catch-up with any rows that
were just deleted from the sliding sync tables (based on the activity in
the `events`/`local_current_membership`). The rooms that need
recalculating are added to the
`sliding_sync_joined_rooms_to_recalculate` table.
1. Making sure rows are fully inserted. Instead of partially inserting,
we need to check if the row already exists and fully insert all data if
not.

All of this extra functionality can be removed once the
`SCHEMA_COMPAT_VERSION` is bumped with support for the new sliding sync
tables so people can no longer downgrade (the `N+2` part of the gradual
migration).


<details>
<summary><sup>[1]</sup></summary>

For `sliding_sync_joined_rooms`, since we partially insert rows as state
comes in, we can't rely on the existence of the row for a given
`room_id`. We can't even rely on looking at whether the background
update has finished. There could still be partial rows from when someone
reverted their Synapse version after the background update finished, had
some state changes (or new rooms), then upgraded again and more state
changes happen leaving a partial row.

For `sliding_sync_membership_snapshots`, we insert items as a whole
except for the `forgotten` column ~~so we can rely on rows existing and
just need to always use a fallback for the `forgotten` data. We can't
use the `forgotten` column in the table for the same reasons above about
`sliding_sync_joined_rooms`.~~ We could have an out-of-date membership
from when someone reverted their Synapse version. (same problems as
outlined for `sliding_sync_joined_rooms` above)

Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)

</details>


### TODO

 - [x] Update `stream_ordering`/`bump_stamp`
 - [x] Handle remote invites
 - [x] Handle state resets
- [x] Consider adding `sender` so we can filter `LEAVE` memberships and
distinguish from kicks.
     - [x] We should add it to be able to tell leaves from kicks 
- [x] Consider adding `tombstone` state to help address
https://github.com/element-hq/synapse/issues/17540
     - [x] We should add it `tombstone_successor_room_id`
- [x] Consider adding `forgotten` status to avoid extra
lookup/table-join on `room_memberships`
    - [x] We should add it
- [x] Background update to fill in values for all joined rooms and
non-join membership
 - [x] Clean-up tables when room is deleted
 - [ ] Make sure tables are useful to our use case
- First explored in
https://github.com/element-hq/synapse/compare/erikj/ss_use_new_tables
- Also explored in
76b5a576eb
 - [x] Plan for how can we use this with a fallback
     - See plan discussed above in main area of the issue description
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)
 - [x] Plan for how we can rely on this new table without a fallback
- Synapse version `N+1`: (this PR) Bump `SCHEMA_VERSION` to `87`. Add
new tables and background update to backfill all rows. Since this is a
new table, we don't have to add any `NOT VALID` constraints and validate
them when the background update completes. Read from new tables with a
fallback in cases where the rows aren't filled in yet.
- Synapse version `N+2`: Bump `SCHEMA_VERSION` to `88` and bump
`SCHEMA_COMPAT_VERSION` to `87` because we don't want people to
downgrade and miss writes while they are on an older version. Add a
foreground update to finish off the backfill so we can read from new
tables without the fallback. Application code can now rely on the new
tables being populated.
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.hh7shg4cxdhj)




### Dev notes

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase

SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase
```

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_sliding_sync.FilterRoomsTestCase
```

Reference:

- [Development docs on background updates and worked examples of gradual
migrations

](1dfa59b238/docs/development/database_schema.md (background-updates))
- A real example of a gradual migration:
https://github.com/matrix-org/synapse/pull/15649#discussion_r1213779514
- Adding `rooms.creator` field that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/10697
- Adding `rooms.room_version` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/6729
- Adding `room_stats_state.room_type` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/13031
- Tables from MSC2716: `insertion_events`, `insertion_event_edges`,
`insertion_event_extremities`, `batch_events`
- `current_state_events` updated in
`synapse/storage/databases/main/events.py`

---

```
persist_event (adds to queue)
_persist_event_batch
_persist_events_and_state_updates (assigns `stream_ordering` to events)
_persist_events_txn
	_store_event_txn
        _update_metadata_tables_txn
            _store_room_members_txn
	_update_current_state_txn
```

---

> Concatenated Indexes [...] (also known as multi-column, composite or
combined index)
>
> [...] key consists of multiple columns.
> 
> We can take advantage of the fact that the first index column is
always usable for searching
>
> *--
https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys*

---

Dealing with `portdb` (`synapse/_scripts/synapse_port_db.py`),
https://github.com/element-hq/synapse/pull/17512#discussion_r1725998219

---

<details>
<summary>SQL queries:</summary>

Both of these are equivalent and work in SQLite and Postgres

Options 1:
```sql
WITH data_table (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) AS (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
)
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT * FROM data_table
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

Option 2:
```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT 
    column1 as room_id,
    column2 as user_id,
    column3 as membership_event_id,
    column4 as membership,
    column5 as event_stream_ordering,
    {", ".join("column" + str(i) for i in range(6, 6 + len(insert_keys)))}
FROM (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
) as v
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

If we don't need the `membership` condition, we could use:

```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, membership_event_id, user_id, membership, event_stream_ordering, {", ".join(insert_keys)})
VALUES (
    ?, ?, ?,
    (SELECT membership FROM room_memberships WHERE event_id = ?),
    (SELECT stream_ordering FROM events WHERE event_id = ?),
    {", ".join("?" for _ in insert_values)}
)
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

</details>

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Erik Johnston <erik@matrix.org>
2024-08-29 16:09:51 +01:00
eyJhb
8da16e55fe
hash_password accepts stdin now (#17608)
`hash_password` now actually accepts password from stdin. The `getpass`
reads from TTY, and does NOT accept stdin in any way.

The manpage has been updated to reflect that.
2024-08-27 18:51:43 +01:00
Shay
dc8ddc6472
Prepare for authenticated media freeze (#17433)
As part of the rollout of
[MSC3916](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/3916-authentication-for-media.md)
this PR adds support for designating authenticated media and ensuring
that authenticated media is not served over unauthenticated endpoints.
2024-07-22 10:33:17 +01:00
dependabot[bot]
20de685a4b
Bump ruff from 0.3.7 to 0.5.0 (#17381) 2024-07-05 12:35:57 +00:00
Jörg Thalheim
c99203d98c
register-new-matrix-user: add a flag to ignore already existing users (#17304)
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
2024-06-19 12:03:08 +01:00
Jörg Thalheim
97c3d98816
register_new_matrix_user: add password-file flag (#17294)
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
2024-06-18 16:21:51 +01:00
Erik Johnston
d16910ca02
Replaces all usages of StreamIdGenerator with MultiWriterIdGenerator (#17229)
Replaces all usages of `StreamIdGenerator` with `MultiWriterIdGenerator`, which is safer.
2024-05-30 11:07:32 +00:00
Shay
37558d5e4c
Add support for MSC3823 - Account Suspension (#17051) 2024-05-01 17:45:17 +01:00
Erik Johnston
ea6bfae0fc
Add support for moving /push_rules off of main process (#17037) 2024-03-28 15:44:07 +00:00
dependabot[bot]
1e68b56a62
Bump black from 23.10.1 to 24.2.0 (#16936) 2024-03-13 16:46:44 +00:00
Erik Johnston
23740eaa3d
Correctly mention previous copyright (#16820)
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
2024-01-23 11:26:48 +00:00
Erik Johnston
eaad9bb156 Merge remote-tracking branch 'gitlab/clokep/license-license' into new_develop 2023-12-13 15:11:56 +00:00
elara-leitstellentechnik
10ada2ff6d
Write signing keys with file mode 0640 (#16740)
Co-authored-by: Fabian Klemp <fabian.klemp@frequentis.com>
2023-12-08 16:25:57 +00:00
Patrick Cloke
8e1e62c9e0 Update license headers 2023-11-21 15:29:58 -05:00
Patrick Cloke
ab3f1b3b53
Convert simple_select_one_txn and simple_select_one to return tuples. (#16612) 2023-11-09 11:13:31 -05:00
David Robertson
747416e94c
Portdb: don't copy a table that gets rebuilt (#16563) 2023-10-27 20:14:02 +01:00
Denis Kasak
3a0aa6fe76
Force TLS certificate verification in registration script. (#16530)
If using the script remotely, there's no particularly convincing reason
to disable certificate verification, as this makes the connection
interceptible.

If on the other hand, the script is used locally (the most common use
case), you can simply target the HTTP listener and avoid TLS altogether.
This is what the script already attempts to do if passed a homeserver
configuration YAML file.
2023-10-23 07:38:51 -04:00
Patrick Cloke
aa483cb4c9
Update ruff config (#16283)
Enable additional checks & clean-up unneeded configuration.
2023-09-08 11:24:36 -04:00
Patrick Cloke
9ec3da06da
Bump mypy-zope & mypy. (#16188) 2023-08-29 10:38:56 -04:00
Patrick Cloke
ad3f43be9a
Run pyupgrade for python 3.7 & 3.8. (#16110) 2023-08-15 08:11:20 -04:00
Mathieu Velten
dac97642e4
Implements admin API to lock an user (MSC3939) (#15870) 2023-08-10 09:10:55 +00:00
Patrick Cloke
90ad836ed8
Properly setup the additional sequences in the portdb script. (#16043)
The un_partial_stated_event_stream_sequence and
application_services_txn_id_seq were never properly configured
in the portdb script, resulting in an error on start-up.
2023-08-01 10:36:33 -04:00
Erik Johnston
39d131b016
Add basic read/write lock (#15782) 2023-07-05 17:25:00 +01:00
Erik Johnston
95a96b21eb
Add foreign key constraint to event_forward_extremities. (#15751) 2023-07-05 09:43:19 +00:00
Erik Johnston
289ce3b8d9
Fix harmless exception in port DB script (#15814)
The port DB script would try and run database background tasks, which
could fail if the data they acted on was in the process of being ported.
These exceptions were non fatal.

Fixes #15789
2023-06-21 13:20:46 +00:00
Shay
89f6fb0d5a
Add an admin API endpoint to support per-user feature flags (#15344) 2023-04-28 11:33:45 -07:00
Shay
301b4156d5
Add column full_user_id to tables profiles and user_filters. (#15458) 2023-04-26 16:03:26 -07:00
Shay
6b23d74ad1
Delete server-side backup keys when deactivating an account. (#15181) 2023-04-04 20:16:08 +00:00
Quentin Gliech
5b70f240cf
Make cleaning up pushers depend on the device_id instead of the token_id (#15280)
This makes it so that we rely on the `device_id` to delete pushers on logout,
instead of relying on the `access_token_id`. This ensures we're not removing
pushers on token refresh, and prepares for a world without access token IDs
(also known as the OIDC).

This actually runs the `set_device_id_for_pushers` background update, which
was forgotten in #13831.

Note that for backwards compatibility it still deletes pushers based on the
`access_token` until the background update finishes.
2023-03-24 11:09:39 -04:00
reivilibre
98fd558382
Add a primitive helper script for listing worker endpoints. (#15243)
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
2023-03-23 12:11:14 +00:00
Shay
7f02fafa28
Add a check to SQLite port DB script to ensure that the sqlite database passed to the script exists before trying to port from it (#15306) 2023-03-22 08:36:42 -07:00
Shay
96bcc5d902
Revert "check sqlite database file exists before porting/#14692" (#15301) 2023-03-21 10:49:25 -07:00
Patrick Cloke
4fc8875876
Refactor media modules. (#15146)
* Removes the `v1` directory from `test.rest.media.v1`.
* Moves the non-REST code from `synapse.rest.media.v1` to `synapse.media`.
* Flatten the `v1` directory from `synapse.rest.media`,  but leave compatiblity
  with 3rd party media repositories and spam checkers.
2023-02-27 08:26:05 -05:00
dependabot[bot]
9bb2eac719
Bump black from 22.12.0 to 23.1.0 (#15103) 2023-02-22 15:29:09 -05:00
David Robertson
e26d7d5ae7
Teach portdb about un_partial_stated_event_stream (#15108)
* Sort BOOLEAN_COLUMNS and APPEND_ONLY_TABLES

So I can see if a given table is present in logarithmic time, rather
than linear.

* Teach portdb about `un_partial_stated_event_streams`

* Comments comments comments

* Changelog
2023-02-20 13:35:24 +00:00
Erik Johnston
65d0386693
Always notify replication when a stream advances (#14877)
This ensures that all other workers are told about stream updates in a timely manner, without having to remember to manually poke replication.
2023-01-20 18:02:18 +00:00
Jeyachandran Rathnam
5c9be9c760
Check sqlite database file exists before porting. (#14692)
To avoid creating an empty SQLite file if the given path
is incorrect.
2022-12-22 13:26:37 -05:00
reivilibre
96251af50d
Fix a bug introduced in v1.67.0 where not specifying a config file or a server URL would lead to the register_new_matrix_user script failing. (#14637) 2022-12-07 13:39:27 +00:00
Finn
fe50738e59
let update_synapse_database run on a multi-database configurations (#13422)
* Allow sharded database in db migrate script

Signed-off-by: Finn Herzfeld <finn@beeper.com>

* Update changelog.d/13422.bugfix

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* Remove check entirely

* remove unused import

Signed-off-by: Finn Herzfeld <finn@beeper.com>
Co-authored-by: finn <finn@beeper.com>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2022-10-19 19:08:40 +01:00
Patrick Cloke
3bbe532abb
Add an API for listing threads in a room. (#13394)
Implement the /threads endpoint from MSC3856.

This is currently unstable and behind an experimental configuration
flag.

It includes a background update to backfill data, results from
the /threads endpoint will be partial until that finishes.
2022-10-13 08:02:11 -04:00
Brendan Abolivier
be76cd8200
Allow admins to require a manual approval process before new accounts can be used (using MSC3866) (#13556) 2022-09-29 15:23:24 +02:00
Brendan Abolivier
8ae42ab8fa
Support enabling/disabling pushers (from MSC3881) (#13799)
Partial implementation of MSC3881
2022-09-21 14:39:01 +00:00
David Robertson
fff9b955fa
Generate separate snapshots for logical databases (#13792)
* Generate separate snapshots for sqlite, postgres and common
* Cleanup postgres dbs in the TRAP
* Say which logical DB we're applying updates to
* Run background updates on the state DB
* Add new option for accepting a SCHEMA_NUMBER
2022-09-20 14:14:12 +01:00
Nick Mills-Barrett
cdbb641232
Add receipts event stream ordering (#13703) 2022-09-13 08:16:37 +01:00
Richard van der Hoff
d092e6f32a
Support registration_shared_secret in a file (#13614)
A new `registration_shared_secret_path` option. This is kinda handy for k8s deployments and things.
2022-08-25 16:27:46 +00:00
Richard van der Hoff
a2ce614447
register_new_matrix_user: read server url from config (#13616)
Fixes https://github.com/matrix-org/synapse/issues/3672:
`https://localhost:8448` is virtually never right.
2022-08-25 15:29:08 +01:00
Brendan Abolivier
47822fd2e8
Merge branch 'master' into develop 2022-07-19 16:14:02 +02:00
Andrew Morgan
6faaf76a32
Remove 'anonymised' from the phone home stats documentation (#13321) 2022-07-19 12:38:29 +00:00
Patrick Cloke
4db7862e0f
Drop unused tables from groups/communities. (#12967)
These tables have been unused since Synapse v1.61.0, although schema version 72
was added in Synapse v1.62.0.
2022-07-13 09:55:14 -04:00