Commit Graph

5168 Commits

Author SHA1 Message Date
Eric Eastwood
e1ed959a68
Sliding Sync: Get bump_stamp from new sliding sync tables because it's faster (#17658)
Get `bump_stamp` from [new sliding sync
tables](https://github.com/element-hq/synapse/pull/17512) which should
be faster (performance) than flipping through the latest events in the
room.
2024-09-09 16:41:25 +01:00
Erik Johnston
786de8570b
Speed up fetching partial-state rooms on sliding sync (#17666)
Instead of having a large cache of `room_id -> bool` about whether a
room is partially stated, replace with a "fetch rooms the user is which
are partially-stated". This is a lot faster as the set of partially
stated rooms at any point across the whole server is small, and so such
a query is fast.

The main issue with the bulk cache lookup is the CPU time looking all
the rooms up in the cache.
2024-09-06 11:12:54 +01:00
Erik Johnston
d5accec2e5
Speed up sliding sync by avoiding copies (#17670)
We ended up spending ~10% CPU creating a new dictionary and
`_RoomMembershipForUser`, so let's avoid creating new dicts and copying
by returning `newly_joined`, `newly_left` and `is_dm` as sets directly.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-09-06 11:12:29 +01:00
Erik Johnston
b09bcf16d9
Fix background update to handle invalid events (#17641)
Follow-up to #17634, https://github.com/element-hq/synapse/pull/17631
and https://github.com/element-hq/synapse/pull/17632 to fix-up
https://github.com/element-hq/synapse/pull/17512
2024-09-05 14:15:04 +01:00
Erik Johnston
dce38f3faf
Fix sliding sync on workers (#17649)
Broke in #17630

---------

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-09-04 10:52:46 +01:00
Quentin Gliech
7d52ce7d4b
Format files with Ruff (#17643)
I thought ruff check would also format, but it doesn't.

This runs ruff format in CI and dev scripts. The first commit is just a
run of `ruff format .` in the root directory.
2024-09-02 12:39:04 +01:00
Erik Johnston
709b7363fe
Sliding sync: use new DB tables (#17630)
Based on https://github.com/element-hq/synapse/pull/17629

Utilizing the new sliding sync tables added in
https://github.com/element-hq/synapse/pull/17512 for fast acquisition of
rooms for the user and filtering/sorting.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-09-01 11:25:39 +01:00
Erik Johnston
8b6ff1dba5 Revert "Also handle invalid event errors"
This reverts commit b4d0356e48.
2024-09-01 10:43:26 +01:00
Erik Johnston
b4d0356e48 Also handle invalid event errors 2024-09-01 10:42:49 +01:00
Erik Johnston
d52c17ce01
Sliding sync: various fixes to background update (#17636)
Follows on from #17512, other fixes include: #17633, #17634, #17635
2024-09-01 10:18:45 +01:00
Quentin Gliech
cdd5979129
Replace isort and black with ruff (#17620)
Ruff now has decent parity with black and isort, so this is going to just save us a bunch of time
2024-08-30 10:07:46 +02:00
Erik Johnston
89801e04ca
Sliding sync: Ignore tables with no create event in current state (#17633) 2024-08-30 08:54:14 +01:00
Erik Johnston
7098d47f29
Sliding sync: Fix bg update again (v3) (#17634)
Follow-up to https://github.com/element-hq/synapse/pull/17631 and
https://github.com/element-hq/synapse/pull/17632 to fix-up
https://github.com/element-hq/synapse/pull/17599

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-30 08:54:07 +01:00
Eric Eastwood
26f81fb5be
Sliding Sync: Fix outlier re-persisting causing problems with sliding sync tables (#17635)
Fix outlier re-persisting causing problems with sliding sync tables

Follow-up to https://github.com/element-hq/synapse/pull/17512

When running on `matrix.org`, we discovered that a remote invite is
first persisted as an `outlier` and then re-persisted again where it is
de-outliered. The first the time, the `outlier` is persisted with one
`stream_ordering` but when persisted again and de-outliered, it is
assigned a different `stream_ordering` that won't end up being used.
Since we call `_calculate_sliding_sync_table_changes()` before
`_update_outliers_txn()` which fixes this discrepancy (always use the
`stream_ordering` from the first time it was persisted), we're working
with an unreliable `stream_ordering` value that will possibly be unused
and not make it into the `events` table.
2024-08-30 08:53:57 +01:00
Erik Johnston
d844afdc29
Fix background update for sliding sync (find previous membership) (#17632)
This reverts commit
ab414f2ab8.

Introduced in https://github.com/element-hq/synapse/pull/17512
2024-08-29 19:16:39 +01:00
Erik Johnston
bb80894391
Fix background update for sliding sync (#17631)
This reverts commit ab414f2ab8a294fbffb417003eeea0f14bbd6588.

Introduced in https://github.com/element-hq/synapse/pull/17599
2024-08-29 16:58:53 +01:00
Erik Johnston
e43c2b023e
Sliding sync: Store the per-connection state in the database. (#17599)
Based on #17600

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-29 16:26:58 +01:00
Eric Eastwood
1a6b718f8c
Sliding Sync: Pre-populate room data for quick filtering/sorting (#17512)
Pre-populate room data for quick filtering/sorting in the Sliding Sync
API

Spawning from
https://github.com/element-hq/synapse/pull/17450#discussion_r1697335578

This PR is acting as the Synapse version `N+1` step in the gradual
migration being tracked by
https://github.com/element-hq/synapse/issues/17623

Adding two new database tables:

- `sliding_sync_joined_rooms`: A table for storing room meta data that
the local server is still participating in. The info here can be shared
across all `Membership.JOIN`. Keyed on `(room_id)` and updated when the
relevant room current state changes or a new event is sent in the room.
- `sliding_sync_membership_snapshots`: A table for storing a snapshot of
room meta data at the time of the local user's membership. Keyed on
`(room_id, user_id)` and only updated when a user's membership in a room
changes.

Also adds background updates to populate these tables with all of the
existing data.


We want to have the guarantee that if a row exists in the sliding sync
tables, we are able to rely on it (accurate data). And if a row doesn't
exist, we use a fallback to get the same info until the background
updates fill in the rows or a new event comes in triggering it to be
fully inserted. This means we need a couple extra things in place until
we bump `SCHEMA_COMPAT_VERSION` and run the foreground update in the
`N+2` part of the gradual migration. For context on why we can't rely on
the tables without these things see [1].

1. On start-up, block until we clear out any rows for the rooms that
have had events since the max-`stream_ordering` of the
`sliding_sync_joined_rooms` table (compare to max-`stream_ordering` of
the `events` table). For `sliding_sync_membership_snapshots`, we can
compare to the max-`stream_ordering` of `local_current_membership`
- This accounts for when someone downgrades their Synapse version and
then upgrades it again. This will ensure that we don't have any
stale/out-of-date data in the
`sliding_sync_joined_rooms`/`sliding_sync_membership_snapshots` tables
since any new events sent in rooms would have also needed to be written
to the sliding sync tables. For example a new event needs to bump
`event_stream_ordering` in `sliding_sync_joined_rooms` table or some
state in the room changing (like the room name). Or another example of
someone's membership changing in a room affecting
`sliding_sync_membership_snapshots`.
1. Add another background update that will catch-up with any rows that
were just deleted from the sliding sync tables (based on the activity in
the `events`/`local_current_membership`). The rooms that need
recalculating are added to the
`sliding_sync_joined_rooms_to_recalculate` table.
1. Making sure rows are fully inserted. Instead of partially inserting,
we need to check if the row already exists and fully insert all data if
not.

All of this extra functionality can be removed once the
`SCHEMA_COMPAT_VERSION` is bumped with support for the new sliding sync
tables so people can no longer downgrade (the `N+2` part of the gradual
migration).


<details>
<summary><sup>[1]</sup></summary>

For `sliding_sync_joined_rooms`, since we partially insert rows as state
comes in, we can't rely on the existence of the row for a given
`room_id`. We can't even rely on looking at whether the background
update has finished. There could still be partial rows from when someone
reverted their Synapse version after the background update finished, had
some state changes (or new rooms), then upgraded again and more state
changes happen leaving a partial row.

For `sliding_sync_membership_snapshots`, we insert items as a whole
except for the `forgotten` column ~~so we can rely on rows existing and
just need to always use a fallback for the `forgotten` data. We can't
use the `forgotten` column in the table for the same reasons above about
`sliding_sync_joined_rooms`.~~ We could have an out-of-date membership
from when someone reverted their Synapse version. (same problems as
outlined for `sliding_sync_joined_rooms` above)

Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)

</details>


### TODO

 - [x] Update `stream_ordering`/`bump_stamp`
 - [x] Handle remote invites
 - [x] Handle state resets
- [x] Consider adding `sender` so we can filter `LEAVE` memberships and
distinguish from kicks.
     - [x] We should add it to be able to tell leaves from kicks 
- [x] Consider adding `tombstone` state to help address
https://github.com/element-hq/synapse/issues/17540
     - [x] We should add it `tombstone_successor_room_id`
- [x] Consider adding `forgotten` status to avoid extra
lookup/table-join on `room_memberships`
    - [x] We should add it
- [x] Background update to fill in values for all joined rooms and
non-join membership
 - [x] Clean-up tables when room is deleted
 - [ ] Make sure tables are useful to our use case
- First explored in
https://github.com/element-hq/synapse/compare/erikj/ss_use_new_tables
- Also explored in
76b5a576eb
 - [x] Plan for how can we use this with a fallback
     - See plan discussed above in main area of the issue description
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)
 - [x] Plan for how we can rely on this new table without a fallback
- Synapse version `N+1`: (this PR) Bump `SCHEMA_VERSION` to `87`. Add
new tables and background update to backfill all rows. Since this is a
new table, we don't have to add any `NOT VALID` constraints and validate
them when the background update completes. Read from new tables with a
fallback in cases where the rows aren't filled in yet.
- Synapse version `N+2`: Bump `SCHEMA_VERSION` to `88` and bump
`SCHEMA_COMPAT_VERSION` to `87` because we don't want people to
downgrade and miss writes while they are on an older version. Add a
foreground update to finish off the backfill so we can read from new
tables without the fallback. Application code can now rely on the new
tables being populated.
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.hh7shg4cxdhj)




### Dev notes

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase

SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase
```

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_sliding_sync.FilterRoomsTestCase
```

Reference:

- [Development docs on background updates and worked examples of gradual
migrations

](1dfa59b238/docs/development/database_schema.md (background-updates))
- A real example of a gradual migration:
https://github.com/matrix-org/synapse/pull/15649#discussion_r1213779514
- Adding `rooms.creator` field that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/10697
- Adding `rooms.room_version` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/6729
- Adding `room_stats_state.room_type` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/13031
- Tables from MSC2716: `insertion_events`, `insertion_event_edges`,
`insertion_event_extremities`, `batch_events`
- `current_state_events` updated in
`synapse/storage/databases/main/events.py`

---

```
persist_event (adds to queue)
_persist_event_batch
_persist_events_and_state_updates (assigns `stream_ordering` to events)
_persist_events_txn
	_store_event_txn
        _update_metadata_tables_txn
            _store_room_members_txn
	_update_current_state_txn
```

---

> Concatenated Indexes [...] (also known as multi-column, composite or
combined index)
>
> [...] key consists of multiple columns.
> 
> We can take advantage of the fact that the first index column is
always usable for searching
>
> *--
https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys*

---

Dealing with `portdb` (`synapse/_scripts/synapse_port_db.py`),
https://github.com/element-hq/synapse/pull/17512#discussion_r1725998219

---

<details>
<summary>SQL queries:</summary>

Both of these are equivalent and work in SQLite and Postgres

Options 1:
```sql
WITH data_table (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) AS (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
)
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT * FROM data_table
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

Option 2:
```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT 
    column1 as room_id,
    column2 as user_id,
    column3 as membership_event_id,
    column4 as membership,
    column5 as event_stream_ordering,
    {", ".join("column" + str(i) for i in range(6, 6 + len(insert_keys)))}
FROM (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
) as v
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

If we don't need the `membership` condition, we could use:

```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, membership_event_id, user_id, membership, event_stream_ordering, {", ".join(insert_keys)})
VALUES (
    ?, ?, ?,
    (SELECT membership FROM room_memberships WHERE event_id = ?),
    (SELECT stream_ordering FROM events WHERE event_id = ?),
    {", ".join("?" for _ in insert_values)}
)
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

</details>

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Erik Johnston <erik@matrix.org>
2024-08-29 16:09:51 +01:00
Gordan Trevis
594cd5f9fd
Fix Internal Server Error for Non-Local Users in Room Actions (#17607) 2024-08-29 14:34:29 +00:00
Erik Johnston
8678516e79
Sliding sync: Always send your own receipts down (#17617)
When returning receipts in sliding sync for initial rooms we should
always include our own receipts in the room (even if they don't match
any timeline events).

Reviewable commit-by-commit.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-29 10:09:40 +01:00
Erik Johnston
defd4aca67
Speed up fetching latest stream positions via cache (#17606)
The idea is to engineer it so that the vast majority of the rooms can
stay in the cache, so we can just ignore them.
2024-08-27 11:03:56 +00:00
Erik Johnston
f1e8d2d15a
Sliding Sync: Speed up getting receipts for initial rooms (#17592)
Let's only pull out the events we care about. Note that the index isn't
necessary here, as postgres is happy to scan the set of rooms for the
events.
2024-08-20 12:57:34 +01:00
Erik Johnston
8b8d74d12f
Sliding sync: Correctly track which read receipts we have or have not sent down. (#17575)
Add connection tracking to the receipts extension.

Based on #17574

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-19 21:16:07 +01:00
Erik Johnston
f162c92f2a
Speed up /keys/changes (#17548)
Follow on from #17537.

This is just adding a batched lookup function (you might want to hide
whitespace in the diff).
2024-08-16 16:04:02 +01:00
Patrick Cloke
8fea190a1f
Add missing docstrings related to profile methods. (#17559) 2024-08-13 17:04:35 +01:00
Erik Johnston
70b0e38603
Fix performance of device lists in /key/changes and sliding sync (#17537)
We do this by reusing the code from sync v2.

Reviewable commit-by-commit. The function `get_user_ids_changed` has
been rewritten entirely, so I would recommend not looking at the diff.
2024-08-09 11:59:44 +01:00
Eric Eastwood
11db575218
Sliding Sync: Use stream_ordering based timeline pagination for incremental sync (#17510)
Use `stream_ordering` based `timeline` pagination for incremental
`/sync` in Sliding Sync. Previously, we were always using a
`topological_ordering` but we should only be using that for historical
scenarios (initial `/sync`, newly joined, or haven't sent the room down
the connection before).

This is slightly different than what the [spec
suggests](https://spec.matrix.org/v1.10/client-server-api/#syncing)

> Events are ordered in this API according to the arrival time of the
event on the homeserver. This can conflict with other APIs which order
events based on their partial ordering in the event graph. This can
result in duplicate events being received (once per distinct API
called). Clients SHOULD de-duplicate events based on the event ID when
this happens.

But we've had a [discussion below in this
PR](https://github.com/element-hq/synapse/pull/17510#discussion_r1699105569)
and this matches what Sync v2 already does and seems like it makes
sense. Created a spec issue
https://github.com/matrix-org/matrix-spec/issues/1917 to clarify this.

Related issues:

 - https://github.com/matrix-org/matrix-spec/issues/1917
 - https://github.com/matrix-org/matrix-spec/issues/852
 - https://github.com/matrix-org/matrix-spec-proposals/pull/4033
2024-08-07 11:27:50 -05:00
Eric Eastwood
1dfa59b238
Sliding Sync: Add more tracing (#17514)
Spawning from looking at a couple traces and wanting a little more info.

Follow-up to github.com/element-hq/synapse/pull/17501

The changes in this PR allow you to find slow Sliding Sync traces ignoring the
`wait_for_events` time. In Jaeger, you can now filter for the `current_sync_for_user`
operation with `RESULT.result=true` indicating that it actually returned non-empty results.

If you want to find traces for your own user, you can use
`RESULT.result=true ARG.sync_config.user="@madlittlemods:matrix.org"`
2024-08-06 11:43:43 -05:00
Eric Eastwood
46de0ee16b
Sliding Sync: Update filters to be robust against remote invite rooms (#17450)
Update `filters.is_encrypted` and `filters.types`/`filters.not_types` to
be robust when dealing with remote invite rooms in Sliding Sync.

Part of
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575):
Sliding Sync

Follow-up to https://github.com/element-hq/synapse/pull/17434

We now take into account current state, fallback to stripped state
for invite/knock rooms, then historical state. If we can't determine
the info needed to filter a room (either from state or stripped state),
it is filtered out.
2024-07-30 13:20:29 -05:00
Erik Johnston
34306be5aa
Only send rooms with updates down sliding sync (#17479)
Rather than always including all rooms in range.

Also adds a pre-filter to rooms that checks the stream change cache to
see if anything might have happened.

Based on #17447

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-07-30 09:30:44 +01:00
Erik Johnston
be4a16ff44
Sliding Sync: Track whether we have sent rooms down to clients (#17447)
The basic idea is that we introduce a new token for a sliding sync
connection, which stores the mapping of room to room "status" (i.e. have
we sent the room down?). This token allows us to handle duplicate
requests properly. In future it can be used to store more
"per-connection" information safely.

In future this should be migrated into the DB, so its important that we
try to reduce the number of syncs where we need to update the
per-connection information. In this PoC this only happens when we: a)
send down a set of room for the first time, or b) we have previously
sent down a room and there are updates but we are not sending the room
down the sync (due to not falling in a list range)

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-07-29 22:45:48 +01:00
Erik Johnston
d225b6b3eb
Speed up SS room sorting (#17468)
We do this by bulk fetching the latest stream ordering.

---------

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-07-23 14:03:14 +01:00
Shay
dc8ddc6472
Prepare for authenticated media freeze (#17433)
As part of the rollout of
[MSC3916](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/3916-authentication-for-media.md)
this PR adds support for designating authenticated media and ensuring
that authenticated media is not served over unauthenticated endpoints.
2024-07-22 10:33:17 +01:00
Erik Johnston
d3f9afd8d9
Add a cache on get_rooms_for_local_user_where_membership_is (#17460)
As it gets used in sliding sync.

We basically invalidate it in all the same places as
`get_rooms_for_user`. Most of the changes are due to needing the
arguments you pass in to be hashable (which lists aren't)
2024-07-19 16:19:15 +01:00
Eric Eastwood
3fee32ed6b
Order heroes by stream_ordering (as spec'ed) (#17435)
The spec specifically mentions `stream_ordering` but that's a Synapse specific concept. In any case, the essence of the spec is basically the first 5 members of the room which `stream_ordering` accomplishes.

Split off from https://github.com/element-hq/synapse/pull/17419#discussion_r1671342794

## Spec compliance

> This should be the first 5 members of the room, **ordered by stream ordering**, which are joined or invited. The list must never include the client’s own user ID. When no joined or invited members are available, this should consist of the banned and left users.
>
> *-- https://spec.matrix.org/v1.10/client-server-api/#_matrixclientv3sync_roomsummary*

Related to https://github.com/matrix-org/matrix-spec/issues/1334
2024-07-17 13:10:15 -05:00
Erik Johnston
606da398fc
Fix filtering room types on remote rooms (#17434)
We can only fetch room types for rooms the server is in, so we need to
only filter rooms that we're joined to.

Also includes a perf fix to bulk fetch room types.
2024-07-11 16:00:44 +01:00
Erik Johnston
8cdd2d214e
Fix bug in sliding sync when using old DB. (#17398)
We don't necessarily have `instance_name` for old events (before we
support multiple event persisters). We treat those as if the
`instance_name` was "master".

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-07-08 20:30:23 +01:00
Eric Eastwood
3fef535ff2
Add rooms.bump_stamp to Sliding Sync /sync for easier client-side sorting (#17395)
`bump_stamp` corresponds to the `stream_ordering` of the latest `DEFAULT_BUMP_EVENT_TYPES` in the room. This helps clients sort more readily without them needing to pull in a bunch of the timeline to determine the last activity. `bump_event_types` is a thing because for example, we don't want display name changes to mark the room as unread and bump it to the top. For encrypted rooms, we just have to consider any activity as a bump because we can't see the content and the client has to figure it out for themselves.

Outside of Synapse, `bump_stamp` is just a free-form counter so other implementations could use `received_ts`or `origin_server_ts` (see the [*Security considerations* section in MSC3575 about the potential pitfalls of using `origin_server_ts`](https://github.com/matrix-org/matrix-spec-proposals/blob/kegan/sync-v3/proposals/3575-sync.md#security-considerations)). It doesn't have any guarantee about always going up. In the Synapse case, it could go down if an event was redacted/removed (or purged in cases of retention policies).

In the future, we could add `bump_event_types` as [MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575) mentions if people need to customize the event types.

---

In the Sliding Sync proxy, a similar [`timestamp` field was added](https://github.com/matrix-org/sliding-sync/pull/247) for the same purpose but the name is not obvious what it pertains to or what it's for.

The `timestamp` field was also added to Ruma in https://github.com/ruma/ruma/pull/1622
2024-07-08 13:17:08 -05:00
Erik Johnston
57538eb4d9
Finish up work to allow per-user feature flags (#17392)
Follows on from @H-Shay's great work at
https://github.com/matrix-org/synapse/pull/15344 and MSC4026.

Also enables its use for MSC3881, mainly as an easy but concrete example
of how to use it.
2024-07-05 13:02:35 +01:00
Eric Eastwood
22aeb78b77
Add rooms.required_state to Sliding Sync /sync (#17342)
Also handles excluding rooms with partial state when people are asking for room membership events unless it's `$LAZY` room membership.
2024-07-04 12:25:36 -05:00
Eric Eastwood
7be03d854b
Add room_types/not_room_types filtering to Sliding Sync /sync (#17337)
Based on
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575):
Sliding Sync
2024-07-02 12:46:27 -05:00
Eric Eastwood
fa91655805
Return some room data in Sliding Sync /sync (#17320)
- Timeline events
 - Stripped `invite_state`

Based on [MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575): Sliding Sync
2024-07-02 11:07:05 -05:00
Erik Johnston
b3b793786c
Fix sync waiting for an invalid token from the "future" (#17386)
Fixes https://github.com/element-hq/synapse/issues/17274, hopefully.

Basically, old versions of Synapse could advance streams without
persisting anything in the DB (fixed in #17229). On restart those
updates would get lost, and so the position of the stream would revert
to an older position. If this happened across an upgrade to a later
Synapse version which included #17215, then sync could get blocked
indefinitely (until the stream advanced to the position in the token).

We fix this by bounding the stream positions we'll wait for to the
maximum position of the underlying stream ID generator.
2024-07-02 12:39:49 +01:00
Erik Johnston
cc5e5893fe
Handle multiple rows device inbox (#17362)
Fix bug where we don't get new to-device from remote if they resent a
message we've already persisted and have recorded in the DB twice.

`device_federation_inbox` table doesn't have a unique index, and so we
can race and store an entry in there twice. If we do so then
`simple_select_one_txn` will throw an error due to the query returning
more than one row. We should add an unique index, but it doesn't really
matter so lets just handle the case of multiple rows correctly for now.
2024-06-27 11:04:31 +01:00
Erik Johnston
c89fea3fd1
Limit amount of replication we send (#17358)
Fixes up #17333, where we failed to actually send less data (the
`DISTINCT` didn't work due to `stream_id` being different).

We fix this by making it so that every device list outbound poke for a
given user ID has the same stream ID. We can't change the query to only
return e.g. max stream ID as the receivers look up the destinations to
send to by doing `SELECT WHERE stream_id = ?`
2024-06-25 11:17:39 +01:00
Erik Johnston
554a92601a
Reintroduce "Reduce device lists replication traffic."" (#17361)
Reintroduces https://github.com/element-hq/synapse/pull/17333


Turns out the reason for revert was down two master instances running
2024-06-25 10:34:34 +01:00
Erik Johnston
a98cb87bee
Revert "Reduce device lists replication traffic." (#17360)
Reverts element-hq/synapse#17333

It looks like master was still sending out replication RDATA with the
old format... somehow
2024-06-25 09:57:34 +01:00
Erik Johnston
930a64b6c1
Reintroduce #17291. (#17338)
This is #17291 (which got reverted), with some added fixups, and change
so that tests actually pick up the error.

The problem was that we were not calculating any new chain IDs due to a
missing `not` in a condition.
2024-06-24 14:40:28 +00:00
Erik Johnston
cf711ac03c
Reduce device lists replication traffic. (#17333)
Reduce the replication traffic of device lists, by not sending every
destination that needs to be sent the device list update over
replication. Instead a "hosts to send to have been calculated"
notification over replication, and then federation senders read the
destinations from the DB.

For non federation senders this should heavily reduce the impact of a
user in many large rooms changing a device.
2024-06-24 14:15:13 +01:00
Erik Johnston
4243c1f074
Revert "Handle large chain calc better (#17291)" (#17334)
This reverts commit bdf82efea5  (#17291)

This seems to have stopped persisting auth chains for new events, and so
is causing state res to fall back to the slow methods
2024-06-19 17:39:33 +01:00