Commit Graph

90 Commits

Author SHA1 Message Date
Erik Johnston
f58d202e3f
Fix bug with reusing 'txn' when persisting event. (#10743)
This will only happen when a server has multiple out of band membership
events in a single room.
2021-09-03 10:59:25 +01:00
Eric Eastwood
684d19a11c
Add support for MSC2716 marker events (#10498)
* Make historical messages available to federated servers

Part of MSC2716: https://github.com/matrix-org/matrix-doc/pull/2716

Follow-up to https://github.com/matrix-org/synapse/pull/9247

* Debug message not available on federation

* Add base starting insertion point when no chunk ID is provided

* Fix messages from multiple senders in historical chunk

Follow-up to https://github.com/matrix-org/synapse/pull/9247

Part of MSC2716: https://github.com/matrix-org/matrix-doc/pull/2716

---

Previously, Synapse would throw a 403,
`Cannot force another user to join.`,
because we were trying to use `?user_id` from a single virtual user
which did not match with messages from other users in the chunk.

* Remove debug lines

* Messing with selecting insertion event extremeties

* Move db schema change to new version

* Add more better comments

* Make a fake requester with just what we need

See https://github.com/matrix-org/synapse/pull/10276#discussion_r660999080

* Store insertion events in table

* Make base insertion event float off on its own

See https://github.com/matrix-org/synapse/pull/10250#issuecomment-875711889

Conflicts:
	synapse/rest/client/v1/room.py

* Validate that the app service can actually control the given user

See https://github.com/matrix-org/synapse/pull/10276#issuecomment-876316455

Conflicts:
	synapse/rest/client/v1/room.py

* Add some better comments on what we're trying to check for

* Continue debugging

* Share validation logic

* Add inserted historical messages to /backfill response

* Remove debug sql queries

* Some marker event implemntation trials

* Clean up PR

* Rename insertion_event_id to just event_id

* Add some better sql comments

* More accurate description

* Add changelog

* Make it clear what MSC the change is part of

* Add more detail on which insertion event came through

* Address review and improve sql queries

* Only use event_id as unique constraint

* Fix test case where insertion event is already in the normal DAG

* Remove debug changes

* Add support for MSC2716 marker events

* Process markers when we receive it over federation

* WIP: make hs2 backfill historical messages after marker event

* hs2 to better ask for insertion event extremity

But running into the `sqlite3.IntegrityError: NOT NULL constraint failed: event_to_state_groups.state_group`
error

* Add insertion_event_extremities table

* Switch to chunk events so we can auth via power_levels

Previously, we were using `content.chunk_id` to connect one
chunk to another. But these events can be from any `sender`
and we can't tell who should be able to send historical events.
We know we only want the application service to do it but these
events have the sender of a real historical message, not the
application service user ID as the sender. Other federated homeservers
also have no indicator which senders are an application service on
the originating homeserver.

So we want to auth all of the MSC2716 events via power_levels
and have them be sent by the application service with proper
PL levels in the room.

* Switch to chunk events for federation

* Add unstable room version to support new historical PL

* Messy: Fix undefined state_group for federated historical events

```
2021-07-13 02:27:57,810 - synapse.handlers.federation - 1248 - ERROR - GET-4 - Failed to backfill from hs1 because NOT NULL constraint failed: event_to_state_groups.state_group
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 1216, in try_backfill
    await self.backfill(
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 1035, in backfill
    await self._auth_and_persist_event(dest, event, context, backfilled=True)
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 2222, in _auth_and_persist_event
    await self._run_push_actions_and_persist_event(event, context, backfilled)
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 2244, in _run_push_actions_and_persist_event
    await self.persist_events_and_notify(
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 3290, in persist_events_and_notify
    events, max_stream_token = await self.storage.persistence.persist_events(
  File "/usr/local/lib/python3.8/site-packages/synapse/logging/opentracing.py", line 774, in _trace_inner
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/persist_events.py", line 320, in persist_events
    ret_vals = await yieldable_gather_results(enqueue, partitioned.items())
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/persist_events.py", line 237, in handle_queue_loop
    ret = await self._per_item_callback(
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/persist_events.py", line 577, in _persist_event_batch
    await self.persist_events_store._persist_events_and_state_updates(
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/databases/main/events.py", line 176, in _persist_events_and_state_updates
    await self.db_pool.runInteraction(
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 681, in runInteraction
    result = await self.runWithConnection(
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 770, in runWithConnection
    return await make_deferred_yieldable(
  File "/usr/local/lib/python3.8/site-packages/twisted/python/threadpool.py", line 238, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.8/site-packages/twisted/python/threadpool.py", line 254, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.8/site-packages/twisted/python/context.py", line 118, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.8/site-packages/twisted/python/context.py", line 83, in callWithContext
    return func(*args, **kw)
  File "/usr/local/lib/python3.8/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
    compat.reraise(excValue, excTraceback)
  File "/usr/local/lib/python3.8/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
    return function(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/twisted/python/compat.py", line 403, in reraise
    raise exception.with_traceback(traceback)
  File "/usr/local/lib/python3.8/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 765, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 549, in new_transaction
    r = func(cursor, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/synapse/logging/utils.py", line 69, in wrapped
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/databases/main/events.py", line 385, in _persist_events_txn
    self._store_event_state_mappings_txn(txn, events_and_contexts)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/databases/main/events.py", line 2065, in _store_event_state_mappings_txn
    self.db_pool.simple_insert_many_txn(
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 923, in simple_insert_many_txn
    txn.execute_batch(sql, vals)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 280, in execute_batch
    self.executemany(sql, args)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 300, in executemany
    self._do_execute(self.txn.executemany, sql, *args)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 330, in _do_execute
    return func(sql, *args)
sqlite3.IntegrityError: NOT NULL constraint failed: event_to_state_groups.state_group
```

* Revert "Messy: Fix undefined state_group for federated historical events"

This reverts commit 187ab28611546321e02770944c86f30ee2bc742a.

* Fix federated events being rejected for no state_groups

Add fix from https://github.com/matrix-org/synapse/pull/10439
until it merges.

* Adapting to experimental room version

* Some log cleanup

* Add better comments around extremity fetching code and why

* Rename to be more accurate to what the function returns

* Add changelog

* Ignore rejected events

* Use simplified upsert

* Add Erik's explanation of extra event checks

See https://github.com/matrix-org/synapse/pull/10498#discussion_r680880332

* Clarify that the depth is not directly correlated to the backwards extremity that we return

See https://github.com/matrix-org/synapse/pull/10498#discussion_r681725404

* lock only matters for sqlite

See https://github.com/matrix-org/synapse/pull/10498#discussion_r681728061

* Move new SQL changes to its own delta file

* Clean up upsert docstring

* Bump database schema version (62)
2021-08-04 12:07:57 -05:00
Eric Eastwood
d0b294ad97
Make historical events discoverable from backfill for servers without any scrollback history (MSC2716) (#10245)
* Make historical messages available to federated servers

Part of MSC2716: https://github.com/matrix-org/matrix-doc/pull/2716

Follow-up to https://github.com/matrix-org/synapse/pull/9247

* Debug message not available on federation

* Add base starting insertion point when no chunk ID is provided

* Fix messages from multiple senders in historical chunk

Follow-up to https://github.com/matrix-org/synapse/pull/9247

Part of MSC2716: https://github.com/matrix-org/matrix-doc/pull/2716

---

Previously, Synapse would throw a 403,
`Cannot force another user to join.`,
because we were trying to use `?user_id` from a single virtual user
which did not match with messages from other users in the chunk.

* Remove debug lines

* Messing with selecting insertion event extremeties

* Move db schema change to new version

* Add more better comments

* Make a fake requester with just what we need

See https://github.com/matrix-org/synapse/pull/10276#discussion_r660999080

* Store insertion events in table

* Make base insertion event float off on its own

See https://github.com/matrix-org/synapse/pull/10250#issuecomment-875711889

Conflicts:
	synapse/rest/client/v1/room.py

* Validate that the app service can actually control the given user

See https://github.com/matrix-org/synapse/pull/10276#issuecomment-876316455

Conflicts:
	synapse/rest/client/v1/room.py

* Add some better comments on what we're trying to check for

* Continue debugging

* Share validation logic

* Add inserted historical messages to /backfill response

* Remove debug sql queries

* Some marker event implemntation trials

* Clean up PR

* Rename insertion_event_id to just event_id

* Add some better sql comments

* More accurate description

* Add changelog

* Make it clear what MSC the change is part of

* Add more detail on which insertion event came through

* Address review and improve sql queries

* Only use event_id as unique constraint

* Fix test case where insertion event is already in the normal DAG

* Remove debug changes

* Switch to chunk events so we can auth via power_levels

Previously, we were using `content.chunk_id` to connect one
chunk to another. But these events can be from any `sender`
and we can't tell who should be able to send historical events.
We know we only want the application service to do it but these
events have the sender of a real historical message, not the
application service user ID as the sender. Other federated homeservers
also have no indicator which senders are an application service on
the originating homeserver.

So we want to auth all of the MSC2716 events via power_levels
and have them be sent by the application service with proper
PL levels in the room.

* Switch to chunk events for federation

* Add unstable room version to support new historical PL

* Fix federated events being rejected for no state_groups

Add fix from https://github.com/matrix-org/synapse/pull/10439
until it merges.

* Only connect base insertion event to prev_event_ids

Per discussion with @erikjohnston,
https://matrix.to/#/!UytJQHLQYfvYWsGrGY:jki.re/$12bTUiObDFdHLAYtT7E-BvYRp3k_xv8w0dUQHibasJk?via=jki.re&via=matrix.org

* Make it possible to get the room_version with txn

* Allow but ignore historical events in unsupported room version

See https://github.com/matrix-org/synapse/pull/10245#discussion_r675592489

We can't reject historical events on unsupported room versions because homeservers without knowledge of MSC2716 or the new room version don't reject historical events either.

Since we can't rely on the auth check here to stop historical events on unsupported room versions, I've added some additional checks in the processing/persisting code (`synapse/storage/databases/main/events.py` ->  `_handle_insertion_event` and `_handle_chunk_event`). I've had to do some refactoring so there is method to fetch the room version by `txn`.

* Move to unique index syntax

See https://github.com/matrix-org/synapse/pull/10245#discussion_r675638509

* High-level document how the insertion->chunk lookup works

* Remove create_event fallback for room_versions

See https://github.com/matrix-org/synapse/pull/10245/files#r677641879

* Use updated method name
2021-07-28 10:46:37 -05:00
Eric Eastwood
7387d6f624
Remove unused events_by_room (#10421)
It looks like it was first used and introduced in 5130d80d79 (diff-8a4a36a7728107b2ccaff2cb405dbab229a1100fe50653a63d1aa9ac10ae45e8R305) but the 

But the usage was removed in 4c6a31cd6e (diff-8a4a36a7728107b2ccaff2cb405dbab229a1100fe50653a63d1aa9ac10ae45e8)
2021-07-19 10:16:46 +01:00
Jonathan de Jong
bdfde6dca1
Use inline type hints in http/federation/, storage/ and util/ (#10381) 2021-07-15 12:46:54 -04:00
Andreas Rammhold
e3e73e181b
Upsert redactions in case they already exists (#10343)
* Upsert redactions in case they already exists

Occasionally, in combination with retention, redactions aren't deleted
from the database whenever they are due for deletion. The server will
eventually try to backfill the deleted events and trip over the already
existing redaction events.

Switching to an UPSERT for those events allows us to recover from there
situations. The retention code still needs fixing but that is outside of
my current comfort zone on this code base.

This is related to #8707 where the error was discussed already.

Signed-off-by: Andreas Rammhold <andreas@rammhold.de>

* Also purge redactions when purging events

Previously redacints where left behind leading to backfilling issues
when the server stumbled across the already existing yet to be
backfilled redactions.

This issues has been discussed in #8707.

Signed-off-by: Andreas Rammhold <andreas@rammhold.de>
2021-07-09 11:03:02 +01:00
Richard van der Hoff
224f2f949b
Combine LruCache.invalidate and invalidate_many (#9973)
* Make `invalidate` and `invalidate_many` do the same thing

... so that we can do either over the invalidation replication stream, and also
because they always confused me a bit.

* Kill off `invalidate_many`

* changelog
2021-05-27 10:33:56 +01:00
Jonathan de Jong
495b214f4f
Fix (final) Bugbear violations (#9838) 2021-04-20 11:50:49 +01:00
Erik Johnston
601b893352
Small speed up joining large remote rooms (#9825)
There are a couple of points in `persist_events` where we are doing a
query per event in series, which we can replace.
2021-04-16 14:44:55 +01:00
Jonathan de Jong
4b965c862d
Remove redundant "coding: utf-8" lines (#9786)
Part of #9744

Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-04-14 15:34:27 +01:00
Jonathan de Jong
2ca4e349e9
Bugbear: Add Mutable Parameter fixes (#9682)
Part of #9366

Adds in fixes for B006 and B008, both relating to mutable parameter lint errors.

Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
2021-04-08 22:38:54 +01:00
Richard van der Hoff
567f88f835
Prep work for removing outlier from internal_metadata (#9411)
* Populate `internal_metadata.outlier` based on `events` table

Rather than relying on `outlier` being in the `internal_metadata` column,
populate it based on the `events.outlier` column.

* Move `outlier` out of InternalMetadata._dict

Ultimately, this will allow us to stop writing it to the database. For now, we
have to grandfather it back in so as to maintain compatibility with older
versions of Synapse.
2021-03-17 12:33:18 +00:00
Erik Johnston
0b5c967813
Refactor to ensure we call check_consistency (#9470)
The idea here is to stop people forgetting to call `check_consistency`. Folks can still just pass in `None` to the new args in `build_sequence_generator`, but hopefully they won't.
2021-02-24 10:13:53 +00:00
Patrick Cloke
65a9eb8994
Include newly added sequences in the port DB script. (#9449)
And ensure the consistency of `event_auth_chain_id`.
2021-02-23 07:33:24 -05:00
Eric Eastwood
0a00b7ff14
Update black, and run auto formatting over the codebase (#9381)
- Update black version to the latest
 - Run black auto formatting over the codebase
    - Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
 - Update `code_style.md` docs around installing black to use the correct version
2021-02-16 22:32:34 +00:00
Patrick Cloke
7950aa8a27 Fix some typos. 2021-02-12 11:14:12 -05:00
Erik Johnston
758ed5f1bc
Speed up chain cover calculation (#9176) 2021-01-21 17:00:12 +00:00
Erik Johnston
eee6fcf5fa
Use execute_batch instead of executemany in places (#9181)
`execute_batch` does fewer round trips in postgres than `executemany`, but does not give a correct `txn.rowcount` result after.
2021-01-21 10:22:53 +00:00
Erik Johnston
659c415ed4
Fix chain cover background update to work with split out event persisters (#9115) 2021-01-14 17:19:35 +00:00
Erik Johnston
7036e24e98
Add background update for add chain cover index (#9029) 2021-01-14 15:18:27 +00:00
Erik Johnston
1315a2e8be
Use a chain cover index to efficiently calculate auth chain difference (#8868) 2021-01-11 16:09:22 +00:00
Erik Johnston
63f4990298
Ensure rejected events get added to some metadata tables (#9016)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2021-01-11 13:57:33 +00:00
Richard van der Hoff
b6ca69e4f1 Remove frozendict_json_encoder and support frozendicts everywhere
Not being able to serialise `frozendicts` is fragile, and it's annoying to have
to think about which serialiser you want. There's no real downside to
supporting frozendicts, so let's just have one json encoder.
2020-10-28 15:56:57 +00:00
Richard van der Hoff
97647b33c2
Replace DeferredCache with LruCache where possible (#8563)
Most of these uses don't need a full-blown DeferredCache; LruCache is lighter and more appropriate.
2020-10-19 12:20:29 +01:00
Brendan Abolivier
3ee97a2748
Make sure a retention policy is a state event (#8527)
* Make sure a retention policy is a state event

* Changelog
2020-10-14 12:00:52 +01:00
Erik Johnston
b2486f6656
Fix message duplication if something goes wrong after persisting the event (#8476)
Should fix #3365.
2020-10-13 12:07:56 +01:00
Erik Johnston
5009ffcaa4
Only send RDATA for instance local events. (#8496)
When pulling events out of the DB to send over replication we were not
filtering by instance name, and so we were sending events for other
instances.
2020-10-09 13:10:33 +01:00
Richard van der Hoff
f31f8e6319
Remove stream ordering from Metadata dict (#8452)
There's no need for it to be in the dict as well as the events table. Instead,
we store it in a separate attribute in the EventInternalMetadata object, and
populate that on load.

This means that we can rely on it being correctly populated for any event which
has been persited to the database.
2020-10-05 14:43:14 +01:00
Patrick Cloke
4ff0201e62
Enable mypy checking for unreachable code and fix instances. (#8432) 2020-10-01 08:09:18 -04:00
Richard van der Hoff
302dc89f6a
Fix bug which caused failure on join with malformed membership events (#8385) 2020-09-23 16:42:14 +01:00
Erik Johnston
cbabb312e0
Use async with for ID gens (#8383)
This will allow us to hit the DB after we've finished using the generated stream ID.
2020-09-23 16:11:18 +01:00
Erik Johnston
04cc249b43
Add experimental support for sharding event persister. Again. (#8294)
This is *not* ready for production yet. Caveats:

1. We should write some tests...
2. The stream token that we use for events can get stalled at the minimum position of all writers. This means that new events may not be processed and e.g. sent down sync streams if a writer isn't writing or is slow.
2020-09-14 10:16:41 +01:00
Erik Johnston
fe8ed1b46f
Make StreamToken.room_key be a RoomStreamToken instance. (#8281) 2020-09-11 12:22:55 +01:00
Brendan Abolivier
9f8abdcc38
Revert "Add experimental support for sharding event persister. (#8170)" (#8242)
* Revert "Add experimental support for sharding event persister. (#8170)"

This reverts commit 82c1ee1c22.

* Changelog
2020-09-04 10:19:42 +01:00
Brendan Abolivier
5a1dd297c3
Re-implement unread counts (again) (#8059) 2020-09-02 17:19:37 +01:00
Erik Johnston
82c1ee1c22
Add experimental support for sharding event persister. (#8170)
This is *not* ready for production yet. Caveats:

1. We should write some tests...
2. The stream token that we use for events can get stalled at the minimum position of all writers. This means that new events may not be processed and e.g. sent down sync streams if a writer isn't writing or is slow.
2020-09-02 15:48:37 +01:00
Erik Johnston
2231dffee6
Make StreamIdGen get_next and get_next_mult async (#8161)
This is mainly so that `StreamIdGenerator` and `MultiWriterIdGenerator`
will have the same interface, allowing them to be used interchangeably.
2020-08-25 15:10:08 +01:00
Patrick Cloke
e8861957d9
Convert receipts and events databases to async/await. (#8076) 2020-08-14 10:05:19 -04:00
Brendan Abolivier
2ffd6783c7
Revert #7736 (#8039) 2020-08-06 17:15:35 +01:00
Erik Johnston
a7bdf98d01
Rename database classes to make some sense (#8033) 2020-08-05 21:38:57 +01:00