When synapse receives an event for a room its not in over federation, it
double checks with the remote server to see if it is in fact in the
room. This is done so that if the server has forgotten about the room
(usually as a result of the database being dropped) it can recover from
it.
However, in the presence of state resets in large rooms, this can cause
a lot of work for servers that have legitimately left. As a hacky
solution that supports both cases we drop incoming events for rooms that
we have explicitly left.
This means that we no longer support the case of servers having
forgotten that they've rejoined a room, but that is sufficiently rare
that we're not going to support it for now.
Since we didn't instansiate the PusherPool at start time it could fail
at run time, which it did for some users.
This may or may not fix things for those users, but it should happen at
start time and stop the server from starting.
The `except SynapseError` clauses were pointless because the wrapped functions
would never throw a `SynapseError` (they either throw a `CodeMessageException`
or a `RuntimeError`).
The `except CodeMessageException` is now also pointless because the caller
treats all exceptions equally, so we may as well just throw the
`CodeMessageException`.
During a rejection of an invite received over federation, we ask a remote
server to make us a `leave` event, then sign it, then send that with
`send_leave`.
We were saving the *unsigned* version of the event (which has a different event
id to the signed version) to our db (and sending it to the clients), whereas
other servers in the room will have seen the *signed* version. We're not aware
of any actual problems that caused, except that it makes the database confusing
to look at and generally leaves the room in a weird state.
Make sure that we accept join events from any server, rather than just the
origin server, to make the federation join dance work correctly.
(Fixes#1893).
Fix a bug in ``logcontext.preserve_fn`` which made it leak context into the
reactor, and add a test for it.
Also, get rid of ``logcontext.reset_context_after_deferred``, which tried to do
the same thing but had its own, different, set of bugs.
A few non-functional changes:
* A bunch of docstrings to document types
* Split `EventsStore._persist_events_txn` up a bit. Hopefully it's a bit more
readable.
* Rephrase `EventFederationStore._update_min_depth_for_room_txn` to avoid
mind-bending conditional.
* Rephrase rejected/outlier conditional in `_update_outliers_txn` to avoid
mind-bending conditional.
This just takes the existing `room_queues` logic and moves it out to
`on_receive_pdu` instead of `_process_received_pdu`, which ensures that we
don't start trying to fetch prev_events and whathaveyou until the join has
completed.
Unfortunately this significantly increases the size of the already-rather-big
FederationHandler, but the code fits more naturally here, and it paves the way
for the tighter integration that I need between handling incoming PDUs and
doing the join dance.
Other than renaming the existing `FederationHandler.on_receive_pdu` to
`_process_received_pdu` to make way for it, this just consists of the move, and
replacing `self.handler` with `self` and `self` with `self.replication_layer`.