Primarily this fixes a bug in the handling of remote users joining a
room where the server sent out the presence for all local users in the
room to all servers in the room.
We also change to using the state delta stream, rather than the
distributor, as it will make it easier to split processing out of the
master process (as well as being more flexible).
Finally, when sending presence states to newly joined servers we filter
out old presence states to reduce the number sent. Initially we filter
out states that are offline and have a last active more than a week ago,
though this can be changed down the line.
Fixes#3962
In worker mode, on the federation sender, when we receive an edu for sending
over the replication socket, it is parsed into an Edu object. There is no point
extracting the contents of it so that we can then immediately build another Edu.
There were a bunch of places where we fire off a process to happen in the
background, but don't have any exception handling on it - instead relying on
the unhandled error being logged when the relevent deferred gets
garbage-collected.
This is unsatisfactory for a number of reasons:
- logging on garbage collection is best-effort and may happen some time after
the error, if at all
- it can be hard to figure out where the error actually happened.
- it is logged as a scary CRITICAL error which (a) I always forget to grep for
and (b) it's not really CRITICAL if a background process we don't care about
fails.
So this is an attempt to add exception handling to everything we fire off into
the background.
This is mainly done by moving the calculation of where to send presence
updates from the presence handler to the transaction queue, so we only
need to send the presence event (and not the destinations) across the
replication connection. Before we were duplicating by sending the full
state across once per destination.
The TCP replication protocol streams deltas of who has started or
stopped syncing. This is different from the HTTP API which periodically
sends the full list of users who are syncing. This commit adds support
for the new TCP style of sending deltas.