If the client failed to process incoming commands during the initial set
up of the replication connection it would immediately disconnect and
reconnect, resulting in a tightloop.
This can happen, for example, when subscribing to a stream that has a
row that is too long in the backlog.
The fix here is to not consider the connection successfully set up until
the client has succesfully subscribed and caught up with the streams.
This ensures that the retry logic timers aren't reset until then,
meaning that if an error does happen during start up the client will
continue backing off before retrying again.
Currently whenever the current state changes in a room invalidate a lot
of caches, which cause *a lot* of traffic over replication. Instead,
lets batch up all those invalidations and send a single poke down
the replication streams.
Hopefully this will reduce load on the master process by substantially
reducing traffic.
This allows registration to be handled by a worker, though the actual
write to the database still happens on master.
Note: due to the in-memory session map all registration requests must be
handled by the same worker.
* Fix replication for room v3
We were not correctly quoting the path fragments over http replication,
which meant that it exploded when the event IDs had a slash in them
* Newsfile
Run the handlers for replication commands as background processes. This should
improve the visibility in our metrics, and reduce the number of "running db
transaction from sentinel context" warnings.
Ideally it means converting the things that fire off deferreds into the night
into things that actually return a Deferred when they are done. I've made a bit
of a stab at this, but it will probably be leaky.
This is the first tranche of support for room versioning. It includes:
* setting the default room version in the config file
* new room_version param on the createRoom API
* storing the version of newly-created rooms in the m.room.create event
* fishing the version of existing rooms out of the m.room.create event
This code brings the SimpleHttpClient into line with the
MatrixFederationHttpClient by having it raise HttpResponseExceptions when a
request fails (rather than trying to parse for matrix errors and maybe raising
MatrixCodeMessageException).
Then, whenever we were checking for MatrixCodeMessageException and turning them
into SynapseErrors, we now need to check for HttpResponseExceptions and call
to_synapse_error.
on_notifier_poke no longer runs synchonously, so we have to do a different hack
to make sure that the replication data has been sent. Let's actually listen for
its arrival.