2016-04-12 10:06:09 -04:00
|
|
|
Replication Architecture
|
|
|
|
========================
|
|
|
|
|
|
|
|
Motivation
|
|
|
|
----------
|
|
|
|
|
|
|
|
We'd like to be able to split some of the work that synapse does into multiple
|
|
|
|
python processes. In theory multiple synapse processes could share a single
|
|
|
|
postgresql database and we'd scale up by running more synapse processes.
|
|
|
|
However much of synapse assumes that only one process is interacting with the
|
|
|
|
database, both for assigning unique identifiers when inserting into tables,
|
|
|
|
notifying components about new updates, and for invalidating its caches.
|
|
|
|
|
|
|
|
So running multiple copies of the current code isn't an option. One way to
|
|
|
|
run multiple processes would be to have a single writer process and multiple
|
|
|
|
reader processes connected to the same database. In order to do this we'd need
|
|
|
|
a way for the reader process to invalidate its in-memory caches when an update
|
|
|
|
happens on the writer. One way to do this is for the writer to present an
|
|
|
|
append-only log of updates which the readers can consume to invalidate their
|
|
|
|
caches and to push updates to listening clients or pushers.
|
|
|
|
|
|
|
|
Synapse already stores much of its data as an append-only log so that it can
|
|
|
|
correctly respond to /sync requests so the amount of code changes needed to
|
|
|
|
expose the append-only log to the readers should be fairly minimal.
|
|
|
|
|
|
|
|
Architecture
|
|
|
|
------------
|
|
|
|
|
2017-04-11 11:21:12 -04:00
|
|
|
The Replication Protocol
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
See ``tcp_replication.rst``
|
2016-04-12 10:06:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
The Slaved DataStore
|
|
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
There are read-only version of the synapse storage layer in
|
|
|
|
``synapse/replication/slave/storage`` that use the response of the replication
|
|
|
|
API to invalidate their caches.
|