Merge pull request #4671 from matrix-org/erikj/state_cache_invalidation

Batch cache invalidation over replication
2025-11-28 15:40:43 -05:00 · 2019-02-19 13:14:30 +00:00 · 2019-02-19 13:14:30 +00:00 · c003450057
commit c003450057
parent 1c0eb8bbb2 62175a20e5
5 changed files with 96 additions and 33 deletions
--- a/docs/tcp_replication.rst
+++ b/docs/tcp_replication.rst
@ -137,7 +137,6 @@ for each stream so that on reconneciton it can start streaming from the correct
 place. Note: not all RDATA have valid tokens due to batching. See
 ``RdataCommand`` for more details.

-
 Example
 ~~~~~~~

@ -221,3 +220,28 @@ SYNC (S, C)

 See ``synapse/replication/tcp/commands.py`` for a detailed description and the
 format of each command.
+
+
+Cache Invalidation Stream
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The cache invalidation stream is used to inform workers when they need to
+invalidate any of their caches in the data store. This is done by streaming all
+cache invalidations done on master down to the workers, assuming that any caches
+on the workers also exist on the master.
+
+Each individual cache invalidation results in a row being sent down replication,
+which includes the cache name (the name of the function) and they key to
+invalidate. For example::
+
+    > RDATA caches 550953771 ["get_user_by_id", ["@bob:example.com"], 1550574873251]
+
+However, there are times when a number of caches need to be invalidated at the
+same time with the same key. To reduce traffic we batch those invalidations into
+a single poke by defining a special cache name that workers understand to mean
+to expand to invalidate the correct caches.
+
+Currently the special cache names are declared in ``synapse/storage/_base.py``
+and are:
+
+1. ``cs_cache_fake`` ─ invalidates caches that depend on the current state