Some of the caches on worker processes were not being correctly invalidated
when a room's state was changed in a way that did not affect the membership
list of the room.
We need to make sure we send out cache invalidations even when no memberships
are changing.
Fixes intermittent errors observed on Apple hardware which were caused by
time.clock() appearing to go backwards when called from different threads.
Also fixes a bug where database activity times were logged as 1/1000 of their
correct ratio due to confusion between milliseconds and seconds.
Sends password reset emails from the homeserver instead of proxying to the identity server. This is now the default behaviour for security reasons. If you wish to continue proxying password reset requests to the identity server you must now enable the email.trust_identity_server_for_password_resets option.
This PR is a culmination of 3 smaller PRs which have each been separately reviewed:
* #5308
* #5345
* #5368
When enabling the account validity feature, Synapse will look at startup for registered account without an expiration date, and will set one equals to 'now + validity_period' for them. On large servers, it can mean that a large number of users will have the same expiration date, which means that they will all be sent a renewal email at the same time, which isn't ideal.
In order to mitigate this, this PR allows server admins to define a 'max_delta' so that the expiration date is a random value in the [now + validity_period ; now + validity_period + max_delta] range. This allows renewal emails to be progressively sent over a configured period instead of being sent all in one big batch.
If account validity is enabled in the server's configuration, this job will run at startup as a background job and will stick an expiration date to any registered account missing one.
Hopefully this time we really will fix#4422.
We need to make sure that the cache on
`get_rooms_for_user_with_stream_ordering` is invalidated *before* the
SyncHandler is notified for the new events, and we can now do so reliably via
the `events` stream.
Currently whenever the current state changes in a room invalidate a lot
of caches, which cause *a lot* of traffic over replication. Instead,
lets batch up all those invalidations and send a single poke down
the replication streams.
Hopefully this will reduce load on the master process by substantially
reducing traffic.
Add more tables to the list of tables which need a background update to
complete before we can upsert into them, which fixes a race against the
background updates.
We need to run the errback in the sentinel context to avoid losing our own
context.
Also: add logging to runInteraction to help identify where "Starting db
connection from sentinel context" warnings are coming from
The psycopg2 package isn't available for PyPy. This commit adds a check
if the runtime is PyPy, and if it is uses psycopg2cffi module in favor
of psycopg2. This is almost a drop-in replacement, except for one place
where an additional cast to string is required.
We poked the notifier before updated the current token for the cache
invalidation stream. This mean that sometimes the update wouldn't be
sent until the next time a cache was invalidated.
For each request, track the amount of time spent waiting for a db
connection. This entails adding it to the LoggingContext and we may as well add
metrics for it while we are passing.
wrap the call to _simple_upsert_txn in a loop so that we retry on an
integrityerror: this means we can avoid locking the table provided there is an
unique index.
Add db_conn parameters to the `__init__` methods of the *Store classes, so that
they are all consistent, which makes the multiple inheritance work correctly
(and so that we can later extract mixins which can be used in the slavedstores)
This is due to the fact that we prefilled caches using txn.call_after,
which always gets called including on error.
We fix this by making txn.call_after only fire when a transaction
completes successfully, which is what we want most of the time anyway.