We start all pushers on start up and immediately start a background
process to fetch push to send. This makes start up incredibly painful
when dealing with many pushers.
Instead, let's do a quick fast DB check to see if there *may* be push to
send and only start the background processes for those pushers. We also
stagger starting up and doing those checks so that we don't try and
handle all pushers at once.
This brings it into line with on_new_notifications and on_new_receipts. It
requires a little bit of hoop-jumping in EmailPusher to load the throttle
params before the first loop.
`on_new_notifications` and `on_new_receipts` in `HttpPusher` and `EmailPusher`
now always return synchronously, so we can remove the `defer.gatherResults` on
their results, and the `run_as_background_process` wrappers can be removed too
because the PusherPool methods will now complete quickly enough.
Each pusher has its own loop which runs for as long as it has work to do. This
should run in its own background thread with its own logcontext, as other
similar loops elsewhere in the system do - which means that CPU usage is
consistently attributed to that loop, rather than to whatever request happened
to start the loop.
move the example email templates into the synapse package so that they can be
used as package data, which should mean that all of the packaging mechanisms
(pip, docker, debian, arch, etc) should now come with the example templates.
In order to grandfather in people who relied on the templates being in the old
place, check for that situation and fall back to using the defaults if the
templates directory does not exist.
First of all, avoid resetting the logcontext before running the pushers, to fix
the "Starting db txn 'get_all_updated_receipts' from sentinel context" warning.
Instead, give them their own "background process" logcontexts.
While I was going through uses of preserve_fn for other PRs, I converted places
which only use the wrapped function once to use run_in_background, to avoid
creating the function object.
There were a bunch of places where we fire off a process to happen in the
background, but don't have any exception handling on it - instead relying on
the unhandled error being logged when the relevent deferred gets
garbage-collected.
This is unsatisfactory for a number of reasons:
- logging on garbage collection is best-effort and may happen some time after
the error, if at all
- it can be hard to figure out where the error actually happened.
- it is logged as a scary CRITICAL error which (a) I always forget to grep for
and (b) it's not really CRITICAL if a background process we don't care about
fails.
So this is an attempt to add exception handling to everything we fire off into
the background.
The redact_content option never worked because it read the wrong config
section. The PR introducing it
(https://github.com/matrix-org/synapse/pull/2301) had feedback suggesting the
name be changed to not re-use the term 'redact' but this wasn't
incorporated.
This reanmes the option to give it a less confusing name, and also
means that people who've set the redact_content option won't suddenly
see a behaviour change when upgrading synapse, but instead can set
include_content if they want to.
This PR also updates the wording of the config comment to clarify
that this has no effect on event_id_only push.
Includes https://github.com/matrix-org/synapse/pull/2422
Only prepend / append word bounary characters if the search
expression starts or ends with a word character, otherwise they
don't work because there's no word bounary between whitespace and
a non-word char.
Param in the data dict of a pusher that tells an HTTP pusher to
send just the event_id of the event it's notifying about and the
notification counts. For clients that want to go & fetch the body
of the event themselves anyway.
Initialising `result` to `{}` in the parameters meant that every call to
_flatten_dict used the *same* target dictionary.
I'm hopeful this will fix https://github.com/matrix-org/synapse/issues/2270,
but I suspect it won't. (This code seems to have been here since forever,
unlike the bug, and I don't really think it explains the observed
behaviour). Still, it makes it hard to investigate the problem.