Previously only Twisted's EPollReactor was compatible with the
reactor timing metric, notably not working when asyncio was used.
After this change, the following configurations support the reactor
timing metric:
* poll, epoll, or select reactors
* asyncio reactor with a poll, epoll, select, /dev/poll, or kqueue event loop.
`setup()` is run under the sentinel context manager, so we wrap the
initial update in a background process. Before this change, Synapse
would log two warnings on startup:
Starting db txn 'count_daily_users' from sentinel context
Starting db connection from sentinel context: metrics will be lost
Signed-off-by: Sean Quah <seanq@matrix.org>
Principally, `prometheus_client.REGISTRY.register` now requires its argument to
extend `prometheus_client.Collector`.
Additionally, `Gauge.set` is now annotated so that passing `Optional[int]`
causes an error.
The existing implementation of the `python_twisted_reactor_tick_time` metric is pretty useless, because it *only*
measures the time taken to execute timed calls and callbacks from threads. That neglects everything that
happens off the back of I/O, which is obviously quite a lot for us.
To improve this, I've hooked into a different place in the reactor - in particular, where it calls `epoll`. That call is
the only place it should wait for something to happen - the rest of the loop *should* be quick.
I've also removed `python_twisted_reactor_pending_calls`, because I don't believe anyone ever looks at it, and
it's a nuisance to populate.
Rather than hooking into the reactor loop, just add a timed task that runs every 100 ms to do the garbage collection.
Part 1 of a quest to simplify the reactor monkey-patching.
* Teach MyPy that the sentinel context is False
This means that if `ctx: LoggingContextOrSentinel`
then `bool(ctx)` narrows us to `ctx:LoggingContext`, which is a really
neat find!
* Annotate RequestMetrics
- Raise errors for sentry if we use the sentinel context
- Ensure we don't raise an error and carry on, but not recording stats
- Include stack trace in the error case to lower Sean's blood pressure
* Make mypy pass for synapse.http.request_metrics
* Make synapse.http.connectproxyclient pass mypy
Co-authored-by: reivilibre <oliverw@matrix.org>
Updating mypy past version 0.9 means that third-party stubs are no-longer distributed with typeshed. See http://mypy-lang.blogspot.com/2021/06/mypy-0900-released.html for details.
We therefore pull in stub packages in setup.py
Additionally, some modules that we were previously ignoring import failures for now have stubs. So let's use them.
The rest of this change consists of fixups to make the newer mypy + stubs pass CI.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Synapse can be quite memory intensive, and unless care is taken to tune
the GC thresholds it can end up thrashing, causing noticable performance
problems for large servers. We fix this by limiting how often we GC a
given generation, regardless of current counts/thresholds.
This does not help with the reverse problem where the thresholds are set
too high, but that should only happen in situations where they've been
manually configured.
Adds a `gc_min_seconds_between` config option to override the defaults.
Fixes#9890.
As far as I can tell our logging contexts are meant to log the request ID, or sometimes the request ID followed by a suffix (this is generally stored in the name field of LoggingContext). There's also code to log the name@memory location, but I'm not sure this is ever used.
This simplifies the code paths to require every logging context to have a name and use that in logging. For sub-contexts (created via nested_logging_contexts, defer_to_threadpool, Measure) we use the current context's str (which becomes their name or the string "sentinel") and then potentially modify that (e.g. add a suffix).
Part of #9744
Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
This PR modifies `GaugeBucketCollector` to only report data once it has been updated, rather than initially reporting a value of 0. Fixes zero values being reported for some metrics on startup until a background job to update the metric's value runs later.
- Update black version to the latest
- Run black auto formatting over the codebase
- Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
- Update `code_style.md` docs around installing black to use the correct version
The main use case is to see how many requests are being made, and how
many are second/third/etc attempts. If there are large number of retries
then that likely indicates a delivery problem.
#8567 started a span for every background process. This is good as it means all Synapse code that gets run should be in a span (unless in the sentinel logging context), but it means we generate about 15x the number of spans as we did previously.
This PR attempts to reduce that number by a) not starting one for send commands to Redis, and b) deferring starting background processes until after we're sure they're necessary.
I don't really know how much this will help.